Low Scores Anyone?

Author	Message
Conan Send message Joined: 11 Oct 05 Posts: 153 Credit: 4,460,094 RAC: 0	Message 30263 - Posted: 30 Oct 2006, 1:31:02 UTC Last modified: 30 Oct 2006, 1:35:25 UTC > The problem Work Unit Type that I have is "1hz6A_BOINC_NATIVEJUMP_CLOSE_CHAINBREAKS_......" (My preference is 21600 seconds or 6 hours), see below >> Workunit 38998468 / Result 44201063 > Took 32946.174 seconds for 2 decoys, granted cobblestones 19.94 = 2.18 c/hour >> Workunit 38983264 / Result 44184941 > Took 42094.899 seconds for 1 decoy, granted cobblestones 9.96 = 0.85 c/hour >> Workunit 38891074 / Result 44104983 > Took 44900.914 seconds for 5 decoys, granted cobblestones 49.685 = 3.98 c/hour >> Workunit 38893424 / Result 44089375 > Took 16400.803 seconds for 1 decoy, granted cobblestones 5.928 = 1.30 c/hour >> Workunit 38866928 / Result 44061211 > Took 18923.17 seconds for 2 decoys, granted cobblestones 11.804 = 2.24 c/hour >> Workunit 38819735 / Result 44010746 > Took 20713.144 seconds for 1 decoy, granted cobblestones 5.885 = 1.023 c/hour As can be seen lots of time but very little return and small amount of decoys processed in that time. Possibly very large Proteins, if so then due to the extra work done on those proteins a higher granted Cobblestone amount would of been expected. These are the lowest ones I grabbed, I did not look at all my results so there could be others. Getting less than 1 cobblestone per hour does not seem fair, especially considering the results that some people are getting (see "High Scores Anyone" thread with 1 computer getting over 520 cobblestones for 18 seconds work). It would appear that there may be two problems with some workunits, some giving very high and others giving very low amounts of credit. ID: 30263 · Rating: 0 · rate: / Reply Quote

Gen_X_Accord Send message Joined: 5 Jun 06 Posts: 154 Credit: 279,018 RAC: 0	Message 30266 - Posted: 30 Oct 2006, 5:10:08 UTC The credit I'm getting now on these other work units is more like normal. Maybe you should send the weird ones to XS_DDTUNG, that person has predicted the lowest structure 3 times in the last 2 months, he/she must have some really good equipment. ID: 30266 · Rating: 0 · rate: / Reply Quote

Rhiju Volunteer moderator Send message Joined: 8 Jan 06 Posts: 223 Credit: 3,546 RAC: 0	Message 30270 - Posted: 30 Oct 2006, 6:21:48 UTC - in response to Message 30263. Last modified: 30 Oct 2006, 6:22:08 UTC To the posters who have been alerting us about workunits that produce very few decoys, I just posted something over on this thread . Thanks for helping us so far! > The problem Work Unit Type that I have is "1hz6A_BOINC_NATIVEJUMP_CLOSE_CHAINBREAKS_......" (My preference is 21600 seconds or 6 hours), see below >> Workunit 38998468 / Result 44201063 > Took 32946.174 seconds for 2 decoys, granted cobblestones 19.94 = 2.18 c/hour >> Workunit 38983264 / Result 44184941 > Took 42094.899 seconds for 1 decoy, granted cobblestones 9.96 = 0.85 c/hour >> Workunit 38891074 / Result 44104983 > Took 44900.914 seconds for 5 decoys, granted cobblestones 49.685 = 3.98 c/hour >> Workunit 38893424 / Result 44089375 > Took 16400.803 seconds for 1 decoy, granted cobblestones 5.928 = 1.30 c/hour >> Workunit 38866928 / Result 44061211 > Took 18923.17 seconds for 2 decoys, granted cobblestones 11.804 = 2.24 c/hour >> Workunit 38819735 / Result 44010746 > Took 20713.144 seconds for 1 decoy, granted cobblestones 5.885 = 1.023 c/hour As can be seen lots of time but very little return and small amount of decoys processed in that time. Possibly very large Proteins, if so then due to the extra work done on those proteins a higher granted Cobblestone amount would of been expected. These are the lowest ones I grabbed, I did not look at all my results so there could be others. Getting less than 1 cobblestone per hour does not seem fair, especially considering the results that some people are getting (see "High Scores Anyone" thread with 1 computer getting over 520 cobblestones for 18 seconds work). It would appear that there may be two problems with some workunits, some giving very high and others giving very low amounts of credit. ID: 30270 · Rating: 0 · rate: / Reply Quote

Fafnir Send message Joined: 20 Jul 06 Posts: 2 Credit: 369 RAC: 0	Message 30326 - Posted: 30 Oct 2006, 22:49:28 UTC - in response to Message 30270. My MacBook: https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=275356 Measured floating point speed 795 million ops/sec Measured integer speed 1148.68 million ops/sec Another MacBook: https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=298066 Measured floating point speed 1555.36 million ops/sec Measured integer speed 4334.73 million ops/sec I get only around 12 credits granted (and claim even less), while the other MacBook gets around 30 credits. The work units are by both machines finished in more or less 10000 CPU seconds, so were is the difference? ID: 30326 · Rating: 0 · rate: / Reply Quote

dcdc Send message Joined: 3 Nov 05 Posts: 1834 Credit: 124,281,057 RAC: 396	Message 30327 - Posted: 30 Oct 2006, 23:04:57 UTC - in response to Message 30229. Is it possible that when a WU errors out (or is aborted) and reports back, that it gets averaged into the credit per model as a zero? And then during the nightly run is granted some credit for the failure, which is not added back in to the running average claim? sounds very plausable - (I'll give it a bump incase anyone missed it ;)) ID: 30327 · Rating: 0 · rate: / Reply Quote

Mats Petersson Send message Joined: 29 Sep 05 Posts: 225 Credit: 951,788 RAC: 0	Message 30371 - Posted: 31 Oct 2006, 15:18:47 UTC - in response to Message 30184. As I was "requested" to try to comment on this, I will... System A -- AMD Athlon TBIRD Uni-Processor @ 1533MHZ System B -- Intel XEON MP @ 1500MHZ (one of 4 physical/8 logical) System A -- 64kI/64kD 256K-L2 System B -- 12kI/8kD 512K-L2 1024K-L3 32MB-L4 AMD INTEL ___________________________________________________________________ 48402.95 Actual-time 33413.59 59.7020724185046 claim 12.3420001842521 101.790368563704 grant 9.82971861593104 30 decoys-gen 1 43200 Preferred-time 43200 1306_38409_0 SUFFIX 1306_34283_0 System A -- Rosetta id's and name. https://boinc.bakerlab.org/rosetta/result.php?resultid=44225313 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=39021279 1hz6A_BOINC_NATIVEJUMPS_CLOSE_CHAINBREAKS_VARY_ALL_BOND_DISTANCES_SAVE_ALL_OUT__1306_38409_0 System B -- Rosetta id's and name. https://boinc.bakerlab.org/rosetta/result.php?resultid=44199022 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=38996524 1hz6A_BOINC_NATIVEJUMPS_CLOSE_CHAINBREAKS_VARY_ALL_BOND_DISTANCES_SAVE_ALL_OUT__1306_34283_0 I think the Intel system is performing absolutely abysmally - the benchmark result is 175 MFlops and 400MIPS - which is about 5-10x lower than one of my AMD64 processors... That's not what I expect from a 1.5GHz P4 type processor - something is wrong with this, but I can't really say what's wrong without further information. It's not the memory/caches as far as I can tell, as the benchmark would fit in the L1 cache, even with the small P4 caches... The Athlon XP system appears to be about right for that class of processor - around 6-8 credits per hour on the few samples I looked at. -- Mats ID: 30371 · Rating: 0 · rate: / Reply Quote

dcdc Send message Joined: 3 Nov 05 Posts: 1834 Credit: 124,281,057 RAC: 396	Message 30399 - Posted: 31 Oct 2006, 23:43:42 UTC - in response to Message 30327. Is it possible that when a WU errors out (or is aborted) and reports back, that it gets averaged into the credit per model as a zero? And then during the nightly run is granted some credit for the failure, which is not added back in to the running average claim? sounds very plausable - (I'll give it a bump incase anyone missed it ;)) From this WU: https://boinc.bakerlab.org/rosetta/workunit.php?wuid=38953742 As Feet1st suggested it looks like the errored WU score of 0 is being taken into account for the averaging. There has been a pretty large drop in RAC recently - I assume this is part of the reason? (The other part being the large checkpoint times) ID: 30399 · Rating: 0 · rate: / Reply Quote

netwraith Send message Joined: 3 Sep 06 Posts: 80 Credit: 13,483,227 RAC: 0	Message 30402 - Posted: 1 Nov 2006, 2:31:21 UTC - in response to Message 30371. As I was "requested" to try to comment on this, I will... [quote] System A -- AMD Athlon TBIRD Uni-Processor @ 1533MHZ System B -- Intel XEON MP @ 1500MHZ (one of 4 physical/8 logical) System A -- 64kI/64kD 256K-L2 System B -- 12kI/8kD 512K-L2 1024K-L3 32MB-L4 AMD INTEL ___________________________________________________________________ 48402.95 Actual-time 33413.59 59.7020724185046 claim 12.3420001842521 101.790368563704 grant 9.82971861593104 30 decoys-gen 1 43200 Preferred-time 43200 1306_38409_0 SUFFIX 1306_34283_0 I think the Intel system is performing absolutely abysmally - the benchmark result is 175 MFlops and 400MIPS - which is about 5-10x lower than one of my AMD64 processors... That's not what I expect from a 1.5GHz P4 type processor - something is wrong with this, but I can't really say what's wrong without further information. It's not the memory/caches as far as I can tell, as the benchmark would fit in the L1 cache, even with the small P4 caches... The Athlon XP system appears to be about right for that class of processor - around 6-8 credits per hour on the few samples I looked at. -- Mats Thanks for the response. I agree that the Intel is performing particularly poorly with this W.U. And I did not have any real good reason for it to do so poorly. It tracks better with the AMD on other types of W.U.'s.. and generally scores just a bit lower. It is only with this type of work unit that the performance was so poor and the only thing that the AMD did was prove it was possible to do properly/better. I think your analytical skills are pretty good, even if this one is confusing... It is confusing me too.. I just wanted to confirm suspicions and possibly rule out something I had not thought of... Thanks again... I, also, had not gotten any comments or help from the programmers... They just skipped right by and glossed it over. Maybe did not perceive it as a problem.. Oh well.. Just when I expect impartiality from scientific minds, I get the opposite... *Looking for a team ??? Join BoincSynergy!!* ID: 30402 · Rating: 0 · rate: / Reply Quote

BennyRop Send message Joined: 17 Dec 05 Posts: 555 Credit: 140,800 RAC: 0	Message 30405 - Posted: 1 Nov 2006, 3:17:57 UTC Give 5.36 a try, and if you get equally bad results, then Rhiju's changes to 5.36 didn't cure the issue for your machine. Post a message in the 5.36 error thread. We don't always get personalized replies here. I may have got a few - but most of them were (Your message has been moved to another thread because it was off topic.) grin ID: 30405 · Rating: 0 · rate: / Reply Quote

dcdc Send message Joined: 3 Nov 05 Posts: 1834 Credit: 124,281,057 RAC: 396	Message 30427 - Posted: 1 Nov 2006, 12:12:36 UTC it seems the longer a WU runs, the more chance there is of it getting a low score. This kinda makes sense as they're more likely to error out I guess. This is one of mine that ran 4x longer than usual, and only got 1/3 the usual points!: Computer:https://boinc.bakerlab.org/rosetta/results.php?hostid=301687 This particular WU:https://boinc.bakerlab.org/rosetta/workunit.php?wuid=39020122 ID: 30427 · Rating: 0 · rate: / Reply Quote

netwraith Send message Joined: 3 Sep 06 Posts: 80 Credit: 13,483,227 RAC: 0	Message 30429 - Posted: 1 Nov 2006, 12:14:42 UTC - in response to Message 30405. Last modified: 1 Nov 2006, 12:35:38 UTC Give 5.36 a try, and if you get equally bad results, then Rhiju's changes to 5.36 didn't cure the issue for your machine. Post a message in the 5.36 error thread. We don't always get personalized replies here. I may have got a few - but most of them were (Your message has been moved to another thread because it was off topic.) grin Well -- I just got done with a similar job on RALPH... Now the machine that was doing 30 decoys w/5.34 is getting stuck w/5.36... Proper Change Control Anyone ??? May have to temporarily switch off projects if this continues... And I just added both of my quad cores to RALPH... Just an ACK would be sufficient..!! (I still miss 'Bloom County')... *Looking for a team ??? Join BoincSynergy!!* ID: 30429 · Rating: 0 · rate: / Reply Quote

Mats Petersson Send message Joined: 29 Sep 05 Posts: 225 Credit: 951,788 RAC: 0	Message 30430 - Posted: 1 Nov 2006, 12:24:51 UTC - in response to Message 30402. It is only with this type of work unit that the performance was so poor and the only thing that the AMD did was prove it was possible to do properly/better. I definitely think your benchmark score should be higher for the processor - I would investigate why this is so low before looking at WU/Task related issues. I've got every single one of my processors showing more than double your benchmark scores, even on Linux distro's which are notorious for giving poor benchmark results. Intel processors shouldn't be THAT much worse than the AMD processors per clock, so I do think there's something in your system causing a problem - assuming of course this isn't a file- or db-server that is loaded by hundreds of clients fetching data from it all the time... -- Mats ID: 30430 · Rating: 0 · rate: / Reply Quote

netwraith Send message Joined: 3 Sep 06 Posts: 80 Credit: 13,483,227 RAC: 0	Message 30432 - Posted: 1 Nov 2006, 12:43:18 UTC - in response to Message 30430. Last modified: 1 Nov 2006, 12:48:50 UTC It is only with this type of work unit that the performance was so poor and the only thing that the AMD did was prove it was possible to do properly/better. I definitely think your benchmark score should be higher for the processor - I would investigate why this is so low before looking at WU/Task related issues. I've got every single one of my processors showing more than double your benchmark scores, even on Linux distro's which are notorious for giving poor benchmark results. Intel processors shouldn't be THAT much worse than the AMD processors per clock, so I do think there's something in your system causing a problem - assuming of course this isn't a file- or db-server that is loaded by hundreds of clients fetching data from it all the time... -- Mats The benchmark scores on all of my multi core machines seem slow. for example, I have a dual XEON 2.8 w/HT on.. The benchmarks stop all 4 threads, runs the benchmark and divides by 4.. Now that only really works if the benchmark is fully multithreaded, which I don't think that it is.. Now... as far as the one system in question, yes, that one is a bit of a special case... That one has a very strange high resolution clock (SUMMIT cyclone -- IBM IA32) and very low loops-per-jiffies (lpj ... used to calculate BOGOMIPS amoung other things)... The BOGOMIPS per CPU is ~200!!! (should be around 3000)... I am still using the CentOS recommended kernel with these, but, have gotten a recent 2.6.17ish kernel compiled.. But I just stopped otherwise, because the system does everything else so well.. Maybe it's time to try the newer kernel or to force the system to use something other than the SUMMIT cyclone timer for hi-res... and, no, the machine (and it's twin) are currently unloaded except for BOINC... They are IBM xSeries 440's Thanks again for the reply... *Looking for a team ??? Join BoincSynergy!!* ID: 30432 · Rating: 0 · rate: / Reply Quote

Mats Petersson Send message Joined: 29 Sep 05 Posts: 225 Credit: 951,788 RAC: 0	Message 30434 - Posted: 1 Nov 2006, 12:52:23 UTC - in response to Message 30432. It is only with this type of work unit that the performance was so poor and the only thing that the AMD did was prove it was possible to do properly/better. I definitely think your benchmark score should be higher for the processor - I would investigate why this is so low before looking at WU/Task related issues. I've got every single one of my processors showing more than double your benchmark scores, even on Linux distro's which are notorious for giving poor benchmark results. Intel processors shouldn't be THAT much worse than the AMD processors per clock, so I do think there's something in your system causing a problem - assuming of course this isn't a file- or db-server that is loaded by hundreds of clients fetching data from it all the time... -- Mats The benchmark scores on all of my multi core machines seem slow. for example, I have a dual XEON 2.8 w/HT on.. The benchmarks stop all 4 threads, runs the benchmark and divides by 4.. Now that only really works if the benchmark is fully multithreaded, which I don't think that it is.. Now... as far as the one system in question, yes, that one is a bit of a special case... That one has a very strange high resolution clock (SUMMIT cyclone -- IBM IA32) and very low loops-per-jiffies (lpj ... used to calculate BOGOMIPS amoung other things)... The BOGOMIPS per CPU is ~200!!! (should be around 3000)... I am still using the CentOS recommended kernel with these, but, have gotten a recent 2.6.17ish kernel compiled.. But I just stopped otherwise, because the system does everything else so well.. Maybe it's time to try the newer kernel or to force the system to use something other than the SUMMIT cyclone timer for hi-res... and, no, the machine (and it's twin) are currently unloaded except for BOINC... They are IBM xSeries 440's Thanks again for the reply... I'd try to get the Bogomips number correct before trying to find any other solutions to any other problems. I'm fairly confident that the benchmark _IS_ multithreaded - at least I see four instances of "boinc" instead of "rosetta" in "top" when I force a run of benchmark on my 2 socket dual-core machine. -- Mats ID: 30434 · Rating: 0 · rate: / Reply Quote

blueroom Send message Joined: 14 May 06 Posts: 3 Credit: 42,145 RAC: 0	Message 30601 - Posted: 4 Nov 2006, 12:00:05 UTC Last modified: 4 Nov 2006, 12:08:28 UTC Hmm, dont know if this is normal but seems kind of low when I am getting about 10points/hour/core with AMD X2 3800+, winXP, 512mb. Runtime is set to 6h/wu. Edit: Its seems normal compared to others so this question/post may be ignored. ID: 30601 · Rating: 0 · rate: / Reply Quote

RWIoffice Send message Joined: 7 Jun 06 Posts: 4 Credit: 37,344 RAC: 0	Message 30652 - Posted: 5 Nov 2006, 17:31:06 UTC FWIW, this machine gets only occasional foreground interactive use, and crunches for Rosetta in the background 24/7. RAC has dropped from 226 on 10/5 to 207 (11/5). The BOINC manager stats graph shows pretty much linear decline, with the exception of one upward spike around 10/20-21. A quick backward glance through Messages shows nothing in red since the machine's last reboot for updates on 10/25. ID: 30652 · Rating: 0 · rate: / Reply Quote

Conan Send message Joined: 11 Oct 05 Posts: 153 Credit: 4,460,094 RAC: 0	Message 30675 - Posted: 6 Nov 2006, 4:37:43 UTC > @ RWloffice, you are probably getting the lower RAC due to some long workunits that have taken longer to process than preference settings. I have had the same problem (I got as low as 0.85 credits/hour for one long workunit that did 1 decoy, and I am set to 6 hours). I also have had screensaver problems that froze the machine for sometime and this may also have caused an issue. The workunits do complete so that is why there is no error messages. 5.36 is supposed to have fixed the long workunit problem. ID: 30675 · Rating: 0 · rate: / Reply Quote