Low Scores Anyone?

Message boards : Number crunching : Low Scores Anyone?

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Profile Conan
Avatar

Send message
Joined: 11 Oct 05
Posts: 151
Credit: 4,244,078
RAC: 128
Message 30263 - Posted: 30 Oct 2006, 1:31:02 UTC
Last modified: 30 Oct 2006, 1:35:25 UTC

> The problem Work Unit Type that I have is "1hz6A_BOINC_NATIVEJUMP_CLOSE_CHAINBREAKS_......"
(My preference is 21600 seconds or 6 hours), see below

>> Workunit 38998468 / Result 44201063
> Took 32946.174 seconds for 2 decoys, granted cobblestones 19.94 = 2.18 c/hour
>> Workunit 38983264 / Result 44184941
> Took 42094.899 seconds for 1 decoy, granted cobblestones 9.96 = 0.85 c/hour
>> Workunit 38891074 / Result 44104983
> Took 44900.914 seconds for 5 decoys, granted cobblestones 49.685 = 3.98 c/hour
>> Workunit 38893424 / Result 44089375
> Took 16400.803 seconds for 1 decoy, granted cobblestones 5.928 = 1.30 c/hour
>> Workunit 38866928 / Result 44061211
> Took 18923.17 seconds for 2 decoys, granted cobblestones 11.804 = 2.24 c/hour
>> Workunit 38819735 / Result 44010746
> Took 20713.144 seconds for 1 decoy, granted cobblestones 5.885 = 1.023 c/hour

As can be seen lots of time but very little return and small amount of decoys processed in that time.
Possibly very large Proteins, if so then due to the extra work done on those proteins a higher granted Cobblestone amount would of been expected.
These are the lowest ones I grabbed, I did not look at all my results so there could be others.
Getting less than 1 cobblestone per hour does not seem fair, especially considering the results that some people are getting (see "High Scores Anyone" thread with 1 computer getting over 520 cobblestones for 18 seconds work).

It would appear that there may be two problems with some workunits, some giving very high and others giving very low amounts of credit.



ID: 30263 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Gen_X_Accord
Avatar

Send message
Joined: 5 Jun 06
Posts: 154
Credit: 279,018
RAC: 0
Message 30266 - Posted: 30 Oct 2006, 5:10:08 UTC

The credit I'm getting now on these other work units is more like normal. Maybe you should send the weird ones to XS_DDTUNG, that person has predicted the lowest structure 3 times in the last 2 months, he/she must have some really good equipment.
ID: 30266 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rhiju
Volunteer moderator

Send message
Joined: 8 Jan 06
Posts: 223
Credit: 3,546
RAC: 0
Message 30270 - Posted: 30 Oct 2006, 6:21:48 UTC - in response to Message 30263.  
Last modified: 30 Oct 2006, 6:22:08 UTC

To the posters who have been alerting us about workunits that produce very few decoys, I just posted something over on this
thread . Thanks for helping us so far!

> The problem Work Unit Type that I have is "1hz6A_BOINC_NATIVEJUMP_CLOSE_CHAINBREAKS_......"
(My preference is 21600 seconds or 6 hours), see below

>> Workunit 38998468 / Result 44201063
> Took 32946.174 seconds for 2 decoys, granted cobblestones 19.94 = 2.18 c/hour
>> Workunit 38983264 / Result 44184941
> Took 42094.899 seconds for 1 decoy, granted cobblestones 9.96 = 0.85 c/hour
>> Workunit 38891074 / Result 44104983
> Took 44900.914 seconds for 5 decoys, granted cobblestones 49.685 = 3.98 c/hour
>> Workunit 38893424 / Result 44089375
> Took 16400.803 seconds for 1 decoy, granted cobblestones 5.928 = 1.30 c/hour
>> Workunit 38866928 / Result 44061211
> Took 18923.17 seconds for 2 decoys, granted cobblestones 11.804 = 2.24 c/hour
>> Workunit 38819735 / Result 44010746
> Took 20713.144 seconds for 1 decoy, granted cobblestones 5.885 = 1.023 c/hour

As can be seen lots of time but very little return and small amount of decoys processed in that time.
Possibly very large Proteins, if so then due to the extra work done on those proteins a higher granted Cobblestone amount would of been expected.
These are the lowest ones I grabbed, I did not look at all my results so there could be others.
Getting less than 1 cobblestone per hour does not seem fair, especially considering the results that some people are getting (see "High Scores Anyone" thread with 1 computer getting over 520 cobblestones for 18 seconds work).

It would appear that there may be two problems with some workunits, some giving very high and others giving very low amounts of credit.




ID: 30270 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Fafnir

Send message
Joined: 20 Jul 06
Posts: 2
Credit: 369
RAC: 0
Message 30326 - Posted: 30 Oct 2006, 22:49:28 UTC - in response to Message 30270.  

My MacBook:
https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=275356
Measured floating point speed 795 million ops/sec
Measured integer speed 1148.68 million ops/sec

Another MacBook:
https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=298066
Measured floating point speed 1555.36 million ops/sec
Measured integer speed 4334.73 million ops/sec

I get only around 12 credits granted (and claim even less), while the other MacBook gets around 30 credits. The work units are by both machines finished in more or less 10000 CPU seconds, so were is the difference?
ID: 30326 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1832
Credit: 119,867,273
RAC: 1,759
Message 30327 - Posted: 30 Oct 2006, 23:04:57 UTC - in response to Message 30229.  

Is it possible that when a WU errors out (or is aborted) and reports back, that it gets averaged into the credit per model as a zero? And then during the nightly run is granted some credit for the failure, which is not added back in to the running average claim?

sounds very plausable - (I'll give it a bump incase anyone missed it ;))
ID: 30327 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mats Petersson

Send message
Joined: 29 Sep 05
Posts: 225
Credit: 951,788
RAC: 0
Message 30371 - Posted: 31 Oct 2006, 15:18:47 UTC - in response to Message 30184.  

As I was "requested" to try to comment on this, I will...

System A -- AMD Athlon TBIRD Uni-Processor @ 1533MHZ
System B -- Intel XEON MP @ 1500MHZ (one of 4 physical/8 logical)

System A -- 64kI/64kD 256K-L2
System B -- 12kI/8kD 512K-L2 1024K-L3 32MB-L4

AMD                                     INTEL
___________________________________________________________________
48402.95                Actual-time     33413.59
59.7020724185046        claim           12.3420001842521
101.790368563704        grant           9.82971861593104
30                      decoys-gen      1
43200                   Preferred-time  43200
1306_38409_0            SUFFIX          1306_34283_0


System A -- Rosetta id's and name.
https://boinc.bakerlab.org/rosetta/result.php?resultid=44225313
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=39021279
1hz6A_BOINC_NATIVEJUMPS_CLOSE_CHAINBREAKS_VARY_ALL_BOND_DISTANCES_SAVE_ALL_OUT__1306_38409_0

System B -- Rosetta id's and name.
https://boinc.bakerlab.org/rosetta/result.php?resultid=44199022
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=38996524
1hz6A_BOINC_NATIVEJUMPS_CLOSE_CHAINBREAKS_VARY_ALL_BOND_DISTANCES_SAVE_ALL_OUT__1306_34283_0


I think the Intel system is performing absolutely abysmally - the benchmark result is 175 MFlops and 400MIPS - which is about 5-10x lower than one of my AMD64 processors... That's not what I expect from a 1.5GHz P4 type processor - something is wrong with this, but I can't really say what's wrong without further information. It's not the memory/caches as far as I can tell, as the benchmark would fit in the L1 cache, even with the small P4 caches...

The Athlon XP system appears to be about right for that class of processor - around 6-8 credits per hour on the few samples I looked at.

--
Mats

ID: 30371 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1832
Credit: 119,867,273
RAC: 1,759
Message 30399 - Posted: 31 Oct 2006, 23:43:42 UTC - in response to Message 30327.  

Is it possible that when a WU errors out (or is aborted) and reports back, that it gets averaged into the credit per model as a zero? And then during the nightly run is granted some credit for the failure, which is not added back in to the running average claim?

sounds very plausable - (I'll give it a bump incase anyone missed it ;))


From this WU:
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=38953742

As Feet1st suggested it looks like the errored WU score of 0 is being taken into account for the averaging. There has been a pretty large drop in RAC recently - I assume this is part of the reason? (The other part being the large checkpoint times)

ID: 30399 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile netwraith
Avatar

Send message
Joined: 3 Sep 06
Posts: 80
Credit: 13,483,227
RAC: 0
Message 30402 - Posted: 1 Nov 2006, 2:31:21 UTC - in response to Message 30371.  

As I was "requested" to try to comment on this, I will...

[quote]
System A -- AMD Athlon TBIRD Uni-Processor @ 1533MHZ
System B -- Intel XEON MP @ 1500MHZ (one of 4 physical/8 logical)

System A -- 64kI/64kD 256K-L2
System B -- 12kI/8kD 512K-L2 1024K-L3 32MB-L4

AMD                                     INTEL
___________________________________________________________________
48402.95                Actual-time     33413.59
59.7020724185046        claim           12.3420001842521
101.790368563704        grant           9.82971861593104
30                      decoys-gen      1
43200                   Preferred-time  43200
1306_38409_0            SUFFIX          1306_34283_0


I think the Intel system is performing absolutely abysmally - the benchmark result is 175 MFlops and 400MIPS - which is about 5-10x lower than one of my AMD64 processors... That's not what I expect from a 1.5GHz P4 type processor - something is wrong with this, but I can't really say what's wrong without further information. It's not the memory/caches as far as I can tell, as the benchmark would fit in the L1 cache, even with the small P4 caches...

The Athlon XP system appears to be about right for that class of processor - around 6-8 credits per hour on the few samples I looked at.

--
Mats


Thanks for the response. I agree that the Intel is performing particularly poorly with this W.U. And I did not have any real good reason for it to do so poorly. It tracks better with the AMD on other types of W.U.'s.. and generally scores just a bit lower.

It is only with this type of work unit that the performance was so poor and the only thing that the AMD did was prove it was possible to do properly/better.

I think your analytical skills are pretty good, even if this one is confusing... It is confusing me too.. I just wanted to confirm suspicions and possibly rule out something I had not thought of... Thanks again...

I, also, had not gotten any comments or help from the programmers... They just skipped right by and glossed it over. Maybe did not perceive it as a problem.. Oh well.. Just when I expect impartiality from scientific minds, I get the opposite...






Looking for a team ??? Join BoincSynergy!!


ID: 30402 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BennyRop

Send message
Joined: 17 Dec 05
Posts: 555
Credit: 140,800
RAC: 0
Message 30405 - Posted: 1 Nov 2006, 3:17:57 UTC

Give 5.36 a try, and if you get equally bad results, then Rhiju's changes to 5.36 didn't cure the issue for your machine. Post a message in the 5.36 error thread. We don't always get personalized replies here.

I may have got a few - but most of them were (Your message has been moved to another thread because it was off topic.) *grin*
ID: 30405 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1832
Credit: 119,867,273
RAC: 1,759
Message 30427 - Posted: 1 Nov 2006, 12:12:36 UTC

it seems the longer a WU runs, the more chance there is of it getting a low score. This kinda makes sense as they're more likely to error out I guess.

This is one of mine that ran 4x longer than usual, and only got 1/3 the usual points!:

Computer:https://boinc.bakerlab.org/rosetta/results.php?hostid=301687
This particular WU:https://boinc.bakerlab.org/rosetta/workunit.php?wuid=39020122
ID: 30427 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile netwraith
Avatar

Send message
Joined: 3 Sep 06
Posts: 80
Credit: 13,483,227
RAC: 0
Message 30429 - Posted: 1 Nov 2006, 12:14:42 UTC - in response to Message 30405.  
Last modified: 1 Nov 2006, 12:35:38 UTC

Give 5.36 a try, and if you get equally bad results, then Rhiju's changes to 5.36 didn't cure the issue for your machine. Post a message in the 5.36 error thread. We don't always get personalized replies here.

I may have got a few - but most of them were (Your message has been moved to another thread because it was off topic.) *grin*


Well -- I just got done with a similar job on RALPH... Now the machine that was doing 30 decoys w/5.34 is getting stuck w/5.36... Proper Change Control Anyone ???

May have to temporarily switch off projects if this continues...

And I just added both of my quad cores to RALPH...

Just an *ACK* would be sufficient..!! (I still miss 'Bloom County')...


Looking for a team ??? Join BoincSynergy!!


ID: 30429 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mats Petersson

Send message
Joined: 29 Sep 05
Posts: 225
Credit: 951,788
RAC: 0
Message 30430 - Posted: 1 Nov 2006, 12:24:51 UTC - in response to Message 30402.  

It is only with this type of work unit that the performance was so poor and the only thing that the AMD did was prove it was possible to do properly/better.


I definitely think your benchmark score should be higher for the processor - I would investigate why this is so low before looking at WU/Task related issues. I've got every single one of my processors showing more than double your benchmark scores, even on Linux distro's which are notorious for giving poor benchmark results. Intel processors shouldn't be THAT much worse than the AMD processors per clock, so I do think there's something in your system causing a problem - assuming of course this isn't a file- or db-server that is loaded by hundreds of clients fetching data from it all the time...

--
Mats

ID: 30430 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile netwraith
Avatar

Send message
Joined: 3 Sep 06
Posts: 80
Credit: 13,483,227
RAC: 0
Message 30432 - Posted: 1 Nov 2006, 12:43:18 UTC - in response to Message 30430.  
Last modified: 1 Nov 2006, 12:48:50 UTC

It is only with this type of work unit that the performance was so poor and the only thing that the AMD did was prove it was possible to do properly/better.


I definitely think your benchmark score should be higher for the processor - I would investigate why this is so low before looking at WU/Task related issues. I've got every single one of my processors showing more than double your benchmark scores, even on Linux distro's which are notorious for giving poor benchmark results. Intel processors shouldn't be THAT much worse than the AMD processors per clock, so I do think there's something in your system causing a problem - assuming of course this isn't a file- or db-server that is loaded by hundreds of clients fetching data from it all the time...

--
Mats


The benchmark scores on all of my multi core machines seem slow. for example, I have a dual XEON 2.8 w/HT on.. The benchmarks stop all 4 threads, runs the benchmark and divides by 4.. Now that only really works if the benchmark is fully multithreaded, which I don't think that it is..

Now... as far as the one system in question, yes, that one is a bit of a special case... That one has a very strange high resolution clock (SUMMIT cyclone -- IBM IA32) and very low loops-per-jiffies (lpj ... used to calculate BOGOMIPS amoung other things)... The BOGOMIPS per CPU is ~200!!! (should be around 3000)...

I am still using the CentOS recommended kernel with these, but, have gotten a recent 2.6.17ish kernel compiled.. But I just stopped otherwise, because the system does everything else so well..

Maybe it's time to try the newer kernel or to force the system to use something other than the SUMMIT cyclone timer for hi-res... and, no, the machine (and it's twin) are currently unloaded except for BOINC... They are IBM xSeries 440's

Thanks again for the reply...


Looking for a team ??? Join BoincSynergy!!


ID: 30432 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mats Petersson

Send message
Joined: 29 Sep 05
Posts: 225
Credit: 951,788
RAC: 0
Message 30434 - Posted: 1 Nov 2006, 12:52:23 UTC - in response to Message 30432.  

It is only with this type of work unit that the performance was so poor and the only thing that the AMD did was prove it was possible to do properly/better.


I definitely think your benchmark score should be higher for the processor - I would investigate why this is so low before looking at WU/Task related issues. I've got every single one of my processors showing more than double your benchmark scores, even on Linux distro's which are notorious for giving poor benchmark results. Intel processors shouldn't be THAT much worse than the AMD processors per clock, so I do think there's something in your system causing a problem - assuming of course this isn't a file- or db-server that is loaded by hundreds of clients fetching data from it all the time...

--
Mats


The benchmark scores on all of my multi core machines seem slow. for example, I have a dual XEON 2.8 w/HT on.. The benchmarks stop all 4 threads, runs the benchmark and divides by 4.. Now that only really works if the benchmark is fully multithreaded, which I don't think that it is..

Now... as far as the one system in question, yes, that one is a bit of a special case... That one has a very strange high resolution clock (SUMMIT cyclone -- IBM IA32) and very low loops-per-jiffies (lpj ... used to calculate BOGOMIPS amoung other things)... The BOGOMIPS per CPU is ~200!!! (should be around 3000)...

I am still using the CentOS recommended kernel with these, but, have gotten a recent 2.6.17ish kernel compiled.. But I just stopped otherwise, because the system does everything else so well..

Maybe it's time to try the newer kernel or to force the system to use something other than the SUMMIT cyclone timer for hi-res... and, no, the machine (and it's twin) are currently unloaded except for BOINC... They are IBM xSeries 440's

Thanks again for the reply...




I'd try to get the Bogomips number correct before trying to find any other solutions to any other problems.

I'm fairly confident that the benchmark _IS_ multithreaded - at least I see four instances of "boinc" instead of "rosetta" in "top" when I force a run of benchmark on my 2 socket dual-core machine.

--
Mats
ID: 30434 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
blueroom

Send message
Joined: 14 May 06
Posts: 3
Credit: 42,145
RAC: 0
Message 30601 - Posted: 4 Nov 2006, 12:00:05 UTC
Last modified: 4 Nov 2006, 12:08:28 UTC

Hmm, dont know if this is normal but seems kind of low when I am getting about 10points/hour/core with AMD X2 3800+, winXP, 512mb. Runtime is set to 6h/wu.
Edit: Its seems normal compared to others so this question/post may be ignored.

ID: 30601 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
RWIoffice

Send message
Joined: 7 Jun 06
Posts: 4
Credit: 37,344
RAC: 0
Message 30652 - Posted: 5 Nov 2006, 17:31:06 UTC

FWIW, this machine gets only occasional foreground interactive use, and crunches for Rosetta in the background 24/7. RAC has dropped from 226 on 10/5 to 207 (11/5). The BOINC manager stats graph shows pretty much linear decline, with the exception of one upward spike around 10/20-21. A quick backward glance through Messages shows nothing in red since the machine's last reboot for updates on 10/25.

ID: 30652 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Conan
Avatar

Send message
Joined: 11 Oct 05
Posts: 151
Credit: 4,244,078
RAC: 128
Message 30675 - Posted: 6 Nov 2006, 4:37:43 UTC

> @ RWloffice, you are probably getting the lower RAC due to some long workunits that have taken longer to process than preference settings.
I have had the same problem (I got as low as 0.85 credits/hour for one long workunit that did 1 decoy, and I am set to 6 hours). I also have had screensaver problems that froze the machine for sometime and this may also have caused an issue.
The workunits do complete so that is why there is no error messages.
5.36 is supposed to have fixed the long workunit problem.
ID: 30675 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2

Message boards : Number crunching : Low Scores Anyone?



©2025 University of Washington
https://www.bakerlab.org