Message boards : Number crunching : Problems with Rosetta version 5.45
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
River~~ Send message Joined: 15 Dec 05 Posts: 761 Credit: 285,578 RAC: 0 |
What's this mean ? It means you have some weird problem with the automatic download of the new Rosetta version. I'd say leave it and see if it all sorts itself out after a retry or three - the software at your end is obviously backing off for 10mins to see if the issue resolves itself, but those who know more than I do may advise differently. R~~ ps - like your profile - I too am old enuff to remember punch cards |
Chu Send message Joined: 23 Feb 06 Posts: 120 Credit: 112,439 RAC: 0 |
Thanks for the report, River. When this happened, did you happend to see whether the cpu run time was stilled being incremented? I agree with you it definitely looks like a bug somewhere, but not graphic related. I am wondering if this only happens on linux platforms or everywhere else. The 'stuck at 100%' bug has returned with this result here. |
EdMulock Send message Joined: 14 Mar 06 Posts: 30 Credit: 2,347,485 RAC: 0 |
What's this mean ? Complete reinstall - including delete BOINC directory - re link to project did the trick. Thanks. Remember when people with BIG HANDS got all the promotions ?
|
adrianxw Send message Joined: 18 Sep 05 Posts: 653 Credit: 11,840,739 RAC: 1 |
Lost 2 wu's with the same error. Both giving... core_client_version>5.8.8</core_client_version> ... both arrived within a couple of hours and were crunched on different machines. The other commonality is the wu name, both were FRA_z020_STRUCTURAL_GENOMICS... etc wu's. Both machines are running 5.8.8 core, leave in memory set, no graphics. I too am old enuff to remember punch cardsAh, good times huh? Punched cards, paper tape. A program patch was done with scissors and sticky tape. I recall we had a mechanical card sorter, so WHEN you dropped your card deck, the machine would put them back into order determined by the line numbers in the last 8 columns of your Fortran-IV program. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
netwraith Send message Joined: 3 Sep 06 Posts: 80 Credit: 13,483,227 RAC: 0 |
-- With my Linux client (standard 5.4.11), I am no longer getting the adjusted granted credit. For this computer... https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=299336 My claim and grant are identical... (Has not happened before) https://boinc.bakerlab.org/rosetta/result.php?resultid=61261566 Claimed credit 38.2147647381899 Granted credit 38.2147647381899 https://boinc.bakerlab.org/rosetta/result.php?resultid=61220331 Claimed credit 37.0220763114969 Granted credit 60.7551059618435 This seems to be happening to a lot of my hosts, and only the latest results.. Needless to say, I am not using any form of optimised client and am afraid that this issue, if intentional, might lead to another round of high claiming clients.... *EDIT* I went a bit deeper into my results, and anything being report from about 4:30AM UTC to around 9AM UTC had identical CLAIM/GRANT numbers... What happened ??? I suppose that there are much more pressing issues than this... Don't spend a bunch of time on my account... Looking for a team ??? Join BoincSynergy!! |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
The credit issue with the new server BOINC code has been corrected now. See post Rosetta Moderator: Mod.Sense |
Christoph Send message Joined: 10 Dec 05 Posts: 57 Credit: 1,512,386 RAC: 0 |
|
netwraith Send message Joined: 3 Sep 06 Posts: 80 Credit: 13,483,227 RAC: 0 |
-- https://boinc.bakerlab.org/rosetta/result.php?resultid=61387146 This one went to 100%, was still in run mode and stopped accumulating time... I restarted boinc and it then continued, resetting to 47%... I am not sure what behavior this indicates, but, figured I would document it anyway... Looking for a team ??? Join BoincSynergy!! |
netwraith Send message Joined: 3 Sep 06 Posts: 80 Credit: 13,483,227 RAC: 0 |
-- *UPDATE* Noticing lots of anomolies with W.U.'s with this pattern.... Hangs, SEGV's, and short runs.... DOC_????_fixbb_???? Looking for a team ??? Join BoincSynergy!! |
MattDavis Send message Joined: 22 Sep 05 Posts: 206 Credit: 1,377,748 RAC: 0 |
-- https://boinc.bakerlab.org/rosetta/forum_thread.php?id=2883 |
The_Bad_Penguin Send message Joined: 5 Jun 06 Posts: 2751 Credit: 4,271,025 RAC: 0 |
Dedicated BOINC machine, no other processes: Result ID 61735972 Name DOC_1EO8_R070207_pose_u_global_search_fixbb_1549_713_0 Workunit 54954229 Created 9 Feb 2007 7:06:19 UTC Sent 9 Feb 2007 7:11:54 UTC Received 9 Feb 2007 17:11:35 UTC Server state Over Outcome Success Client state Done Exit status 0 (0x0) Computer ID 358920 Report deadline 19 Feb 2007 7:11:54 UTC CPU time 18121.765625 stderr out <core_client_version>5.4.11</core_client_version> <stderr_txt> # random seed: 2083803 # cpu_run_time_pref: 21600 ********************************************************************** Rosetta score is stuck or going too long. Watchdog is ending the run! Stuck at score 0 for 3600 seconds ********************************************************************** GZIP SILENT FILE: .dd1EO8.out </stderr_txt> Validate state Valid Claimed credit 66.5583529721271 Granted credit 20 application version 5.45 |
Trog Dog Send message Joined: 25 Nov 05 Posts: 129 Credit: 57,345 RAC: 0 |
Hi all. G'day Eric The 127 error code is a file not found code which normally means a missing library or dependency. It could also mean that the rosetta app does not have executable bit set. From a terminal window run the ldd command on the rosetta app and then on boincmgr and note what packages/libraries that you are missing. ldd /full/path/to/rosetta_5.43_i686-pc-linux-gnu and ldd /full/path/to/boincmgr substituting the correct path for /full/path/to/ |
Michael.L Send message Joined: 12 Nov 06 Posts: 67 Credit: 31,295 RAC: 0 |
Result ID 61729110 Name DOC_1DQJ_R070207_pose_u_global_search_fixbb_1549_663_0 Workunit 54947927 Created 9 Feb 2007 6:07:35 UTC Sent 9 Feb 2007 6:12:48 UTC Received 10 Feb 2007 8:42:19 UTC Server state Over Outcome Success Client state Done Exit status 0 (0x0) Computer ID 410873 Report deadline 19 Feb 2007 6:12:48 UTC CPU time 3591.15625 stderr out <core_client_version>5.4.11</core_client_version> <stderr_txt> # random seed: 2083863 # cpu_run_time_pref: 14400 ********************************************************************** Rosetta score is stuck or going too long. Watchdog is ending the run! Stuck at score 0 for 3600 seconds ********************************************************************** GZIP SILENT FILE: .dd1DQJ.out </stderr_txt> Validate state Valid Claimed credit 10.9597211979944 Granted credit 20 application version 5.45 |
Conan Send message Joined: 11 Oct 05 Posts: 151 Credit: 4,244,078 RAC: 128 |
> Getting the same problem I had over at Ralph with the stuck for too long at zero. <core_client_version>5.4.11</core_client_version> <stderr_txt> # random seed: 2084050 # cpu_run_time_pref: 21600 ********************************************************************** Rosetta score is stuck or going too long. Watchdog is ending the run! Stuck at score 0 for 3600 seconds ********************************************************************** GZIP SILENT FILE: .dd1CHO.out https://boinc.bakerlab.org/rosetta/result.php?resultid=61657043 https://boinc.bakerlab.org/rosetta/result.php?resultid=61657041 https://boinc.bakerlab.org/rosetta/result.php?resultid=61596422 https://boinc.bakerlab.org/rosetta/result.php?resultid=61556930 On Ralph it was my Linux machine that had this problem for 6 Wu's now my Windows machines are having the same problem on Rosetta. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
There were some problems with a batch of DOC WUs being ended by the watchdog. These WUs have now been removed from the server. Further description and symptoms in the link below: quoting Chu Feb 9th, "...those [DOC] WUs have been temporarily removed from the queue...To further help us track down the problem, could you please report what kind of platform your host is? It is definitely happening on linux, what about the rest of you?" Rosetta Moderator: Mod.Sense |
Michael.L Send message Joined: 12 Nov 06 Posts: 67 Credit: 31,295 RAC: 0 |
Stuck WU. Msg 36421. Windows XP 2 Home. BOINC 5.4.1. |
The_Bad_Penguin Send message Joined: 5 Jun 06 Posts: 2751 Credit: 4,271,025 RAC: 0 |
Compaq sr2030nx. AMD A64 3800+. 1 GB RAM. Win XP MC 2005. BOINC 5.4.11 Result ID 61735972 Name DOC_1EO8_R070207_pose_u_global_search_fixbb_1549_713_0 Workunit 54954229 ********************************************************************** Rosetta score is stuck or going too long. Watchdog is ending the run! Stuck at score 0 for 3600 seconds ********************************************************************** To further help us track down the problem, could you please report what kind of platform your host is? It is definitely happening on linux, what about the rest of you?" |
River~~ Send message Joined: 15 Dec 05 Posts: 761 Credit: 285,578 RAC: 0 |
Sorry, I didn't take note of that... will let you know next time it happens. I would add that this is not an every time bug, even on Linux, as I had cleared out the Rosetta work from 6 Linux boxes and 3 win2k using the same technique, and this bug only arose on 2 boxes. It could be coincidence that both failures occurred on Linux Thanks for the report, River. When this happened, did you happend to see whether the cpu run time was stilled being incremented? I agree with you it definitely looks like a bug somewhere, but not graphic related. I am wondering if this only happens on linux platforms or everywhere else. |
River~~ Send message Joined: 15 Dec 05 Posts: 761 Credit: 285,578 RAC: 0 |
and the stuck soon after start bug has also returned. This result hung for eight hours at 1min 16sec before I noticed it and aborted it. The next following result started and has already run for 14min cpu and still counting. I got the eight hours figure from looking up the 'computation started' time in the messages tab. River~~ |
anders n Send message Joined: 19 Sep 05 Posts: 403 Credit: 537,991 RAC: 0 |
exit code 1 (0x1) ERROR:: Exit at: .refold.cc line:337 Wu https://boinc.bakerlab.org/rosetta/result.php?resultid=61784644 Anders n |
Message boards :
Number crunching :
Problems with Rosetta version 5.45
©2025 University of Washington
https://www.bakerlab.org