Message boards : Number crunching : Rosetta 5.40 locks up
Author | Message |
---|---|
Ivor Cogdell Send message Joined: 7 Nov 06 Posts: 10 Credit: 21,469 RAC: 0 |
Message 31292 - Posted 17 Nov 2006 10:06:49 UTC Hi folks, Running Seti@home enhanced 5.15, Einstein@home and Rosetta@home 5.40, Boinc 5.4.11 on PC Windows XP. After an overnight run, Rosetta refuses to break out of sreensaver mode to normal operations.on a mouse click or keypress, the screen freeses, apart from the cursor. I have to turn off the computer to get it active again. Any thoughts, Ivor Reply from Fluffy Chicken asked... ATI or Intel graphics card ? What driver version ? Pop along to the Number crunching section of the main message board, you shoul find a thread called Report problems with Rosetta@home v5.40. Post in there inclding the answers to the questions I just asked. ATI Graphics Radeon 9200 Series Primary and secondary Driver ati2cqag.dll V 6.14.10.0265 Status shows as ok on both. Quick message startup for stats. 17/11/2006 19:29:11||Starting BOINC client version 5.4.11 for windows_intelx86 17/11/2006 19:29:11||libcurl/7.15.3 OpenSSL/0.9.8a zlib/1.2.3 17/11/2006 19:29:11||Data directory: C:Program FilesBOINC 17/11/2006 19:29:12||Processor: 1 GenuineIntel x86 Family 6 Model 8 Stepping 10 996MHz 17/11/2006 19:29:12||Memory: 510.48 MB physical, 1.22 GB virtual 17/11/2006 19:29:12||Disk: 128.00 GB total, 89.02 GB free 17/11/2006 19:29:12|rosetta@home|URL: https://boinc.bakerlab.org/rosetta/; Computer ID: 349316; location: home; project prefs: home 17/11/2006 19:29:12|Einstein@Home|URL: http://einstein.phys.uwm.edu/; Computer ID: 25471; location: home; project prefs: default 17/11/2006 19:29:12|SETI@home|URL: http://setiathome.berkeley.edu/; Computer ID: 1638859; location: home; project prefs: default 17/11/2006 19:29:12||General prefs: from SETI@home (last modified 2006-11-17 01:34:09) 17/11/2006 19:29:12||General prefs: using separate prefs for home 17/11/2006 19:29:12||Local control only allowed 17/11/2006 19:29:12||Listening on port 31416 17/11/2006 19:29:12|SETI@home|Resuming task 10jn03aa.8062.19266.804812.3.110_3 using setiathome_enhanced version 515 17/11/2006 19:29:12|rosetta@home|Deferring task FRA_t362_HOMOENV_hom001_8_t362_6_2gf6A_IGNORE_THE_REST_15_1398_4_0 17/11/2006 20:37:55||Suspending computation - running CPU benchmarks 17/11/2006 20:37:55|SETI@home|Pausing task 10jn03aa.8062.19266.804812.3.110_3 (removed from memory) 17/11/2006 20:37:55||Suspending network activity - running CPU benchmarks 17/11/2006 20:37:57||Running CPU benchmarks 17/11/2006 20:38:56||Benchmark results: 17/11/2006 20:38:56|| Number of CPUs: 1 17/11/2006 20:38:56|| 793 floating point MIPS (Whetstone) per CPU 17/11/2006 20:38:56|| 1411 integer MIPS (Dhrystone) per CPU 17/11/2006 20:38:56||Finished CPU benchmarks 17/11/2006 20:38:57||Resuming computation 17/11/2006 20:38:57||Rescheduling CPU: Resuming computation 17/11/2006 20:38:57||Resuming network activity 17/11/2006 20:38:57|SETI@home|Restarting task 10jn03aa.8062.19266.804812.3.110_3 using setiathome_enhanced version 515 Thanks, Ivor |
Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0 |
Is this when you are running the Rosetta screensaver? edit, my bad, you said it's when poping OUT of the screensaver? So you've set things up to not run when the user is active? Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ |
hugothehermit Send message Joined: 26 Sep 05 Posts: 238 Credit: 314,893 RAC: 0 |
G'day Ivor Disable the Rosetta@Home screensaver (right click the desktop->Properties->Screen Saver(tab)-> Screen saver (drop down box) and choose None), and see if that fixes up the problem. It seems to me to be that Rosetta@Home is having problems with some of the ATI cards. I'm just guessing so if you could tell me if this worked it would be much appreciated. Thanks Hugo. |
Keith Akins Send message Joined: 22 Oct 05 Posts: 176 Credit: 71,779 RAC: 0 |
Actually you can include the Intel 82865G Extream Graphics v2 onboard video. It ran 5.16 - 5.2X beautifully. When 5.3X - 5.4X came out, well, I had to set sceeensaver to blank due to one-in-three WU's failing. I suspect maybe some coding issues maybe with sidechain display could be part of it. That's when it started on my box. |
MM Sihombing Send message Joined: 22 May 06 Posts: 15 Credit: 1,424,082 RAC: 0 |
|
Ivor Cogdell Send message Joined: 7 Nov 06 Posts: 10 Credit: 21,469 RAC: 0 |
Hi Gang, Some additional information. Setup is to run in the background, then activates screensaver at the stated time. I have now set the screensaver to none, as requested, to see if this alters anything. The lockup also occurs when seti@home is running as well, so it might be an underlying boinc problem. I am waiting to download the latest Einstein@home workunit to find out if that is affected too. |
Ivor Cogdell Send message Joined: 7 Nov 06 Posts: 10 Credit: 21,469 RAC: 0 |
Hi gang, Just had an overnight seven hour run with screensaver turned off, no problem at all getting to rest of system this morning. Hope that narrows it down a bit. Only half a haystack to go through. Einstein@home workunit loaded, so I shall put screensaver back on and see if that has any effect. |
Ivor Cogdell Send message Joined: 7 Nov 06 Posts: 10 Credit: 21,469 RAC: 0 |
Hi gang, Just had a lock up using rosetta, but I managed to get the task manager running by (control) (alt) (Delete) sequence. It stated that Rosetta was not responding. I ended the task, ran Boinc manager. Rosetta flagged a computational error and loaded another work unit and started on that. Are there any other debug logs that I can download to help ? 20/11/2006 18:14:55||Starting BOINC client version 5.4.11 for windows_intelx86 20/11/2006 18:14:55||libcurl/7.15.3 OpenSSL/0.9.8a zlib/1.2.3 20/11/2006 18:14:55||Data directory: C:Program FilesBOINC 20/11/2006 18:14:56||Processor: 1 GenuineIntel x86 Family 6 Model 8 Stepping 10 996MHz 20/11/2006 18:14:56||Memory: 510.48 MB physical, 1.22 GB virtual 20/11/2006 18:14:56||Disk: 128.00 GB total, 88.63 GB free 20/11/2006 18:14:56|rosetta@home|URL: https://boinc.bakerlab.org/rosetta/; Computer ID: 349316; location: home; project prefs: home 20/11/2006 18:14:56|Einstein@Home|URL: http://einstein.phys.uwm.edu/; Computer ID: 25471; location: home; project prefs: default 20/11/2006 18:14:56|SETI@home|URL: http://setiathome.berkeley.edu/; Computer ID: 1638859; location: home; project prefs: default 20/11/2006 18:14:56||General prefs: from SETI@home (last modified 2006-11-17 01:34:09) 20/11/2006 18:14:56||General prefs: using separate prefs for home 20/11/2006 18:14:56||Local control only allowed 20/11/2006 18:14:56||Listening on port 31416 20/11/2006 18:14:56|SETI@home|Deferring task 10jn03aa.8062.19266.804812.3.110_3 20/11/2006 18:14:56|Einstein@Home|Deferring task l1_1383.0_S5R1__38_S5R1a_1 20/11/2006 18:14:57|rosetta@home|Resuming task DOC_1CSE_R061114_pose_u_global_search_1402_2152_0 using rosetta version 540 20/11/2006 18:14:58||Using earliest-deadline-first scheduling because computer is overcommitted. 20/11/2006 18:14:58||Suspending work fetch because computer is overcommitted. 20/11/2006 19:53:08||Rescheduling CPU: application exited 20/11/2006 19:53:08|rosetta@home|Computation for task DOC_1CSE_R061114_pose_u_global_search_1402_2152_0 finished 20/11/2006 19:53:09|Einstein@Home|Restarting task l1_1383.0_S5R1__38_S5R1a_1 using einstein_S5R1 version 424 20/11/2006 19:53:11|rosetta@home|Started upload of file DOC_1CSE_R061114_pose_u_global_search_1402_2152_0_0 20/11/2006 19:53:21|rosetta@home|Finished upload of file DOC_1CSE_R061114_pose_u_global_search_1402_2152_0_0 20/11/2006 19:53:21|rosetta@home|Throughput 24767 bytes/sec 20/11/2006 21:15:00||Allowing work fetch again. 20/11/2006 21:15:01|rosetta@home|Sending scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi 20/11/2006 21:15:01|rosetta@home|Reason: To fetch work 20/11/2006 21:15:01|rosetta@home|Requesting 43200 seconds of new work, and reporting 1 completed tasks 20/11/2006 21:15:06|rosetta@home|Scheduler request succeeded 20/11/2006 21:15:09|rosetta@home|Started download of file hom003_s014_.fasta.gz 20/11/2006 21:15:09|rosetta@home|Started download of file hom003_s014_.psipred_ss2.gz 20/11/2006 21:15:10|rosetta@home|Finished download of file hom003_s014_.fasta.gz 20/11/2006 21:15:10|rosetta@home|Throughput 470 bytes/sec 20/11/2006 21:15:10|rosetta@home|Finished download of file hom003_s014_.psipred_ss2.gz 20/11/2006 21:15:10|rosetta@home|Throughput 2968 bytes/sec 20/11/2006 21:15:10|rosetta@home|Started download of file boinc_hom003_aas014_03_05.200_v1_3.gz 20/11/2006 21:15:10|rosetta@home|Started download of file boinc_hom003_aas014_09_05.200_v1_3.gz 20/11/2006 21:15:15|rosetta@home|Finished download of file boinc_hom003_aas014_09_05.200_v1_3.gz 20/11/2006 21:15:15|rosetta@home|Throughput 51837 bytes/sec 20/11/2006 21:15:15|rosetta@home|Started download of file sg_target_description.txt 20/11/2006 21:15:17|rosetta@home|Finished download of file sg_target_description.txt 20/11/2006 21:15:17|rosetta@home|Throughput 330 bytes/sec 20/11/2006 21:15:18|rosetta@home|Finished download of file boinc_hom003_aas014_03_05.200_v1_3.gz 20/11/2006 21:15:18|rosetta@home|Throughput 136672 bytes/sec 20/11/2006 21:15:19||Rescheduling CPU: files downloaded 20/11/2006 21:15:19|Einstein@Home|Pausing task l1_1383.0_S5R1__38_S5R1a_1 (removed from memory) 20/11/2006 21:15:19|rosetta@home|Starting task s014__BOINC_ABRELAX_SAVE_ALL_OUT_hom003__1406_1937_0 using rosetta version 540 20/11/2006 21:15:22||Suspending work fetch because computer is overcommitted. 21/11/2006 00:11:02|rosetta@home|Unrecoverable error for result s014__BOINC_ABRELAX_SAVE_ALL_OUT_hom003__1406_1937_0 ( - exit code 1073807364 (0x40010004)) 21/11/2006 00:11:02|rosetta@home|Deferring scheduler requests for 1 minutes and 0 seconds 21/11/2006 00:11:02||Rescheduling CPU: application exited 21/11/2006 00:11:02|rosetta@home|Computation for task s014__BOINC_ABRELAX_SAVE_ALL_OUT_hom003__1406_1937_0 finished 21/11/2006 00:11:02||Resuming round-robin CPU scheduling. 21/11/2006 00:11:02|Einstein@Home|Restarting task l1_1383.0_S5R1__38_S5R1a_1 using einstein_S5R1 version 424 21/11/2006 00:11:05||Allowing work fetch again. 21/11/2006 00:12:05|rosetta@home|Sending scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi 21/11/2006 00:12:05|rosetta@home|Reason: To fetch work 21/11/2006 00:12:05|rosetta@home|Requesting 43200 seconds of new work, and reporting 1 completed tasks 21/11/2006 00:12:10|rosetta@home|Scheduler request succeeded 21/11/2006 00:12:12|rosetta@home|Started download of file hom012_s018_.fasta.gz 21/11/2006 00:12:12|rosetta@home|Started download of file hom012_s018_.psipred_ss2.gz 21/11/2006 00:12:14|rosetta@home|Finished download of file hom012_s018_.fasta.gz 21/11/2006 00:12:14|rosetta@home|Throughput 435 bytes/sec 21/11/2006 00:12:14|rosetta@home|Finished download of file hom012_s018_.psipred_ss2.gz 21/11/2006 00:12:14|rosetta@home|Throughput 2442 bytes/sec 21/11/2006 00:12:14|rosetta@home|Started download of file boinc_hom012_aas018_03_05.200_v1_3.gz 21/11/2006 00:12:14|rosetta@home|Started download of file boinc_hom012_aas018_09_05.200_v1_3.gz 21/11/2006 00:12:18|rosetta@home|Finished download of file boinc_hom012_aas018_09_05.200_v1_3.gz 21/11/2006 00:12:18|rosetta@home|Throughput 52488 bytes/sec 21/11/2006 00:12:21|rosetta@home|Finished download of file boinc_hom012_aas018_03_05.200_v1_3.gz 21/11/2006 00:12:21|rosetta@home|Throughput 108299 bytes/sec 21/11/2006 00:12:23||Rescheduling CPU: files downloaded 21/11/2006 00:12:23||Using earliest-deadline-first scheduling because computer is overcommitted. 21/11/2006 00:12:23|Einstein@Home|Pausing task l1_1383.0_S5R1__38_S5R1a_1 (removed from memory) 21/11/2006 00:12:23|rosetta@home|Starting task s018__BOINC_ABRELAX_SAVE_ALL_OUT_hom012__1407_2120_0 using rosetta version 540 21/11/2006 00:12:27||Suspending work fetch because computer is overcommitted. |
River~~ Send message Joined: 15 Dec 05 Posts: 761 Credit: 285,578 RAC: 0 |
Hi gang, The following advice may seem counter intuitive, and it is! If you don't have time to do all I am suggesting and you also don't want to leave your box idle, then what you did is a good quick way to get going again. I am not suggesting you did anything wrong - rather I am trying to say that, if you can spare the extra time, doing the longer procedure given below might be of more help to the project. It is better to exit boinc rather than use task manager (or on Linux top or kill) to end the rosetta process, even when it is the rosetta process causing the problem. First, try forcing boinc to exit the normal way: - file->exit for most people, - but for those running boinc as a windows service, use control panel -> admin tools -> services -> right click boinc -> stop - linux users will probably already know how to stop boinc on their installation. One of the many ways is to open a shell window, cd to the BOINC directory, and type ./boinc_cmd --quit If that does not work then use task manager (top, kill) to kill the boinc process, still not the rosetta one. However you ended boinc, please wait one minute after boinc is dead, use task manager to see if rosetta is still there. If it is still running, only then use task manager to kill off the rosetta process. The reason for this advice is because by doing things the more obvious way, as you did, the error report from the Rosetta app is about the fact that Rosetta was killed by intervention from the operating system. If you kill boinc, rosetta should die anyway after 30sec, which is why I suggest waiting 1min. The same work will restart form the proevious checkpoint when boinc is restarted. Sometimes it will then run OK, sometimes it will die again - either way this is useful info to be reported to the project team when the work is finally reported. If rosetta dies a second time, I would again suggest you try not to use task manager to abort it, but restart boinc and use the abort facility built into boinc. Again this gives the project team a better idea of what went wrong. Sometimes in the past the team have asked us to preserve files from such situations - usually now they already have enough debug info sent back in the stderr output. If you are asked to save files you will be told which ones, and then a good point in the above sequence to copy them is while boinc is not running. R~~ edit: moved things around, added how to stop boinc on linux for benefit of other readers |
Ivor Cogdell Send message Joined: 7 Nov 06 Posts: 10 Credit: 21,469 RAC: 0 |
Hi River and the gang, It is my usual policy to use the Boinc manager Exit before closing down windows, if i can gain access to it. I then try task manager next. I will aim for Boinc before Roosetta. |
hugothehermit Send message Joined: 26 Sep 05 Posts: 238 Credit: 314,893 RAC: 0 |
21/11/2006 00:11:02|rosetta@home|Unrecoverable error for result s014__BOINC_ABRELAX_SAVE_ALL_OUT_hom003__1406_1937_0 ( - exit code 1073807364 (0x40010004)) Though note: it's a minus and your isn't, but with the number corresponding I would hazzard a guess that it's the same thing. Unofficial BOINC Wiki Exit Code -1073807364 (0x40010004) If you're not using the screensaver, I don't know, maybe a bug in BOINC version 5.4.11. I seem to remember that someone said one of the new BOINC versions used a different file name for one of it's EXE's and was causing two BOINC's to load up, but that should be taken with a grain of salt. Though if your straw clutching you could have a look at the with the task manager I use BOINC 5.4.9 with the screensaver set to none and it works, but like I said I don't know :? |
River~~ Send message Joined: 15 Dec 05 Posts: 761 Credit: 285,578 RAC: 0 |
Though note: it's a minus and your isn't, but with the number corresponding I would hazzard a guess that it's the same thing. Good spot Hugo! This is a documentation bug - the decimal value of this error code is positive and the wiki is wrong to show it as negative. So you are quite right to guess that this is the right interpretation of the code. Fyi, hex numbers that begin with digits 0-7 are positive when converted to signed decimal values, those that begin 8-F are negative. (NB - you must make sure you have all the hex digits before applying this rule: 0xFFFF would be negative (-1) if it were a 16 bit value, but positve if it were shorthand for 0x0000FFFF) Maybe someone with current wiki access could take out the - sign. Under the same rule, the minus sign is correct on the error codes above, which start 0xC. No doubt that is how the mistake crept in. R~~ |
Ivor Cogdell Send message Joined: 7 Nov 06 Posts: 10 Credit: 21,469 RAC: 0 |
Hi gang, This may fit into the dumb question category, but is there any way to check the status of the screensaver on a regular basis, to see when the problem occurs. In other words, does it lock up at the keypress or mouse input stage or just after a set time or process ? Just to clarify Hugo, the screensaver was active (running Rosetta or Seti) when the lockup occurs. I mentioned both because it might be a factor. The Screensaver is turned off at the moment, to preserve workunits, but I can put it back on for any tests anyone might like to suggest. |
ravens Send message Joined: 25 Mar 06 Posts: 4 Credit: 85,122 RAC: 0 |
Hi gang, I had the same lockup problem if I had the Rosetta graphics on, while running BOINC 5.4.x, and I was also having issues getting CPU throttling to work. The throttling option was shown in 5.4.x, but did not actually do it. I tried some later Beta versions, throttling worked in those (runs/suspends every second or so), but the tasks (both for Einstein and Rosetta) stopped running after a few minutes. Win XP Task Manager said was still doing the running/suspended sequence but task was not running, showed no progress or CPU usage. In addition, I got the screen lockups with Rosetta graphics. I'm trying Beta 5.7.5 now, throttling works fine on my laptop, Rosetta and Einstein tasks both can run right thru to completion, and if Rosetta graphics are on too, they do NOT cause a lockup. It is beta code, but I assume this fix will make it to th next official non-beta release. Mike |
Message boards :
Number crunching :
Rosetta 5.40 locks up
©2025 University of Washington
https://www.bakerlab.org