Message boards : Number crunching : Problems with Rosetta version 5.98
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 10 · Next
Author | Message |
---|---|
The_Bad_Penguin Send message Joined: 5 Jun 06 Posts: 2751 Credit: 4,271,025 RAC: 0 |
FRA_t449_CASP8_MANUAL_1_IGNORE_THE_RESTt449_1_ttxxxxT0449_1CHIM_0001_0001_0001_4126_1926_1 errors Too many error results CPU time 36.42623 stderr out <core_client_version>5.10.13</core_client_version> <![CDATA[ <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> <stderr_txt> # cpu_run_time_pref: 10800 # random seed: 2406548 ERROR:: Exit from: .loop_relax.cc line: 1745 </stderr_txt> ]]> |
The_Bad_Penguin Send message Joined: 5 Jun 06 Posts: 2751 Credit: 4,271,025 RAC: 0 |
FRA_t449_CASP8_MANUAL_1_IGNORE_THE_RESTt449_1_ttxxxxT0449_1CHIM_0001_0001_0001_4126_3868_1 errors Too many error results stderr out <core_client_version>5.10.13</core_client_version> <![CDATA[ <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> <stderr_txt> # cpu_run_time_pref: 10800 # random seed: 2404606 ERROR:: Exit from: .loop_relax.cc line: 1745 </stderr_txt> ]]> Validate state Invalid |
kb7rzf Send message Joined: 7 Oct 05 Posts: 16 Credit: 35,427 RAC: 0 |
The 2nd result on this work unit also got the same error. Well, had 1 compute error since I started crunching again here. The wu is FRA_t453_CASP8_HYBRID_MANUAL_1_IGNORE_THE_RESTt451_1_axmin1_0001_4163_1226 |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
rosetta@home|Task t443_FULL_h001__CASP8_LONGRANGE_JUMP_SAVE_ALL_OUT_BARCODE__4133_145188_0 exited with a DLL initialization error. |rosetta@home|If this happens repeatedly you may need to reboot your computer. |rosetta@home|Restarting task t443_FULL_h001__CASP8_LONGRANGE_JUMP_SAVE_ALL_OUT_BARCODE__4133_145188_0 using rosetta_beta version 598 it was at 100% and 5 hrs and when I opened the graphics window it just went blank. when i tried to close the window then it went into not responding. finally got the window to close and it reset itself to 42%. |
BrnmccO1 Send message Joined: 26 Jun 07 Posts: 17 Credit: 578,825 RAC: 0 |
159639723 Compute error after full run, also failed on someone else's host as well. Output file missing. <core_client_version>5.10.45</core_client_version> <![CDATA[ <stderr_txt> # cpu_run_time_pref: 10800 # random seed: 2136320 ====================================================== DONE :: 1 starting structures 10289.6 cpu seconds This process generated 1 decoys from 1 attempts 0 starting pdbs were skipped ====================================================== BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down... </stderr_txt> <message> <file_xfer_error> <file_name>FRA_t453_CASP8_HYBRID_MANUAL_1_IGNORE_THE_RESTt451_1_axmin1_0001_4165_2909_0_0</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
AMD_is_logical Send message Joined: 20 Dec 05 Posts: 299 Credit: 31,460,681 RAC: 0 |
this FRA_t453_CASP8_HYBRID_MANUAL_1_IGNORE_THE_RESTt451_1_axmin1_0001_4165_78 WU seemed to crunch correctly, then it bombed out with: <message><file_xfer_error> <file_name>FRA_t453_CASP8_HYBRID_MANUAL_1_IGNORE_THE_RESTt451_1_axmin1_0001_4165_78_1_0</file_name> <error_code>-161</error_code> <error_message></error_message> </file_xfer_error> This WU did the same for the other cruncher as well. |
BrnmccO1 Send message Joined: 26 Jun 07 Posts: 17 Credit: 578,825 RAC: 0 |
this FRA_t453_CASP8_HYBRID_MANUAL_1_IGNORE_THE_RESTt451_1_axmin1_0001_4165_78 WU seemed to crunch correctly, then it bombed out with: I got the same -161 Output file missing error from one of my t453's as well. |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
Another 8hrs wasted, is this a boinc,app,server or workunit problem Someone! Anyone! https://boinc.bakerlab.org/rosetta/workunit.php?wuid=159608923 7/6/2008 11:19:02 AM|rosetta@home|Output file for task FRA_t453_CASP8_HYBRID_MANUAL_1_IGNORE_THE_RESTt451_1_axmin1_0001_4163_823_1 absent <core_client_version>5.10.30</core_client_version> <![CDATA[ <stderr_txt> # cpu_run_time_pref: 21600 # random seed: 2143416 # cpu_run_time_pref: 21600 ====================================================== DONE :: 1 starting structures 29142 cpu seconds This process generated 10 decoys from 10 attempts 0 starting pdbs were skipped ====================================================== BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down... </stderr_txt> <message> <file_xfer_error> <file_name>FRA_t453_CASP8_HYBRID_MANUAL_1_IGNORE_THE_RESTt451_1_axmin1_0001_4163_823_1_0</file_name> <error_code>-161</error_code> </file_xfer_error> </message> pete. |
AMD_is_logical Send message Joined: 20 Dec 05 Posts: 299 Credit: 31,460,681 RAC: 0 |
Here's two more that crunched to completion, then bombed out with the -161 error for both me and the other cruncher: FRA_t453_CASP8_HYBRID_MANUAL_1_IGNORE_THE_RESTt451_1_axmin1_0001_4163_1501 FRA_t453_CASP8_HYBRID_MANUAL_1_IGNORE_THE_RESTt451_1_axmin1_0001_4163_1279 |
The_Bad_Penguin Send message Joined: 5 Jun 06 Posts: 2751 Credit: 4,271,025 RAC: 0 |
t451_M4_grishin_IGNORE_THE_REST_renumbered_4150_1393_0 Outcome Validate error CPU time 10275.77 stderr out <core_client_version>5.10.13</core_client_version> <![CDATA[ <stderr_txt> # cpu_run_time_pref: 10800 # </stderr_txt> ]]> Validate state Invalid Claimed credit 42.7165467457025 Granted credit 0 application version 5.98 |
AMD_is_logical Send message Joined: 20 Dec 05 Posts: 299 Credit: 31,460,681 RAC: 0 |
The Server Status page currently says that the validator is not running. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
The Server Status page currently says that the validator is not running. I've EMailed the Project Team pointing this out. Thanks for pointing it out. Rosetta Moderator: Mod.Sense |
Rhiju Volunteer moderator Send message Joined: 8 Jan 06 Posts: 223 Credit: 3,546 RAC: 0 |
Thanks... we changed the validator code earlier today after testing on RALPH, but there's clearly still an issue! I've contacted DK to revert to the old code. The Server Status page currently says that the validator is not running. |
TeAm Enterprise Send message Joined: 28 Sep 05 Posts: 18 Credit: 27,911,735 RAC: 125 |
You have bigger problems than the validator code. Since 5.96 I have never had more problems with errors. I have just aborted all the T484 WUs since these don't work on my machine. Are you folks getting any science or just problems. Jim Crunch with friends - TeAm Anandtech |
Alan Roberts Send message Joined: 7 Jun 06 Posts: 61 Credit: 6,901,926 RAC: 0 |
I've had so many problems with Mini (see this post) that I've had to resort to filtering it off of quite a few of my dual-core/dual-CPU machines. This morning I walked into my home's listening room to find my recycled laptop, low-power music server (that to-date has happily consumed anything Rosetta sent its way) making excessive noise. Checking I found this 5.98 WU stuck at 100% CPU, even though the machine's preferences were set for max of 70% of CPU (BOINC 5.10.45, and BOINC CPU setting has been honored in the past). Within BOINC, CPU time used and progress were {b]not[/b] advancing, the job was sitting at 20-something percent progress. Suspending the project did not suspend the job. Shutting down the BOINC service did. Ran a round of Windows updates and rebooted. The work unit restarted and ran with CPU throttling for about 10 minutes, then locked up at 100% again. This time I aborted the task ... I believe the first time across any of the machines on my team that I've had to abandon a 5.98 work unit. The worst news for me is that the long (possibly better part of two days) non-cycling fan run seems to have put the fan into a permanent high-noise mode. I've got a spare fan assembly, but won't really enjoy the time to tear down and reassemble the unit this weekend. I guess I'll reinstall and setup Threadmaster, since the BOINC/Rosetta combination seems to be trending towards less operational reliability. I know everyone is busy with CASP, but I have to emphasize that this is important to me, and I assume to others who are trying to contribute with machines that are not dedicated crunchers. Most of the machines on my team are there because I committed to not loading the machine during business hours (time-of-day and when needed manual suspends) and not overheating the machine (CPU limits). If I can't reliably do this with minimal ongoing effort I'll end up having to pull machines off the project. |
The_Bad_Penguin Send message Joined: 5 Jun 06 Posts: 2751 Credit: 4,271,025 RAC: 0 |
errors Too many error results FRA_t453_CASP8_HYBRID_MANUAL_1_IGNORE_THE_RESTt451_1_axmin1_0001_4165_2851_0 CPU time 8831.965 stderr out <core_client_version>5.10.13</core_client_version> <![CDATA[ <stderr_txt> # cpu_run_time_pref: 10800 # random seed: 2136378 ====================================================== DONE :: 1 starting structures 8831.64 cpu seconds This process generated 4 decoys from 4 attempts 0 starting pdbs were skipped ====================================================== BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down... </stderr_txt> <message> <file_xfer_error> <file_name>FRA_t453_CASP8_HYBRID_MANUAL_1_IGNORE_THE_RESTt451_1_axmin1_0001_4165_2851_0_0</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> Validate state Invalid |
Harwood Send message Joined: 15 Nov 05 Posts: 1 Credit: 1,789,800 RAC: 0 |
I am running on an AMD Athlon in Win Server 2k8 with rosetta_beta_5.98_windows_x86_64.exe and I noticed in the task manager that it is running in 32 bit compatablity. Could it be that a flag is thrown and this is a 64 bit app? Irregardless, we are not getting the performance for the project. Its running, but it could be better. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
I am running on an AMD Athlon in Win Server 2k8 with rosetta_beta_5.98_windows_x86_64.exe and I noticed in the task manager that it is running in 32 bit compatablity. Could it be that a flag is thrown and this is a 64 bit app? Irregardless, we are not getting the performance for the project. Its running, but it could be better. At this point, there is no true 64bit application. The Project Team is aware of the performance implications of that fact. Rosetta Moderator: Mod.Sense |
Azurrio Send message Joined: 20 Feb 06 Posts: 8 Credit: 237,979 RAC: 0 |
Computer/validate error on this |
Alberthuang Send message Joined: 5 Dec 05 Posts: 6 Credit: 182,638 RAC: 754 |
My computer's OS is Windows XP SP3, using the BOINC manager version 5.10.45. It computed the workunit n004__BOINC_SYMMETRY_C4SYMM_FOLD_AND_DOCK_RELAX-n004_-t484__4207_923 with Rosetta beta version 5.98, and showed compute error after full run. Then a windows message also showed that Windows C++ Runtime error at the same time, and the output file n004__BOINC_SYMMETRY_C4SYMM_FOLD_AND_DOCK_RELAX-n004_-t484__4207_923_0_0 for this task was missing. The task detail is in the following: Task ID 176331854 Name n004__BOINC_SYMMETRY_C4SYMM_FOLD_AND_DOCK_RELAX-n004_-t484__4207_923_0 Workunit 160936951 Created 9 Jul 2008 2:51:35 UTC Sent 9 Jul 2008 2:52:15 UTC Received 18 Jul 2008 10:17:47 UTC Server state Over Outcome Client error Client state Done Exit status 3 (0x3) Computer ID 224205 Report deadline 19 Jul 2008 2:52:15 UTC CPU time 14199.22 stderr out <core_client_version>5.10.45</core_client_version> <![CDATA[ <message> 系統找不到指定的路徑。 (0x3) - exit code 3 (0x3) </message> <stderr_txt> # cpu_run_time_pref: 14400 # random seed: 1315613 </stderr_txt> ]]> Validate state Invalid Claimed credit 27.2519995630333 Granted credit 0 application version 5.98 And before this workunit crashed, the BOINC manager downloaded a workunit with Rosetta beta version 5.98. At the same time, two previous files of Rosetta@home were deleted when the BOINC manager got server request of Rosetta@home! I wondered if this workunit's crash was in connection with the deletion of two previous files. |
Message boards :
Number crunching :
Problems with Rosetta version 5.98
©2024 University of Washington
https://www.bakerlab.org