minirosetta 2.05

Message boards : Number crunching : minirosetta 2.05

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 10 · Next

AuthorMessage
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 65038 - Posted: 19 Jan 2010, 0:17:29 UTC
Last modified: 19 Jan 2010, 0:20:37 UTC

{wrong version area for my posts}
ID: 65038 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
namtraf

Send message
Joined: 6 Jun 06
Posts: 1
Credit: 535,242
RAC: 0
Message 65040 - Posted: 19 Jan 2010, 5:49:43 UTC

minirosetta 2.05 hangs on my computer frequently. It's a windows vista machine. The cpu meter shows no activity, the time to completion is incrementing instead of decrementing and the screen saver for r@h is blank. I've shut my machine off then on 3 times and r@h runs normally after that. The cpu meter shows activity, the time to completion is decrementing and the time as decreased from more than 10 hours to around 2 hours and the screen saver works. This started happening the second week of January. My machine was off from December 18 to January 8. After a few hours of running, r@h hangs again.
ID: 65040 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 65043 - Posted: 19 Jan 2010, 14:40:44 UTC - in response to Message 65034.  

Hello,

based on the reports of validator issues, David Kim has now fixed the validator. He also asked me to remind people that credit is granted based on the client's claimed credit, regardless of validator results.

Let us know if you see more such problems.

Thanks, Sarel.


If I have my facts straight, Sarel means to say that credit is issued as normal. This means based on the average credit claims PER MODEL of the tasks reported before yours. This is a bit odd for Sarel's tasks because, as he's been explaining, there is a new technique where a quick cursory review of a given model is performed, and then some small percentage of those are deemed worth a more detailed review. And so model runtimes can vary from around 60 seconds, to several hours. So you will see credit all over the map. But it seems that on average most tasks spend the majority of their time crunching on one low level model, and so over time credit is still comparable with other types of Rosetta work.

If you somehow run through 60 models, and none require low level analysis, and you only allow a 1hr runtime preference, then you would probably see considerably more credit granted then your claim. As I say, this would be rather rare. If you run for a 24hr runtime preference, then you'll probably see several low level models. But then that is over a longer period of crunching too. But once you've run through several such tasks the credit will average out, as it always does.
Rosetta Moderator: Mod.Sense
ID: 65043 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Sarel

Send message
Joined: 11 May 06
Posts: 51
Credit: 81,712
RAC: 0
Message 65044 - Posted: 19 Jan 2010, 19:33:12 UTC

Thanks RosettaMod for the clarification!

On another note, I've isolated why on restart the *gnb* runs report starting over from model 1. The fix for this will be part of the next update of the minirosetta application. Despite the confusion, the models that we get are unharmed and credit is allocated correctly.

Many thanks to the users who reported this for another bug catch!
ID: 65044 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile macko
Avatar

Send message
Joined: 25 Jun 09
Posts: 32
Credit: 153,495
RAC: 0
Message 65047 - Posted: 20 Jan 2010, 13:28:01 UTC


Hi

This WU's, "8gbnnotyr" and older "dock" types won't be listed on results pages?

With regards
ID: 65047 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Sarel

Send message
Joined: 11 May 06
Posts: 51
Credit: 81,712
RAC: 0
Message 65048 - Posted: 20 Jan 2010, 16:57:44 UTC - in response to Message 65047.  

Could you elaborate what it is that you're seeing? These types of job are treated as others in these respects.


Hi

This WU's, "8gbnnotyr" and older "dock" types won't be listed on results pages?

With regards


ID: 65048 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile macko
Avatar

Send message
Joined: 25 Jun 09
Posts: 32
Credit: 153,495
RAC: 0
Message 65049 - Posted: 20 Jan 2010, 21:07:17 UTC - in response to Message 65048.  

Could you elaborate what it is that you're seeing? These types of job are treated as others in these respects.


Hi

This WU's, "8gbnnotyr" and older "dock" types won't be listed on results pages?

With regards


Hi

There were some WUs not showed on results page, here is small (uncomplete)collection from last months:
aTt13
histone
1 famA
foldit WUs
denovo_design_rossmann2x3_flxbb (a really RAM eating ang long running ones)
NeR103A
CGR26A
and finally this two from 2010: CtR69A_2KRU_BOINC_ABRELAX, 3gbn bla-bla&gz_dock

And now the 8gbnnotyr WUs seems to have similar fate, crunching only for credit.

ID: 65049 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 65050 - Posted: 21 Jan 2010, 6:47:12 UTC

This only ran for 19 min, no idea what happened.

boinc.loopbuild_threading_hb_2kruA_IGNORE_THE_REST_17084_3403_0

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=284668104

# cpu_run_time_pref: 14400
======================================================
DONE :: 5 starting structures 1201 cpu seconds
This process generated 5 decoys from 5 attempts
======================================================

BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down cleanly ...
called boinc_finish

Over__Validate error__Done__1,152.13

ID: 65050 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
AMD_is_logical

Send message
Joined: 20 Dec 05
Posts: 299
Credit: 31,460,681
RAC: 0
Message 65053 - Posted: 21 Jan 2010, 15:24:13 UTC

Here's a couple of relaxopt_grow WUS that exited after a few seconds with the error:

ERROR: LoopRebuild::ERROR Loop definition out of boundary
ERROR:: Exit from: src/protocols/loops/Loops.cc line: 595
BOINC:: Error reading and gzipping output datafile: default.out

https://boinc.bakerlab.org/rosetta/result.php?resultid=312109858
https://boinc.bakerlab.org/rosetta/result.php?resultid=312068049
ID: 65053 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 65058 - Posted: 22 Jan 2010, 6:49:19 UTC

This ran for 1hr, 30min's then fell over.

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=284866813

tyrsim_3gbn_2c2p_20Jan2010_17119_14_0

<message>
process exited with code 193 (0xc1, -63)
</message>

# cpu_run_time_pref: 14400
SIGSEGV: segmentation violation
Stack trace (64 frames):
[0x96c49b3]
[0x96ee888]
[0xb7fad420]

JUST A FEW OF THEM.

ID: 65058 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TomaszPawel

Send message
Joined: 28 Apr 07
Posts: 54
Credit: 2,791,145
RAC: 0
Message 65068 - Posted: 22 Jan 2010, 18:51:02 UTC - in response to Message 65058.  

ID: 65068 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Admin

Send message
Joined: 13 Apr 07
Posts: 42
Credit: 260,782
RAC: 0
Message 65069 - Posted: 22 Jan 2010, 19:03:58 UTC

Compute Error

relaxopt_grow.1bk2.1bk2.IGNORE_THE_REST.S_00066_0000013_0_0000_noncon_00066.pdb.JOB_16957_8

ERROR: LoopRebuild::ERROR Loop definition out of boundary

ERROR:: Exit from: ....srcprotocolsloopsLoops.cc line: 595
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
ID: 65069 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile macko
Avatar

Send message
Joined: 25 Jun 09
Posts: 32
Credit: 153,495
RAC: 0
Message 65070 - Posted: 22 Jan 2010, 19:13:26 UTC - in response to Message 65053.  

Here's a couple of relaxopt_grow WUS that exited after a few seconds with the error:

ERROR: LoopRebuild::ERROR Loop definition out of boundary
ERROR:: Exit from: src/protocols/loops/Loops.cc line: 595
BOINC:: Error reading and gzipping output datafile: default.out

https://boinc.bakerlab.org/rosetta/result.php?resultid=312109858
https://boinc.bakerlab.org/rosetta/result.php?resultid=312068049


Same error, same wus relaxopt_grow.1ctf.1ctf.
ID: 65070 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
svincent

Send message
Joined: 30 Dec 05
Posts: 219
Credit: 12,120,035
RAC: 0
Message 65072 - Posted: 22 Jan 2010, 19:48:39 UTC

Another "Loop definition out of boundary" error as reported by others. On Mac OS X 10.6.

Task : 312278520
Name : relaxopt_grow.1c9o.1c9o.IGNORE_THE_REST.S_00082_0000671_0.pdb.JOB_16963_8_0

ERROR: LoopRebuild::ERROR Loop definition out of boundary

ERROR:: Exit from: src/protocols/loops/Loops.cc line: 595
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

</stderr_txt>
ID: 65072 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mad_Max

Send message
Joined: 31 Dec 09
Posts: 209
Credit: 26,262,530
RAC: 19,111
Message 65077 - Posted: 23 Jan 2010, 14:49:56 UTC - in response to Message 65034.  

Hello,

based on the reports of validator issues, David Kim has now fixed the validator. He also asked me to remind people that credit is granted based on the client's claimed credit, regardless of validator results.

Let us know if you see more such problems.

Thanks, Sarel.


The last 20 tasks on my computer were completed without any validation errors.
(Among them were including *gbnnotyr* and tasks restarted in execution time)
So seems this problem is solved.
ID: 65077 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mad_Max

Send message
Joined: 31 Dec 09
Posts: 209
Credit: 26,262,530
RAC: 19,111
Message 65078 - Posted: 23 Jan 2010, 15:00:24 UTC - in response to Message 65044.  

Thanks RosettaMod for the clarification!

On another note, I've isolated why on restart the *gnb* runs report starting over from model 1. The fix for this will be part of the next update of the minirosetta application. Despite the confusion, the models that we get are unharmed and credit is allocated correctly.

Many thanks to the users who reported this for another bug catch!


In addition to Wus type *gnb* bug with only 1 model after a restart occurs in many other types of tasks. But there it does not seem to affect the results sent to the server, but only on the mapping process in the graphic part. So it is not a significant error. It makes sense to report such?
ID: 65078 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 65081 - Posted: 23 Jan 2010, 22:45:11 UTC

Here's another one of these, zero run time.

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=284355878

homopt_nat2.t331_.t331_.IGNORE_THE_REST.S_00004_0000011_06.pdb_00004.pdb.JOB_16832_30_1

<message>
process exited with code 1 (0x1, -255)
</message>

ERROR: No values of the appropriate type specified for multi-valued option -loops:loop_file

ID: 65081 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Evan

Send message
Joined: 23 Dec 05
Posts: 268
Credit: 402,585
RAC: 0
Message 65091 - Posted: 24 Jan 2010, 21:45:03 UTC

This one failed after about 40 seconds
cst2.loopbuild_threading_hb_i1705_IGNORE_THE_REST_17160_389_0

- exit code -1073741819 (0xc0000005)

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x0054FC53 read attempt to address 0xFFFFFFC0


ID: 65091 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 65093 - Posted: 25 Jan 2010, 1:59:23 UTC - in response to Message 65044.  
Last modified: 25 Jan 2010, 1:59:57 UTC

Thanks RosettaMod for the clarification!

On another note, I've isolated why on restart the *gnb* runs report starting over from model 1. The fix for this will be part of the next update of the minirosetta application. Despite the confusion, the models that we get are unharmed and credit is allocated correctly.

Many thanks to the users who reported this for another bug catch!

=================================================================================

I'm assuming that these two tasks have been affected by this bug, both had a few

hundred models showing in the graphics before the rig was rebooted, then there

gone. The credits are O.K.

tyrsim_3gbn_2znr_20Jan2010_17119_66_0

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=285166061

# cpu_run_time_pref: 14400
======================================================
DONE :: 2 starting structures 13670.3 cpu seconds
This process generated 2 decoys from 2 attempts
======================================================

--------------------------------------------------------------
tyrsim_3gbn_1s2x_20Jan2010_17119_291_0

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=285312382

# cpu_run_time_pref: 14400
======================================================
DONE :: 2 starting structures 9627.95 cpu seconds
This process generated 2 decoys from 2 attempts
======================================================
ID: 65093 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Rabinovitch
Avatar

Send message
Joined: 28 Apr 07
Posts: 28
Credit: 5,439,728
RAC: 0
Message 65095 - Posted: 25 Jan 2010, 3:44:38 UTC - in response to Message 64974.  

New app working well. And it seems that now the WU need less RAM (about 100 MB per WU). Is it true? If it is, then may be this is a step to rosetta's GPU client? :-)


Well, now I see two WUs are being processed, and one is consupting about 510 MB of RAM, and another - 480. I like such a heavy WUs, give me more please! :-)
ID: 65095 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 10 · Next

Message boards : Number crunching : minirosetta 2.05



©2024 University of Washington
https://www.bakerlab.org