Granted Credit taking forever....

Message boards : Number crunching : Granted Credit taking forever....

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Profile Gen_X_Accord
Avatar

Send message
Joined: 5 Jun 06
Posts: 154
Credit: 279,018
RAC: 0
Message 63215 - Posted: 9 Sep 2009, 0:07:23 UTC

I have a lot of work units that have been reported but are still awaiting "granted" credit. This is strange because nomally the longest I have ever seen Rosetta take to tabulate the granted credit has been about a half hour. Some of these items are taking all day. And....for some reason I have these really short one hour work units even though my preference settings is for 8 hours. Anybody know what is going on?
ID: 63215 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Gen_X_Accord
Avatar

Send message
Joined: 5 Jun 06
Posts: 154
Credit: 279,018
RAC: 0
Message 63218 - Posted: 9 Sep 2009, 5:25:28 UTC
Last modified: 9 Sep 2009, 5:30:46 UTC

I'm not the only one this is happening too, there are other users who's finished work units are also waiting for granted credit. My RAC is starting to plunge because of finished work units that are uploaded are not being granted any credit, they are stuck in "pending" limbo.
ID: 63218 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2127
Credit: 41,266,340
RAC: 8,573
Message 63220 - Posted: 9 Sep 2009, 6:48:55 UTC

I've seen it too. One of my team-mates is waiting up to 8 hours.

RAC will come back of course - no need to concern yourself with that once they've all been validaed.
ID: 63220 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mark Brown

Send message
Joined: 8 Aug 09
Posts: 21
Credit: 602,685
RAC: 0
Message 63226 - Posted: 9 Sep 2009, 12:33:59 UTC - in response to Message 63220.  

I've seen it too. One of my team-mates is waiting up to 8 hours.

RAC will come back of course - no need to concern yourself with that once they've all been validaed.


The latest system I put online has 20+ pending with only 2 showing any granted credit. This seems to be the pattern for my other systems too.


ID: 63226 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mark Brown

Send message
Joined: 8 Aug 09
Posts: 21
Credit: 602,685
RAC: 0
Message 63229 - Posted: 9 Sep 2009, 19:00:47 UTC - in response to Message 63226.  

I have 82 task Pending now. My average has dropped 80 points today!
ID: 63229 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile John Hunt
Avatar

Send message
Joined: 18 Sep 05
Posts: 446
Credit: 200,755
RAC: 0
Message 63230 - Posted: 9 Sep 2009, 19:47:52 UTC

Seems to be a backlog in the validator.

Don't panic; all your pending credits are 'money in the bank'.




ID: 63230 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Gen_X_Accord
Avatar

Send message
Joined: 5 Jun 06
Posts: 154
Credit: 279,018
RAC: 0
Message 63233 - Posted: 9 Sep 2009, 23:42:19 UTC

It's sure causing a drop on the Teraflops estimate on the main page as well.
ID: 63233 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
MarcoA

Send message
Joined: 2 Sep 08
Posts: 9
Credit: 777,433
RAC: 0
Message 63236 - Posted: 10 Sep 2009, 9:39:39 UTC

same here... since yesterday, all of my WUs stay pending.

But the Server Status (https://boinc.bakerlab.org/rosetta/rah_status.php) lists every program as running and operating normally...

I took a look at some of the pending WUs and most of them have a name like
1XXX (3 different big Letters)
or
lr5_combine(_smooth_torsion|_mod)

Furthermore, almost (ca 90%) every pending WU ran much shorter than wanted (12 hours).

Perhaps there is someting wrong with a new batch of WUs?
ID: 63236 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Gen_X_Accord
Avatar

Send message
Joined: 5 Jun 06
Posts: 154
Credit: 279,018
RAC: 0
Message 63239 - Posted: 10 Sep 2009, 10:32:22 UTC

It seems to be a problem with the validator, and is affecting everybody, look at the main tera-flops estimate on the front page.
(I think it was that Romulan Nero! He sabotages EVERYTHING. ;-) )
ID: 63239 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile John Hunt
Avatar

Send message
Joined: 18 Sep 05
Posts: 446
Credit: 200,755
RAC: 0
Message 63249 - Posted: 10 Sep 2009, 18:09:37 UTC

Message has been posted on front page -


Sep 10, 2009
The validator and scheduler servers are currently slowly processing a large work unit. We have reprioritized the WU after finding that it is causing server problem. However, it will take a while for the existing jobs to clean out. Meanwhile, server lags are expected.



ID: 63249 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Yifan Song
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 26 May 09
Posts: 62
Credit: 7,322
RAC: 0
Message 63250 - Posted: 10 Sep 2009, 18:39:24 UTC

Hi guys. Sorry about all the lags. The job I sent was a lot more IO intensive than I had expected.
ID: 63250 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile rochester new york
Avatar

Send message
Joined: 2 Jul 06
Posts: 2842
Credit: 2,020,043
RAC: 0
Message 63254 - Posted: 10 Sep 2009, 20:28:26 UTC - in response to Message 63250.  

Hi guys. Sorry about all the lags. The job I sent was a lot more IO intensive than I had expected.


thanks for the info its ok with me as long as the project is going forward
ID: 63254 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Gen_X_Accord
Avatar

Send message
Joined: 5 Jun 06
Posts: 154
Credit: 279,018
RAC: 0
Message 63265 - Posted: 11 Sep 2009, 4:14:13 UTC

Great...now how am I supposed to win the Golden Chromosome Award back from Chilean? ;)
ID: 63265 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Gen_X_Accord
Avatar

Send message
Joined: 5 Jun 06
Posts: 154
Credit: 279,018
RAC: 0
Message 63279 - Posted: 11 Sep 2009, 18:10:08 UTC
Last modified: 11 Sep 2009, 18:10:47 UTC

I know it sounds bad but I'm tempted to stop crunching for Rosetta for awhile and put my computer fully over to WCG to help reduce the amount of finished work units that the validator has to catch up with. Anyone want to talk me out of that idea?
ID: 63279 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Evan

Send message
Joined: 23 Dec 05
Posts: 268
Credit: 402,585
RAC: 0
Message 63280 - Posted: 11 Sep 2009, 18:15:27 UTC

You could download several more days worth of units or else increase the number of hours per work unit and then suspend network activity.
ID: 63280 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Gen_X_Accord
Avatar

Send message
Joined: 5 Jun 06
Posts: 154
Credit: 279,018
RAC: 0
Message 63281 - Posted: 11 Sep 2009, 18:35:01 UTC
Last modified: 11 Sep 2009, 18:37:41 UTC

But that would just flood their servers when I report all those finished tasks. And how the heck do you get 0 granted credit anyway, I know they say "cancelled, in the error section, so they are screw ups right?...
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=254777531
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=254958434
ID: 63281 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Path7

Send message
Joined: 25 Aug 07
Posts: 128
Credit: 61,751
RAC: 0
Message 63284 - Posted: 11 Sep 2009, 20:59:25 UTC

Hello Gen_X_Accord,
Evan's idea looks good to me.
Don't think you will flood the servers;
221,150 Wu's a day = 9215 Wu's/hour = 2.56 Wu's each second
How many seconds do you need to upload / report a single WU? The server has done lots of other Wu's in the meantime.

About choosing between Rosetta & WCG: personally I think there is no wrong choice to make; its up to you.

“Cancelled, in the error section” yes that is weird, I'll ask about that.

Have a nice day,
Path7.
ID: 63284 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Trotador

Send message
Joined: 30 May 09
Posts: 108
Credit: 291,214,977
RAC: 0
Message 63286 - Posted: 11 Sep 2009, 21:44:08 UTC

Yes, a lot of units with 0 granted credit here too (17 so far) and with the same cancelled statement in the error section. Standing by until I learn whether it is my fault or the server's
ID: 63286 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Gen_X_Accord
Avatar

Send message
Joined: 5 Jun 06
Posts: 154
Credit: 279,018
RAC: 0
Message 63290 - Posted: 11 Sep 2009, 23:33:40 UTC

Those 0 work units were granted credit at some point today. They are strange ones. The work units only ran for about an hour and they had really weird names and the graphics were strange looking too, maybe that is why they show up like they do.
ID: 63290 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Yifan Song
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 26 May 09
Posts: 62
Credit: 7,322
RAC: 0
Message 63291 - Posted: 11 Sep 2009, 23:44:50 UTC

The jobs that are canceled are the ones created IO problem for the server. DEK and I thought if we remove the job, it would stop the validator server from processing it. But turned out it didn't. So we'll have to wait for the server to finish processing the rest of the data.
ID: 63291 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · 3 · Next

Message boards : Number crunching : Granted Credit taking forever....



©2024 University of Washington
https://www.bakerlab.org