SERVER PROBLEMS.

Message boards : Number crunching : SERVER PROBLEMS.

To post messages, you must log in.

Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 12 · Next

AuthorMessage
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2130
Credit: 41,424,155
RAC: 16,102
Message 62324 - Posted: 21 Jul 2009, 2:04:39 UTC - in response to Message 62323.  

As of 20 Jul 2009 23:19:13 UTC (updated every 10 minutes)
Ready to send 5

I have the same problem.

I'm struggling, like most people, but a few WUs are sneaking through from time to time. Current ready to send figure is 3030. I don't think I'm very far short of a full buffer (1.5 days completing about 30day).

Has anyone actually run out of work? I suspect not many quite yet.
ID: 62324 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
tralala

Send message
Joined: 8 Apr 06
Posts: 376
Credit: 581,806
RAC: 0
Message 62328 - Posted: 21 Jul 2009, 6:10:35 UTC

I have run out of work and I expect quite a few as well. Look at the TeraFLOPS estimate on the startpage and you see there is something wrong.
ID: 62328 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2130
Credit: 41,424,155
RAC: 16,102
Message 62333 - Posted: 21 Jul 2009, 11:17:47 UTC - in response to Message 62328.  
Last modified: 21 Jul 2009, 11:22:10 UTC

I have run out of work and I expect quite a few as well. Look at the TeraFLOPS estimate on the startpage and you see there is something wrong.

At the time of posting last time I'd picked up 6 WUs in the previous 90 minutes. In the 10 hours since then I've only picked up another 4 while completingreporting 12. Clearly still some issues to iron out.

At least it's not the weekend and people are around to take a look at it.

That said, there were about 280k WUs in progress last night and this figure is now up at 596k. Ready to Send was over 5000 a few minutes ago but just updated to only 6. Someone's picking them up, even if it's not me right now. I still have a buffer of about a day so I'm not anxious just yet.
ID: 62333 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
S8ER01Z

Send message
Joined: 29 Jan 07
Posts: 4
Credit: 705,064
RAC: 0
Message 62335 - Posted: 21 Jul 2009, 14:06:45 UTC - in response to Message 62324.  
Last modified: 21 Jul 2009, 14:07:47 UTC

Edit: disregard.
ID: 62335 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 62357 - Posted: 21 Jul 2009, 23:29:44 UTC

looks like the issue was solved:

7/21/2009 10:16:02 PM|rosetta@home|Scheduler request completed: got 20 new tasks

As of 21 Jul 2009 23:23:26 UTC (updated every 10 minutes)
Ready to send 30,380

I always keep a cache of 5 days work just for problems like this.
ID: 62357 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1018
Credit: 4,334,829
RAC: 0
Message 62360 - Posted: 21 Jul 2009, 23:59:17 UTC

quite honestly, I don't know what caused the slow down for our work unit generators. They just were not creating enough work (into the boinc database) even though the work existed in our project specific queue. I cleaned up our project specific database table by removing old job entries and their template files and started up another work unit generator daemon. Things seem to be going okay so far.
ID: 62360 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 62437 - Posted: 26 Jul 2009, 5:02:23 UTC
Last modified: 26 Jul 2009, 5:14:07 UTC

Hi.

I'm getting this type of message on all rigs.

Sun 26 Jul 2009 14:59:09 EST|rosetta@home|Scheduler request failed: Failed sending data to the peer
Sun 26 Jul 2009 14:59:10 EST||Internet access OK - project servers may be temporarily down.
Sun 26 Jul 2009 14:59:18 EST|rosetta@home|Temporarily failed upload of lr10_seq_score12_rlbd_1npu_IGNORE_THE_REST_DECOY_13841_2574_0_0: HTTP error
Sun 26 Jul 2009 14:59:18 EST|rosetta@home|Temporarily failed upload of lr8_seq_score12_ss5.0_rlbd_1l6p_IGNORE_THE_REST_DECOY_14281_2574_0_0: HTTP error

Servers are all green, but is something broken!

I saw one rig trying to D/L a new app, could that be what's jamming up the works.
ID: 62437 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Warped

Send message
Joined: 15 Jan 06
Posts: 48
Credit: 1,788,185
RAC: 0
Message 62438 - Posted: 26 Jul 2009, 5:25:28 UTC - in response to Message 62437.  

Hi.

I'm getting this type of message on all rigs.

Sun 26 Jul 2009 14:59:09 EST|rosetta@home|Scheduler request failed: Failed sending data to the peer
Sun 26 Jul 2009 14:59:10 EST||Internet access OK - project servers may be temporarily down.
Sun 26 Jul 2009 14:59:18 EST|rosetta@home|Temporarily failed upload of lr10_seq_score12_rlbd_1npu_IGNORE_THE_REST_DECOY_13841_2574_0_0: HTTP error
Sun 26 Jul 2009 14:59:18 EST|rosetta@home|Temporarily failed upload of lr8_seq_score12_ss5.0_rlbd_1l6p_IGNORE_THE_REST_DECOY_14281_2574_0_0: HTTP error

Servers are all green, but is something broken!

I saw one rig trying to D/L a new app, could that be what's jamming up the works.


I'm getting timeouts as well.
Warped

ID: 62438 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Heidi1
Avatar

Send message
Joined: 11 Aug 07
Posts: 49
Credit: 1,786,248
RAC: 0
Message 62439 - Posted: 26 Jul 2009, 5:37:54 UTC

I'm having problems again as well. After the server fix a couple of days ago, I was able to get WUs like normal. Now, I can't upload. I'll see what happens.
ID: 62439 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 62440 - Posted: 26 Jul 2009, 6:05:15 UTC
Last modified: 26 Jul 2009, 7:02:43 UTC

I just got this after D/L a 27 MB file, not good.


Sun 26 Jul 2009 15:58:41 EST|rosetta@home|[error] Signature verification failed for minirosetta_database_rev31588.zip
Sun 26 Jul 2009 15:58:41 EST|rosetta@home|[error] Checksum or signature error for minirosetta_database_rev31588.zip
ID: 62440 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
JAMES

Send message
Joined: 5 May 07
Posts: 8
Credit: 275,386
RAC: 0
Message 62442 - Posted: 26 Jul 2009, 7:19:55 UTC

Is there a problem with the servers? For the last several hours I have had (a) no uncompleted rosetta WU’s and (b) a rosetta zip file stuck in the transfer tab.

ID: 62442 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
JAMES

Send message
Joined: 5 May 07
Posts: 8
Credit: 275,386
RAC: 0
Message 62443 - Posted: 26 Jul 2009, 7:20:32 UTC

Is there a problem with the servers? For the last several hours I have had (a) no uncompleted rosetta WU’s and (b) a rosetta zip file stuck in the transfer tab.

ID: 62443 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 62447 - Posted: 26 Jul 2009, 9:10:30 UTC

Sunday morning European time:

7/26/2009 11:07:29 AM|rosetta@home|Sending scheduler request: Requested by user. Requesting 0 seconds of work, reporting 2 completed tasks
7/26/2009 11:07:52 AM||Project communication failed: attempting access to reference site
7/26/2009 11:07:53 AM||Internet access OK - project servers may be temporarily down.
7/26/2009 11:07:54 AM|rosetta@home|Scheduler request failed: Failure when receiving data from the peer
ID: 62447 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2130
Credit: 41,424,155
RAC: 16,102
Message 62456 - Posted: 26 Jul 2009, 12:45:22 UTC

Ditto to all the above.

In the Tasks tab, all the download problems relate to a new version of Mini Rosetta 1.86, which hasn't been announced anywhere either. Some individual files have snuck through, but not the minirosetta_database_rev31588.zip file.
ID: 62456 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Paul

Send message
Joined: 29 Oct 05
Posts: 193
Credit: 66,501,314
RAC: 9,302
Message 62457 - Posted: 26 Jul 2009, 12:53:53 UTC - in response to Message 62456.  

results are stacking up here. All of my projects are 1.82 & running an older version of BOINC.

I would assume everything will be corrected on Monday.
Thx!

Paul

ID: 62457 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2130
Credit: 41,424,155
RAC: 16,102
Message 62458 - Posted: 26 Jul 2009, 13:15:22 UTC - in response to Message 62456.  

Some individual files have snuck through, but not the minirosetta_database_rev31588.zip file.

Unsuccessfully:

26/07/2009 14:09:08 rosetta@home Finished download of minirosetta_database_rev31588.zip
26/07/2009 14:09:09 rosetta@home [error] Signature verification error for minirosetta_database_rev31588.zip
26/07/2009 14:09:09 rosetta@home [error] Checksum or signature error for minirosetta_database_rev31588.zip
ID: 62458 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [TiDC] yattote

Send message
Joined: 13 Mar 07
Posts: 2
Credit: 12,427,729
RAC: 0
Message 62461 - Posted: 26 Jul 2009, 15:12:11 UTC

Looks like the server neither wants to receive results nor send WUs despite it's in green status. Hope to be fixed soon.
ID: 62461 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 62463 - Posted: 26 Jul 2009, 15:15:23 UTC

The uploads all go to the same server as the scheduler requests. And the scheduler is being driven hard right now due to some problems that are causing the issued tasks to immediately fail. These all then report back and the host needs more work, and the cycle continues.

Your completed task will continue to retry uploading periodically, and will go through when things settle down again. No change required on your part, unless your machine has no other project to work on during this period and you wish to keep it busy. If that is the case I would suggest attaching to a second project of your choice as well. There is a list of them here

The servers are active, just too busy to keep up right now. The Project Team is aware of the problem.
Rosetta Moderator: Mod.Sense
ID: 62463 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 62483 - Posted: 26 Jul 2009, 22:13:19 UTC

Morning.

It dosen't look like it's much better this morning, all files but this one are

O.K. which is a problem because it's the bigest all 27 plus MB's.

Mon 27 Jul 2009 07:57:25 EST|rosetta@home|Finished download of minirosetta_database_rev31588.zip
Mon 27 Jul 2009 07:57:25 EST|rosetta@home|[error] Signature verification failed for minirosetta_database_rev31588.zip
Mon 27 Jul 2009 07:57:25 EST|rosetta@home|[error] Checksum or signature error for minirosetta_database_rev31588.zip



ID: 62483 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
JackOnTheRoxs

Send message
Joined: 28 Jun 09
Posts: 3
Credit: 1,925,358
RAC: 0
Message 62485 - Posted: 27 Jul 2009, 0:32:36 UTC

i had been rrying all day to get some work unita uploaded. Unoftunately none of them seem to be getting "Received" and credited. Has anyone else been seeing this problem along with all the other communication issues this weekend? I am deferring communications for now, fortunately I have several days of workunits for my Quad to crunch on while they figure out what's gotten us so screwed up.
ID: 62485 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 12 · Next

Message boards : Number crunching : SERVER PROBLEMS.



©2024 University of Washington
https://www.bakerlab.org