Message boards : Number crunching : SERVER PROBLEMS.
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 12 · Next
Author | Message |
---|---|
Michael G.R. Send message Joined: 11 Nov 05 Posts: 264 Credit: 11,247,510 RAC: 0 |
Am crunching for Folding@home and World Community Grid (Human Proteome Folding phase 2) in the meantime.. Hope this delay is because they're working on a new improved version, or a new project, and not just some random hardware failure. |
LizzieBarry Send message Joined: 25 Feb 08 Posts: 76 Credit: 201,862 RAC: 0 |
Hate to be the bearer of bad news but: As of 6 Jan 2009 12:11:20 UTC |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
same status As of 6 Jan 2009 15:25:37 UTC (updated every 10 minutes). It is now 730am pacific time. hopefully the team will see this problem and get things running again in a few hours. Hate to be the bearer of bad news but: |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
Hi. It's all green but i'm still having problems! Wed 07 Jan 2009 08:51:48 EST||Project communication failed: attempting access to reference site Wed 07 Jan 2009 08:51:48 EST|rosetta@home|Temporarily failed upload of abinitio_norelax_homfrag_129_B_1vkkA_SAVE_ALL_OUT_4626_4788_0_0: HTTP error Wed 07 Jan 2009 08:51:48 EST|rosetta@home|Backing off 40 min 0 sec on upload of abinitio_norelax_homfrag_129_B_1vkkA_SAVE_ALL_OUT_4626_4788_0_0 Wed 07 Jan 2009 08:51:50 EST||Internet access OK - project servers may be temporarily down. Been getting this all morning. pete. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Hi. just let it ride, or suspend communication for a few hours and try again. i see the same thing on my messages as well. home page had this note, Sunday, January 06, 2008 9:00 AM Our database server lost power accidentally as work was being done on the rack. We are back up and running now. Now it looks like there is something else going on, there are Ready to send 73,580 tasks on the server and the main page says [ Scheduler running ] Queued: 45,698. So there must be some sort of comms issue now. |
AMD_is_logical Send message Joined: 20 Dec 05 Posts: 299 Credit: 31,460,681 RAC: 0 |
I'm currently seeing: Tue 06 Jan 2009 05:02:47 PM EST|rosetta@home|Sending scheduler request to http://srv4.bakerlab.org/rosetta_cgi/cgi Tue 06 Jan 2009 05:02:47 PM EST|rosetta@home|Reason: To fetch work Tue 06 Jan 2009 05:02:47 PM EST|rosetta@home|Requesting 191267 seconds of new work Tue 06 Jan 2009 05:04:47 PM EST||Attempting to communicate with [srv4.bakerlab.org] timed out Tue 06 Jan 2009 05:04:49 PM EST|rosetta@home|Scheduler request to http://srv4.bakerlab.org/rosetta_cgi/cgi failed with a return value of -182 Tue 06 Jan 2009 05:04:49 PM EST|rosetta@home|No schedulers responded |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Just a repeat of the weekends failure. Sit back and wait it out..that's all there is to do. When they get the problem fixed then you will get work. Boinc manager will keep delaying the contact time every time it encounters a error. I have had failures all night (european time) so its backed off to 2hrs before trying to communicate again. |
AMD_is_logical Send message Joined: 20 Dec 05 Posts: 299 Credit: 31,460,681 RAC: 0 |
Just a repeat of the weekends failure. No, the symptoms are different. According to the status page, there is now plenty of work and everything is green, but my machines are getting "Scheduler request to http://srv4.bakerlab.org/rosetta_cgi/cgi failed with a return value of -182" errors when they try to download work. Sit back and wait it out..that's all there is to do. On the home page they say "We are back up and running now." It sounds like they may think things are running properly, in which case some people need to post and tell them things are not running properly. |
David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0 |
the external interface to our upload/download servers is currently down. Keith (our sys admin guru) is aware of the issue but cannot fix it until he gets in tomorrow. It is related to the database server issue we had. The connection must have been pulled accidentally when work was being done on the rack. Sorry for any inconvenience. I imagine our servers are going to be quite busy tomorrow. |
Gray Handcock Send message Joined: 26 Sep 05 Posts: 20 Credit: 2,018,415 RAC: 0 |
Hi 10:47 am here - still getting this error: 2009/01/07 10:36:00 AM|rosetta@home|Sending scheduler request: Requested by user. Requesting 259201 seconds of work, reporting 0 completed tasks 2009/01/07 10:36:08 AM||Project communication failed: attempting access to reference site 2009/01/07 10:36:10 AM|rosetta@home|Scheduler request failed: Transferred a partial file 2009/01/07 10:36:11 AM||Internet access OK - project servers may be temporarily down. I assume the issue is still to be worked on ? |
AMD_is_logical Send message Joined: 20 Dec 05 Posts: 299 Credit: 31,460,681 RAC: 0 |
I've now gotten some work (although the servers do indeed seem pretty busy at the moment). It's clear someone must have been working very hard into the wee hours of the morning to get things working. Thanks. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Guys the project is based in Seattle and that is -8 hrs GMT/UTC time. Current time in Belgium is 11.28 am and back in Seattle it is 2.28am. There won't be a resolution to the problem described below for at least another 6 hours minimum and perhaps longer. I would set your project status to 'no new tasks' for now to save your self a long list of failure messages. Take that setting off later today once they have solved the problem listed in DEK's message. Hi |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
system is back to normal. i just restarted communications and got 6 new tasks. |
Gray Handcock Send message Joined: 26 Sep 05 Posts: 20 Credit: 2,018,415 RAC: 0 |
Guess it will still be a while... --------------------------------------------- 2009/01/07 10:49:28 PM|rosetta@home|Sending scheduler request: Requested by user. Requesting 30240 seconds of work, reporting 0 completed tasks 2009/01/07 10:49:31 PM||Project communication failed: attempting access to reference site 2009/01/07 10:49:33 PM|rosetta@home|Scheduler request failed: Transferred a partial file 2009/01/07 10:49:34 PM||Internet access OK - project servers may be temporarily down. 2009/01/07 11:05:33 PM|rosetta@home|Sending scheduler request: Requested by user. Requesting 30240 seconds of work, reporting 0 completed tasks 2009/01/07 11:05:38 PM||Project communication failed: attempting access to reference site 2009/01/07 11:05:39 PM|rosetta@home|Scheduler request failed: Transferred a partial file 2009/01/07 11:05:41 PM||Internet access OK - project servers may be temporarily down. ---------------------------------------------- cheers |
Gray Handcock Send message Joined: 26 Sep 05 Posts: 20 Credit: 2,018,415 RAC: 0 |
...and as soon as I posted the message before this the downloads started !! Thanks ! |
Gray Handcock Send message Joined: 26 Sep 05 Posts: 20 Credit: 2,018,415 RAC: 0 |
are there still server issues ? ----------------------------------------- 2009/01/09 08:30:08 AM|rosetta@home|Sending scheduler request: Requested by user. Requesting 0 seconds of work, reporting 1 completed tasks 2009/01/09 08:30:13 AM||Project communication failed: attempting access to reference site 2009/01/09 08:30:13 AM|rosetta@home|Scheduler request failed: Transferred a partial file 2009/01/09 08:30:16 AM||Internet access OK - project servers may be temporarily down. ------------------------------------------ thanks |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
are there still server issues ? yes.. looks like the make work servers are offline or out of jobs to send, we are stuck with no new work for at least 7 hours. I suggest the next time you get connected you download about 3-4 days of extra work. Then you can survive this problem if it happens again. As of 9 Jan 2009 9:11:47 UTC (updated every 10 minutes) rah_make_work1 srv3 Not running rah_make_work2 srv4 Not running Ready to send 74 main page shows 0 tasks ready to send |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
Hi. I just two of these out of four D/L's. Sat 10 Jan 2009 07:46:04 EST|rosetta@home|[error] Checksum or signature error for homfragments_2ci2.zip Sat 10 Jan 2009 07:46:37 EST|rosetta@home|[error] Checksum or signature error for homfragments_2d4f.zip pete. |
ComfortablyNumb Send message Joined: 6 Jul 07 Posts: 8 Credit: 658,196 RAC: 0 |
are there still server issues ? I've tried to download 3-4 days worth of work before. Both times my pc crashed(In the last week). Couldn't even use asr to recover. Had to reformat and start over. I have 3 pages of wu's, that my pc will not do. I'm sure they'll reassign them to other people, eventually. |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,208,737 RAC: 3,249 |
I have 3 pages of wu's, that my pc will not do. I'm sure they'll reassign them to other people, eventually. Oh yes, if you look the units are probably already marked with an error message on your list and they could even be sent out already to a different person. |
Message boards :
Number crunching :
SERVER PROBLEMS.
©2024 University of Washington
https://www.bakerlab.org