Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 335 · 336 · 337 · 338

AuthorMessage
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2363
Credit: 44,949,968
RAC: 25,412
Message 112715 - Posted: 30 May 2025, 1:24:21 UTC - in response to Message 112712.  

Event log today:
5/29/2025 2:10:57 PM | Rosetta@home | Server error: feeder not running
5/29/2025 2:10:57 PM | Rosetta@home | Project requested delay of 3600 seconds

Steve, take a look at kotenok2000's message above
The lines I've extracted from your log tell me the change you've made in your hosts file isn't being read by your PC.
"hosts" can't have any extension at all, such as hosts.txt. hosts.doc, hosts.bak or hosts.old
It has to simply be "hosts"

When you showed us the hosts edit you made, it looked good to me, but if the filename is wrong, that's the only explanation I can see for "Server error: feeder not running" still being reported in your log file.
That's the line we have to get rid of.
ID: 112715 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Stevie G

Send message
Joined: 15 Dec 18
Posts: 124
Credit: 1,028,210
RAC: 1,498
Message 112716 - Posted: 31 May 2025, 5:15:07 UTC - in response to Message 112715.  

Event log today:
5/29/2025 2:10:57 PM | Rosetta@home | Server error: feeder not running
5/29/2025 2:10:57 PM | Rosetta@home | Project requested delay of 3600 seconds

Steve, take a look at kotenok2000's message above
The lines I've extracted from your log tell me the change you've made in your hosts file isn't being read by your PC.
"hosts" can't have any extension at all, such as hosts.txt. hosts.doc, hosts.bak or hosts.old
It has to simply be "hosts"

When you showed us the hosts edit you made, it looked good to me, but if the filename is wrong, that's the only explanation I can see for "Server error: feeder not running" still being reported in your log file.
That's the line we have to get rid of.


Sid, Tom, et al:

Guess what??

Today I initially got two Rosetta tasks!!

Then a few hour later, lo and behold, I got NINE tasks!

All due by June 3, but running between 183 -187 degrees F. My cut-off temp is 190 F.

So it looks like things have returned to something resembling normality.

We'll see how long it last.

I want to express my sincere thanks to you both and all the others who helped me get through these tribulations.

S. Gaber
Oldsmar, FL
ID: 112716 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tom M

Send message
Joined: 20 Jun 17
Posts: 141
Credit: 30,772,102
RAC: 102,763
Message 112717 - Posted: 31 May 2025, 16:42:50 UTC - in response to Message 112712.  
Last modified: 31 May 2025, 16:44:25 UTC

Tom:

You said "5/28/2025 9:48:51 AM | Rosetta@home | Not requesting tasks: don't need (CPU: ; AMD/ATI GPU: )
Usually this means you have other projects that have filled your cache up. So you might want to No New Task all your other CPU projects until you start getting some Rosetta downloads."

I have paused other projects or requested no new tasks for a day or two. And still didn't get any Rosetta tasks. That's why I said I was ready to give up on Rosetta.

Event log today:
5/29/2025 2:10:43 PM | Asteroids@home | project suspended by user
5/29/2025 2:10:47 PM | Einstein@Home | project suspended by user
5/29/2025 2:10:48 PM | Universe@Home | Sending scheduler request: To fetch work.
5/29/2025 2:10:48 PM | Universe@Home | Requesting new tasks for CPU
5/29/2025 2:10:49 PM | Milkyway@home | project suspended by user
5/29/2025 2:10:50 PM | Universe@Home | Scheduler request completed: got 0 new tasks
5/29/2025 2:10:50 PM | Universe@Home | Project has no tasks available
5/29/2025 2:10:50 PM | Universe@Home | Project requested delay of 11 seconds
5/29/2025 2:10:53 PM | World Community Grid | project suspended by user
5/29/2025 2:10:55 PM | Rosetta@home | Sending scheduler request: To fetch work.
5/29/2025 2:10:55 PM | Rosetta@home | Requesting new tasks for CPU and AMD/ATI GPU
5/29/2025 2:10:57 PM | Rosetta@home | Scheduler request completed: got 0 new tasks
5/29/2025 2:10:57 PM | Rosetta@home | Server error: feeder not running
5/29/2025 2:10:57 PM | Rosetta@home | Project requested delay of 3600 seconds
5/29/2025 2:10:59 PM | Rosetta@home | update requested by user
5/29/2025 2:11:02 PM | Rosetta@home | Sending scheduler request: Requested by user.
5/29/2025 2:11:02 PM | Rosetta@home | Requesting new tasks for CPU and AMD/ATI GPU
5/29/2025 2:11:03 PM | Rosetta@home | Scheduler request completed: got 0 new tasks
5/29/2025 2:11:03 PM | Rosetta@home | Server error: feeder not running
5/29/2025 2:11:03 PM | Rosetta@home | Project requested delay of 3600 seconds

S. Gaber
Oldsmar, FL


I have only been able to get rid of that message by NNT tasks on other projects for extended periods so your total tasks in cache goes down significantly. A day or two may not be long enough.
Proud member of the O.F.A. (Old Farts Association)
ID: 112717 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tom M

Send message
Joined: 20 Jun 17
Posts: 141
Credit: 30,772,102
RAC: 102,763
Message 112718 - Posted: 31 May 2025, 16:59:44 UTC - in response to Message 112714.  
Last modified: 31 May 2025, 17:08:40 UTC

Sid,


Saying that, and poking my nose in where it isn't wanted yet again, is your cache of Einstein tasks too high?
I note that several tasks were cancelled before starting, despite a 2-week deadline.
Or was that period a while ago, and have you already resolved it?


The deadline I am seeing is currently June 2. So I believe I am getting 3 day deadlines.

===edit== And dropped the cpu tasks allowed for Rosetta (in the Pandora config file) to 200.

For a while, a cache under 300 seemed to not be producing expired tasks.
And I don't want to have tasks being expired.

The problem is to keep more than 120 processing at once while not taking more than 3 days (I think) to get them done.

Your previous calculations seem to think that 12 hour tasks should manage to do that for a cache of 300 or under. 3 x 120 = 360.

I started up my polling script when I hit some 90 Rosetta tasks processing and my backup cpu project started downloading tasks.

<scratch head>

:)

===edit==
I looked at the list of canceled by server tasks. They were all canceled the same day they were sent. So they were not even getting to the 3 day deadline and being canceled.
Proud member of the O.F.A. (Old Farts Association)
ID: 112718 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1849
Credit: 18,534,891
RAC: 0
Message 112719 - Posted: 31 May 2025, 19:43:00 UTC - in response to Message 112718.  
Last modified: 31 May 2025, 19:52:02 UTC

I note that several tasks were cancelled before starting, despite a 2-week deadline.
Or was that period a while ago, and have you already resolved it?
Those Tasks were all from the beginning of last month/the end of the previous month.
Current Einstein Tasks appear to be getting done within 1.5-2 days.



I looked at the list of canceled by server tasks. They were all canceled the same day they were sent. So they were not even getting to the 3 day deadline and being canceled.
Cancelled by server Tasks are Tasks that have been resent because they hadn't been returned by the deadline, but ended up being returned by that other system before your system started processing them; so the project cancelled those resends as they were no longer needed.
The lager your cache, then the more often that will occur.
Grant
Darwin NT
ID: 112719 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Stevie G

Send message
Joined: 15 Dec 18
Posts: 124
Credit: 1,028,210
RAC: 1,498
Message 112720 - Posted: 31 May 2025, 21:57:51 UTC - in response to Message 112717.  

Tom:

You said "5/28/2025 9:48:51 AM | Rosetta@home | Not requesting tasks: don't need (CPU: ; AMD/ATI GPU: )
Usually this means you have other projects that have filled your cache up. So you might want to No New Task all your other CPU projects until you start getting some Rosetta downloads."

I have paused other projects or requested no new tasks for a day or two. And still didn't get any Rosetta tasks. That's why I said I was ready to give up on Rosetta.

Event log today:
5/29/2025 2:10:43 PM | Asteroids@home | project suspended by user
5/29/2025 2:10:47 PM | Einstein@Home | project suspended by user
5/29/2025 2:10:48 PM | Universe@Home | Sending scheduler request: To fetch work.
5/29/2025 2:10:48 PM | Universe@Home | Requesting new tasks for CPU
5/29/2025 2:10:49 PM | Milkyway@home | project suspended by user
5/29/2025 2:10:50 PM | Universe@Home | Scheduler request completed: got 0 new tasks
5/29/2025 2:10:50 PM | Universe@Home | Project has no tasks available
5/29/2025 2:10:50 PM | Universe@Home | Project requested delay of 11 seconds
5/29/2025 2:10:53 PM | World Community Grid | project suspended by user
5/29/2025 2:10:55 PM | Rosetta@home | Sending scheduler request: To fetch work.
5/29/2025 2:10:55 PM | Rosetta@home | Requesting new tasks for CPU and AMD/ATI GPU
5/29/2025 2:10:57 PM | Rosetta@home | Scheduler request completed: got 0 new tasks
5/29/2025 2:10:57 PM | Rosetta@home | Server error: feeder not running
5/29/2025 2:10:57 PM | Rosetta@home | Project requested delay of 3600 seconds
5/29/2025 2:10:59 PM | Rosetta@home | update requested by user
5/29/2025 2:11:02 PM | Rosetta@home | Sending scheduler request: Requested by user.
5/29/2025 2:11:02 PM | Rosetta@home | Requesting new tasks for CPU and AMD/ATI GPU
5/29/2025 2:11:03 PM | Rosetta@home | Scheduler request completed: got 0 new tasks
5/29/2025 2:11:03 PM | Rosetta@home | Server error: feeder not running
5/29/2025 2:11:03 PM | Rosetta@home | Project requested delay of 3600 seconds

S. Gaber
Oldsmar, FL


I have only been able to get rid of that message by NNT tasks on other projects for extended periods so your total tasks in cache goes down significantly. A day or two may not be long enough.


Tom:

Today and yesterday, my computer processed 13 Rosetta tasks!

Right now it's working on 16 more. But I have to suspend most of those because it was running at 202 degrees F.

To keep it under 190 F I have to suspend other projects and limit it to 5, sometimes 6 Rosetta tasks.

This computer will run any combination of 17 Asteroids or Einstein tasks and still stay under 190 F. Maybe 9 WCG tasks.

It will only run 1 Milky Way task, plus occasionally one Asteroids or Einstein task at a time. But Milky way tasks go very quickly, sometimes in 12 minutes or one hour at the most.

I see that Rosetta now has no tasks.

S. Gaber
Oldsmar, FL
ID: 112720 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tom M

Send message
Joined: 20 Jun 17
Posts: 141
Credit: 30,772,102
RAC: 102,763
Message 112721 - Posted: 1 Jun 2025, 2:13:21 UTC - in response to Message 112720.  

Steve G,
There used to be a Windows up that played well with Boinc. It would throttle the cpu temperature below what you set the limit at.

Maybe someone else remembers or can find it...
Proud member of the O.F.A. (Old Farts Association)
ID: 112721 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1849
Credit: 18,534,891
RAC: 0
Message 112722 - Posted: 1 Jun 2025, 3:09:06 UTC - in response to Message 112721.  
Last modified: 1 Jun 2025, 3:59:01 UTC

Steve G,
There used to be a Windows up that played well with Boinc. It would throttle the cpu temperature below what you set the limit at.
And it is bad for your CPU to run it flat out, then run it slow, then run it flat out, then run it slow, then run it flat out then run it slow (which is what that does).
Thermal stress is not good for electronics.
Any system from the last 10 years or so will self-throttle when it reaches it's thermal limit. Even so, keeping your CPU at 80°c or less will do wonders for it's longevity, even if it is rated to operate at as much as 105°c.


The AMD Ryzen 7 5700G is rated at a max of 90W, so very low power. Even at 100% load the stock Wraith cooler correctly fitted should keep the temperature below 95°c with a reasonably well ventilated case with an ambient temperature of around 30°c, but the stock cooler is very, very basic.
Fixing the problem with the CPU/system cooling would be the better option- a US $30 dual fan aftermarket cooler would be overkill, but it would keep it cool no matter what.
eg the Thermalright Peerless Assassin 120 SE V3 is capable of keeping a 200W CPU at less than 60°c over ambient at 100% load for around US $33.

The next best option would be to just limit the number of cores/threads being used by BOINC.
Grant
Darwin NT
ID: 112722 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bill Swisher

Send message
Joined: 10 Jun 13
Posts: 63
Credit: 54,084,334
RAC: 140,779
Message 112723 - Posted: 1 Jun 2025, 4:49:33 UTC - in response to Message 112722.  

The AMD Ryzen 7 5700G is rated at a max of 90W, so very low power.

Agree with this. I have 3 of them running, well 2 at the moment...1 is down in Arizona and it's powered down, apparently it was 109F there today and it's supposed to warm up a bit next week. I did use a Noctua NH-L9a-AM4 fan for them, the case has a limitation on how "tall" things can be. Right now, running 16 threads at 100% utilization, ones at 90C and the other is at 95.8C. Long ago I wrote a bash script, that ran several time per hour, to adjust the <niu_max_ncpus_pct>100.000000</niu_max_ncpus_pct> in the /var/lib/boinc/global_prefs_override.xml file to keep the temps down. Then I discovered the processor will throttle itself. So it was all a nice learning experience. :-)
ID: 112723 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
PMH_UK

Send message
Joined: 9 Aug 08
Posts: 24
Credit: 1,243,749
RAC: 0
Message 112724 - Posted: 1 Jun 2025, 8:36:59 UTC - in response to Message 112721.  

Tthrottle from efmer https://efmer.com/ throttles to keep temperatures below limit set by user.
Paul.
ID: 112724 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tom M

Send message
Joined: 20 Jun 17
Posts: 141
Credit: 30,772,102
RAC: 102,763
Message 112725 - Posted: 1 Jun 2025, 8:40:45 UTC

Over a thousand RTS.
Proud member of the O.F.A. (Old Farts Association)
ID: 112725 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1849
Credit: 18,534,891
RAC: 0
Message 112726 - Posted: 1 Jun 2025, 8:48:09 UTC - in response to Message 112725.  

Over a thousand RTS.
For now.
There are still over 4.5 million queued up to be processed, but for months now the Feeder has had issues, and only dribs & drabs are being released at any given time for people to actually download.
Grant
Darwin NT
ID: 112726 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2363
Credit: 44,949,968
RAC: 25,412
Message 112727 - Posted: 1 Jun 2025, 11:29:51 UTC - in response to Message 112716.  

Event log today:
5/29/2025 2:10:57 PM | Rosetta@home | Server error: feeder not running
5/29/2025 2:10:57 PM | Rosetta@home | Project requested delay of 3600 seconds

Steve, take a look at kotenok2000's message above
The lines I've extracted from your log tell me the change you've made in your hosts file isn't being read by your PC.
"hosts" can't have any extension at all, such as hosts.txt. hosts.doc, hosts.bak or hosts.old
It has to simply be "hosts"

When you showed us the hosts edit you made, it looked good to me, but if the filename is wrong, that's the only explanation I can see for "Server error: feeder not running" still being reported in your log file.
That's the line we have to get rid of.

Sid, Tom, et al:

Guess what??

Today I initially got two Rosetta tasks!!

Then a few hour later, lo and behold, I got NINE tasks!

All due by June 3, but running between 183 -187 degrees F. My cut-off temp is 190 F.

So it looks like things have returned to something resembling normality.

We'll see how long it last.

I want to express my sincere thanks to you both and all the others who helped me get through these tribulations.

S. Gaber
Oldsmar, FL

Great news.
Because you quoted that section about removing the file extension, was that what finally solved your issues? It's the only thing that makes sense.
In any case, we'll take it.
It won't be intermittent - it'll keep working until something else breaks at Rosetta. Hopefully nothing.

On temperatures, I've just looked up the operating temps for your 5700G processor and it's ok up to 95C, which is 203F, at which point it'll start throttling back your running speeds to keep itself safe.
Rosetta does run hard so it'll always push your PC to its limits. It depends on your own comfort levels what you limit it to.
I let the processor decide what it can handle (and occasionally suffer the consequences tbh) but 200F might be a more practical limit (93.33C)
ID: 112727 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2363
Credit: 44,949,968
RAC: 25,412
Message 112728 - Posted: 1 Jun 2025, 11:43:06 UTC - in response to Message 112718.  

Sid,

Saying that, and poking my nose in where it isn't wanted yet again, is your cache of Einstein tasks too high?
I note that several tasks were cancelled before starting, despite a 2-week deadline.
Or was that period a while ago, and have you already resolved it?

The deadline I am seeing is currently June 2. So I believe I am getting 3 day deadlines.

===edit== And dropped the cpu tasks allowed for Rosetta (in the Pandora config file) to 200.

For a while, a cache under 300 seemed to not be producing expired tasks.
And I don't want to have tasks being expired.

The problem is to keep more than 120 processing at once while not taking more than 3 days (I think) to get them done.

Your previous calculations seem to think that 12 hour tasks should manage to do that for a cache of 300 or under. 3 x 120 = 360.

I started up my polling script when I hit some 90 Rosetta tasks processing and my backup cpu project started downloading tasks.

<scratch head>

:)

===edit==
I looked at the list of canceled by server tasks. They were all canceled the same day they were sent. So they were not even getting to the 3 day deadline and being canceled.

I've entirely misled you.
I didn't mean Rosetta 3-day deadline tasks were being timed out - nothing has reappeared since we addressed that very early on. The 300 tasks cache looked ideal.
I meant you were getting Einstein 2-week deadline tasks timing out through not being started.
Sorry for the confusion.
ID: 112728 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2363
Credit: 44,949,968
RAC: 25,412
Message 112729 - Posted: 1 Jun 2025, 12:08:03 UTC - in response to Message 112722.  

Steve G,
There used to be a Windows up that played well with Boinc. It would throttle the cpu temperature below what you set the limit at.
And it is bad for your CPU to run it flat out, then run it slow, then run it flat out, then run it slow, then run it flat out then run it slow (which is what that does).
Thermal stress is not good for electronics.
Any system from the last 10 years or so will self-throttle when it reaches it's thermal limit. Even so, keeping your CPU at 80°c or less will do wonders for it's longevity, even if it is rated to operate at as much as 105°c.


The AMD Ryzen 7 5700G is rated at a max of 90W, so very low power. Even at 100% load the stock Wraith cooler correctly fitted should keep the temperature below 95°c with a reasonably well ventilated case with an ambient temperature of around 30°c, but the stock cooler is very, very basic.
Fixing the problem with the CPU/system cooling would be the better option- a US $30 dual fan aftermarket cooler would be overkill, but it would keep it cool no matter what.
eg the Thermalright Peerless Assassin 120 SE V3 is capable of keeping a 200W CPU at less than 60°c over ambient at 100% load for around US $33.

The next best option would be to just limit the number of cores/threads being used by BOINC.

I weighed in on my view of temps before reading what others said.
I don't disagree with a word written here, but I'd repeat that it also depends on your personal comfort levels.
Steve reports hitting temps of 202F and being concerned.
While I'm thinking the thermal limit is 203F, so 202F is absolutely perfect. If my temps were lower than 202F I might look at boosting my clock speeds so temps would increase back to 202F !!

A factor in this may be that I'm writing from Birmingham, England (temp 18C) not Darwin, Australia or Florida, USA (temps 31C) - Lol
ID: 112729 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2363
Credit: 44,949,968
RAC: 25,412
Message 112730 - Posted: 1 Jun 2025, 13:03:32 UTC - in response to Message 112661.  

Two more Validate errors tonight, meaning 2x12hr tasks not being awarded credit.
Another unheard appeal for the daily job that cleans this up to be reinstated.

Probably caused by some disk errors I'm getting locally, but annoying nonetheless :(

More disk errors, 5 more Validation errors (likely more to come). All lost credits again.
I'm going to have to do something about this...
ID: 112730 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2363
Credit: 44,949,968
RAC: 25,412
Message 112731 - Posted: 1 Jun 2025, 16:16:21 UTC - in response to Message 112730.  

Two more Validate errors tonight, meaning 2x12hr tasks not being awarded credit.
Another unheard appeal for the daily job that cleans this up to be reinstated.

Probably caused by some disk errors I'm getting locally, but annoying nonetheless :(

More disk errors, 5 more Validation errors (likely more to come). All lost credits again.
I'm going to have to do something about this...

More did come - 8 in all. A temporary fix is in, but it'll return until I can clone onto a new drive :(
ID: 112731 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2363
Credit: 44,949,968
RAC: 25,412
Message 112734 - Posted: 2 Jun 2025, 21:13:21 UTC - in response to Message 112731.  

Two more Validate errors tonight, meaning 2x12hr tasks not being awarded credit.
Another unheard appeal for the daily job that cleans this up to be reinstated.

Probably caused by some disk errors I'm getting locally, but annoying nonetheless :(

More disk errors, 5 more Validation errors (likely more to come). All lost credits again.
I'm going to have to do something about this...

More did come - 8 in all. A temporary fix is in, but it'll return until I can clone onto a new drive :(

Another 8 validation errors and 4 compute errors on top <sigh>
ID: 112734 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 335 · 336 · 337 · 338

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2025 University of Washington
https://www.bakerlab.org