Message boards : Number crunching : Question about constant RAC
Author | Message |
---|---|
tiger Send message Joined: 16 Jul 06 Posts: 17 Credit: 1,083,385 RAC: 0 |
I don't mean a level that stays within a tight range, I mean, a constant 6-significant-figure RAC. This fellow here: https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=1105953 Stays at 3088.71. After the most recent server down time, where my three Q9550's went for hours with nothing to do, I increased the "additional work buffer" to the maximum of 10 days. But that is when the RAC of this one system seemed to freeze. Any ideas? |
Snags Send message Joined: 22 Feb 07 Posts: 198 Credit: 2,888,320 RAC: 0 |
I don't mean a level that stays within a tight range, I mean, a constant 6-significant-figure RAC. This fellow here: At the moment it's 3,086.14. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
So your question is "why did they get work when I didn't?" Just depends upon when you hit the scheduler and how caught up the feeder is. The file server was overloaded for extended periods of time causing various pieces of the process to have periods where they couldn't keep up with the 80,000 hosts that run R@h. So increasing your work buffer is the main way to assure you have enough work for such adverse events. Another way to improve your odds would be to increase your runtime preference. Now that you have a 10 day buffer, you'll want to ratchet that down before increasing runtime. Make any increases gradually over the course of days or you will find yourself with too much work to complete before the 10 day deadline. For example, if you have 6 days of work on hand and double your runtime preference, suddenly it becomes 12 days of work. i.e. the change in runtime preference applies to work you already had waiting to run. Rosetta Moderator: Mod.Sense |
tiger Send message Joined: 16 Jul 06 Posts: 17 Credit: 1,083,385 RAC: 0 |
So your question is "why did they get work when I didn't?" No. I was marvelling at the constant RAC of one machine. It occurred proximately to when I increased the additional work buffer. I was putting that out there in case someone knew if that causes the RAC to be calculated differently, that's all. Another way to improve your odds would be to increase your runtime preference. Now that you have a 10 day buffer, you'll want to ratchet that down before increasing runtime. Make any increases gradually over the course of days or you will find yourself with too much work to complete before the 10 day deadline. For example, if you have 6 days of work on hand and double your runtime preference, suddenly it becomes 12 days of work. i.e. the change in runtime preference applies to work you already had waiting to run. I re-read that a few times and still aren't sure what you're saying. I have three (soon to be 4!) quads that run R@H non-stop. They can have 100% of any idle cpu time available. I did get from your post that 10 days might be pushing it, in that the report deadline may come and go before the last piece is available. I reduced the AWB to 5 days. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
RAC is based on completed work, not in any way based upon the pending work you have waiting to begin. Perhaps what you observed was an extended period of time when the host didn't contact the scheduler and the credit decay script hadn't adjusted his RAC yet. If you aren't familiar with it, the Rosetta preferences allow you to define a preference for how long each task should run. Default is 3 hours. But you can select 1-24hrs per task as your preference. The application will do it's best to follow your preference, but it will not always be possible. If you have a 3hr preference, then your average CPU on an average day will do 8 tasks. You have a quad, so that's a total of about 32 tasks per day. Do if you do get a full 10 day cache of work, you would have about 320 tasks queued up. Now, if you were to change your preference to 8hrs, those existing 320 tasks are going to start running for roughly 8hrs instead of 3. And now 320 tasks cannot be completed within the 10 day deadline. So, work your cache of pending work down. Then ratchet up your runtime preference. Once the initial time to completion shown for a task is roughly inline with your current target runtime, then you can safely increase the number of days of work you keep on hand because the estimates will be reasonably close. It takes the BOINC client a day or so to get used to the new runtime preference. Rosetta Moderator: Mod.Sense |
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,821,902 RAC: 15,180 |
was it actively reporting new work units? i think if a machine doesn't report for a while then its RAC remains constant (there is a decay function but i don't think it runs that often - once a week maybe?) So your question is "why did they get work when I didn't?" |
tiger Send message Joined: 16 Jul 06 Posts: 17 Credit: 1,083,385 RAC: 0 |
Yep. I even manually did network communication. The RAC seems to be floating again, so maybe it was just that whatever updates RAC, was not running at the time. was it actively reporting new work units? |
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,821,902 RAC: 15,180 |
yeah - probably because of the recent validator backlog then... Yep. I even manually did network communication. The RAC seems to be floating again, so maybe it was just that whatever updates RAC, was not running at the time. |
Message boards :
Number crunching :
Question about constant RAC
©2024 University of Washington
https://www.bakerlab.org