Message boards : Number crunching : BOINC v6.6.20 scheduler issues
Author | Message |
---|---|
Mod.Zilla Volunteer moderator Send message Joined: 5 Sep 06 Posts: 423 Credit: 6 RAC: 0 |
New thread created and posts moved in as requested. Rosetta Informational Moderator: Mod.Zilla |
TomaszPawel Send message Joined: 28 Apr 07 Posts: 54 Credit: 2,791,145 RAC: 0 |
On one of my hosts i have "nice" problem... Video: http://www.youtube.com/watch?v=kfclLVJ7cyc On this host: https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=791178 i instal 6.6.20. And problem starts. Rosetta@home don't download WU.... On this machine i run GPUGRID and Rosetta@home in 50/50.(2000/2000) GPUGIRD takes new WU but Rosetta@home not. Even if I manually request too update project - rosetta@home Scheduler request completed: got 0 new tasks Any TIPS? Soon my R@H WU will depleated... |
Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
On one of my hosts i have "nice" problem... Not off the top of my head... but the video idea is interesting to say the least ... hat the heck, not that I expect that it will do any good, but let me post the link on the alpha list and see what gives. My only suggestion off the top of my head would be to reset debts. You have to make a cc_config file and stop and restart BOINC Client, not just the manager. Shut down the client with Advanced menu then close the manager. Check task manager and make sure all the science applications are stopped. With the config file in place restart BOINC and the debts should be reset and you *MAY* get work... |
Buckeye74 Send message Joined: 5 Jun 06 Posts: 1 Credit: 110,354 RAC: 0 |
Suspend GPUGRID and your machine should download 5 days worth of Rosetta work, then resume GPUGRID. |
TomaszPawel Send message Joined: 28 Apr 07 Posts: 54 Credit: 2,791,145 RAC: 0 |
I try Suspend GPUGRID but rosetta did not download enything... Hmmm, I decided to wait to see what happend. And it was 5 task runing. 4 rosetta and 1 GPUGRID. No more Rosetta WU was waiting. And one task was ended, 3 was crunching (+1GPUGRID) and then, sudenly: 2009-04-21 16:52:49 rosetta@home Computation for task 1dhn__BOINC_ABINITIO_IGNORE_THE_REST-MOO12--1dhn_-_10770_76_0 finished 2009-04-21 16:52:51 rosetta@home Started upload of 1dhn__BOINC_ABINITIO_IGNORE_THE_REST-MOO12--1dhn_-_10770_76_0_0 2009-04-21 16:52:53 rosetta@home Sending scheduler request: To fetch work. 2009-04-21 16:52:53 rosetta@home Requesting new tasks 2009-04-21 16:52:56 rosetta@home Finished upload of 1dhn__BOINC_ABINITIO_IGNORE_THE_REST-MOO12--1dhn_-_10770_76_0_0 2009-04-21 16:52:58 rosetta@home Scheduler request completed: got 8 new tasks only 8 new tasks was send ... lol. It is real seroius bug in 6.6.20. |
TomaszPawel Send message Joined: 28 Apr 07 Posts: 54 Credit: 2,791,145 RAC: 0 |
I seams that 6.6.20 on my host have some strange cycle of scheduler... It only downloads 8 WU from Rosetta, crunch them, and when crunched 5 of them downloads another 8!!! LOL So for few minutes one core of my quad is idle... until it finished download WU. |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,208,737 RAC: 2,882 |
I seams that 6.6.20 on my host have some strange cycle of scheduler... Okay Tomas...how long does Boinc think it will take for you to complete a unit? How long is it really taking you to finish a unit? If Boinc thinks it will take 8 hours and you are really only taking 2 hours, you have found your problem. Boinc thinks you have too much work for your settings. You can either change the settings to have larger cache or try and fiddle with the Boinc settings. I have not moved to the Boinc 6.6.? versions yet so can't help you on that part. But the cache settings are changeable either on the website for all your computers, or in Boinc itself for that one computer. Boinc is designed to fix itself, how long that will take is anyones guess, but it could take anywhere from a day or so to a week or so, or longer. Kind of depends on the other projects running on that pc. |
Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
There are also issues with 6.6.20 and 6.6.23 with regard to some things... I am in the process of proving one bug, though i do not know the cause at all ... and we may be closing in on the logic flaw that has been driving us nuts for some time (though not related to the problem reported here). @TomaszPawel Can you tell me if the debt for Rosetta is just rising negatively? You can turn on the logging flag for work fetch debug or use the "Properties" button (with rosetta selected) on the Projects Tab. Look at it, wait some time, look at it again ... tell me if it is just going in one direction. Wouldn't hurt to get teh umber for GPU Grid while there ... |
TomaszPawel Send message Joined: 28 Apr 07 Posts: 54 Credit: 2,791,145 RAC: 0 |
Hi! It looks line this: and after some WU |
TomaszPawel Send message Joined: 28 Apr 07 Posts: 54 Credit: 2,791,145 RAC: 0 |
After longer period of time...: so... WTF still 8 ... |
Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
@Mod.Sense, Can we extract out Tomasz's issue to a new thread? It is important, but a sidetrack from RS theory. @Tomasz We are looking into your issues. Are you up for more experiments? If so, could you try 6.5.0 for me for a couple days. The major downside is that the run times for GPU Grid are not as well reported on the client. Other than that, this is the version I use as my standard on all my other systems. We *ARE* talking about this problem, the problem is that we need data ... As part of that I want you to fall back and run for a couple days to a week with the other version. If it runs well and as you expect we can go back to 6.6.20 and see if you work back into this situation. In the mean time, as I learn things I will apprise you of what they are ... honest ... :) I am sure someone will vouch for me being relatively decent about working the issues I am aware of ... |
TomaszPawel Send message Joined: 28 Apr 07 Posts: 54 Credit: 2,791,145 RAC: 0 |
Still nothing...
Ok, but first I will try 6.6.15 - it works great on my other computer. I read that in version 6.6.20 is new scheduler... http://boinc.berkeley.edu/dev/forum_thread.php?id=2518&nowrap=true#24183 |
Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
I don't know which 6.6.x version they started the development of the new scheduler. The only version I have experience with that I personally know worked is 6.5.0 ... if the later one worked, fine, try that ... :) I am going to try to look at code tonight or tomorrow. This *IS* an important issue and I think it is affecting me, though in a different way (not sure why) and as such we need to get to the bottom of it. |
Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
Posted this this morning: Ok, I have a glimmer, not sure if I got it right ... but let me try to put my limited understanding down on paper and see if one of you chrome domes can straighten me out. |
TomaszPawel Send message Joined: 28 Apr 07 Posts: 54 Credit: 2,791,145 RAC: 0 |
6.6.15 also afected. I will try make clean instal of 6.6.20 Uninstal 6.6.15, then delete files and folders of BOINC.... |
Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
I would just roll back to 6.5.0 ... the only thing you lose is the CPU time column shows real CPU time on the GPU tasks so you don't get that nice "elapsed" display. We are trying to get them to look at this ... honest ... The problem is that there is at least 3 or 4 main show-stopper type issues that we are working (this being one) and it is difficult to get them to focus ... |
Message boards :
Number crunching :
BOINC v6.6.20 scheduler issues
©2024 University of Washington
https://www.bakerlab.org