Should Rosetta Limit the amount of tasks queued per PC?

Message boards : Number crunching : Should Rosetta Limit the amount of tasks queued per PC?

To post messages, you must log in.

AuthorMessage
Profile 3k7cGiWzDHhDmVziFjKz4UkRa1sm

Send message
Joined: 21 Feb 11
Posts: 4
Credit: 6,609,198
RAC: 0
Message 100179 - Posted: 27 Dec 2020, 11:04:18 UTC
Last modified: 27 Dec 2020, 11:11:44 UTC

Basically the title. I noticed I got an influx of tasks that had all timed out from the same Computer ID. https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=4379211

This users PC has an 8 thread i7 processor with (as it currently stands at the time of writing this) 397 Tasks in progress and 425 Errors (All "Not started by deadline" of course, because that's a crazy amount for one PC with 8 threads, haha)

It doesn't appear that there's any hard and fast boundary to how many work units can be queued for a single computer? Would it possibly make sense to limit the amount of Queued work on a per PC basis by (at the very least) saying you can only queue up some TBD multiple of your thread count?

eg. I have 8 threads, and Rosetta has a threadCount-to-tasksQueued multiple of 10. So for that PC the max I can queue up for Rosetta is 80 tasks before the server refuses to assign me any more

This might help server load and overall project efficiency as this would prevent more novice BOINC users from inadvertently pulling down tasks that they have a zero percent chance of touching before the deadline.
ID: 100179 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Brian Nixon

Send message
Joined: 12 Apr 20
Posts: 293
Credit: 8,432,366
RAC: 0
Message 100181 - Posted: 27 Dec 2020, 12:15:57 UTC - in response to Message 100179.  
Last modified: 27 Dec 2020, 13:13:17 UTC

There is a hard upper limit on the server to the number of tasks sent to any host per application per day, but that needs to be high enough to cater for high core-count hosts. I don’t know what it’s set at, but it’s definitely over 800.

There is also a dynamic per-host limit – the ‘Max tasks per day’ on the Application details page, whose purpose is to limit the amount of work sent out to bad hosts. The snag with that is that it increases way faster than it decreases: it doubles with every valid task returned, but only decreases by one with every invalid task. So a machine that’s doing a small amount of valid work will effectively never be prevented from downloading vast numbers of tasks, even if it will complete only a tiny proportion of them.

It occurs to me that the current shortage of tasks could be explained by a large number of machines suddenly requesting work faster than the server is able to replenish its supply. Has some other project recently increased its default work cache setting (or recommended that users increase it themselves)?
Edit: That doesn’t make sense; ‘Tasks in progress’ wouldn’t be decreasing if that were the case…
ID: 100181 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Kissagogo27

Send message
Joined: 31 Mar 20
Posts: 86
Credit: 2,915,353
RAC: 2,605
Message 100182 - Posted: 27 Dec 2020, 12:47:18 UTC

hi, i just have 2 tasks from them:

https://boinc.bakerlab.org/rosetta/results.php?hostid=5486166
https://boinc.bakerlab.org/rosetta/results.php?hostid=3407218
https://boinc.bakerlab.org/rosetta/results.php?hostid=4046401
ID: 100182 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2122
Credit: 41,203,821
RAC: 10,286
Message 100198 - Posted: 27 Dec 2020, 17:36:14 UTC - in response to Message 100179.  

Basically the title. I noticed I got an influx of tasks that had all timed out from the same Computer ID. https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=4379211

This users PC has an 8 thread i7 processor with (as it currently stands at the time of writing this) 397 Tasks in progress and 425 Errors (All "Not started by deadline" of course, because that's a crazy amount for one PC with 8 threads, haha)

It doesn't appear that there's any hard and fast boundary to how many work units can be queued for a single computer? Would it possibly make sense to limit the amount of Queued work on a per PC basis by (at the very least) saying you can only queue up some TBD multiple of your thread count?

eg. I have 8 threads, and Rosetta has a threadCount-to-tasksQueued multiple of 10. So for that PC the max I can queue up for Rosetta is 80 tasks before the server refuses to assign me any more

This might help server load and overall project efficiency as this would prevent more novice BOINC users from inadvertently pulling down tasks that they have a zero percent chance of touching before the deadline.

I don't know why you've grabbed so many tasks. Some time in April a "3 day delay_bound" was put in, which meant that "tasks will only be issued to machines that estimate they can complete the task in less than 3 days. It also means that if the task is not returned within 3 days, it will be reissued (and BOINC Manager will abort it on your machine)". I thought that still applied.

Of course, it does require that you actually run Rosetta tasks and I notice you have several projects - maybe not prioritised sufficiently?
But how you got them in the first place remains a mystery.
ID: 100198 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1895
Credit: 9,164,050
RAC: 4,004
Message 100207 - Posted: 27 Dec 2020, 21:51:13 UTC - in response to Message 100179.  
Last modified: 27 Dec 2020, 21:51:59 UTC

Merry Christmas
ID: 100207 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1679
Credit: 17,816,373
RAC: 22,802
Message 100211 - Posted: 27 Dec 2020, 22:44:11 UTC - in response to Message 100179.  

It doesn't appear that there's any hard and fast boundary to how many work units can be queued for a single computer?
Because not all computers are the same.
Some single core systems can process more work than some multi-core systems. An extremely slow but high core count multi core/thread system can process more work than many extremely high speed low core/thread count systems.
Some systems are up 24/7, others for only a few hours a day. Some systems that are up 24/7 can only process BOINC work for a few hours- others are dedicated crunches and do nothing but BOINC. Some systems do only a single project, others several, still others are attached to almost everything. Some projects have short deadlines of a couple of days, other projects have deadlines of months, and still other projects have variable deadlines depending on the type of work.

The initial values for a system when it first starts processing work for a project are very conservative in order to limit the amount of work it gets till the BOINC Manager knows how long it takes to process work, how much up time the system has, how much of that time is spent processing work etc. Over time, the values will settle down to their actual ones.
It would appear in this case the user has their values set right on a boundary- they are returning enough work often enough to be able to get more work, even with their extremely high number of missed deadlines.
I've sent them a PM. We'll see if that has any effect.



A big part of the issue here is how Rosetta works- even errors can be useful, which is why you get credit for the time your system spent processing a Task even if it errors out. So you can have a high percentage of errors, and still be returning useful work- unlike other projects where a computation error doesn't return any useful work.
It would require some work to BOINC in order to make errors from missed deadlines to be treated differently to Computation or Validation errors, and then further limit the amount of work such system could get.
Grant
Darwin NT
ID: 100211 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Should Rosetta Limit the amount of tasks queued per PC?



©2024 University of Washington
https://www.bakerlab.org