Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 199 · 200 · 201 · 202 · 203 · 204 · 205 . . . 300 · Next

AuthorMessage
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 105847 - Posted: 5 Apr 2022, 22:00:22 UTC - in response to Message 105844.  

What is your CPU to Run time difference?
Have you gone through your stderr file to see if anything abnormal showed up there?
An idea what percentage of the core that task is using?
30 hours is absurd. If it doesn't finish in 8 CPU time, then it stuck somehow.
ID: 105847 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,717,270
RAC: 11,974
Message 105848 - Posted: 5 Apr 2022, 22:07:36 UTC

Update, I made a mistake.

I actually only have 2 machines (of 7) that can run Rosetta Python. VB 5 makes Cosmology work. Antivirus settings make LHC work. But Rosetta needs a CPU with AVX (or maybe FMA) instructions, and 5 of mine are too old.

So I got 3 aaam tasks, one on an i5-8600K which works ok. Two on a Ryzen 9 3900XT, which both work ok.
ID: 105848 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,717,270
RAC: 11,974
Message 105849 - Posted: 5 Apr 2022, 22:08:47 UTC - in response to Message 105846.  

From what i could understand his tasks do not fail, but some run for 30 hours before completing.
That is a fail. If they run 30 hours, it means no calculations are actually being done. I assume something times out.
ID: 105849 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1831
Credit: 119,513,695
RAC: 9,561
Message 105850 - Posted: 5 Apr 2022, 22:40:44 UTC
Last modified: 5 Apr 2022, 22:41:02 UTC

Boinctasks is helpful to show any tasks where the CPU usage had dropped to near 0. I find this happens on some machines more than others but I'm not sure why. Windows seems more problematic than Linux.

Typically, I find that VirtualBox tasks that have run for more than 10hrs are broken and need aborting. Unfortunately, that often also triggers the project to stop you getting more VB tasks until you click the Allow button again.
ID: 105850 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bryn Mawr

Send message
Joined: 26 Dec 18
Posts: 389
Credit: 12,073,013
RAC: 8,289
Message 105853 - Posted: 6 Apr 2022, 9:30:09 UTC - in response to Message 105840.  
Last modified: 6 Apr 2022, 9:33:16 UTC

Subject: +30 hours to run 8 hr tasks

I recently installed VB. I'm on Win 11, Intel i5-11, running VB 6.1.32 (the recommended Boinc version), and all Rosetta tasks are estimated by Boinc to take 8 hrs, which sometimes take 4 hrs, and some take +30 hrs. Is this normal? The ones that take the longest often indicate, for ex, time remaining = 30 min, and 8 hrs later, the time remaining = 20 min -- which seems odd. Do I need to tweak something?


It can take some days for Boinc to get used to your system and how quickly it runs. To help it do so make sure you run the benchmarks.

Having said that, Rosetta is different to other projects in that it has fixed length jobs rather than jobs with a fixed amount of work. There are two things that can interfere with that, firstly if the task has a low number of starting points you might run for a shorter time than selected, often 3 hours rather than 8, and secondly if the task takes more than the selected time to run the first iteration of the work as it only checks whether to finish at the end of each iteration.

Ach! I forgot Python and it’s failings - yes, this is likely to be vb going to sleep and only pretending to work.
ID: 105853 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
tullio

Send message
Joined: 10 May 20
Posts: 63
Credit: 630,125
RAC: 0
Message 105854 - Posted: 6 Apr 2022, 12:53:02 UTC

My last two tasks took 3 hours and 8 minutes and 3 hours and 22 minutes on my Windows 11 Intel i5 and Virtual Box 6.1.32.RAM is 12 GB.
Tullio
ID: 105854 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 258
Credit: 483,503
RAC: 133
Message 105855 - Posted: 6 Apr 2022, 12:58:49 UTC - in response to Message 105854.  
Last modified: 6 Apr 2022, 12:59:29 UTC

I have got connection reset errors several times while sending reply.
ID: 105855 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 105856 - Posted: 6 Apr 2022, 20:00:10 UTC - in response to Message 105853.  

Subject: +30 hours to run 8 hr tasks

I recently installed VB. I'm on Win 11, Intel i5-11, running VB 6.1.32 (the recommended Boinc version), and all Rosetta tasks are estimated by Boinc to take 8 hrs, which sometimes take 4 hrs, and some take +30 hrs. Is this normal? The ones that take the longest often indicate, for ex, time remaining = 30 min, and 8 hrs later, the time remaining = 20 min -- which seems odd. Do I need to tweak something?


It can take some days for Boinc to get used to your system and how quickly it runs. To help it do so make sure you run the benchmarks.

Having said that, Rosetta is different to other projects in that it has fixed length jobs rather than jobs with a fixed amount of work. There are two things that can interfere with that, firstly if the task has a low number of starting points you might run for a shorter time than selected, often 3 hours rather than 8, and secondly if the task takes more than the selected time to run the first iteration of the work as it only checks whether to finish at the end of each iteration.

Ach! I forgot Python and it’s failings - yes, this is likely to be vb going to sleep and only pretending to work.


Look through you stderr file and see if you can find anything unusual. Lost comm or something that burped weird during the process. Look for radical time change message. Compare your CPU run time vs actual run time where it makes checkpoints. If they are not close then something went wrong prior to where the values become to far apart to make any sense. See if you can find anything in the text.
If you can't then it is something BOINC can not identify and you can write it up to a "bug" from the person who wrote that protein sequence.
ID: 105856 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2117
Credit: 41,140,182
RAC: 15,917
Message 105863 - Posted: 8 Apr 2022, 0:42:39 UTC
Last modified: 8 Apr 2022, 0:46:41 UTC

Repeating my earlier message for those who haven't seen it:
If you have a task you think is stalled or taking a long time, click on it and select properties on the left.
If there's a large difference between CPU time and Elapsed time, then it's stalled and you can only abort it. They <never> restart.

Also, if later tasks are completing before earlier tasks, it's a clue to check Properties of those earlier tasks and, if they've stalled in the way described above, abort them.
This wastes the least amount of processing time.

It's not your fault and there's <nothing> you can do to correct it

ID: 105863 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,717,270
RAC: 11,974
Message 105865 - Posted: 8 Apr 2022, 7:09:44 UTC - in response to Message 105788.  

Cosmology doesn't need it. I can run Cosmology on all 7 of my machines, most are missing AVX. The only thing that annoys Cosmology is VB 6. VB 5 is ok.

Had a look , the q9450 is on Boinc 7.16.20 so its got VB 6.1.2
I will finish all work and revert/uninstall/nuke back to Boinc 7.14.2 uses VB 5.2.8 to see what happens .
I have got versions of boinc mangler back to 5.10.13
Oh! , that's 45 all together in win/Lin 32/64/VB or not , sad case . . . .
Just in case .
But sometimes they come in usefull .
I stick with the latest Boinc (well actually later than latest because I have contacts) but just change the VB version.
ID: 105865 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 274
Credit: 26,399,595
RAC: 0
Message 105873 - Posted: 8 Apr 2022, 23:53:56 UTC - in response to Message 105865.  
Last modified: 9 Apr 2022, 0:01:22 UTC

Cosmology doesn't need it. I can run Cosmology on all 7 of my machines, most are missing AVX. The only thing that annoys Cosmology is VB 6. VB 5 is ok.

Had a look , the q9450 is on Boinc 7.16.20 so its got VB 6.1.2
I will finish all work and revert/uninstall/nuke back to Boinc 7.14.2 uses VB 5.2.8 to see what happens .
I have got versions of boinc mangler back to 5.10.13
Oh! , that's 45 all together in win/Lin 32/64/VB or not , sad case . . . .
Just in case .
But sometimes they come in usefull .
I stick with the latest Boinc (well actually later than latest because I have contacts) but just change the VB version.

Reverting did no good , it still won`t run any VB tasks from Rosetta or Cosmo , it trashes them
so put it back to 7.16.20 without VB Looks like a Q9450 haznt got what it takes to run boinc `s kind of VB stuff on the projects that I do .
And I will not be bothering to try out others to see what happens .

Over at Cosmo my opteron16 system does work with 7.16.20 and VB 6.1.2 with little or no bother
ID: 105873 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,717,270
RAC: 11,974
Message 105874 - Posted: 9 Apr 2022, 0:00:05 UTC - in response to Message 105873.  

Cosmology doesn't need it. I can run Cosmology on all 7 of my machines, most are missing AVX. The only thing that annoys Cosmology is VB 6. VB 5 is ok.

Had a look , the q9450 is on Boinc 7.16.20 so its got VB 6.1.2
I will finish all work and revert/uninstall/nuke back to Boinc 7.14.2 uses VB 5.2.8 to see what happens .
I have got versions of boinc mangler back to 5.10.13
Oh! , that's 45 all together in win/Lin 32/64/VB or not , sad case . . . .
Just in case .
But sometimes they come in usefull .
I stick with the latest Boinc (well actually later than latest because I have contacts) but just change the VB version.

Reverting did no good , it still won`t run any VB tasks from Rosetta or Cosmo , it trashes them
so put it back to 7.16.20 without VB Looks like a Q9450 haznt got what it takes to run boinc `s kind of VB stuff on the projects that I do .
And I will not be bothering to try out others to see what happens .
Cosmology should run on anything if you use VB 5. Look at these old things I'm using on it:

https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=6181958
https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=6181968

The second one uses DDR2 RAM! Both run Cosmology perfectly. Your computers are better than those, even your Core 2 Quads are newer than mine.
ID: 105874 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 274
Credit: 26,399,595
RAC: 0
Message 105875 - Posted: 9 Apr 2022, 0:07:27 UTC
Last modified: 9 Apr 2022, 0:09:16 UTC

Looks like I waz editing my above post while you where replying to it
I see they are both on win 11 , I wonder if that haz anything to do with it , as the Q9450 iz still on wind7 .
Though the opteron iz on wyn 7 and crunch`s along without any bother , such iz life .
ID: 105875 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 258
Credit: 483,503
RAC: 133
Message 105876 - Posted: 9 Apr 2022, 0:15:50 UTC - in response to Message 105875.  

Q9450 Doesn't support AVX. Virtualbox jobs get stuck if avx is missing.
ID: 105876 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,717,270
RAC: 11,974
Message 105877 - Posted: 9 Apr 2022, 0:15:58 UTC - in response to Message 105875.  

Looks like I waz editing my above post while you where replying to it
I see they are both on win 11 , I wonder if that haz anything to do with it , as the Q9450 iz still on wind7 .
Though the opteron iz on wyn 7 and crunch`s along without any bother , such iz life .
I don't think I was doing cosmology back on windows 10. It is possible that 11 helps, I think the way the OS uses the CPU is different. Just get 11 onto them, use "Rufus" - it can bypass the TPM requirement, I used an 8GB USB stick to perform an upgrade. You can probably do it just on the hard disk if you don't have one.
ID: 105877 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,717,270
RAC: 11,974
Message 105878 - Posted: 9 Apr 2022, 0:16:35 UTC - in response to Message 105876.  
Last modified: 9 Apr 2022, 0:17:05 UTC

Q9450 Doesn't support AVX. Virtualbox jobs get stuck if avx is missing.
Not on cosmology or LHC, that's only a Rosetta requirement. I have 5 computers with no AVX and they all run Cosmology and LHC VB tasks fine.
ID: 105878 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile bcavnaugh
Avatar

Send message
Joined: 7 Dec 13
Posts: 7
Credit: 2,389,640
RAC: 0
Message 105879 - Posted: 9 Apr 2022, 1:34:52 UTC

Selecting 2 Hours seams to really take up to and even more than a day.
I can this be fixed?
ID: 105879 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1232
Credit: 14,269,631
RAC: 2,588
Message 105880 - Posted: 9 Apr 2022, 3:41:56 UTC - in response to Message 105879.  

Selecting 2 Hours seams to really take up to and even more than a day.
I can this be fixed?

Each task is divided into sections called decoys. The checks of whether to end a task are normally only run at the end of a decoy, except for a check that the task has run so long that its outputs are unlikely to be very useful.

For the Rosetta tasks, this is probably unfixable. It usually means that the time per decoy for this task is much more than two hours, and you're unlikely to accomplish anything before the first decoy finishes.

For the Python tasks, you can partially fix some of them by comparing the emulated CPU time with the CPU time used by the emulation, and aborting them if they start using much more CPU time for running the emulation than the emulated CPU time, expect for a few minutes for starting up the emulation.
ID: 105880 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1232
Credit: 14,269,631
RAC: 2,588
Message 105881 - Posted: 9 Apr 2022, 3:48:57 UTC

BOINC is giving me error messages about being unable to run Python tasks because the need very large amounts of free memory to even start, such as about 19 GB. Is this a real requirement for them, or just bad calculations?
ID: 105881 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,717,270
RAC: 11,974
Message 105882 - Posted: 9 Apr 2022, 3:58:38 UTC - in response to Message 105881.  
Last modified: 9 Apr 2022, 4:00:56 UTC

BOINC is giving me error messages about being unable to run Python tasks because the need very large amounts of free memory to even start, such as about 19 GB. Is this a real requirement for them, or just bad calculations?
Sometimes they ask for more than they will actually need, but they do use a lot. For example I have a 6 core i5 with 16GB of RAM. It can get 5 Pythons running, then the memory is 80-90% full and doesn't load the 6th.

19GB sounds unusual, since that would stop any from running on my above machine at all, and I've never seen that.
ID: 105882 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 199 · 200 · 201 · 202 · 203 · 204 · 205 . . . 300 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org