Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 175 · 176 · 177 · 178 · 179 · 180 · 181 . . . 300 · Next
Author | Message |
---|---|
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,717,270 RAC: 11,974 |
Why can't message text be included in email notification?I'll guess either server load or stupidity. |
Rachael Lines Send message Joined: 11 Feb 22 Posts: 2 Credit: 2,865 RAC: 0 |
Hey everyone I started crunching yesterday - but four of mine are showing computation error. is this something normal? Or do I need to suspend each one before shutting down my computer at all? - Has turning off my PC what has caused this? Thanks in advance. |
kotenok2000 Send message Joined: 22 Feb 11 Posts: 258 Credit: 483,503 RAC: 133 |
You need to enable SVM in bios to compute virtualbox apps And reset switch that automaticaly disables virtualbox apps when hos6t fails to compute them at https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=6177189 press allow |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1673 Credit: 17,609,434 RAC: 22,266 |
Hey everyoneIt has tried to process Python Tasks, and they require VirtualBox in order to run. Your system has VirtualBox, but is having problems running it. Waiting for VM "boinc_722c34e89dac8a69" to power on... VBoxManage.exe: error: Not in a hypervisor partition (HVP=0) (VERR_NEM_NOT_AVAILABLE). VBoxManage.exe: error: AMD-V is disabled in the BIOS (or by the host OS) (VERR_SVM_DISABLED) VBoxManage.exe: error: Details: code E_FAIL (0x80004005), component ConsoleWrap, interface IConsole 2022-02-11 18:14:31 (7376): VM failed to start. 2022-02-11 18:14:31 (7376): Could not start 2022-02-11 18:14:31 (7376): ERROR: VM failed to start 2022-02-11 18:14:31 (7376): Powering off VM. 2022-02-11 18:14:31 (7376): Deregistering VM. (boinc_722c34e89dac8a69, slot#4) 2022-02-11 18:14:31 (7376): Removing network bandwidth throttle group from VM. 2022-02-11 18:14:31 (7376): Removing VM from VirtualBox. I'd suggest checking your BOIS to make sure virtualisation is enabled, and then make sure that Hyper-V isn't enabled (under Windows Features). This may be of use. (Similar hardware & OS and what they had to do to get VirtualBox to work). If you can get it working you will be limited in the number of Python Tasks you can process due to the amount of RAM you have- from memory at least 3GB of RAM is required per Task to start processing (even though they actually use much less). And you will also probably need to increase the default amount of disk space BOINC can use to process more than a few Python Tasks- just under 8GB of disk space is require per Task being processed. You shouldn't have any issues processing Rosetta 4.20 Tasks (unless we get some that do require more RAM than the present ones), but until the last couple of days they have been pretty much non-existent for the last few months. I would also suggest running the BOINC Manager benchmarks- they are used to determine the amount of Credit you get for doing work, and your system is showing just the default values. Grant Darwin NT |
Rachael Lines Send message Joined: 11 Feb 22 Posts: 2 Credit: 2,865 RAC: 0 |
They were Rosetta Python ones that were showing the error, I have some of the Rosetta 4.20 working now. I will have a better look at it this afternon, Thanks for the info |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1994 Credit: 9,524,889 RAC: 7,500 |
All PcrV10AA_PcrV_HYF_ fail after few seconds: <stderr_txt> |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1673 Credit: 17,609,434 RAC: 22,266 |
All PcrV10AA_PcrV_HYF_ fail after few seconds:I've got plenty of _PcrV_ Tasks that have been processed and Validated, but it must be around 50% of them crashed and burned within seconds of starting. - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x00007FF7DA118316 read attempt to address 0xFFFFFFFF Grant Darwin NT |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1994 Credit: 9,524,889 RAC: 7,500 |
I've got plenty of _PcrV_ Tasks that have been processed and Validated, but it must be around 50% of them crashed and burned within seconds of starting. +1. Now some of these wus are running... |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Everything from that protein died and my wingmen had the same errors. Very good of them to dump untested tasks on the server. Thought they dumped them on RALPH first and if he liked them then they came to Rosie. |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1994 Credit: 9,524,889 RAC: 7,500 |
Thought they dumped them on RALPH first and if he liked them then they came to Rosie. Completely agree with you. Ralph is VERY underused |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2117 Credit: 41,149,199 RAC: 15,933 |
Finished the latest batch of Rosetta 4.20 tasks, so flicked back to WCG tasks automatically... Yeah, I didn't read WCG's recent announcement properly. I thought it was going to be down from 14th to 28th February. Not stop sending tasks that will complete in that period and then the whole project be down until April 22nd Already completed everything ffs... I'm going to have to install Virtual Box and give that another try, aren't I God help me |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2117 Credit: 41,149,199 RAC: 15,933 |
Finished the latest batch of Rosetta 4.20 tasks, so flicked back to WCG tasks automatically... Was just about to say I completed my first tasks (which I did) when something crashed and all my remaining tasks errored out 16/02/2022 2:41:27 | Rosetta@home | [error] MD5 check failed for AIMNet_vm_v2.vdi Still, better than my previous attempts. I had up to 9 tasks running at a time within 32Gb RAM on my 8C/16T machine |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1994 Credit: 9,524,889 RAC: 7,500 |
I'm going to have to install Virtual Box and give that another try, aren't I Tn-Grid?? Sidock? |
computezrmle Send message Joined: 9 Dec 11 Posts: 63 Credit: 9,680,103 RAC: 0 |
16/02/2022 2:41:27 | Rosetta@home | [error] MD5 check failed for AIMNet_vm_v2.vdi Looks like the vdi image got damaged and needs to be refreshed. Best would be to - Shut down BOINC - Delete AIMNet_vm_v2.vdi from the projects directory - Restart BOINC This will initiate a fresh download of the compressed image (~2 GB) which will then be expanded to 6.9 GB. Whenever a fresh task starts AIMNet_vm_v2.vdi will be forced through the checksum calculator (MD5 check) and the result will be compared to the checksum sent by the project. Only in case of a success the image will be copied to a slots directory and renamed vm_image.vdi which is used for the task. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2117 Credit: 41,149,199 RAC: 15,933 |
I'm going to have to install Virtual Box and give that another try, aren't I I couldn't even access the home page of Sidock. TN-Grid may be something, but I'm not sure what. I'll persist here for a while longer |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2117 Credit: 41,149,199 RAC: 15,933 |
16/02/2022 2:41:27 | Rosetta@home | [error] MD5 check failed for AIMNet_vm_v2.vdi A new version of AIMNet_vm_v2.vdi comes down after updating, with all the attributes you mention but without a shutdown and restart. I've had a few Rosetta 4.20 and WCG tasks dribble through too. I overclock my PC and sometimes these checksum errors have been associated with overclocking, so I'm wary of that factor. At the same time, VBox tasks seem slightly less demanding than Rosetta tasks and I'm running a lot cooler with VBox, so maybe not. And I'm sure that only being able to run 8 or 9 tasks at a time rather than 16 plays into that too. I've completed all the Rosetta and WCG tasks I got, but now I only have 2 VBox tasks and none further will download. Is it normal for VBox tasks to only be available intermittently? I'm getting what I'm getting and it's not the complete failure it was when I first tried. I'll give it a few more days Edit: I've had to click "Allow" on my PC's profile. I guess all the crashed tasks tripped it to restrict downloads Yup, that's done it |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1994 Credit: 9,524,889 RAC: 7,500 |
I couldn't even access the home page of Sidock. Sidock is in maintenance today, but will be turn online soon TN-Grid may be something, but I'm not sure what. It's a storical boinc project about gene network.... http://gene.disi.unitn.it/test/ |
Killersocke@rosetta Send message Joined: 13 Nov 06 Posts: 29 Credit: 2,579,125 RAC: 0 |
World Community Grid: WCG Data Transfer Underway, Stress Test of New Infrastructure Scheduled For Feb 28th We have started to transfer data for all active WCG projects to the Krembil Research Institute. We are gearing up to start testing the whole system on February 28, 2022. 09.02.2022 20:41:43 · weiterlesen... -------------------------------------------------------------------------------- SiDock@home: Technical maintenance on February 15th Hello! Additional server maintenance planned on February 15th, for several hours. 14.02.2022 22:57:41 · weiterlesen... |
BoredEEdude Send message Joined: 11 Apr 12 Posts: 11 Credit: 38,954,694 RAC: 5 |
I have been running Rosetta on multiple computers for years, and it has been a mostly hands-off background task requiring minimal supervision. For the past few months, Rosetta work units have been unavailable for days on end. No errors are shown, just "got 0 new tasks". 2/16/2022 11:45:16 AM | Rosetta@home | update requested by user 2/16/2022 11:45:20 AM | Rosetta@home | Sending scheduler request: Requested by user. 2/16/2022 11:45:20 AM | Rosetta@home | Requesting new tasks for CPU 2/16/2022 11:45:22 AM | Rosetta@home | Scheduler request completed: got 0 new tasks 2/16/2022 11:45:22 AM | Rosetta@home | No tasks sent 2/16/2022 11:45:22 AM | Rosetta@home | Project requested delay of 31 seconds When this happens, the online website server status shows approximately 5000 tasks are ready to send, with some large number (~100k) of tasks in progress, and little server side processing occurring. Computing status Work Tasks ready to send 4992 Tasks in progress 115529 Workunits waiting for validation 0 Workunits waiting for assimilation 1 Workunits waiting for file deletion 1 Tasks waiting for file deletion 1 Transitioner backlog (hours) 0.00 It seems that whenever the available number of tasks gets down to around 5000, all work units are considered sent, and the server backend is now just waiting for completed work to be returned. I don't recall ever seeing the available tasks go down to zero. When I do eventually get some tasks, everything runs as expected locally until all tasks are finished. Then I go idle for days waiting for more tasks to become available. If seems to me that the project is just not generating as much work for all of it's users these days. I don't know if that is because the number of work units are down, or there are many more users available to process the same number of generally available units, or if the type of work has changed and I am unaware of what my system is lacking so it can be sent some of these "new" type of tasks now being made available. Is there a checklist somewhere that I can use to verify my system is setup correctly? Because my BOINC Manager currently thinks everything is running just fine. I used to run Rosetta work exclusively. But to keep my computers occupied (non-idle) I have since added other projects so I can pickup other tasks when no Rosette tasks are available. The downside is that when Rosetta tasks are available, these other projects dilute the amount of resources I can devote to Rosetta in the hands-off processing approach I prefer, as all projects now have to share the available CPU time. If many Rosetta users are running out of work, but there are still 10s or 100s of thousands of tasks still in progress, can Rosetta start limiting the number of tasks sent to individual users (even if they are willing to backlog a large numbers of tasks locally)? I have seen other projects where tasks were only generated in large bursts, and the users knew to backlog days or weeks worth of tasks since the server would quickly run out of new tasks to send out. The result was that if you didn't stockpile tasks during the initial big release, you would virtually never see any tasks unless BOINC happened to check in during a new big release of tasks days or weeks in the future. Limiting the size of individual user backlogs would spread the available work out across all the available users. That would help retain more users, since everyone would feel like they are contributing to the project. At this point, I feel like I'm getting sidelined with no work, while others are sitting on a lot of work units they cannot run immediately. And the rate of results back to Rosetta will be delayed unnecessarily as they wait for the return of backlogged tasks for a few users instead of sending them to idle machines instead. My Rosetta@home Statistics graph clearly shows 3 bursts of activity over a total of 8 days within the past 30 days. That leaves me sitting idle for 22 days (or about 75% of that time). My main PC (which the graph come from) is capable of running 16 concurrent tasks in 32 GB of RAM at ~3.5 GHZ CPU speed, so while I can normally complete many concurrent tasks in about 8 hours, 75% of the month Rosetta gets ZERO results from me for lack of tasks to run. https://drive.google.com/file/d/1X5aBWy0xj2wgV7DpF9tqjrRg8i8E-XEY/view |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
I couldn't even access the home page of Sidock. QuChem has been offline for 3 days now...must have blown something up to be offline this long. No webserver, no project server....dead. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org