Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 194 · 195 · 196 · 197 · 198 · 199 · 200 . . . 300 · Next
Author | Message |
---|---|
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,717,270 RAC: 11,974 |
CPUs have many different combinations of instructions sets, it should be tested for before running!Newer machines work, older machines don't. I'm going to guess the Python app is using newer instruction sets only available on newer processors, and the incompetant fools at Rosetta are handing them out to everybody instead of only those that can handle it. |
JohnDK Send message Joined: 6 Apr 20 Posts: 33 Credit: 2,390,240 RAC: 0 |
I just can't get the python WUs to work properly on my Linux host, they all too often pauses with the VM unmanageable error. I started with 9 WUs and it has been suggested to lower that and one by one I'm down to 5. Right now I only have 4 left in cache and running, but 3 of them have already pausede with the VM error message after a BOINC restart. So the question of having enough RAM doesn't seem to apply, to my PC anyway. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 2,588 |
VirtualBox comes in two major versions, vbox and vbox64. The Python tasks use only the newer of these, vbox64. Since vbox emulates a 32-bit instruction set and vbox64 emulates a 64-bit instruction set, they are not interchangeable. Each is a program, and therefore requires a certain list of instructions from the physical CPU core it runs on. BOINC makes a list of the major groups of instructions available as it starts up. It appears that vbox has been in use long enough that it only uses CPU instructions available on nearly all computers still in use, but vbox64 hasn't. VirtualBox https://www.virtualbox.org/wiki/Downloads https://www.virtualbox.org/ If some of you can identify specific emulated CPU instructions for which emulation fails and shuts down the emulation, you might give the details to Oracle and see if they will fix at least part of the problem, even if Rosetta@Home won't help. The details you send them should include the list of CPU instruction groups produced when BOINC starts up. One thing many of us might send them is a request that when the VM unmanageable error is given, vbox64 should give more details on why. |
zxcvbob Send message Joined: 4 Jan 06 Posts: 8 Credit: 830,878 RAC: 0 |
Are there no 32-bit work units? One of my better systems (that crunches WCG very well when that project is up) has Windows 10 Pro 32-bit. I attached it to R@H several hours ago and it's getting no tasks. It does not have vbox installed, but my 64-bit machine without vbox (or do you call it vbox64?) is getting new work. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,717,270 RAC: 11,974 |
If some of you can identify specific emulated CPU instructions for which emulation fails and shuts down the emulation, you might give the details to Oracle and see if they will fix at least part of the problem, even if Rosetta@Home won't help.The best way to do this would be for many of us to create a big list of CPUs, their instructions sets as reported by Boinc, and if they run Python or not. We can then see which instruction is causing the problem. I'll start us off with my 7 machines, add your own please, and I'll shove them all in a spreadsheet and see what's what: Ryzen 9 3900XT, WORKS, fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 htt pni ssse3 fma cx16 sse4_1 sse4_2 movebe popcnt aes f16c rdrandsyscall nx lm avx avx2 svm sse4a osvw ibs skinit wdt tce topx page1gb r (I think this is truncated?) i5-8600K, WORKS, fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 fma cx16 sse4_1 sse4_2 movebe popcnt aes f16c rdrandsyscall nx lm avx avx2 vmx smx tm2 pbe fsgsbase bmi1 hle Core 2 Quad Q8400, DOESN'T WORK, fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 syscall nx lm vmx tm2 pbe Pentium N3700, DOESN'T WORK, fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 movebe popcnt aes rdrandsyscall nx lm vmx tm2 pbe smep Xeon X5650, DOESN'T WORK, fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 popcnt aes syscall nx lm vmx smx tm2 dca pbe i3 M350, DOESN'T WORK, fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 popcnt syscall nx lm vmx tm2 pbe |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1673 Credit: 17,602,814 RAC: 22,347 |
Are there no 32-bit work units?Work units are just data that can be processed by any software that has been written to process it. Rosetta has both 32 bit & 64 bit applications. Non-Python Rosetta 4.20 Tasks are very rare, it's just the the luck of the draw if your system just happens to request work when there are actually some available. Grant Darwin NT |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,717,270 RAC: 11,974 |
My non-python capable machines are attached to Rosetta so if 4.2 appears, they grab it, since Boinc will have a work debt for that project. But I give them other projects to do aswell.Are there no 32-bit work units?Work units are just data that can be processed by any software that has been written to process it. |
kotenok2000 Send message Joined: 22 Feb 11 Posts: 258 Credit: 483,503 RAC: 133 |
Ryzen 3 3100, works fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext ssbd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero wbnoinvd arat npt svm_lock nrip_save vmcb_clean flushbyasid decodeassists umip rdpid overflow_recov succor used cat /proc/cpuinfo on vmware workstation |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,717,270 RAC: 11,974 |
Ryzen 3 3100, works fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext ssbd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero wbnoinvd arat npt svm_lock nrip_save vmcb_clean flushbyasid decodeassists umip rdpid overflow_recov succorThanks, building spreadsheet, more people add please. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,717,270 RAC: 11,974 |
Ryzen 3 3100, works fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext ssbd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero wbnoinvd arat npt svm_lock nrip_save vmcb_clean flushbyasid decodeassists umip rdpid overflow_recov succorYou're getting more info than I got from Boinc. |
kotenok2000 Send message Joined: 22 Feb 11 Posts: 258 Credit: 483,503 RAC: 133 |
When I run cat /proc/cpuinfo on wsl1 i get this fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave osxsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni umip rdpid |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,717,270 RAC: 11,974 |
When I run cat /proc/cpuinfo on wsl1 i get thisMaybe we should stick to what Boinc shows. That's presumably what's important for Boinc. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Processor: 16 AuthenticAMD AMD Ryzen 7 3700X 8-Core Processor [Family 23 Model 113 Stepping 0 Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 htt pni ssse3 fma cx16 sse4_1 sse4_2 movebe popcnt aes f16c rdrandsyscall nx lm avx avx2 svm sse4a osvw ibs skinit wdt tce topx page1gb rdtscp fsgsbase bmi1 smep bmi2 runs 64 bit win10 Can run python just fine if the task data is ok |
Jean-David Beyer Send message Joined: 2 Nov 05 Posts: 187 Credit: 6,370,872 RAC: 5,700 |
When I run cat /proc/cpuinfo on wsl1 i get this fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave osxsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni umip rdpid I get way more than you do, fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single intel_ppin ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req avx512_vnni md_clear flush_l1d arch_capabilities Does the difference matter? |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 2,588 |
When I run cat /proc/cpuinfo on wsl1 i get this Maybe. We are still trying to determine which entries on the list are important. You might add how often your computer runs Python tasks properly to help determine this. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
WTF?!?! I get booted off Python for an error in computing 2 days ago and a abort 4 days before that? What the hell is up with that? Really stupid shit is happening now. What's this? 2022-03-23 15:16:13 (3364): Starting VM using VBoxManage interface. (boinc_d09093615cdfbe0f, slot#9) 2022-03-23 15:16:13 (3364): Error in start VM for VM: -2135228415 Command: VBoxManage -q startvm "boinc_d09093615cdfbe0f" --type headless Output: VBoxManage.exe: error: Could not find a registered machine named 'boinc_d09093615cdfbe0f' VBoxManage.exe: error: Details: code VBOX_E_OBJECT_NOT_FOUND (0x80bb0001), component VirtualBoxWrap, interface IVirtualBox, callee IUnknown VBoxManage.exe: error: Context: "FindMachine(Bstr(pszVM).raw(), machine.asOutParam())" at line 722 of file VBoxManageMisc.cpp 2022-03-23 15:16:13 (3364): VM failed to start. 2022-03-23 15:16:13 (3364): Could not start 2022-03-23 15:16:13 (3364): ERROR: VM failed to start 2022-03-23 15:16:13 (3364): Powering off VM. 2022-03-23 15:16:13 (3364): Deregistering VM. (boinc_d09093615cdfbe0f, slot#9) 2022-03-23 15:16:14 (3364): Removing network bandwidth throttle group from VM. 2022-03-23 15:16:14 (3364): Removing VM from VirtualBox. 022-03-23 15:16:13 (3364): Command: VBoxManage -q showvminfo "boinc_d09093615cdfbe0f" --machinereadable Exit Code: -108 Output: 2022-03-23 15:16:13 (3364): Command: VBoxManage -q startvm "boinc_d09093615cdfbe0f" --type headless Exit Code: -2135228415 Output: VBoxManage.exe: error: Could not find a registered machine named 'boinc_d09093615cdfbe0f' VBoxManage.exe: error: Details: code VBOX_E_OBJECT_NOT_FOUND (0x80bb0001), component VirtualBoxWrap, interface IVirtualBox, callee IUnknown VBoxManage.exe: error: Context: "FindMachine(Bstr(pszVM).raw(), machine.asOutParam())" at line 722 of file VBoxManageMisc.cpp 2022-03-23 15:16:14 (3364): Command: VBoxManage -q snapshot "boinc_d09093615cdfbe0f" list Exit Code: -2135228415 Output: VBoxManage.exe: error: Could not find a registered machine named 'boinc_d09093615cdfbe0f' VBoxManage.exe: error: Details: code VBOX_E_OBJECT_NOT_FOUND (0x80bb0001), component VirtualBoxWrap, interface IVirtualBox, callee IUnknown VBoxManage.exe: error: Context: "FindMachine(bstrMachine.raw(), pMachine.asOutParam())" at line 333 of file VBoxManageSnapshot.cpp 2022-03-23 15:16:14 (3364): Command: VBoxManage -q bandwidthctl "boinc_d09093615cdfbe0f" remove "boinc_d09093615cdfbe0f_net" Exit Code: -2135228415 Output: VBoxManage.exe: error: Could not find a registered machine named 'boinc_d09093615cdfbe0f' VBoxManage.exe: error: Details: code VBOX_E_OBJECT_NOT_FOUND (0x80bb0001), component VirtualBoxWrap, interface IVirtualBox, callee IUnknown VBoxManage.exe: error: Context: "FindMachine(Bstr(a->argv[0]).raw(), machine.asOutParam())" at line 320 of file VBoxManageBandwidthControl.cpp 2022-03-23 15:16:14 (3364): Command: VBoxManage -q unregistervm "boinc_d09093615cdfbe0f" --delete Exit Code: -2135228415 Output: VBoxManage.exe: error: Could not find a registered machine named 'boinc_d09093615cdfbe0f' VBoxManage.exe: error: Details: code VBOX_E_OBJECT_NOT_FOUND (0x80bb0001), component VirtualBoxWrap, interface IVirtualBox, callee IUnknown VBoxManage.exe: error: Context: "FindMachine(Bstr(VMName).raw(), machine.asOutParam())" at line 150 of file VBoxManageMisc.cpp 15:16:25 (3364): called boinc_finish(-2135228415) https://boinc.bakerlab.org/rosetta/result.php?resultid=1480251240 https://boinc.bakerlab.org/rosetta/result.php?resultid=1480014826 This one I killed off because it died Run time 3 hours 25 min 47 sec CPU time 4 sec Both their fault and I get the hit for it? Cheap. |
Jean-David Beyer Send message Joined: 2 Nov 05 Posts: 187 Credit: 6,370,872 RAC: 5,700 |
Maybe. We are still trying to determine which entries on the list are important. You might add how often your computer runs Python tasks properly to help determine this. Sorry. I am unwilling to install VirtualBox on my machine, so I do not run any. Wed 23 Mar 2022 11:20:40 PM EDT | Rosetta@home | Message from server: VirtualBox is not installed |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Maybe. We are still trying to determine which entries on the list are important. You might add how often your computer runs Python tasks properly to help determine this. Be sure to take that setting off your profile so the project does not try to send you pythons. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1673 Credit: 17,602,814 RAC: 22,347 |
???Be sure to take that setting off your profile so the project does not try to send you pythons.Maybe. We are still trying to determine which entries on the list are important. You might add how often your computer runs Python tasks properly to help determine this. VirtualBox isn't installed, so it doesn't try to send any Python Tasks. Grant Darwin NT |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,717,270 RAC: 11,974 |
WTF?!?! I get booted off Python for an error in computing 2 days ago and a abort 4 days before that?I thought you had to return 100 faulty tasks to get booted? You can just reset it anyway. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org