Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 271 · 272 · 273 · 274 · 275 · 276 · 277 . . . 300 · Next
Author | Message |
---|---|
kotenok2000 Send message Joined: 22 Feb 11 Posts: 258 Credit: 483,503 RAC: 219 |
I see this in stderr.txt command: projects/boinc.bakerlab.org_rosetta/rosetta_beta_6.04_windows_x86_64.exe @7ahall_e_hal_7aa_15545_d239_0001.flags -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 20000 -checkpoint_interval 120 -mute all -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -boinc::cpu_run_timeout 36000 -run::rng mt19937 Extracting in project directory: database_0f7f01a1b07.zip error: cannot delete old E:/ProgramData/BOINC/projects/boinc.bakerlab.org_rosetta/database_0f7f01a1b07/database/rotamer/bbdep02.May.sortlib Permission denied error: cannot delete old E:/ProgramData/BOINC/projects/boinc.bakerlab.org_rosetta/database_0f7f01a1b07/database/rotamer/peptoid_rotlibs/001.rotlib Permission denied error: cannot delete old E:/ProgramData/BOINC/projects/boinc.bakerlab.org_rosetta/database_0f7f01a1b07/database/chemical/pdb_components/components.R.cif Permission denied error: cannot delete old E:/ProgramData/BOINC/projects/boinc.bakerlab.org_rosetta/database_0f7f01a1b07/database/chemical/pdb_components/components.6.cif Permission denied error: cannot delete old E:/ProgramData/BOINC/projects/boinc.bakerlab.org_rosetta/database_0f7f01a1b07/database/chemical/pdb_components/components.1.cif Permission denied error: cannot delete old E:/ProgramData/BOINC/projects/boinc.bakerlab.org_rosetta/database_0f7f01a1b07/database/chemical/pdb_components/components.D.cif Permission denied error: cannot delete old E:/ProgramData/BOINC/projects/boinc.bakerlab.org_rosetta/database_0f7f01a1b07/database/chemical/pdb_components/components.V.cif Permission denied error: cannot delete old E:/ProgramData/BOINC/projects/boinc.bakerlab.org_rosetta/database_0f7f01a1b07/database/chemical/pdb_components/components.4.cif Permission denied error: cannot delete old E:/ProgramData/BOINC/projects/boinc.bakerlab.org_rosetta/database_0f7f01a1b07/database/sampling/disulfide_jump_database_wip.dat Permission denied error: cannot delete old E:/ProgramData/BOINC/projects/boinc.bakerlab.org_rosetta/database_0f7f01a1b07/database/sampling/fragpicker_rama_tables/L_QP.counts.gz Permission denied error: cannot delete old E:/ProgramData/BOINC/projects/boinc.bakerlab.org_rosetta/database_0f7f01a1b07/database/sampling/vall.jul19.2011.torsions.gz Permission denied error: cannot delete old E:/ProgramData/BOINC/projects/boinc.bakerlab.org_rosetta/database_0f7f01a1b07/database/protocol_data/tensorflow_graphs/gcn_test_model/gcn_test_model_plot.png Permission denied Extracting in slot directory: minirosetta_database.zip Using database: database looks like each task tried to extract database to E:/ProgramData/BOINC/projects/boinc.bakerlab.org_rosetta/database_0f7f01a1b07 all at once, gave up, and extracted to slot directory. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2114 Credit: 41,105,271 RAC: 21,658 |
It's a strange thing, but every time tasks run out recently, another ~million seem to be added to the queue to keep us going. I realise many have given up a bit on expecting reliability from Rosetta, but it almost seems like someone is paying a little attention on the quiet. Or maybe I'm just wishing that was the case. Either way, it's appreciated. And there are still enough people around to blast through and return them quickly too. (Comparatively) good times... |
kotenok2000 Send message Joined: 22 Feb 11 Posts: 258 Credit: 483,503 RAC: 219 |
When i run graphics app it closes immediately. I see this in stderrgfx.txt user@ubuntu:/var/lib/boinc/slots/19$ cat stderrgfx.txt ERROR: Unable to open file: /var/lib/boinc/projects/boinc.bakerlab.org_rosetta/../database/chemical/residue_type_sets/fa_standard/residue_types.txt ERROR:: Exit from: src/core/chemical/GlobalResidueTypeSet.cc line: 145 13:38:44 (25733): called boinc_finish(0) |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1991 Credit: 9,520,400 RAC: 12,860 |
And there are still enough people around to blast through and return them quickly too. +1 But, after months and hundreds of thousands of wus, maybe it's the time to let the app out from beta stage |
kotenok2000 Send message Joined: 22 Feb 11 Posts: 258 Credit: 483,503 RAC: 219 |
Tasks finish in 3 hours for me. I have set "Target CPU run time" to "not selected" |
just1vet Send message Joined: 13 Nov 05 Posts: 4 Credit: 3,148,987 RAC: 0 |
Big problems with Rosetta on Linux Mint 20 and 21. Had to remove the project from the client on both of my machines. It would freeze the computers to where it had to be rebooted, only to lock up again, soon as it started on Rosetta. I narrowed it down to the Rosetta project after replacing hard drive, mother board and ram. they run fine on the other projects. This has been going on for a while. Runs fine on my Windows computers. Any ideas? |
kotenok2000 Send message Joined: 22 Feb 11 Posts: 258 Credit: 483,503 RAC: 219 |
Maybe you can reduce number of cpus allocated from 100% to 75%? |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2114 Credit: 41,105,271 RAC: 21,658 |
Tasks finish in 3 hours for me. Has something changed then? I noticed this elsewhere too. Doesn't apply to me - since tasks have been harder to come by I've changed my default to 12hrs. Maybe it's time to be explicit on runtime and change it to 8hrs, rather than let it run at a dodgy default value. |
just1vet Send message Joined: 13 Nov 05 Posts: 4 Credit: 3,148,987 RAC: 0 |
Right now they are 32 core with 16gb of RAM. Which should be enough for crunching. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,506,871 RAC: 24,507 |
Right now they are 32 core with 16gb of RAM. Which should be enough for crunching.Some Rosetta 4.20 Tasks require over 2GB of RAM. 32*2= way more than 16GB. Although the larger RAM Tasks have been very few and far between, 500MB to 1GB has been the usual range for Rosetta 4.20 Tasks lately. And 32*.5= all your RAM. 16GB RAM on a system with 64 cores/threads is way, way, way too little. Grant Darwin NT |
RDTSC Send message Joined: 29 Jan 24 Posts: 4 Credit: 596,739 RAC: 16,380 |
A flock of work units arrived recently that are behaving oddly, well, all but one of them...I have one machine, a workstation Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz / Arch Linux, which crunches Rosetta and WCG packets fine. A few months ago, added a really old dual Intel(R) Xeon(TM) CPU 2.80GHz machine (old Dell server, latest Ubuntu server LTS.) The old machine was getting Rosetta Beta workunits and choking on them; error, error, error... it was able to crunch through several non-beta workunits though. Thought it was the old CPUs, like an unsupported instruction or something. Reading this, now thinking it was bad workunits. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2114 Credit: 41,105,271 RAC: 21,658 |
And we're back... Looks like the whole website went down for about 10hours today. Couldn't even get to the Rosetta home page let alone upload results. Everything going through fine now |
GDB Send message Joined: 5 Oct 17 Posts: 1 Credit: 4,185,976 RAC: 8,428 |
All my units are getting validate errors now. |
MStenholm Send message Joined: 18 Apr 20 Posts: 18 Credit: 25,821,080 RAC: 27,889 |
GDB: you are not alone in all returned results getting valitated errors. The top 10 CPUs I checked plus my own got the same verdict. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,506,871 RAC: 24,507 |
Yep, The Validator is borked, For me, anything returned from 3 Apr 2024, 22:02:46 UTC fails, and a quick look at th top computers shows the same thing- everything going back at present fails Validation. If someone could get the Projects attention? Grant Darwin NT |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1991 Credit: 9,520,400 RAC: 12,860 |
If someone could get the Projects attention? +1 After the over 60 wus failed some hrs ago, i'm ready to upload about ten wus. Have i to stop the upload? |
Daniel Graf Send message Joined: 2 Nov 05 Posts: 10 Credit: 68,374,886 RAC: 78,958 |
Let's see if these work units are still credited. But I have the feeling that after calculating they will go straight into the trash can. Unfortunately, one computer will be running until this afternoon and will probably only produce garbage before I can separate it from Rosetta. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,506,871 RAC: 24,507 |
Let's see if these work units are still credited.If it's not Valid, there is no Credit. Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,506,871 RAC: 24,507 |
Have i to stop the upload?It will stop you from getting new work, but it is the only way to stop returned work from not Validating until the project fixes the issue. They could also re-run the validation of the presently failed Tasks, but i don't like the odds of that actually happening. Grant Darwin NT |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1991 Credit: 9,520,400 RAC: 12,860 |
it will stop you from getting new work, but it is the only way to stop returned work from not Validating until the project fixes the issue. The problem is that some of these wus are near the deadline. It's a pity to throw away the work done.... (i don't care a lot about points, i care about science) |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org