Message boards : Number crunching : Manual downloading
Author | Message |
---|---|
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,688,048 RAC: 10,544 |
Hi all I built a machine for a friend some time ago (Athlon X2 4000+) and want to get it crunching but he doesn't have an internet connection - he connects through his mobile occasionally which is fine for surfing but expensive if rosetta is downloading. He's got loads of disk space so I want to cut the downloads by copying all the new rosetta files from my installation over to his once a week or so and then the send/receive should just skip the files it already has. Is there somewhere in the bakerlab.org domain that i can download files that i don't have to take them to his? For example, when I last connected his comp through my phone it was trying to download a 9MB file called lb_all_multi_threshold.1.5.loopbuild.t317_.tex.boinc_files.zip. If i can download the occasional large file like that manually I can keep the machine crunching... ta Danny |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
You can download files from the project download directory. But there are tens of thousands of them that have not been used for a long time. The problem is knowing which ones the machine will need. And keeping them around long enough for it to use them. You also need to know which of the 1024 fan-out subdirectories the desired file is in. You might create of cache of files this machine has ever used using a proxy. If the machine has internet access long enough, and it is just interfering with surfing that is the problem, you can set the preferences to limit the download bandwidth BOINC uses. This makes the downloads take even longer, but improves your surfing experience, so perhaps you stay attached to the network longer anyway :) edit: sorry, I missed the point... it is EXPENSIVE, as in the more MB you download the higher the bill for the mobile. I was thinking of the other ways that word might be used. Rosetta Moderator: Mod.Sense |
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,688,048 RAC: 10,544 |
yeah - a proxy might be useful. Anyone know of a decent one for xp? I'll increase the rosetta run-time to 6hrs too as it should be ok to hit the deadlines at that duration... |
Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0 |
Hey Danny. I would think you'd get the maximum of what you want if you treat that machine like one of your flash drives. In other words have another machine with a full BOINC data directory that requests all the work, gets all the files, and then move the entire directory over to your friend's machine where it crunches for a few days. Then bring it all back and upload the results. In other words, treat it like a machine that has no internet access at all. This only works well if the machines are the same type. Some projects detect CPU type and send different types of work, or tasks optimized for specific CPUs. Rosetta (from what I can tell) just looks at memory, and OS type. If the machine is on, and running BOINC most of the time, I'd think you'll minimize the number of downloads overall if you work your way gradually up to 24hr runtime preference. Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,188,754 RAC: 3,501 |
Hey Danny. I would think you'd get the maximum of what you want if you treat that machine like one of your flash drives. In other words have another machine with a full BOINC data directory that requests all the work, gets all the files, and then move the entire directory over to your friend's machine where it crunches for a few days. Then bring it all back and upload the results. In other words, treat it like a machine that has no internet access at all. That sort of gave me an idea along those lines...DCDC could create a second Boinc installation oh his own machine, within the guidelines of OS and memory as stated above, and use it to feed the friends machine. I mean load Boinc in an alternate directory and get the work units, download them onto a flash drive and upload them onto your friends machine. when he has finished crunching them, bring them back and import them into your alternate Boinc location and upload and get new units. This is going to be a royal pain, it would be easier, if he lives close, to just give him access to your wireless net when he needs it! I guess I just kinda said what was said previously didn't I? Hmmmm |
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,688,048 RAC: 10,544 |
in the past i have run boinc from two thumbdrives - leaving one while i take the other home to upload. I was hoping dumping all the large files from my boinc dir into his would mean i could just do a send/receive on my phone and most of the files would already be there so the downloads would be minimal, but it seems there's a lot more files than my computer has in its cache. I know new ones are added regularly but i was thinking i could create a little script to copy the new ones to my phone when at my house (via bluetooth when in range) and then over to his when i'm at his house (he doesn't have bluetooth but i can get an adapter for £3 from ebuyer or just use the cable)... |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Unless you actually do the WU downloads for your friend's machine, your odds of having the files his machine needs to do the work it gets are only about 50/50. If you follow an approach as described by feet1st and mikey, you will have 100% of the files. Rosetta Moderator: Mod.Sense |
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,688,048 RAC: 10,544 |
Unless you actually do the WU downloads for your friend's machine, your odds of having the files his machine needs to do the work it gets are only about 50/50. If you follow an approach as described by feet1st and mikey, you will have 100% of the files. yeah, i know. it just makes things more complicated because i'll have to upload when i get a key back to mine and then download new tasks before i go to his to get a decent chance of getting them crunched and returned before the deadlines. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Yes, you either run with 2 keys and visit your friend twice as often, or you have to visit twice the same day. With 2 keys, I mean you bring him one loaded with work, and pick up one loaded with results. Bring the results home, update to project report results, get more work and then repeat the process, meeting again when his machine has completed most of the work to swap keys. Rosetta Moderator: Mod.Sense |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,188,754 RAC: 3,501 |
Unless you actually do the WU downloads for your friend's machine, your odds of having the files his machine needs to do the work it gets are only about 50/50. If you follow an approach as described by feet1st and mikey, you will have 100% of the files. Without internet your friends crunching for an internet based project may not make a whole lot of sense! This may end up being alot more work than you really want to be a part of. I do computer repair work for friends and when some call it is a real chore to find the time to go over to their home and fix their pc's, spending most of the day chit chatting and going over all the things we have missed. Now when they give me a handful of money it is nice, but sometimes they don't pay me, I do it for friends so don't charge them, they just pay what they want to pay. Some pay alot, most pay something, I do charge for any parts though. Anyway my point was that this may become a "chore" not a labor of love! |
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,688,048 RAC: 10,544 |
where would i find, for exmple, lb_all_multi_threshold.0.5.loopbuild.t373_.tex.boinc_files.zip I'm not sure how the project download directory is organised! ta Danny |
Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0 |
I was thinking you might end up asking that. BOINC projects spread the download files out across a number of directories to keep a modest number of files in each. If the number of files gets too great on some systems they begin to slow down. Rosetta uses 1024 subdirectories (BOINC calls this "fanout"). These subdirectories are named with the hexidecimal form of the numbers 0-1023 which in hex is x'0' to x'3ff' BOINC takes the file names of anything that is not part of the base application executable (which are served directly from /download/...) and uses the following formula to assign a hashed fanout directory: <?php // GetHashDir //////////////// // Returns the hashed directory name in the /download directory for a given file. // // $ToHash - in, string of the file name shown in BOINC Manager messages // returns hex characters of the hashed directory name 0 thru fanout-1 function GetHashDir($ToHash) { $HashFanout = 1024; $Hashed = substr(md5($ToHash),1,7); $Hashed2 = hexdec($Hashed); $H2 = $Hashed2 % $HashFanout; return sprintf("%x", $H2); } ?> So, if you run the above with a value for $ToHash of your file name "lb_all_multi_threshold.0.5.loopbuild.t373_.tex.boinc_files.zip" (being careful not to include any leading or trailing blanks) you get a value of '2ac'. So your file will be found in https://boinc.bakerlab.org/download/2ac/lb_all_multi_threshold.0.5.loopbuild.t373_.tex.boinc_files.zip ...if that was more then you wanted to chew, I set up a form you can use to compute these for you: http://www.violetoaks.com/TinkerTools/GetHashDir.php Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ |
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,688,048 RAC: 10,544 |
legend ;) I was thinking you might end up asking that. |
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,688,048 RAC: 10,544 |
next question... it might seem like i'm putting more effort into this than it's worth but i might be able to get another couple if this works. If I can get a list of the files that Rosetta has in the download queue then i can let it upload results using the phone, and request new tasks normally. Then i can: 1. have a script that copys the list of required files to a flash drive (parsed from one of the rosetta log files?) 2. another script on the flash drive to download the files in the list (based on Feet1st's fanout logic). He can run that himself at work when on the t'internet. 3. I can set another script to run regularly as a scheduled task on his machine so whenever the flash drive is plugged in it copys the files to the rosetta folder (overwriting any partial downloads). It might need a net stop boinc and net start boinc at the start and end to make sure they're not locked i guess. That means all he has to do is take the usb stick in to work once a week and run a script. Easy! Can anyone see any theoretical flaws in my logic? If not, does anyone know how to find the list of files to download!?! |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
How is the result any much different then suspending internet access until he is at work? ...or are you moving the flash drive from his home machine to one at work? So why not just leave the BOINC data directory on the flash and run directly from that? Rosetta Moderator: Mod.Sense |
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,688,048 RAC: 10,544 |
How is the result any much different then suspending internet access until he is at work? ...or are you moving the flash drive from his home machine to one at work? So why not just leave the BOINC data directory on the flash and run directly from that? yeah - it's a desktop so just the flash drive moving. He can't run rosetta on the machines at work in order to do the send/receive but i don't think there'd be an issue with a script that runs a few downloads... |
Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0 |
If the company doesn't trust Rosetta to run on a machine in their network, then they really don't trust downloading it's .exe when the version changes, nor downloading it's other components. But if you suspend the CPU on the machine at work, and never allow it to run, you should be able to do all the scheduler requests and file downloads without actually running Rosetta there. Use your thumb drive as the BOINC data directory. Always exit (i.e. completely shutdown) BOINC before attaching or removing the thumb drive. Round-robin with two thumb drives and the machine at home is active even when your friend goes to work to do updates. You wouldn't have to have all work completed either. It will continue when it gets back to a machine with BOINC allowing the CPU to be used. Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ |
Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0 |
You'd still be running BOINC, just not the applications. So... thinking of BOINC as a "script that does some internet accesses and file storage", you have what you are talking about, without all the hassle. Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ |
Message boards :
Number crunching :
Manual downloading
©2024 University of Washington
https://www.bakerlab.org