Message boards : Number crunching : difference between robetta and crunchers
Author | Message |
---|---|
Jaykay Send message Joined: 13 Nov 08 Posts: 29 Credit: 1,743,205 RAC: 0 |
the title says nearly all, where's the difference between robetta and us "normal" crunchers? and why could the work of robetta not be done by us? a link is enough, as i think this question is already answered anywhere and i only didnt find it :) johannes |
Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
Well, I know I cannot explain the differences, but, the end goals of the applications are different. Also, different software ... |
David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0 |
the title says nearly all, where's the difference between robetta and us "normal" crunchers? and why could the work of robetta not be done by us? Robetta uses a small dedicated local cluster and computing resources provided by NCSA to do fully automated protein structure prediction for the public (among other things like alanine scanning and fragment library generation). During CASP8, robetta actually used R@h for doing full-atom (high resolution) de novo structure prediction which requires a lot of computing, much more than what robetta currently uses. There just isn't enough computing resources available right now to run robetta jobs on R@h and also do research within the lab -- right now there are millions of jobs queued necessary for improving and applying rosetta and more to come. |
Jaykay Send message Joined: 13 Nov 08 Posts: 29 Credit: 1,743,205 RAC: 0 |
first of all, thanks for your replies :)
so if i understand that correctly robetta is a small (?) supercomuter and not just a normal server? are there any nummbers for comparison with rosetta, like gflops?
so is robetta "better" than rosetta? and robetta does research for anyone who queues his work and rosetta does research for "your lab", but its basically the same work?
where queued, in robetta or rosetta or both? and can improvements for rosetta also be done for robetta? sorry for so many questions, but i'm really curious :) |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
If you think of Rosetta@home as being a research project developing and improving an industrial metal press, Robetta is the assembly line use of such a metal press. So, improving Rosetta@home to use less "electricity" (runtime) means Robetta begins using presses that need less electricity. But first you must prove to yourself that the quality of the work produced is otherwise the same etc. It is helpful to seperate the assembly line from the project development. And if you were using volunteers for both, it would be very (even more) unclear what they are contributing to. Also, researchers that have submitted work to Robetta, may not want to have it sent out to the public domain for various reasons, and perhaps to comply with their own research grant guidelines. The tasks DK refers to are "for improving and applying [to] rosetta", and therefore he's referring to Rosetta@home, and once improvements are confirmed, then to Robetta. The project homepage typically shows 20,000 tasks ready to send, but there is a queue of work behind that. Rather then produce the 2.5 million tasks referred to in the Feb 18 news item, and bog down the databases etc. the system creates work units from the queue as the number of available tasks drops below 20,000. So the actual WUs that the BOINC server sees available are generated dynamically from this queue. Confusing because BOINC is using the word queue as well. One queue feeds the other and is not reported to you by BOINC. Rosetta Moderator: Mod.Sense |
Michael G.R. Send message Joined: 11 Nov 05 Posts: 264 Credit: 11,247,510 RAC: 0 |
Very interesting. So was is Robetta running on? Do you have an estimate of the teraFLOPS number, or any info about the hardware? How does it compare to R@H in speed? |
Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0 |
Perhaps the better question is how are things coming along with Blue Waters?? Which Keith has already said will be running Robetta work. Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ |
Jaykay Send message Joined: 13 Nov 08 Posts: 29 Credit: 1,743,205 RAC: 0 |
mod.sense: many many thanks, that was very helpful! but your answer raised some more questions, sorry :) so basically rosetta is the testing project for robetta, and robetta is doing the "real" work, right? could rosetta be replaced by a second robetta, i.e. a second supercomputer? why are so many tasks needed for testing? and does rosetta also directly help research on protein structure, or is it really only for testing? many thanks again! |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
When Microsoft works on a new and improved version of Windows... they devise a plan on the improvements to make, they design and code the changes, then they study it (through usability studies and beta testing) to assure that it meets the objectives they set out with originally. They don't do any "production" activity with this new version until they formally release the new product. At that point, everyone using it, is doing "production" not lab work with it. With a science application like Rosetta, that step where you assess whether your improvements are really better or not, is not as simple as surveying some users that tried the new screen layout and seeing if they understand how to use the program. But yes, once you've mastered that new science, everyone benefits. The Rosetta program is used in various ways in labs throughout the world. And yes, if you would like to purchase a new supercomputer and run Robetta as a free service to the scientific community, I'm sure Dr. Baker would be willing to work with you. why are so many tasks needed for testing? Because there are thousands of proteins. Your changes might work great for the first 10 proteins that you try it on. But that doesn't mean it provides better results for all of them. So you have to try it on dozens of proteins to get a fair assessment. And from there, if it works well on some and not others, then you try to determine why, and make further adjustments to help the others work better too, and then you start over again with did your changes really improve things or not? ...does rosetta also directly help research on protein structure, or is it really only for testing? I'm not certain I understand the question. Think of Rosetta@home as the research on protein structure, so it works to answer the question "how can we predict the structure of proteins"? and think of Robetta as working to answer the question "with what is known about this specific protein, what does Rosetta predict it's structure will be?" If you wanted to test possible drugs to treat bird flu, you would want to run work on Robetta. If your efforts there don't find any viable drugs, then you need further research done on Rosetta@home. Perhaps this is the terminology clarification you are needing: "Rosetta" is a computer program that does protein structure prediction (and numerous other related stuff!) "Rosetta@home" is a public project using distributed volunteers to further develop and improve the Rosetta program. These volunteers are running the Rosetta computer program. The work here improves the computer program so that it's predictions are more accurate, or take less time to produce, or both. Work at Rosetta@home also expands the scope of how much "other related stuff" Rosetta is able to do. Protein docking predictions comes to mind. The Rosetta program didn't used to handle them. Code is being developed to perform docking predictions. It is a different class of the protein structure prediction problem. Once the ability to make accurate predictions in this field is proven, then Robetta can offer this function to the researchers that are using it. "Robetta" is a project using an in-house cluster to run the Rosetta computer program and deliver protein structure predictions to other researchers. Since it is running the same computer program, the work done on Rosetta@home directly benefits Robetta and the other researchers. So Rosetta@home does the "real work" of learning how to make better predictions of proteins in general. And Robetta does the "real work" of predicting the structure of a specific protein. Rosetta Moderator: Mod.Sense |
Mike Tyka Send message Joined: 20 Oct 05 Posts: 96 Credit: 2,190 RAC: 0 |
I should add to the excellent above decription by ModSense that Robetta uses BOINC(Rosetta@HOME) for abinito predictions since last summer. (at least it did during CASP, i'm not sure it does right now) http://beautifulproteins.blogspot.com/ http://www.miketyka.com/ |
David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0 |
yeah, it doesn't use BOINC now. not enough resources. |
Jaykay Send message Joined: 13 Nov 08 Posts: 29 Credit: 1,743,205 RAC: 0 |
many thanks again, i wasnt aware of the difference between rosetta and rosetta@home.
i meant rosetta@home, so the question was whether rosetta@home directly helps research. if i understood it correctly it "only" helps/improves robetta and robetta directly helps research. and another question: i already found two threads regarding this, but the question was not really answered: https://boinc.bakerlab.org/rosetta/forum_thread.php?id=4718 https://boinc.bakerlab.org/rosetta/forum_thread.php?id=4712 if i got that correctly, minirosetta is basically a cleaner and more improved version of rosetta. now the questions: if minirosetta is better, why is it not always used in rosetta@home? and why does robetta use rosetta although minirosetta is better? i hope that this are one of my last questions, i dont want to annoy you johannes |
Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
If someone did not make the test tube, we could not have chemistry labs. So, which is more important? The test tube maker, or the lab? Which contributes directly to the science? Without the test tube there is no work done in the lab. If we don't do the research on the algorithm in RaH, then the algorithm cannot be used to do research. It is like a car, it cannot go without the gas ... so ... all pieces are important. If we find a more effective algorithm then the science can be done faster with less resources. But, you have to have the effort to prove the algorithm... no work on seeking a better algorithm, no algorithm, no science ... chicken and the egg ... and the answer is egg ... not quite a chicken + not quite a chicken + random change in DNA = Chicken in the egg ... We are kinda like not quite a Roberta ... :) |
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,688,048 RAC: 10,544 |
now the questions: if minirosetta is better, why is it not always used in rosetta@home? and why does robetta use rosetta although minirosetta is better? Not all of the Rosetta functionality has been ported to minirosetta yet (the figure given is ~80% is ported) |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
The main point you seem to be missing is that the Rosetta program is used in research labs throughout the world. The other point you seem to be missing is that without programs like Rosetta, computers are not capable of these predictions. So, when Robetta and others use Rosetta for production work, it is not the perfect solution. But it is the current state-of-the-art that mankind has available. There are still proteins that trick it in to the wrong answer, and there is still much room to make predictions faster and more accurate. As CASP shows you every 2 years, there are other scientists working to develop programs like Rosetta and contributing to the Rosetta code as well. I do not mean to belittle their efforts and results by calling Rosetta "the" state-of-the-art. That is up to debate and someone else may actually have a program that works better. I'm simply trying to make the point that you are right here on the edge of what mankind knows about proteins. When you improve what mankind knows about protein structure prediction, and then provide Robetta for others to use that program for free, you build up quite a queue of work for Robetta. This is why supercomputers like DeepWater are considered public projects for the common good. If we learn to target a specific virus and actually design a protein that neutralizes it safely, everyone benefits and new fields of medical research are opened. Rosetta Moderator: Mod.Sense |
Jaykay Send message Joined: 13 Nov 08 Posts: 29 Credit: 1,743,205 RAC: 0 |
i know that without programs like rosetta a computer cant predict protein structures... and paul showed quite good that both the testing and robetta is important. but you didnt answer my last questions: if minirosetta is better, why is it not always used in rosetta@home? and why does robetta use rosetta although minirosetta is better? that should be the last questions i ask, thanks for your patience |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
ya, that one I'm not sure of the details of why. But in the big scheme of things, they are both the "Rosetta program" that we've discussed in this thread. They are sort of different versions of the same program, which raises the question "why not run the best version available?". Some possible answers to that are: "because you aren't yet certain the new version is indeed better" "you are in the middle of an extensive study and want all of your results done using the same version of the program so your results are uniform" "you want to use some of the 'other stuff' that was not enhanced and brought in to mini yet" Regardless of the answer, the specifics of the program and what is being studied at any point in time will evolve as time moves forward. Dawned on me that another great example of "other stuff" is the recent work on zinc. Someone starts out with a theory that the presence of zinc gives some cues you can follow to a better prediction. They write some code to attempt to take advantage of these cues. They run thousands of models on Rosetta@home and study the results. They then compare the new results with results from prior Rosetta versions for the same proteins. Are the new predictions more accurate? Let's say they are... but now what about the predictions for proteins that do not have zinc? Have your new changes perhaps effected these predictions adversely? See how you might find yourself running your prior version for a frame of reference? More models to study and confirm and prove. More analysis and comparison. The process continues... Rosetta Moderator: Mod.Sense |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Sort of ironic isn't it? The pattern of this thread resembles the process of the science work? You start out with what seems like a simple question... attempt to answer it and in so doing, realize the field is more complex then you knew before and so the number of questions multiplies. If you think about the progression of science, one can honestly say that after all these generations that mankind has been in existence, we now know that we don't know about more then ever before. Thanks for the questions. I'm sure we will all reference this thread many times in the future. Rosetta Moderator: Mod.Sense |
David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0 |
Very interesting. Robetta uses a local cluster consisting of 16 quad core machines. It also uses resources provided by NCSA, our last allocation was 200,000 process hours and a 1 million supplement has recently been awarded. To meet public demand, Robetta needs around 10000 process hours a day (rough estimate). |
Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0 |
What is a "process hour"? An hour of CPU time on an array of CPUs? Or each CPU? How many "process hours" is Rosetta@home using now? Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ |
Message boards :
Number crunching :
difference between robetta and crunchers
©2024 University of Washington
https://www.bakerlab.org