difference between robetta and crunchers

Message boards : Number crunching : difference between robetta and crunchers

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Jaykay

Send message
Joined: 13 Nov 08
Posts: 29
Credit: 1,743,205
RAC: 0
Message 59779 - Posted: 24 Feb 2009, 19:35:54 UTC

the title says nearly all, where's the difference between robetta and us "normal" crunchers? and why could the work of robetta not be done by us?

a link is enough, as i think this question is already answered anywhere and i only didnt find it :)


johannes
ID: 59779 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Paul D. Buck

Send message
Joined: 17 Sep 05
Posts: 815
Credit: 1,812,737
RAC: 0
Message 59781 - Posted: 24 Feb 2009, 21:48:02 UTC

Well, I know I cannot explain the differences, but, the end goals of the applications are different.

Also, different software ...
ID: 59781 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1018
Credit: 4,334,829
RAC: 0
Message 59786 - Posted: 25 Feb 2009, 1:08:55 UTC - in response to Message 59779.  

the title says nearly all, where's the difference between robetta and us "normal" crunchers? and why could the work of robetta not be done by us?

a link is enough, as i think this question is already answered anywhere and i only didnt find it :)


johannes



Robetta uses a small dedicated local cluster and computing resources provided by NCSA to do fully automated protein structure prediction for the public (among other things like alanine scanning and fragment library generation). During CASP8, robetta actually used R@h for doing full-atom (high resolution) de novo structure prediction which requires a lot of computing, much more than what robetta currently uses. There just isn't enough computing resources available right now to run robetta jobs on R@h and also do research within the lab -- right now there are millions of jobs queued necessary for improving and applying rosetta and more to come.
ID: 59786 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Jaykay

Send message
Joined: 13 Nov 08
Posts: 29
Credit: 1,743,205
RAC: 0
Message 59806 - Posted: 25 Feb 2009, 20:56:06 UTC - in response to Message 59786.  
Last modified: 25 Feb 2009, 20:56:41 UTC

first of all, thanks for your replies :)


Robetta uses a small dedicated local cluster and computing resources provided by NCSA to do fully automated protein structure prediction for the public (among other things like alanine scanning and fragment library generation).

so if i understand that correctly robetta is a small (?) supercomuter and not just a normal server? are there any nummbers for comparison with rosetta, like gflops?

During CASP8, robetta actually used R@h for doing full-atom (high resolution) de novo structure prediction which requires a lot of computing, much more than what robetta currently uses. There just isn't enough computing resources available right now to run robetta jobs on R@h and also do research within the lab

so is robetta "better" than rosetta? and robetta does research for anyone who queues his work and rosetta does research for "your lab", but its basically the same work?


right now there are millions of jobs queued necessary for improving and applying rosetta and more to come.


where queued, in robetta or rosetta or both? and can improvements for rosetta also be done for robetta?



sorry for so many questions, but i'm really curious :)
ID: 59806 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 59816 - Posted: 26 Feb 2009, 15:40:41 UTC
Last modified: 26 Feb 2009, 18:08:15 UTC

If you think of Rosetta@home as being a research project developing and improving an industrial metal press, Robetta is the assembly line use of such a metal press. So, improving Rosetta@home to use less "electricity" (runtime) means Robetta begins using presses that need less electricity. But first you must prove to yourself that the quality of the work produced is otherwise the same etc.

It is helpful to seperate the assembly line from the project development. And if you were using volunteers for both, it would be very (even more) unclear what they are contributing to. Also, researchers that have submitted work to Robetta, may not want to have it sent out to the public domain for various reasons, and perhaps to comply with their own research grant guidelines.

The tasks DK refers to are "for improving and applying [to] rosetta", and therefore he's referring to Rosetta@home, and once improvements are confirmed, then to Robetta.

The project homepage typically shows 20,000 tasks ready to send, but there is a queue of work behind that. Rather then produce the 2.5 million tasks referred to in the Feb 18 news item, and bog down the databases etc. the system creates work units from the queue as the number of available tasks drops below 20,000. So the actual WUs that the BOINC server sees available are generated dynamically from this queue. Confusing because BOINC is using the word queue as well. One queue feeds the other and is not reported to you by BOINC.
Rosetta Moderator: Mod.Sense
ID: 59816 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Michael G.R.

Send message
Joined: 11 Nov 05
Posts: 264
Credit: 11,247,510
RAC: 0
Message 59817 - Posted: 26 Feb 2009, 16:12:28 UTC

Very interesting.

So was is Robetta running on? Do you have an estimate of the teraFLOPS number, or any info about the hardware? How does it compare to R@H in speed?
ID: 59817 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 59819 - Posted: 26 Feb 2009, 18:14:15 UTC

Perhaps the better question is how are things coming along with Blue Waters?? Which Keith has already said will be running Robetta work.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 59819 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Jaykay

Send message
Joined: 13 Nov 08
Posts: 29
Credit: 1,743,205
RAC: 0
Message 59829 - Posted: 26 Feb 2009, 22:11:18 UTC

mod.sense: many many thanks, that was very helpful!

but your answer raised some more questions, sorry :)

so basically rosetta is the testing project for robetta, and robetta is doing the "real" work, right?

could rosetta be replaced by a second robetta, i.e. a second supercomputer?

why are so many tasks needed for testing?

and does rosetta also directly help research on protein structure, or is it really only for testing?

many thanks again!
ID: 59829 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 59831 - Posted: 26 Feb 2009, 22:43:33 UTC
Last modified: 26 Feb 2009, 22:56:23 UTC

When Microsoft works on a new and improved version of Windows... they devise a plan on the improvements to make, they design and code the changes, then they study it (through usability studies and beta testing) to assure that it meets the objectives they set out with originally. They don't do any "production" activity with this new version until they formally release the new product. At that point, everyone using it, is doing "production" not lab work with it.

With a science application like Rosetta, that step where you assess whether your improvements are really better or not, is not as simple as surveying some users that tried the new screen layout and seeing if they understand how to use the program. But yes, once you've mastered that new science, everyone benefits. The Rosetta program is used in various ways in labs throughout the world.

And yes, if you would like to purchase a new supercomputer and run Robetta as a free service to the scientific community, I'm sure Dr. Baker would be willing to work with you.

why are so many tasks needed for testing?

Because there are thousands of proteins. Your changes might work great for the first 10 proteins that you try it on. But that doesn't mean it provides better results for all of them. So you have to try it on dozens of proteins to get a fair assessment. And from there, if it works well on some and not others, then you try to determine why, and make further adjustments to help the others work better too, and then you start over again with did your changes really improve things or not?

...does rosetta also directly help research on protein structure, or is it really only for testing?


I'm not certain I understand the question. Think of Rosetta@home as the research on protein structure, so it works to answer the question "how can we predict the structure of proteins"? and think of Robetta as working to answer the question "with what is known about this specific protein, what does Rosetta predict it's structure will be?"

If you wanted to test possible drugs to treat bird flu, you would want to run work on Robetta. If your efforts there don't find any viable drugs, then you need further research done on Rosetta@home.

Perhaps this is the terminology clarification you are needing:

"Rosetta" is a computer program that does protein structure prediction (and numerous other related stuff!)

"Rosetta@home" is a public project using distributed volunteers to further develop and improve the Rosetta program. These volunteers are running the Rosetta computer program. The work here improves the computer program so that it's predictions are more accurate, or take less time to produce, or both.

Work at Rosetta@home also expands the scope of how much "other related stuff" Rosetta is able to do. Protein docking predictions comes to mind. The Rosetta program didn't used to handle them. Code is being developed to perform docking predictions. It is a different class of the protein structure prediction problem. Once the ability to make accurate predictions in this field is proven, then Robetta can offer this function to the researchers that are using it.

"Robetta" is a project using an in-house cluster to run the Rosetta computer program and deliver protein structure predictions to other researchers. Since it is running the same computer program, the work done on Rosetta@home directly benefits Robetta and the other researchers.

So Rosetta@home does the "real work" of learning how to make better predictions of proteins in general. And Robetta does the "real work" of predicting the structure of a specific protein.
Rosetta Moderator: Mod.Sense
ID: 59831 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mike Tyka

Send message
Joined: 20 Oct 05
Posts: 96
Credit: 2,190
RAC: 0
Message 59832 - Posted: 26 Feb 2009, 23:18:49 UTC


I should add to the excellent above decription by ModSense that Robetta uses BOINC(Rosetta@HOME) for abinito predictions since last summer. (at least it did during CASP, i'm not sure it does right now)

http://beautifulproteins.blogspot.com/
http://www.miketyka.com/
ID: 59832 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1018
Credit: 4,334,829
RAC: 0
Message 59833 - Posted: 26 Feb 2009, 23:42:55 UTC - in response to Message 59832.  


I should add to the excellent above decription by ModSense that Robetta uses BOINC(Rosetta@HOME) for abinito predictions since last summer. (at least it did during CASP, i'm not sure it does right now)


yeah, it doesn't use BOINC now. not enough resources.
ID: 59833 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Jaykay

Send message
Joined: 13 Nov 08
Posts: 29
Credit: 1,743,205
RAC: 0
Message 59840 - Posted: 27 Feb 2009, 7:36:50 UTC - in response to Message 59831.  

many thanks again, i wasnt aware of the difference between rosetta and rosetta@home.


...does rosetta also directly help research on protein structure, or is it really only for testing?


I'm not certain I understand the question. Think of Rosetta@home as the research on protein structure, so it works to answer the question "how can we predict the structure of proteins"? and think of Robetta as working to answer the question "with what is known about this specific protein, what does Rosetta predict it's structure will be?"


i meant rosetta@home, so the question was whether rosetta@home directly helps research. if i understood it correctly it "only" helps/improves robetta and robetta directly helps research.

and another question: i already found two threads regarding this, but the question was not really answered:
https://boinc.bakerlab.org/rosetta/forum_thread.php?id=4718
https://boinc.bakerlab.org/rosetta/forum_thread.php?id=4712

if i got that correctly, minirosetta is basically a cleaner and more improved version of rosetta.

now the questions: if minirosetta is better, why is it not always used in rosetta@home? and why does robetta use rosetta although minirosetta is better?


i hope that this are one of my last questions, i dont want to annoy you


johannes
ID: 59840 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Paul D. Buck

Send message
Joined: 17 Sep 05
Posts: 815
Credit: 1,812,737
RAC: 0
Message 59842 - Posted: 27 Feb 2009, 7:43:50 UTC
Last modified: 27 Feb 2009, 7:44:16 UTC

If someone did not make the test tube, we could not have chemistry labs. So, which is more important? The test tube maker, or the lab?

Which contributes directly to the science?

Without the test tube there is no work done in the lab.

If we don't do the research on the algorithm in RaH, then the algorithm cannot be used to do research. It is like a car, it cannot go without the gas ... so ... all pieces are important.

If we find a more effective algorithm then the science can be done faster with less resources. But, you have to have the effort to prove the algorithm... no work on seeking a better algorithm, no algorithm, no science ... chicken and the egg ... and the answer is egg ...

not quite a chicken + not quite a chicken + random change in DNA = Chicken in the egg ...

We are kinda like not quite a Roberta ... :)
ID: 59842 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1832
Credit: 119,688,048
RAC: 10,544
Message 59851 - Posted: 27 Feb 2009, 14:23:38 UTC - in response to Message 59840.  

now the questions: if minirosetta is better, why is it not always used in rosetta@home? and why does robetta use rosetta although minirosetta is better?

Not all of the Rosetta functionality has been ported to minirosetta yet (the figure given is ~80% is ported)

ID: 59851 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 59852 - Posted: 27 Feb 2009, 14:35:02 UTC

The main point you seem to be missing is that the Rosetta program is used in research labs throughout the world. The other point you seem to be missing is that without programs like Rosetta, computers are not capable of these predictions. So, when Robetta and others use Rosetta for production work, it is not the perfect solution. But it is the current state-of-the-art that mankind has available. There are still proteins that trick it in to the wrong answer, and there is still much room to make predictions faster and more accurate.

As CASP shows you every 2 years, there are other scientists working to develop programs like Rosetta and contributing to the Rosetta code as well. I do not mean to belittle their efforts and results by calling Rosetta "the" state-of-the-art. That is up to debate and someone else may actually have a program that works better. I'm simply trying to make the point that you are right here on the edge of what mankind knows about proteins. When you improve what mankind knows about protein structure prediction, and then provide Robetta for others to use that program for free, you build up quite a queue of work for Robetta. This is why supercomputers like DeepWater are considered public projects for the common good. If we learn to target a specific virus and actually design a protein that neutralizes it safely, everyone benefits and new fields of medical research are opened.
Rosetta Moderator: Mod.Sense
ID: 59852 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Jaykay

Send message
Joined: 13 Nov 08
Posts: 29
Credit: 1,743,205
RAC: 0
Message 59853 - Posted: 27 Feb 2009, 15:13:25 UTC

i know that without programs like rosetta a computer cant predict protein structures... and paul showed quite good that both the testing and robetta is important.

but you didnt answer my last questions:
if minirosetta is better, why is it not always used in rosetta@home? and why does robetta use rosetta although minirosetta is better?



that should be the last questions i ask, thanks for your patience
ID: 59853 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 59855 - Posted: 27 Feb 2009, 15:27:52 UTC

ya, that one I'm not sure of the details of why. But in the big scheme of things, they are both the "Rosetta program" that we've discussed in this thread. They are sort of different versions of the same program, which raises the question "why not run the best version available?". Some possible answers to that are:
"because you aren't yet certain the new version is indeed better"
"you are in the middle of an extensive study and want all of your results done using the same version of the program so your results are uniform"
"you want to use some of the 'other stuff' that was not enhanced and brought in to mini yet"

Regardless of the answer, the specifics of the program and what is being studied at any point in time will evolve as time moves forward.

Dawned on me that another great example of "other stuff" is the recent work on zinc. Someone starts out with a theory that the presence of zinc gives some cues you can follow to a better prediction. They write some code to attempt to take advantage of these cues. They run thousands of models on Rosetta@home and study the results. They then compare the new results with results from prior Rosetta versions for the same proteins. Are the new predictions more accurate? Let's say they are... but now what about the predictions for proteins that do not have zinc? Have your new changes perhaps effected these predictions adversely? See how you might find yourself running your prior version for a frame of reference? More models to study and confirm and prove. More analysis and comparison. The process continues...
Rosetta Moderator: Mod.Sense
ID: 59855 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 59857 - Posted: 27 Feb 2009, 15:59:07 UTC

Sort of ironic isn't it? The pattern of this thread resembles the process of the science work? You start out with what seems like a simple question... attempt to answer it and in so doing, realize the field is more complex then you knew before and so the number of questions multiplies.

If you think about the progression of science, one can honestly say that after all these generations that mankind has been in existence, we now know that we don't know about more then ever before.

Thanks for the questions. I'm sure we will all reference this thread many times in the future.
Rosetta Moderator: Mod.Sense
ID: 59857 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1018
Credit: 4,334,829
RAC: 0
Message 59859 - Posted: 27 Feb 2009, 18:48:18 UTC - in response to Message 59817.  

Very interesting.

So was is Robetta running on? Do you have an estimate of the teraFLOPS number, or any info about the hardware? How does it compare to R@H in speed?



Robetta uses a local cluster consisting of 16 quad core machines. It also uses resources provided by NCSA, our last allocation was 200,000 process hours and a 1 million supplement has recently been awarded. To meet public demand, Robetta needs around 10000 process hours a day (rough estimate).

ID: 59859 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 59860 - Posted: 27 Feb 2009, 19:40:29 UTC
Last modified: 27 Feb 2009, 19:41:12 UTC

What is a "process hour"? An hour of CPU time on an array of CPUs? Or each CPU? How many "process hours" is Rosetta@home using now?
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 59860 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · Next

Message boards : Number crunching : difference between robetta and crunchers



©2024 University of Washington
https://www.bakerlab.org