Message boards : Number crunching : Core 2 QX6700 beats the 4 Opterons Dual core for the Number 1 position
Author | Message |
---|---|
Who? Send message Joined: 2 Apr 06 Posts: 213 Credit: 1,366,981 RAC: 0 |
Top 20 Of course, this QX6700 is overclocked, at 4.0GHz, but it is rock stable. It proves as well that Hypertransport is totally over hyped, if it was so good at memory bandwidth, it will be much faster than what it is: The QX6700 is beating a AMD system that have 4 memory controlers! what a poor efficency! Grandfather (They call it Quadfather)4x4 will have about half of this processing units, will use about 125Watts per sockets, and cost more than QX6700. The motherboard will be more expensive than the 975XBX that i used. Hypertransport and all its high pines counts raise the prices, but not the performance!!! The demonstration is made, smart cores (Core 2) with prefetchers and good L2 caches is much better than the expensive Hypertransport with aging cores (K8) and expensive motherboard. So, Mister the 3 marketeers, stop telling us that Hypertranspport is the futur, it is a marketing hype, and we exposed it here!!!! May the Core be with you ;-) who? |
Mats Petersson Send message Joined: 29 Sep 05 Posts: 225 Credit: 951,788 RAC: 0 |
Great machine. However, since Rosetta has very few L2-cache misses with 1MB L2 cache, I expect that the point of how many memory controllers you have will not matter in this case, and of course, Hypertransport only really matters if you have any communication between the processors... As Rosetta jobs don't need to communicate between each other (there is no shared memory between them, application is statically linked and each instance has it's own data-set), this is not a particularly good benchmark for how good or bad any type of inter-processor communication is. My best machine is on the next page down... :-( -- Mats |
Who? Send message Joined: 2 Apr 06 Posts: 213 Credit: 1,366,981 RAC: 0 |
Great machine. You got my point, for crunching data, the programmer is usually smart enough to do a little of data locality work and avoid L2 caches access, making 4 memory controlers totally useless. On the top of this, you have to choose between avoiding 1 of the 3 latency of mem access by using a memory controler on die, or avoiding totally the 3 latencies by prefetching the data correctly. On Rosetta, Core 2 has a 99.99% L2 cache success rate, and after testing a little test on my X2, it is far from being true on X2. The X2 success rate is more around 97%, making it spend 3% of the time with a memory subsystem much slower than the core frequency. Time to drive to work... Stay tune, i ll release a version of SETI optimized soon to prove my point. I wish Rosetta source code was open too. who? |
Mats Petersson Send message Joined: 29 Sep 05 Posts: 225 Credit: 951,788 RAC: 0 |
Great machine. Yes, if AMD built a processor JUST for Rosetta, then I doubt it would be built with one memory controller per processor. However, I also doubt that AMD would be a reasonably successfull company if that was the speciality. Which model X2 processor are you referring to? There are models with 256K, 512K and 1024K L2 cache per core. On a 1024K per core, I get 99% L2-cache hit-rate... With smaller L2 cache, it would obviously reduce the cache-hit rate... -- Mats |
Michael G.R. Send message Joined: 11 Nov 05 Posts: 264 Credit: 11,247,510 RAC: 0 |
Lets also keep things in perspective; newer CPU architectures being faster than old ones is nothing new. The K8 architecture (A64) was introduced in 2003, so it's pretty impressive that it has stayed competitive this long. |
FluffyChicken Send message Joined: 1 Nov 05 Posts: 1260 Credit: 369,635 RAC: 0 |
Actually RAC may not be the best indicator since it doesn't take into effect bad work units, effect of bulk uploading .. BUT just by looking at the credit per hour Opteron ~10credit/hr (core) 870 is 2GHz I think. C2Q @ ~30credit/hr (core) , you say that at 4GHz, so at it's real speed (2.66) is about ~20credits/hr (core). Overall the C2Q is a faster at the same Hz but a slight overclock to all the Opteron cores would put it past the C2Q in rosetta@home. Of course it would be cheaper to run the C2Q ;-) Mind you should be comparing it to the 3GHz Opteron 856 (for that platform) Or the AM2 platform 8220SE (2.8GHz PC2-5300) rather then to the aging 870. ;-) Team mauisun.org |
Who? Send message Joined: 2 Apr 06 Posts: 213 Credit: 1,366,981 RAC: 0 |
Actually RAC may not be the best indicator since it doesn't take into effect bad work units, effect of bulk uploading .. hehehehe , I am not going to slow down my machine to figure out that it is still faster than the opteron, when i slow it down ... na! Who? |
Message boards :
Number crunching :
Core 2 QX6700 beats the 4 Opterons Dual core for the Number 1 position
©2025 University of Washington
https://www.bakerlab.org