Category Archives: Folding Clients

CPU Folding Revisited: AMD FX-8320E 8-Core CPU

In the last article, I made the statement that running Stanford’s Folding@home distributed computing project on CPUs is a planet-killing waste of electricity.  Well, perhaps I didn’t say it in such harsh terms, but that was basically the point.  Graphics cards, which are massively multi-threaded by design, offer much more computational power for molecular dynamics solutions than traditional desktop processors.  More importantly, they do more science per watt of electricity consumed.

If you’ve been following along, you’ve probably noticed that the processors I’ve been playing around with are relatively elderly (if you are still using a Core2 anything, you might consider upgrading).  In this article, I’m going to take a look at a much newer processor, AMD’s Vishera-based 8-core FX-8320e.  This processor, circa 2015, is the newest piece of hardware I currently have (although as promised in the previous article, I’ve got a brand new graphics card on the way).  The 8-core FX-8320e is a bit of a departure for AMD in terms of power consumption.  While many of their high end processors are creeping north of 125 watts in TDP, this model sips a relatively modest (for an 8-core) 95 watts of power.  As shown previously here, with more cores, F@H efficiency increases along with overall performance.  The 8320e chip should be no exception.

Processor Specs:

  • Designation: AMD FX-8320e
  • Architecture: Vishera
  • Socket: AM3+
  • Manufacturing Process: 32 nm
  • # Cores: 8
  • Clock Speed: 3.2 GHz (4.0 Turbo)
  • TDP: 95 Watts

Side Note: As many will undoubtedly mention, this processor isn’t really a true 8-core in the sense that each pair of cores shares one Floating Point Unit, whereas an ideal 8-core CPU would have 1 FPU per core.  So, it will be interesting to see how this processor does against a true 1 to 1 processor such as the 1100T (six FPUs, reviewed here).

All of my power readings are at the plug, so the host system plays a part in the overall efficiency numbers reported.  Here is the configuration of my current test computer, for reference:

Test Setup Specs:

  • CPU: AMD FX-8320e
  • Mainboard : Gigabyte GA-880GMA-USB3
  • GPU: Sapphire Radeon 7970 HD
  • Ram: 16 GB DDR3L (low voltage)
  • Power Supply: Seasonic X-650 80+ Gold
  • Drives: 1x SSD, 2 x 7200 RPM HDDs, Blu-Ray Burner
  • Fans: 1x CPU, 2 x 120 mm intake, 1 x 120 mm exhaust, 1 x 80 mm exhaust
  • OS: Win7 64 bit

Folding Results

Since I’ve been out of CPU folding for a while, I had to run through 10 CPU work units in order to be eligible to start getting Stanford’s quick return bonus (extra points received for doing very fast science).  You can see the three regions on the plot.  The first region is GPU-only folding on the 7970.  The second region is CPU-only folding on the FX-8320e prior to the bonus points being awarded.  The third region is CPU-only folding with QRB bonus points.  Credit for the graph goes to http://folding.extremeoverclocking.com/.

Radeon 7970 GPU vs AMD FX 8320e CPU Folding@home Performane

An 8-core processor is no match for a graphics card with 2048 Shaders!

The 8-core AMD chip averages about 20K PPD when doing science on the older A4 core. Stanford’s latest A7 core, which supports Advanced Vector Extensions, returns about 30K PPD on the processor.  In either case, this is well short of the 150K PPD on the graphics card, which is also about three years older than the CPU!  Clearly, if your goal is doing the most science, the high-end graphics card trumps the processor.  (Update note: Intel’s latest processors such as the 6900X have been shown to return in excess of 120K PPD on the A7 core.  This makes CPUs relevant again for folding, but not as relevant as modern high-end graphics cards, which can return up to a million PPD!  I’ll have more articles on these later, I think…)

Efficiency Numbers

I used both HFM.net and the local V7 client to obtain an estimated PPD for the A7 core work unit, which should represent about the highest PPD achievable on the FX-8320e in stock trim.

FX 8320e PPD Performance

According to the watt meter, my system is drawing about 160 watts from the wall.  So, 29534 PPD / 160 watts is 185 PPD/Watt.  Here’s how this stacks up with the hardware tested so far.

Folding@Home Performance Table with AMD 8320e

Conclusion

Even though the Radeon HD 7970 was released 3 years earlier than AMD’s flagship line of 8-core processors, it still trounces the CPU in terms of Folding@home performance. Efficiency plots show the same story.  If you are interested in turning electricity into disease research, you’d be better off using a high-end graphics card than a high-end processor.  I hope to be able to illustrate this with higher end, modern hardware in the future.

As a side note, the FX-8320e is the most efficient folder of the processors tested so far. Although not half as fast as the latest Intel offerings, it has performed well for me as a general multi-tasking processor.  Now, if only I could get my hands on a new CPU, such as a Kaby Lake or a Ryzen (any one want to donate one to the cause?)…

Advertisements

PPD/Watt Shootout: Uniprocessor Client is a Bad Idea

My Gaming / Folding computer with Q6600 / GTX 460 Installed

My Gaming / Folding computer with Q6600 / GTX 460 Installed

Since the dawn of Folding@Home, Stanford’s single-threaded CPU client known as “uniprocessor” has been the standard choice for stable folding@home installations.  For people who don’t want to tinker with many settings, and for people who don’t plan on running 24/7, this has been a good choice of clients because it allows a small science contribution to be done without very much hassle.  It’s a fairly invisible program that runs in the background and doesn’t spin up all your computer’s fans and heat up your room.  But, is it really efficient?  

The question, more specifically targeted for folding freaks reading this blog, is this:  Does the uniprocessor client make sense for an efficient 24/7 folding@home rig?  My answer:  a resounding NO!  Kill that process immediately!

A basic Google search on this will show that you can get vastly more points per day running the multicore client (SMP), a dedicated graphics card client (GPU), or both.  Just type “PPD Uniprocessor SMP Folding” into Google and read for about 20 minutes and you’ll get the idea.  I’m too lazy to point to any specific threads (no pun intended), but the various forum discussions reveal that the uniprocessor client is slower than slow.  This should not be surprising.  One CPU core is slower than two, which is slower than three!  Yay, math!

Also, Stanford’s point reward system isn’t linear but exponential.  If you return a work unit twice as fast, you get more than twice as many points as a reward, because prompt results are very valuable in the scientific world.  This bonus is known as the Quick Return Bonus, and it is available to users running with a passkey (a long auto-generated password that proves you are who you say you are to Stanford’s servers).  I won’t regurgitate all that info on passkeys and points here, because if you are reading this site then you most likely know it already.  If not, start by downloading Stanford’s latest all-in-one client known as Client V7.  Make sure you set yourself up with a username as well as a passkey, in case you didn’t have one.  Once you return 10 successful work units using your passkey, you can get the extra QRB points.  For the record, this is the setup I am using for this blog at the moment: V7 Client Version 7.3.6, running with passkey.

Unlike the older 6.x client interfaces, the new V7 client lets you pick the specific work package type you want to do within one program.  “Uniprocessor” is no longer a separate installation, but is selectable by adding a CPU slot within the V7 client and telling it how many threads to run.  V7 then downloads the correct work unit to munch on.

I thought I was talking efficiency!  Well, to that end, what we want to do is maximize the F@H output relative to the input.  We want to make as many Points per Day while drawing the fewest watts from the wall as possible.  It should be clear by now where this is going (I hope).  Because Stanford’s points system heavily favors the fast return of work units, it is often the case that the PPD/Watt increases as more and more CPU cores or GPU shaders are engaged, even though the resulting power draw of the computer increases.

Limiting ourselves to CPU-only folding for the moment, let’s have a look at what one of my Folding@Home rigs can do.  It’s Specs Time (Yay SPECS!). Here are the specs of my beloved gaming computer, known as Sagitta (outdated picture was up at the top).

  • Intel Q6600 Quad Core CPU @ 2.4 GHz
  • Gigabyte AMD Radeon HD 7870 Gigahertz Edition
  • 8 GB Kingston DDR2-800 Ram
  • Gigabyte 965-P S3 motherboard
  • Seasonic X-650 80+ Gold PSU
  • 2 x 500 GB Western Digital HDDs RAID-1
  • 2 x 120 MM Intake Fans
  • 1 x 120 MM Exhaust Fan
  • 1 x 80 MM Exhaust Fan
  • Arctic Cooling Freezer 7 CPU Cooler
  • Generic PCI Slot centrifugal exhaust fan
Ancient Pic of Sagitta (2006 Vintage).  I really need to take a new pic of the current configuration.

Ancient Pic of Sagitta (2006 Vintage). I really need to take a new pic of the current configuration.

You’ll probably say right away that this system, except for the graphics card, is pretty out of date for 2014, but for relative A to B comparisons within the V7 client this doesn’t matter.  For new I7 CPUs, the relative performance and efficiency differences seen by increasing the number of CPU cores for Folding reveals the same trend as will be shown here.  I’ll start by just looking at the 1-core option (uniprocessor) vs a dual-core F@H solve.

Uniprocessor Is Slow

As you can see, switching to a 2-CPU solve within the V7 client yields almost twice as many PPD (12.11 vs 6.82).  And, this isn’t even a fair comparison, because the dual-core work unit I received was one of the older A3 cores, which tend to produce less PPD than the A4 work units.

In conclusion, if everyone who is out there running the uniprocessor client switched to a dual-core client, FOLDING AT HOME WOULD BECOME TWICE AS EFFICIENT!  I can’t scream this loud enough.  Part of the reason for this is because it doesn’t take many more watts to feed another core in a computer that is already fired up and folding.  In the above example, we really started getting twice the amount of work done for only 13 more watts of power consumed.  THIS IS AWESOME, and it is just the beginning.  In the next article, I’ll look at the efficiency of 3 and 4 CPU Folding on the Q6600, as well as 6-CPU folding on my other computer, which is powered by a newer processor (AMD Phenom II X6 1100T). I’ll then move on to dual-CPU systems (non BIGADV at this point for those of you who know what this means, but we will get there too), and to graphics cards.  If you think 12 PPD/Watt is good, just wait until you read the next article!

Until next time…

-C