Category Archives: PPD/Watt

Nvidia GeForce GTX 1070 Ti Folding@Home Review

In an effort to make as much use of the colder months in New England as I can, I’m running tons of Stanford University’s Folding@Home on my computer to do charitable science for disease research while heating my house. In the last article, I reviewed a slightly older AMD card, the RX 480, to determine its performance and efficiency running Folding@Home. Today, I’ll be taking a look at one of the favorite cards from Nvidia for both folding and gaming: The 1070 Ti.

The GeForce GTX 1070 Ti was released in November 2017, and sits between the 1070 and 1080 in terms of raw performance. As of February 2019, the 1070 Ti can be for a deep discount on the used market, now that the RTX 20xx series cards have been released. I got my Asus version on eBay for $250.

Based on Nvidia’s 14nm Pascal architecture, the 1070 Ti has 2432 CUDA cores and 8 GB of GDDR5 memory, with a memory bandwidth of 256 GB/s. The base clock rate of the GPU is 1607 MHz, although the cards automatically boost well past the advertised boost clock of 1683 Mhz. Thermal Design Power (TDP) is 180 Watts.

The 3rd party Asus card I got is nothing special. It appears to be a dual-slot reference design, and uses a blower cooler to exhaust hot air out the back of the case. It requires one supplemental 8-pin PCI-E Power connection.

IMG_20190206_185514342

ASUS GeForce GTX 1070 Ti

One thing I will note about this card is it’s length. At 10.5 inches (which is similar to many NVidia high-end cards), it can be a bit problematic to fit in some cases. I have a Raidmax Sagitta mid-tower case from way back in 2006, and it fits, but barely. I had the same problem with the EVGA GeForce 1070 I reviewed earlier.

IMG_20190206_190210910_TOP

ASUS GTX 1070 Ti – Installed.

Test Environment

Testing was done in Windows 10 on my AMD FX-based system, which is old but holds pretty well, all things considered. You can read more on that here. The system was built for both performance and efficiency, using AMD’s 8320e processor (a bit less power hungry than the other 8-core FX processors), a Seasonic 650 80+ Gold Power Supply, and 8 GB of low voltage DDR3 memory. The real key here, since I take all my power measurements at the wall with a P3 Kill-A-Watt meter, is that the system is the same for all of my tests.

The Folding@Home Client version is 7.5.1, running a single GPU slot with the following settings:

GPU Slot Options

GPU Slot Options for Maximum PPD

These settings tend to result in a slighter higher points per day (PPD), because they request large, advanced work units from Stanford.

Initial Test Results

Initial testing was done on one of the oldest drivers I could find to support the 1070 Ti (driver version 388.13). The thought here was that older drivers would have less gaming optimizations, which tend to hurt performance for compute jobs (unlike AMD, Nvidia doesn’t include a compute mode in their graphics driver settings).

Unfortunately, the best Nvidia driver for the non-Ti GTX 10xx cards (372.90) doesn’t work with the 1070 Ti, because the Ti version came out a few months later than the original cards. So, I was stuck with version 388.13.

Nvidia 1070 TI Baseline Clocks

Nvidia GTX 1070 Ti Monitoring – Baseline Clocks

I ran F@H for three days using the stock clock rate of 1823 MHz core, with the memory at 3802 MHz. Similar to what I found when testing the 1070, Folding@Home does not trigger the card to go into the high power (max performance) P0 state. Instead, it is stuck in the power-saving P2 state, so the core and memory clocks do not boost.

The PPD average for three days when folding at this rate was 632,380 PPD. Checking the Kill-A-Watt meter over the course of those days showed an approximate average system power consumption of 220 watts. Interestingly, this is less power draw than the GTX 1070 (which used 227 watts, although that was with overclocking + the more efficient 372.90 driver). The PPD average was also less than the GTX 1070, which had done about 640,000 PPD. Initial efficiency, in PPD/Watt, was thus 2875 (compared to the GTX 1070’s 2820 PPD/Watt).

The lower power consumption number and lower PPD performance score were a bit surprising, since the GTX 1070 TI has 512 more CUDA cores than the GTX 1070. However, in my previous review of the 1070, I had done a lot of optimization work, both with overclocking and with driver tuning. So, now it was time to do the same to the 1070 Ti.

Tuning the Card

By running UNIGINE’s Heaven video game benchmark in windowed mode, I was able to watch what the card did in MSI afterburner. The core clock boosted up to 1860 MHz (a modest increase from the 1823 base clock), and the memory went up to 4000 MHz (the default). I tried these overclocking settings and saw only a modest increase in PPD numbers. So, I decided to push it further, despite the Asus card having only a reference-style blower cooler. From my 1070 review, I found I was able to fold nice and stable with a core clock of 2012 MHz and a memory clock of 3802 MHz. So, I set up the GTX 1070 Ti with those same settings. After running it for five days, I pushed the core a little higher to 2050 Mhz. A few days later, I upgraded the driver to the latest (417.71).

Nvidia 1070 TI OC

Nvidia GTX 1070 Ti Monitoring – Overclocked

With these settings, I did have to increase the fan speed to keep the card below 70 degrees Celsius. Since the Asus card uses a blower cooler, it was a bit loud, but nothing too crazy. Open-air coolers with lots of heat pipes and multiple fans would probably let me push the card higher, but from what I’d read, people start running into stability problems at core clocks over 2100 Mhz. Since the goal of Folding@home is to produce reliable science to help Stanford University fight disease, I didn’t want to risk dropping a work unit due to an unstable overclock.

Here’s the production vs. time history from Stanford’s servers, courtesy of https://folding.extremeoverclocking.com/

Nvidia GTX 1070 Ti Time History

Nvidia GTX1070 Ti Folding@Home Production Time History

As you can see below, the overclock helped improve the performance of the GTX 1070 Ti. Using the last five days worth of data points (which has the graphics driver set to 417.71 and the 2050 MHz core overclock), I got an average PPD of 703,371 PPD with a power consumption at the wall of 225 Watts. This gives an overall system efficiency of 3126 PPD/Watt.

Finally, these results are starting to make more sense. Now, this card is outpacing the GTX 1070 in terms of both PPD and energy efficiency. However, the gain in performance isn’t enough to confidently say the card is doing better, since there is typically a +/- 10% PPD difference depending on what work unit the computer receives. This is clear from the amount of variability, or “hash”, in the time history plot.

Interestingly, the GTX 1070 Ti it is still using about the same amount of power as the base model GTX 1070, which has a Thermal Design Power of 150 Watts, compared to the GTX 1070 Ti’s TDP of 180 Watts. So, why isn’t my system consuming 30 watts more at the wall than it did when equipped with the base 1070?

I suspect the issue here is that the drivers available for the 1070 Ti are not as good for folding as the 372.90 driver for the non-Ti 10-series Nvidia cards. As you can see from the MSI Afterburner screen shots above, GPU Usage on the GTX 1070 Ti during folding hovers in the 80-90% range, which is lower than the 85-93% range seen when using the non-Ti GTX 1070. In short, folding on the 1070 Ti seems to be a bit handicapped by the drivers available in Windows.

Comparison to Similar Cards

Here are the Production and Efficiency Plots for comparison to other cards I’ve tested.

GTX 1070 Ti Performance Comparison

GTX 1070 Ti Performance Comparison

GTX 1070 Ti Efficiency Comparison

GTX 1070 Ti Efficiency Comparison

Conclusion

The Nvidia GTX 1070 Ti is a very good graphics card for running Folding@Home. With an average PPD of 703K and a system efficiency of 3126 PPD/Watt, it is the fastest and most efficient graphics card I’ve tested so far. As far as maximizing the amount of science done per electricity consumed, this card continues the trend…higher-end video cards are more efficient, despite the increased power draw.

One side note about the GTX 1070 Ti is that the drivers don’t seem as optimized as they could be. This is a known problem for running Folding@Home in Windows. But, since the proven Nvidia driver 372.90 is not available for the Ti-flavor of the 1070, the hit here is more than normal. On the used market in 2019, you can get a GTX 1070 for $200 on ebay, whereas the GTX 1070 Ti’s go for $250. My opinion is that if you’re going to fold in Windows, a tuned GTX 1070 running the 372.90 driver is the way to go.

Future Work

To fully unlock the capability of the GTX 1070 Ti, I realized I’m going to have to switch operating systems. Stay tuned for a follow-up article in Linux.

Advertisements

Folding@Home Efficiency vs. GPU Power Limit

Folding@Home: The Need for Efficiency

Distributed computing projects like Stanford University’s Folding@Home sometimes get a bad rap on account of all the power that is consumed in the name of science.  Critics argue that any potential gains that are made in the area of disease research are offset by the environmental damage caused by thousands of computers sucking down electricity.

This blog hopes to find a balance by optimizing the way the computational research is done. In this article, I’m going to show how a simple setting in the graphics card driver can improve Folding@Home’s Energy Efficiency.

This blog uses an Nvidia graphics card, but the general idea should also work with AMD cards. The specific card here is an EVGA GeForce GTX 1060 (6 GB).  Green F@H Review here: Folding on the NVidia GTX 1060

If you are folding on a CPU, similar efficiency improvements can be achieved by optimizing the clock frequencies and voltages in the BIOS.  For an example on how to do this, see these posts:

F@H Efficiency: AMD Phenom X6 1100T

F@H Efficiency: Overclock or Undervolt?

(at this point in time I really just recommend folding on a GPU for optimum production and efficiency)

GPU Power Limit Overview

The GPU Power limit slider is a quick way to control how much power the graphics card is allowed to draw. Typically, graphics cards are optimized for speed, with efficiency a second goal (if at all). When a graphics card is pushed harder, it will draw more power (until it runs into the power limit). Today’s graphics cards will also boost their clock rate when loaded, and reduce it when the load goes away. Sometimes, a few extra MHz can be achieved for minimal extra power, but go too far and the amount of power needed to drive the card will grow exponentially. Sure the card is doing a bit more work (or playing a game a bit faster), but the heaps of extra power needed to do this are making it very inefficient.

What I’m going to quickly show is that going the other way (reducing power) can actually improve efficiency, albeit at a reduction of raw output. For  this quick test, I’m just going to look a the default power limit, 100%, vs 50%. Specific tuning is going to be dependent on your actual graphics card. But, with a few days at different settings, you should be able to find a happy balance between performance and efficiency.

For these plots, I used my watt meter to obtain actual power consumption at the wall. You can read about my watt meters here.

Changing the Power Limit

A tool such as MSI Afterburner can be used to view the graphics card’s settings, including the power limit. In the below screenshot, I reduced the card’s power limit by 50% midway through taking data. You can clearly see the power consumption and GPU temperature drop. This suggests the entire computer should be drawing less power from the wall. I confirmed this with my watt meter.

Adjust Power Limit MSI Afterburner

MSI Afterburner is used to reduce the graphics card’s power limit.

Effect on Results

I ran the card for multiple days at each power setting and used Stanford’s actual stats to generate an averaged number for PPD. Reporting an average number like this lends more confidence that the results are real, since PPD as reported in the client varies a lot with time, and PPD can bounce around by +/- 10 percent with different projects.

Below is the production time history plot, courtesy of https://folding.extremeoverclocking.com/. I marked on the plot the actual power consumption numbers I was seeing from my computer at the wall. As you can see, reducing the power limit on the 1060 from 100% to 50% saved about 40 watts of power at the wall.

GTX 1060 F@H Reduced Power Limit Production

GTX 1060 Folding@Home Performance at 100% and 50% Power

On the efficiency plot, you can see that reducing the power limit on the 1060 actually improved its efficiency slightly. This is a great way to fold more effectively.

Nvidia 1060 PPD per Watt Updated

NVidia GTX 1060 Folding@Home Efficiency Results

There is a downside of course, and that is in raw production. The Points Per Day plot below shows a pretty big reduction in PPD for the reduced power 1060, although it is still beating its little brother, the 1050 TI. One of the reasons PPD falls off so hard is that Stanford provides bonus points that are tied to how fast your computer can return a work unit. These points increase exponentially the faster your computer can do work. So, by slowing the card down, we not only lose on base points, but we lose on  the quick return bonus as well.

Nvidia 1060 PPD Updated

NVidia GTX 1060 Folding@Home Performance Results

Conclusion

Reducing the power limit on a graphics card can increase its computational energy efficiency in Folding@Home, although at the cost of raw PPD. There is probably a sweet spot for efficiency vs. performance at some power setting between 50% and 100%. This will likely be different for each graphics card. The process outlined above can be used for various power limit settings to find the best efficiency point.

 

Folding on the Nvidia GTX 1070

Overview

Folding@home is Stanford University’s charitable distributed computing project. It’s charitable because you can donate electricity, as converted into work through your home computer, to fight cancer, Alzheimer’s, and a host of other diseases.  It’s distributed, because anyone can run it with almost any desktop PC hardware.  But, not all hardware configurations are created equally.  If you’ve been following along, you know the point of this blog is to do the most work for as little power consumption as possible.  After all, electricity isn’t free, and killing the planet to cure cancer isn’t a very good trade-off.

Today we’re testing out Folding@home on an EVGA NVIDIA GTX 1070 graphics card.  This card offers a big step up in gaming and compute horsepower compared to the 1060 I reviewed previously, and is capable of pushing solid frame rates at 4K resolution. So, how well does it fold?

Card Specifications (Nvidia Reference Specs)

1070 specs

Nvidia GTX 1070 Specifications

evga 1070 acx stock photo

EVGA Nvidia GTX 1070 ACX 3.0 (photo credit: EVGA)

FOLDING@HOME TEST SETUP

For this test I used my normal desktop computer as the benchmark machine.  Testing was done using Stanford’s V7 client on Windows 10 64-bit running FAH Core 21 work units.  The video driver version used was initially 388.59, and subsequently 372.90. Power consumption measurements reported in the charts were taken at the wall and are thus full system power consumption numbers.

If you’re interested in reading about the hardware configuration of my test rig, it is summarized in this post:

https://greenfoldingathome.com/2017/04/21/cpu-folding-revisited-amd-fx-8320e-8-core-cpu/

Information on my watt meter readings can be found here:

I Got a New Watt Meter!

Initial Testing and Troubleshooting

Like the GTX 1060, the 1070 uses Nvidia’s Pascal architecture, which is very efficient and has a reputation for solid compute performance. The 1070 has 50% more CUDA cores than the 1060, and with Folding@Home’s exponential points system (the quick return bonus gives you more points for doing work quickly), we should see roughly double the PPD of the 1060, which does 300 – 350 thousand PPD depending on the work unit. Based on various people’s experiences, and especially this forum post, I was expecting the 1070 to produce somewhere in the range of 600-700K PPD.

That wasn’t what happened. The card wasn’t exactly slow, but initial testing showed an estimated 450 to 550K PPD, as reported by the client. I ran it for a few days, since PPD can vary a good deal depending on the work unit, but the result was unfortunately the same. 550K PPD was about as much as my card would do.

initial_1070_results

Initial GTX 1070 Results – 544K PPD

At first I thought it might be due to the card running hot. Unlike my test of a brand new 1060, I obtained my 1070 used off of eBay for a great price of $200 dollars + shipping. It was a little dusty, so I blew it all out and fired up MSI Afterburner to check out the temps. Unfortunately, the fans on the card weren’t even breaking a sweat, and it was nice and cool. Points didn’t increase.

evga 1070 acx 3.0

My Used EVGA GTX 1070 ACX 3.0 – eBay Price: $200

initial 1070 afterburner report

MSI Afterburner Report: NVidia GTX 1070, Stock Clocks, Driver 388.59

After doing some more digging, I ran across a few threads online that indicated the 1070 (along with a few other GTX models) don’t always boost up to their maximum clock rates for compute loads. Opening up a video, or Folding@home’s protein viewer, can sometimes force the card to clock up. I tried this and didn’t have any luck. My card was running at the stock clocks, and in fact the memory even appeared to be running 200 Megahertz below the 4000 Mhz reference clock rate. This suggested the card was in a low-power mode.

Thankfully, Nvidia’s System Management Interface tool can be used to see what is going on. This tool, which in Windows 10 lives in C:\Program Files\Nvidia Corporation, can be accessed by the command line. I followed the tutorial here to learn a few things about what my 1070 was doing. Although that write-up is geared at people mining for cryptocurrency, the steps are still releveant.

As can be seen here, my card was in the “P2” state, which is not the high-performance “P0” state. This is why the card wasn’t boosting, and why the memory clock seems diminished.

1070 performance state

Nvidia 1070 Performance State

Another feature of the Nvidia System Management Interface is the ability to get the power consumption at the card. This is measured by the driver, using the card’s hardware, and is the total instantaneous power the card is consuming (PCI slot power + supplemental power connections). As you can see, in the P2 state, the card is very rarely nearing the 150 watt TDP.

Now, this doesn’t necessarily mean the card would get closer to 150 watts in the P0 state. F@H does not utilize every portion of the graphics card, and it is expected that the power consumption would not be right at the limit. Still, these numbers seemed a bit low to me.

1070 card-level power consumption (before tuning)

1070 card-level power consumption (before tuning)

Overclocking Manually to Approximate P0 State

Unlike what was suggested in that crypto mining article, I wasn’t able to use the NVSMI tool to force a P0 state. For some reason, my NVSMI tool wouldn’t show me the available clock rate settings for my 1070. However, manual overclocking with a program such as MSI Afterburner is really easy. By maxing out the power limit and setting the core clock to a higher value, I can basically make the card run at its boost frequency, or higher.

First, I set the power limit to the maximum allowed (112%). Don’t worry, this won’t hurt anything. It is limited in the driver to not cause any damage. Basically, this will allow the card to sip a bit more electricity (albeit at a reduction of efficiency). For a card that was in the P0 state (say, running a video game), this would allow higher boost clocks.

Next, I started upping the core clock in increments of 100 Mhz. I didn’t run into any stability problems, and settled in on a core clock of 2000 Mhz (factory clock is 1506 Mhz / 1683 boost). Note that that factory boost number is deceiving, since the latest drivers will crank the GPU core up past 1900 MHz if there is power and voltage headroom. From what I read, many people can run the 1070 stable at 2050 Mhz without adding voltage.

I decided not to boost the voltage, and to stay 50 Mhz below that supposedly stable number, because it’s not worth risking the stability of Folding@home. We want accurate, repeatable science! Plus, dropping work units is much worse for PPD than running slightly below a card’s maximum capability.

I experimented with clocking the memory up from 3800 MHz to 4000 MHz (note it’s double data rate so this equates to 8000 MHz as reported by some programs). This didn’t seem to affect results. F@H has historically been fairly insensitive to memory clocks, and boosting memory too much can cause slowdowns due to the error-checking routines having to work harder to ensure clean results. Basically, everyone says it’s not worth it. I ran it at 4000 MHz long enough to confirm this (a day), then throttled it back down to 3800 MHz. The benefit here will be more power available for the GPU cores, which is what really counts for folding.

Here are my final overclock numbers. The card has been running with these clocks for a week and a half non-stop, with no stability issues:

final 1070 afterburner report

Overclocked Settings: +160 MHz Core, 112% Power Limit

Note the driver version as shown in the updated Afterburner screen shot is different…as it turns out, this can have a huge effect on F@H PPD. More on that in a moment.

Overclocking Result: An Extra 50,000 PPD

Running the core at 2012 MHz (+160 MHz boost from the P2 power state) and upping the card’s power limit by 12% made the average PPD, as observed over two days, climb from 500-550K PPD to 550K-600K PPD. So, that’s a 50,000 PPD increase for minimal effort. But, something still seemed off. At the time I was still running driver version 388.59, and one of the things I had discovered when searching around for 1070 tuning tips is that not all drivers are created equal.

Nvidia Driver 372.90: The Best Folding Driver for the GTX 1070

Nvidia has been updating drivers with more and more emphasis on gaming optimizations and less on compute. So, it makes sense that older drivers might actually offer better compute performance. There are many threads in the Folding@Home Hardware Forum discussing this, and one driver version that keeps being mentioned is 372.90. It’s a bit tricky to keep it installed on Windows 10, since Windows is always trying to push a newer version, but for my 24/7 folding rig, I installed it and simply never rebooted it in order to get a week’s worth of data.

This driver change alone seemed to also offer a 50,000 point boost. After running various core 21 work units, the GTX 1070’s PPD has stayed between 630,000 and 660,000. This is normal variation between work units, and I feel confident reporting a final PPD of 640K. As I write this, the client is estimating 660K PPD.

final_1070_results

Nvidia GTX 1070: 660K PPD on Project 13815 (Core 21)

This is an excellent result. It’s twice the PPD of the GTX 1060, although eking out that last 100K PPD took a manual overclock plus a driver “update” to an older version.

Now, for the fun part. Efficiency! This 1070 is rated at 150 watts, which is only 30 watts more than the 1060. So we are supposedly doing 100% more science for Stanford University, and for a meager 25% increase in power consumption. Time to bust out the watt meter and find out!

Power Consumption at the Wall

Using my P3 Kill-A-Watt Power Meter, I measured the total system power consumption. This is the same way I measure all of my graphics cards (as opposed to estimating the card’s power by the TDP or using the video card driver to spit out instantaneous card power). The reason is that I like to have a full-system view, factoring in the power usage of my CPU, main board, and RAM, all essential components to keep the card happy.

While folding with the GTX 1070, my system’s total power draw varied between 225 and 230 watts. I’m going to go with 227 watts as the average power number. 

Efficiency

Computing computational efficiency as Points Per Day (PPD) / Power (Watts) gives:

640,000 PPD / 227 Watts = 2820 PPD/Watt.

Conclusion

The Nvidia GTX 1070 is a very efficient card for running Stanford’s Folding@Home Distributed Computing Project. The trend established in my previous articles seems to be continuing, namely that the more expensive high-end video cards are more efficient, despite their higher power draw. In this case of the 1070, some manual overclocking was needed to unlock the full PPD potential. As proven by many others, the default drivers weren’t very good, but the 372.90 drivers really opened it up.

Base PPD: 550,000

Tuned PPD (drivers + overclock) = 640,000

PPD/Watt(@wall) = 2820

1070 ppd plot

Nvidia GTX 1070 Performance Comparison

1070 efficiency plot

Nvidia 1070 Efficiency Comparison

As a final note, this post focused more on PPD than efficiency, since for much of the testing my watt meter was not installed (my kids keep playing with it). At some point in the future, I’ll do an article where I tune one of these cards to find the best efficiency point. This will likely be at a lower power limit than 100%, with perhaps a slight reduction in clock rate.

Squeezing a few more PPD out of the FX-8320E

In the last post, the 8-core AMD FX-8320E was compared against the AMD Radeon 7970 in terms of both raw Folding@home computational performance and efficiency.  It lost, although it is the best processor I’ve tested so far.  It also turns out it is a very stable processor for overclocking.

Typical CPU overclocking focuses on raw performance only, and involves upping the clock frequency of the chip as well as the supplied voltage.  When tuning for efficiency, doing more work for the same (or less) power is what is desired.  In that frame of mind, I increased the clock rate of my FX-8320e without adjusting the voltage to try and find an improved efficiency point.

Overclocking Results

My FX-8320E proved to be very stable at stock voltage at frequencies up to 3.6 GHz.  By very stable, I mean running Folding@home at max load on all CPUs for over 24 hours with no crashes, while also using the computer for daily tasks.   This is a 400 MHz increase over the stock clock rate of 3.2 GHz.  As expected, F@H production went up a noticeable amount (over 3000 PPD).  Power consumption also increased slightly.  It turns out the efficiency was also slightly higher (190 PPD/watt vs. 185 PPD/watt).  So, overclocking was a success on all fronts.

FX 8320e overclock PPD

FX 8320e overclock efficiency

Folding Stats Table FX-8320e OC

Conclusion

As demonstrated with the AMD FX-8320e, mild overclocking can be a good way to earn more Points Per Day at a similar or greater efficiency than the stock clock rate.  Small tweaks like this to Folding@home systems, if applied everywhere, could result in more disease research being done more efficiently.

CPU Folding Revisited: AMD FX-8320E 8-Core CPU

In the last article, I made the statement that running Stanford’s Folding@home distributed computing project on CPUs is a planet-killing waste of electricity.  Well, perhaps I didn’t say it in such harsh terms, but that was basically the point.  Graphics cards, which are massively multi-threaded by design, offer much more computational power for molecular dynamics solutions than traditional desktop processors.  More importantly, they do more science per watt of electricity consumed.

If you’ve been following along, you’ve probably noticed that the processors I’ve been playing around with are relatively elderly (if you are still using a Core2 anything, you might consider upgrading).  In this article, I’m going to take a look at a much newer processor, AMD’s Vishera-based 8-core FX-8320e.  This processor, circa 2015, is the newest piece of hardware I currently have (although as promised in the previous article, I’ve got a brand new graphics card on the way).  The 8-core FX-8320e is a bit of a departure for AMD in terms of power consumption.  While many of their high end processors are creeping north of 125 watts in TDP, this model sips a relatively modest (for an 8-core) 95 watts of power.  As shown previously here, with more cores, F@H efficiency increases along with overall performance.  The 8320e chip should be no exception.

Processor Specs:

  • Designation: AMD FX-8320e
  • Architecture: Vishera
  • Socket: AM3+
  • Manufacturing Process: 32 nm
  • # Cores: 8
  • Clock Speed: 3.2 GHz (4.0 Turbo)
  • TDP: 95 Watts

Side Note: As many will undoubtedly mention, this processor isn’t really a true 8-core in the sense that each pair of cores shares one Floating Point Unit, whereas an ideal 8-core CPU would have 1 FPU per core.  So, it will be interesting to see how this processor does against a true 1 to 1 processor such as the 1100T (six FPUs, reviewed here).

All of my power readings are at the plug, so the host system plays a part in the overall efficiency numbers reported.  Here is the configuration of my current test computer, for reference:

Test Setup Specs:

  • CPU: AMD FX-8320e
  • Mainboard : Gigabyte GA-880GMA-USB3
  • GPU: Sapphire Radeon 7970 HD
  • Ram: 16 GB DDR3L (low voltage)
  • Power Supply: Seasonic X-650 80+ Gold
  • Drives: 1x SSD, 2 x 7200 RPM HDDs, Blu-Ray Burner
  • Fans: 1x CPU, 2 x 120 mm intake, 1 x 120 mm exhaust, 1 x 80 mm exhaust
  • OS: Win7 64 bit

Folding Results

Since I’ve been out of CPU folding for a while, I had to run through 10 CPU work units in order to be eligible to start getting Stanford’s quick return bonus (extra points received for doing very fast science).  You can see the three regions on the plot.  The first region is GPU-only folding on the 7970.  The second region is CPU-only folding on the FX-8320e prior to the bonus points being awarded.  The third region is CPU-only folding with QRB bonus points.  Credit for the graph goes to http://folding.extremeoverclocking.com/.

Radeon 7970 GPU vs AMD FX 8320e CPU Folding@home Performane

An 8-core processor is no match for a graphics card with 2048 Shaders!

The 8-core AMD chip averages about 20K PPD when doing science on the older A4 core. Stanford’s latest A7 core, which supports Advanced Vector Extensions, returns about 30K PPD on the processor.  In either case, this is well short of the 150K PPD on the graphics card, which is also about three years older than the CPU!  Clearly, if your goal is doing the most science, the high-end graphics card trumps the processor.  (Update note: Intel’s latest processors such as the 6900X have been shown to return in excess of 120K PPD on the A7 core.  This makes CPUs relevant again for folding, but not as relevant as modern high-end graphics cards, which can return up to a million PPD!  I’ll have more articles on these later, I think…)

Efficiency Numbers

I used both HFM.net and the local V7 client to obtain an estimated PPD for the A7 core work unit, which should represent about the highest PPD achievable on the FX-8320e in stock trim.

FX 8320e PPD Performance

According to the watt meter, my system is drawing about 160 watts from the wall.  So, 29534 PPD / 160 watts is 185 PPD/Watt.  Here’s how this stacks up with the hardware tested so far.

Folding@Home Performance Table with AMD 8320e

Conclusion

Even though the Radeon HD 7970 was released 3 years earlier than AMD’s flagship line of 8-core processors, it still trounces the CPU in terms of Folding@home performance. Efficiency plots show the same story.  If you are interested in turning electricity into disease research, you’d be better off using a high-end graphics card than a high-end processor.  I hope to be able to illustrate this with higher end, modern hardware in the future.

As a side note, the FX-8320e is the most efficient folder of the processors tested so far. Although not half as fast as the latest Intel offerings, it has performed well for me as a general multi-tasking processor.  Now, if only I could get my hands on a new CPU, such as a Kaby Lake or a Ryzen (any one want to donate one to the cause?)…

F@H Efficiency on Dell Inspiron 1545 Laptop

Laptops!  

When browsing internet forums looking for questions that people ask about F@H, I often see people asking if it is worth folding on laptops (note that I am talking about normal, battery-life optimized laptops, not Alienware gaming laptops / desktop replacements).  In general, the consensus from the community is that folding on laptops is a waste of time.  Well, that is true from a raw performance perspective.  Laptops, tablets, and other mobile devices are not the way to rise to the top of the Folding at Home leader boards.  They’re just too slow, due to the reduced clock speeds and voltages employed to maximize battery life.

But wait, didn’t you say that low voltage is good for efficiency?

I did, in the last article.  By undervolting and slightly underclocking the Phenom II X6 in a desktop computer, I was able to get close to 90 PPD/Watt while still doing an impressive twelve thousand PPD.

However, this raised the interesting question of what would happen if someone tried to fold on a computer that was optimized for low voltage, such as a laptop.  Lets find out!

Dell Inspiron 1545

Specs:

  • Intel T9600 Core 2 Duo
  • 8 GB DDR2 Ram
  • 250 GB spinning disk style HDD (5400 RPM, slow as molasses)
  • Intel Integrated HD Graphics (horrible for gaming, great for not using much extra electricity)
  • LCD Off during test  to reduce power

I did this test on my Dell Inspiron 1545, because it is what I had lying around.  It’s an older laptop that originally shipped with a slow socket P Intel Pentium dual core.  This 2.1 GHz chip was going to be so slow at folding that I decided to splurge and pick up a 2.8 GHz T9600 Core 2 Duo from Ebay for 25 bucks (can you believe this processor used to cost $400)?  This high end laptop processor has the same 35 watt TDP as the Pentium it is replacing, but has 6 times the total cache.  This is a dual core part that is roughly similar in architecture to the Q6600 I tested earlier, so one would expect the PPD and the efficiency to be close to the Q6600 when running on only 2 cores (albeit a bit higher due to the T9600’s higher clock speed).  I didn’t bother doing a test with the old laptop processor, because it would have been pretty bad (same power consumption but much slower).

After upgrading the processor (rather easy on this model of laptop, since there is a rear access panel that lets you get at everything), I ran this test in Windows 7 using the V7 client.  My computer picked up a nice A4 work unit and started munching away.  I made sure to use my passkey to ensure I get the quick return bonus.

Results:

The Intel T9600 laptop processor produced slightly more PPD than the similar Q6600 desktop processor when running on 2 cores (2235 PPD vs 1960 PPD). This is a decent production rate for a dual core, but it pales in comparison to the 6000K PPD of the Q6600 running with all 4 cores, or newer processors such as the AMD 1100T (over 12K PPD).

However, from an efficiency standpoint, the T9600 Core2 Duo blows away the desktop Core2 Quad by a lot, as seen in the chart and graph below.

Intel T9600 Folding@Home Efficiency

Intel T9600 Folding@Home Efficiency

Intel T9600 Folding@Home Efficiency vs. Intel Desktop Processors

Intel T9600 Folding@Home Efficiency vs. Desktop Processors

Conclusion

So, the people who say that laptops are slow are correct.  Compared to all the crazy desktop processors out there, a little dual core in a laptop isn’t going to do very many points per day.  Even modern quad cores laptops are fairly tame compared to their desktop brethren.  However, the efficiency numbers tell a different story.

Because everything from the motherboard, video card, audio circuit, hard drive, and processor are optimized for low voltage, the total system power consumption was only 39 watts (with the lid closed).  This meant that the 2235 PPD was enough to earn an efficiency score of 57.29 PPD/Watt.  This number beats all of the efficiency numbers from the most similar desktop processor tested so far (Q6600), even when the Q6600 is using all four cores.

So, laptops can be efficient F@H computers, even though they are not good at raw PPD production.  It should also be noted that during this experiment the little T9600 processor heated up to a whopping 67 degrees C. That’s really warm compared to the 40 degrees Celsius the Q6600 runs at in the desktop.  Over time, that heat load would probably break my poor laptop and give me an excuse to get that Alienware I’ve been wanting.  

F@H Efficiency: AMD Phenom X6 1100T

Welcome back to the fold!  In the last post, I showed how increasing the # of CPU cores has a massive positive impact on the amount of cancer-fighting research your computer does, as well as how efficiently it does it.  In stock form, the quad core Intel Q6600 delivered just shy of 6000 points per day of F@H with all 4 cores engaged.  My computer’s total power draw at the wall was 169 watts.  So, that works out to be 6000 PPD / 169 Watts = 35 PPD/Watt.  Not too bad, considering the horrible efficiency numbers of the uniprocessor client.

In this article, I’m jumping forward in time to a more modern processor…the AMD Phenom II X6 1000T.  This six-core beast is the last of the true core-for-core chips from AMD (Bulldozer and newer CPUs have 2 integer units but only 1 floating point unit per core).  With 6 physical floating point cores, the AMD 1100T should be good at folding.

Note that I am obviously using a completely different computer setup here than in the last post (I have an AMD machine and an Intel machine).  So, the efficiency numbers aren’t a perfect apples-to-apples comparison, due to the different supporting parts in both computers.  However, the difference between processors is so large that the differences in the host computers really doesn’t matter.  The newer AMD chip is much better, and that is what is driving the results!

Test Rig Specs:

AMD Phenom II X6 1100T
Gigabyte GA-880GMA-USB3 Micro ATX Motherboard
8 GB Kingston ValueRam DDR3 1333 MHz (4 x 2GB)
Seasonic S12 II 380W 80+ PSU
Hitachi 80 G SATA Hard Drive
Linkworld MicroATX
Fans: 2 x 80mm Side Intake, 1 x 80mm front intake, 1 x 92 mm Exhaust
Noctua NH-C12P SE14 140mm SSO CPU Cooler

A note about the operating system…

The previous tests on my Intel Q6600 were performed using Windows 7 with the V7 folding client.  Due to Windows costing money, I used Ubuntu Linux on my AMD system with the V7 folding client.  Linux is a bit more capable of maxing out a PC’s hardware than Windows, so the resulting PPD numbers are likely slightly higher than they would be had the machine been running Windows.  However, the difference is typically small (5 percent or so).  Note that over time, this performance bonus can really add up.  This is why Linux is the preferred operating system for many dedicated Folding at Home users.

AMD Folding Rig - Phenom II X6 Configuration

AMD Folding Rig – Phenom II X6 Configuration

Test Results

AMD Phemom II X6 1100T Folding at Home Performance and Efficiency

AMD Phemom II X6 1100T Folding at Home Performance and Efficiency

AMD 1100T 6-core CPU pushes the efficiency curve further

AMD 1100T 6-core CPU pushes the efficiency curve further

As expected, the 6-core 1100T is a performer when it comes to F@H.  Producing just shy of 13,000 Points Per Day with a total system power draw of 185 watts, this setup has an efficiency of 67 PPD/Watt.  This is almost twice that of the older Intel quad-cores.  Note that I am not Intel-bashing here…if you do some google searching, you will likely see that the new Intel Core I5 and I7’s do even better in both raw PPD and PPD/W than the AMD 1100T.  The moral of the story is that you should try and set up your folding Rig with the most powerful, latest-generation processor you can.  I recommend upgrading at least once a year to keep improving the performance and efficiency of your F@H contributions.  Don’t be that guy running an old-school Athlon X2 generation 300 points per day (while using 150 watts to do it).