Tag Archives: F@H

AMD Ryzen 9 3950X Folding@Home Review: Part 3: SMT (Hyperthreading)

Hi all. In my last post, I showed that the AMD Ryzen 9 3950x is quite a good processor for fighting diseases like Cancer, Alzheimer’s, and COVID-19. Folding@Home, the distributed computing project helping researchers understand various diseases, definitely makes good use of the 16 cores / 32 threads on the 3950x.

In this article, I’m taking a look at how virtualized CPU cores (Simultaneous Multithreading in AMD speak or Hyperthreading for you Intel fans) helps computational performance and efficiency when running Folding@Home on a high-end CPU such as the Ryzen 9 3950x.

Instead of regurgitating all of the previous information, here are some links to bring you up to speed if you haven’t read the previous posts.

Socket AM4 Benchmark Machine

AMD Ryzen 9 3950x Review: Part 1 (Overview)

AMD Ryzen 9 3950X Review: Part 2 (Average Results vs. # of Threads)

Test Setup

For this test, I used the same settings as in Part 2, except that I disabled SMT in the BIOS on my motherboard. Thus, Windows 10 will only see the 16 physical CPU cores, and will not be able to run two logical threads per CPU core. As before, I ran all testing using Folding@Home’s V7 client. I set the CPU slot configuration for a thread value of 1-16. At each setting, I ran five work units and averaged the results. Note that AMD’s core performance boost was turned off for all tests, so at all times the processor ran at 3.5 GHz.


As expected, as you throw more CPU cores at a problem, the computer can chew through the math faster. Thus, more science gets done in a given amount of time. In the case of Folding@Home, this performance is rated in terms of Points Per Day (PPD). The following plot shows the increase in computational performance as a function of # of threads utilized by the solver. Unlike in my previous testing on the 3950x, here an increase of 1 thread corresponds to an increase of 1 engaged CPU core, since virtual threads (SMT / Hyperthreading) are disabled.

The plot below includes the individual samples at each data point as light gray dots, as well as a + / – 2 sigma (95%) confidence interval. This means that 95% of the results for a given thread setting are statistically predicted to fall within the dashed lines.

AMD Ryzen 9 3950x Performance SMT Off

As a side note, certain settings of thread count actually result in the exact same performance, because the Folding@Home client is internally using a different number than the specified value. For example, setting the CPU slot to 5 threads will still result in a 4-thread solve, because the solver is avoiding the numerical issues that occur when trying to stitch the solution together with 5 threads (5 is a tricky prime number to work with numerically). I noted these regions on the plot. If you would like more detail about this, please read the previous part of this review (part 2).

One interesting observation is that the maximum performance occurs with 15 CPU cores enabled, not the complete 16! This is somewhat similar to what was observed in Part 2 of this review (SMT enabled), where 30 threads provided slightly more points than 32 threads. More on that in a moment…

Power Consumption

Using my P3 Kill A Watt Power Meter, I measured the power consumption of the entire computer at the wall. As expected, as you increase the number of CPUs engaged, the instantaneous power consumption goes up. The power numbers reported here are averaged by “the eyeball method”, since the actual instantaneous power goes up and down by a few watts as the computer does its thing. I’d estimate that these numbers are accurate within 5 watts.

AMD Ryzen 9 3950x Power Consumption SMT Off


The ultimate goal of this blog is to find the most efficient settings for computer hardware, so that we can do the most scientific research for a given amount of power consumption. Thus, this next plot is just performance (in PPD) divided by power consumption (in watts). I left off all the work unit variation and confidence interval lines, since it looks about the same as the performance plot, and it’s cleaner with just the one average line.

AMD Ryzen 9 3950x Efficiency SMT Off

As with performance, setting Folding@Home to use 15 CPUs instead of the full 16 is surprisingly the best option for efficiency. The difference is pretty profound here, as the processor used more power at 16 threads than at 15 threads while producing less points at 16 threads than at 15.

Comparison to Hyperthreaded Results

To get a better idea of what’s going on, here are the same three plots again with the average results overlaid on the previous results from when SMT was enabled. Of course the SMT results go up to 32 threads, since with virtual cores enabled, the 16-core Ryzen 9 3950x can support 32 total threads.

AMD Ryzen 9 3950x Performance SMT Off vs On

AMD Ryzen 9 3950X Performance: SMT Study

AMD Ryzen 9 3950x Power SMT Off vs On

AMD Ryzen 9 3950X Power Consumption: SMT Study

AMD Ryzen 9 3950x Efficiency SMT Off vs On

AMD Ryzen 9 3950X Efficiency: SMT Study


Disabling SMT (aka Hyperthreading) essentially limits the Ryzen 9 3950x to a maximum thread count of 16 (one thread per physical core). The results from 1-16 threads are very similar to those results obtained with SMT enabled. Due to work unit variation, the performance and efficiency plots show what I would say is effectively the same result with SMT on vs. off, up to 16 threads. One thing to note was that the power consumption in the 12-16 thread range did trend higher for the SMT off case, although the offset was small (about 5-10 watts). This is likely due to Windows scheduling work to a new physical core to handle the higher thread count when SMT is disabled, as opposed to virtualizing the work onto an already-running core using SMT. Ultimately, this slightly higher power consumption didn’t have a noticeable effect on the efficiency plot.

The big takeaway is that for thread counts above 16 (the physical core count), the Ryzen 9 3950x can utilize thread virtualization very well. The logical processors that Windows sees don’t work quite as well as true physical cores (hence the decrease in slope on the performance and efficiency plots above 16 CPUs). However, when the thread count is doubled, SMT still does allow the processor to eek out an extra 100K PPD (about 33% more) and run more efficiently than when it is limited to scheduling work to physical CPUs.

Pro Tip #1: Turn on Hyperthreading / SMT and run with high core counts to get the most out of Folding@Home!

The final observation worth noting is that in both cases, setting the F@H client to use the maximum available number of threads (16 for SMT off, 32 for SMT on) is not the fastest or most efficient setting. Backing the physical core count down to 15 (and, similarly, the SMT core count down to 30) results in the fastest and most efficient solver performance.

My theory is that by leaving one physical core free (one physical core = 2 threads with SMT on), the computer has enough spare capacity to run all the crap that Windows 10 does in the background. Thus, there is less competition for CPU resources, and everything just works better. The computer is also easier to use for other tasks when you don’t fully max out the CPU core count. This is also especially valuable for those people also trying to fold on a GPU while CPU folding (more on that in the next article).

Pro Tip #2: For high core count CPUs, don’t fold at 100% of your processor’s core capacity. Go right to the limit, and then back it off by a core.

Since you’re using SMT / Hyperthreading due to Pro Tip #1, this means setting the CPUs box in the client to 2 less than the maximum allowed. On my 16-core, 32-thread Ryzen 9 3950x, this means CPUs = 32 (theoretical max) – 2 (2 threads per core) = 30

CPU Slot Config

This result will be different on CPUs with different numbers of cores, so YMMV…I always recommend testing out your individual processor. For lower core count processors such as Intel’s quad core Q6600, running with the maximum number of cores offers the best performance. I previously showed this here.

Future Work

In the next article, I’m going to kick off folding on the GPU, an Nvidia GeForce 1650, which I previously tested by its lonesome here. In a CPU + GPU folding configuration, it’s important to make sure the CPU has enough resources free to “feed” the GPU, or else points will suffer.

I’ve also started re-running the thread tests with Core Performance Boost enabled. This allows the processor to scale up in frequency automatically based on the power and thermal headroom. This should significantly change the character of the SMT On and SMT Off plots, since everything up till now has been run at the stock speed of 3.5 GHz.

Support My Blog (please!)

If you are interested in measuring the power consumption of your own computer (or any device), please consider purchasing a P3 Kill A Watt Power Meter from Amazon. You’ll be surprised what a $35 investment in a watt meter can tell you about your home’s power usage, and if you make a few changes based on what you learn you will save money every year! Using this link won’t cost you anything extra, but will provide me with a small percentage of the sale to support the site hosting fees of GreenFolding@Home.

If you enjoyed this article, perhaps you are in the market for an AMD Ryzen 9 3950x or similar Ryzen processor. If so, please consider using one of the links below to buy one from Amazon. Thanks for reading!

AMD Ryzen 9 3950x Direct Link

AMD Ryzen (Amazon Search)

AMD Ryzen 9 3950X Folding@Home Review: Part 2: Averaging, Efficiency, and Variation

Welcome back everyone! In my last post, I used my rebuilt benchmark machine to revisit CPU folding on my AMD Ryzen 9 3950x 16-core processor. This article is a follow-on. As promised, this includes the companion power consumption and efficiency plots for thread settings of 1-32 cores. As a quick reminder, I did this test with multi-threading (SMT) on, but with Core Performance Boost disabled, so all cores are running at the base 3.5 GHz setting.


The Folding@Home distributed computing project has come a long way from its humble disease-fighting beginnings back in 2000. The purpose of this testing is to see just how well the V7 CPU client scales on a modern, high core-count processor. With all the new Folding@Home donors coming onboard to fight COVID, having some insight into how to set up the configuration for the most performance is hopefully helpful.

For this test, I simply set the # of threads the client can use to a value and ran five sequential work units. I averaged the performance (Points Per Day), but I also plot the individual work unit performance values to give you a sense of the variation. Since the Ryzen 9 3950x supports 32 threads, I essentially ran 160 tests. Since I wanted the Folding@Home Consortium to get useful data in their fight against COVID-19, I let each work unit run to completion, even though I only need them to run to about 10-20% complete to get an accurate PPD estimate from the client.

So, without further blabbing on my part, here is the graph of Folding@Home performance vs. thread count in Windows 10 on the Ryzen 9 3950x


Here, the solid blue line is the averaged performance, and the gray circles are the individual tests. The dashed blue lines represent a statistical 95% confidence interval, which is computed based on the variation. The expected Points Per Day (PPD) of a work unit run on the 3950x is expected to fall within this band 95% of the time.

My first observation is, holy crap! This is a fast processor. Some work units at high thread counts get really close to 500K PPD, which for me has only been achievable by GPU folding up to this point.

My second observation is that there is a lot of variation between different work units. This makes sense, because some work units have much larger molecules to solve than others. In my testing, I found the average variation of all 160 tests to be 12.78%, with individual variance up to 25%.

My third observation is that there seems to be two different regions on this plot. For the first half, the thread count setting is less than the number of physical cores on the chip, and the results are fairly linear. For the second half, the thread count setting is higher than the number of physical cores on the chip (thus forcing the CPU to virtualize those cores using SMT). Performance seems to fall off when the CPU cores become fully saturated (threads = 16), and it takes a while to climb out of the hole (threads = 24 starts showing some more gains).

As a side note, the client does not actually run all of these thread count settings, since some prime numbers, especially large primes (7, 11) and multiples thereof cause numerical issues. For example, when you try to run a 7-thread solve, the client automatically backs the thread count down to 6. You can see warnings in the log file about this when it happens.

Prime Number Thread Adjust

I noted all the relevant thread counts where this happens on the x-axis of the plot. Theoretically, these should be equivalent settings. The fact that the average performance varies a bit between them is just due to work unit variation (I’d have to run hundreds of averages to cancel all the variation out).

Finally, I noticed that the highest PPD actually occurred with a thread count of 30 (PPD = 407200) vs a thread count of 32 (PPD = 401485). This is a small but interesting difference, and is within the range of statistical variation. Thus I would say that setting the thread count to 30 vs 32 provides the same performance, while leaving two CPU threads free for other tasks (such as GPU folding…more on that later!).

Power Consumption

Power consumption numbers for each thread setting were taken at the wall, using my P3 Kill A Watt meter. Since the power numbers tend to walk around a bit as the computer works, it’s hard to get an instantaneous reading. Thus these are “eyeball averaged”. There was enough change at each CPU thread setting to clearly see a difference (not counting those thread settings that are actually equivalent to an adjacent setting).


The total measured power consumption rose fairly linearly from just under 80 watts to just under 160 watts. There’s not too much surprising here. As you throw more threads at the CPU, it clocks up idle cores and does more work (which causes more transistors to switch, which thus takes more power). This seems pretty believable to me. At the high end, the system is drawing just under 160 watts of power. The AMD Ryzen 9 3950x is rated at a 105 watt TDP, and with CPB turned off it should be pretty close to this number. My rough back of the hand calculation for this rig was as follows:

  1. CPU Loaded Power = 105 Watts
  2. GPU Idle Power (Nvidia GTX 1650) = 10 Watts
  3. Motherboard Power = 15 Watts
  4. Ram Power = 2 watts * 4 sticks = 8 watts
  5. NVME Power = 2 watts * 2 drives = 4 watts
  6. SSD Power = 2 watts

Total Estimated Watts @ F@H CPU Load = 144 Watts

Factor in a boat load of case fans, some silly LED lights, and a bit of PSU efficiency hit (about 90% efficient for my Seasonic unit) and it’ll be close to the 160 watts as measured.


This being a blog about saving the planet while still doing science with computers, I am very interested in energy efficiency. For Folding@Home, this means at doing the most work (PPD) for the least amount of power (watts). So, this plot is just PPD/Watts. Easy!

Similar to the PPD plot, this efficiency plot averages five data points for each thread setting. I chose to leave off the individual points and the confidence interval, because that looks about the same on this plot as it does on the PPD plot, and leaving all the clutter off makes this easier to read.


As with the PPD plot, there seem to be two regions on the efficiency curve. The first region (threads less than 16) shows a pretty good linear ramp-up in efficiency as more threads are added. The second region (threads 16 or greater) is what I’m calling the “core saturation” region. Here, there are more threads than physical cores, and efficiency stays relatively flat. It actually drops off at 16 cores (similar to the PPD plot), and doesn’t start improving again until 24 or more threads are allocated to the solver.

This plot, at first glance, suggests that the maximum efficiency is realized at # of threads = 30. However, it should be noted that work unit variation still has a lot of influence, even with reporting results of a 5-sample average. You can see this effect by looking at the efficiency drop at threads = 31. Theoretically, the efficiency should be the same at threads = 31 and threads = 30, because the solver runs a 30-thread solution even when set to 31 to prevent domain decomposition.

Thus, similar to the PPD plot, I’d say the max efficiency is effectively achieved at thread counts of 30 and 32. My personal opinion is that you might as well run with # of threads = 30 (leaving two threads free for other tasks). This setting results in the maximum PPD as well.

Weird Results at Threads = 16-23

Some of you might be wondering why the performance and efficiency drops off when the thread count is set to the actual number of cores (16) or higher. I was too, so I re-ran some tests and looked at what was happening with AMD’s built-in Ryzen Master tool. As you can see in the screen shot below, even though the # of threads was set to 18 in Folding@Home (a number greater than the 16 physical cores), not all 16 cores were fully engaged on the processor. In fact, only 14 were clocked up, and two were showing relatively lazy clock rates.

Two Cores are Lazy!

Folding@Home 18-Thread CPU Solve on 16-Core Processor

I suspect what is happening is that some of the threads were loaded onto “virtual” CPU cores (i.e. SMT / hyper threading). This might be something Windows 10 does to preserve a few free CPU cores for other tasks. In fact, I didn’t see all of the cores turbo up to full speed until I set Folding@Home’s thread count to 24. This incidentally is when performance starts coming back in on the plots above.

This weird SMT / Hyper-threading behavior is likely what is responsible for the large drop-off / flat part of the performance and efficiency curves that exists from thread count = 16 to 23. As you can see in the picture below, once you fully load all the available threads, the CPU frequencies on each core all hit the maximum value, as expected.


Folding@Home 32-Thread CPU Solve on 16-Core Processor

Results Comparison

The following plots compare overall performance, power consumption, and efficiency of my new AMD Ryzen 9 3950x Folding@Home rig to other hardware configurations I have tested so far.


As you can see from the plot below, the Ryzen 9 3950x running a 32-thread Folding@Home solve can compete with relatively modern graphics cards in terms of raw performance. High-end GPUs will still offer more performance, but for a processor, getting over 400K PPD is very impressive. This is significantly more PPD than the previous processors I have tested (AMD Bulldozer-based FX-8320e, AMD Phenom II X6 1100t, Intel Core2Quad Q6600, etc). Admittedly I have not tested very many CPUs, since this is much more involved than just swapping out graphics cards to test.

AMD Ryzen 9 3950x Performance

Power Consumption

From a total system power consumption standpoint, my new benchmark machine with the AMD Ryzen 9 3950x has a surprisingly low total power draw when running Folding. Another interesting point is that since the 3950x lacks onboard graphics, I had to have a graphics card installed to get display. In my case, I had the Nvidia GTX 1650 installed, since this is a relatively low power consumption card that should provide minimal overhead. As you can see below, folding on the 3950x CPU (with the 1650 GPU idle) uses nearly the same amount of power as folding on the 1650 GPU (with the 3950x idle).

AMD Ryzen 9 3950x Power Consumption


Efficiency is the point of this blog, and in this respect the 3950x comes in towards the upper middle of the pack of hardware configurations I have tested. It’s definitely the most efficient processor I have tested so far, but graphics cards such as the 1660 Super and 1080 Ti are more efficient. Despite drawing more total power from the wall, these high-end GPUs do a lot more science.

Still, a PPD/Watt of over 2500 is not bad, and in this case the 3950x is more efficient than folding on the modest GPU installed in the same box (the Nvidia GTX 1650). Compared to the much older AMD FX-8320e, the Ryxen 9 3950x is 14x more efficient! What a difference 7 years can make!

AMD Ryzen 9 3950x Efficiency


The 16-core, 32-thread AMD Ryzen 9 3950x is one fast processor, and can do a lot of science for the Folding@Home distributed computing project. Although mid to high-end graphics cards such as the 1080 Ti ($450 on the used market) can outperform the $700 3950x in terms of performance and efficiency, it is still important to have a smattering of high-end CPU folding rigs on the Folding@Home network, because some molecules can only be solved on CPUs.

There is a general trend of increasing efficiency and performance as the # of CPU threads allocated to Folding@Home increases. For the Ryzen 9 3950x, using a setting of 30 or 32 threads is recommended for maximum performance and efficiency. If you plan on using your computer for other tasks, or for simultaneously folding on the GPU, 30 threads is the ideal CPU slot setting.

Please Support My Blog!

If you are interested in measuring the power consumption of your own computer (or any device), please consider purchasing a P3 Kill A Watt Power Meter from Amazon. You’ll be surprised what a $35 investment in a watt meter can tell you about your home’s power usage, and if you make a few changes based on what you learn you will save money every year! Using this link won’t cost you anything extra, but will provide me with a small percentage of the sale to support the site hosting fees of GreenFolding@Home.

If you enjoyed this article, perhaps you are in the market for an AMD Ryzen 9 3950x or similar Ryzen processor. If so, please consider using one of the links below to buy one from Amazon. Thanks for reading!

AMD Ryzen 9 3950x Direct Link

AMD Ryzen (Amazon Search)

Future Work

In the next article, I’ll disable multithreading (SMT) to see the effect of virtualized CPU cores on Folding@Home performance.

Later, I plan to enable core performance boost on the 3950x to see what effect the automatic clock frequency and voltage overclocking has on Folding@Home performance and efficiency.



How to Make a Folding@Home Space Heater (and why would you want to?)

My normal posts on this site are all about how to do as much science as possible with Folding@Home, for the least amount of power. This is because I think disease research, while a noble and essential cause, shouldn’t be done without respecting the environment.

With that said, I think there is a use case for a power-hungry, inefficient Folding@Home computer. Namely, as a space heater for those in colder climates.

The logic is this: Running Folding@Home, or any other piece of software, makes your computer do work. Electricity flows through the circuits, flipping tiny silicon switches, and producing heat in the process. Ultimately all of the energy that flows into your computer comes back out as heat (well, a small amount comes out as light, or electromagnetic radiation, or noise, but all of those can and do get converted back into heat as they strike things in the room).

Have you ever noticed how running your gaming computer with the door to your room closed makes your feet nice and toasty in the winter? It’s the same idea. Here, one of my high-performance rigs (dual NVidia 980 Ti GPUs) is silently humming away, putting off about 500 watts of pleasant heat. My son is investigating:

My Folding@Home Space Heater Experiment

Folding@Home uses CPUs and GPUs to run molecular dynamic models to help research understand and fight diseases. You get the most points per day (PPD) by using cutting-edge hardware, but the Folding@Home Consortium and Stanford University openly encourage everyone to run the software on whatever they happen to have.

With this in mind, I started thinking about all the old hardware that is out there…CPUs and graphics cards that are destined for landfills because they are no longer fast enough to do any useful gaming or decode 4K video. People describe this type of hardware as “bricks” or “space heaters”–useful for nothing other than wasting power.

That gave me an idea…

It didn’t take me long to find a sweet deal on an nForce 680i-based system on eBay for $60 shipped (EVGA board with Nvidia n680i chipset, supporting three full-length PCI-E X16 slots). I swapped out the Core 2 Duo that this machine came with for a Core 2 Quad, and purchased four Fermi-based Nvidia graphics cards, plus a used 1300 Watt Seasonic 80+ Gold power supply. All of this was amazingly cheap. The beautiful Antec case was worth the $60 cost of the parts that came with it alone. Because I knew lots of power would be critical here, I spent most of the money on a high-end power supply (also used on eBay). Later on, I found that I needed to also upgrade the cooling (read: cut a hole in the side panel and strap on some more fans).

  • Antec Mid-Tower Case + Corsair 520 Watt PSU, EVGA 680i motherboard, Core 2 Duo CPU, 4 GB Ram, CD Drives, and 4 Fans = $60
  • 2x EVGA Nvidia GeForce GTX 480 graphics cards: $40
  • 1 x EVGA NVidia GeForce GTX 580 Graphics Card: $50
  • 1 x EVGA NVidia GeForce GeForce GTX 460 Graphics Card: $20
  • 1 x PCI-E X1 to X16 Riser: $10
  • 1 x Core 2 Quad Q6600 CPU (Socket 775) – $6
  • 1 x Seasonic 1300 Watt 80+ Gold Modular Power Supply: $90
  • 2 x Noctua 120 MM fans + custom aluminum bracket (for modifying side panel): $60
  • 1 x Arctic Cooling Freezer Tower Cooler – $10
  • 1 x Western Digital Black 640GB HDD – $10

Total Cost (Estimated): $356

This is the cost before I sold some of the parts I didn’t need (Core 2 Duo, Corsair PSU, etc).

Here is a shot of the final build. It took a bit of tweaking to get it to this point.


Used Parts Disclaimer!

Note that when dealing with used parts on eBay, it’s always good to do some basic service. For the GPUs in this build, I took them apart, cleaned them, applied fresh thermal paste (Arctic MX-4), and re-assembled. It was good that I did…these cards were pretty gross, and the decade-old thermal paste was dried on from years of use.


I mean, come on now, look at the dust cake on the second GTX 480! Clean your graphics cards, random eBay people!

GTX 480 Dust

Here’s how the 3 + 1 GPUs are set up. The two GTX 480s and the GTX 580 are on the mobo in the X16 slots. I remotely mounted the GTX 460 in the drive bay. I used blower-style (slot exhaust) cards on purpose here, because they exhaust 100% of the hot air outside the case. Open-fan style cards would have overheated instantly in this setup.

To keep costs down, I just used Ubuntu Linux as the operating system. I configured the machine for 4-slot GPU folding using proprietary Nvidia drivers. Although I ultimately control all of my remote Linux machines with TeamViewer, it is helpful to have a portable monitor and combo wireless keyboard/mouse for initial configuration and testing. In the shot below (of an earlier config), I learned a lot just trying the get the machine stable with 3 cards.


Initial Testing on the Space Heater (3 GPUs installed). This test showed me that I needed better CPU cooling (hence I chucked that stock Intel cooler)

I also did some thermal testing along the way to make sure things weren’t getting too hot. It turns out this testing was a bit misleading, because the system was running a lot cooler with the side panel off than with it on.

Some Thermal Camera Images During Initial Burn-In (3 GPUs, stock CPU cooler):

Now that’s some heat coming out of this beast! Thankfully, the upgraded 14-gauge power plug and my watt meter aren’t at risk of melting, although they are pretty warm.

Once I had the machine up and running with all four GPUs the final configuration, I found that it produced about 55-95K PPD on average (based on the work unit), with the following breakdown

  • GTX 460: 10-20K PPD
  • GTX 480: 20-30K PPD each
  • GTX 580: 25-45 K PPD

Power consumption, as measured at the wall, ranged from 900 to 1000 watts with all 4 GPUs engaged. By turning different GPUs on and off, I could get varying levels of power (about 200 Watts idle. I typically ran it with one 580 and one 480 folding, for an average power consumption of about 600 watts).


After running the machine for a while, my room was nice and toasty, as expected!

One thing that I should mention was the effect of the two additional intake fans that I mounted in the side panel. Originally I did not have these, and the top graphics card in the stack was hitting 97 degrees C according to the onboard monitoring! After modding this custom side-intake into the case (found a nice fan bracket on Amazon, and put my dremel tool to good use), the temps went down quite a lot. I used fan grilles on the inside of the fans to keep internal cables out of them, and mesh filters on the outside to match the intake filters on the rest of the case.


The top card stays under 85 degrees C (with the fan at 50%). The middle card stays under 80 degrees C, and the bottom card runs at 60 degrees C. The GTX 460 mounted in the drive bay never goes over 60 degrees C, but it’s a less powerful card and is mounted on the other side of the case.

Here’s some more pictures of the modded side panel, along with a little cooling diagram I threw together:

PPD, Wattage, and Efficiency Comparison

I debated about putting these plots in here, because the point of this machine was not primarily to make points (pun intended), or to be efficient from a PPD/Watt perspective. The point of this machine was to replace the 1500 watt space heater I use in the winter to keep a room warm.

As you can see, the scientific production (PPD) on this machine, even with 4 GPUs, is not all that impressive in 2020, since the GPUs being used are ten years old. Similarly, the efficiency (PPD/Watt) is terrible. There’s no surprise there, since it averages just under 1000 watts of power consumption at the wall!


It is totally possible to build a (relatively) inexpensive desktop computer out of old, used parts to use as a space heater. If the primary goal is to make heat, then this might not be a bad idea (although at $350, it still costs way more than a $20 heater from Walmart). The obvious benefit is that this sort of space heater is actually doing something useful besides keeping you warm (in this case, helping scientists learn more about diseases thanks to Folding@Home).

Other benefits that I found were the remote control (TeamViewer), which lets me use my cellphone to turn GPUs on and off to vary the heat output. Also, I think running this machine for extended durations in its medium-high setting (700 watts or so) is much healthier for the electrical wiring in my house vs. the constant cycling on and off of a traditional 1500 watt space heater.

From an environmental standpoint, you can do much worse than using electric heat. In my case, electric space heaters make a lot of sense, especially at night. I can shut off the entire heating zone (my house only has two zones) to the upstairs and just keep the bedroom warm. This drastically reduces my fossil fuel usage (good old New England, where home heating oil is the primary method of keeping warm in the winter). Since my house has an 8.23 KW solar panel array on the roof, a lot of my electricity comes directly from the sun, making this electric heat solution even greener.

Parting Thoughts:

I would not recommend running a machine like this during the warmer months. If warm air is not wanted, all the waste heat from this machine will do nothing but rack up your power bill for relatively little science being done. If you want to run an efficient summer-time F@H rig that uses low power (so as to not fight your AC) , check out my article on the GTX 1660 and 1650.

In a future article, I plan to show how I actually saved on heating costs by running Folding@Home space heaters all last winter (with a total of seven Folding@Home desktops placed strategically throughout my house, so that I hardly had to burn any oil).


Folding@Home on Turing (NVidia GTX 1660 Super and GTX 1650 Combined Review)

Hey everyone. Sorry for the long delay (I have been working on another writing project, more on that later…). Recently I got a pair of new graphics cards based on Nvidia’s new Turing architecture. This has been advertised as being more efficient than the outgoing Pascal architecture, and is the basis of the popular RTX series Geforce cards (2060, 2070, 2080, etc). It’s time to see how well they do some charitable computing, running the now world-famous disease research distributed computing project Folding@Home.

Since those RTX cards with their ray-tracing cores (which does nothing for Folding) are so expensive, I opted to start testing with two lower-end models: the GeForce GTX 1660 Super and the GeForce GTX 1650.


These are really tiny cards, and should be perfect for some low-power consumption summertime folding. Also, today is the first time I’ve tested anything from Zotac (the 1650). The 1660 super is from EVGA.

GPU Specifications

Here’s a quick table I threw together comparing these latest Turing-based GTX 16xx series cards to the older Pascal lineup.

Turing GPU Specs

It should be immediately apparent that these are very low power cards. The GTX 1650 has a design power of only 75 watts, and doesn’t even need a supplemental PCI-Express power cable. The GTX 1660 Super also has a very low power rating at 125 Watts. Due to their small size and power requirements, these cards are good options for small form factor PCs with non-gaming oriented power supplies.

Test Setup

Testing was done in Windows 10 using Folding@Home Client version 7.5.1. The Nvidia Graphics Card driver version was 445.87. All power measurements were made at the wall (measuring total system power consumption) with my trusty P3 Kill-A-Watt Power Meter. Performance numbers in terms of Points Per Day (PPD) were estimated from the client during individual work units. This is a departure from my normal PPD metric (averaging the time-history results reported by Folding@Home’s servers), but was necessary due to the recent lack of work units caused by the surge in F@H users due to COVID-19.

Note: This will likely be the last test I do with my aging AMD FX-8320e based desktop, since the motherboard only supports PCI Express 2.0. That is not a problem for the cards tested here, but Folding@Home on very fast modern cards (such as the GTX 2080 Ti) shows a modest slowdown if the cards are limited by PCI Express 2.0 x16 (around 10%). Thus, in the next article, expect to see a new benchmark machine!

System Specs:

  • CPU: AMD FX-8320e
  • Mainboard : Gigabyte GA-880GMA-USB3
  • GPU: EVGA 1080 Ti (Reference Design)
  • Ram: 16 GB DDR3L (low voltage)
  • Power Supply: Seasonic X-650 80+ Gold
  • Drives: 1x SSD, 2 x 7200 RPM HDDs, Blu-Ray Burner
  • Fans: 1x CPU, 2 x 120 mm intake, 1 x 120 mm exhaust, 1 x 80 mm exhaust
  • OS: Win10 64 bit

Goal of the Testing

For those of you who have been following along, you know that the point of this blog is to determine not only which hardware configurations can fight the most cancer (or coronavirus), but to determine how to do the most science with the least amount of electrical power. This is important. Just because we have all these diseases (and computers to combat them with) doesn’t mean we should kill the planet by sucking down untold gigawatts of electricity.

To that end, I will be reporting the following:

Net Worth of Science Performed: Points Per Day (PPD)

System Power Consumption (Watts)

Folding Efficiency (PPD/Watt)

As a side-note, I used MSI afterburner to reduce the GPU Power Limit of the GTX 1660 Super and GTX 1650 to the minimum allowed by the driver / board vendor (in this case, 56% for the 1660 and 50% for the 1650). This is because my previous testing, plus the results of various people in the Folding@Home forums and all over, have shown that by reducing the power cap on the card, you can get an efficiency boost. Let’s see if that holds true for the Turing architecture!


The following plots show the two new Turing architecture cards relative to everything else I have tested. As can be seen, these little cards punch well above their weight class, with the GTX 1660 Super and GTX 1650 giving the 1070 Ti and 1060 a run for their money. Also, the power throttling applied to the cards did reduce raw PPD, but not by too much.

Nvidia GTX 1650 and 1660 performance

Power Draw

This is the plot where I was most impressed. In the summer, any Folding@Home I do directly competes with the air conditioning. Running big graphics cards, like the 1080 Ti, causes not only my power bill to go crazy due to my computer, but also due to the increased air conditioning required.

Thus, for people in hot climates, extra consideration should be given to the overall power consumption of your Folding@Home computer. With the GTX 1660 running in reduced power mode, I was able to get a total system power consumption of just over 150 watts while still making over 500K PPD! That’s not half bad. On the super low power end, I was able to beat the GTX 1050’s power consumption level…getting my beastly FX-8320e 8-core rig to draw 125 watts total while folding was quite a feat. The best thing was that it still made almost 300K PPD, which is well above last generations small cards.

Nvidia GTX 1650 and 1660 Power Consumption


This is my favorite part. How do these low-power Turing cards do on the efficiency scale? This is simply looking at how many PPD you can get per watt of power draw at the wall.

Nvidia GTX 1650 and 1660 Efficiency

And…wow! Just wow. For about $220 new, you can pick up a GTX 1660 Super and be just as efficient than the previous generation’s top card (GTX 1080 Ti), which still goes for $400-500 used on eBay. Sure the 1660 Super won’t be as good of a gaming card, and it  makes only about 2/3’s the PPD as the 1080 Ti, but on an energy efficiency metric it holds its own.

The GTX 1650 did pretty good as well, coming in somewhere towards the middle of the pack. It is still much more efficient than the similar market segment cards of the previous generation (GTX 1050), but it is overall hampered by not being able to return work units as quickly to the scientists, who prioritize fast work with bonus points (Quick Return Bonus).


NVIDIA’s entry-level Turing architecture graphics cards perform very well in Folding@Home, both from a performance and an efficiency standpoint. They offer significant gains relative to legacy cards, and can be a good option for a budget Folding@Home build.

Join My Team!

Interested in fighting COVID-19, Cancer, Alzheimer’s, Parkinson’s, and many other diseases with your computer? Please consider downloading Folding@Home and joining Team Nuclear Wessels (54345). See my tutorial here.

Interested in Buying a GTX 1660 or GTX 1650?

Please consider supporting my blog by using one of the below Amazon affiliate search links to find your next card! It won’t cost you anything extra, but will provide me with a small part of Amazon’s profit so I can keep paying for this site.

GTX 1660 Amazon Search Affiliate Link!

GTX 1650 Amazon Search Affiliate Link!

How to Run Folding@Home on a Graphics Card in Windows 10

(A Folding at Home Unofficial Configuration Guide for GPU, Multi-GPU, and CPU/GPU Folding)

Folding@Home is a distributed computing project for fighting diseases. If you’re reading this post, they you are probably looking for some help getting Folding@Home running on your graphics card. GPU folding, when configured properly, is one of the best way to do tons of science efficiently. I hope this Folding@Home GPU Guide helps you start kicking butt against cancer and other diseases. So, let’s get started.

Note: for people who already have the Folding@Home client up and running and you want to switch from CPU folding to GPU folding, skip right to Step 3. Please note that if you are changing your hardware configuration on a machine that is already folding, it is courteous to let the existing work units finish by using the “finish” option on the client prior to re-arranging hardware. This keeps work units from being lost.

Step 0: System Requirements

Yes, we’re starting at zero, because computer indexing starts here too. Plus, before you even try this, you need the right stuff in the box.

Operating System

While Folding@Home supports many operating systems, this guide is aimed at Windows users. I’ll be using Windows 10, but the steps are the same for Windows 7.

Overall Computer

Give Me Efficiency or Give Me an Empty CPU Socket!

You do need to think about what goes in this socket, even if you’re GPU folding

Even though this is a guide about graphics card folding, the rest of your computer needs to be up to snuff to keep the card fed. Ideally, you want one dedicated CPU core for your overall Windows environment, plus one CPU core for each graphics card you want to run F@H on. So, for a 1-GPU computer, having two CPU cores available is optimal. A dual-GPU computer should have a 3 cores available, a three-GPU computer should have four cores available, etc. In terms of clock rate, almost all modern processors with clock rates above 2.0 GHz will work. Remember, we aren’t doing CPU folding here; the CPU just needs to be fast enough to keep the GPU fed.

Circuit City

Circuit City

Motherboards don’t matter too much, except that you should have a full-width PCI-Express x16 slot for each graphics card you want to fold on. When you get into really fast, new graphics cards like the RTX 2080,  a PCI-E 3.0 x16 slot will ensure the data flows fast enough to the card. PCI-Express 2.0 bandwidth will work with these ultra-fast cards, but there will be a slight bottleneck. Note I have never seen any bottlenecks with my GTX 1080 Ti on PCI-Express 2.0 x16 in Windows, but when adding a second card (using an x1 riser), I did see a slowdown on my Gigabyte 880-series socket AM3 board.


You should also aim to have 8 GB of ram (16 ideally), just because Windows tends to be a resource hog. Some people can fold just fine with 4 GB, but for this guide I am assuming you want to be able to use the machine as well. Memory channel configuration and speed doesn’t matter very much for Folding@Home, especially on GPUs.

Hard Drives

Any old hard drive with 60 GB or so of free space will do. The F@H client takes up almost no space. The 60 GB of free space is really just what you need for Windows 10 to not run really bad, regardless of what the machine is being used for.

Internet Connection

Almost anything works, as long as it doesn’t drop out.

Power Supply

PC Power & Cooling SILENCER PSU

This is a critical and often overlooked component in the world of computational computing. I’ve written many articles on power supplies, so feel free to browse through my site to learn more. In short, make sure your system has enough PSU wattage to drive the video card, based on the video card’s recommendation. You’ll also need to make sure your power supply has the correct auxiliary power cables (PCI-Express 6-pin and/ or 8-pin) to supply enough current to cards requiring supplemental power.

For multiple cards, you’ll need more nameplate PSU wattage. Power supplies should be 80+ Bronze certified or better to help deliver power efficiently, because no one likes wasting money on misused electricity (and this hurts the environment). Also, you should try and stick with major manufacturers, such as (but not limited to) Corsair, Antec, Seasonic, Cooler Master, PC Power & Cooling, Thermaltake, etc.

Here are some common computer configurations and a reasonable power supply wattage to drive them:

  • 1 x Low-End GPU –> (GTX 1050, RX560, etc) –> 380 Watt PSU
  • 1 x Mid-Range GPU (GTX 1060, RX570, etc) –> 450 Watt PSU
  • 1 x High-End GPU (GTX 1080, Vega64, etc) –> 550 Watt PSU
  • 2 x Mid-Range GPUs  or 3 x Low-End GPUs–> 600 Watt PSU
  • 2 x High-End GPUs or 3 x Mid-Range GPUs –> 800 Watt PSU
  • 3 x High-End GPUs or  4 x Mid-Range GPUs–> 1000 Watt PSU
  • 4 x High-End GPUs (you’re crazy!) –> 1200+ Watt PSU

Saving the Planet Tip: Any PSU supplying an active load of 600 Watts or more should be 80+ Gold certified or better. This will minimize waste heat due to efficiency losses, which really start to add up for high power-draw computers.


This is another overlooked requirement. Any computer doing 24/7 computations on a graphics card is going to get pretty toasty. Thankfully, most modern CPU cases come with enough space and fans to deal with this. You’ll want at least 1 dedicated 120 MM exhaust fan (not including the PSU fan) and one 120 MM intake fan to keep the air flowing. If you have dual graphics cards, having an intake fan right on the side panel blowing on the cards is one of the best way to keep a hot pocket of air from forming between the cards. Consider reference-style video cards (centrifugal 2-slot blower coolers) for multi-card setups to help dump the heat, since open-fan cards tend to just drown in their own heat if there isn’t enough airflow. I also recommend aftermarket coolers on CPUs, since your processor will be actively spooled up and feeding your graphics card. Yet, CPU cooling doesn’t need to be overkill.

Icy Opteron 4184


Graphics Card
Graphics Card Showdown: EVGA Nvidia Geforce GTX 1050 TI vs. Gigabyte AMD Radeon HD 7970 GHz Edition

Graphics Cards: You’ll Need One

First off, you should actually have a discrete graphics card. While F@H might run on some onboard / APU graphics solutions, the performance won’t be worth it, and you might as well just run CPU folding.

Folding@Home works on many discrete graphics cards that support OpenCL, but not all cards are supported. AMD RADEON HD 5xxx cards and Nvidia GeForce 4xx cards and newer are currently supported, but that can always change. See the project’s system requirements for a complete list. I personally recommend using Nvidia 9xxx series cards or AMD RX 5xx cards or newer, since these are more efficient than older hardware. My review of the GeForce 1080 Ti has some plots on efficiency and performance that might be helpful if you are selecting a card specifically for folding. Make sure you have the latest drivers for your card from either AMD or Nvidia.

Step 1: System Prep

Before even downloading Folding@Home, you should do a few basic things just to make sure the system is going to be stable for heavy computations. On the software side, this means updating drivers, making sure virus definitions and Windows updates are up to date, etc. On the hardware side, I recommend fully air-canning the dust out of your machine to optimize cooling. If the computer is older and the GPU you plan to use has been installed for a while, it’s worth taking the graphics card out and hitting it with some compressed air from all angles to clean out the heat sinks.

Step 2: Download and Install V7 Client

The Folding@Home V7 client can be found here:

The current client version is 7.5.1. Go ahead and install it. For this part, it’s basically just following the prompts. F@H’s default Windows install guide works well enough, and you can read that here. All of this can be configured later within the client (and this will be required for GPU folding). So, I’m linking to the standard install guide instead of regurgitating the steps, because I’m lazy I want this to be done identically to how Stanford * the F@H Consortium recommends it be done. If you don’t want to fold anonymously, select the “Set up an identity” button. You’ll want to pick a user name and enter a team number if you have one you’d like to join.

For example, if you wanted to join our team, you’d enter number 54345 in the team number field to join team Nuclear Wessels!

A note about Passkeys: you want one of these if you want to get lots of points and compete on the F@H leaderboards. Passkeys are a secure key that makes sure your points are your own (i.e. no one is using your username to generate points elsewhere). You need to have a Passkey if you want to be eligible for the Quick Return Bonus (more points given to users who do science quickly). You become eligible for the bonus once you have successfully completed ten work units and you have a valid passkey. You can get a Passkey here (but you don’t have to do this right away. Just like configuring your GPUs, it can be done later).

Step 3: Configure the Client for GPU Folding


Now we are going to edit settings within the Advanced Control section of the Folding@Home client. To get here, look at your Windows task bar (next to the clock). Once F@H is installed, there should be a little molecule there. Right-click that bad boy and select “Advanced Control” to open the local client window.

Right-Click FAH

This opens up the client view. Here is what mine currently looks like (with GPU slots configured). Depending on how you got here, you might or might not have a team name and user identity displayed, and you might or might not have CPU folding enabled.

F@H Control V7

Go ahead and click the “Configure” button in the top-left of the window. Go to the “Identity” tab first.


Here, you can change any of the user info and team name info you entered when you installed the V7 client. You can also enter a Passkey if you have one (for those sweet, sweet Quick Return Bonus Points!).

Pitch: I’d be honored if you joined team # 54345 (Nuclear Wessels). We are currently doing everything we can to fight the COVID-19 coronavirus.

Nuclear Wessels Meme

Next, go one tab over to “Slots”. Here, you can see what devices Folding@Home plans to use (either CPU or GPU). For my setup, I have removed all CPU slots and added two GPU slots (one slot for my 980 Ti and one for my 1080 Ti). If you originally started folding on the CPU and want to switch to GPU folding, you can delete your CPU slot here and add GPU slot(s) for your graphics card(s).

Note: If you want to do mixed hardware folding (CPU + GPU), I will talk about that in Step 4.


The slot configuration window opens up when you add or edit a slot. Here are the options.

Slot Config Selecting the GPU button and leaving all the index settings at -1 is a good place to start. Nine times out of ten, the client will properly detect graphics cards this way. For my computer, adding two GPU slots with settings like this resulted in it properly detecting and folding on my installed GTX 980 Ti and GTX 1080 Ti cards.

In rare cases, the client might get confused. This happens in systems with onboard graphics (such as with AMD APUs). What happens is you are trying to fold on your discrete graphics card, and instead the F@H client is running the GPU slot on the APU. When this happens, I’ve found the easiest thing to do is reboot the computer, go into the BIOS, and disable the APU graphics from there, so that the client can’t even see the APU. Thus, the GPU slot with a -1 index defaults to the discrete graphics card.

Alternatively, you can use the gpu-index, opencl-index, and cuda-index boxes to try and get the slot to run on the correct graphics card. This is a trial and error process that is beyond the scope of this guide (leave me a comment if you need help with this, or ask someone in the Folding@Home Forums).

Advanced Slot Options

The Extra Slot Options (expert only) box on the bottom can sometimes help you eek a bit more performance out of the GPU slots. However, your mileage may vary. You can add or remove slot options with the + and – buttons on the bottom-right.

The settings I tend to add are these:

Advanced Options

Here, client-type advanced lets me get “late stage beta” work units, which might be a bit more unstable than normal work units, yet this helps the Folding@Home Consortium get new projects tested sooner. Max-Packet-Size Big (other options are “normal” and “small”) lets me download large molecules that will push the system a bit harder (more VRAM needed, more internet bandwidth, etc). Pause-on-start (value of “true” or “false”) tells the system to pause the folding slot when the computer boots (instead of automatically folding as soon as the machine is on). This is nice for when I want to kick folding off manually. Set this to “false” or leave it blank if you want the computer to fold automatically after a restart.

For a detailed list of these slot options, see the config guide here. Note: some of this is out of date.

Step 4 (Optional): Configure a CPU Slot as well

If you have CPU cores to spare, you can add a CPU folding slot in addition to the GPU slots. I recommend leaving 1 CPU core free for Windows background tasks (unless you are making a dedicated folding rig and don’t mind it being a bit slow to use). You should also keep 1 CPU core free for feeding each GPU that you have in your system. So, for my 8-core AMD FX-8320e with my two graphics cards, I could do something like this:

Total CPU Cores: 8

Cores needed for Windows: 1

Cores Needed for GPU Slots: 2 (one for each GPU)

Cores Remaining: = 8-1-2 = 5

So, theoretically, I can set my CPU folding slot to use 5 CPU cores. Now, an interesting fact is that in multi-core computing, prime numbers like 3, 5, and 7 do not work so well. Folding at home also doesn’t do well with high prime numbers, or multiples thereof (such as 14 threads, which is a multiple of prime number 7). It has to do with how all the data threads are stitched together.

For example, you get similar performance folding with 4 CPU cores as with 5 (4 is a nice base 2 number that computers like). In my case, for a non-dedicated folding rig, I set up a CPU slot with 4 CPU cores enabled, leaving two cores to handle whatever else the computer is doing and 2 cores to feed the graphics cards. Incidentally, if this were a guide about just setting up CPU folding, I would leave this box at “-1”.

4 CPU Core Config

Now, just hit the OK button and then save the slot configuration.

Save Slot Config

Step 5: Observe Slots Descriptions in the Client

Now, I can see that I have three slots (two GPU and one CPU) listed in the client window.

Ready Slots

Here, you should see that the CPU slot is using the number of threads you told it to use (4, in my case), and that the graphics cards are correctly identified. This all looks good.

Step 5: Watch it run!

Once you have your slots configured, you should be able to sit back and watch your computer fight disease with everything it’s got. One last thing: A helpful tool for graphics card monitoring is something like MSI Afterburner, or AMD’s built-in tool Wattman. It’s good to use these to make sure your card has enough thermal headroom to perform (keep it under 80 degrees C if you can!). If your card is thermally throttling, you’ll see an impact to folding@home PPD. I find that setting custom fan curves, or just setting the fan to run a bit faster than it normally would, is often enough to eliminate this.


The V7 client installer does the best job at detecting your specific graphics hardware during initial software installation. If you added a new graphics card that is not recognized, you should do a clean re-install of the V7 client. Write down your Name, Team Number, and Passkey, uninstall the client completely (including data), reinstall, and see if the new card is detected.

Some new graphics cards are also not immediately supported upon release. For example, the Radeon 5700 XT is only recently gaining support with advanced beta work units, but work is progressing to get this card fully supported (as of 3/2020). You can read up on which cards are supported and which aren’t yet on the GPU Whitelist Thread.

Leave me a comment if…

Did this guide help you? Did I miss something? Let me know how I can help and make this better by leaving a comment. Thanks!


Addendum: Helpful Links to Other Tutorials

HFM.net – A remote monitoring program for F@H Clients

HFM.net monitoring tutorial (Youtube) – Video Tutorial by Frax1006

Teamviewer Guide – A remote desktop solution to let you log into folding machines and monitor / configure them. This is an excellent write-up by Pyroball.

Official F@H Advanced User Custom Installation Guide

Official F@H Configuration Guide

Overclocker’s Club F@H Guide


Folding@Home Review: NVIDIA GeForce GTX 1080 Ti

Released in March 2017, Nvidia’s GeForce GTX 1080 Ti was the top-tier card of the Pascal line-up. This is the graphics card that super-nerds and gamers drooled over. With an MSRP of $699 for the base model, board partners such as EVGA, Asus, Gigabyte, MSI, and Zotac (among others) all quickly jumped on board (pun intended) with custom designs costing well over the MSRP, as well as their own takes on the reference design.

GTX 1080 Ti Reference EVGA

EVGA GeForce GTX 1080 Ti – Reference

Three years later, with the release of the RTX 2080 Ti, the 1080 Ti still holds its own, and still commands well over $400 on the used market. These are beastly cards, capable of running most games with max settings in 4K resolutions.

But, how does it fold?


Folding at home is a distributed computing project originally developed by Stanford University, where everyday users can lend their PC’s computational horsepower to help disease researchers understand and fight things like cancer, Alzheimer’s, and most recently the COVID-19 Coronavirus. User’s computers solve molecular dynamics problems in the background, which help the Folding@Home Consortium understand how proteins “misfold” to cause disease. For computer nerds, this is an awesome way to give (money–>electricity–>computer work–>fighting disease).

Folding at home (or F@H) can be run on both CPUs and GPUs. CPUs provide a good baseline of performance, and certain molecular simulations can only be done here. However, GPUs, with their massively parallel shader cores, can do certain types of single-precision math much faster than CPUs. GPUs provide the majority of the computational performance of F@H.

Geforce GTX 1080 Ti Specs

The 1080 Ti is at the top of Nvidia’s lineup of their 10-series cards.

1080 Ti Specs

With 3584 CUDA Cores, the 1080 Ti is an absolute beast. In benchmarks, it holds its own against the much newer RTX cards, besting even the RTX 2080 and matching the RTX 2080 Super. Only the RTX 2080 Ti is decidedly faster.

Folding@Home Testing

Testing is performed in my old but trusty benchmark machine, running Windows 10 Pro and using Stanford’s V7 Client. The Nvidia graphics driver version was 441.87. Power consumption measurements are taken on the system-level using a P3 Watt Meter at the wall.

System Specs:

  • CPU: AMD FX-8320e
  • Mainboard : Gigabyte GA-880GMA-USB3
  • GPU: EVGA 1080 Ti (Reference Design)
  • Ram: 16 GB DDR3L (low voltage)
  • Power Supply: Seasonic X-650 80+ Gold
  • Drives: 1x SSD, 2 x 7200 RPM HDDs, Blu-Ray Burner
  • Fans: 1x CPU, 2 x 120 mm intake, 1 x 120 mm exhaust, 1 x 80 mm exhaust
  • OS: Win10 64 bit

I did extensive testing of the 1080 Ti over many weeks. Folding@Home rewards donors with “Points” for their contributions, based on how much science is done and how quickly it is returned. A typical performance metric is “Points per Day” (PPD). Here, I have averaged my Points Per Day results out over many work units to provide a consistent number. Note that any given work unit can produce more or less PPD than the average, with variation of 10% being very common. For example, here are five screen shots of the client, showing five different instantaneous PPD values for the 1080 Ti.


GTX 1080 Ti Folding@Home Performance

The following plot shows just how fast the 1080 Ti is compared to other graphics cards I have tested. As you can see, with nearly 1.1 Million PPD, this card does a lot of science.

1080 Ti Folding Performance

GTX 1080 Ti Power Consumption

With a board power rating of 250 Watts, this is a power hungry graphics card. Thus, it isn’t surprising to see that power consumption is at the top of the pack.

1080 Ti Folding Power

GTX 1080 Ti Efficiency

Power consumption alone isn’t the whole story. Being a blog about doing the most work possible for the least amount of power, I am all about finding Folding@Home hardware that is highly efficient. Here, efficiency is defined as Performance Out / Power In. So, for F@H, it is PPD/Watt. The best F@H hardware is gear that maximizes disease research (performance) done per watt of power consumed.

Here’s the efficiency plot.

1080 Ti Folding Efficiency


The Geforce GTX 1080 Ti is the fastest and most efficient graphics card that I’ve tested so far for Stanford’s Folding@Home distributed computing project. With a raw performance of nearly 1.1 Million PPD in windows and an efficiency of almost 3500 PPD/Watt, this card is a good choice for doing science effectively.

Stay tuned to see how Nvidia’s latest Turing architecture stacks up.

Folding@Home: Nvidia GTX 1080 Review Part 3: Memory Speed

In the last article, I investigated how the power limit setting on an Nvidia Geforce GTX 1080 graphics card could affect the card’s performance and efficiency for doing charitable disease research in the Folding@Home distributed computing project. The conclusion was that a power limit of 60% offers only a slight reduction in raw performance (Points Per Day), but a large boost in energy efficiency (PPD/Watt). Two articles ago, I looked at the effect of GPU core clock. In this article, I’m experimenting with a different variable. Namely, the memory clock rate.

The effect of memory clock rate on video games is well defined. Gamers looking for the highest frame rates typically overclock both their graphics GPU and Memory speeds, and see benefits from both. For computation projects like Stanford University’s Folding@Home, the results aren’t as clear. I’ve seen arguments made both ways in the hardware forums. The intent of this article is to simply add another data point, albeit with a bit more scientific rigor.

The Test

To conduct this experiment, I ran the Folding@Home V7 GPU client for a minimum of 3 days continuously on my Windows 10 test computer. Folding@Home points per day (PPD) numbers were taken from Stanford’s Servers via the helpful team at https://folding.extremeoverclocking.com.  I measured total system power consumption at the wall with my P3 Kill A Watt meter. I used the meter’s KWH function to capture the total energy consumed, and divided out by the time the computer was on in order to get an average wattage value (thus eliminating a lot of variability). The test computer specs are as follows:

Test Setup Specs

  • Case: Raidmax Sagitta
  • CPU: AMD FX-8320e
  • Mainboard : Gigabyte GA-880GMA-USB3
  • GPU: Asus GeForce 1080 Turbo
  • Ram: 16 GB DDR3L (low voltage)
  • Power Supply: Seasonic X-650 80+ Gold
  • Drives: 1x SSD, 2 x 7200 RPM HDDs, Blu-Ray Burner
  • Fans: 1x CPU, 2 x 120 mm intake, 1 x 120 mm exhaust, 1 x 80 mm exhaust
  • OS: Win10 64 bit
  • Video Card Driver Version: 372.90

I ran this test with the memory clock rate at the stock clock for the P2 power state (4500 MHz), along with the gaming clock rate of 5000 MHz and a reduced clock rate of 4000 MHz. This gives me three data points of comparison. I left the GPU core clock at +175 MHz (the optimum setting from my first article on the 1080 GTX) and the power limit at 100%, to ensure I had headroom to move the memory clock without affecting the core clock. I verified I wasn’t hitting the power limit in MSI Afterburner.

*Update. Some people may ask why I didn’t go beyond the standard P0 gaming memory clock rate of 5000 MHz (same thing as 10,000 MHz double data rate, which is the card’s advertised memory clock). Basically, I didn’t want to get into the territory where the GDDR5’s error checking comes into play. If you push the memory too hard, there can be errors in the computation but work units can still complete (unlike a GPU core overclock, where work units will fail due to errors). The reason is the built-in error checking on the card memory, which corrects errors as they come up but results in reduced performance. By staying away from 5000+ MHz territory on the memory, I can ensure the relationship between performance and memory clock rate is not affected by memory error correction.

1080 Memory Boost Example

Memory Overclocking Performed in MSI Afterburner

Tabular Results

I put together a table of results in order to show how the averaging was done, and the # of work units backing up my +500 MHz and -500 MHz data points. Having a bunch of work units is key, because there is significant variability in PPD and power consumption numbers between work units. Note that the performance and efficiency numbers for the baseline memory speed (+0 MHz, aka 4500 MHz) come from my extended testing baseline for the 1080 and have even more sample points.

Geforce 1080 PPD Production - Ram Study

Nvidia GTX 1080 Folding@Home Production History: Data shows increased performance with a higher memory speed

Graphic Results

The following graphs show the PPD, Power Consumption, and Efficiency curves as a function of graphics card memory speed. Since I had three points of data, I was able to do a simple three-point-curve linear trendline fit. The R-squared value of the trendline shows how well the data points represent a linear relationship (higher is better, with 1 being ideal). Note that for the power consumption, the card seems to have used more power with a lower memory clock rate than the baseline memory clock. I am not sure why this is…however, the difference is so small that it is likely due to work unit variability or background tasks running on the computer. One could even argue that all of the power consumption results are suspect, since the changes are so small (on the order of 5-10 watts between data points).

Geforce 1080 Performance vs Ram Speed

Geforce 1080 Power vs Ram Speed

Geforce 1080 Efficiency vs Ram Speed


Increasing the memory speed of the Nvidia Geforce GTX 1080 results in a modest increase in PPD and efficiency, and arguably a slight increase in power consumption. The difference between the fastest (+500 MHz) and slowest (-500 MHz) data points I tested are:

PPD: +81K PPD (11.5%)

Power: +9.36 Watts (3.8%)

Efficiency: +212.8 PPD/Watt (7.4%)

Keep in mind that these are for a massive difference in ram speed (5000 MHz vs 4000 MHz).

Another way to look at these results is that underclocking the graphics card ram in hopes of improving efficiency doesn’t work (you’ll actually lose efficiency). I expect this trend will hold true for the rest of the Nvidia Pascal series of cards (GTX 10xx), although so far my testing of this has been limited to this one card, so your mileage may vary. Please post any insights if you have them.

NVIDIA GEFORCE GTX 1080 Folding@Home Review (Part 1)


It’s hard to believe that the Nvidia GTX 1080 is almost three years old now, and I’m just getting around to writing a Folding@Home review of it. In the realm of graphics cards, this thing is legendary, and only recently displaced from the enthusiast podium by Nvidia’s new RTX series of cards. The 1080 was Nvidia’s top of the line gaming graphics card (next to the Ti edition of course), and has been very popular for both GPU coin mining and cancer-curing (or at least disease research for Stanford University’s charitable distributed computing project: Folding@Home). If you’ve been following along, you know it’s that second thing that I’m interested in. The point of this review is to see just how well the GTX 1080 folds…and by well, I mean not just raw performance, but also energy efficiency.

Quick Stats Comparison

I threw together a quick table to give you an idea of where the GTX 1080 stacks up (I left the newer RTX cards and the older GTX 9-series cards off of here because I’m lazy…

Nvidia Pascal Cards

Nvidia Pascal Family GPU Comparison

As you can see, the GTX 1080 is pretty fast, eclipsed only by the GTX 1080 Ti (which also has a higher Thermal Design Power, suggesting more electricity usage). From my previous articles, we’ve seen that the more powerful cards tend to do work more efficiency, especially if they are in the same TDP bracket. So, the 1080 should be a better folder (both in PPD and PPD/Watt efficiency) than the 1070 Ti I tested last time.

Test Card: ASUS GeForce GTX 1080 Turbo

As with the 1070 Ti, I picked up a pretty boring flavor of a 1080 in the form of an Asus turbo card. These cards lack back plates (which help with circuit board rigidity and heat dissipation) and use cheap blower coolers, which suck in air from a single centrifugal fan on the underside and blow it out the back of the case (keeping the hot air from building up in the case). These are loud, and tend to run hotter than open-fan coolers, so overclocking and boost clocks are limited compared to aftermarket designs. However, like Nvidia’s own Founder’s Edition reference cards, this reference design provides a good baseline for a 1080’s minimum performance.

ASUS GeForce GTX 1080 Turbo

ASUS GeForce GTX 1080 Turbo

The new 1080 looks strikingly similar to the 1070 Ti…Asus is obviously reusing the exact same cooler since both cards have a 180 Watt TDP.

Asus GTX 1080 and 1070 Ti

Asus GTX 1080 and 1070 Ti (which one is which?)

Test Environment

Like most of my previous graphics card testing, I put this into my AMD FX-Based Test System. If you are interested in how this test machine does with CPU folding, you can read about it here. Testing was done using Stanford’s Folding@Home V7 Client (version 7.5.1) in Windows 10. Points Per Day (PPD) production was collected from Stanford’s servers. Power measurements were done with a P3 Kill A Watt Meter (taken at the wall, for a total-system power profile).

Test Setup Specs

  • Case: Raidmax Sagitta
  • CPU: AMD FX-8320e
  • Mainboard : Gigabyte GA-880GMA-USB3
  • GPU: Asus GeForce 1080 Turbo
  • Ram: 16 GB DDR3L (low voltage)
  • Power Supply: Seasonic X-650 80+ Gold
  • Drives: 1x SSD, 2 x 7200 RPM HDDs, Blu-Ray Burner
  • Fans: 1x CPU, 2 x 120 mm intake, 1 x 120 mm exhaust, 1 x 80 mm exhaust
  • OS: Win10 64 bit
  • Video Card Driver Version: 372.90

Video Card Configuration – Optimize for Performance

In my previous articles, I’ve shown how Nvidia GPUs don’t always automatically boost their clock rates when running Folding@home (as opposed to video games or benchmarks). The same is true of the GTX 1080. It sometimes needs a little encouragement in order to fold at the maximum performance. I overclocked the core by 175 MHz and increased the power limit* by 20% in MSI afterburner using similar settings to the GTX 1070. These values were shown to be stable after 2+ weeks of testing with no dropped work units.

*I also experimented with the power limit at 100% and I saw no change in card power consumption. This makes sense…folding is not using 100% of the GPU. Inspection of the MSI afterburner plots shows that while folding, the card does not hit the power limit at either 100% or 120%. I will have to reduce the power limit to get the card to throttle back (this will happen in part 2 of this article).

As with previous cards, I did not push the memory into its performance zone, but left it at the default P2 (low-power) state clock rate. The general consensus is that memory clock does not significantly affect folding@home, and it is better to leave the power headroom for the core clock, which does improve performance. As an interesting side-note, the memory clock on this thing jumps up to 5000 Mhz (effective) in benchmarks. For example, see the card’s auto-boost settings when running Heaven:

1080 Benchmark Stats

Nvidia GeForce GTX 1080 – Boost Clocks (auto) in Heaven Benchmark

Testing Overview

For most of my tests, I just let the computer run folding@home 24/7 for a couple of days and then average the points per day (PPD) results from Stanford’s stats server. Since the GTX 1080 is such a popular card, I decided to let it run a little longer (a few weeks) to get a really good sampling of results, since PPD can vary a lot from work unit to work unit. Before we get into the duration results, let’s do a quick overview of what the Folding@home environment looks like for a typical work unit.

The following is an example screen shot of the display from the client, showing an instantaneous PPD of about 770K, which is very impressive. Here, it is folding on a core 21 work unit (Project 14124).

F@H Client 1080

Folding@Home V7 Client – GeForce GTX 1080

MSI Afterburner is a handy way to monitor GPU stats. As you can see, the GPU usage is hovering in the low 80% region (this is typical for GPU folding in Windows. Linux can use a bit more of the GPU for a few percentage points more PPD). This Asus card, with its reference blower cooler, is running a bit warm (just shy of 70 degrees C), but that’s well within spec. I had the power limit at 120%, but the card is nowhere near hitting that…the power limit seems to just peak above 80% here and there.

GTX 1080 MSI Afterburner

GTX 1080 stats while folding.

Measuring card power consumption with the driver shows that it’s using about 150 watts, which seems about right when compared to the GPU usage and power % graphs. 100% GPU usage would be ideal (and would result in a power consumption of about 180 watts, which is the 1080’s TDP).

In terms of card-level efficiency, this is 770,000 PPD / 150 Watts = 5133 PPD/Watt.

Power Draw (at the card)

Nvidia Geforce GTX 1080 – Instantaneous Power Draw @ the Card

Duration Testing

I ran Folding@Home for quite a while on the 1080. As you can see from this plot (courtesy of https://folding.extremeoverclocking.com/), the 1080 is mildly beating the 1070 Ti. It should be noted that the stats for the 1070 Ti are a bit low in the left-hand side of the plot, because folding was interrupted a few times for various reasons (gaming). The 1080 results were uninterrupted.

1080 Production History

Geforce GTX 1080 Production History

Another thing I noticed was the amount of variation in the results. Normal work unit variation (at least for less powerful cards) is around 10-20 percent. For the GTX 1080, I saw swings of 200K PPD, which is closer to 30%. Check out that one point at 875K PPD!

Average PPD: 730K PPD

I averaged the PPD over two weeks on the GTX 1080 and got 730K PPD. Previous testing on the GTX 1070 Ti (based on continual testing without interruptions) showed an average PPD of 700K. Here is the plot from that article, reproduced for convenience.

Nvidia GTX 1070 Ti Time History

Nvidia GTX 1070 Ti Folding@Home Production Time History

I had expected my GTX 1080 to do a bit better than that. However, it only has about 5% more CUDA cores than the GTX 1070 Ti (2560 vs 2438). The GTX 1080’s faster memory also isn’t an advantage in Folding@Home. So, a 30K PPD improvement for the 1080, which corresponds to about a 4.3% faster, makes sense.

System Average Power Consumption: 240 Watts @ the Wall

I spot checked the power meter (P3 Kill A Watt) many times over the course of folding. Although it varies with work unit, it seemed to most commonly use around 230 watts. Peek observed wattage was 257, and minimum was around 220. This was more variation than I typically see, but I think it corresponds with the variation in PPD I saw in the performance graph. It was very tempting to just say that 230 watts was the number, but I wasn’t confident that this was accurate. There was just too much variation.

In order to get a better number, I reset the Kill-A-Watt meter (I hadn’t reset it in ages) and let it log the computer’s usage over the weekend. The meter keeps track of the total kilowatt-hours (KWH) of energy consumed, as well as the time period (in hours) of the reading. By dividing the energy by time, we get power. Instead of an instantaneous power (the eyeball method), this is an average power over the weekend, and is thus a compatible number with the average PPD.

The end result of this was 17.39 KWH consumed over 72.5 hours. Thus, the average power consumption of the computer is:

17.39/72.5 (KWH/H) * 1000 (Watts/KW) = about 240 Watts (I round a bit for convenience in reporting, but the Excel sheet that backs up all my plots is exact)

This is a bit more power consumed than the GTX 1070 Ti results, which used an average of 225 watts (admittedly computed by the eyeball method over many days, but there was much less variation so I think it is valid). This increased power consumption of the GTX 1080 vs. the 1070 Ti is also consistent with what people have seen in games. This Legit Reviews article shows an EVGA 1080 using about 30 watts more power than an EVGA 1070 Ti during gaming benchmarks. The power consumption figure is reproduced below:


Modern Graphics Card Power Consumption. Source: Legit Reviews

This is a very interesting result. Even though the 1080 and the 1070 Ti have the same 180 Watt TDP, the 1080 draws more power, both in folding@home and in gaming.

System Computational Efficiency: 3044 PPD/Watt

For my Asus GeForce GTX 1080, the folding@home efficiency is:

730,000 PPD / 240 Watts = 3044 PPD/Watt.

This is an excellent score. Surprisingly, it is slightly less than my Asus 1070 Ti, which I found to have an efficiency of 3126 PPD/Watt. In practice these are so close that it just could be attributed to work unit variation. The GeForce 1080 and 1070 Ti are both extremely efficient cards, and are good choices for folding@home.

Comparison plots here:

GeForce 1080 PPD Comparison

GeForce GTX 1080 Folding@Home PPD Comparison

GeForce 1080 Efficiency Comparison

GeForce GTX 1080 Folding@Home Efficiency Comparison

Final Thoughts

The GTX 1080 is a great card. With that said, I’m a bit annoyed that my GTX 1080 didn’t hit 800K PPD like some folks in the forums say theirs do (I bet a lot of those people getting 800K PPD use Linux, as it is a bit better than Windows for folding). Still, this is a good result.

Similarly, I’m annoyed that the GTX 1080 didn’t thoroughly beat my 1070 Ti in terms of efficiency. The results are so close though that it’s effectively the same. This is part one of a multi-part review, where I tuned the card for performance. In the next article, I plan to go after finding a better efficiency point for running this card by experimenting with reducing the power limit. Right now I’m thinking of running the card at 80% power limit for a week, and then at 60% for another week, and reporting the results. So, stay tuned!

Folding@Home Efficiency vs. GPU Power Limit

Folding@Home: The Need for Efficiency

Distributed computing projects like Stanford University’s Folding@Home sometimes get a bad rap on account of all the power that is consumed in the name of science.  Critics argue that any potential gains that are made in the area of disease research are offset by the environmental damage caused by thousands of computers sucking down electricity.

This blog hopes to find a balance by optimizing the way the computational research is done. In this article, I’m going to show how a simple setting in the graphics card driver can improve Folding@Home’s Energy Efficiency.

This blog uses an Nvidia graphics card, but the general idea should also work with AMD cards. The specific card here is an EVGA GeForce GTX 1060 (6 GB).  Green F@H Review here: Folding on the NVidia GTX 1060

If you are folding on a CPU, similar efficiency improvements can be achieved by optimizing the clock frequencies and voltages in the BIOS.  For an example on how to do this, see these posts:

F@H Efficiency: AMD Phenom X6 1100T

F@H Efficiency: Overclock or Undervolt?

(at this point in time I really just recommend folding on a GPU for optimum production and efficiency)

GPU Power Limit Overview

The GPU Power limit slider is a quick way to control how much power the graphics card is allowed to draw. Typically, graphics cards are optimized for speed, with efficiency a second goal (if at all). When a graphics card is pushed harder, it will draw more power (until it runs into the power limit). Today’s graphics cards will also boost their clock rate when loaded, and reduce it when the load goes away. Sometimes, a few extra MHz can be achieved for minimal extra power, but go too far and the amount of power needed to drive the card will grow exponentially. Sure the card is doing a bit more work (or playing a game a bit faster), but the heaps of extra power needed to do this are making it very inefficient.

What I’m going to quickly show is that going the other way (reducing power) can actually improve efficiency, albeit at a reduction of raw output. For  this quick test, I’m just going to look a the default power limit, 100%, vs 50%. Specific tuning is going to be dependent on your actual graphics card. But, with a few days at different settings, you should be able to find a happy balance between performance and efficiency.

For these plots, I used my watt meter to obtain actual power consumption at the wall. You can read about my watt meters here.

Changing the Power Limit

A tool such as MSI Afterburner can be used to view the graphics card’s settings, including the power limit. In the below screenshot, I reduced the card’s power limit by 50% midway through taking data. You can clearly see the power consumption and GPU temperature drop. This suggests the entire computer should be drawing less power from the wall. I confirmed this with my watt meter.

Adjust Power Limit MSI Afterburner

MSI Afterburner is used to reduce the graphics card’s power limit.

Effect on Results

I ran the card for multiple days at each power setting and used Stanford’s actual stats to generate an averaged number for PPD. Reporting an average number like this lends more confidence that the results are real, since PPD as reported in the client varies a lot with time, and PPD can bounce around by +/- 10 percent with different projects.

Below is the production time history plot, courtesy of https://folding.extremeoverclocking.com/. I marked on the plot the actual power consumption numbers I was seeing from my computer at the wall. As you can see, reducing the power limit on the 1060 from 100% to 50% saved about 40 watts of power at the wall.

GTX 1060 F@H Reduced Power Limit Production

GTX 1060 Folding@Home Performance at 100% and 50% Power

On the efficiency plot, you can see that reducing the power limit on the 1060 actually improved its efficiency slightly. This is a great way to fold more effectively.

Nvidia 1060 PPD per Watt Updated

NVidia GTX 1060 Folding@Home Efficiency Results

There is a downside of course, and that is in raw production. The Points Per Day plot below shows a pretty big reduction in PPD for the reduced power 1060, although it is still beating its little brother, the 1050 TI. One of the reasons PPD falls off so hard is that Stanford provides bonus points that are tied to how fast your computer can return a work unit. These points increase exponentially the faster your computer can do work. So, by slowing the card down, we not only lose on base points, but we lose on  the quick return bonus as well.

Nvidia 1060 PPD Updated

NVidia GTX 1060 Folding@Home Performance Results


Reducing the power limit on a graphics card can increase its computational energy efficiency in Folding@Home, although at the cost of raw PPD. There is probably a sweet spot for efficiency vs. performance at some power setting between 50% and 100%. This will likely be different for each graphics card. The process outlined above can be used for various power limit settings to find the best efficiency point.


Folding on the Nvidia GTX 1070


Folding@home is Stanford University’s charitable distributed computing project. It’s charitable because you can donate electricity, as converted into work through your home computer, to fight cancer, Alzheimer’s, and a host of other diseases.  It’s distributed, because anyone can run it with almost any desktop PC hardware.  But, not all hardware configurations are created equally.  If you’ve been following along, you know the point of this blog is to do the most work for as little power consumption as possible.  After all, electricity isn’t free, and killing the planet to cure cancer isn’t a very good trade-off.

Today we’re testing out Folding@home on an EVGA NVIDIA GTX 1070 graphics card.  This card offers a big step up in gaming and compute horsepower compared to the 1060 I reviewed previously, and is capable of pushing solid frame rates at 4K resolution. So, how well does it fold?

Card Specifications (Nvidia Reference Specs)

1070 specs

Nvidia GTX 1070 Specifications

evga 1070 acx stock photo

EVGA Nvidia GTX 1070 ACX 3.0 (photo credit: EVGA)


For this test I used my normal desktop computer as the benchmark machine.  Testing was done using Stanford’s V7 client on Windows 10 64-bit running FAH Core 21 work units.  The video driver version used was initially 388.59, and subsequently 372.90. Power consumption measurements reported in the charts were taken at the wall and are thus full system power consumption numbers.

If you’re interested in reading about the hardware configuration of my test rig, it is summarized in this post:


Information on my watt meter readings can be found here:

I Got a New Watt Meter!

Initial Testing and Troubleshooting

Like the GTX 1060, the 1070 uses Nvidia’s Pascal architecture, which is very efficient and has a reputation for solid compute performance. The 1070 has 50% more CUDA cores than the 1060, and with Folding@Home’s exponential points system (the quick return bonus gives you more points for doing work quickly), we should see roughly double the PPD of the 1060, which does 300 – 350 thousand PPD depending on the work unit. Based on various people’s experiences, and especially this forum post, I was expecting the 1070 to produce somewhere in the range of 600-700K PPD.

That wasn’t what happened. The card wasn’t exactly slow, but initial testing showed an estimated 450 to 550K PPD, as reported by the client. I ran it for a few days, since PPD can vary a good deal depending on the work unit, but the result was unfortunately the same. 550K PPD was about as much as my card would do.


Initial GTX 1070 Results – 544K PPD

At first I thought it might be due to the card running hot. Unlike my test of a brand new 1060, I obtained my 1070 used off of eBay for a great price of $200 dollars + shipping. It was a little dusty, so I blew it all out and fired up MSI Afterburner to check out the temps. Unfortunately, the fans on the card weren’t even breaking a sweat, and it was nice and cool. Points didn’t increase.

evga 1070 acx 3.0

My Used EVGA GTX 1070 ACX 3.0 – eBay Price: $200

initial 1070 afterburner report

MSI Afterburner Report: NVidia GTX 1070, Stock Clocks, Driver 388.59

After doing some more digging, I ran across a few threads online that indicated the 1070 (along with a few other GTX models) don’t always boost up to their maximum clock rates for compute loads. Opening up a video, or Folding@home’s protein viewer, can sometimes force the card to clock up. I tried this and didn’t have any luck. My card was running at the stock clocks, and in fact the memory even appeared to be running 200 Megahertz below the 4000 Mhz reference clock rate. This suggested the card was in a low-power mode.

Thankfully, Nvidia’s System Management Interface tool can be used to see what is going on. This tool, which in Windows 10 lives in C:\Program Files\Nvidia Corporation, can be accessed by the command line. I followed the tutorial here to learn a few things about what my 1070 was doing. Although that write-up is geared at people mining for cryptocurrency, the steps are still releveant.

As can be seen here, my card was in the “P2” state, which is not the high-performance “P0” state. This is why the card wasn’t boosting, and why the memory clock seems diminished.

1070 performance state

Nvidia 1070 Performance State

Another feature of the Nvidia System Management Interface is the ability to get the power consumption at the card. This is measured by the driver, using the card’s hardware, and is the total instantaneous power the card is consuming (PCI slot power + supplemental power connections). As you can see, in the P2 state, the card is very rarely nearing the 150 watt TDP.

Now, this doesn’t necessarily mean the card would get closer to 150 watts in the P0 state. F@H does not utilize every portion of the graphics card, and it is expected that the power consumption would not be right at the limit. Still, these numbers seemed a bit low to me.

1070 card-level power consumption (before tuning)

1070 card-level power consumption (before tuning)

Overclocking Manually to Approximate P0 State

Unlike what was suggested in that crypto mining article, I wasn’t able to use the NVSMI tool to force a P0 state. For some reason, my NVSMI tool wouldn’t show me the available clock rate settings for my 1070. However, manual overclocking with a program such as MSI Afterburner is really easy. By maxing out the power limit and setting the core clock to a higher value, I can basically make the card run at its boost frequency, or higher.

First, I set the power limit to the maximum allowed (112%). Don’t worry, this won’t hurt anything. It is limited in the driver to not cause any damage. Basically, this will allow the card to sip a bit more electricity (albeit at a reduction of efficiency). For a card that was in the P0 state (say, running a video game), this would allow higher boost clocks.

Next, I started upping the core clock in increments of 100 Mhz. I didn’t run into any stability problems, and settled in on a core clock of 2000 Mhz (factory clock is 1506 Mhz / 1683 boost). Note that that factory boost number is deceiving, since the latest drivers will crank the GPU core up past 1900 MHz if there is power and voltage headroom. From what I read, many people can run the 1070 stable at 2050 Mhz without adding voltage.

I decided not to boost the voltage, and to stay 50 Mhz below that supposedly stable number, because it’s not worth risking the stability of Folding@home. We want accurate, repeatable science! Plus, dropping work units is much worse for PPD than running slightly below a card’s maximum capability.

I experimented with clocking the memory up from 3800 MHz to 4000 MHz (note it’s double data rate so this equates to 8000 MHz as reported by some programs). This didn’t seem to affect results. F@H has historically been fairly insensitive to memory clocks, and boosting memory too much can cause slowdowns due to the error-checking routines having to work harder to ensure clean results. Basically, everyone says it’s not worth it. I ran it at 4000 MHz long enough to confirm this (a day), then throttled it back down to 3800 MHz. The benefit here will be more power available for the GPU cores, which is what really counts for folding.

Here are my final overclock numbers. The card has been running with these clocks for a week and a half non-stop, with no stability issues:

final 1070 afterburner report

Overclocked Settings: +160 MHz Core, 112% Power Limit

Note the driver version as shown in the updated Afterburner screen shot is different…as it turns out, this can have a huge effect on F@H PPD. More on that in a moment.

Overclocking Result: An Extra 50,000 PPD

Running the core at 2012 MHz (+160 MHz boost from the P2 power state) and upping the card’s power limit by 12% made the average PPD, as observed over two days, climb from 500-550K PPD to 550K-600K PPD. So, that’s a 50,000 PPD increase for minimal effort. But, something still seemed off. At the time I was still running driver version 388.59, and one of the things I had discovered when searching around for 1070 tuning tips is that not all drivers are created equal.

Nvidia Driver 372.90: The Best Folding Driver for the GTX 1070

Nvidia has been updating drivers with more and more emphasis on gaming optimizations and less on compute. So, it makes sense that older drivers might actually offer better compute performance. There are many threads in the Folding@Home Hardware Forum discussing this, and one driver version that keeps being mentioned is 372.90. It’s a bit tricky to keep it installed on Windows 10, since Windows is always trying to push a newer version, but for my 24/7 folding rig, I installed it and simply never rebooted it in order to get a week’s worth of data.

This driver change alone seemed to also offer a 50,000 point boost. After running various core 21 work units, the GTX 1070’s PPD has stayed between 630,000 and 660,000. This is normal variation between work units, and I feel confident reporting a final PPD of 640K. As I write this, the client is estimating 660K PPD.


Nvidia GTX 1070: 660K PPD on Project 13815 (Core 21)

This is an excellent result. It’s twice the PPD of the GTX 1060, although eking out that last 100K PPD took a manual overclock plus a driver “update” to an older version.

Now, for the fun part. Efficiency! This 1070 is rated at 150 watts, which is only 30 watts more than the 1060. So we are supposedly doing 100% more science for Stanford University, and for a meager 25% increase in power consumption. Time to bust out the watt meter and find out!

Power Consumption at the Wall

Using my P3 Kill-A-Watt Power Meter, I measured the total system power consumption. This is the same way I measure all of my graphics cards (as opposed to estimating the card’s power by the TDP or using the video card driver to spit out instantaneous card power). The reason is that I like to have a full-system view, factoring in the power usage of my CPU, main board, and RAM, all essential components to keep the card happy.

While folding with the GTX 1070, my system’s total power draw varied between 225 and 230 watts. I’m going to go with 227 watts as the average power number. 


Computing computational efficiency as Points Per Day (PPD) / Power (Watts) gives:

640,000 PPD / 227 Watts = 2820 PPD/Watt.


The Nvidia GTX 1070 is a very efficient card for running Stanford’s Folding@Home Distributed Computing Project. The trend established in my previous articles seems to be continuing, namely that the more expensive high-end video cards are more efficient, despite their higher power draw. In this case of the 1070, some manual overclocking was needed to unlock the full PPD potential. As proven by many others, the default drivers weren’t very good, but the 372.90 drivers really opened it up.

Base PPD: 550,000

Tuned PPD (drivers + overclock) = 640,000

PPD/Watt(@wall) = 2820

1070 ppd plot

Nvidia GTX 1070 Performance Comparison

1070 efficiency plot

Nvidia 1070 Efficiency Comparison

As a final note, this post focused more on PPD than efficiency, since for much of the testing my watt meter was not installed (my kids keep playing with it). At some point in the future, I’ll do an article where I tune one of these cards to find the best efficiency point. This will likely be at a lower power limit than 100%, with perhaps a slight reduction in clock rate.