Tag Archives: Folding at Home

Folding@Home on GeForce RTX 3090 Review

Hi everyone, sorry for the delay in blog posts. Electricity in Connecticut has been so expensive lately that except for our winter heating Folding@Home cluster, it wasn’t affordable to keep running all those GPUs (even with our solar panels, which is really saying something). However, I did manage to get some good data on the top-tier Nvidia RTX 3090, which I got during COVID as the GPU in a prebuilt HP Omen gaming desktop. I transplanted the 3090 into my benchmark desktop, so these stats are comparable to previous cards I’ve tested.

Wait, what are we doing here?

For those just joining, this is a blog about optimizing computers for energy efficiency. I’m running Folding@Home, a distributed computing research project that uses your computer to help fight diseases such as cancer and covid and a host of other ailements. For more information, check out the project website here: https://foldingathome.org/

Look at this bad boy!

This is the HP OEM version of an RTX 3090. I was impressed that it had lots of copper heat pipes and a metal back plate. Overall this was a very solid card for an OEM offering.

HP OEM Nvidia RTX 3090 installed in my AMD Ryzen 9 3950X benchmark desktop

At the time of my testing, the RTX 3090 was the top-tier card from Nvidia’s new Ampere line. They have since released the 3090 Ti, which is ever so slightly faster. To give you an idea of where the RTX 3090 stacks compared to the previous cards I have tested, here is a table. Note that 350 watt TDP! That is a lot of power for this air cooler to dissipate.

The Test

I ran Folding@Home on my benchmark desktop in Windows 10, using Folding@Home client 7.6.13. I was immediately blown away by the insane Points Per Day (PPD) that the 3090 can spit out! Here’s a screen shot of the client, where the card was doing a very impressive 6.4 million PPD!

What was really interesting about the 3090 though was how much variation there was in performance depending on the size of the molecule being worked on. Very large molecules with high atom counts benefited greatly from the number of CUDA cores on this card, and it kicked butt in both raw performance (PPD) and effiency (PPD/Watt). Smaller molecules, however, did not fully utilize this card’s impressive potential. This resulted in a less efficiency and more wasted power. I would assume that running two smaller Ampere cards, for example the 3080, with small models would be more efficient than using the 3090 for small models, but I haven’t got any 3080’s to test that assumption (yet!).

In the plots below, you can see that the smaller model (89k atoms) resulted in a peak PPD of about 4 million, as opposed to the 7 million PPD with a 312k atom model. PPD/watt at 100% card power was also less efficient for the smaller model, coming in at about 16,500 PPD/Watt vs. 10,000 PPD/Watt. These are still great efficiency numbers, which shows how far GPU computing has come from previous generations.

Reduce GPU TDP Power Target to Improve Efficiency

I’ve previously shown how GPUs are set up for maximum performance out of the box, which makes sense for video gaming. However, if you are trying to maximize energy efficiency of your computational machines, reducing the power target of the GPU can result in massive efficiency gains. The GeForce RTX 3090 is a great example of this. When solving large models, this beast of a card benefits from throttling the power down, gaining 2.35% improved energy efficiency with a power target set for 85%. However, the huge improvement comes for solving smaller models. When running the 89k atom work unit, I got a whopping 29% efficiency improvement when setting the power target to 55% with only a 14% performance reduction! Since the F@H project gives out a lot of smaller work units in addition to some larger ones, I chose to run my machine at a 75% power target. On average, this splits the difference, and gives a noticeable efficiency improvement without sacrificing raw PPD performance too much. In the RTX 3090’s case, a 75% power target massively reduced the power draw on the computer (reduced wall consumption from 434 to 360 watts), as well as reduced heat and noise coming out of the chassis. This promotes a more happy office environment and a happier computer, that will last longer!

Tuning Results: 89K Atoms (Small Model)

Here are the tuning plots for a smaller molecule. In all cases, the X-axis is the power target, set in the Nvidia Driver. 100% corresponds to 350 Watts in the case of the RTX 3090.

Tuning Results: 312K Atoms (Large Model)

And here are the tuning results for a larger molecule.

Overall Results

Here are the comparison results to the previous hardware configurations I have tested. Note that now that the F@H client supports enabling CUDA, I did some tests with CUDA on vs. off with the RTX 2080 Ti and the 3090. Pro Tip: MAKE SURE CUDA IS ON! It really speeds things up and also improves energy efficiency.

Key takeaways from below is that the 3090 offers 50% more performance (PPD) than the 2080 Ti, and is almost 30% more energy efficient while doing it! Note this does not mean this card sips power…it actually uses more watts than any of the other cards I’ve tested. However, it does a lot more computation with those watts, so it is putting the electricity to better use. Thus, a data center or workstation can get through more work in a shorter amount of time with 3090s vs. other cards, and thus use less power overall to solve a given amount of work. This is better for the environment!

Nvidia RTX 3090 Folding@Home Performance (green bars) compared to other hardware configurations
Nvidia RTX 3090 Folding@Home Total System Power Consumption (green bars) compared to other hardware configurations
Nvidia RTX 3090 Folding@Home Energy Efficiency (green bars) compared to other hardware configurations.

Conclusion

The flagship Ampere architecture Nvidia GeForce RTX 3090 is an excellent card for compute applications. It does draw a ton of power, but this can be mitigated by reducing the power target in the driver to gain efficiency and reduce heat and noise. In the case of Folding@Home disease research, this card is a step change in both performance and energy efficiency, offering 50% more compute power and 30% more efficiency than the previous generation. I look forward to testing out other Ampere cards, as well as the new 40xx “Lovelace” architecture, if Eversource ever drops the electric rate back to normal levels in CT.

Advertisement

AMD Ryzen 9 3950X Folding@Home Review: Part 2: Averaging, Efficiency, and Variation

Welcome back everyone! In my last post, I used my rebuilt benchmark machine to revisit CPU folding on my AMD Ryzen 9 3950x 16-core processor. This article is a follow-on. As promised, this includes the companion power consumption and efficiency plots for thread settings of 1-32 cores. As a quick reminder, I did this test with multi-threading (SMT) on, but with Core Performance Boost disabled, so all cores are running at the base 3.5 GHz setting.

Performance

The Folding@Home distributed computing project has come a long way from its humble disease-fighting beginnings back in 2000. The purpose of this testing is to see just how well the V7 CPU client scales on a modern, high core-count processor. With all the new Folding@Home donors coming onboard to fight COVID, having some insight into how to set up the configuration for the most performance is hopefully helpful.

For this test, I simply set the # of threads the client can use to a value and ran five sequential work units. I averaged the performance (Points Per Day), but I also plot the individual work unit performance values to give you a sense of the variation. Since the Ryzen 9 3950x supports 32 threads, I essentially ran 160 tests. Since I wanted the Folding@Home Consortium to get useful data in their fight against COVID-19, I let each work unit run to completion, even though I only need them to run to about 10-20% complete to get an accurate PPD estimate from the client.

So, without further blabbing on my part, here is the graph of Folding@Home performance vs. thread count in Windows 10 on the Ryzen 9 3950x

Ryzen_3950x_Performance_SMT_Off_CPB_On

Here, the solid blue line is the averaged performance, and the gray circles are the individual tests. The dashed blue lines represent a statistical 95% confidence interval, which is computed based on the variation. The expected Points Per Day (PPD) of a work unit run on the 3950x is expected to fall within this band 95% of the time.

My first observation is, holy crap! This is a fast processor. Some work units at high thread counts get really close to 500K PPD, which for me has only been achievable by GPU folding up to this point.

My second observation is that there is a lot of variation between different work units. This makes sense, because some work units have much larger molecules to solve than others. In my testing, I found the average variation of all 160 tests to be 12.78%, with individual variance up to 25%.

My third observation is that there seems to be two different regions on this plot. For the first half, the thread count setting is less than the number of physical cores on the chip, and the results are fairly linear. For the second half, the thread count setting is higher than the number of physical cores on the chip (thus forcing the CPU to virtualize those cores using SMT). Performance seems to fall off when the CPU cores become fully saturated (threads = 16), and it takes a while to climb out of the hole (threads = 24 starts showing some more gains).

As a side note, the client does not actually run all of these thread count settings, since some prime numbers, especially large primes (7, 11) and multiples thereof cause numerical issues. For example, when you try to run a 7-thread solve, the client automatically backs the thread count down to 6. You can see warnings in the log file about this when it happens.

Prime Number Thread Adjust

I noted all the relevant thread counts where this happens on the x-axis of the plot. Theoretically, these should be equivalent settings. The fact that the average performance varies a bit between them is just due to work unit variation (I’d have to run hundreds of averages to cancel all the variation out).

Finally, I noticed that the highest PPD actually occurred with a thread count of 30 (PPD = 407200) vs a thread count of 32 (PPD = 401485). This is a small but interesting difference, and is within the range of statistical variation. Thus I would say that setting the thread count to 30 vs 32 provides the same performance, while leaving two CPU threads free for other tasks (such as GPU folding…more on that later!).

Power Consumption

Power consumption numbers for each thread setting were taken at the wall, using my P3 Kill A Watt meter. Since the power numbers tend to walk around a bit as the computer works, it’s hard to get an instantaneous reading. Thus these are “eyeball averaged”. There was enough change at each CPU thread setting to clearly see a difference (not counting those thread settings that are actually equivalent to an adjacent setting).

Ryzen_3950x_Power_SMT_Off_CPB_On

The total measured power consumption rose fairly linearly from just under 80 watts to just under 160 watts. There’s not too much surprising here. As you throw more threads at the CPU, it clocks up idle cores and does more work (which causes more transistors to switch, which thus takes more power). This seems pretty believable to me. At the high end, the system is drawing just under 160 watts of power. The AMD Ryzen 9 3950x is rated at a 105 watt TDP, and with CPB turned off it should be pretty close to this number. My rough back of the hand calculation for this rig was as follows:

  1. CPU Loaded Power = 105 Watts
  2. GPU Idle Power (Nvidia GTX 1650) = 10 Watts
  3. Motherboard Power = 15 Watts
  4. Ram Power = 2 watts * 4 sticks = 8 watts
  5. NVME Power = 2 watts * 2 drives = 4 watts
  6. SSD Power = 2 watts

Total Estimated Watts @ F@H CPU Load = 144 Watts

Factor in a boat load of case fans, some silly LED lights, and a bit of PSU efficiency hit (about 90% efficient for my Seasonic unit) and it’ll be close to the 160 watts as measured.

Efficiency

This being a blog about saving the planet while still doing science with computers, I am very interested in energy efficiency. For Folding@Home, this means at doing the most work (PPD) for the least amount of power (watts). So, this plot is just PPD/Watts. Easy!

Similar to the PPD plot, this efficiency plot averages five data points for each thread setting. I chose to leave off the individual points and the confidence interval, because that looks about the same on this plot as it does on the PPD plot, and leaving all the clutter off makes this easier to read.

Ryzen_3950x_Efficiency_SMT_Off_CPB_On

As with the PPD plot, there seem to be two regions on the efficiency curve. The first region (threads less than 16) shows a pretty good linear ramp-up in efficiency as more threads are added. The second region (threads 16 or greater) is what I’m calling the “core saturation” region. Here, there are more threads than physical cores, and efficiency stays relatively flat. It actually drops off at 16 cores (similar to the PPD plot), and doesn’t start improving again until 24 or more threads are allocated to the solver.

This plot, at first glance, suggests that the maximum efficiency is realized at # of threads = 30. However, it should be noted that work unit variation still has a lot of influence, even with reporting results of a 5-sample average. You can see this effect by looking at the efficiency drop at threads = 31. Theoretically, the efficiency should be the same at threads = 31 and threads = 30, because the solver runs a 30-thread solution even when set to 31 to prevent domain decomposition.

Thus, similar to the PPD plot, I’d say the max efficiency is effectively achieved at thread counts of 30 and 32. My personal opinion is that you might as well run with # of threads = 30 (leaving two threads free for other tasks). This setting results in the maximum PPD as well.

Weird Results at Threads = 16-23

Some of you might be wondering why the performance and efficiency drops off when the thread count is set to the actual number of cores (16) or higher. I was too, so I re-ran some tests and looked at what was happening with AMD’s built-in Ryzen Master tool. As you can see in the screen shot below, even though the # of threads was set to 18 in Folding@Home (a number greater than the 16 physical cores), not all 16 cores were fully engaged on the processor. In fact, only 14 were clocked up, and two were showing relatively lazy clock rates.

Two Cores are Lazy!

Folding@Home 18-Thread CPU Solve on 16-Core Processor

I suspect what is happening is that some of the threads were loaded onto “virtual” CPU cores (i.e. SMT / hyper threading). This might be something Windows 10 does to preserve a few free CPU cores for other tasks. In fact, I didn’t see all of the cores turbo up to full speed until I set Folding@Home’s thread count to 24. This incidentally is when performance starts coming back in on the plots above.

This weird SMT / Hyper-threading behavior is likely what is responsible for the large drop-off / flat part of the performance and efficiency curves that exists from thread count = 16 to 23. As you can see in the picture below, once you fully load all the available threads, the CPU frequencies on each core all hit the maximum value, as expected.

Ryzen_Master_32_Thread_Solve

Folding@Home 32-Thread CPU Solve on 16-Core Processor

Results Comparison

The following plots compare overall performance, power consumption, and efficiency of my new AMD Ryzen 9 3950x Folding@Home rig to other hardware configurations I have tested so far.

Performance

As you can see from the plot below, the Ryzen 9 3950x running a 32-thread Folding@Home solve can compete with relatively modern graphics cards in terms of raw performance. High-end GPUs will still offer more performance, but for a processor, getting over 400K PPD is very impressive. This is significantly more PPD than the previous processors I have tested (AMD Bulldozer-based FX-8320e, AMD Phenom II X6 1100t, Intel Core2Quad Q6600, etc). Admittedly I have not tested very many CPUs, since this is much more involved than just swapping out graphics cards to test.

AMD Ryzen 9 3950x Performance

Power Consumption

From a total system power consumption standpoint, my new benchmark machine with the AMD Ryzen 9 3950x has a surprisingly low total power draw when running Folding. Another interesting point is that since the 3950x lacks onboard graphics, I had to have a graphics card installed to get display. In my case, I had the Nvidia GTX 1650 installed, since this is a relatively low power consumption card that should provide minimal overhead. As you can see below, folding on the 3950x CPU (with the 1650 GPU idle) uses nearly the same amount of power as folding on the 1650 GPU (with the 3950x idle).

AMD Ryzen 9 3950x Power Consumption

Efficiency

Efficiency is the point of this blog, and in this respect the 3950x comes in towards the upper middle of the pack of hardware configurations I have tested. It’s definitely the most efficient processor I have tested so far, but graphics cards such as the 1660 Super and 1080 Ti are more efficient. Despite drawing more total power from the wall, these high-end GPUs do a lot more science.

Still, a PPD/Watt of over 2500 is not bad, and in this case the 3950x is more efficient than folding on the modest GPU installed in the same box (the Nvidia GTX 1650). Compared to the much older AMD FX-8320e, the Ryxen 9 3950x is 14x more efficient! What a difference 7 years can make!

AMD Ryzen 9 3950x Efficiency

Conclusion

The 16-core, 32-thread AMD Ryzen 9 3950x is one fast processor, and can do a lot of science for the Folding@Home distributed computing project. Although mid to high-end graphics cards such as the 1080 Ti ($450 on the used market) can outperform the $700 3950x in terms of performance and efficiency, it is still important to have a smattering of high-end CPU folding rigs on the Folding@Home network, because some molecules can only be solved on CPUs.

There is a general trend of increasing efficiency and performance as the # of CPU threads allocated to Folding@Home increases. For the Ryzen 9 3950x, using a setting of 30 or 32 threads is recommended for maximum performance and efficiency. If you plan on using your computer for other tasks, or for simultaneously folding on the GPU, 30 threads is the ideal CPU slot setting.

Please Support My Blog!

If you are interested in measuring the power consumption of your own computer (or any device), please consider purchasing a P3 Kill A Watt Power Meter from Amazon. You’ll be surprised what a $35 investment in a watt meter can tell you about your home’s power usage, and if you make a few changes based on what you learn you will save money every year! Using this link won’t cost you anything extra, but will provide me with a small percentage of the sale to support the site hosting fees of GreenFolding@Home.

If you enjoyed this article, perhaps you are in the market for an AMD Ryzen 9 3950x or similar Ryzen processor. If so, please consider using one of the links below to buy one from Amazon. Thanks for reading!

AMD Ryzen 9 3950x Direct Link

AMD Ryzen (Amazon Search)

Future Work

In the next article, I’ll disable multithreading (SMT) to see the effect of virtualized CPU cores on Folding@Home performance.

Later, I plan to enable core performance boost on the 3950x to see what effect the automatic clock frequency and voltage overclocking has on Folding@Home performance and efficiency.

 

 

Folding@Home Review: NVIDIA GeForce GTX 1080 Ti

Released in March 2017, Nvidia’s GeForce GTX 1080 Ti was the top-tier card of the Pascal line-up. This is the graphics card that super-nerds and gamers drooled over. With an MSRP of $699 for the base model, board partners such as EVGA, Asus, Gigabyte, MSI, and Zotac (among others) all quickly jumped on board (pun intended) with custom designs costing well over the MSRP, as well as their own takes on the reference design.

GTX 1080 Ti Reference EVGA

EVGA GeForce GTX 1080 Ti – Reference

Three years later, with the release of the RTX 2080 Ti, the 1080 Ti still holds its own, and still commands well over $400 on the used market. These are beastly cards, capable of running most games with max settings in 4K resolutions.

But, how does it fold?

Folding@Home

Folding at home is a distributed computing project originally developed by Stanford University, where everyday users can lend their PC’s computational horsepower to help disease researchers understand and fight things like cancer, Alzheimer’s, and most recently the COVID-19 Coronavirus. User’s computers solve molecular dynamics problems in the background, which help the Folding@Home Consortium understand how proteins “misfold” to cause disease. For computer nerds, this is an awesome way to give (money–>electricity–>computer work–>fighting disease).

Folding at home (or F@H) can be run on both CPUs and GPUs. CPUs provide a good baseline of performance, and certain molecular simulations can only be done here. However, GPUs, with their massively parallel shader cores, can do certain types of single-precision math much faster than CPUs. GPUs provide the majority of the computational performance of F@H.

Geforce GTX 1080 Ti Specs

The 1080 Ti is at the top of Nvidia’s lineup of their 10-series cards.

1080 Ti Specs

With 3584 CUDA Cores, the 1080 Ti is an absolute beast. In benchmarks, it holds its own against the much newer RTX cards, besting even the RTX 2080 and matching the RTX 2080 Super. Only the RTX 2080 Ti is decidedly faster.

Folding@Home Testing

Testing is performed in my old but trusty benchmark machine, running Windows 10 Pro and using Stanford’s V7 Client. The Nvidia graphics driver version was 441.87. Power consumption measurements are taken on the system-level using a P3 Watt Meter at the wall.

System Specs:

  • CPU: AMD FX-8320e
  • Mainboard : Gigabyte GA-880GMA-USB3
  • GPU: EVGA 1080 Ti (Reference Design)
  • Ram: 16 GB DDR3L (low voltage)
  • Power Supply: Seasonic X-650 80+ Gold
  • Drives: 1x SSD, 2 x 7200 RPM HDDs, Blu-Ray Burner
  • Fans: 1x CPU, 2 x 120 mm intake, 1 x 120 mm exhaust, 1 x 80 mm exhaust
  • OS: Win10 64 bit

I did extensive testing of the 1080 Ti over many weeks. Folding@Home rewards donors with “Points” for their contributions, based on how much science is done and how quickly it is returned. A typical performance metric is “Points per Day” (PPD). Here, I have averaged my Points Per Day results out over many work units to provide a consistent number. Note that any given work unit can produce more or less PPD than the average, with variation of 10% being very common. For example, here are five screen shots of the client, showing five different instantaneous PPD values for the 1080 Ti.

 

GTX 1080 Ti Folding@Home Performance

The following plot shows just how fast the 1080 Ti is compared to other graphics cards I have tested. As you can see, with nearly 1.1 Million PPD, this card does a lot of science.

1080 Ti Folding Performance

GTX 1080 Ti Power Consumption

With a board power rating of 250 Watts, this is a power hungry graphics card. Thus, it isn’t surprising to see that power consumption is at the top of the pack.

1080 Ti Folding Power

GTX 1080 Ti Efficiency

Power consumption alone isn’t the whole story. Being a blog about doing the most work possible for the least amount of power, I am all about finding Folding@Home hardware that is highly efficient. Here, efficiency is defined as Performance Out / Power In. So, for F@H, it is PPD/Watt. The best F@H hardware is gear that maximizes disease research (performance) done per watt of power consumed.

Here’s the efficiency plot.

1080 Ti Folding Efficiency

Conclusion

The Geforce GTX 1080 Ti is the fastest and most efficient graphics card that I’ve tested so far for Stanford’s Folding@Home distributed computing project. With a raw performance of nearly 1.1 Million PPD in windows and an efficiency of almost 3500 PPD/Watt, this card is a good choice for doing science effectively.

Stay tuned to see how Nvidia’s latest Turing architecture stacks up.

GTX 460 Graphics Card Review: Is Folding on Ancient Hardware Worth It?

Recently, I picked up an old Core 2 duo build on Ebay for $25 + shipping. It was missing some pieces (Graphics card, drives, etc), but it was a good deal, especially for the all-metal Antec P182 case and included Corsair PSU + Antec 3-speed case fans. So, I figured what the heck, let’s see if this vintage rig can fold!

Antec 775 Purchase

To complement this old Socket 775 build, I picked up a well loved EVGA GeForce GTX 460 on eBay for a grand total of $26.85. It should be noted that this generation of Nvidia graphics cards (based on the Fermi architecture from back in 2010) is the oldest GPU hardware that is still supported by Stanford. It will be interesting to see how much science one of these old cards can do.

GTX 460 Purchase

I supplied a dusty Western Digital 640 Black Hard Drive that I had kicking around, along with a TP Link USB wireless adapter (about $7 on Amazon). The Operating System was free (go Linux!). So, for under $100 I had this setup:

  • Case: Antec P182 Steel ATX
  • PSU: Corsair HX 520
  • Processor: Intel Core2duo E8300
  • Motherboard: EVGA nForce 680i SLI
  • Ram: 2 x 2 GB DDR2 6400 (800 MHz)
  • HDD: Western Digital Black 640GB
  • GPU: EVGA GeForce GTX 460
  • Operating System: Ubuntu Linux 18.04
  • Folding@Home Client: V7

I fired up folding, and after some fiddling I got it running nice and stable. The first thing I noticed was that the power draw was higher than I had expected. Measured at the wall, this vintage folding rig was consuming a whopping 220 Watts! That’s a good deal more than the 185 watts that my main computer draws when folding on a modern GTX 1060. Some of this is due to differences in hardware configuration between the two boxes, but one thing to note is that the older GTX 460 has a TDP of 160 watts, whereas the GTX 1060 has a TDP of only 120 Watts.

Here’s a quick comparison of the GTX 460 vs the GTX 1060. At the time of their release, both of these cards were Nvidia’s baseline GTX model, offering serious gaming performance for a better price than the more aggressive GTX -70 and -80-series variants. I threw a GTX 1080 into the table for good measure.

GTX 460 Spec Comparison

GTX 460 Specification Comparison

The key takeaways here are that six years later, the equivalent graphics card to the GTX 460 was over three and a half times faster while using forty watts less power.

Power Consumption

I typically don’t report power consumption directly, because I’m more interested in optimizing efficiency (doing more work for less power). However, in this case, there is an interesting point to be made by looking at the wattage numbers directly. Namely, the GTX 460 (a mid-range card) uses almost as much power as a modern high-end GTX 1080, and uses seriously more power than the modern GTX 1060 mid-range card. Note: these power consumption numbers must be taken with a grain of salt, because the GTX 460 was installed in a different host system (the Core2 Duo rig) as the other cards, but the resutls are still telling. This is also consistent with the advertised TDP of the GTX 460, which is 40 watts higher than the GTX 1060.

GTX 460 Power Consumption (Wall)

Total System Power Consumption

Folding@Home Results

Folding on the old GTX 460 produced a rough average of 20,000 points per day, with the normal +/- 10% variation in production seen between work units. Back in 2006 when I was making a few hundred PPD on an old Athlon 64 X2 CPU, this would have been a huge amount of points! Nowadays, this is not so impressive. As I mentioned before, the power consumption at the wall for this system was 220 Watts. This yields an efficiency of 20,000 PPD / 220 Watts = 90 PPD/Watt.

Based off the relative performance, one would think the six-year newer GTX 1060 would produce somewhere between 3 and 4 times as many PPD as the older 460 card. This would mean roughly 60-80K PPD. However, my GTX 1060 frequently produces over 300K PPD. This is due to Stanford’s Quick Return Bonus, which essentially rewards donors for doing science quickly. You can read more about this incentive-based points system at Stanford’s website. The gist is, the faster you return a work unit to the scientists, the sooner they can get to developing cures for diseases. Thus, they award you more points for fast work. As the performance plot below shows, this quick return bonus really adds up, so that someone doing 3-4 times more (GTX 1060 vs. GTX 460 linear benchmark performance) results in 15 times more F@H performance.

GTX 460 Performance and Efficiency

Old vs. New Graphics Card Comparison: Folding@Home Efficiency and PPD

This being a blog about energy-conscious computing, I’d be remiss if I didn’t point out just how inefficient the ancient GTX 460 is compared to the newer cards. Due to the relatively high power consumption for a midrange card, the GTX 460 is eighteen times less efficient than the GTX 1060, and a whopping thirty three times less efficient than the GTX 1080.

Conclusion

Stanford eventually drops support for old hardware (anyone remember PS3 folding?), and it might not be long before they do the same for Fermi-based GPUs. Compared with relatively modern GPUs, the GTX 460 just doesn’t stack up in 2020. Now that the 10-series cards are almost four years old, you can often get GTX 1060s for less than $200 on eBay, so if you can afford to build a folding rig around one of these cards, it will be 18 times more efficient and make 15 times more points.

Still, I only paid about $100 total to build this vintage folding@home rig for this experiment. One could argue that putting old hardware to use like this keeps it out of landfills and still does some good work. Additionally, if you ignore bonus points and look at pure science done, the GTX 460 is “only” about 4 times slower than its modern equivalent.

Ultimately, for the sake of the environment, I can’t recommend folding on graphics cards that are many years out of date, unless you plan on using the machine as a space heater to offset heating costs in the winter. More on that later…

Addendum

Since doing the initial testing and outline for this article, I picked up a GTX 480 and a few GTX 980 Ti cards. Here are some updated plots showing these cards added to the mix. The GTX 480 was tested in the Core2 build, and the GTX 980 Ti in my standard benchmark rig (AMD FX-based Socket AM3 system).

Various GPU Power Consumption

GTX 980 and 480 Performance

GTX 980 and 480 Efficiency

I think the conclusion holds: even though the GTX 480 is slightly faster and more efficient than it’s little brother, it is still leaps and bounds worse than the more modern cards. The 980 Ti, being a top-tier card from a few generations back, holds its own nicely, and is almost as efficient as a GTX 1060. I’d say that the 980 Ti is still a relatively efficient card to use in 2020 if you can get one for cheap enough.

Is Folding@Home a Waste of Electricity?

Folding@home has brought together thousands of people (81 thousand active folders as of the time of this writing, as evidenced from Stanford’s One in a Million contributor drive.) This is awesome…tens of thousands of people teaming up to help researchers unravel the mysteries of terrible diseases.

But, there is a cost. If you are reading this blog, then you know the cost of scientific computing projects such as Folding@Home is environmental. In trying to save ourselves from the likes of cancer and Alzheimer’s disease, we are running a piece of software that causes our computers to use more electricity. In the case of dedicated folding@home computers, this can be hundreds of watts of power consumed 24/7. It adds up to a lot of consumed power, that in the end exits your computer as heat (potentially driving up your air conditioning costs as well).

Folding on Graphics Card Thermal

FLIR Thermal Cam – Folding@Home on Graphics Card

If Stanford reaches their goal of 1 million active folders, then we have an order of magnitude more power consumption on our hands. Let’s do some quick math, assuming each folder contributes 200 watts continuous (low compared to the power draw of most dedicated Folding@home machines). In this case, we have 200 watts/computer * 24 hours/day * 365 days/year * 1,000,000 computers *1 kilowatt-hour/1000 watt-hours = 1,752,000,000 kilowatt-hours of power consumed in a year, in the name of Science!

That’s almost two billion kilowatt-hours, people.  It’s 1.75 terawatt-hours (TWh)! Using the EPA’s free converter can put that into perspective. Basically, this is like driving 279 thousand extra cars for a year, or burning 1.5 billion pounds of coal.  Yikes!

https://www.epa.gov/energy/greenhouse-gas-equivalencies-calculator

F@H Energy Equivalence

Potential Folding@Home Environmental Impact

Is all this disease research really harming the planet? If it is, is it worth it? I don’t know. It depends on the outcome of the research, the potential benefit to humans, and the detriment to humans, animals, and the environment caused by that research. This opens up all sorts of what-if scenarios.

For example: what if Folding@Home does help find a future cure for many diseases, which results in extended life-spans. Then, the earth gets even more overpopulated than it is already. Wouldn’t the added environmental stresses negatively impact people’s health? Conversely, what if Folding@Home research results in a cure for a disease that allows a little girl or boy to grow to adulthood and become the inventor of some game-changing green technology?

It’s just not that easy to quantify.

Then, there is the topic of Folding@home vs. other distributed computing projects. Digital currency, for example. Bitcoin miners (and all the spinoffs) suck up a ton of power. Current estimates put Bitcoin alone at over 40 TWH a year.

Source: https://www.theguardian.com/technology/2018/jan/17/bitcoin-electricity-usage-huge-climate-cryptocurrency

That’s more power than some countries use, and twenty times more than my admittedly crude future Folding@home estimate. When you consider that the cryptocurrency product has only limited uses (many of which are on the darkweb for shady purposes), it perhaps helps cast Folding@home in a better light.

There is always room for improvement thought. That is the point of this entire blog. If we crazies are committed to turning our hard-earned dollars into “points”, we might as well do it in the most efficient way possible. And, while we’re at it, we should consider the environmental cost of our hobby and think of ways to offset it (that goes for the Bitcoin folks too).

I once ran across a rant on another online blog about how Folding@home is killing the planet. This was years ago, before the Rise of the Crypto. I wish I could find that now, but it seems to have been lost in the mists of time, long since indexed, ousted, and forgotten by the Google Search Crawler. In it, the author bemoaned over how F@H was murdering mother earth in the name of science. I recall thinking to myself, “hey, they’ve got a point”. And then I realized that I had already done a bunch of things to help combat the rising electric bill, and I bet most distributed computing participants have done some of these things too.

These things are covered elsewhere in this blog, and range from optimizing the computer doing the work to going after other non-folding@home related items to help offset the electrical and environmental cost. I started by switching to LED light-bulbs, then went to using space heaters instead of whole house heating methods in the winter. As I upgraded my Folding@home computer, I made it more energy efficient not just for F@H but for all tasks executed on that machine.

In the last two years, my wife and I bought a house, which gave us a whole other level of control over the situation. We had one of those state-subsidized energy audits done. They put in some insulation and air-sealed our attic, thus reducing our yearly heating costs. Eventually, we even decided to put solar panels on the roof and get an electric car (these last two weren’t because I felt guilty about running F@H, but because my wife and I are just into green technologies). We even use our Folding@home computer as a space heater in the winter, thus offsetting home heating oil use and negating any any environmental arguments against F@H in the winter months.

In conclusion, there is no doubt that distributed projects have an environmental cost. However, to claim that they are a waste of electricity or that they are killing the planet might be taking it too far. One has to ask if the cause is worth the environmental impact, and then figure out ways to lessen that impact (or in some cases get motivated to offset it completely. Solar powered folding farm, anyone?)

Solar Panel in Basement

LG 320 Solar Panel in my basement, awaiting roof install.

Folding on the NVidia GTX 1060

Overview

Folding@home is Stanford University’s charitable distributed computing project. It’s charitable because you can donate electricity, as converted into work through your home computer, to fight cancer, Alzheimer, and a host of other diseases.  It’s distributed, because anyone can run it with almost any desktop PC hardware.  But, not all hardware configurations are created equally.  If you’ve been following along, you know the point of this blog is to do the most work for as little power consumption as possible.  After all, electricity isn’t free, and killing the planet to cure cancer isn’t a very good trade-off.

Today we’re testing out Folding@home on EVGA’s single-fan version of the NVIDIA GTX 1060 graphics card.  This is an impressive little card in that it offers a lot of gaming performance in a small package.  This is a very popular graphics card for gamers who don’t want to spend $400+ on GTX 1070s and 1080s.  But, how well does it fold?

Card Specifications

Manufacturer:  EVGA
Model #:  06G-P4-6163
Model Name: EVGA GeForce GTX 1060 SC GAMING (Single Fan)
Max TDP: 120 Watts
Power:  1 x PCI Express 6-pin
GPU: 1280 CUDA Cores @ 1607 MHz (Boost Clock of 1835 MHz)
Memory: 6 GB GDDR5
Bus: PCI-Express X16 3.0
MSRP: $269

06G-P4-6163-KR_XL_4

EVGA Nvidia GeForce GTX 1060 (photo by EVGA)

Folding@Home Test Setup

For this test I used my normal desktop computer as the benchmark machine.  Testing was done using Stanford’s V7 client on Windows 7 64-bit running FAH Core 21 work units.  The video driver version used was 381.65.  All power consumption measurements were taken at the wall and are thus full system power consumption numbers.

If you’re interested in reading about the hardware configuration of my test rig, it is summarized in this post:

https://greenfoldingathome.com/2017/04/21/cpu-folding-revisited-amd-fx-8320e-8-core-cpu/

Information on my watt meter readings can be found here:

I Got a New Watt Meter!

FOLDING@HOME TEST RESULTS – 305K PPD AND 1650 PPD/WATT

The Nvidia GTX 1060 delivers the best Folding@Home performance and efficiency of all the hardware I’ve tested so far.  As seen in the screen shot below, the native F@H client has shown up to 330K PPD.  I ran the card for over a week and averaged the results as reported to Stanford to come up with the nominal 305K Points Per Day number.  I’m going to use 305 K PPD in the charts in order to be conservative.  The power draw at the wall was 185 watts, which is very reasonable, especially considering this graphics card is in an 8-core gaming rig with 16 GB of ram.  This results in a F@H efficiency of about 1650 PPD/Watt, which is very good.

Screen Shot from F@H V7 Client showing Estimated Points per Day:

1060 TI Client

Nvidia GTX 1060 Folding @ Home Results: Windows V7 Client

Here are the averaged results based on actual returned work units

(Graph courtesy of http://folding.extremeoverclocking.com/)

1060 GTX PPD History

NVidia 1060 GTX Folding PPD History

Note that in this plot, the reported results previous to the circled region are also from the 1060, but I didn’t have it running all the time.  The 305K PPD average is generated only from the work units returned within the time frame of the red circle (7/12 thru 7/21)

Production and Efficiency Plots

Nvidia 1060 PPD

NVidia GTX 1060 Folding@Home PPD Production Graph

Nvidia 1060 PPD per Watt

Nvidia GTX 1060 Folding@Home Efficiency Graph

Conclusion

For about $250 bucks (or $180 used if you get lucky on eBay), you can do some serious disease research by running Stanford University’s Folding@Home distributed computing project on the Nvidia GTX 1060 graphics card.  This card is a good middle ground in terms of price (it is the entry-level in NVidia’s current generation of GTX series of gaming cards).  Stepping up to a 1070 or 1080 will likely continue the trend of increased energy efficiency and performance, but these cards cost between $400 and $800.  The GTX 1060 reviewed here was still very impressive, and I’ll also point out that it runs my old video games at absolute max settings (Skyrim, Need for Speed Rivals).  Being a relatively small video card, it easily fits in a mid-tower ATX computer case, and only requires one supplemental PCI-Express power connector.  Doing over 300K PPD on only 185 watts, this Folding@home setup is both efficient and fast. For 2017, the NVidia 1060 is an excellent bang-for-the-buck Folding@home Graphics Card.

Request: Anyone want to loan me a 1070 or 1080 to test?  I’ll return it fully functional (I promise!)

Folding@Home on the Nvidia GeForce GTX 1050 TI: Extended Testing

Hi again.  Last week, I looked at the performance and energy efficiency of using an Nvidia GeForce GTX 1050 TI to run Stanford’s charitable distributed computing project Folding@home.  The conclusion of that study was that the GTX 1050 TI offers very good Points Per Day (PPD) and PPD/Watt energy efficiency.  Now, after some more dedicated testing, I have a few more thoughts on this card.

Average Points Per Day

In the last article, I based the production and efficiency numbers on the estimated completion time of one work unit (Core 21), which resulted in a PPD of 192,000 and an efficiency of 1377 PPD/Watt.  To get a better number, I let the card complete four work units and report the results to Stanford’s collection server.  The end result was a real-world performance of 185K PPD and 1322 PPD/Watt (power consumption is unchanged at 140 watts @ the wall).  These are still very good numbers, and I’ve updated the charts accordingly.  It should be noted that this still only represents one day of folding, and I am suspicious that this PPD is still on the high end of what this card should produce as an average.  Thus, after this article is complete, I’ll be running some more work units to try and get a better average.

Folding While Doing Other Things

Unlike the AMD Radeon HD 7970 reviewed here, the Nvidia GTX 1050 TI doesn’t like folding while you do anything else on the machine.  To use the computer, we ended up pausing folding on multiple occasions to watch videos and browse the internet.  This results in a pretty big hit in the amount of disease-fighting science you can do, and it is evident in the PPD results.

Folding on a Reduced Power Setting

Finally, we went back to uninterrupted folding on the card, but at a reduced power setting (90%, set using MSI Afterburner).  This resulted in a 7 watt reduction of power consumption as measured at the wall (133 watts vs. 140 watts).  However, in order to produce this reduction in power, the graphics card’s clock speed is reduced, resulting in more than a performance hit.  The power settings can be seen here:

GTX 1050 Throttled

MSI Afterburner is used to reduce GPU Power Limit

Observing the estimated Folding@home PPD in the Windows V7 client shows what appears to be a massive reduction in PPD compared to previous testing.  However, since production is highly dependent on the individual projects and work units, this reduction in PPD should be taken with a grain of salt.

GTX 1050 V7 Throttled Performance

In order to get some more accurate results at the reduced power limit, we let the machine chug along uninterrupted for a week.  Here is the PPD production graph courtesy of http://folding.extremeoverclocking.com/

GTX 1050 Extended Performance Testing

Nvidia GTX 1050 TI Folding@Home Extended Performance Testing

It appears here that the 90% power setting has caused a significant reduction in PPD. However, this is based on having only one day’s worth of results (4 work units) for the 100% power case, as opposed to 19 work units worth of data for the 90% power case. More testing at 100% power should provide a better comparison.

Updated Charts (pending further baseline testing)

GTX 1050 PPD Underpowered

Nvidia GTX 1050 PPD Chart

GTX 1050 Efficiency Underpowered

Nvidia GTX 1050 TI Efficiency

As expected, you can contribute the most to Stanford’s Folding@home scientific disease research with a dedicated computer.  Pausing F@H to do other tasks, even for short periods, significantly reduces performance and efficiency.  Initial results seem to indicate that reducing the power limit of the graphics card significantly hurts performance and efficiency.  However, there still isn’t enough data to provide a detailed comparison, since the initial PPD numbers I tested on the GTX 1050 were based on the results of only 4 completed work units.  Further testing should help characterize the difference.

Squeezing a few more PPD out of the FX-8320E

In the last post, the 8-core AMD FX-8320E was compared against the AMD Radeon 7970 in terms of both raw Folding@home computational performance and efficiency.  It lost, although it is the best processor I’ve tested so far.  It also turns out it is a very stable processor for overclocking.

Typical CPU overclocking focuses on raw performance only, and involves upping the clock frequency of the chip as well as the supplied voltage.  When tuning for efficiency, doing more work for the same (or less) power is what is desired.  In that frame of mind, I increased the clock rate of my FX-8320e without adjusting the voltage to try and find an improved efficiency point.

Overclocking Results

My FX-8320E proved to be very stable at stock voltage at frequencies up to 3.6 GHz.  By very stable, I mean running Folding@home at max load on all CPUs for over 24 hours with no crashes, while also using the computer for daily tasks.   This is a 400 MHz increase over the stock clock rate of 3.2 GHz.  As expected, F@H production went up a noticeable amount (over 3000 PPD).  Power consumption also increased slightly.  It turns out the efficiency was also slightly higher (190 PPD/watt vs. 185 PPD/watt).  So, overclocking was a success on all fronts.

FX 8320e overclock PPD

FX 8320e overclock efficiency

Folding Stats Table FX-8320e OC

Conclusion

As demonstrated with the AMD FX-8320e, mild overclocking can be a good way to earn more Points Per Day at a similar or greater efficiency than the stock clock rate.  Small tweaks like this to Folding@home systems, if applied everywhere, could result in more disease research being done more efficiently.

F@H Efficiency on Dell Inspiron 1545 Laptop

Laptops!  

When browsing internet forums looking for questions that people ask about F@H, I often see people asking if it is worth folding on laptops (note that I am talking about normal, battery-life optimized laptops, not Alienware gaming laptops / desktop replacements).  In general, the consensus from the community is that folding on laptops is a waste of time.  Well, that is true from a raw performance perspective.  Laptops, tablets, and other mobile devices are not the way to rise to the top of the Folding at Home leader boards.  They’re just too slow, due to the reduced clock speeds and voltages employed to maximize battery life.

But wait, didn’t you say that low voltage is good for efficiency?

I did, in the last article.  By undervolting and slightly underclocking the Phenom II X6 in a desktop computer, I was able to get close to 90 PPD/Watt while still doing an impressive twelve thousand PPD.

However, this raised the interesting question of what would happen if someone tried to fold on a computer that was optimized for low voltage, such as a laptop.  Lets find out!

Dell Inspiron 1545

Specs:

  • Intel T9600 Core 2 Duo
  • 8 GB DDR2 Ram
  • 250 GB spinning disk style HDD (5400 RPM, slow as molasses)
  • Intel Integrated HD Graphics (horrible for gaming, great for not using much extra electricity)
  • LCD Off during test  to reduce power

I did this test on my Dell Inspiron 1545, because it is what I had lying around.  It’s an older laptop that originally shipped with a slow socket P Intel Pentium dual core.  This 2.1 GHz chip was going to be so slow at folding that I decided to splurge and pick up a 2.8 GHz T9600 Core 2 Duo from Ebay for 25 bucks (can you believe this processor used to cost $400)?  This high end laptop processor has the same 35 watt TDP as the Pentium it is replacing, but has 6 times the total cache.  This is a dual core part that is roughly similar in architecture to the Q6600 I tested earlier, so one would expect the PPD and the efficiency to be close to the Q6600 when running on only 2 cores (albeit a bit higher due to the T9600’s higher clock speed).  I didn’t bother doing a test with the old laptop processor, because it would have been pretty bad (same power consumption but much slower).

After upgrading the processor (rather easy on this model of laptop, since there is a rear access panel that lets you get at everything), I ran this test in Windows 7 using the V7 client.  My computer picked up a nice A4 work unit and started munching away.  I made sure to use my passkey to ensure I get the quick return bonus.

Results:

The Intel T9600 laptop processor produced slightly more PPD than the similar Q6600 desktop processor when running on 2 cores (2235 PPD vs 1960 PPD). This is a decent production rate for a dual core, but it pales in comparison to the 6000K PPD of the Q6600 running with all 4 cores, or newer processors such as the AMD 1100T (over 12K PPD).

However, from an efficiency standpoint, the T9600 Core2 Duo blows away the desktop Core2 Quad by a lot, as seen in the chart and graph below.

Intel T9600 Folding@Home Efficiency

Intel T9600 Folding@Home Efficiency

Intel T9600 Folding@Home Efficiency vs. Intel Desktop Processors

Intel T9600 Folding@Home Efficiency vs. Desktop Processors

Conclusion

So, the people who say that laptops are slow are correct.  Compared to all the crazy desktop processors out there, a little dual core in a laptop isn’t going to do very many points per day.  Even modern quad cores laptops are fairly tame compared to their desktop brethren.  However, the efficiency numbers tell a different story.

Because everything from the motherboard, video card, audio circuit, hard drive, and processor are optimized for low voltage, the total system power consumption was only 39 watts (with the lid closed).  This meant that the 2235 PPD was enough to earn an efficiency score of 57.29 PPD/Watt.  This number beats all of the efficiency numbers from the most similar desktop processor tested so far (Q6600), even when the Q6600 is using all four cores.

So, laptops can be efficient F@H computers, even though they are not good at raw PPD production.  It should also be noted that during this experiment the little T9600 processor heated up to a whopping 67 degrees C. That’s really warm compared to the 40 degrees Celsius the Q6600 runs at in the desktop.  Over time, that heat load would probably break my poor laptop and give me an excuse to get that Alienware I’ve been wanting.