Monthly Archives: February 2019

Nvidia GeForce GTX 1070 Ti Folding@Home Review

In an effort to make as much use of the colder months in New England as I can, I’m running tons of Stanford University’s Folding@Home on my computer to do charitable science for disease research while heating my house. In the last article, I reviewed a slightly older AMD card, the RX 480, to determine its performance and efficiency running Folding@Home. Today, I’ll be taking a look at one of the favorite cards from Nvidia for both folding and gaming: The 1070 Ti.

The GeForce GTX 1070 Ti was released in November 2017, and sits between the 1070 and 1080 in terms of raw performance. As of February 2019, the 1070 Ti can be for a deep discount on the used market, now that the RTX 20xx series cards have been released. I got my Asus version on eBay for $250.

Based on Nvidia’s 14nm Pascal architecture, the 1070 Ti has 2432 CUDA cores and 8 GB of GDDR5 memory, with a memory bandwidth of 256 GB/s. The base clock rate of the GPU is 1607 MHz, although the cards automatically boost well past the advertised boost clock of 1683 Mhz. Thermal Design Power (TDP) is 180 Watts.

The 3rd party Asus card I got is nothing special. It appears to be a dual-slot reference design, and uses a blower cooler to exhaust hot air out the back of the case. It requires one supplemental 8-pin PCI-E Power connection.

IMG_20190206_185514342

ASUS GeForce GTX 1070 Ti

One thing I will note about this card is it’s length. At 10.5 inches (which is similar to many NVidia high-end cards), it can be a bit problematic to fit in some cases. I have a Raidmax Sagitta mid-tower case from way back in 2006, and it fits, but barely. I had the same problem with the EVGA GeForce 1070 I reviewed earlier.

IMG_20190206_190210910_TOP

ASUS GTX 1070 Ti – Installed.

Test Environment

Testing was done in Windows 10 on my AMD FX-based system, which is old but holds pretty well, all things considered. You can read more on that here. The system was built for both performance and efficiency, using AMD’s 8320e processor (a bit less power hungry than the other 8-core FX processors), a Seasonic 650 80+ Gold Power Supply, and 8 GB of low voltage DDR3 memory. The real key here, since I take all my power measurements at the wall with a P3 Kill-A-Watt meter, is that the system is the same for all of my tests.

The Folding@Home Client version is 7.5.1, running a single GPU slot with the following settings:

GPU Slot Options

GPU Slot Options for Maximum PPD

These settings tend to result in a slighter higher points per day (PPD), because they request large, advanced work units from Stanford.

Initial Test Results

Initial testing was done on one of the oldest drivers I could find to support the 1070 Ti (driver version 388.13). The thought here was that older drivers would have less gaming optimizations, which tend to hurt performance for compute jobs (unlike AMD, Nvidia doesn’t include a compute mode in their graphics driver settings).

Unfortunately, the best Nvidia driver for the non-Ti GTX 10xx cards (372.90) doesn’t work with the 1070 Ti, because the Ti version came out a few months later than the original cards. So, I was stuck with version 388.13.

Nvidia 1070 TI Baseline Clocks

Nvidia GTX 1070 Ti Monitoring – Baseline Clocks

I ran F@H for three days using the stock clock rate of 1823 MHz core, with the memory at 3802 MHz. Similar to what I found when testing the 1070, Folding@Home does not trigger the card to go into the high power (max performance) P0 state. Instead, it is stuck in the power-saving P2 state, so the core and memory clocks do not boost.

The PPD average for three days when folding at this rate was 632,380 PPD. Checking the Kill-A-Watt meter over the course of those days showed an approximate average system power consumption of 220 watts. Interestingly, this is less power draw than the GTX 1070 (which used 227 watts, although that was with overclocking + the more efficient 372.90 driver). The PPD average was also less than the GTX 1070, which had done about 640,000 PPD. Initial efficiency, in PPD/Watt, was thus 2875 (compared to the GTX 1070’s 2820 PPD/Watt).

The lower power consumption number and lower PPD performance score were a bit surprising, since the GTX 1070 TI has 512 more CUDA cores than the GTX 1070. However, in my previous review of the 1070, I had done a lot of optimization work, both with overclocking and with driver tuning. So, now it was time to do the same to the 1070 Ti.

Tuning the Card

By running UNIGINE’s Heaven video game benchmark in windowed mode, I was able to watch what the card did in MSI afterburner. The core clock boosted up to 1860 MHz (a modest increase from the 1823 base clock), and the memory went up to 4000 MHz (the default). I tried these overclocking settings and saw only a modest increase in PPD numbers. So, I decided to push it further, despite the Asus card having only a reference-style blower cooler. From my 1070 review, I found I was able to fold nice and stable with a core clock of 2012 MHz and a memory clock of 3802 MHz. So, I set up the GTX 1070 Ti with those same settings. After running it for five days, I pushed the core a little higher to 2050 Mhz. A few days later, I upgraded the driver to the latest (417.71).

Nvidia 1070 TI OC

Nvidia GTX 1070 Ti Monitoring – Overclocked

With these settings, I did have to increase the fan speed to keep the card below 70 degrees Celsius. Since the Asus card uses a blower cooler, it was a bit loud, but nothing too crazy. Open-air coolers with lots of heat pipes and multiple fans would probably let me push the card higher, but from what I’d read, people start running into stability problems at core clocks over 2100 Mhz. Since the goal of Folding@home is to produce reliable science to help Stanford University fight disease, I didn’t want to risk dropping a work unit due to an unstable overclock.

Here’s the production vs. time history from Stanford’s servers, courtesy of https://folding.extremeoverclocking.com/

Nvidia GTX 1070 Ti Time History

Nvidia GTX1070 Ti Folding@Home Production Time History

As you can see below, the overclock helped improve the performance of the GTX 1070 Ti. Using the last five days worth of data points (which has the graphics driver set to 417.71 and the 2050 MHz core overclock), I got an average PPD of 703,371 PPD with a power consumption at the wall of 225 Watts. This gives an overall system efficiency of 3126 PPD/Watt.

Finally, these results are starting to make more sense. Now, this card is outpacing the GTX 1070 in terms of both PPD and energy efficiency. However, the gain in performance isn’t enough to confidently say the card is doing better, since there is typically a +/- 10% PPD difference depending on what work unit the computer receives. This is clear from the amount of variability, or “hash”, in the time history plot.

Interestingly, the GTX 1070 Ti it is still using about the same amount of power as the base model GTX 1070, which has a Thermal Design Power of 150 Watts, compared to the GTX 1070 Ti’s TDP of 180 Watts. So, why isn’t my system consuming 30 watts more at the wall than it did when equipped with the base 1070?

I suspect the issue here is that the drivers available for the 1070 Ti are not as good for folding as the 372.90 driver for the non-Ti 10-series Nvidia cards. As you can see from the MSI Afterburner screen shots above, GPU Usage on the GTX 1070 Ti during folding hovers in the 80-90% range, which is lower than the 85-93% range seen when using the non-Ti GTX 1070. In short, folding on the 1070 Ti seems to be a bit handicapped by the drivers available in Windows.

Comparison to Similar Cards

Here are the Production and Efficiency Plots for comparison to other cards I’ve tested.

GTX 1070 Ti Performance Comparison

GTX 1070 Ti Performance Comparison

GTX 1070 Ti Efficiency Comparison

GTX 1070 Ti Efficiency Comparison

Conclusion

The Nvidia GTX 1070 Ti is a very good graphics card for running Folding@Home. With an average PPD of 703K and a system efficiency of 3126 PPD/Watt, it is the fastest and most efficient graphics card I’ve tested so far. As far as maximizing the amount of science done per electricity consumed, this card continues the trend…higher-end video cards are more efficient, despite the increased power draw.

One side note about the GTX 1070 Ti is that the drivers don’t seem as optimized as they could be. This is a known problem for running Folding@Home in Windows. But, since the proven Nvidia driver 372.90 is not available for the Ti-flavor of the 1070, the hit here is more than normal. On the used market in 2019, you can get a GTX 1070 for $200 on ebay, whereas the GTX 1070 Ti’s go for $250. My opinion is that if you’re going to fold in Windows, a tuned GTX 1070 running the 372.90 driver is the way to go.

Future Work

To fully unlock the capability of the GTX 1070 Ti, I realized I’m going to have to switch operating systems. Stay tuned for a follow-up article in Linux.

AMD Radeon RX 480 Folding@Home Review

I’ve been reviewing a lot of Nvidia cards lately, so it’s high time I mixed it up a bit. The 4xx series of cards from AMD were released in June 2016, and featured AMD’s new Polaris 14nm architecture. The flagship card, the RX 480, was available in a 4 GB and 8 GB version. The Polaris architecture, which in the RX 480 features 2034 stream processors at a base clock rate of 1120 MHz (1266 boost) and a TDP of 150 watts, was designed to be more efficiency than the aging Fiji architecture used in the R5/R7/R9 300 series.

Now that these cards can be obtained relatively inexpensively on eBay. I picked up a second hand 8 GB card from XFX for $90. Let’s see how it folds compared to some similar graphics cards from Nvidia from that time period. Namely the 1050 and 1060.

 

IMG_20190202_165117036

XFX Radeon RX 480 – 8GB – 150 Watt TDP

Folding@Home testing was done with in Windows 10 on my AMD FX-based test system. The folding@home client was version 7.5.1. The GPU slot options were configured as usual for maximum points per day (PPD) jobs:

Name: client-type  Value: advanced

Name: max-packet-size Value: big

The video driver used was Crimson ReLive 17.7, which includes an essential option for running compute jobs like Folding@Home. This is the ‘compute’ mode for GPU Workload. As previously reported by other folders, this setting can offer significant performance improvement vs. the default gaming setting. I tested it both ways.

AMD Compute Mode

Make sure to set GPU Workload to ‘Compute’ for running Folding@Home Work Units!

Monitoring of the card while folding was done with MSI Afterburner. My particular version of the card by XFX got up to about 76 degrees C when folding, which is pretty warm but not dangerous. The fan settings were on auto, and it was spinning nice and quietly at a touch over 50% speed. The GPU workload % was nicely maxed out at 100 percent, which is something not typically seen on Nvidia cards in Windows. As expected, Folding@Home doesn’t use the full 150 watt TDP. The power usage, as reported at the card, bounced around but was centered at about 110 watts. Although it is expected that the actual power usage would be less than the TDP, this is a lot less, especially considering the 100% GPU usage. I suspect something might be fishy, considering my total system power consumption was pretty high (more on that later).

RX 480 Stock Settings

RX 480 Settings while Folding

Initially, I tested out the driver setting to see if there was a difference between ‘graphics’ and ‘compute’ mode. Although I didn’t see much of a power consumption change (hard to tell since it bounces around), the PPD as reported from the client did change. Note for this testing, I just flipped the switch and observed the time-averaged PPD results as reported from the client. The key here is the project (14152) was the same in both cases, so the result is directly comparable.

In Graphics Mode:

PPD (Estimated) = 290592, TPF (Estimated) = 3 minutes 12 seconds

In Compute Mode:

PPD (Estimated) = 304055, TPF (Estimated) = 2 minutes 59 seconds

That is a pretty significant increase in performance by just flipping a switch. In short, on AMD cards running Folding@Home, always use compute mode.

Here are the screen shots from the client to back this up:

RX 480 Graphics Mode Client View

AMD RX 480 – Graphics Mode

RX 480 Compute Mode Client View

AMD RX 480 – Compute Mode

If you’ve been following along, you know I don’t like to rely on the client’s estimated values for overall PPD numbers. The reason is that it is just an estimate, and it varies a lot between work units. However, for this quick test of graphics vs. compute mode on the same work unit, the results are consistent with those found by other testers.

Overall Performance and Efficiency

I like to run cards for a few days on a variety of work units in order to get some statistics, which I can average to provide more certain results. In this case, I ran folding@home on my RX 480 for over three days. Here are the stats from Stanford’s server, as reported by the kind folks over at Extreme Overclocking.

RX 480 Stats History

Folding @ Home Server Statistics – AMD RX 480 Over 3 Days

As you can see, the average PPD of about 245K PPD wasn’t that impressive, although to be fair the other cards on this plot are all in higher performance price points, except possibly the 1060. I also think this card has potential to churn out over 300k PPD as estimated by the client. This thread seems to suggest this is possible, although the card in that test was overclocked to 1328 MHz vs the 1288 MHz I was running (I didn’t have time to do any overclock testing on mine).

Power consumption measured at the wall varied a bit with the different work units. Spot-checking the numbers with my P3 watt meter resulted in an approximate average total system power consumption of 243 watts. This is much higher than my EVGA GTX 1060 (185 watts at the wall). Just going by the TDP of both cards, I would have guessed the wall power consumption to be somewhere around 215 watts (since the TDP of the RX 480 is 30 watts higher than the 1060).

I ended up selling this card on Ebay a lot faster than I had planned, so I wasn’t able to do detailed testing. However, I suspect the actual power consumption at the card was much higher than what was being reported in MSI Afterburner. After doing some research, it turns out the RX 480 is known to overdraw from both the PCI Express Slot and the supplemental PCI-E power cable. For a card designed to be more efficient, this one is a failure.

Performance Comparison

RX 480 Performance Plot

AMD RX 480 Folding@Home Performance Comparison

Efficiency Comparison

RX 480 Efficiency Plot

AMD RX 480 Folding@Home Efficiency Comparison

Conclusion

The AMD RX 480 produces about 245K PPD while using a surprisingly high 243 watts of system power (measured at the wall). The efficiency is thus about 1000 PPD/Watt. Although better than AMD’s older cards such as a Radeon 7970, these numbers aren’t very competitive, especially when compared to Nvidia’s GTX 1060 (a similarly-priced card from 2016). As of Feb. 2019, the RX 480 can be obtained used for about $100, and the GTX 1060 for $120. If you’re considering buying one of these older cards to do some charitable science with Folding@Home, I recommend spending the extra $20 on the Nvidia 1060, especially because with a mild overclock and a few driver tweaks (use the 372.90 drivers), the Nvidia 1060 can crank out over 350K PPD.

TL;DR: The AMD RX 480 isn’t a very efficient graphics card for running Folding@Home. However, the XFX Version has Pretty Lights…

RX 580 by XFX

Ahh, pretty lights!

Folding@Home Efficiency vs. GPU Power Limit

Folding@Home: The Need for Efficiency

Distributed computing projects like Stanford University’s Folding@Home sometimes get a bad rap on account of all the power that is consumed in the name of science.  Critics argue that any potential gains that are made in the area of disease research are offset by the environmental damage caused by thousands of computers sucking down electricity.

This blog hopes to find a balance by optimizing the way the computational research is done. In this article, I’m going to show how a simple setting in the graphics card driver can improve Folding@Home’s Energy Efficiency.

This blog uses an Nvidia graphics card, but the general idea should also work with AMD cards. The specific card here is an EVGA GeForce GTX 1060 (6 GB).  Green F@H Review here: Folding on the NVidia GTX 1060

If you are folding on a CPU, similar efficiency improvements can be achieved by optimizing the clock frequencies and voltages in the BIOS.  For an example on how to do this, see these posts:

F@H Efficiency: AMD Phenom X6 1100T

F@H Efficiency: Overclock or Undervolt?

(at this point in time I really just recommend folding on a GPU for optimum production and efficiency)

GPU Power Limit Overview

The GPU Power limit slider is a quick way to control how much power the graphics card is allowed to draw. Typically, graphics cards are optimized for speed, with efficiency a second goal (if at all). When a graphics card is pushed harder, it will draw more power (until it runs into the power limit). Today’s graphics cards will also boost their clock rate when loaded, and reduce it when the load goes away. Sometimes, a few extra MHz can be achieved for minimal extra power, but go too far and the amount of power needed to drive the card will grow exponentially. Sure the card is doing a bit more work (or playing a game a bit faster), but the heaps of extra power needed to do this are making it very inefficient.

What I’m going to quickly show is that going the other way (reducing power) can actually improve efficiency, albeit at a reduction of raw output. For  this quick test, I’m just going to look a the default power limit, 100%, vs 50%. Specific tuning is going to be dependent on your actual graphics card. But, with a few days at different settings, you should be able to find a happy balance between performance and efficiency.

For these plots, I used my watt meter to obtain actual power consumption at the wall. You can read about my watt meters here.

Changing the Power Limit

A tool such as MSI Afterburner can be used to view the graphics card’s settings, including the power limit. In the below screenshot, I reduced the card’s power limit by 50% midway through taking data. You can clearly see the power consumption and GPU temperature drop. This suggests the entire computer should be drawing less power from the wall. I confirmed this with my watt meter.

Adjust Power Limit MSI Afterburner

MSI Afterburner is used to reduce the graphics card’s power limit.

Effect on Results

I ran the card for multiple days at each power setting and used Stanford’s actual stats to generate an averaged number for PPD. Reporting an average number like this lends more confidence that the results are real, since PPD as reported in the client varies a lot with time, and PPD can bounce around by +/- 10 percent with different projects.

Below is the production time history plot, courtesy of https://folding.extremeoverclocking.com/. I marked on the plot the actual power consumption numbers I was seeing from my computer at the wall. As you can see, reducing the power limit on the 1060 from 100% to 50% saved about 40 watts of power at the wall.

GTX 1060 F@H Reduced Power Limit Production

GTX 1060 Folding@Home Performance at 100% and 50% Power

On the efficiency plot, you can see that reducing the power limit on the 1060 actually improved its efficiency slightly. This is a great way to fold more effectively.

Nvidia 1060 PPD per Watt Updated

NVidia GTX 1060 Folding@Home Efficiency Results

There is a downside of course, and that is in raw production. The Points Per Day plot below shows a pretty big reduction in PPD for the reduced power 1060, although it is still beating its little brother, the 1050 TI. One of the reasons PPD falls off so hard is that Stanford provides bonus points that are tied to how fast your computer can return a work unit. These points increase exponentially the faster your computer can do work. So, by slowing the card down, we not only lose on base points, but we lose on  the quick return bonus as well.

Nvidia 1060 PPD Updated

NVidia GTX 1060 Folding@Home Performance Results

Conclusion

Reducing the power limit on a graphics card can increase its computational energy efficiency in Folding@Home, although at the cost of raw PPD. There is probably a sweet spot for efficiency vs. performance at some power setting between 50% and 100%. This will likely be different for each graphics card. The process outlined above can be used for various power limit settings to find the best efficiency point.