Hi everyone, sorry for the delay in blog posts. Electricity in Connecticut has been so expensive lately that except for our winter heating Folding@Home cluster, it wasn’t affordable to keep running all those GPUs (even with our solar panels, which is really saying something). However, I did manage to get some good data on the top-tier Nvidia RTX 3090, which I got during COVID as the GPU in a prebuilt HP Omen gaming desktop. I transplanted the 3090 into my benchmark desktop, so these stats are comparable to previous cards I’ve tested.
Wait, what are we doing here?
For those just joining, this is a blog about optimizing computers for energy efficiency. I’m running Folding@Home, a distributed computing research project that uses your computer to help fight diseases such as cancer and covid and a host of other ailements. For more information, check out the project website here: https://foldingathome.org/
Look at this bad boy!
This is the HP OEM version of an RTX 3090. I was impressed that it had lots of copper heat pipes and a metal back plate. Overall this was a very solid card for an OEM offering.
At the time of my testing, the RTX 3090 was the top-tier card from Nvidia’s new Ampere line. They have since released the 3090 Ti, which is ever so slightly faster. To give you an idea of where the RTX 3090 stacks compared to the previous cards I have tested, here is a table. Note that 350 watt TDP! That is a lot of power for this air cooler to dissipate.
I ran Folding@Home on my benchmark desktop in Windows 10, using Folding@Home client 7.6.13. I was immediately blown away by the insane Points Per Day (PPD) that the 3090 can spit out! Here’s a screen shot of the client, where the card was doing a very impressive 6.4 million PPD!
What was really interesting about the 3090 though was how much variation there was in performance depending on the size of the molecule being worked on. Very large molecules with high atom counts benefited greatly from the number of CUDA cores on this card, and it kicked butt in both raw performance (PPD) and effiency (PPD/Watt). Smaller molecules, however, did not fully utilize this card’s impressive potential. This resulted in a less efficiency and more wasted power. I would assume that running two smaller Ampere cards, for example the 3080, with small models would be more efficient than using the 3090 for small models, but I haven’t got any 3080’s to test that assumption (yet!).
In the plots below, you can see that the smaller model (89k atoms) resulted in a peak PPD of about 4 million, as opposed to the 7 million PPD with a 312k atom model. PPD/watt at 100% card power was also less efficient for the smaller model, coming in at about 16,500 PPD/Watt vs. 10,000 PPD/Watt. These are still great efficiency numbers, which shows how far GPU computing has come from previous generations.
Reduce GPU TDP Power Target to Improve Efficiency
I’ve previously shown how GPUs are set up for maximum performance out of the box, which makes sense for video gaming. However, if you are trying to maximize energy efficiency of your computational machines, reducing the power target of the GPU can result in massive efficiency gains. The GeForce RTX 3090 is a great example of this. When solving large models, this beast of a card benefits from throttling the power down, gaining 2.35% improved energy efficiency with a power target set for 85%. However, the huge improvement comes for solving smaller models. When running the 89k atom work unit, I got a whopping 29% efficiency improvement when setting the power target to 55% with only a 14% performance reduction! Since the F@H project gives out a lot of smaller work units in addition to some larger ones, I chose to run my machine at a 75% power target. On average, this splits the difference, and gives a noticeable efficiency improvement without sacrificing raw PPD performance too much. In the RTX 3090’s case, a 75% power target massively reduced the power draw on the computer (reduced wall consumption from 434 to 360 watts), as well as reduced heat and noise coming out of the chassis. This promotes a more happy office environment and a happier computer, that will last longer!
Tuning Results: 89K Atoms (Small Model)
Here are the tuning plots for a smaller molecule. In all cases, the X-axis is the power target, set in the Nvidia Driver. 100% corresponds to 350 Watts in the case of the RTX 3090.
Tuning Results: 312K Atoms (Large Model)
And here are the tuning results for a larger molecule.
Here are the comparison results to the previous hardware configurations I have tested. Note that now that the F@H client supports enabling CUDA, I did some tests with CUDA on vs. off with the RTX 2080 Ti and the 3090. Pro Tip: MAKE SURE CUDA IS ON! It really speeds things up and also improves energy efficiency.
Key takeaways from below is that the 3090 offers 50% more performance (PPD) than the 2080 Ti, and is almost 30% more energy efficient while doing it! Note this does not mean this card sips power…it actually uses more watts than any of the other cards I’ve tested. However, it does a lot more computation with those watts, so it is putting the electricity to better use. Thus, a data center or workstation can get through more work in a shorter amount of time with 3090s vs. other cards, and thus use less power overall to solve a given amount of work. This is better for the environment!
The flagship Ampere architecture Nvidia GeForce RTX 3090 is an excellent card for compute applications. It does draw a ton of power, but this can be mitigated by reducing the power target in the driver to gain efficiency and reduce heat and noise. In the case of Folding@Home disease research, this card is a step change in both performance and energy efficiency, offering 50% more compute power and 30% more efficiency than the previous generation. I look forward to testing out other Ampere cards, as well as the new 40xx “Lovelace” architecture, if Eversource ever drops the electric rate back to normal levels in CT.
Sweet. Nice to see a new post.
How do you retest the same size workload, at different power levels? Do you just take the 10 minute average, and tweak? Or restarting a known file?
I have 2080, 2080ti, 3090 (to be debugged), will try some tests to see if we get similar numbers. Right now, the two 2080’s pull 700W from the wall…
What’s your actual electrical rate?
Hi! For the individual workload tests, I stayed within one work unit and made adjustments and watched for the moving average to stabilize. It’s not 100% perfect, but good enough to reveal the trend. For the longer term tests that make up the bar chart at the end, I run a ton of work units at a given setting, such as the 75% TDP setting, and average the answer. I usually try to get at least a week’s worth of data for the major data points in the final plots.
Nice collection of GPUs! 700W wall TDP from the 2080s seems about right, assuming about 150 watts for the rest of the system to keep them fed.
Electric rate over here at the moment is 24 cents / kWh for generation, not including transmission or service charges. https://insideinvestigator.org/report-connecticut-electric-rates-are-highest-in-continental-us/#:~:text=Connecticut%20residents%20paid%20some%20of,greatest%20price%20increases%20in%202022.