Duped by Nvidia, 4090 Benchmarks Don’t Apply to Gamers

Nvidia GeForce RTX 4090

Preface

I started this article as a news and analysis piece for the GeForce 4090 launch. It quickly grew to be a much longer piece, though. I’m going to lay out in great detail why enthusiasts shouldn’t care about the GeForce 4090, how Nvidia used tech enthusiasts for marketing, and why the true Lovelace gaming cards won’t be as impressive as you think. So, grab some coffee and get reading!

The News

The Nvidia GeForce 4090 benchmark embargo was lifted yesterday, and the numbers are impressive! Various outlets are measuring anywhere between a 40% to 75% generational increase in performance over the 3090 Ti. This GPU is so powerful that it can run Cyberpunk 2077 at 4K with raytracing and no DLSS at 60 FPS without breaking a sweat. Even more impressive is that power consumption remains static at around 430 watts under load – less than the rumored 600 watts the 4090 was expected to consume.

It should be noted that at the time of writing Nvidia still has an embargo in place for AIB partner cards. The normal tech outlets have not released benchmarks for MSI et al. 4090 cards, though it’s safe to expect similar levels of performance.

FOMO need not apply, though. The 4090 isn’t for gamers. More on that below.

Here’s a quick spec recap for the Nvidia 4090 and both “4080” models. I’ll be referencing these numbers in the rest of the article:

4090 Specs:

  • 16384 Cuda Cores
  • 24Gb memory
  • 384-bit memory bandwidth
  • 2520 MHz CPU boost

4080 16Gb Specs:

  • 9728 Cuda Cores
  • 16 Gb memory
  • 256-bit memory bandwidth
  • 2505 MHz CPU Boost

“4080” 12Gb Specs:

  • 7680 Cuda Cores
  • 12 Gb memory
  • 196-bit memory bandwidth
  • 2610 MHz CPU Boost

The Opinion

I’m salivating over the 4090. I want it bad, but not for the same reason you do. If you’re a gamer, the 4090 isn’t a GPU for you. Go look somewhere else. It’s not because of price either. In fact, the MSRP for the 4090 is downright cheap, but Nvidia didn’t make the 4090 for gamers. It’s just using us for marketing.

Before you break out your pitchforks set ablaze with the blood of your firstborn type of witchery – hear me out. I’m going to explain who the 4090 is for, why this GPU doesn’t apply to you, and how Nvidia is using us for publicity.  So, keep reading.

The GeForce 4090 Is Not for Gamers

Nvidia GeForce RTX 4090 (2)

The moment Nvidia announced the pricing schedule for Lovelace, the crowd went wild, but not in a good way. Every enthusiast tech outlet balked at the price of the GeForce 4090. As gamers, $1,600 is a large pill to swallow, but the 4090 isn’t for gamers.

The GeForce 4090 is meant for freelancers and SMBs. Full stop. Here’s why.

The History of GeForce in Business

Small businesses and freelancers have always used GeForce cards as an alternative to the Quadro GPUs, and this has totally been Nvidia’s fault. Nvidia’s 6th generation of graphics cards, the 6×00 series (try Googling it – they are so old you only get AMD results now…) could easily be converted to a Quadro card with a simple registry hack. It was obvious that the GPU cores shared more in common between GeForce and Quadro than Nvidia wanted to admit.

Since the 6×00 series days, Nvidia has tried its best to force businesses to use their Quadro line instead of GeForce cards. Love it or hate it, Nvidia’s cash cow doesn’t depend on gamers. The profit margin is too thin on enthusiast cards.

That’s not to say that Nvidia doesn’t make a good chunk of change from us gamers, though. We buy a lot of hardware. It’s not a market that Nvidia can ignore.

But again, Nvidia’s bread and butter aren’t gamers. We simply get the leftovers. Let me explain.

Nvidia has always designed for Quadro before GeForce (with caveats for today’s markets). For instance, before Vulkan, OpenGL was the bee’s knees. Let’s save the debate comparing OpenGL, Vulkan, and Direct X for another article. OpenGL was, and to some extent still is, the de facto graphics layer for anything that isn’t a game (IE., industries where unit sales are more than $60 a pop).

Nvidia trounced competitors in the OpenGL space. Even PowerVR had trouble keeping up, and back in the day PowerVR’s hardware rendering engine and Savage’s S3TC technologies were way ahead of 3DFX’s Glide and anything Nvidia or ATi had going for it.

A moment ago, I stated that the 6×00 cards could be converted to Quadro cards. At the time, the Nvidia 6800 GT was a monster card. When those 6800 GTs were benchmarked against competing Quadro cards using Quadro-certified drivers, they went toe-to-toe with each other.

Nvidia realized they dun screwed up.

Moving forward, Nvidia feature-locked GeForce cards. For example, NVenc can’t decode more than 2 video streams, and GeForce cards couldn’t be used for virtualization (more on that below). If you needed these features, it was Quadro or bust.

There’s a lot that happened between Nvidia’s 6th generation of GPUs and today’s cards, but it wasn’t until the GeForce 9×0 series that businesses started eyeing GeForce again despite feature locks.

Why do businesses buy Nvidia Quadro cards?

Nvidia Quadro GPUs are as expensive as used cars. That’s an open-ended hefty statement, isn’t it? Think about it. A Quadro RTX 8000 runs for $5,800 on Amazon. Data center-focused Tesla GPUs are easily double that cost.  The Nvidia DGX100 Compute Unit costs as much as a house ($199,000).

If the GeForce 4090 MSRP is listed at $1600, why are businesses buying Quadro cards? There are a few reasons.

Pi Is Not Accurate Enough and Other Things

Traditionally Quadro’s claim to fame is floating point calculations. In the CompSci world, floating point numbers are messy. We don’t like them. Numbers after a decimal point are not accurate enough. This is the reason all FinTech apps convert currency to integer values (E.g. Pennies) before it does anything else.

That rings true in all mathy professions including modeling and CAD software. 3D modelers, hipster data scientists, and engineers were among the first to use computer-aided workflows. In the engineering world, there is a massive difference between 0.00001 and 0.00009.

At one time, Nvidia Quadro cards came with application-certified drivers. If you used Maya with an Nvidia graphics card, Nvidia provided drivers that were tested and certified specifically for Maya. That’s not so much the case today (more below). Those driver certifications now come from the vendor more than Nvidia, though Nvidia still works very closely with market leaders like Autodesk.

But I digress. Those certified drivers guaranteed floating point math accuracy (among other things). Because the GeForce products were aimed at the gaming market, and game engines typically prefer whole values (E.g. Integers), floating point accuracy isn’t a priority for them.

Before Pascal cards, this was easily proven. GeForce cards consistently trounced Quadro cards in game benchmarks. Though on a personal opinion, I’m not entirely convinced there was ever a hardware difference between GeForce and Quadro cards in the GPU die (other than Cuda cores, etc.) – it only existed in software.

Over time, Nvidia gave businesses other reasons to purchase Quadro cards. For instance, NVenc is one of the best hardware-based media encoders on the market, but GeForce cards can only process 2 video streams at once. That’s not feasible for organizations that work with video content.

Likewise, up until a year or two ago, GeForce cards did not work in a virtualized environment without a lot of hacks. Quadro cards do. In both the SMB and enterprise environments, everything has been virtualized for the past decade. If a hardware component can’t be supported in a VM, it’s not going into the server. Full stop.

For businesses, the Quadro tax isn’t a big deal. Paying a couple of thousand dollars more for a GPU is a drop in the bucket when a rounding error could literally cost you millions of dollars.

Likewise, every business I have worked for in my professional life has not paid me to sit on the phone waiting for support from a vendor. It costs businesses hundreds of dollars in wages alone for their IT staff to file a support ticket with Microsoft for a licensing issue. Certified graphics drivers reduce support costs.

Organizations don’t care about the Quadro tax. It’s just another OpEx cost like those pens Kathy in HR has been hoarding.

Nvidia Tricked SMBs Into Using GeForce

In 2013 Nvidia released its first Titan card. They were amazing! That was Nvidia’s first signal to the SMB market that GeForce cards might be a viable alternative to Quadro.

A year later, Nvidia released the Titan Z – a dual-die monster GPU. While SLI was a thing long before the Titan Z, the difference was that the Titan Z could support more onboard memory than either GeForce or Quadro cards, and it didn’t come with SLI’s headaches. Though the Titan Z was basically two graphics cards squished together, the OS treated it as a single GPU thus making it more reliable.

Though the Titan GPUs started to gain some traction in the business environment, it wasn’t until Pascal was released that GeForce cards really shined. Before Pascal, GeForce ate Quadro’s breakfast in games. With the Pascal architecture, a Quadro P6000 suddenly went toe-to-toe with a GeForce 1080 even when using Quadro-certified drivers. Making matters worse, a lot of industry professionals found that the GeForce 1080 GPU was about as good as Quadro cards in professional applications. The only difference was the application-certified drivers were only available for Quadro GPUs.

Instantly the gap between Quadro and GeForce cards in the SMB environment boiled down to features hidden behind a paywall.

The SMB Move to GeForce

Since Nvidia released the Pascal architecture, the GeForce experience (pun intended) in the SMB market has grown stronger. Small businesses are fine with the 2 stream NVenc limitations, and Nvidia has since started supporting GeForce products in virtualized environments.

Likewise, Titan cards are MIA. The latest Titan GPU, the Titan RTX, was released in 2018. It was based on the Turing architecture – better known as the GeForce 20×0 cards. The Titan RTX was also marketed to data scientists and people doing machine learning – not gamers.

The Ampere GPUs scrapped Titan altogether. Instead, we got the GeForce 3090. Do you remember the difference between the GeForce 20×0 and 30×0 launches? Nvidia totally pimped out the GeForce 2080 to every tech enthusiast YouTuber it could find. However, all was quiet on the Titan RTX front.

When Ampere launched, the 3090-marketing approach wasn’t any different from the 3080. Tech YouTubers galore benchmarked the 3090 until they were blue in the face. Every gamer wanted a 3090, but only the more affluent among us could afford one.

However, the GeForce 3090 was never meant for gamers. Nvidia only used gamers as a marketing arm. We willingly praised its graces, and the folks in the SMB world paid attention to it. We’re seeing the same thing happen with the 4090, too.

Who should buy the GeForce 4090?

Nvidia GeForce RTX 4090 (3)

The GeForce 4090 isn’t for gamers. It’s for the SMB crowd. Specifically, the GeForce RTX 4090 is for freelancers, startups, and small businesses doing content creation, machine learning, and data analytics. Gamers need not apply.

Where’s the proof?

Is AI real life?

This part is more complicated. One area of technology that has blown up is machine learning and artificial intelligence. While both are technically different things – they are the same thing…

Machine learning is a royal ***** to compute. It requires tons of processing power even with specialized hardware. It just so happens that the parallel nature of GPU design fits this bill perfectly.

There are a few things to consider here. First, Nvidia capitalized on the ML market fast. They beat everyone else, and as such, the ML market heavily depends on Cuda. AMD is trying to catch up, but they are a few years off.

Even still, Nvidia has so much of a lead in the ML market that their enterprise cards (E.g. Tesla, Quadro, Geforce, et al.) are considered the standard. Software engineers don’t like Cuda, yet every machine-learning framework defaults to using it.

In every respect, AMD’s ROCM AI platform is easier to use and works better than Nvidia’s Cuda. However, because Cuda came first, the big ML platforms and libraries like TensorFlow and PyTorch support Cuda out of the box. ROCM isn’t as well supported but is gaining traction.

Okay, so Nvidia wins AI. So what?

Here’s the big deal. For most people, their experience using machine learning on their computers is running apps with premade models. While AI-powered apps, like Whisper or Stable Diffusion, use GPUs for acceleration, that’s not what makes GPUs a big deal in the machine learning space.

Creating those machine learning models from scratch requires a boatload of processing power even when using platforms like TensorFlow. The amount of computing resources required to create your Stable Diffusion porn is like a drop in the ocean compared to what OpenAI had to use to create the models that power Stable Diffusion.

While a GeForce 3090 can generate an image with Stable Diffusion in a few seconds, it would take a GeForce 3090 days or weeks to process the ML models required to make Stable Diffusion work.

There’s another way, though. AWS, Azure, and Google all offer PaaS products specifically for machine learning. These PaaS products are backed by GPUs (specifically Nvidia GPUs in most cases).

Go look at the pricing schedule for AWS, though. An Nvidia-powered EC2 instance can easily cost $24.00 per hour for a P3 instance (recommended for ML training).

If we play the number fiction game, and the model for Stable Diffusion takes 3 days to crunch with that AWS EC2 P3 instance, that will cost $1728. That’s more money than a new GeForce 4090 and only a couple hundred dollars shy of the inflated RTX 3090 TI launch price. At least if businesses buy the GeForce 4090, they get to keep using it after that ML model is completed.

Oh, okay. So, it’s cheaper for businesses to buy GeForce cards?

See where I’m going with this? It’s very cost-effective for businesses to buy GeForce cards.

The world runs on small businesses, and tech startups can’t afford the likes of AWS and Azure. The running myth is that the cloud is cheaper than it is for businesses to buy and maintain their own servers.

That’s flat-out wrong. If cloud resources are configured properly, they cost about as much (give or take depending on the business) to run hardware on-prem. This doesn’t include tiny businesses that can get away with using Office 365 (now Microsoft 365) or Google GSuite (or whatever Google calls it this week) only.

SMBs routinely choose the cloud route for operational reasons, not to save money.

Since AI platforms like TensorFlow have matured, AI-powered startups are popping up like weeds. Where in the middle of a huge societal shift – one that we haven’t witnessed since the birth of the small combustion engine.

Even I am running AI-powered software in my freelance life. However, like most small businesses, I can’t afford to spend $2000 to train an ML model. Nor do I have any desire to pay $7 per hour for an AWS GPU-powered virtual machine – not when a GeForce 3080 is only $800.

This only explains why startups offering ML-powered products would want a GeForce card. The truth is any GPU-powered operation will benefit from the power of a GeForce 4090. It doesn’t help that AWS and Azure helped murder the HPC (high-performance computer) in the workplace. High-powered workstations are only ever ordered for specific use cases now. The list of reasons to purchase Quadro is getting smaller.

Who wants Quadro anymore?

There will always be a demand for Quadro cards. Don’t get me wrong. I once worked with engineers that would run particle simulations. These simulations computed the flow of millions of grain-sized particles at once.

Even with SLI-enabled top-end Quadro cards, these simulations took days to run. Those simulations were used to design heavy equipment, and if they were wrong, that mistake could literally cost that business hundreds of thousands of dollars per unit sale. The Quadro tax for those certified drivers is worth the money to that business.

Nvidia has also pivoted in the enterprise environment. Quadro was Nvidia’s enterprise bread and butter. However, things like the DGX100 and Tesla cards have taken Quadro’s place. This is probably the reason why Nvidia lamented and started allowing GeForce cards to work in virtualized environments.

Quadro will likely not be laid to rest anytime soon. There is a lot of value behind that brand and those certified drivers. For the freelancer and SMB crowd, the 4090 is the new queen bee. So why is the media making a big fuss about the GeForce 4090 gaming performance?

What about the GeForce 4080?

Nvidia GeForce RTX 4080

You’re cute Nvidia. We see what you did there. Everyone is all up in a tizzy about the 4090, and now we’re pumped for the 4080. Should we be, though?

Meh, I guess. It’ll be fun for me to watch, but I’m a nerd. Here’s why the 4080 will be much less impressive.

Each new generation of graphics cards typically offers a 15-25% improvement. It doesn’t matter if it’s an AMD or Nvidia card. Lovelace is not unique despite Nvidia’s best efforts.

The top of this article states that the GeForce 4080 16Gb variant has 9728 Cuda cores. The GeForce 4090 has 16384. Otherwise, the Nvidia GeForce 4090 has 6,659 more Cuda cores, or almost double.

According to JayzTwoCents, a popular PC enthusiast YouTube channel, the 4090 gets a respectable score of 19553 in 3dMark Time Spy Extreme while the 3090 TI clocks in at a measly 11185. The 3090 TI boasts 10752 Cuda cores.

Let’s do some rough napkin math.

The GeForce 3090 TI scores 1.04 points for every Cuda core in the Time Spy Extreme benchmark. The GeForce 4090 scores 1.19 points per Cuda core. That looks a lot closer to the gen-over-gen performance uplift we’re so familiar with.

Using that same math, we can expect the GeForce 4080 16Gb variant to score about 11576 in Time Spy Extreme. That’s only slightly better than the GeForce 3090 TI. We should note that this math doesn’t account for the difference in the memory bus – it’s wider in both the GeForce 3090 TI and 4090 compared to the 4080. So, the actual Time Spy Extreme score for the 4080 16Gb variants will likely be lower.

Is that a problem? Not really. That’s comparing a flagship consumer card to an SMB-grade GPU. The better comparison is the 3080 vs the 4080. We’re waiting with bated breath for those numbers, but I suspect the 4080 will offer about a 30% improvement in performance compared to the 3080.

That’s still a good boost in performance, but it’s not the 50%-70% that we are seeing with the 4090. So, it behooves me to stipulate that Nvidia is using the enthusiast market to pimp their 4090s to businesses. We shouldn’t pay it any attention. I’m not even going to discuss the 4070, ehh… I mean the 4080 12Gb version. Shame on you Nvidia…

RDNA3 Vs. Lovelace

What does all this mean for gamers using AMD? Other than mindshare, absolutely nothing.

Here’s the deal. It’s obvious how Nvidia is trying to control its narrative. They have an embargo on GeForce reviews. At the time of writing, outlets are only allowed to release benchmarks and reviews for the 4090 Founders Edition. Numbers for partners’ cards are not allowed to be released until the next day.

That serves two purposes.

One, this strategy aligns with Jensen’s views that partners aren’t important. Nvidia wants all the attention. Jensen believes that MSI and other AIB partners shouldn’t get praise since Nvidia does all the work.

Two, it keeps Nvidia in the news cycle longer. The more that a business occupies mindshare, the more likely they are to be the dominant force in its market. I bet we’ll see the 4080 reviews about the same time RDNA3 is released. The 4080 review cycle will follow the 4090, too – 4090 FE reviews first followed by partner cards a day or two later.

Nvidia is not screwing around here. AMD is traditionally not a threat in the GPU world, but Nvidia watched how AMD started eating Intel’s lunch. Lisa Su very publicly gave Jensen the professional three-finger salute when RDNA launched stating that RDNA will age like fine wine – though it’s speculated Su may have said that as a nod to enthusiasts.

Nonetheless, RDNA2 went toe-to-toe with Ampere in rasterization performance. It was the first time in a long while that Nvidia wasn’t dominant in every benchmark. RDNA3 is expected to be even better this time while challenging Lovelace’s RT performance, too.

Hence Jensen’s response. There’s a good chance that RDNA3 will be extremely competitive in most markets compared to GeForce. By market, I don’t mean different geographic areas. I mean in different industries.

AMD is getting serious about ML, and their release notes for ROCM show that. AMD’s video encoders lag behind NVenc, but they continue to improve at a drastic rate. RDNA3 is rumored to support AV1 encoding, much like Intel Arc, across the entire RNDA3 GPU line, too. There’s a good chance that at a minimum RDNA3 will go toe-to-toe with Lovelace this generation, if not trouncing Lovelace in game benchmarks altogether.

The best way for Nvidia to compete against AMD at the moment is through mindshare, and Nvidia crafted a master plan. The GeForce 3090 and 4090 are not meant for gamers, but tech outlets benchmark and treat them as such.

So, if the 4090 isn’t a gaming card, and the SMB market could use more horsepower, why not throw as many Cuda cores in the thing as possible? That way tech outlets go nuts over that substantial 70% gen-over-gen performance improvement, and the 4090 instantly goes viral in the tech space.

There’s a good chance that Nvidia did catch AMD off-guard. AMD may not have a competitor for the 4090 early in the RDNA3 generation. AMD will need to scramble and produce something like a Radeon 7950 XT Super or something. However, I must emphasize again, the 4090 isn’t for gamers, and we should only pay attention to the 4080 and lower GPUs.

Conclusion

Holy moly! This is a long article. I’m including a conclusion at the end as a TL;DR for you folks that scroll to the bottom right away.

Here’s the summary of this article:

  • Don’t be impressed by the GeForce 4090. It has the same gen-over-gen performance improvement as every previous generation GPU that came before it.
  • The 4090 is meant for freelancers and SMBs, not gamers.
  • Nvidia tightly controlled the 4090 launch narrative to go viral and gain mindshare. We’ve been duped.
  • The 3090 and 4090 are the new Quadro and Titan cards for small businesses. Oh yeah, and for affluent gamers, too, I suppose.
  • S3TC scared Nvidia poopless, but that’s for a different article.

I understand that this article is a short story. If you’re a tech or gaming enthusiast, I highly encourage you to read all of it – especially if you’re younger than an elder millennial. There is a lot of good history and personal experience forming my opinions, and if you think I’m wrong, let me know. Sound off on social media and tag us in your responses.

About Jonathan Welling 9 Articles
My nam is Jon and I am a tech therapist. I'm a recovering sysadmin with more than a decade of served time in the tech industry doing everything from front-line support to managing systems and application development. I now spend my days helping others recover from tech phobias through exposure therapy, assisting SMBs with IT, writing words, and enjoying long walks on the beach.

Be the first to comment

Leave a Reply

Your email address will not be published.


*