How come Xavier's quoted TFLOPS count is less than double of Parker? The TDP is twice as high, the GPU twice as big with newer CUDA cores and spanking new (and more) custom ARM cores? One would think the TFLOPS would be more than twice as big - perhaps even surpassing 2 TFLOPS. My brain can't compute this.
It is FP32 FLOPs, a large number workloads often need only half precision float or even only 8-bit float. Hence, at the cost of FP32 FLOPs, they increased Half-precision float.
There is no such thing as 8bit float. Floating point numbers at 8bits of precision defies basic common sense. Lowest precision floating point implementations are 16 bit, i.e. half precision.
It is 8bit INTEGERS. Which is why the chart says TOPS - tera operations per second rather than tera floating point operations per second. 8bit integers suffice for ML, and the compute units required are much simpler, smaller and energy efficient. Which is how they get to reach those highly disproportional throughput values.
It’s even a possible that at some point they’ll be considered a good match for neural nets or something which doesn’t demand precision but does want dynamic range? (But yeah, the accumulator for such systems would have to be very carefully designed...)
Sure. But I think it's interesting to know that the option for 8-bit floats exists. Even though they don't have essential use cases today (and so haven't been implemented in HW) we don't know what tomorrow will bring. NN, in particular, have created an environment where poor precision is often acceptable, but one has to play games with a manually tracked fixed point to maintain dynamic range, and it's possible that (for at least some subset of the problem, once appropriately reconceptualized) 8-bit FP might be a better overall fit --higher per-op energy, yes, but easier to code because one doesn't have to track the position of the fixed point.
No. They also use up more die area than int, and if the benefits were that great, then we should've long ago seen a surge games' demand for fp16. In fact, having to support denormals probably burns most of the die area savings vs. fp16.
The reality is that GPUs had fp32 in such abundance that games ditched fixed-point arithmetic long ago. In fact, it's in such low demand that GCN only bothered to implement a scalar integer ALU, as opposed to the 64-wide SIMD they have for fp32.
Looking towards the future, Intel has had 2xfp16 since Broadwell and AMD has it in Vega. If Nvidia brings it to their mainstream post-Pascal line, then we might actually see fp16 start to gain traction, in games. It just depends on how compute-bound they still are (fp16 load/store has been around for a while).
fp16 is twice as energy efficient compared to fp32, and requires about half of the chip size for the same performance (or more, as multiplying 11-bit mantissas is way more than twice as cheap as 24-bit mantissas). Meaning enablement of 1080p+ gaming on battery-powered, even small and light laptops, just as it enabled phone and tablet games. It is also more useful for AI for the same power+performance+cost reasons.
The whole fp32 thing was a strategic mistake made for 640x480 no-AA low-detail gaming.
And that first link you posted really drives home the point about how tricky float8 would be to actually use. You'd have epsilon issues all over the place!
Honestly, fp32 has enough pitfalls, for me. I've never used fp16, but contemplating fp8 is a wake-up call that even fp16 would have to be used with care. Especially if no denormals (as I imagine GPUs implement it).
8-bit floats make no sense, 8-bit ENCODED floats have much better (as needed) precision and range, and all binary operations amount to simple 64kB table direct reads using 2 arguments as a 16-bit address in the table.
Aquaman is a boring character, that's why Jason Momoa is nothing at all like the character. I like to think of him as the guy Aquaman hired to make him seem cool.
I wish they would tell us why they're not producing consumer Tegras. I want a Shield portable, or an updated Shield TV. Either they didn't sell many Shield devices, or they have some agreement with Nintendo to not use Tegra to complete against them (the shield portable and shield TV are kinda sorta competitors to Switch).
They already made the new tegra though. The X2 was out before the Switch came out (which uses the X1). A switch hardware revision would almost certainly be an X2 and won't come out until the X2 earns a similar bargain bin status.
Good point but let's say the Nintendo switch refresh comes in 2020 or 2021. The x2 will be over 3 years old and might not be available at that time and don't forget that ps5 and the Xbox 2 will be out in 2021! So they will have to use a custom chip since Xavier has a tdp of 30w and the x2 will be outdated
Actually, it looks like Nintendo is getting a new SoC for a Switch refresh. The latest firmware update contained references to a T214 chip codenamed Mariko, where the TX1 is designated T210 and codenamed Erista, and the X2 is T186 and codenamed Parker. It's probably just a TX1 die-shrink with some unused hardware removed (I don't think Nintendo use the h.265 codec, for example), but interesting nonetheless.
I’m aware. It’s hard to say whether it’s a new nintendo SKU or security fixes. Nintendo has been going hard this generation on security compared to previous generations but a new piece of silicon is not a cheap proposition. The details to come to a conclusion are in the nintendo-nvidia contract.
It'd be nice if Nintendo did drop in a new SoC. The hardware is okay as-is, but longer battery life and doing away with the need for active cooling would be welcome improvements.
What Switch right now needs is shrinked Maxwell Tegra to 12nm from 20nm. They could clock up a bit for the games with some frame unstability, nothing more. It could increase battery life by 1 full hour easily.
But we already have a better option. The Tx2 SoC, When Nintendo dose any refresh of their handheld line up. Like Gameboy to Gameboy colour(more powerful) . Gameboy advance to Gameboy advance sp (Added a front or back lit screen and a rechargeable battery). Nintendo ds to Nintendo dsi (More power, Added the dsi shop and added cameras). And the 3ds to new 3ds (Went from dual core to quad core cpu, And doubled the ram from 128mb to 256 mb and also added more vram) So the switch pro might use the tx2
All signs point (and Nintendo states) that Nintendo is treating the Switch like a home console from a business perspective. It isn’t expect to get the annual updates that their handhelds get.
'Because money' is the short answer. They can sell these things to car companies for much more than they can sell them to people buying Shield products.
There's also little reason to upgrade the Shield TV. It already does 4K video like a champ and gaming on it has always been a sad sideshow with very limited support. A new Shield Portable makes more sense, but again, they've never really been able to get developers on board and it can't just be a device to play Android games since everyone's phone can already do that.
Why aren't Tegras available for the chromebook/new chromepad market? I understand their own tablet failed, but there's lots of chromebooks being sold that could use a little more oomph. Samsung uses lots of other people's chips, too. Android needs some more strategic partnerships outside of frenemies Samsung and Google.
Not that it helps you any, but while I've buying their hardware for years, I wouldn't say I'm a fan either. Just like I'm not a fan of pretty much any company. I buy from those that offer what I need at a given time, knowing fully well in a few years the situation can be turned on its head (i.e. at some point my system sported an AMD CPU and an ATI video card; if I were to update today, instead of Intel I might go with AMD).
Guys, i know, you love vulgar girls What about online communication with them without limits? Here http://lonaism.ga you can find horny real girls from different countries.
I know I am a bit late to the party, but... Could "Orin" be the chip for the upcoming PS 5 from SONY? Right now, MS has bested the PS 4 Pro at least in sheer numbers (CUs, Tflops, memory), and while there are distinct features (how upscaling is handled etc.), the bottom line is that customers (i.e. us) can right now compare apples to apples (AMD custom core to AMD custom core). So, why think of NVIDIA? I would be surprised if the next generation of consoles will NOT have some AI/neural networking/machine learning circuitry in them, if only so one can brag about it when those new machines launch. Plus, AI circuitry just makes a lot of sense for gaming, at least for me. NVIDIA is a heavyweight in this area, and a contract to supply tens of millions of custom chips for SONY's new PS is the kind of order even chipzillas like NVIDIA will find tempting. Plus, unlike self-driving cars, consoles haven't killed anybody yet, so there is that upside, too. So, has anybody heard any rumors of SONY hiring a larger number of programmers with ARM and CUDA experience?
It's extremely unlikely Sony or Microsoft will move away from x86-64 architecture for their next generation. Such changes in the past happened when there was a lack of a good upgrade path.
Microsoft had an untenable situation with the way the original Xbox hardware was sourced, PowerPC had become a dead-end by the time a 360 successor needed to get underway.
Over at Sony, the PS3 had the dual problems of PowerPC and the CELL use of it, which never delivered on the absurd claims Sony, Toshiba, and IBM made for it back when it was first announced. The process of getting from the original plan to what became the PS3 design (Nvidia's GPU was added very late in the game, so much so that no actual PS3 existed for its E3 premiere, so all the demos were either CELL or Nvidia on PC) was more than a little painful and expensive.
I'd be very surprised if the partnership with AMD didn't continue. They have a strong upgrade path to offer in the combination of Ryzen and Vega. The recently announced APU products are very likely the early basis of what the console makers will request from AMD. There will likely be some customization to mitigate compatibility issues, among other things. At least 8/16 cores/threads, at least 16GB RAM, etc. A pretty natural progression that is a substantial jump from their current high end models.
Another question is whether the PS4 Pro and Xbox One X can see enough cost reduction to keep them going for a while as the low end models.
I feel like Cell eventually delivered, though it was a steep learning curve and depended much on 2nd and 3rd party libraries optimized for it. The difference between early and late PS3 games is night and day.
The raw compute power of Cell was just monstrous, at the PS3's launch. Sure, it came at the expense of programmability, but it truly leapfrogged anything in the PC world, at the time. The console world hasn't seen anything like it, since. And probably never will.
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
43 Comments
Back to Article
Trixanity - Thursday, March 29, 2018 - link
How come Xavier's quoted TFLOPS count is less than double of Parker? The TDP is twice as high, the GPU twice as big with newer CUDA cores and spanking new (and more) custom ARM cores? One would think the TFLOPS would be more than twice as big - perhaps even surpassing 2 TFLOPS. My brain can't compute this.karthik.hegde - Thursday, March 29, 2018 - link
It is FP32 FLOPs, a large number workloads often need only half precision float or even only 8-bit float. Hence, at the cost of FP32 FLOPs, they increased Half-precision float.Trixanity - Thursday, March 29, 2018 - link
That makes sense. Thanks!iter - Thursday, March 29, 2018 - link
There is no such thing as 8bit float. Floating point numbers at 8bits of precision defies basic common sense. Lowest precision floating point implementations are 16 bit, i.e. half precision.It is 8bit INTEGERS. Which is why the chart says TOPS - tera operations per second rather than tera floating point operations per second. 8bit integers suffice for ML, and the compute units required are much simpler, smaller and energy efficient. Which is how they get to reach those highly disproportional throughput values.
name99 - Friday, March 30, 2018 - link
There are indeed 8-bit floats. See eghttp://www.cs.jhu.edu/~jorgev/cs333/readings/8-Bit...
https://en.m.wikipedia.org/wiki/Minifloat
It’s even a possible that at some point they’ll be considered a good match for neural nets or something which doesn’t demand precision but does want dynamic range? (But yeah, the accumulator for such systems would have to be very carefully designed...)
mode_13h - Friday, March 30, 2018 - link
I think it's pretty clear that @karthik.hegde meant to write 8-bit int. Everyone is all about inferencing with small-width ints.name99 - Friday, March 30, 2018 - link
Sure. But I think it's interesting to know that the option for 8-bit floats exists.Even though they don't have essential use cases today (and so haven't been implemented in HW) we don't know what tomorrow will bring. NN, in particular, have created an environment where poor precision is often acceptable, but one has to play games with a manually tracked fixed point to maintain dynamic range, and it's possible that (for at least some subset of the problem, once appropriately reconceptualized) 8-bit FP might be a better overall fit --higher per-op energy, yes, but easier to code because one doesn't have to track the position of the fixed point.
mode_13h - Monday, April 2, 2018 - link
No. They also use up more die area than int, and if the benefits were that great, then we should've long ago seen a surge games' demand for fp16. In fact, having to support denormals probably burns most of the die area savings vs. fp16.The reality is that GPUs had fp32 in such abundance that games ditched fixed-point arithmetic long ago. In fact, it's in such low demand that GCN only bothered to implement a scalar integer ALU, as opposed to the 64-wide SIMD they have for fp32.
Looking towards the future, Intel has had 2xfp16 since Broadwell and AMD has it in Vega. If Nvidia brings it to their mainstream post-Pascal line, then we might actually see fp16 start to gain traction, in games. It just depends on how compute-bound they still are (fp16 load/store has been around for a while).
peevee - Tuesday, July 3, 2018 - link
fp16 is twice as energy efficient compared to fp32, and requires about half of the chip size for the same performance (or more, as multiplying 11-bit mantissas is way more than twice as cheap as 24-bit mantissas). Meaning enablement of 1080p+ gaming on battery-powered, even small and light laptops, just as it enabled phone and tablet games.It is also more useful for AI for the same power+performance+cost reasons.
The whole fp32 thing was a strategic mistake made for 640x480 no-AA low-detail gaming.
mode_13h - Monday, April 2, 2018 - link
And that first link you posted really drives home the point about how tricky float8 would be to actually use. You'd have epsilon issues all over the place!Honestly, fp32 has enough pitfalls, for me. I've never used fp16, but contemplating fp8 is a wake-up call that even fp16 would have to be used with care. Especially if no denormals (as I imagine GPUs implement it).
peevee - Tuesday, July 3, 2018 - link
8-bit floats make no sense, 8-bit ENCODED floats have much better (as needed) precision and range, and all binary operations amount to simple 64kB table direct reads using 2 arguments as a 16-bit address in the table.Flunk - Thursday, March 29, 2018 - link
Aquaman is a boring character, that's why Jason Momoa is nothing at all like the character. I like to think of him as the guy Aquaman hired to make him seem cool.Holliday75 - Thursday, March 29, 2018 - link
LOLsyxbit - Thursday, March 29, 2018 - link
I wish they would tell us why they're not producing consumer Tegras.I want a Shield portable, or an updated Shield TV.
Either they didn't sell many Shield devices, or they have some agreement with Nintendo to not use Tegra to complete against them (the shield portable and shield TV are kinda sorta competitors to Switch).
S A - Thursday, March 29, 2018 - link
Well I do think that Nvidia would need to make a new Tegra for a Nintendo Switch pro or 2! But I think it's only custom made for the switchwillis936 - Thursday, March 29, 2018 - link
They already made the new tegra though. The X2 was out before the Switch came out (which uses the X1). A switch hardware revision would almost certainly be an X2 and won't come out until the X2 earns a similar bargain bin status.S A - Thursday, March 29, 2018 - link
Good point but let's say the Nintendo switch refresh comes in 2020 or 2021. The x2 will be over 3 years old and might not be available at that time and don't forget that ps5 and the Xbox 2 will be out in 2021! So they will have to use a custom chip since Xavier has a tdp of 30w and the x2 will be outdatedRyan Smith - Thursday, March 29, 2018 - link
Note that Parker (X2) is not officially a Tegra. Tegra ended with TX1.Thraktor - Friday, March 30, 2018 - link
Actually, it looks like Nintendo is getting a new SoC for a Switch refresh. The latest firmware update contained references to a T214 chip codenamed Mariko, where the TX1 is designated T210 and codenamed Erista, and the X2 is T186 and codenamed Parker. It's probably just a TX1 die-shrink with some unused hardware removed (I don't think Nintendo use the h.265 codec, for example), but interesting nonetheless.Here's the link:
http://switchbrew.org/index.php?title=5.0.0
S A - Friday, March 30, 2018 - link
I think it's a sign of a Switch miniwillis936 - Friday, March 30, 2018 - link
I’m aware. It’s hard to say whether it’s a new nintendo SKU or security fixes. Nintendo has been going hard this generation on security compared to previous generations but a new piece of silicon is not a cheap proposition. The details to come to a conclusion are in the nintendo-nvidia contract.PeachNCream - Friday, March 30, 2018 - link
It'd be nice if Nintendo did drop in a new SoC. The hardware is okay as-is, but longer battery life and doing away with the need for active cooling would be welcome improvements.Lolimaster - Friday, March 30, 2018 - link
What Switch right now needs is shrinked Maxwell Tegra to 12nm from 20nm. They could clock up a bit for the games with some frame unstability, nothing more. It could increase battery life by 1 full hour easily.S A - Friday, March 30, 2018 - link
But we already have a better option. The Tx2 SoC, When Nintendo dose any refresh of their handheld line up. Like Gameboy to Gameboy colour(more powerful) . Gameboy advance to Gameboy advance sp (Added a front or back lit screen and a rechargeable battery). Nintendo ds to Nintendo dsi (More power, Added the dsi shop and added cameras). And the 3ds to new 3ds (Went from dual core to quad core cpu, And doubled the ram from 128mb to 256 mb and also added more vram) So the switch pro might use the tx2willis936 - Friday, March 30, 2018 - link
All signs point (and Nintendo states) that Nintendo is treating the Switch like a home console from a business perspective. It isn’t expect to get the annual updates that their handhelds get.S A - Friday, March 30, 2018 - link
Yes but a Survey done by Nintendo shows that people mostly play the switch in Handheld mode and not in docked modeS A - Friday, March 30, 2018 - link
Btw the n64 had that ram expansion and also there is a possibility that SCD with GTX 1060 might come in 2021cfenton - Thursday, March 29, 2018 - link
'Because money' is the short answer. They can sell these things to car companies for much more than they can sell them to people buying Shield products.There's also little reason to upgrade the Shield TV. It already does 4K video like a champ and gaming on it has always been a sad sideshow with very limited support. A new Shield Portable makes more sense, but again, they've never really been able to get developers on board and it can't just be a device to play Android games since everyone's phone can already do that.
nico_mach - Friday, March 30, 2018 - link
Why aren't Tegras available for the chromebook/new chromepad market? I understand their own tablet failed, but there's lots of chromebooks being sold that could use a little more oomph. Samsung uses lots of other people's chips, too. Android needs some more strategic partnerships outside of frenemies Samsung and Google.0ldman79 - Thursday, March 29, 2018 - link
You must have missed the story where he cuts off his own hand with a rock so he could free himself to save his son and kill his brother...TheReason8286 - Thursday, March 29, 2018 - link
wow Orin is my actual name.. However im not a big fan of Nvidia lol. /facepalmbug77 - Friday, March 30, 2018 - link
Not that it helps you any, but while I've buying their hardware for years, I wouldn't say I'm a fan either. Just like I'm not a fan of pretty much any company. I buy from those that offer what I need at a given time, knowing fully well in a few years the situation can be turned on its head (i.e. at some point my system sported an AMD CPU and an ATI video card; if I were to update today, instead of Intel I might go with AMD).mode_13h - Monday, April 2, 2018 - link
Cool story, bug.evilpaul666 - Thursday, March 29, 2018 - link
Bury the lead, much? Did anyone click other than to find out who an "Orin" is?dromoxen - Thursday, March 29, 2018 - link
orin is an anagram of iron .. I never liked aquaman .. something fishy about him... Aquawoman , however, is a completely different kettle of fish.vailr - Thursday, March 29, 2018 - link
It's spelled "Xavier", not "Xaiver"patrickjp93 - Friday, March 30, 2018 - link
I just want AWS and Google Cloud to offer ARM-based Linux instances for super cheap micro service instances, including Lambda on ARM.LinuxDevice - Friday, March 30, 2018 - link
The most recent Tegra is not the TX1, it is the TX2. The TX1 did not have the Denver cores mentioned in the Parker series ARM table...this is the TX2.The TX2 (Parker) is actively developed.
Tterraneya - Wednesday, April 4, 2018 - link
Guys, i know, you love vulgar girlsWhat about online communication with them without limits? Here http://lonaism.ga you can find horny real girls from different countries.
eastcoast_pete - Thursday, April 5, 2018 - link
I know I am a bit late to the party, but...Could "Orin" be the chip for the upcoming PS 5 from SONY? Right now, MS has bested the PS 4 Pro at least in sheer numbers (CUs, Tflops, memory), and while there are distinct features (how upscaling is handled etc.), the bottom line is that customers (i.e. us) can right now compare apples to apples (AMD custom core to AMD custom core). So, why think of NVIDIA? I would be surprised if the next generation of consoles will NOT have some AI/neural networking/machine learning circuitry in them, if only so one can brag about it when those new machines launch. Plus, AI circuitry just makes a lot of sense for gaming, at least for me. NVIDIA is a heavyweight in this area, and a contract to supply tens of millions of custom chips for SONY's new PS is the kind of order even chipzillas like NVIDIA will find tempting. Plus, unlike self-driving cars, consoles haven't killed anybody yet, so there is that upside, too.
So, has anybody heard any rumors of SONY hiring a larger number of programmers with ARM and CUDA experience?
mode_13h - Saturday, April 7, 2018 - link
No. Orin only beats it in Tensor FLOPS.epobirs - Monday, April 9, 2018 - link
It's extremely unlikely Sony or Microsoft will move away from x86-64 architecture for their next generation. Such changes in the past happened when there was a lack of a good upgrade path.Microsoft had an untenable situation with the way the original Xbox hardware was sourced, PowerPC had become a dead-end by the time a 360 successor needed to get underway.
Over at Sony, the PS3 had the dual problems of PowerPC and the CELL use of it, which never delivered on the absurd claims Sony, Toshiba, and IBM made for it back when it was first announced. The process of getting from the original plan to what became the PS3 design (Nvidia's GPU was added very late in the game, so much so that no actual PS3 existed for its E3 premiere, so all the demos were either CELL or Nvidia on PC) was more than a little painful and expensive.
I'd be very surprised if the partnership with AMD didn't continue. They have a strong upgrade path to offer in the combination of Ryzen and Vega. The recently announced APU products are very likely the early basis of what the console makers will request from AMD. There will likely be some customization to mitigate compatibility issues, among other things. At least 8/16 cores/threads, at least 16GB RAM, etc. A pretty natural progression that is a substantial jump from their current high end models.
Another question is whether the PS4 Pro and Xbox One X can see enough cost reduction to keep them going for a while as the low end models.
mode_13h - Monday, April 9, 2018 - link
I feel like Cell eventually delivered, though it was a steep learning curve and depended much on 2nd and 3rd party libraries optimized for it. The difference between early and late PS3 games is night and day.The raw compute power of Cell was just monstrous, at the PS3's launch. Sure, it came at the expense of programmability, but it truly leapfrogged anything in the PC world, at the time. The console world hasn't seen anything like it, since. And probably never will.