How many pixels can G80 push?
One of the things we are using to describe the traditional pixel pipeline is the number of pixels a chip can render in a single clock. With programmable units, the traditional pipeline died out, but many hacks out there are still using this inaccurate description.
To cut a long story short, on the pixel-rendering side, G80 can render the same amount of pixels as G70 (7800) and G71 (7900) chips.
The G80 chip in its full configuration comes with six Raster Operation Partitions (ROP) and each can render four pixels. So, 8800GTX can churn out 24, and 8800GTS can push 20 pixels per clock. However, these are complete pixels. If you use only Z-processing, you can expect a massive 192 pixels if one sample per pixel is used. If 4x FSAA is being used, then this number drops to 48 pixels per clock.
For game developers, the important information is that eight MRT (Multiple Render Targets) can be utilised and the ROPs support Frame Buffer blending of FP16 and FP32 render targets and every type of Frame Buffer surface can be used with FSAA and HDR.
If you are not a game developer, this sentence above means that Nvidia now supports FP32 blending, which was not a thing in the past, and FSAA/HDR combination will be supported by default. In fact, 16xAA and 128-bit HDR are supported at the same time.
Lumenex Engine - New FSAA and HDR explained
ROPs are also in charge of AntiAliasing, which has remained very similar to GeForce 7 series, albeit with quality adjustments. The G80 chip supports multi-sampling (MSAA), supersampling (SSAA) and transparency adaptive anti-aliasing (TAA). The four new 1GPU modes are 8x, 8xQ, 16x and 16xQ. Of course, you can't expect that you will be able to have enough horsepower to run the latest games with 16xQ enabled on a single 8800GTX, right?
Wrong. In certain games you can buy today, you can enjoy full 16xQ with the performance of regular 4xAA. The reason is exactly the difference between those 192 and 48 pixels in a single clock. But in games which aren't able to utilise 16x and 16xQ optimisations, you're far better off with lower AntiAliasing settings.
This mode Nvidia now calls "Application Enhanced, joining the two old scoundrels "Application Override" and "Application Controlled". Only "App Enhanced" is the new mode, and the idea is probably that the application talks with Nvidia's driver in order to decide which piece of a scene gets the AA treatment, and what does not. Can you say.... partial AA?
Now, where did we hear that one before.... ah, yes. EAA on Renderition Verite in late 90s of the past century and Matrox Parhelia in the early 21st century?
On the HDR (High Dynamic Range) side, Nvidia has designed the feature around OpenEXR spec, offering 128-bit precision (32-bit FP per component, Red:Green:Blue:Alpha channel) instead of today's 64-bit version. Nvidia is calling its new feature True HDR, although you can bet your arse this isn't the latest feature that vendors will call "true". Can't wait for "True AA", "True AF" and so on...
Anisotropiiltering has been raised in quality to match for ATI's X1K marchitecture, so now Nvidia offers angle-independent Aniso Filtering as well, thus killing the shimmering effect which was so annoying in numerous battles in Alterac Valley (World of WarCraft), Spywarefied (pardon, BattleField), Enemy Territory and many more. When compared to GeForce 7, it looks like GeForce 7 was in the stone age compared to the smoothness of the GeForce 8 series. Expect interesting screenshots of D3D AF-Tester Ver1.1. in many of GF8 reviews on the 8th.
Oh yeah, you can use AA in conjunction with both high-quality AF and 128-bit HDR. The external I/O chip now offers 10-bit DAC and supports over a billion colours, unlike 16.7 million in previous GeForce marchitectures.
Quantum Effects
Since PhysiX failed to take off in a spectacular manner, DAAMIT's Menage-a-Trois and Nvidia's SLI-Physics used Havok to create simpler physics computation on respective GPUs. Quantum Effects should take things on a more professional (usable) level, with hardware calculation of effects such as smoke, fire and explosions added to the mix of rigid body physics, particle effects, fluid, cloth and many more things that should make their way into games of tomorrow.
GeForce 8800GTX
Developed under a codename P355, the 8800GTX is Nvidia's flagship implementation. It features a fully fledged G80 chip clocked at 575MHz. Inside the GPU, there are 128 scalar Shader units clocked at 1.35GHz and raw Shader power is around 520GFLOPS. So, if anyone starts to talk about teraflops on a single GPU, we can tell you that we're around a year before that number becomes true. Before G90 and R700 these claims come from marketing alone.
768MB of Samsung memory is clocked at 900MHz DDR, or 1800 MegaTransfers (1.8GHz) wielding out a commanding 86.4 GB/s of memory bandwidth.
The PCB is massive 10.3 inches, or 27 centimetres, and on top of the PCB there are couple of new things. First of all, there are two power connectors, and secondly - the GTX features two new SLI MIO connectors. Their usage is "TBA" (To Be Announced), but we can tell you that this is not the only 8800 you will be seeing on the market. Connectors are two dual-link DVIs and one HDTV 7-pin out. HDMI 1.3 support is here from day one, but we don't think you'll be seeing too much of 8800GTX w/HDMI connection.
Cooling is not water/air cooled, but more manufacturer friendly aluminium with copper heat pipe. The I记爱好者s expected to be silent as a grave, and several AIBs are planning a more powerful version for 2nd gen 8800GTX, expected to be overclocked to 600 MHz for GPU and 1 GHz DDR for the memory.
The board's recommended price has changed couple of times and stands at 599 or dollars/euros, or 399 pounds. However, due to expected massive shortage, expect these prices to hit stratospheric levels.
GeForce 8800GTS
Codenamed P356, the 8800GTS is a smaller brother of the GTX. The G80 chip is the same as on the GTX, but the amount of wiring has been cut, so you have the 320-bit memory controller instead of 384-bit, 96 Shader units instead of 128 and 20 pixels per clock instead of 24.
The board itself is long and comes with a simpler layout than the GTX one. Dual-Link DVI, 7-pin HDTV out come by default. "Only" one 6-pin PEG connector is used, and power-supply requirements are lighter on the wallet.
The clocks have been set at 500MHz for the GPU, 1.2GHz for Shader Units, while the 640MB of memory has been clocked down to 800MHz DDR, or 1600 MegaTransfers (1.6GHz), yielding out bandwidth of 64GB/s. Both pixel and texel fill-rate fell by a significant margin, to 24 billion pixels and 16 to 32 billion texels.
Recommended price is 399 dollars/euros, but who are we kidding? Expect at least 100 dollars/euros higher price.
Performance is CPU Bound
Yes, you've read it correctly. Both GTS and GTX are maxing out the CPUs of today, and even Kentsfield and upcoming 4x4 will not have enough CPU to max out the graphics card – G80 chip just eats up all the processing power that a CPU can provide to them.
Having said this, expect fireworks with AMD's 4x4 platform once that true quad-core FX become available.  |