Gå til innhold

GPUene i PS3 og xbox360.


Anbefalte innlegg

Dere bør lese denne artikkelen:

http://anandtech.com/tradeshows/showdoc.aspx?i=2423

ATI has been working on the Xbox 360 GPU for approximately two years, and it has been developed independently of any PC GPU. So despite what you may have heard elsewhere, the Xbox 360 GPU is not based on ATI's R5xx architecture.

 

Unlike any of their current-gen desktop GPUs, the 360 GPU supports FP32 from start to finish (as opposed to the current FP24 spec that ATI has implemented). Full FP32 support puts this aspect of the 360 GPU on par with NVIDIA's RSX.

 

ATI was very light on details of their pipeline implementation on the 360's GPU, but we were able to get some more clarification on some items. Each of the 48 shader pipelines is able to process two shader operations per cycle (one scalar and one vector), offering a total of 96 shader ops per cycle across the entire array. Remember that because the GPU implements a Unified Shader Architecture, each of these pipelines features execution units that can operate on either pixel or vertex shader instructions.

 

Both consoles are built on a 90nm process, and thus ATI's GPU is also built on a 90nm process at TSMC. ATI isn't talking transistor counts just yet, but given that the chip has a full 10MB of DRAM on it, we'd expect the chip to be fairly large.

 

One thing that ATI did shed some light on is that the Xbox 360 GPU is actually a multi-die design, referring to it as a parent-daughter die relationship. Because the GPU's die is so big, ATI had to split it into two separate die on the same package - connected by a "very wide" bus operating at 2GHz.

 

The daughter die is where the 10MB of embedded DRAM resides, but there is also a great deal of logic on the daughter die alongside the memory. The daughter die features 192 floating point units that are responsible for a lot of the work in sampling for AA among other things.

 

Remember the 256GB/s bandwidth figure from earlier? It turns out that that's not how much bandwidth is between the parent and daughter die, but rather the bandwidth available to this array of 192 floating point units on the daughter die itself. Clever use of words, no?

 

Because of the extremely large amount of bandwidth available both between the parent and daughter die as well as between the embedded DRAM and its FPUs, multi-sample AA is essentially free at 720p and 1080p in the Xbox 360. If you're wondering why Microsoft is insisting that all games will have AA enabled, this is why.

 

ATI did clarify that although Microsoft isn't targetting 1080p (1920 x 1080) as a resolution for games, their GPU would be able to handle the resolution with 4X AA enabled at no performance penalty.

 

ATI has also implemented a number of intelligent algorithms on the daughter die to handle situations where you need more memory than the 10MB of DRAM on-die. The daughter die has the ability to split the frame into two sections if the frame itself can't fit into the embedded memory. A z-pass is done to determine the location of all of the pixels of the screen and the daughter die then fetches only what is going to be a part of the scene that is being drawn at that particular time.

 

On the physical side, unlike ATI's Flipper GPU in the Gamecube, the 360 GPU does not use 1T-SRAM for its on-die memory. The memory on-die is actually DRAM. By using regular DRAM on-die, latencies are higher than SRAM or 1T-SRAM but costs should be kept to a minimum thanks to a smaller die than either of the aforementioned technologies.

 

Remember that in addition to functioning as a GPU, ATI's chip must also function as a memory controller for the 3-core PPC CPU in the Xbox 360. The memory controller services both the GPU and the CPU's needs, and as we mentioned before the controller is 256-bits wide and interfaces to 512MB of unified GDDR3 memory running at 700MHz. The memory controller resides on the parent die.

 

As we mentioned before, NVIDIA's RSX is the more PC-like of the two GPU solutions. Unlike ATI's offering, the RSX is based on a NVIDIA GPU, the upcoming G70 (the successor to the GeForce 6).

 

The RSX is a 90nm GPU weighing in at over 300 million transistors and fabbed by Sony at two plants, their Nagasaki plant and their joint fab with Toshiba.

 

The RSX follows a more conventional dataflow, with discrete pixel and vertex shader units. Sony has yet to announce the exact number of pixel and vertex shader units, potentially because that number may change as time goes by depending on yields. This time around Sony seems to be very careful not to let too many specs out that are subject to change to avoid any sort of backlash as they did back with the PS2. Given the transistor count and 90nm process, you can definitely expect the RSX to feature more than the 16 pipes of the present day GeForce 6800 Ultra. As for how many, we'll have to wait for Sony on that.

 

NVIDIA confirmed that the RSX is features full FP32 support, like the current generation GeForce 6 as well as ATI's Xbox 360 GPU. NVIDIA did announce that the RSX would be able to execute 136 shader operations per cycle, a number that is greater than ATI's announced 96 shader ops per cycle. Given that we don't know anything more about where NVIDIA derived this value from, we can't be certain if we are able to make a direct comparison to ATI's 96 shader ops per cycle.

 

Given that the RSX is based off of NVIDIA's G70 architecture, you can expect to have a similar feature set later this year on the PC. In fact, NVIDIA stated that by the time PS3 ships there will be a more powerful GPU available on the desktop. This is in stark contrast to ATI's stance that a number of the features of the Xbox 360 GPU won't make it to the desktop for a matter of years (potentially unified shader architecture), while others will never be seen on the desktop (embedded DRAM?).

 

There will definitely be some differences between the RSX GPU and future PC GPUs, for a couple of reasons:

 

    1) NVIDIA stated that they had never had as powerful a CPU as Cell, and thus the RSX GPU has to be able to swallow a much larger command stream than any of the PC GPUs as current generation CPUs are pretty bad at keeping the GPU fed.

 

    2) The RSX GPU has a 35GB/s link to the CPU, much greater than any desktop GPU, and thus the turbo cache architecture needs to be reworked quite a bit for the console GPU to take better advantage of the plethora of bandwidth. Functional unit latencies must be adjusted, buffer sizes have to be changed, etc...

 

We did ask NVIDIA about technology like unified shader model or embedded DRAM. Their stance continues to be that at every GPU generation they design and test features like unified shader model, embedded DRAM, RDRAM, tiling rendering architectures, etc... and evaluate their usefulness. They have apparently done a unified shader model design and the performance just didn't make sense for their architecture.

 

NVIDIA isn't saying that a unified shader architecture doesn't make sense, but at this point in time, for NVIDIA GPUs, it isn't the best call. From NVIDIA's standpoint, a unified shader architecture offers higher peak performance (e.g. all pixel instructions, or all vertex instructions) but getting good performance in more balanced scenarios is more difficult. The other issue is that the instruction mix for pixel and vertex shaders are very different, so the optimal functional units required for each are going to be different. The final issue is that a unified shader architecture, from NVIDIA's standpoint, requires a much more complex design, which will in turn increase die area.

 

NVIDIA stated that they will eventually do a unified shader GPU, but before then there are a number of other GPU enhancements that they are looking to implement. Potentially things like a programmable ROP, programmable rasterization, programmable texturing, etc...

Og for de som enda ikke har skjønt hvordan Sony og Microsoft sin PR-avdeling LURER dumme forumbrukere; (fra Wikipedia)
FLOPS in isolation are arguably not very useful as a benchmark for modern computers. There are many other factors in computer performance other than raw floating-point computation speed, such as interprocessor communication, cache coherence, and the memory hierarchy.

 

For ordinary (non-scientific) applications, integer operations (measured in MIPS) are far more common. Measuring floating point operation speed, therefore, does not predict accurately how the processor will perform on just any problem.

Ikke la dere lure av reklame. Vent med påstander om hvem som knuser hvem til dere har kunnskap.
Lenke til kommentar
Videoannonse
Annonse
Jeg så video fra xbox 360 og ps3 i dag. Her så jeg at ps3 hadde bedre grafikk.

MSi kunne sikkert ha vist frem tech-demoer som minst er like imponerende.

Video er også vanskelig å bedømme ytelse ut i fra, selv om det blir påstått at denne rendres i realtime. For hva er definisjonen på å rendreing i realtime? De trenger selvfølgelig ikke å rendre mer enn en liten del av skjermen, en liten del av grafikken, mens de aller fleste objektene er prerendret, noe som gjør at de kan si det er realtime uten å komme med en direkte løgn.

Lenke til kommentar

Jeg tror vi skal være veldig forsiktige med å sammenligne disse her 2 før vi får se 100% virkelige ingame videoer i skikkelig kvalitet for å sammenligne. Og jeg synes det er litt merkelig at sony og MS ikke får lagt noen SERIØST gode teasere ut på nettet med ingame videor i HØY oppløsning på de offisielle sidene dems, slik at de får laget den hypen som de kanskje trenger begge 2. Pre-redered videoer har vi sett nok av, og for å si det rett ut, jeg har sett noe virkelige gode på vanlig xbox også...

 

Men uansett så tror jeg nok at det kan være ett sjakktrekk å kjøpe begge konsollene:) Så er man sikra uansett.

Lenke til kommentar

Opprett en konto eller logg inn for å kommentere

Du må være et medlem for å kunne skrive en kommentar

Opprett konto

Det er enkelt å melde seg inn for å starte en ny konto!

Start en konto

Logg inn

Har du allerede en konto? Logg inn her.

Logg inn nå
×
×
  • Opprett ny...