Gå til innhold

Noe info om neste generasjon grafikk chip fra ATi


Anbefalte innlegg

Videoannonse
Annonse
er det gitt noen indikasjon på om pipelinsa vil ha annen rekkefølge på instruksjonene eller andre/flere/mindre instruksjoner i forhold til C1?

6190724[/snapback]

Jeg antar du sikter til dette:

If we just consider a single one of the arrays for the time being - with 16 ALU's available this means that on every cycle it is processing a maximum of either 16 vertices or four 2x2 pixel quads. However, as there is no pipelining from one set of ALU's to the next, the ALU array will need to first process the first shader instruction, then go back and process the second shader instruction. For cases where there is a direct data dependency (i.e. the first instruction says A + B = C and the resultant value for C is used in the next instruction), there must be some way of making sure that C is available in time for the second instruction to execute.

When the ALU's move from one instruction to the next, there is an inherent latency (this is the amount of pipeline clocks it takes to execute the first instruction).

 

Men videre skriver dem også:

The Xenos shader contains a large number of independent groups of pixels and vertices (threads) which are 16 wide. In order to hide the latency of an instruction for a given thread, a number of other threads are used to "fill in the gaps". By doing this, the ALU's are fully utilized all the time, and the shader can have direct data dependency on every instruction and still run full rate. Xenos has a very large number of these independent threads ready to process, so there are always enough independent instructions to execute such that the ALU's are fully utilized. Each of these different threads can be executing a different shader, can be at different places within the same shader, can be pixels or vertices, etc.

 

Jeg vil tro at dette er en såpass grunnleggende del av unified shader arkitekturen til ATi at vi ikke vil se noen endringer her til R600.

Lenke til kommentar

Ja. Tar vi C1 (aka Xenos aka Xbox360's GPU) og trekker fra eDRAM, legger til ringbus minnekontrolleren, flere ROP's, flere shader ALU's og muligens flere TMU's samt full D3D10 støtte har vi nok hovedsaklig R600.

Endret av MistaPi
Lenke til kommentar
Ja. Tar vi C1 (aka Xenos aka Xbox360's GPU) og trekker fra eDRAM, legger til ringbus minnekontrolleren, flere ROP's, flere shader ALU's og muligens flere TMU's samt full D3D10 støtte har vi nok hovedsaklig R600.

6195453[/snapback]

Ja, samt en AVIVO 2 eller noe sånt...

 

Dette vil være en ganske ny arkitiektur, så det vil være endel jobb å skrive drivere. Strengt talt bør de vel også benytte anledningen til å skrive OpenGL driveren til R600 fra scratch.

Lenke til kommentar
Dette vil være en ganske ny arkitiektur, så det vil være endel jobb å skrive drivere. Strengt talt bør de vel også benytte anledningen til å skrive OpenGL driveren til R600 fra scratch.

6199983[/snapback]

 

Det vil jo ikke være så veldig nytt for ATi, dette ser mer eller mindre ut som en oppgradering av C1/Xenos i mine øyne, enn ny arkitektur.

Lenke til kommentar

Pluss at dem støtte på et problem som gjorde at brikkene ikke kunne klokkes mere enn 500MHz. tok dem flere måneder å løse det problemet. til slutt fant dem ut at det var noen koblinger som trengte et ekstra metallag, og så var problemet løst.

Lenke til kommentar
According to public reports ATI noticed that as late as July, issues occurred that prevented the R520 core being clocked close to its target speeds, which is consistent with leakage issues. Curiously, the issue did not occur across all their 90nm products - ATI had already delivered Xenos to Microsoft using the same 90nm process R520 does, and other derivatives of the R520 line suffered the same issue (RV530) but others did not (RV515) - the fact R520 and RV530 share the same memory bus, while RV515 and Xenos have different memory busses is not likely to be coincidental in this case. ATI were open about talking about the issue they faced bringing up R520, sometimes describing the issue in such detail that only Electronic Engineers are likely to understand, however their primary issue when trying to track it down was that it wasn't a consistent failure - it was almost random in its appearance, causing boards to fail in different cases at different times, the only consistent element being that it occurs at high clockspeeds. Although, publicly, ATI representatives wouldn't lay blame on exactly were the issue existed, quietly some will point out that when the issue was eventually traced it had occurred not in any of ATI's logic cells, but instead in a piece of "off-the-shelf" third party IP whose 90nm library was not correct. Once the issue was actually traced, after nearly 6 months of attacking numerous points where they felt the problems could have occurred, it took them less than an hour to resolve in the design, requiring only a contact and metal change, and once back from the fab with the fix in place stable, yield-able clockspeeds jumped in the order of 160MHz.
Lenke til kommentar

Opprett en konto eller logg inn for å kommentere

Du må være et medlem for å kunne skrive en kommentar

Opprett konto

Det er enkelt å melde seg inn for å starte en ny konto!

Start en konto

Logg inn

Har du allerede en konto? Logg inn her.

Logg inn nå
×
×
  • Opprett ny...