ATI Radeon HD 4800: Newest Graphics Processor from ATI
The RV670 (Radeon HD 38x0 series) had a new number in the series name although differed but slightly from the previous R600 core (Radeon HD 2900). The RV770 has more reasons to have a new number in the series name as it is indeed a new product even though with many traits inherited from its predecessors. The new series is called ATI Radeon HD 4800 and the naming system first used for the Radeon HD 3800 series is maintained. The first numeral means the graphics architecture generation. The second numeral stands for the family and the last two denote the specific graphics card model.
Two models were revealed at the moment of the announcement of the new graphics architecture: ATI Radeon HD 4850 with a recommended price of $199 and ATI Radeon HD 4870 with high-speed GDDR5 memory and a recommended price of $299.
Click to enlarge
Click to enlarge
The RV770 core incorporates an impressive 956 million transistors. Nvidia’s GT200 incorporates 1.4 billion, though. However, this is hardly an achievement for Nvidia as its GPU is manufactured on less advanced 65nm tech process. Coupled with the huge size and complexity of the GPU, there are fewer GPUs that can be made out of one wafer and the chip yield is lower. Consequently, the manufacturing cost of one chip is high. Nvidia has been following this approach for the last few years, though. GT200-based cards will hardly get cheaper as opposed to ATI’s new RV770-based solutions. So, ATI’s approach seems to work better here.
The GPU clock rates have been lowered considerably in comparison with the RV670-based cards because of the higher complexity of the new core. This shouldn’t be a problem as the new chip features high computing and texture-mapping capacity. Another noteworthy thing, the fast GDDR5 memory installed on the senior Radeon HD 4800 helps achieve high bandwidth without widening the memory interface as ATI did in the last year and Nvidia does today. When the memory bus is wider than the traditional 256 bits, the PCB becomes complex and more expensive to make. Of course, GDDR5 is more expensive than GDDR3 but this difference seems to be made up for by the simpler PCB design. This is indicated by the relatively low recommended price of the Radeon HD 4870. It is only $299, i.e. $100 lower than the recommended price of the Nvidia GeForce GTX 260 and $130 lower than the price of the GeForce GTX 280. There can only be one problem about the memory. GDDR5 is not yet widespread, and there may be a shortage of graphics cards due to the lack of the required memory.
The specifications table shows that the RV770 is stronger in both computing and texture-mapping capacities. The higher performance of the texture processors is the most important innovation because they used to be the weak spot of the RV670. There now being 40 of them instead of 16, the TMUs won’t be a bottleneck anymore even if their architecture hasn’t changed. For comparison, the Nvidia GT200 core incorporates 80 TMUs in its maximum configuration but their performance in real-life applications is only half of the potential. So, the RV770 should be equal in this respect even though the Radeon HD 4870 is not actually positioned as an opponent to the Nvidia GeForce GTX 280/260. We’ll check this out in our practical tests soon.
The superscalar Radeon HD architecture is known to be sensitive to shader code optimizations notwithstanding the special-purpose task dispatcher available in every GPU of this generation. With the amount of ALUs increased from 320 to 800, the GPU will not slow down too much if the driver is not optimized for a specific 3D application. In the worst case, one out of each five ALUs will operate, which means 160 operating ALUs in total (as opposed to the RV670’s 64). These 160 ALUs should deliver about the same performance as the GeForce 9800 GTX provides. In the best case, the computing capacity of the Radeon HD 4850 may reach 1 teraflop. That is, this modest $199 card can challenge the expensive GeForce GTX 280 that is declared to have a computing capacity of about 1teraflop, too.
The raster processors have been improved, too. There are as many of them as in the previous core, but they are two times more efficient at processing the Z-buffer. Thus, the RV770 can process 64 Z-values per clock cycle as opposed to the RV670’s ability to process only 32 Z-values per clock cycle.
The new card seems to have a huge potential. We’ll discuss its innovations in more detail before testing it performance.






