ArtsAutosBooksBusinessEducationEntertainmentFamilyFashionFoodGamesGenderHealthHolidaysHomeHubPagesPersonal FinancePetsPoliticsReligionSportsTechnologyTravel
  • »
  • Technology»
  • Computers & Software»
  • Computer Science & Programming

Supercomputing Performance using Graphical Processing Units(GPU)

Updated on April 7, 2011

Supercomputing For Cheap

For many, Supercomputers thew up vivid images of a huge room with towers of hardware, cables and power supply cables. These rooms looked scary but also evoked a sense of power and awe. These mean machines could crunch numbers at scary speeds. Till parallel processing came, Cray Computers were the fastest machines humans could build with exotic technologies like liquid nitrogen cooling to speed up things ( there are many technologies that go into making a supercomputer, but I shall not delve into it at present). These machines were very expensive, costing millions of dollars and were used to solve highly complex problems like Weather Simulation & modeling, Molecular dynamics, Genome mapping, Fluid Dynamics etc.

But with the maturing of parallel computing, scientists and designers realized that single processor or even some multiple processors could do only so much. They were not scalable beyond a certain point and required expensive exotic technologies to cool the processors and the associated circuits. Therefore building a supercomputer was pretty much out of reach of most organizations in the world. Then came the brilliant idea of harnessing the power of multiple processors, working in parallel to solve a problem. Parallel Computing has its drawbacks and challenges and requires applications to be reprogrammed to harness its full power, but,it also has the potential to bring supercomputing performance in the hands of the masses. However, using many single single core processors was also turning out to be expensive and interconnects for transferring data between them were becoming the new bottlenecks to scaling up speeds. In the mean time, gamers were driving the industry to give them hardware to render their computer games in greater detail and with higher refresh rates. As the games increased in complexity, the default graphics cards were not able to crunch numbers to render the graphic intensive scenes. 

Requirement for Dedicated Graphics Processors

So why do we need dedicated graphics processors? To put it in perspective, lets say a room in a game is made of 1000 points, with each point being represented by 32 bits. The room has walls, the walls have texture, therefore, these textures can be represented again by a combination of polygons (generally triangles as to represent every part of a wall by a point would mean that every wall would need millions of points to render it and that would need enormous memory and computing power. Therefore, triangles are used to model such surfaces to reduce the number of points required to render but at the cost of detail). That is why computer games have figures that look odd with pointed surfaces as they have been modeled using triangles. 

Now if you have to move this figure, you have to transpose every point of the triangle that makes this image by the amount of movement desired by the user. Therefore its new position has to be calculated. In order to keep the movement smooth, the image is refreshed multiple times so that to the human eye the movement looks fluid and not jerky. All this was pure maths and required tremendous memory capacity and computing power which the regular processors were not able to handle. Therefore special purpose cards called Graphics Processing units came into existence. These were separate cards which were fixed on to the PCI/AGP bus and computer games used these cards to do the number crunching when they were executed by the CPU's. Therefore, for eg, when Need for Speed is executed on the computer, the main processor would hand over the processing to the GPU for rendering and playing by the user. These cards (GPU's) had multiple processors, dedicated memory and faster interconnects that enable real time rendering. As hardware technology improved more processing cores were added to the GPU's along with memory and faster interconnects and efficient architectures to make these cards high performance computing devices. The processing cores on GPU's were not the complex processors like Pentium 4 or Core2Duo, but were streamlined for floating point computations with a far reduced instruction set. This meant that the Computer games had to be designed to leverage these multiple processors so that game rendering and refresh rate would be extremely fast and enjoyable. 


Harnessing Power of GPU's

But then considering that GPU's are not used most of the times and in order to leverage these processing cores for normal computation, NVIDIA came up with CUDA. To know more about CUDA visit this site:-

Basically what CUDA does is to give an interface to the programmer to make use of the GPU cores. In programming language parlance, its an API (Application Programming interface). For eg A programmer downloads the CUDA development framework, installs its on his PC, ports his application using the CUDA calls to hand over processing to the GPU's (he also has to parallelize his code to make use of multiple cores), and hey presto- Application speedup!! Its not as simple as it sounds however, its far more easier than making a conventional supercomputer. Therefore, if you buy some high end motherboards from say Supermicro with 8 pci express slots, buy four GTX 295 or GTX 460 cards (only four can go as one card eats up 2 slots), put in a i7 processor, with some heavy duty SMPS's to take care of the GPU cards buy some custom made liquid cooling stuff from Koolance, to dissipate the heat generated and hey presto again, you have a 2-3 teraflop supercomputer for about $4000. Hows that.!!!!

There has been tremendous interest in this field and many organizations have started making their own rigs. THis article by the Fraunhofer institute gives a great overview on this technology. The flip side is that hackers can also use this technique to improve their processing power for cryptanalysis.

What I have brought out above is just the tip of the iceberg and one possible way to make such a rig. There is another competing framework called OPENCL (which as the name suggests is open standards and supports a plethora of cards). Just visit the site for more information. Stay tuned for more on this particular technology.



    0 of 8192 characters used
    Post Comment

    • profile image

      movingfinger 5 years ago

      Well... there are many links to do it.. u need to let me know what u exactly want and I shall tell u the way to do it..

    • profile image

      technogeek 5 years ago

      Interesting article .. and achievable at home. keep such articles coming. do u have a hands on guide which I can use to make such a machine.