What is a GPU?
Short answer: Graphics Processing Unit – a computer chip (or a part of one) that is responsible for processing graphic and video image data. To understand it, let’s begin by looking at the difference between it and the Central Processing Unit.
- Caveat: This is a very simplified discussion that conveys the general concepts but may not convey a precise model of modern chips.
The CPU (generally considered the “brains” of a computer) is responsible for running the overall computer system. It reads programs from memory and executes them. Using both built in (hardwired, or firmware based) and memory based instructions, it is responsible for scheduling all the different processes that want to run concurrently, responding to user inputs, sending data to various peripheral systems, maintaining levels of security, and so on.
In order to accomplish all this, the CPU has a very rich set of instructions that it understands, and this instruction set has been tuned over decades to handle this wide set of operations efficiently. It should be remembered that every processing unit can perform one and only one instruction at a time – or more accurately these days, one per core or thread. (Detail: actually, there are execution units that do perform several operations in a pipeline fashion, but the basic concept holds.)
Now a crescent wrench is a fine tool. Large and heavy, it is adjustable and therefore adaptable to many uses from plumbing to auto repair. If, however, you intend to remove your engine head, then you will be much better served with a speedy little ratchet with the proper socket. This one socket only fits a single size – but does it very well.
So too the GPU. It has a much smaller instruction set, but this is tuned to the types of mathematical operations that are needed for processing images, and – most importantly – it has lots of threads to run many operations in parallel.
What is all this? Let’s look at an image and see what we are talking about. We have a photo of mine from Colombia, 2011, with a zoom into one eye. Then an even closer zoom.
As you can see, the photo is composed of millions of tiny squares that blend together to form the image. Each of these square is one pixel. There are pixels for the image (what we see here), but also pixels for your screen. These are what GPUs process. If you have an HDMI monitor, then it has a resolution of 1920×1080 pixels for 2,073,600 pixels in all, roughly 2.1 Megapixels. Now say you want to brighten the photo. You will have to apply your adjustment to over 2 million pixels! But wait, the original was a 10 Megapixel image, so the computer needs to adjust the value of 10 million pixels. How long do you want to wait? This can be done by the CPU – it has the capacity to perform all the instructions, but it will take a long time to process each pixel individually. It is like moving with a bulldozer, it may be strong, but rather slow. Instead what you need is a whole fleet of little Porches. And this is the GPU.
A modern GPU consists of dozens or even hundreds of threads. Each of these has a very limited instruction set, but is designed meticulously to perform those instructions cleanly and rapidly. But the big thing is that is performing its operations in parallel. It has what is called a SIMD architecture – it performs a Single Instruction on Multiple Data all at once.
The diagram below illustrates this, where each blue column is a Thread Processor – a pipe of computational units that will operate on one piece of picture data at a time. Meanwhile, its brethren are doing exactly the same thing on the neighboring pixels. (These are sometimes called simply threads, or stream processors, or execution units…)
So you can see how a buffer (green) can be filled with pixel data from our image, and then queued up for processing dozens or hundreds at a time. The next diagram is a more detailed look of the NVidia GeForce 8800 GPU architecture. More modern GPUs are even more complex.
Where are they? GPUs come in two flavors – discrete chips or as part of a CPU. Many of Intel’s CPU come with GPUs built in. Most have several different options for the embedded GPU. Thus, the Core i5-4570 has very similar specs to the Core i5-4570R, except the former as HD Graphics 4600 GPU and the latter has the (supposedly more powerful) Iris Pro Graphics 5200 GPU. When comparing systems, it is important to check what you are buying in order to be sure to what degree two systems are comparable.
Apple, in both its MacBook Air and mobile products, has opted mostly for reduced CPU speed (processor GHz) to save energy, but higher end GPUs to speed up graphics processing. To them, most work is done in GPUs these days. Remember, it is not just editing your photos or home movies here, but every time a web page is displayed, graphics are applied. It appears that their intent is to improve the overall user perceived performance, and this is how they weight their tradeoffs.
It is important to note that photo/video editing and games are not the only uses for the GPU. Science and mathematical uses also abound, and not just for strictly graphic applications such as medical image processing.
In mathematics, when you have a list of values on which you want to apply a single operation, then this set of values is called a vector. Operating on vectors is precisely what the GPU does so well, and so chip designers have been adding methods for scientific applications developers to access this power directly for any general vector processing – not just image vectors.
OpenCL (note the ‘C’ for compute) is an open standard application language for programming GPUs and other specialty processors that was designed specifically to help in the development of these types of applications.
Today, vector spaces are applied throughout mathematics, science and engineering. They are the appropriate linear-algebraic notion to deal with systems of linear equations; offer a framework for Fourier expansion, which is employed in image compression routines; or provide an environment that can be used for solution techniques for partial differential equations.
The GPU is powerful because it:
- Specializes in its optimized instruction set, and
- Performs in massive parallelism
Of course, it adheres to the same laws of physics – for any given circuit technology (i.e. size of the transistors) the greater the power of the device, the great the electric power it consumes. Hence, any given computing system, be it a desktop, smartphone or tablet, will always have to select a tradeoff between performance and cost of system in terms of energy usage vs. processing power. Aside – This is where Apple has proven to be rather brilliant. They have been able to very precisely define these tradeoffs, and engineer their processors to exquisitely fit their goals in the mobile platforms. They can do this in large part because they alone control both the software and the hardware. == Technologists will please forgive me if I have skipped over the enormous number of details that go into design, etc. Please leave your comments, and certainly correct me if I have made any errors.
Dear friend – if you appreciate my commentary please consider viewing my product linked below.
Elegant, Handcrafted, Genuine Leather