What is Computer Graphics?

Published in

XRPractices

9 min readDec 21, 2020

Hello, when we hear the term computer graphics the things that come into our mind are the digital screen of your mobile phone, display of on digital camera, the digital screen on the dashboard of your car or your laptop, or the big displays that we see in malls or stadiums.

The reason I am writing this article is that I always wanted to be a graphics programmer(Not there yet…) who know stuff behind the scene i.e. how the pixels are lit which transforms into such mesmerizing streams of image(s).

From an application standpoint, computer graphics play an essential role in many business verticals, whether its the manufacturing industry where CAD is the starting point of everything or movies where VFX helps to create situations/scenarios which are not possible to capture in the real world and I have left my favourite for the last which is gaming especially 3D-gaming that uses the power of modern GPU to render tons of triangles in real-time on your screen, having said that these examples are just tip of the iceberg.

This series of articles is sort of my journey in computer graphics and how I learned the trade secrets given that I have no formal education in this field I have learned most of the stuff on the job or through the world wide web, so don’t completely rely on what you read here, do a little research of your own after you read it. So let's begin…

Disclaimer — Given that I am full a time employee in software consulting company(Nagarro Software), the frequency of these posts may vary significantly.

Some History

When the computer came into existence, the only way to communicate(IO operations) with that big machine was punch cards. Someone used to punch the holes on the cards as input and then computer used to spit out the output on the cards which have holes in it

https://www.computerhope.com/jargon/p/punccard.htm

It doesn’t need the genius to see the flaw in all this, so a much better way of communication was needed, Ivan Sutherlands’s Sketchpad solved this and hence computer graphics came into the picture

And in 2020, we have come so far in the field of computer graphics e.g. an 8k monitor has 7680x4320 pixels and spits around 95 megabytes of data in one second and if take an example of VR devices they are capable of 2.3 GigaBytes of data/second (two displays, one for each eye — 2160x2160 pixels)

A road to formally defining CG

Umm but why graphics at all, why not some other way to communicate with computers, I mean we could have sound/audio as a primary source of I/O. Well because we humans have 30% of the brain dedicated to visual processing and we can say that eyes are the highest bandwidth port in our brain and thus computer graphics is the aptest way of interacting with computers as it is intuitive for humans.

Definition I— The use of computers to generate information which human can perceive

Now a lot of people confuses CV i.e. computer vision with CG i.e Computer Graphics. The difference is very crisp and clear, CG takes digital information and synthesizes perceptual stimuli(Input for our eyes/ears/skin…yes I mean it will explain later how) whereas CV converts what we can perceive through our eyes into digital information.

Definition II — The use of computation to turn digital information into sensory stimuli

CG In Action

I believe it is the time to see some of the stuff we do in CG before this post becomes boring.

Let’s take-up an exercise of modeling and rendering a cube on a screen. When I say modeling it means storing the definition of the cube(representation — length, breadth and height) and rendering means drawing that 3D cube on a 2D screen which we can see.

GOAL — Generate a realistic drawing of a cube

But before we can actually begin there are some questions we must address about modeling — How do we describe the Cube?, rendering — How we visualize a 3D object in 2D?

Part I: Modeling — How do we describe the Cube

The problem to model a cube is how do we save a 3D object on a file disk. So for that, we make few assumptions to begin with:-

Cube’s centre is at the origin in 3D-space (0,0,0)
Cube’s length, breadth and height is 2 units
We are looking at the cube such that our eyes are aligned to the x-y-z-axis

But these assumptions are not enough for complex objects like a face or a car or anything else so we capture the coordinates vertices of the cube

But this is still not enough as, if we draw these coordinates on a 2D canvas we won’t be able to perceive the shape of the cube, so we also capture the edges of the cube

We have a digital description of the cube — vertices and edges. We can now encode this data as binary to save it memory and hence the problem of modeling is solved

Part II: Rendering — How we visualize a 3D object in 2D

Rendering is tricky, you will see why in a moment but for starters how do we represent a 3D object on a 2D-Canvas, we have two options:-

Opt 1 — Throw away one space/axis i.e. z-axis but we will see only a square in that case, and we certainly don’t need that — Rejected
Opt 2 — Map 3D-vertices to 2D points in the 2D plane, connect 2D points with the straight lines…

So my dear readers here come projections and matrix…and from now things will start getting little freaky, but we will not go all the way in. We will see just a glimpse of it and gradually will turn up the heat

Perspective Projection — Have you ever wondered why objects get smaller as we move further away from them. We all at some point in our life have stumbled upon this question. This is one of the key concepts in CG, and we will try to understand this now.

In the image below you can see the concept on which the camera works, well at least the old cameras where you have to keep a film which you have to get processed later to get the photograph…

Here you can see we have a tree (A 3D Object), and on the left side of the image, we have our simple camera. Lights enter in the camera from the pinhole and a 2d image is created on the projection screen. The longer you keep the pinhole open more exposure you will get (all that camera stuff).

So now let's turn our simple camera into not so simple and overlay some geometry on it.

I have intentionally kept only one ray starting from the point — p on the tree and ending on point — q in our camera on the projection screen.

Here we are going to assume a few things so that we can converge to an equation which we will use to create our rendering algorithm

Assumptions:-

There is a Z-axis passing from the centre of the cube through the pinhole. Y-axis is vertically perpendicular to the Z-axis, positive in up-direction.
The X-axis is perpendicular to both the Z & Y-axis — not visible in the diagram for the sake of simplicity
Assuming that Y-axis is vertical and X-axis is horizontal.
The size of our cube camera is 1 unit
The distance of the point p on 3D-object(tree) along Z-axis is z — unit
Also, point p is y — unit above Z-axis along with Y-axis
Lightray starts from point p(x,y,z) on the 3D object and meets on point q on the projection plane — (u,v,1)

Note: That the u-unit and x-unit is the distance on the horizontal plane that is perpendicular to both Y and Z-axis, in the above image Y and Z-axis are clearly visible and imagine X-axis on a horizontal plane perpendicular to Y and Z-axis

We can see two similar triangles on this image.

Considering the similar triangle property : Ratio of edge length is same

From property above we can say that v/1 = y/z → v = y/z

v = y/z — vertical coordinate

therefore we can say that v is the slope and it is equal to y/z

In a similar way, we can get the horizontal coordinate

i.e. u/1 = x/z → u = x/z

u = x/z — horizontal coordinate

Therefore we can infer from the above two equations that if z becomes bigger which means if we move away from the 3D object, the value for u and v will become smaller and the image size will shrink on the projection plane.

Now replace the pinhole camera with a human eye, the pinhole becomes pupil in the eye. The projection plane is the retina that sends the image to the brain via the optic nerve. Now you know why objects looks smaller as we move further away from it.

From the above two equations, we can create our algorithm for the rendering of the cube (remember our problem …How to draw 3D on 2D?)

Assume the camera is at c = (2,3,5)
Convert the (x,y,z) points on 3d object to (u,v) for all 12 edges as below
Subtract the camera c from the vertex (x,y,z) — as we have assumed the camera is on (2,3,5) — to normalize the position of the cube w.r.t camera
Divide (x,y) by z to get the (u,v)
Draw line between (u1,v1) and (u2,v2)

And here we go, we have successfully found out the coordinates of the 3d representation on a 2d screen without losing the shape of the cube.

In the above steps, we saw how we covert digital information into purely visual information with an algorithm that we derived with the help of high-school maths.

Another Problem —

But how to draw these lines on a computer screen, I mean yeah we now know the coordinates/position on the 2D plane, but what next?

I am sure by now you must be thinking that this is where pixels kicks in and yes you are right. But I am not going to get into that right now, wait for the next post which I hope to publish soon.

In the next post, we will see about raster displays and line rasterization algorithm, it's complexity in time-space etc.

I highly recommend visiting the links below if you are interested to see more about the latest trends in computer graphics and what is the future of it.

Few of my other blogs

Credits —

I want to thank wholeheartedly to the people who have published their work and explained the concepts of CG so that people like me can learn. The links below helped me while writing this blog