Moving into 3 dimensions, first we must be clear about some conventions.
In a 2D coordinate system, there’s only really one way to position the two perpendicular axes. Sure you could rotate and flip them around, but that would just be viewing the same thing from a different angle.
In contrast, when we add a 3rd axis, a Z axis, perpendicular to the other two, there’s an arbitrary choice of which way the third axis, points. Here the Z axis points away from us when we view X and Y from the normal perspective, but here the Z axis points in the opposite direction. If you take some real world pointy objects, like pencils, and try to add a third perpendicular axis to two existing perpendicular axes, you’ll find that you can’t just rotate one configuration around to get the other. The difference is not just a matter of perspective. To be sure, it’s an arbitrary choice which system we use, but we must make a choice.
The two choices are known as a left-hand system and a right-hand system, so called because if you form 3 perpendicular axes with your thumb as X, your index finger as y, and your middle finger as z, your get the two different coordinate systems with your left and right hands. So it helps to remember the hand gesture and which digit corresponds to which axis.
In the previous video on perspective, we actually used a left-hand system where X pointed right on the screen, Y pointed up, and Z pointed inwards, away from the viewer. It’s actually most common in graphics to use right-handed systems, so that’s the convention we’ll mostly stick with here.
When rotating around the 3 axes, we’re effectively rotating points in three different planes. When we rotate around the x-axis, we’re rotating in the YZ plane, when we rotate around the y-axis, we’re rotating in the ZX plane, and when rotating around the z-axis, we’re rotating in the XY plane.
Now, quite confusingly, in addition to left-hand and right-hand 3D coordinate systems, we also have a so-called left-hand rule and right-hand rule for expressing the conventions of direction of rotation. The rule involves a different hand gesture, where we imagine our grip is wrapped around an axis, our thumb points in the positive direction along the axis, and we imagine the direction of rotations for a positive angle going in the direction of our curled fingers. So let’s say we go by the convention of the right-hand rule and a right-hand coordinate system: imagine then a right-hand 3D system, in which the Z axis points towards you, and imagine gripping the Z axis with your right hand; your fingers would then be wrapping counter-clockwise around the Z axis, and as you can see, this corresponds to our usual convention in 2D rotations that a positive rotation in the XY plane goes counter-clockwise. So it’s not surprising that, in a right-hand system, the right-hand rule is the dominant convention.
As mentioned briefly in the previous video, translations in 3D aren’t any more complicated than in 2D: the vectors and points simply have one more dimension, and that dimension gets treated just the same. Rotations in are a different case, as things get considerably more complicated in three dimensions than in two dimensions.
Getting to it, let’s start with the simple cases of rotating around one of the three axes. It shouldn’t be hard to see that rotating around one of the axes in three dimensions is really just a 2D rotation in disguise because it will mean only affecting two of the three coordinates of a point. For example, if I rotate a point around the Y axis, only the X and Z values change. So we can in fact use the very same 2D rotations formula…as long as we’re mindful of plugging in the right coordinates in place of the X and Y of our formula for 2 dimensional rotations. So starting with the assumption of right-hand-rule rotations in a right-handed coordinate system, we rotate around the z-axis by plugging in x and y into our 2D rotation formula as normal. However, for the same right-hand conventions but rotating around the x-axis, we want to plug our 3D Y value in place of the usual 2D x and our 3D Z value in place of our usual 2D Y. This is clear if you visualize the coordinate system from the perspective where X points toward us as Z conventionally does, Z points up as Y conventionally does, and Y points right as X conventionally does. By the same logic, for rotating around the y-axis in the right-hand conventions, our 3D Z value plugs in for 2D X and our 3D X value plugs in for 2D Y.
Now, for a left-hand system with left-hand rule rotations, the rotations around the axes actually work out the same: rotations around the z-axis use X for X and Y for Y, rotations around the x-axis use Y for X and Z for Y, and rotations around the y-axis use Z for X and X for Y.
But what if we mix the left-hand-rule convention with a right-hand system or mix the right-hand-rule convention with a left-hand system? Well we could do one of two things: we could either swap the two coordinate values we plug into our formula, e.g. X becomes Y and Y becomes X, or we could simply negate the angle of rotation.
Again, though, we’ll stick to the dominant convention of right-hand-rule-rotations in a right-hand coordinate system. So, again, to be clear, for us, rotations around the z-axis will use X for X and Y for Y, rotations around the x-axis will use Y for X and Z for Y, and rotations around the y-axis will use Z for X and X for Y.
Probably the most aggravating thing about rotations is that, just like rotations in 2D around different pivot points are not commutative, rotations in 3D around different axes are not commutative. This can actually be quite hard to visualize sometimes, as if you just pick up a random object and try rotating it around different axes in different orders, you can easily fool yourself with some edge cases where the results of different rotation orders seem to produce similar results. So it’s actually probably clearer to look at some numbers. So say that I have a 3D coordinate at (900, 500, 0) which I’ll rotate 30 degrees around the x-axis, 20 degrees around the y axis, and 70 degrees around the z-axis. For all six possible orders in which I can apply these three rotations, the point ends up in a totally different position. For example, when I rotate in the order x, then y, then z, the point ends up at (-88.4, 1023.2, 72.9), but when I rotate in the order x, then z, then y, the point ends up at (-7.6, 993.8, 268.8).
The primary use of rotations in 3D rendering is to take our object models defined in a fixed orientation around their own local coordinate system, and we want to not only position these models in our world coordinates by translation but also orient them in any direction by rotation. So here for example, we have a airplane model which is represented as a bunch of vertices fixed in space such that the nose of the plane faces straight down the x-axis and its tail points up parallel to the y-axis. The model just as well could have been built in any other orientation, but it usually makes most sense to build models with their front facing forward and their top facing up. Though of course, which way in our coordinate system we decide faces forward and which way faces up, that’s an arbitrary decision; in this case, we decided that the x axis points forward, and the y axis points up. It’s also most common to build models with the apparent center at the origin because this is usually the point we want our rotations to pivot around.
The question, now, is how do we rotate the model to any desired orientation. As we’ll informally demonstrate later, we can get a model from one orientation into any other in just three (or sometimes fewer) rotations around X, Y, and Z. Any order of X, Y, and Z will work, so Z, then X, then Y or Y, then Z, then X can do the same work as the order X, then Y, then Z. In fact, as we’ll show later, we can even achieve any orientation using just two axes as long as we rotate around one axis, then the other, then the first again. For instance, X-Z-X works, but X-X-Z does not.
In any case, when we do use all three axes, the natural question is which order of X, Y, and Z is easiest to use? Well first consider that the three different ways in which we usually wish to rotate an object. Once again, this is easiest to visualize with an object like a plane that has a facing direction and a right-side-up. When we rotate the plane around the length of its fuselage, we’re controlling its roll; when we point the nose of a plane up and down, we are controlling its pitch; and when we point the nose left and right we’re controlling its yaw.
Where this gets confusing is that we have both extrinsic and intrinsic roll, pitch, and yaw: while an extrinsic rotation is a rotation around the fixed axes of our coordinate system, an intrinsic rotation is relative to the current orientation of the object. So here for example, if we again have our plane starting out facing down the x-axis, but then we rotate it, the plane then implicitly has its own axes, X prime, Y prime, and Z prime, which are relative to its new orientation. A rotation now around the original X, Y, or Z would be an extrinsic rotation, but a rotation around x prime, y prime, or z-prime would be intrinsic. And be clear that, each time we rotate the plane, we implicitly produce a new set of intrinsic rotation axes. After a second rotation, we’d call the third set of axes x double prime, y double prime, and z double prime, and then after a third rotation, we’d call the fourth set of axes x triple prime, y triple prime, and z triple prime, and so forth.
A convenient fact (though one we won’t prove here) is that a series of extrinsic rotations produces the same result as the reverse order of intrinsic rotations. So for example, if we extrinsically rotate around X, then Y, then Z, we end up with the same orientation as if we instead intrinsically rotate around Z, then Y prime, then X double prime.
This is good to know, as depending upon the situation, it may be easier to think in terms of intrinsic rotations than extrinsic rotations, and so converting from a series of intrinsic rotations to a series of extrinsic rotations allows us to use our formulas for rotations around the fixed axes.
Back now to the question of, ‘What is the most natural way to describe an object’s orientation in space?’ Well if we imagine an airplane flying somewhere in the world, we of course would first describe its position in world space, in longitude and latitude, but then for its rotation, most people would start with its heading, a.k.a. its yaw, relative to the ground. Next, most people would describe its pitch relative to its heading, a.k.a. its intrinsic pitch rotation. Lastly, we describe the plane’s roll relative to its heading and pitch, a.k.a. its intrinsic roll rotation.
So back to our plane model, which is oriented such that X is its roll axis, Y is its yaw axis, and Z is its pitch axis. In this configuration, then, the natural way to orient the plane is by intrinsic rotations around Y, then Z prime, then X double prime. (Note that in a series of intrinsic rotations, the first axis is just a normal cardinal axis like in an extrinsic rotation, assuming that the object we’re rotating starts out aligned correctly on the cardinal axes, as is the case here.) To express this as a series of extrinsic rotations, we simply reverse the order, so X, then Z, then Y.
Be clear that this all depends on the original orientation of our model: if our plane model was instead built oriented with its nose facing up the Z axis, then Z would be our roll axis and X our pitch axis, so the order of intrinsic rotations would be Y, then X prime, then Z double prime, and the extrinsic rotations would be Z, then X, then Y. The simplest, most general way to remember it as the extrinsic rotations in the order roll, then pitch, then yaw. This order is significantly easier to work with than other orders of rotations. If I tell you an object is rolled 30 degrees right, then pitched 45 degrees up, and then yawed 90 degrees left, you should be able to picture the result fairly easily, and if presented with an object in some orientation, you should be able to visually estimate its roll, pitch, and yaw for the same specific order. The same cannot be said of alternative extrinsic-rotation orders. Presented with an object in some arbitrary orientation, it can be difficult to visually estimate how to produce that orientation by applying rotations in some other order, like say, yaw, then roll, then pitch.
We already know how to rotate objects around the coordinate axes, but what if we wish to rotate around some other line that also runs through the origin? The trick for such cases is similar to what we did to rotate in 2 dimensions around an arbitrary point. We’ll change our frame of reference to change the problem into one we’ve already solved. This means we first apply the rotations which make the line overlap any one of the three coordinate axes, it doesn’t matter which. Assuming we rotate to make the line overlap the X axis, we can then perform the rotation around the line by rotating around the X axis, something we already know how to do. Having done that, we simply undo the rotations that changed the frame of reference.
The only question now is how exactly to rotate the line onto one of the axis. Let’s say we choose to rotate onto the X-axis. We have a few options:
We could first rotate the line to the ZX plane by rotating around the Z axis, or we could do the same by rotating around the X axis. Both work. Once the line is on the ZX plane, we rotate the line onto the XY plane by rotating around the Y-axis. In this process, the important thing to remember is that rotating around Z or X may change the line’s angle to the X-axis in the ZX plane, so you should figure the second rotation angle only after the first rotation, not before.
We have two more options: we can rotate to the XY plane first, by rotating around either X or Y, and then we would rotate to the ZX plane by rotating around Z.
So note that we can rotate the line to overlap the X-axis in four possible combinations of two axis rotations: either Z then Y, or X then Y, or X then Z, or Y then Z. As you might imagine then, if we instead decided to rotate the line to the Y axis, we could do so with four more possible combinations of two axis rotations, and likewise if we decided to rotate the line to the Z axis, we could do so with four possible combinations of two axis rotations. Just remember that for all of these options, the first rotation changes the angle of rotation needed for the second rotation, so we must perform the first rotation before figuring the second rotation angle.
In any case, whichever axis and rotations we choose, we can then rotate around that axis to rotate around the line. Then we simply apply the same rotations in reverse order. So say, if we rotated the line to the X-axis by rotating 20 degrees around Z and then 30 degrees around Y, we would undo those rotations by rotating -30 degrees around Y and then -20 degrees around Z. Note that we reverse both the order of the rotations and their directions.
Looking now at an actual example line running through the origin, how exactly do we rotate this line to lie on the x-axis? Well first note that a rotation around the z-axis is a rotation in the XY plane, and the slope of a vector in that plane is y of v over x of v (v here standing for vector). Likewise, rotation around the x-axis is a rotation in the YZ plane, and the slope of a vector in the YZ plane is its z over its y. And lastly, rotation around the y-axis is a rotation around the ZX plane, and the slope of a vector in the ZX plane is its x over its z. Notated more formally, the slope of a line in the xy plane we would call m of xy, and it equals y of v over x of v. Likewise m of yz equals z of over y of v, and m of zx equals x of v over z of v.
Once we have our slopes, we can then figure the angles of those slopes by getting their arctangents. So theta of z, the angle of the vector in the XY plane, is arctangent of m of xy. Likewise, theta of x, the angle of the vector in the YZ plane, is arctangent of m of yz, and theta of y, the angle of the vector in the ZX plane, is arctangent of m of zx.
So now that we can find the angle between a vector and the coordinate planes, we can rotate the vector to line up on an axis. Again, just remember that the first rotation changes the angle we apply in the second rotation.
Here’s the process in more detail:
If we wish to rotate theta degrees around the line, and assuming we’ll use the X access as our temporary frame of reference, then we rotate to that frame of reference, first by finding theta of z, the angle of the line in the XY plane, then rotating around Z by negative theta of z, then finding theta of y, the new angle of the line in the ZX plane, then rotating around Y by negative theta of y. Now we’re in a frame of reference where the line overlaps the x axis, so we can rotate around the line by rotating around the x-axis. Now we have to shift the frame of reference back, applying the same rotations from steps 1 and 2, but reversing their order and directions. So we first rotate around Y by positive theta of y, and then rotate around Z by positive theta of z. Be clear that these are the same theta of z and theta of y we found in steps 1 and 2, respectively.
So that’s how we can rotate around an arbitrary axis. Again, using the x-axis as our temporary frame of reference is an arbitrary choice: we just as well could have used the Y or Z axes. Also recall that, for whichever axis we choose, we actually have four possible pairs of rotations around the axes that will make the line overlap the chosen axis. This gives us a total of 12 possible combinations of rotations around X, Y, and Z that allow us to rotate around an arbitrary axis through the origin.
Not coincidentally, these are the same 12 possible combinations which allow us to rotate an object from one orientation to any other. To see why this isn’t a coincidence, imagine we have an object oriented on some arbitrary line through the origin, like this airplane here. We know we can rotate the line, and thus the airplane, to point down one of the axes, let’s say the X axis, in two rotations, and then we can roll the airplane around the x axis so that it faces right-side-up. So no matter which way the line originally points, and no matter how the airplane starts rolled around that line, we can use three rotations to get the airplane facing down the x-axis with its right side up. In other words, we know we can get the airplane from any orientation into one particular orientation using the same 12 combinations of rotations around X, Y, and Z which we used to rotate objects around an arbitrary line through the origin. So if we can get from any orientation to one particular orientation in three rotations, then logically we can go the other way just by reversing the moves: we can get from one particular orientation to any other orientation in three rotations.
The next interesting observation from all this is an informal statement of Euler’s rotation theorem. A bastardized, informal version of the proof might go something like this:
Given that we can rotate an object forma given orientation to any other orientation with certain combinations of three (or fewer) rotations around the cardinal axes.
And given that we can rotate around any arbitrary axis with the same combinations of three (or fewer) rotations around the cardinal axes…
Therefore, we can rotate an object from a given orientation to any other orientation by rotating around some arbitrary axis, that is, we can get to any orientation in one rotation. Moreover, multiple rotations around different axes which all intersect at the same point can be combined into one rotation. This last part will be quite significant when we revisit rotations using matrices.
For completeness sake, here are the 12 possible combinations of three rotations around X, Y, and Z that achieve any rotation around the origin. For reasons I won’t get into, the combinations are all known as Euler angles, after the 18th century mathematician, but a distinction is sometimes made between the ‘proper’ or ‘class’ Euler angles and Tait-Bryan angles. Notice that the proper Euler angles all start and end with a rotation around the same axis. Remember much earlier when I said that we surprisingly can achieve any rotation using just rotations around two of the three cardinal axes? Well now you should understand how this is possible. For instance, the combination X-Y-X works because, as we discussed, a rotation around x and then around y will get any line through the origin to overlap the x axis, and a subsequent rotation around x will then effectively rotate around the line. This means that, using the combination X-Y-X, we can rotate an object from any orientation to one particular orientation, so logically we can do the reverse with X-Y-X, rotate from one particular orientation to any other orientation.
Just be very clear that while any of these combinations of rotation orders can be used to achieve any orientation, different rotation orders produce different results for the same angles. For example, if I rotate 30 degrees around X, then 50 degrees around Y, and then 20 degrees around Z, I get a different orientation than if I reverse the order, first rotating 20 degrees around Z, then 50 degrees around Y, then 30 degrees around X. The same angles in a different order equal a different result.
Also recall that an order of extrinsic rotations is equivalent to the reverse order of intrinsic rotations. So we actually have 12 more combinations of rotations, which are the reverse order of our first 12 with intrinsic rotations.
Now that we can rotate around an arbitrary axis running through the origin, what about rotations around axes not running through the origin? Well the solution is very simple, and similar to the trick we used in 2D to rotate around pivots other than the origin. Once again, we temporarily change the frame of reference, this time translating the line so that it does pass through the origin before performing our rotation and then translating back. We first select any point on the line, it doesn’t matter which; we then find the vector from that point to the origin, and this vector is the translation we perform to temporarily change our frame of reference.
One problem with using Euler angles as a system of describing object orientations is that Euler angles are susceptible to a phenomenon called “Gimbal lock”. A Gimbal is a device of interlocking rings, with the outer ring hinged to rotate on an axis, and each inner ring hinged to rotate around the diameter of the ring immediately surrounding it. With a Gimbal of three rings, the inner ring can effectively rotate in any direction from any starting orientation, except in the special case of Gimbal lock. Gimbal lock occurs when any two of the rings align, such that rotations of either aligned ring produce the same rotation on the inner ring. In a sense, a degree of freedom is lost (though as Wikipedia points out, the term “lock” is perhaps misleading because all three rings are still free to rotate; nothing is “locked” in place).
So how do the problems with gimbals relate to our rotations? Well notice that a gimbal of three rings is just like a system of intrinsic Euler rotations: the rings can start out perpendicular to each other just like 3 axes, and rotations of the outer ring changes the orientation of the two inner rings, and rotations of the middle ring changes the orientation of the innermost ring. So a gimbal is actually a physical manifestation of intrinsic Euler angles. It follows then that Euler angles share the same problems. Because of scenarios like gimbal lock where two intrinsic rotation axes align, some orientations can be described in Euler angles with multiple angle values (even within a single order of rotations). So for one example, in the extrinsic rotation order X Y Z, the X Y Z angles (45, -90, -45) produce the same result as the X Y Z angles (0, -90, 0). In fact, for any set of angles where the y angle equals -90 in this extrinsic rotation order, the rotations around X and Z actually have a reverse effect around the same axis: a rotation of n degrees around X equals a rotation of -n degrees around Z. So for this rotation order X Y Z, any two sets of angels where the y angle = -90 and where the sum of the X and Z angles are equivalent, the rotations produced will be the same. The problem with all this redundancy is that it introduces yet more arbitrary choices for us to make, on top of the choice of which of the 12 rotation orders to use; the sum effect is that Euler angles burden their users with extraneous decisions, not to mention potential confusion.
Another troublesome aspect of Euler angles arises in animations. We can animate a rotation frame-by-frame and so describe any motion we want with Euler angles. Most commonly, though, computer animations are specified by an animator using keyframes, such that the object positions for frames in between get interpolated automatically. So imagine an animation system that uses Euler angles and interpolates between keyframes by simply interpolating between the respective angles. If our first keyframe has an object rotated at x, y, z angles (0, 30, 50) and our second keyframe has the object rotated at x, y, z angles (100, 70, 40), then for each frame interpolated between those positions, the X angle will smoothly change from 0 to 100, the y angle from 30 to 70, and the z angle from 50 to 40. While the resulting animation would appear smooth, the nose of the object would appear to travel in a bent arc along the sphere of rotation. So Euler angles have a tendency to frustrate animators because they often produce unintuitive animations. One fix for this is to have the animation system do a more sophisticated form of interpolation such that the rate of change between angles differs for each coordinate. The other fix is to use an alternative system of describing rotations, called quaternions.
Unfortunately, quaternions are well beyond our scope here, as they involve complex numbers in a way that goes beyond high school algebra. They were first described by a 19th-century Irish mathematician William Rowan Hamilton, who famously, in a moment of inspiration, devised the key formula i2 = j2= k2 = ijk = -1 while taking a walk and then immediately carved the formula into a stone of the bridge he stood on. While a bit tricky to grasp, quaternions don’t suffer the same drawbacks as Euler angles and so are commonly used in 3D animation. I’d like to cover them in a later video, but I’m not well qualified to do so, as I barely understand them myself, so I can’t make any promises. Perhaps I’ll later just direct you to other resources that cover quaternions well.