Nuclear Nutcracker

3D from Scratch - Day 2

2016-12-10T00:00:00Z

This is my attempt to figure out enough 3D graphics from scratch to clone the original Elite. The start of the series is here: 3D from Scratch - Intro.

Weird Cube Bug

When I presented Day 1 at our work Breakfast Club this week, the first thing someone did is grab the controls and instantly find a bug. 😠

Here it is: if you move the cube directly towards the camera, at some point you see strange lines that cross the screen. And then if you keep moving the cube back, it reappears moving forward from again!

So first I removed all but one edge of the cube, to get a clearer idea whats happening:

And it seems that the points are transposed to the position that they would be in if I turned around, and turned upside down.

I think the lines that unaccountably cross the screen are just an artefact of this – when one point is in normal view and one point is in upside-down-reverse view it still tries to draw a line between them.

That means I can remove all but one point to debug it:

My normal debugging strategy at this point is to add a shit-ton of logging to everything, but first let’s just look at the point drawing code and think about what it might be:

function drawPoint3d(ctx, p) {
  var x = Math.round(p.x * (screen_dist / p.z))
  var y = Math.round(p.y * (screen_dist / p.z))
  setPixel(ctx, x + PIXEL_WIDTH/2, y + PIXEL_HEIGHT/2)
}

First thing to notice is that p.z will turn negative when the point comes towards the camera and then passes it. That will make x and y negative too. The x value is then subtracted from PIXEL_WIDTH/2 rather than added to it, which is why it appears reversed! Solved!

So the obvious thing to do is to add a guard that just bails on drawing the point if p.z is negative. That would probably work here, but in general it is quite possible to have one end of a line be behind the camera, and the other end in front of it, and still want to draw the line.

Also, the criteria as to whether a point is drawn or not is not actually whether it’s behind the camera, but whether it’s inside the camera’s view of the screen, and a point leaves that field of view well before p.z turns negative.

So to do this right I’ll have to figure out what appears in the field of view of the camera.

This has turned into a larger task than I was expecting.

Task: Don’t display things not in the camera’s field of view

I knew this was going to be an issue eventually, so might as well get right into it. This was a bit tricky to figure out, I had to break out the old Maths textbooks and everything.

How to decide whether the camera can see something? Here are two diagrams I drew while thinking about this.

This one depicts a side view of the camera and screen, with two lines that do or do not appear in view:

This one is my attempt to draw the camera, screen and field of view in 3d, again with one line that is not visible and one that is, partially:

After half an hour of staring at these diagrams, here are the observations I made:

the field of view of the camera is a pyramid, with the apex centred on the camera. This space is defined as the interior of the shape formed by 4 planes, each of which contain the camera and two (different) corners of the screen.
a point is visible if it is on the inside of all four planes
a line with both points inside the field of view is entirely visible
a line with both points outside the field of view may still be partially visible if it intersects the field of view (the second line in both diagrams)
it’s fairly easy to tell which side of a plane a point lies on, using Maths… so for any point you can tell if it is inside the field of view by checking it against the four planes and seeing if it is inside all of them.

Now practically you could exclude from drawing any line that doesn’t have both ends inside the field of view. This is tempting as most things will be composed of many fairly short lines, and it is not common to fly right into things…

But this is not good enough in the long run, as there are real world cases where a line might cross the vision and you still want to draw it… for instance you might be docking with a space station and only be able to see part of the docking port:

Subtask: don’t draw a line with either end offscreen

However, it’s a good first step, so that’s what I’m going to focus on first. (Only drawing a line if both ends are inside the field of view.)

Here’s where it gets mathsy. I had to look a bunch of this up but it’s all coming back to me now. Maths concepts, with code:

A plane. A plane can be defined as a single point (which is in that plane) and a single “normal” vector, which is perpendicular to the plane. There are infinite points and vectors which work for this, so we will pick ones that make things easy for us.

Dot product. This is the sort-of multiplication of two vectors to give a single number. It shows sort-of “how much in the same direction they are but multiplied”. Vectors at right angles have a zero dot product, and as will become important in a moment, if two vectors are in opposite directions the dot product is negative. It’s written as n•v if n and v are vectors.

// u and v are vectors with x,y,z components
function dot(u, v) {
  return u[0]*v[0] + u[1]*v[1] + u[2]*v[2]
}

Cross product. Taking the cross product of two vectors gives you another vector that is at right angles to the plane formed by them.

It’s defined like this, don’t ask me why:

function cross(u, v) {
  return [
    u[1]*v[2] - u[2]*v[1], 
    u[2]*v[0] - u[0]*v[2],
    u[0]*v[1] - u[1]*v[0],
  ]
}

Now here’s the plan. We’re going to:

find normal vectors for the four planes that define the field of view, and in particular, they’re going to be normal vectors that point “inwards” towards the field of view, not ones that point “outwards”.
each normal vector will be calculated by taking the cross product of two vectors that we know are in the plane. In this case, that’s the two vectors from the camera to the corners of the screen.
for any point we care about, check which side of each of the four planes it is on.

This is pictured in top-down view here (with only the two side planes):

How do we tell which side of the planes the point is on? Well, this is what the dot product is for. Look at this next diagram. p and q are the normal vectors of the two planes. The vector a is to a point that is in view, and the vector b is to a point that is not.

The dot product of a with p will be positive, as they are both pointing inwards from the left plane. Same for q. So since the dot products are both positive, we know the point at a is in view.

The dot product of b with p will also be positive BUT the dot product of b with q is negative, as b and q are pointing different ways away from the plane on the right.

So you can see, if we take the dot products of the point with each of the four normals, we need them all to be positive to know that the point is in view.

Let’s code that up. Here are the coordinates of the four corners of the screen:

// clockwise from bottom right
var screen_coords = [
  [ PIXEL_WIDTH/2,  PIXEL_HEIGHT/2, screen_dist],
  [-PIXEL_WIDTH/2,  PIXEL_HEIGHT/2, screen_dist], // bottom left
  [-PIXEL_WIDTH/2, -PIXEL_HEIGHT/2, screen_dist], // top left
  [ PIXEL_WIDTH/2, -PIXEL_HEIGHT/2, screen_dist], // top right
]

And here are the inward-pointing normals. This is done by taking the cross product of two vectors in each plane to get a new vector at right angles to both of them, and because the camera is at (0,0,0) the points at the corners represent vectors straightaway. (I actually did these in the opposite order first, and that produced outward-pointing normals – once I noticed it was the wrong way round I just flipped the order):

var view_plane_normals = [
  cross(screen_coords[0], screen_coords[1]), // bottom plane
  cross(screen_coords[1], screen_coords[2]), // left plane
  cross(screen_coords[2], screen_coords[3]), // top plane
  cross(screen_coords[3], screen_coords[0]), // right plane
]

Now the code that actually checks if a point is in view. All it does is check that the dot product of the point with each normal is positive, as we discussed before:

function isPointInView(p) {
  for (var i = 0; i < view_plane_normals.length; i++)
    if (dot([p.x, p.y, p.z], view_plane_normals[i]) < 0)
      return false
  return true
}

And now we can adjust drawFrame to not draw any line where both points are not in view:

  ...
  if (isPointInView(newP1) && isPointInView(newP2))
    drawLine3d(ctx, newP1, newP2)
  ...

And run it:

And that looks much better! Our weird bug is fixed, and you can see it is correctly removing any line with an end offscreen.

Subtask: Draw the part of the line that appears onscreen

However, you can now see that we are having lines disappear when they should still be partially visible. This means that when we have a line that has one or both ends offscreen, we should figure out exactly what part of it is visible and draw that.

It’s clear that we’re going to need a way to compute the intersection of a line with a plane, to be able to figure out exactly the point on the edge of the view to draw the lines from and to. For instance, in this diagram the line goes across the view so we need to know the two intersection points with the left and right side of the view:

So first let’s figure out how to do that. /Daunting

Every so often in this series, I’m going to say something like “and then I sat and stared into space for three solid hours”. This is one of those times.

Here’s what I came up with.

Representation of the line. What we have is two points p = (a,b,c) and q = (d,e,f) that are the start and the end of the line (these are the corners of the cube). The line runs from one point to the other. So the representation of the line is as the three equations:

x = a + t(d-a),
y = b + t(e-b),
z = c + t(f-c).

This is called the parametric representation. t runs from zero to one and is “the proportion we are along from one end to the other”. To see that these are the right equations, notice how when t = 0 that x, y and z are just a, b and c, and when t = 1 they are d, e, f. So the ends are right. And since the equations are linear they describe a straight line, no curving going on. So if the ends are right and the line is straight this must be the right equations.

Representation of the plane. Now, there are three ways to represent a plane. One is the normal vector and point representation we already discussed. One is a parametric representation, but using three points in the plane and two variables instead of one. And one is the equation of the plane: ax + by + cz = d

Now I could see how to find the intersection point by plugging the equations of the parametric representation of the line into the equation of the plane.

However, we don’t actually have the equation of the plane. What we have is the parametric representation. And blowed if I could figure out how to go from one to the other.

I’m afraid I had to look this up (I’m allowed to look up maths, though I’m trying not to). Once I had, it seems so obvious and I’m a bit annoyed at myself for not figuring this out.

We actually have two representations of the plane already, the parametric representation and the normal & point representation. And going from the normal & point representation to the equation of the plane is really easy.

What does every vector in the plane have in common? They are all at right angles to the plane’s normal vector. And we said earlier that the dot product of two vectors at right angles is zero. Therefore, if v is a vector in the plane, and n is the plane normal, n•v = 0.

From this we can get the equation of the plane. Let’s pick a point in the plane p = (a, b, c). For all points q = (x, y, z), the vector q - p is in the plane if and only if n •(q-p) = 0. If n=(n,m,o) then by the definition of the dot product this gives us an equation of the plane n(x-a) + m(y-b) + o(z-c) = 0

This is even easier in our case, because the camera is in all our planes, and the camera is at (0,0,0), we can just use that point as our p and the equation is just nx + my + oz = 0.

So now we can work out an formula for t that we can use to give us the intersection point of a line and a plane, by substituting the parametric representation of the line into the equation of the plane. Here’s my derivation, with different constant names:

And here’s the code for it:

// takes two points that define a line and a plane normal 
// and returns where the line intersects the plane
// (assumes (0,0,0) is in the plane)
  function linePlaneIntersection(p, q, n) {
    var v = [q.x - p.x, q.y - p.y, q.z - p.z]
    var t = -1*(n[0]*p.x + n[1]*p.y + n[2]*p.z) /
               (n[0]*v[0] + n[1]*v[1] + n[2]*v[2])
    if (t < 0 || t > 1) 
      return null
    return {x: p.x + t*v[0], y: p.y + t*v[1], z: p.z + t*v[2]}
  }

Clamping the line to the view

Ok! Now we know how to compute intersections, we are very close to being able to truncate the lines by finding their intersection points with the viewplanes! Problem is, how do we know which planes to compute the intersections of the line with?

There are quite a few cases here depending on where the line starts and finishes, and indeed I spent quite a long time drawing lines across squares to try to figure out a neat way of figuring out exactly which planes the lines will intersect with.

This went nowhere, because of the case where the line goes from the top offscreen to the side offscreen. This might or might not intersect the view depending on the exact start and end points:

So then I thought, let’s just brute force it: for any line, collect all the points of intersection with the planes that define the view, and then just pick whichever two are visible (a point on the exact edge of the view we’ll define as visible). And if none are visible, then the line doesn’t intersect the view.

And if one end of the line is visible to begin with, then you only need one of those points of intersection to be visible.

So, that gives us a plan, and we can code it up. This function returns false if none of the line is visible, and otherwise returns the two points that define the part of the line that is visible:

// Returns false if the line between p and q is not
// visible at all. If it is, returns the points for the
// part of the line that is visible.
function clampLineToView(p, q) {
  var p_in = isPointInView(p)
  var q_in = isPointInView(q)

  // if both visible, we're done
  if (p_in && q_in)
    return [p, q]

  // we need two visible endpoints. Include p or q
  // if either of them is visible
  var visible_a = p_in ? p : (q_in ? q : null)
  var visible_b = null

  // now find the intersections and keep going until we 
  // have two visible points
  for (var i = 0; i < view_plane_normals.length; i++) {
    var ip = linePlaneIntersection(p, q, view_plane_normals[i])
    if (ip && isPointInView(ip)) {
      if (visible_a == null) {
        visible_a = ip
      } else if (visible_b == null) {
        visible_b = ip
        break
      }
    }
  }

  // if we have found two visible points, return them,
  // otherwise return false, meaning none of the line
  // is visible.
  if (visible_a != null && visible_b != null)
    return [visible_a, visible_b]
  else
    return false
}

To demo this I’ve done a few things. First I moved the edges of the view inwards a few pixels so we can see the lines vanish (otherwise I’d never be sure if it wasn’t getting the visible portion wrong but just drawing it off canvas):

// clockwise from bottom right
var screen_coords = [
  [ PIXEL_WIDTH/2 - 15,  PIXEL_HEIGHT/2 - 15, screen_dist], // bottom rt
  [-PIXEL_WIDTH/2 + 15,  PIXEL_HEIGHT/2 - 15, screen_dist], // bottom left
  [-PIXEL_WIDTH/2 + 15, -PIXEL_HEIGHT/2 + 15, screen_dist], // top left
  [ PIXEL_WIDTH/2 - 15, -PIXEL_HEIGHT/2 + 15, screen_dist], // top right
]

Then change drawFrame again to only draw the right part of the line, and to draw the intersection points too so we can see it clearly:

  …
  // draw edges
  for (var j = 0; j < edges.length; j++) {
    var p1 = cube[edges[j][0]]
    var p2 = cube[edges[j][1]]
    var newP1 = {x: p1.x + transform.x, y: p1.y + transform.y, z: p1.z + transform.z}
    var newP2 = {x: p2.x + transform.x, y: p2.y + transform.y, z: p2.z + transform.z}

    var clampedLine = clampLineToView(newP1, newP2)
    if (clampedLine) {
      ctx.fillStyle = "blue"
      drawLine3d(ctx, clampedLine[0], clampedLine[1])

      ctx.fillStyle = "red"
      drawPoint3d(ctx, clampedLine[0])
      drawPoint3d(ctx, clampedLine[1])
    }
  }
  …

And now to run it and see…

And that shows clearly that the lines are being truncated only to the visible portion!

Bug: flickering lines

However, we do have a bug here. The truncated lines sometimes flicker as they move. I made another video so it was clear:

This is very strange.

The debugging process was to remove all but one line, and add logging extensively until it became clear what was going on.

If you open up the JavaScript console and run this calculation:

there’s a chance that instead of getting 27.45 as your answer, you’ll in fact get 27.4500000000003. This is because floating point calculations that look as though they should be precise to us humans can have drift due to representational inaccuracies. In fact on my other laptop this was happening consistently, but not here, so I guess it’s due to system specific stuff exactly when this happens.

The problem is that our isPointInView function compares the value of the dot product against 0. And the intersection points we’ve been calculating are actually on the planes in question, so the dot product is often exactly 0 when this function is called. So a slight inaccuracy in the value of the coordinate is enough to render the point not visible when it is in fact on the plane.

The line flickers because this inaccuracy only occurs some of the time, again based on system specific factors.

There’s probably a better way of doing this, but I’ve fixed it by adding a fudge factor to the comparison in isPointOfView:

function isPointInView(p) {
  for (var i = 0; i < view_plane_normals.length; i++)
    if (dot([p.x, p.y, p.z], view_plane_normals[i]) < -0.001)
      return false
  return true
}

And the result, no flickering!

Conclusion

Now I’ve got this all working, I’ve removed the extra space, and the drawing of the intersection and corner points to show what we should really see:

And that’s it! This looks very similar to yesterday’s final demo, and indeed if the cube stays within the view it is identical. But although you can’t see it, it is properly not drawing lines that it can’t see.

So, there are two things that still bug me about todays work:

computing the intersection of every line with every plane. If we assume there are going to be many many lines in the simulation, this might add up to a lot of work. I have an idea how to optimize this if need be though.
the assumption that the camera is at (0,0,0). This has been very handy but I’m starting to suspect that we’re going to have to move the camera eventually (as opposed to moving the entire rest of the world), which means revisiting some of these formulae.

But, for now, it’s all working, so let’s press on and come back to these when we have to!

On to Day Three…

3D from Scratch Day 3 - TypeScript, Refactoring, more Cubes!

2016-12-10T00:00:00Z

This is my attempt to figure out enough 3D graphics from scratch to clone the original Elite. The start of the series is here: 3D from Scratch - Intro.

Some smaller bits and bobs today.

Fixes for Chrome, Firefox

The demos only worked in Safari, because Safari does stuff weirdly and I’d coded for that weirdness (I’m a Safari user). In particular this code doesn’t work in Chrome or Firefox:

document.addEventListener('keydown', function(e) {
 if (e.keyIdentifier == "Up")
   keyState.up = true
 ...
}

This is because keyIdentifier is not standard, it’s only in Safari. And "Up" is not standard, in Chrome and Firefox it’s "ArrowUp".

So to make this code portable I’ve changed it to:

document.addEventListener('keydown', function(e) {
  var keyName = e.key || e.keyIdentifier
  if (keyName == "ArrowUp" || keyName == "Up")
    keyState.up = true
  ...
}

I’ve also gone back and changed this code in the Day 1 and Day 2 demos.

TypeScript

I’ve ported the code to TypeScript because I wanted to try out the language and it changes almost nothing except give me better error messages. It was basically trivial to do so.

Step one was just to create a few types:

interface Point {
  x: number,
  y: number,
  z: number
}

type Vector = number[]

And then to update function signatures like so:

function setPixel(ctx, x, y)
function setPixel(ctx: CanvasRenderingContext2D, x: number, y: number): void

function linePlaneIntersection(p, q, n)
function linePlaneIntersection(p: Point, q: Point, n: Vector): Point

There weren’t a ton of surprises here, it all just worked straight away (and didn’t miraculously surface any bugs … though I suppose it is only a 350 line file).

One annoyance is that the built in type definitions for the KeyboardEvent didn’t take Safari into account, so this code I just mentioned raised a TypeScript error saying that keyIdentifier wasn’t a thing.

  var keyName = e.key || e.keyIdentifier

This isn’t a great sign for TypeScript, really, as I’d like to be able to write portable code in it (I know Safari is in the wrong here but really TypeScript should take that into account in its type definitions).

I didn’t see an obvious way to fix this (say by adding to the built-in type definition) so I just did this to make it work:

 var keyName = e.key || e["keyIdentifier"]

One neat thing was using TypeScript destructuring to replace the long-winded variable swapping code in the drawLine function:

  // old
  if (x2 < x1) {
    var xt = x1
    var yt = y1
    x1 = x2
    y1 = y2
    x2 = xt
    y2 = yt
  }

  // new
  if (x2 < x1) {
    [x1, x2, y1, y2] = [x2, x1, y2, y1]
  }

As to the performance of this, is it making two new arrays on each invocation?? Or is it smart enough not to? The generated javascript is this:

if (x2 < x1) {
  _b = [x2, x1, y2, y1], x1 = _b[0], x2 = _b[1], y1 = _b[2], y2 = _b[3];
}

So it is creating one new array each time. Are the JavaScript VMs smart enough to optimize that object creation away, seeing that it is only used on that one line? Who knows….

Refactoring

Writing down the exact types of things made me feel a little uncomfortable that I was slinging around Arrays for Vectors, Objects (with x,y,z fields) for Points, and separate x and y variables for 2D points! So I’ve refactored to make them all objects with classes:

class Point2D {
  constructor(public x: number, public y: number) {}
}

class Point {
  constructor(public x: number, public y: number, public z: number) {}
}

class Vector {
  constructor(public x: number, public y: number, public z: number) {}
}

So for instance the drawLine function signature has changed like this:

function drawLine(ctx: CanvasRenderingContext2D, x1: number, y1: number, x2: number, y2: number)
function drawLine(ctx: CanvasRenderingContext2D, p: Point2D, q: Point2D): void

and the dot function like this:

function dot(u: number[], v: number[]) {
  return u[0]*v[0] + u[1]*v[1] + u[2]*v[2]
}

function dot(u: Vector, v: Vector): number {
  return u.x*v.x + u.y*v.y + u.z*v.z
}

It’s worth noting that there’s absolutely nothing in TypeScript that stops you passing in a Vector or Point to an argument of type Point2D, as all that is required is that the object have both an x and a y field.

For instance, this is valid code even though cross takes two Vectors not two points, because Points and Vectors have identical fields:

cross(new Point(1, 2, 3), new Point(4, 5, 6))

This is the principle of ‘duck typing’, that says that as long as an object fulfils the interface it doesn’t have to be of the same type. Usually I like this, but in the case of this project I was really hoping for a bit more type safety around when I was using a Point versus a Vector.

However, as soon as the Point and Vector classes diverge with unique properties of their own, I will get this. It’s just that at the moment they are identical so I don’t.

We probably can’t keep it. Although all these objects make the code lovely indeed… we may have to undo it all later. This is because I foresee a time when we’ll want to represent all our data as much as possible as arrays of bytes – JavaScript VMs are optimised to operate on that kind of thing very fast and we’re probably going to need the speed.

ClearRect

I changed how it was clearing the frame to black from this:

ctx.fillStyle = "black"
ctx.fillRect(0, 0, PIXEL_WIDTH*pixel_size, PIXEL_HEIGHT*pixel_size)

to this:

ctx.clearRect(0, 0, PIXEL_WIDTH*pixel_size, PIXEL_HEIGHT*pixel_size)

as I read someplace that it was faster. This also required the canvas background colour to be black, which it was already.

Performance Info

I wanted to have some info dumped on frame rate and how much time we were using to render the frames, so I added this code, which should be fairly self-explanatory:

var PERF_INFO_FRAMES = 100 

// clear perf info after displaying it
function resetPerfInfo(perfInfo) {
    perfInfo.lastCalcUpdateTime    = Date.now()
    perfInfo.frameCounter          = 0
    perfInfo.elapsedTimeInFunction = 0
}

// update after every frame
function updatePerfInfo(perfInfo, funcStartTime) {
  perfInfo.elapsedTimeInFunction += Date.now() - funcStartTime
  perfInfo.frameCounter++

  // after every PERF_INFO_FRAMES frames, dump performance info
  if (perfInfo.frameCounter == PERF_INFO_FRAMES) {
    var timeSinceLast     = Date.now() - perfInfo.lastCalcUpdateTime
    var frameRate         = Math.round(1000*10*PERF_INFO_FRAMES/timeSinceLast)/10
    var runtimePercentage = perfInfo.elapsedTimeInFunction / (Date.now() - perfInfo.lastCalcUpdateTime)

    console.log({
      frameRate:      frameRate, 
      timeBudgetUsed: Math.round(runtimePercentage*1000)/10 + "%"
    })
    resetPerfInfo(perfInfo)
  }
}

And called it at the start and end of the drawFrame:

// set up data structure
var perfInfo = {}
resetPerfInfo(perfInfo)

function drawFrame(): void {
  var funcStartTime = Date.now()
  …
  updatePerfInfo(perfInfo, funcStartTime)
}

This gives us nice messages in the console like this that tell us what frame rate we are getting and how much of our “time budget” (the amount of time between each frame) we are using up:

{frameRate: 60, timeBudgetUsed: "5.2%"}
{frameRate: 60, timeBudgetUsed: "5.1%"}
{frameRate: 60, timeBudgetUsed: "5.7%"}
{frameRate: 60, timeBudgetUsed: "4.9%"}

5% is quite high considering we are only animating one cube! This includes a mixed bag of computing intersections, calculating perspective, and filling in all those little squares in the canvas. Maybe later we will analyse that in more depth to see where the time is going. For now, onwards!

More Cubes!

I decided I was bored with the single blue cube demo so I added some more cubes! I had to at last encapsulate the cube vertices and edges into a Model class. I rebased the cube vertices to make 0 the centre of the cube rather than it being offset by 100 into the world.

class Model {
  constructor(public vertices: Point[], public edges: Edge[]) {}
}

type Edge = number[]

var cubeModel = new Model(
  [
    new Point(50,  50,  50),
    new Point(50,  50,  -50),
    new Point(50,  -50, 50),
    new Point(-50, 50,  50),
    new Point(50,  -50, -50),
    new Point(-50, 50,  -50),
    new Point(-50, -50, 50), 
    new Point(-50, -50, -50),
  ],
  [
    [0, 1],
    [0, 2],
    [0, 3],
    [1, 4],
    [1, 5],
    [2, 4],
    [2, 6],
    [3, 5],
    [3, 6],
    [4, 7],
    [5, 7],
    [6, 7],
  ]
)

Then each cube that exists in the world is an “Instance” of this Model, containing a reference to the model and a location of the object:

class Instance {
  constructor(public model: Model, public location: Point) {}
}

Then creating an array that contains all the many many cubes there now are:

var objects: Instance[] = [
  new Instance(cubeModel, new Point(0,    0,    400)),
  new Instance(cubeModel, new Point(150,  0,    500)),
  ...
  new Instance(cubeModel, new Point(-150, +300, 500)),
  new Instance(cubeModel, new Point(-300, +300, 500)),
]

And rewriting drawFrame to draw the edges of the objects based on this array:

  // draw edges
  ctx.fillStyle = "yellow"
  for (var i = 0; i < objects.length; i++) {
    var object = objects[i]
    for (var j = 0; j < object.model.edges.length; j++) {
      var p1 = object.model.vertices[object.model.edges[j][0]]
      var p2 = object.model.vertices[object.model.edges[j][1]]
      var loc = object.location
      var newP1 = new Point(p1.x + loc.x + transform.x, p1.y + loc.y + transform.y, p1.z + loc.z + transform.z)
      var newP2 = new Point(p2.x + loc.x + transform.x, p2.y + loc.y + transform.y, p2.z + loc.z + transform.z)
      var clampedLine = clampLineToView(newP1, newP2)
      if (clampedLine)
        drawLine3d(ctx, clampedLine[0], clampedLine[1])
    }
  }

And….

Nice!

Conclusion

Pretty happy with all that. TypeScript I will keep I think, as it seems to provide a level of certainty about my JavaScript that I really appreciate. There were a few times I used the “Rename symbol” option in VSCode and it worked perfectly, which is very cool.

Day 4 I have some technical screen size issues to sort out. Not glamorous but I’ve been lax about things so want to get things fixed before moving on.

3D from Scratch - Day 1

2016-12-04T00:00:00Z

First day of seeing if I can figure out retro 3d graphics from scratch. (To catch up on this series, go here.)

My task for today: make a wireframe cube that I can move around with the cursor keys. In 3D.

This is going to be a test to see whether this is a remotely plausible project. Because if I can’t figure this out, I’ve got no hope for the rest of it…

CUBE!

Preliminary setup. The HTML file, with text if Canvas isn’t supported in the user’s browser:

<canvas>Too bad.</canvas>

And the size our screen is going to be (for now):

// Our retro "screen" resolution
var PIXEL_WIDTH  = 160
var PIXEL_HEIGHT = 120

Step 1: Make a screen of pixels using Canvas

First job is figure out how to use Canvas in such a way that it seems as though it’s a simple screen of (huge) pixels. This is harder than you might think, for two reasons:

Every Canvas drawing function is anti-aliased, and bugger if I can figure out how to turn that off. I don’t think it’s possible.
Even if I could turn it off, the pixels on a modern screen are tiiiiiny. There’s no setting in canvas called “set pixel size = 10” or anything like that.

Now I had a few options here, and I’ve gone ahead and chosen the worst. But also the fastest to get up and running. I’ll revisit it later.

That is: I’ll use the canvas fillRect function to draw little squares where the pixels should be. Onwards!

// Actual space we can use in the browser window
var WIN_WIDTH  = window.innerWidth
var WIN_HEIGHT = window.innerHeight

// Calculate how big we can make the virtual pixels 
// (In whole numbers, can't have our virtual pixel be 2.5 
// real pixels, that would look terrible)
var ratio_width  = WIN_WIDTH / PIXEL_WIDTH
var ratio_height = WIN_HEIGHT / PIXEL_HEIGHT
var pixel_size = Math.floor(Math.min(ratio_width, ratio_height))

// Resize the canvas
var canvas = document.getElementById("canvas")
canvas.width = PIXEL_WIDTH * pixel_size
canvas.height = PIXEL_HEIGHT * pixel_size

// Get the context, which is where you do all the actual drawing
var ctx = canvas.getContext("2d")

// Make it a black screen
ctx.fillStyle = "black"
ctx.fillRect(0, 0, PIXEL_WIDTH*pixel_size, '' PIXEL_HEIGHT*pixel_size)

// function to set a pixel (to the colour set with fillStyle)
function setPixel(ctx, x, y) {
  if (x > 0 && x < PIXEL_WIDTH && y > 0 && y < PIXEL_HEIGHT)
    ctx.fillRect(x*pixel_size, y*pixel_size, pixel_size, pixel_size)
}

That looks good. And a little demo to check it works right:

ctx.fillStyle = "white"
for (var x = 0; x < PIXEL_WIDTH; x++) {
  setPixel(ctx, x, Math.floor(30*Math.sin(x/10)) + 55)
}

Suitably retro. OK, we have our screen.

Step 2: Draw the points of the cube

OK, let’s get straight in there and draw the corners of the cube as single pixels, in 3D.

Mental visualisation. I’m here, floating in space, looking straight ahead. There’s a cube hovering in front of me. How do I turn all that into a 2D screen?

After some thought, I’ve decided that this mental visualisation has to include the screen as well, as a flat rectangle hovering directly in front of me, between me and the cube. You can then find the location of any point on the cube on the screen, by drawing a line from the my eye to the point on the cube, and marking where exactly it intersects the screen.

This makes the picture on the screen the projection of the cube. I recall projection being a thing from Maths.

Terrible diagram of this (the first of many):

I’m not a fan of trigonometry, but I think if you threw enough of it at this, you could figure out the locations of the points on the screen in pixels.

But I’m a bit at a loss as how to code this straight away, so let’s simplify by tossing out the x-axis and just considering one point to start with.

To make things as easy as possible, I’m going to assume that the camera is at (0, 0, 0), that it is looking… up… the z-axis, and that the screen is a… units… up the z-axis, and that the cube is b units up the z-axis, and has … 2c units to a side. (Can you tell I’m making this up as I go along?)

This means that the x and y axes of the screen are aligned with the x and y axes of the space it’s floating in, again to simplify things. That the centre point of the screen is _x = 0, y = 0, z = a.

That gives me this diagram:

Now to draw the point on the screen, I need to know the y pixel coordinate, which is marked as y’. (We’re forgetting about x coordinate for a minute.)

The y coordinate of the pixel is something like c, but a bit less because of the perspective.

If I redraw that highlighting the triangle formed by the camera location, the point in 3D space, and the midpoint of the cube line, it becomes easy to see how to derive y’:

Because it’s a right-angled triangle, the value of y’ is proportional to c in the same ratio as the distance that the screen is along the base of the triangle. In other words, y’ = c(a/b). And that’s our screen coordinate!

And this works exactly the same in both the x-axis and the y-axis! (You can imagine the diagram with x-axis in place of the y-axis and nothing changes.) That means we can calculate the pixel coordinates (of that single point) now:

// cube corner, x and y are 'c' and z is 'b' in the diagram above
var p = {x:50, y:50, z:200}

// screen distance from camera is 'a' in the diagram above
var screen_dist = 100 

var screen_coordinates = {
  x: Math.round(p.x*screen_dist / p.z),
  y: Math.round(p.y*screen_dist / p.z)
}

And let’s draw it like this:

ctx.fillStyle = "white"
setPixel(ctx, screen_coordinates.x, screen_coordinates.y)

Fingers crossed…

BOOM! A point! In 3D!

I mean I guess it looks like it’s in about the right place… Lets draw the rest of the cube to see. First I’ll wrap up the point drawing code into a function, then draw the rest of the points.

function drawPoint3d(ctx, p) {
  var x = Math.round(p.x * (screen_dist / p.z))
  var y = Math.round(p.y * (screen_dist / p.z))
  setPixel(ctx, x, y)
}

var cube = [
  { x: 50,  y: 50,  z: 250},
  { x: 50,  y: 50,  z: 150},
  { x: 50,  y: -50, z: 250},
  { x: -50, y: 50,  z: 250},
  { x: 50,  y: -50, z: 150},
  { x: -50, y: 50,  z: 150},
  { x: -50, y: -50, z: 250}, 
  { x: -50, y: -50, z: 150}
]

ctx.fillStyle = "white"
for (var i = 0; i < cube.length; i++) {
  drawPoint3d(ctx, cube[i])
}

And the result:

OK well that looks even more amazingly 3D than the last one!

Although it’s obviously centered the cube at (0, 0) on the screen… so let’s make sure to translate that screen half way in both axes so that the centre of the screen is at x=0, y=0.

function drawPoint3d(ctx, p) {
  var x = Math.round(p.x * (screen_dist / p.z))
  var y = Math.round(p.y * (screen_dist / p.z))
  setPixel(ctx, x + PIXEL_WIDTH/2, y + PIXEL_HEIGHT/2)
}

And the result is very definitely a 3D cube!

And no trigonometry required at all. 🎉

Step 3: Move the cube with cursor keys

OK I want to be able to move this baby around, really feel the 3D.

Shouldn’t be too hard, just need to hook into some JS key events and change the cube coordinates if the cursor keys are pressed.

First let’s maintain the state of the cursor keys using JavaScript document events:

var keyState = {
  up: false,
  down: false,
  left: false,
  right: false,
}

document.addEventListener('keydown', function(e) {
  if (e.keyIdentifier == "Up")    keyState.up = true
  if (e.keyIdentifier == "Down")  keyState.down = true
  if (e.keyIdentifier == "Left")  keyState.left = true
  if (e.keyIdentifier == "Right")  keyState.right = true
})

document.addEventListener('keyup', function(e) {
  if (e.keyIdentifier == "Up")    keyState.up = false
  if (e.keyIdentifier == "Down")  keyState.down = false
  if (e.keyIdentifier == "Left")  keyState.left = false
  if (e.keyIdentifier == "Right") keyState.right = false
})

Now, if this is going to move about then I’m stepping into animation territory, which means I should switch to doing all my drawing inside a window.requestAnimationFrame call. (This allows the browser to call your function that draws a frame whenever it decides that it is time for a new frame to be drawn.)

It’s also going to have to keep track of where the cube is. I decided not to change the values of the coordinates in the cube points every time it moves, but just to keep track of the change in the points in a separate variable:

var transform = {x: 0, y: 0, z: 0}

Now I’ll have a drawFrame function that requestAnimationFrame can call. It needs to clear the screen before it renders the points each frame, and update the transform if it detects that the keys are pressed.

I’ll only have it move in the z and x axes (as there are only four cursor keys).

function drawFrame() {
  // clear frame
  ctx.fillStyle = "black"
  ctx.fillRect(0, 0, PIXEL_WIDTH*pixel_size, PIXEL_HEIGHT*pixel_size)

  // draw points
  ctx.fillStyle = "white"
  for (var i = 0; i < cube.length; i++) {
    var p = cube[i]
    // move point based on transform
    var newP = {x: p.x + transform.x, y: p.y + transform.y, z: p.z + transform.z}
    drawPoint3d(ctx, newP)
  }

  // update cube location
  if (keyState.down)  transform.z -= 10
  if (keyState.up)    transform.z += 10
  if (keyState.left)  transform.x -= 10
  if (keyState.right) transform.x += 10

  window.requestAnimationFrame(drawFrame)
}

// start the whole thing off:
window.requestAnimationFrame(drawFrame)

And that should do it!

Step 4: Write a function to draw lines

This is just a set of points floating around. To make it a real cube it needs to be wireframe, and that means… I have to figure out how to draw lines. To be honest I’ve been putting this off a little bit, because I have no idea how to do it, but it’s clearly time now.

I spent quite a bit of time figuring this out, way longer than on the 3D point drawing. I’m not going to tell you all my errant thought processes here, so I’ll just show you a few lowlights.

The goal is to turn a platonic “perfect” line between two points into a set of illuminated pixels that approximate that line in a “nice” way (see the illustration below). The points at the start and end of the line are always at the exact centre of two pixels.

Now if you draw a few of these it becomes clear pretty quickly that sometimes it can be done more nicely than others. For instance line II in this diagram is a little unbalanced, and will appear a little curved on the screen, but there is just no way to do it better.

Red Herring Number 1. First thing I did was get properly side-tracked into trying to create a recursive algorithm. The first line in the diagram above passes through the exact midpoint of a pixel in the middle of the line. Therefore you can decompose the problem into two around that midpoint, and recursively call your line drawing algorithm with those two halves. This seemed to me to be very promising.

The problem is there’s no way to do the same for line II (or none that I can see) because the line midpoint falls on the boundary between two pixels, so the sub-problems don’t obey the constraint of starting and ending at the centre of two pixels. And there are more cases like this. So this turned out to go nowhere.

Solution Eventually I noticed that (for a line that runs with a steep slope downwards, like Line I and Line II in the diagram), for each row of pixels there is only one active. This led me to think about how you would pick which of the row to activate. The answer is to look at the horizontal line that passes through the midpoint of the row of pixels, and ask: in which pixel does the perfect line intersect this midpoint horizontal line? Here’s what that looks like:

You can see that the intersections are in the correct pixels for this line.

That’s pretty easy to code up (remember we’re assuming that the line slopes down and to the right severely). We step along the perfect line, increasing y by one each time and increasing x by the proportional amount. It’s the Math.round call that’s doing the work of selecting the pixel in the row:

function drawLine(ctx, x1, y1, x2, y2) {
  var x = x1
  var y = y1
  var slope = (x2 - x1) / (y2 - y1)
  while (y <= y2) {
    setPixel(ctx, Math.round(x), y)
    y++
    x += slope;
  }
}

And a demo:

// inside drawFrame
ctx.fillStyle = "yellow"
drawLine(ctx, 10, 10, 10, 50)
drawLine(ctx, 10, 10, 20, 50)
drawLine(ctx, 10, 10, 30, 50)
drawLine(ctx, 10, 10, 40, 50)
drawLine(ctx, 10, 10, 50, 50)

So that looks right. Now we need to consider the cases where the line doesn’t slope down and to the right like that.

First of all, there’s no reason that we have to consider lines that run from right to left. We can just swap the ends, and then for the rest of the function we are guaranteed they run from left to right:

function drawLine(ctx, x1, y1, x2, y2) {
  // ensure line from left to right
  if (x2 < x1) {
   var xt = x1
   var yt = y1
   x1 = x2
   y1 = y2
   x2 = xt
   y2 = yt
  }
  ...

Once we’ve done that, there are four cases to consider. If we calculate the slope of the line as s, which is change in x over change in y:

var s = (x2 - x1) / (y2 - y1)

Then the cases are:

For the two more horizontal cases, the algorithm steps along x one a time (instead of y) and looks to see which pixel in the column (instead of row) of pixels should be activated:

Now we can code up those cases, very similar to the previous ones. The differences in each case are:

whether it is y or x that is incremented at each step
if y, whether it is incremented or decremented (line goes down or up)

if we are scanning across columns (x is being incremented), then y needs to change by 1/s, rather than s. This is because we expressed s as “change in x per y” but now we need “change in y per x”.

if (s > 0 && s <= 1) {
  while (y <= y2) {
    setPixel(ctx, Math.round(x), y)
    y++
    x += s
  }
} else if (s < 0 && s >= -1) {
  while (y >= y2) {
    setPixel(ctx, Math.round(x), y)
    y--
    x -= s
  }
} else if (s < -1) {
  while (x <= x2) {
    setPixel(ctx, x, Math.round(y))
    x++
    y += 1/s
  }
} else if (s > 1) {
  while (x <= x2) {
    setPixel(ctx, x, Math.round(y))
    x++
    y += 1/s
  }
}   }

And try a demo with different colours for each of the four cases:

  ctx.fillStyle = "yellow"
  drawLine(ctx, 10, 60, 10, 100)
  drawLine(ctx, 10, 60, 20, 100)
  drawLine(ctx, 10, 60, 30, 100)
  drawLine(ctx, 10, 60, 40, 100)
  drawLine(ctx, 10, 60, 50, 100)

  ctx.fillStyle = "red"
  drawLine(ctx, 10, 60, 10, 20)
  drawLine(ctx, 10, 60, 20, 20)
  drawLine(ctx, 10, 60, 30, 20)
  drawLine(ctx, 10, 60, 40, 20)
  drawLine(ctx, 10, 60, 50, 20)

  ctx.fillStyle = "green"
  drawLine(ctx, 10, 60, 50, 30)
  drawLine(ctx, 10, 60, 50, 40)
  drawLine(ctx, 10, 60, 50, 50)
  drawLine(ctx, 10, 60, 50, 60)

  ctx.fillStyle = "purple"
  drawLine(ctx, 10, 60, 50, 70)
  drawLine(ctx, 10, 60, 50, 80)
  drawLine(ctx, 10, 60, 50, 90)

Which looks pretty good… Except we’re missing the vertical lines.

This puzzled me a bit. In both upwards and downwards vertical lines the slope s is zero. The difference is that s is either +0, or -0. Of course you can’t distinguish those with an inequality condition, so we can just check whether the second y coordinate is bigger or lesser than the first. Adding that to the code:

  if ((s > 0 && s <= 1)) || (s == 0 && y2 > y1)) {
    ...
  } else if ((s < 0 && s >= -1)) || (s == 0 && y2 < y1)) {
    ...
  } else if (s < -1) {
    ...
  } else if (s > 1) {
    ...
  }

And bingo!

Step 5: Draw the wireframe cube

The line drawing seems to be working. Let’s hook it up to the cube and win!

We need a description of which edges we want, which I’ve chosen to do in terms of which corners to go from and to. Here the numbers are indexes into the cube array of points from before:

var edges = [
  [0, 1],
  [0, 2],
  [0, 3],
  [1, 4],
  [1, 5],
  [2, 4],
  [2, 6],
  [3, 5],
  [3, 6],
  [4, 7],
  [5, 7],
  [6, 7],
]

And here’s a function that copies the code from drawPoint3d that turns the 3D points into screen coordinates, and then just draws the line between them:

function drawLine3d(ctx, p1, p2) {
  var x1 = Math.round(p1.x * (screen_dist / p1.z))
  var y1 = Math.round(p1.y * (screen_dist / p1.z))
  var x2 = Math.round(p2.x * (screen_dist / p2.z))
  var y2 = Math.round(p2.y * (screen_dist / p2.z))
  drawLine(ctx, x1 + PIXEL_WIDTH/2, y1 + PIXEL_HEIGHT/2, x2 + PIXEL_WIDTH/2, y2 + PIXEL_HEIGHT/2)
}

And adding code to drawFrame to actually draw them from point to point, suitably transformed as before:

for (var j = 0; j < edges.length; j++) {
  var p1 = cube[edges[j][0]]
  var p2 = cube[edges[j][1]]
  ctx.fillStyle = "blue"
  var newP1 = {x: p1.x + transform.x, y: p1.y + transform.y, z: p1.z + transform.z}
  var newP2 = {x: p2.x + transform.x, y: p2.y + transform.y, z: p2.z + transform.z}
  drawLine3d(ctx, newP1, newP2)
}

And run it:

And done!

Conclusion

Pleased with the days work! Was not nearly as hard as I expected, so maybe I can do this 💪

On to Day Two…

3D from Scratch - Introduction

2016-12-03T00:00:00Z

New Project!

I’m going to figure out enough about how 3D graphics and game programming work to write a simple retro Elite-style game from scratch. But I’m not going look at any graphics or game programming references or tutorials. The rule is I’ve got to figure it all out myself.

To be clear: I have no idea how 3D graphics work. I have virtually no idea how 2D graphics work. I am pretty good with maths.

I’ll do it in JavaScript because I want to finish this century and because my JavaScript knowledge is still a little jQuery-era and I want to brush up.

I can look at any JavaScript documentation I want, including the Canvas API. But for the graphics programming I’ll treat HTML5 canvas entirely as a dumb screen of pixels (step one is to figure out how to do this). This means no OpenGL/WebGL either: all the 3D and rendering will be implemented in pure JavaScript.

Also I’ll allow myself to look up generic Maths definitions and theorems, otherwise it will take all year.

What will be in these posts will be all my reasoning as I work this stuff out, including diagrams and full code (in a kind of a literate style). And I’ll check in each day’s code in the repo.

It might not be pretty, but it’s going to be a lot of fun.

Thoughts on JavaScript as the implementation language

Again, I’m choosing this language because I don’t have much free time and I want to make rapid progress.

Also, if it works I can eventually figure out how to deploy it anywhere. There are many HTML/JS wrapper frameworks that I forget the names of that can deploy to all platforms.

However I’m not entirely sure that JavaScript is fast enough to do this. (Again, this is 3D graphics without OpenGL.) I wouldn’t even consider it except that this is going to be very retro graphics style with something like a 320x200 resolution and wireframe or flat-shaded models.

When you compare to the hardware that Bell and Braben had in the early 80’s when making Elite, it certainly seems as though the modern browser should be able to compete.

And I’ve heard of people getting amazing performance out of the modern runtimes (like emulating x86 CPUs with a fair speed!) so surely it’s possible. And learning how to optimize JavaScript is an interesting project in its own right.

A bigger worry is GC pause times for a realtime application. I know the V8 team have done a bunch of work making the runtime prioritize live stuff. But what about Safari? I don’t want to have jerky graphics. If necessary, learning how to maintain object pools to keep GC minimal will also be a very interesting project, and something I’ve wanted to try forever.

And finally I reserve the right to bail out into any other language at any time if it turns out not to be possible or just too much work to be reasonable.

Where to start

First things first, how do we even do retro pixel graphics in HTML Canvas? And then something very simple to test the waters…

Series TOC

Simplex for Ruby

2013-12-20T00:00:00Z

I’ve released a pure-Ruby implementation of the Simplex algorithm for solving linear problems. It may be useful to you if you can only run Ruby, or if you want to learn about the Simplex algorithm from a simple implementation.

I really really didn’t want to write my own LP solver implementation. If someone else told me they had done this, I would smirk. It’s a notoriously hard algorithm to implement correctly.

My use case is allocating power through circuits to weapons and shields and things in imaginary space ships (see right) created by users in my web game (codenamed Fantasy Star Fleets). The optimal power allocation (given whichever power cores and couplings have been blown up by enemy action at the present time) is a linear program.

The game runs on Heroku, which is a terrific time-saver. Have you tried compiling the “pro” LP solvers packages for Heroku? Have you tried compiling them on Heroku for Ruby 2.0?

I spent most of a day trying to make that work with various different solvers. Then I gave up and spent an hour writing my own.

It works for me because the little space ships are little, so the problems are small. Plus they are all of a very standard simple form, so there is no chance of degeneracy or hard things in general.

To solve the maximization in standard form (which is the only kind it can do atm):

max x +  y

   2x +  y <= 4
    x + 2y <= 3

    x, y >= 0

Do this:

> simplex = Simplex.new(
  [1, 1],       # coefficients of objective function
  [             # matrix of inequality coefficients on the lhs ...
    [ 2,  1],
    [ 1,  2],
  ],
  [4, 3]        # .. and the rhs of the inequalities
)
> simplex.solution
=> [(5/3), (2/3)]

Although it may not be of much practical use outside the restricted Heroku environment, I’ve tried to make it clean and easy to learn from. You can run the algorithm step by step and inspect the tableau as you go along:

> simplex = Simplex.new([1, 1], [[2, 1], [1, 2]], [4, 3])
> puts simplex.formatted_tableau
 -1.000   -1.000    0.000    0.000            
----------------------------------------------
 *2.000    1.000    1.000    0.000  |    4.000
  1.000    2.000    0.000    1.000  |    3.000

> simplex.can_improve?
=> true
> simplex.pivot
=> [0, 3]

> puts simplex.formatted_tableau
  0.000   -0.500    0.500    0.000            
----------------------------------------------
  1.000    0.500    0.500    0.000  |    2.000
  0.000   *1.500   -0.500    1.000  |    1.000

In the project description on Github and Rubygems I call this a “naive” solver, and that it certainly is. For example, it assumes your problem has feasible origin (because in my use case this is always true). I’d like to improve it so that it doesn’t make this assumption, but I might not find the time.