Reboot Early, and Often!

A wise man once said to me “Reboot early, and often!” He said it many times actually. When you’re trying to troubleshoot programs for which you don’t have the source code, you tend to develop little rituals that you go through to “fix” them…

So here I was today, trying to power through the boilerplate to draw a triangle in Vulkan. I had just finished up creating my Pipeline object, and ran my program to see if it worked. As you can imagine, it didn’t work.

Shockingly, C++ Causes Problems

First up: my shaders are invalid. The error:

validation layer: ObjectTracker : Invalid Shader Module Object 0xa.

Of course, the Object Tracker layer is there to tell you when you try to use objects that are not valid handles. Usually this is because vKcreateFoo did not return VK_SUCCESS. Of course, vkCreateShaderModule definitely did return VK_SUCCESS, because if it hadn’t my program would have died in a fire thanks to my expect assertion macro. Surely either I have a driver bug or C++ is doing “exactly what I told it to.”

So I do some troubleshooting. I upgrade my driver which does nothing. Then I examine my code. Buried in a comment thread on this stackoverflow question is the suggestion that a destructor is being called. I examine my code, and sure enough, the RAII object that I’m storing my shader in is being destructed before the call to vkCreateGraphicsPipeline. C++ was doing “exactly what I told it to.”

Sound Advice

So an hour later, and one problem down. I recompile and try again:

vkEnumeratePhysicalDevices: returned VK_ERROR_INITIALIZATION_FAILED, indicating that initialization of an object has failed

Shenanigans! There’s no way that vkEnumeratePhysicalDevices is messed up, that’s been working for weeks! I turn to the Googler, but to no avail. Then the words rang in my head “Reboot early and often!” It’s not like I just updated my graphics driver without rebooting or anything… Oh wait!

Moral of the story: reboot early, and often!

Something Broke!

Linux has come a long way since the days of non-existent wifi drivers and flash plugins. I dare say Linux is ready for non-technical users. At least I would say this if I didn’t still occasionally get random errors. You know, like this one:

thanks for letting me know

Awesome, something broke! Thanks for letting me know! Twice!

This one is a common one, and easily fixed. However, I never remember what the specific incantation for this one is, so I always have to Google it. But no more! Today I record the solution in my log!

The Solution

This one is caused by some process crashing. When a process crashes, apparently it dumps some junk into /var/crash that presumably the developer of said process knows about and cares about. Unfortunately I’m not said developer, and I don’t care about a one time crash of some random process. I do care that I’m getting 2-3 useless popups whenever I boot my Linux partition up. Let’s fix this:

NOTE: Consider looking in /var/crash to see what happened before randomly deleting all the stuff in it.

$ sudo rm /var/crash/* $ sudo init 6

After your machine reboots, you should be good to go.

The Specter of Undefined Behavior

If you’ve ever spoken to a programmer, and really got them on a roll, they may have said the words “undefined behavior” to you. Since you speak English, you probably know what each of those words mean, and can imagine a reasonable meaning for them in that order. But then your programmer friends goes on about “null-pointer dereferencing” and “invariant violations” and you start thinking about cats or football or whatever because you are not a programmer.

I often find myself being asked what it is that I do. Since I’ve spent the last few years working on my Computer Science degree, and have spent much of that time involved in programming language research, I often find myself trying to explain this concept. Unfortunately, when put on the spot, I usually am only able to come up with the usual sort of explanation that programmers use among themselves: “If you invoke undefined behavior, anything can happen! Try to dereference a null pointer? Bam! Lions could emerge from your monitor and eat your family!” Strictly speaking, while I’m sure some compiler writer would implement this behavior if they could, it’s not a good explanation for a person who doesn’t already kind of understand the issues at play.

Today, I’d like to give an explanation of undefined behavior for a lay person. Using examples, I’ll give an intuitive understanding of what it is, and also why we tolerate it. Then I’ll talk about how we go about mitigating it.

Division By Zero

Here is one that most of us know. Tell me, what is 8 / 0? The answer of course is “division by zero is undefined.” In mathematics, there are two sorts of functions: total and partial. A total function is defined for all inputs. If you say a + b, this can be evaluated to some result no matter what you substitute for a and b. Addition is total. The same cannot be said for division. If you say a / b, this can be evaluated to some result no matter what you substitute for a and b unless you substitute b with 0. Division is not total.

If you go to the Wikipedia article for division by zero you’ll find some rationale for why division by zero is undefined. The short version is that if it were defined, then it could be mathematically proven that one equals two. This would of course imply that cats and dogs live in peace together and that pigs fly, and we can’t have that!

ti86_calculator_divbyzero

However, there is a way we can define division to be total that doesn’t have this issue. Instead of defining division to return a number, we could define division to return a set of numbers. You can think of a set as a collection of things. We write this as a list in curly braces. {this, is, a, set, of, words} I have two cats named Gatito and Moogle. I can have a set of cats by writing {Gatito, Moogle}. Sets can be empty; we call the empty set the null set and can write it as {} or using this symbol . I’ll stick with empty braces because one of the things I hate about mathematics is everybody’s insistence on writing in Greek.

So here is our new total division function:

totalDivide(a, b) if (b does not equal 0) output {a / b} otherwise output {}

If you use totalDivide to do your division, then you will never have to worry about the undefined behavior of division! So why didn’t Aristotle (or Archimedes or Yoda or whoever invented division) define division like this in the first place? Because it’s super annoying to deal with these sets. None of the other arithmetic functions are defined to take sets, so we’d have to constantly test that the division result did not produce the empty set, and extract the result from the set. In other words: while our division is now total, we still need to treat division by zero as a special case. Let us try to evaluate 2/2 + 2/2 and totalDivide(2,2) + totalDivide(2,2):

1: 2/2 + 2/2 2: 1 + 1 3: 2

Even showing all my work, that took only 3 lines.

1: let {1} = totalDivide(2,2) 2: let {1} = totalDivide(2,2) 3: 1 + 1 4: 2

Since you can’t add two sets, I had to evaluate totalDivide out of line, and extract the values and add them separately. Even this required my human ability to look at the denominator and see that it wasn’t zero for both cases. In other words, making division total made it much more complicated to work with, and it didn’t actually buy us anything. It’s slower. It’s easier to mess up. It has no real value. As humans, it’s fairly easy for us to look at the denominator, see that it’s zero, and just say “undefined.”

Cartons of Eggs

I’m sure many of you have a carton of eggs in your fridge. Go get me the 17th egg from your carton of eggs. Some of you will be able to do this, and some of you will not. Maybe you only have a 12 egg carton. Maybe you only have 4 eggs in your 18 egg carton, and the 17th egg is one of the ones that are missing. Maybe you’re vegan.

A basic sort of construct in programming is called an “array.” Basically, this is a collection of the same sort of things packed together in a region of memory on your computer. You can think of a carton of eggs as an array of eggs. The carton only contains one sort of thing: an egg. The eggs are all packed together right next to each other with nothing in between. There is some finite amount of eggs.

SAMSUNG DIGITAL CAMERA

If I told you “for each egg in the carton, take it out and crack it, and dump it in a bowl starting with the first egg”, you would be able to do this. If I told you “take the 7th egg and throw it at your neighbor’s house” you would be able to do this. In the first example, you would notice when you cracked the last egg. In the second example you would make sure that there was a 7th egg, and if there wasn’t you probably picked some other egg because your neighbor is probably a jerk who deserves to have his house egged. You did this unconsciously because you are a human who can react to dynamic situations. The computer can’t do this.

If you have some array that looks like this (array locations are separated by | bars | and * stars * are outside the array) ***|1|2|3|*** and you told the computer “for each location in the array, add 1 to the number, starting at the first location” it would set the first location to be 2, the second location to be 3, the third location to be 4. Then it would interpret the bits in the location of memory directly to the right of the third location as a number, and it would add 1 to this “number” thereby destroying the data in that location. It would do this forever because this is what you told the machine to do. Suppose that part of memory was involved in controlling the brakes in your 2010 era Toyota vehicle. This is obviously incredibly bad, so how do we prevent this?

The answer is that the programmer (hopefully) knows how big the array is and actually says “starting at location one, for the next 3 locations, add one to the number in the location”. But suppose the programmer messes up, and accidentally says “for the next 4 locations” and costs a multinational company billions of dollars? We could prevent this. There are programming languages that give us ways to prevent these situations. “High level” programming languages such as Java have built-in ways to tell how long an array is. They are also designed to prevent the programmer from telling the machine to write past the end of the array. In Java, the program will successfully write |2|3|4| and then it will crash, rather than corrupting the data outside of the array. This crash will be noticed in testing, and Toyota will save face. We also have “low level” programming languages such as C, which don’t do this. Why do we use low level programming languages? Let’s step through what these languages actually have the machine do for “starting at location one, for the next 3 locations, add one to the number in the location”: First the C program:

NOTE: location[some value] is shorthand for “the location identified by some value.” egg_carton[3] is the third egg in the carton. Additionally, you should read these as sequential instructions “first do this, then do that” Finally, these examples are greatly simplified for the purposes of this article.

1: counter = 1 2: location[counter] = 1 + 1 3: if (counter equals 3) terminate 4: counter = 2 5: location[counter] = 2 + 1 6: if (counter equals 3) terminate 7: counter = 3 8: location[count] = 3 + 1 9: if (counter equals 3) terminate

Very roughly speaking, this is what the computer does. The programmer will use a counter to keep track of their location in the array. After updating each location, they will test the counter to see if they should stop. If they keep going they will repeat this process until the stop condition is satisfied. The Java programmer would write mostly the same program, but the program that translates the Java code into machine code (called a compiler) will add some stuff:

1: counter = 1 2: if (counter greater than array length) crash 3: location[counter] = 1 + 1 4: if (counter equals 3) terminate 5: counter = 2 6: if (counter greater than array length) crash 7: location[counter] = 2 + 1 8: if (counter equals 3) terminate 9: counter = 3 10: if (counter greater than array length) crash 11: location[count] = 3 + 1 12: if (counter equals 3) terminate

As you can see, 3 extra lines were added. If you know for a fact that the array you are working with has a length that is greater than or equal to three, then this code is redundant.

For such a small array, this might not be a huge deal, but suppose the array was a billion elements. Suddenly an extra billion instructions were added. Your phone’s processor likely runs at 1-3 gigahertz, which means that it has an internal clock that ticks 1-3 billion times per second. The smallest amount of time that an instruction can take is one clock cycle, which means that in the best case scenario, the java program takes one entire second longer to complete. The fact of the matter is that “if (counter greater than array length) crash” definitely takes longer than one clock cycle to complete. For a game on your phone, this extra second may be acceptable. For the onboard computer in your car, it is definitely not. Imagine if your brakes took an extra second to engage after you push the pedal? Congressmen would get involved!

windows_xp_bsod

In Java, reading off the end of an array is defined. The language defines that if you attempt to do this, the program will crash (it actually does something similar but not the same, but this is outside the scope of this article). In order to enforce this definition, it inserts these extra instructions into the program that implement the functionality. In C, reading off the end of an array is undefined. Since C doesn’t care what happens when you read off the end of an array, it doesn’t add any code to your program. C assume you know what you’re doing, and have taken the necessary steps to ensure your program is correct. The result is that the C program is much faster than the Java program.

There are many such undefined behaviors in programming. For instance, your computer’s division function is partial just like the mathematical version. Java will test that the denominator isn’t zero, and crash if it is. C happily tells the machine to evaluate 8 / 0. Most processors will actually go into a failure state if you attempt to divide by zero, and most operating systems (such as Windows or Mac OSX) will crash your program to recover from the fault. However, there is no law that says this must happen. I could very well create a processor that sends lions to your house to punish you for trying to divide by zero. I could define x / 0 = 17. The C language committee would be perfectly fine with either solution; they just don’t care. This is why people often call languages such as C “unsafe.” This doesn’t mean that they are bad necessarily, just that their use requires caution. A chainsaw is unsafe, but it is a very powerful tool when used correctly. When used incorrectly, it will slice your face off.

What To Do

So, if defining every behavior is slow, but leaving it undefined is dangerous, what should we do? Well, the fact of the matter is that in most cases, the added overhead of adding these extra instructions is acceptable. In these cases, “safe” languages such as Java are preferred because they ensure program correctness. Some people will still write these sorts of programs in unsafe languages such as C (for instance, my own DMP Photobooth is implemented in C), but strictly speaking there are better options. This is part of the explanation for the phenomenon that “computers get faster every year, but [insert program] is just as slow as ever!” Since the performance of [insert program] we deemed to be “good enough”, this extra processing power is instead being devoted to program correctness. If you’ve ever used older versions of Windows, then you know that your programs not constantly crashing is a Good Thing.

windows_xp_bsod

This is fine and good for those programs, but what about the ones that cannot afford this luxury? These other programs fall into a few general categories, two of which we’ll call “real-time” and “big data.” These are buzzwords that you’ve likely heard before, “big data” programs are the programs that actually process one billion element arrays. An example of this sort of software would be software that is run by a financial company. Financial companies have billions of transactions per day, and these transactions need to post as quickly as possible. (suppose you deposit a check, you want those funds to be available as quickly as possible) These companies need all the speed they can get, and all those extra instructions dedicated to totality are holding up the show.

Meanwhile “real-time” applications have operations that absolutely must complete in a set amount of time. Suppose I’m flying a jet, and I push the button to raise a wing flap. That button triggers an operation in the program running on the flight computer, and if that operation doesn’t complete immediately (where “immediately” is some fixed, non-zero-but-really-small amount of time) then that program is not correct. In these cases, the programmer needs to have very precise control over what instructions are produced, and they need to make every instruction count. In these cases, redundant totality checks are a luxury that is not in the budget.

Real-time and big data programs need to be fast, so they are often implemented in unsafe languages, but that does not mean that invoking undefined behavior is OK. If a financial company sets your account balance to be check value / 0, you are not going to have a good day. If your car reads the braking strength from a location off to the right of the braking strength array, you are going to die. So, what do these sorts of programs do?

One very common method, often used in safety-critical software such as a car’s onboard computer is to employ strict coding standards. MISRA C is a set of guidelines for programming in C to help ensure program correctness. Such guidelines instruct the developer on how to program to avoid unsafe behavior. Enforcement of the guidelines is ensured by peer-review, software testing, and Static program analysis.

Static program analysis (or just static analysis) is the process of running a program on a codebase to check it for defects. For MISRA C, there exists tooling to ensure compliance with its guidelines. Static analysis can also be more general. Over the last year or so, I’ve been assisting with a research project at UCSD called Liquid Haskell. Simply put, Liquid Haskell provides the programmer with ways to specify requirements about the inputs and outputs of a piece of code. Liquid Haskell could ensure the correct usage of division by specifying a “precondition” that “the denominator must not equal zero.” (I believe that this actually comes for free if you use Liquid Haskell as part of its basic built-in checks) After specifying the precondition, the tool will check your codebase, find all uses of division, and ensure that you ensured that zero will never be used as the denominator.

It does this by determining where the denominator value came from. If the denominator is some literal (i.e. the number 7, and not some variable a that can take on multiple values), it will examine the literal and ensure it meets the precondition of division. If the number is an input to the current routine, it will ensure the routine has a precondition on that value that it not be zero. If the number is the output from some other routine, it verifies that the the routine that produced the value has, as a “postcondition”, that its result will never be zero. If the check passes for all usages of division, your use of division will be declared safe. If the check fails, it will tell you what usages were unsafe, and you will be able to fix it before your program goes live. The Haskell programming language is very safe to begin with, but a Haskell program verified by Liquid Haskell is practically Fort Knox!

The Human Factor

Humans are imperfect, we make mistakes. However, we make up for it in our ability to respond to dynamic situations. A human would never fail to grab the 259th egg from a 12 egg carton and crack it into a bowl; the human wouldn’t even try. The human can see that there is only 12 eggs without having to be told to do so, and will respond accordingly. Machines do not make mistakes, they do exactly what you tell them to, exactly how you told them to do it. If you tell the machine to grab the 259th egg and crack it into a bowl, it will reach it’s hand down, grab whatever is in the space 258 egg lengths to the right of the first egg, and smash it on the edge of a mixing bowl. You can only hope that nothing valuable was in that spot.

Most people don’t necessarily have a strong intuition for what “undefined behavior” is, but mathematicians and programmers everywhere fight this battle every day.

Procedural Moonbase: 30 Seconds to Launch

Mere hours from the deadline, the project comes to its conclusion. However, some sacrifices had to be made. Andrew tried valiantly to get the procedural city generation working, but it was a lost cause. Instead he implemented shadow mapping.

Honestly, I think this works better for our concept. I was a bit worried that the scene might be a bit too busy with the city, but I think the addition of shadows really works, and gives the trees the last little bit of “oomph” that they needed. The “Octopus” tree especially casts a nice ominous shadow over the landscape, and it all plays quite nicely with the orbiting light.

One Last Push

On the eve of the deadline, work continues at a feverish pace…

Procedural Terrain and L-systems

Work on procedural terrain generation and crystal formations was mostly done for last Friday’s post. A bit of polish work has occurred since then. Most notably, I’ve smoothed out the terrain of the real world moon heightmap data. The result is much more moon-like:

smoother

Also, I’ve tweaked the shader for the crystal growths. I’ve used a technique similar to what is done in cel shading to recognize the silhouettes of the growths. At first I was planning on implementing actual cel shading, but it did not look good. However, I discovered that I could favor the external shell portion of the texture in the silhouette area, much like how if you look at a transparent object from an extreme angle, you see more of the surface material. I find the result to be quite satisfying:

newShader

… compared to last week:

anemone

Procedural Cities and Buildings

This weekend, Andrew added functionality to render the smaller heightmap for the city on top of the buildable area of the larger heightmap.

The main C++ program invokes a Python script to generate the heightmap and road network, then reads the data into a new LandscapeModel at the appropriate location. The program then renders the road network using a solid color shader for now. By rendering only a portion of the road network at a time, it’s possible to animate the process, showing how the network grows from the population centers and intersects with itself.

roads

The more significant challenge has been extracting buildable areas from the road network. Implementing the road network growth required hacking together a spatially indexed graph database. The generated graphs are subtly broken – some roads cross without an intersection, so the graph is not necessarily planar.

Andrew need to resolve this issue before he can reliably extract cycles, and he needs to extract buildable plots before he can place and generate buildings. He will be working on that tonight.

Where the graph is sensible, he is able to extract a cycle:

cycleSuccess

But in a case where the edges intersect somewhere besides an intersection point, this fails:

cycleFail

Plumbing

There is an additional requirement that all procedural elements be able to be regenerated at runtime. I’ve implemented the plumbing to ensure that this happens. At the press of a button, the terrain can be regenerated, the crystals can be regenerated and the real world heightmap data can be toggled on and off. Additionally, when the city generation is complete, the plumbing is in place for that as well.

All this regeneration happens much faster than I’d thought it would, which was nice. There was no need to optimize the regeneration or implement asynchronous regeneration.

Onwards To The Moon

As the deadline looms, work continues at a feverish pace. Much has happened, and much remains to be done.

Procedural Terrain Generation

The terrain generation is basically done. We are generating random landscape heightmaps using the Diamond-Square algorithm. After generating the heightmaps, we carve out a flat space for the city, and then place various mineral deposits and crystal growths. After the features are set, we tessellate the landscape.

progGenTerrain

We are also capable of reading in external heightmap data, and I’ve located a heightmap of the surface of the moon, which we are capable of turning into a 3D surface.

As you can see, the lunar surface is much rougher than one would think.

theMoon

Crystalline Structures

New this week are the crystalline structures seen in the screenshots above. These are generated using L-systems, with the following grammar:

V = { D(len, topScale, bottomScale, topLen, bottomLen), C(len, topScale, bottomScale, topLen, bottomLen), S(scale), T(minSegs) } S = { F(theta), K(theta), A(r, s) } ω1 = { A(C(6,1,1,1,1), D(1.0f, 3.0f, 2.5f, 0.5f, 1.0f)) } ω2 = { A(C(6,1,1,1,1), A(F(1), S(1))) } ω3 = { A(S(1), A(K(1), T(3)) } P = { D -> A(D, A((K(1), C(2,1,1,1,1)))), C -> A(C, A(F(0.75), T(3))), T : final iteration -> A(T, S(0.5)), T : otherwise -> A(T, T(1)), S : -> A(S, D(3,2,1,1,1)) }

There is a basic crystal building block, represented by D and C. These are the same shape, but D has 5 branching points, and C has 1 (the center.) To build on a mounting point, one uses the A rule, which states “For all mounting points produced by r, build s on it.”

Rounding out the bunch, are S a “scepter” shaped crystal, F, which fans out in three directions, and K, which forks in two directions. Last but not least is the “angry tentacle” formation T.

Here we see ω1:

anemone

… ω2:

coral

…and ω3:

octopus

Procedural City Generation

We’ll be placing a procedurally generated moon base on the flat build site carved out of the terrain. The moon’s surface is very rough, so we can only place the base on the flat area.

However, the area under the city isn’t completely flat; we are using a technique called spectral synthesis to generate a smoother heightmap underneath the city. The edges of this heightmap will coincide smoothly with the surrounding terrain to ensure a natural transition.

167cg1

After generating the heightmaps, we generate a population density map using a similar procedure, attenuating it where the terrain is steepest (it’s difficult to build on the side of a hill!). Then we sample the density map and place population centers using k-means clustering. Finally, we triangulate the set of population centers to generate a connectivity graph for road generation.

167cg2

Once we have a connectivity graph, we draw adaptive highways between population centers using the technique described in Citygen: An Interactive System for Procedural City Generation, using a heuristic combining Least Elevation Difference, population density, and the degree to which the road would deviate from its current direction. As these are drawn, we create additional perpendicular roads in areas of high population density. These roads are extended using the same heuristic until they reach a less populated area.

167cg3

So far, we’ve implemented the above in Python so we can leverage the performant array operations from NumPy and SciPy, in addition to the spatial indexing functionality from Rtree. We’ll need to use the Python C API to call into our Python code and store the resulting road graph in a buffer for rendering.

The Way Ahead

Much work remains to be done. For terrain and L-systems, there remains polish work. I’ve implemented the required algorithms, and it technically “works.” However, it’s still a bit rough. I hope to refine the shaders of the crystals to make them appear more 3-dimensional. Additionally, I’d like to increase the density of “trees” on the landscape as I think it looks a bit sparse. I hope to have this work wrapped up by tomorrow night.

For city generation, the road graph needs to be converted to buildable cells, and subdivided/populated with buildings. Andrew plans to have this done by tomorrow night. By Sunday, he plans to have the rendering set up for this, with basic buildings. Then by Monday, more elaborate buildings will follow.

Additionally, we need the capability to regenerate the various procedurally generated components at runtime. This is a simple matter of plumbing, and should not be difficult. Time allowing, I’d like to do this asynchronously, so there is no frame hiccup during regeneration. The user will press a button, and a few seconds later, the change will be reflected.

Procedural Moonbase

Having reached the end of UCSD’s Intro to Computer Graphics course, I have been tasked with creating a real-time demo that implements a subset of features covered in the class. Today is the first of three posts about the progress of this project.

For this project myself (Chris Tetreault) and my partner Andrew Buss will be implementing 4 features:

  • Procedurally modelled city
  • Procedurally modelled buildings
  • Procedurally generated terrain
  • Procedurally generated “plants” with L-systems

With all of these, we’ll be creating a procedurally generated moon base. First, we will generate a landscape, and carve out a flat spot for the base. Next, we will generate a city, which consists of a procedurally generated layout populated by procedurally generated buildings. Finally, we will procedurally generate doodads to be placed on the landscape throughout the undeveloped portion of the terrain.

All of this will be implemented in modern C++ with modern OpenGL.

Terrain and “Plants”

This portion of the project is coming along nicely. We are currently generating the landscape, with the exception of textures. We currently have the surface normals set as the pixel color in the fragment shader.

Procedurally Generated landscape

At some elevation lower than the city, I plan to add lava. Additionally, I plan to have two land based-textures that I will select between based on the elevation gain. Likely, this will be a smooth rock texture for steep hills, and gravel for flatter areas.

After completing this, I will use L-systems to generate stuff to put on the landscape. This will likely take the form of geological formations, such as rocks or lava flows as one isn’t likely to find trees on a blasted lunar wasteland.

Cities and Buildings

For the city portion, we’ll be procedurally generating moon base buildings. We decided to go with the moon base, as opposed to a traditional city because we felt this could give us more creative freedom to do what we want. After all; if the task is to procedurally generate a “City”, how many urban downtown areas are you likely to see?

This portion of the project is still in the planning phase, so expect to see more on this next week.

%d bloggers like this: