Archive | C RSS for this section

Reboot Early, and Often!

A wise man once said to me “Reboot early, and often!” He said it many times actually. When you’re trying to troubleshoot programs for which you don’t have the source code, you tend to develop little rituals that you go through to “fix” them…

So here I was today, trying to power through the boilerplate to draw a triangle in Vulkan. I had just finished up creating my Pipeline object, and ran my program to see if it worked. As you can imagine, it didn’t work.

Shockingly, C++ Causes Problems

First up: my shaders are invalid. The error:

validation layer: ObjectTracker : Invalid Shader Module Object 0xa.

Of course, the Object Tracker layer is there to tell you when you try to use objects that are not valid handles. Usually this is because vKcreateFoo did not return VK_SUCCESS. Of course, vkCreateShaderModule definitely did return VK_SUCCESS, because if it hadn’t my program would have died in a fire thanks to my expect assertion macro. Surely either I have a driver bug or C++ is doing “exactly what I told it to.”

So I do some troubleshooting. I upgrade my driver which does nothing. Then I examine my code. Buried in a comment thread on this stackoverflow question is the suggestion that a destructor is being called. I examine my code, and sure enough, the RAII object that I’m storing my shader in is being destructed before the call to vkCreateGraphicsPipeline. C++ was doing “exactly what I told it to.”

Sound Advice

So an hour later, and one problem down. I recompile and try again:

vkEnumeratePhysicalDevices: returned VK_ERROR_INITIALIZATION_FAILED, indicating that initialization of an object has failed

Shenanigans! There’s no way that vkEnumeratePhysicalDevices is messed up, that’s been working for weeks! I turn to the Googler, but to no avail. Then the words rang in my head “Reboot early and often!” It’s not like I just updated my graphics driver without rebooting or anything… Oh wait!

Moral of the story: reboot early, and often!

The Specter of Undefined Behavior

If you’ve ever spoken to a programmer, and really got them on a roll, they may have said the words “undefined behavior” to you. Since you speak English, you probably know what each of those words mean, and can imagine a reasonable meaning for them in that order. But then your programmer friends goes on about “null-pointer dereferencing” and “invariant violations” and you start thinking about cats or football or whatever because you are not a programmer.

I often find myself being asked what it is that I do. Since I’ve spent the last few years working on my Computer Science degree, and have spent much of that time involved in programming language research, I often find myself trying to explain this concept. Unfortunately, when put on the spot, I usually am only able to come up with the usual sort of explanation that programmers use among themselves: “If you invoke undefined behavior, anything can happen! Try to dereference a null pointer? Bam! Lions could emerge from your monitor and eat your family!” Strictly speaking, while I’m sure some compiler writer would implement this behavior if they could, it’s not a good explanation for a person who doesn’t already kind of understand the issues at play.

Today, I’d like to give an explanation of undefined behavior for a lay person. Using examples, I’ll give an intuitive understanding of what it is, and also why we tolerate it. Then I’ll talk about how we go about mitigating it.

Division By Zero

Here is one that most of us know. Tell me, what is 8 / 0? The answer of course is “division by zero is undefined.” In mathematics, there are two sorts of functions: total and partial. A total function is defined for all inputs. If you say a + b, this can be evaluated to some result no matter what you substitute for a and b. Addition is total. The same cannot be said for division. If you say a / b, this can be evaluated to some result no matter what you substitute for a and b unless you substitute b with 0. Division is not total.

If you go to the Wikipedia article for division by zero you’ll find some rationale for why division by zero is undefined. The short version is that if it were defined, then it could be mathematically proven that one equals two. This would of course imply that cats and dogs live in peace together and that pigs fly, and we can’t have that!

ti86_calculator_divbyzero

However, there is a way we can define division to be total that doesn’t have this issue. Instead of defining division to return a number, we could define division to return a set of numbers. You can think of a set as a collection of things. We write this as a list in curly braces. {this, is, a, set, of, words} I have two cats named Gatito and Moogle. I can have a set of cats by writing {Gatito, Moogle}. Sets can be empty; we call the empty set the null set and can write it as {} or using this symbol . I’ll stick with empty braces because one of the things I hate about mathematics is everybody’s insistence on writing in Greek.

So here is our new total division function:

totalDivide(a, b) if (b does not equal 0) output {a / b} otherwise output {}

If you use totalDivide to do your division, then you will never have to worry about the undefined behavior of division! So why didn’t Aristotle (or Archimedes or Yoda or whoever invented division) define division like this in the first place? Because it’s super annoying to deal with these sets. None of the other arithmetic functions are defined to take sets, so we’d have to constantly test that the division result did not produce the empty set, and extract the result from the set. In other words: while our division is now total, we still need to treat division by zero as a special case. Let us try to evaluate 2/2 + 2/2 and totalDivide(2,2) + totalDivide(2,2):

1: 2/2 + 2/2 2: 1 + 1 3: 2

Even showing all my work, that took only 3 lines.

1: let {1} = totalDivide(2,2) 2: let {1} = totalDivide(2,2) 3: 1 + 1 4: 2

Since you can’t add two sets, I had to evaluate totalDivide out of line, and extract the values and add them separately. Even this required my human ability to look at the denominator and see that it wasn’t zero for both cases. In other words, making division total made it much more complicated to work with, and it didn’t actually buy us anything. It’s slower. It’s easier to mess up. It has no real value. As humans, it’s fairly easy for us to look at the denominator, see that it’s zero, and just say “undefined.”

Cartons of Eggs

I’m sure many of you have a carton of eggs in your fridge. Go get me the 17th egg from your carton of eggs. Some of you will be able to do this, and some of you will not. Maybe you only have a 12 egg carton. Maybe you only have 4 eggs in your 18 egg carton, and the 17th egg is one of the ones that are missing. Maybe you’re vegan.

A basic sort of construct in programming is called an “array.” Basically, this is a collection of the same sort of things packed together in a region of memory on your computer. You can think of a carton of eggs as an array of eggs. The carton only contains one sort of thing: an egg. The eggs are all packed together right next to each other with nothing in between. There is some finite amount of eggs.

SAMSUNG DIGITAL CAMERA

If I told you “for each egg in the carton, take it out and crack it, and dump it in a bowl starting with the first egg”, you would be able to do this. If I told you “take the 7th egg and throw it at your neighbor’s house” you would be able to do this. In the first example, you would notice when you cracked the last egg. In the second example you would make sure that there was a 7th egg, and if there wasn’t you probably picked some other egg because your neighbor is probably a jerk who deserves to have his house egged. You did this unconsciously because you are a human who can react to dynamic situations. The computer can’t do this.

If you have some array that looks like this (array locations are separated by | bars | and * stars * are outside the array) ***|1|2|3|*** and you told the computer “for each location in the array, add 1 to the number, starting at the first location” it would set the first location to be 2, the second location to be 3, the third location to be 4. Then it would interpret the bits in the location of memory directly to the right of the third location as a number, and it would add 1 to this “number” thereby destroying the data in that location. It would do this forever because this is what you told the machine to do. Suppose that part of memory was involved in controlling the brakes in your 2010 era Toyota vehicle. This is obviously incredibly bad, so how do we prevent this?

The answer is that the programmer (hopefully) knows how big the array is and actually says “starting at location one, for the next 3 locations, add one to the number in the location”. But suppose the programmer messes up, and accidentally says “for the next 4 locations” and costs a multinational company billions of dollars? We could prevent this. There are programming languages that give us ways to prevent these situations. “High level” programming languages such as Java have built-in ways to tell how long an array is. They are also designed to prevent the programmer from telling the machine to write past the end of the array. In Java, the program will successfully write |2|3|4| and then it will crash, rather than corrupting the data outside of the array. This crash will be noticed in testing, and Toyota will save face. We also have “low level” programming languages such as C, which don’t do this. Why do we use low level programming languages? Let’s step through what these languages actually have the machine do for “starting at location one, for the next 3 locations, add one to the number in the location”: First the C program:

NOTE: location[some value] is shorthand for “the location identified by some value.” egg_carton[3] is the third egg in the carton. Additionally, you should read these as sequential instructions “first do this, then do that” Finally, these examples are greatly simplified for the purposes of this article.

1: counter = 1 2: location[counter] = 1 + 1 3: if (counter equals 3) terminate 4: counter = 2 5: location[counter] = 2 + 1 6: if (counter equals 3) terminate 7: counter = 3 8: location[count] = 3 + 1 9: if (counter equals 3) terminate

Very roughly speaking, this is what the computer does. The programmer will use a counter to keep track of their location in the array. After updating each location, they will test the counter to see if they should stop. If they keep going they will repeat this process until the stop condition is satisfied. The Java programmer would write mostly the same program, but the program that translates the Java code into machine code (called a compiler) will add some stuff:

1: counter = 1 2: if (counter greater than array length) crash 3: location[counter] = 1 + 1 4: if (counter equals 3) terminate 5: counter = 2 6: if (counter greater than array length) crash 7: location[counter] = 2 + 1 8: if (counter equals 3) terminate 9: counter = 3 10: if (counter greater than array length) crash 11: location[count] = 3 + 1 12: if (counter equals 3) terminate

As you can see, 3 extra lines were added. If you know for a fact that the array you are working with has a length that is greater than or equal to three, then this code is redundant.

For such a small array, this might not be a huge deal, but suppose the array was a billion elements. Suddenly an extra billion instructions were added. Your phone’s processor likely runs at 1-3 gigahertz, which means that it has an internal clock that ticks 1-3 billion times per second. The smallest amount of time that an instruction can take is one clock cycle, which means that in the best case scenario, the java program takes one entire second longer to complete. The fact of the matter is that “if (counter greater than array length) crash” definitely takes longer than one clock cycle to complete. For a game on your phone, this extra second may be acceptable. For the onboard computer in your car, it is definitely not. Imagine if your brakes took an extra second to engage after you push the pedal? Congressmen would get involved!

windows_xp_bsod

In Java, reading off the end of an array is defined. The language defines that if you attempt to do this, the program will crash (it actually does something similar but not the same, but this is outside the scope of this article). In order to enforce this definition, it inserts these extra instructions into the program that implement the functionality. In C, reading off the end of an array is undefined. Since C doesn’t care what happens when you read off the end of an array, it doesn’t add any code to your program. C assume you know what you’re doing, and have taken the necessary steps to ensure your program is correct. The result is that the C program is much faster than the Java program.

There are many such undefined behaviors in programming. For instance, your computer’s division function is partial just like the mathematical version. Java will test that the denominator isn’t zero, and crash if it is. C happily tells the machine to evaluate 8 / 0. Most processors will actually go into a failure state if you attempt to divide by zero, and most operating systems (such as Windows or Mac OSX) will crash your program to recover from the fault. However, there is no law that says this must happen. I could very well create a processor that sends lions to your house to punish you for trying to divide by zero. I could define x / 0 = 17. The C language committee would be perfectly fine with either solution; they just don’t care. This is why people often call languages such as C “unsafe.” This doesn’t mean that they are bad necessarily, just that their use requires caution. A chainsaw is unsafe, but it is a very powerful tool when used correctly. When used incorrectly, it will slice your face off.

What To Do

So, if defining every behavior is slow, but leaving it undefined is dangerous, what should we do? Well, the fact of the matter is that in most cases, the added overhead of adding these extra instructions is acceptable. In these cases, “safe” languages such as Java are preferred because they ensure program correctness. Some people will still write these sorts of programs in unsafe languages such as C (for instance, my own DMP Photobooth is implemented in C), but strictly speaking there are better options. This is part of the explanation for the phenomenon that “computers get faster every year, but [insert program] is just as slow as ever!” Since the performance of [insert program] we deemed to be “good enough”, this extra processing power is instead being devoted to program correctness. If you’ve ever used older versions of Windows, then you know that your programs not constantly crashing is a Good Thing.

windows_xp_bsod

This is fine and good for those programs, but what about the ones that cannot afford this luxury? These other programs fall into a few general categories, two of which we’ll call “real-time” and “big data.” These are buzzwords that you’ve likely heard before, “big data” programs are the programs that actually process one billion element arrays. An example of this sort of software would be software that is run by a financial company. Financial companies have billions of transactions per day, and these transactions need to post as quickly as possible. (suppose you deposit a check, you want those funds to be available as quickly as possible) These companies need all the speed they can get, and all those extra instructions dedicated to totality are holding up the show.

Meanwhile “real-time” applications have operations that absolutely must complete in a set amount of time. Suppose I’m flying a jet, and I push the button to raise a wing flap. That button triggers an operation in the program running on the flight computer, and if that operation doesn’t complete immediately (where “immediately” is some fixed, non-zero-but-really-small amount of time) then that program is not correct. In these cases, the programmer needs to have very precise control over what instructions are produced, and they need to make every instruction count. In these cases, redundant totality checks are a luxury that is not in the budget.

Real-time and big data programs need to be fast, so they are often implemented in unsafe languages, but that does not mean that invoking undefined behavior is OK. If a financial company sets your account balance to be check value / 0, you are not going to have a good day. If your car reads the braking strength from a location off to the right of the braking strength array, you are going to die. So, what do these sorts of programs do?

One very common method, often used in safety-critical software such as a car’s onboard computer is to employ strict coding standards. MISRA C is a set of guidelines for programming in C to help ensure program correctness. Such guidelines instruct the developer on how to program to avoid unsafe behavior. Enforcement of the guidelines is ensured by peer-review, software testing, and Static program analysis.

Static program analysis (or just static analysis) is the process of running a program on a codebase to check it for defects. For MISRA C, there exists tooling to ensure compliance with its guidelines. Static analysis can also be more general. Over the last year or so, I’ve been assisting with a research project at UCSD called Liquid Haskell. Simply put, Liquid Haskell provides the programmer with ways to specify requirements about the inputs and outputs of a piece of code. Liquid Haskell could ensure the correct usage of division by specifying a “precondition” that “the denominator must not equal zero.” (I believe that this actually comes for free if you use Liquid Haskell as part of its basic built-in checks) After specifying the precondition, the tool will check your codebase, find all uses of division, and ensure that you ensured that zero will never be used as the denominator.

It does this by determining where the denominator value came from. If the denominator is some literal (i.e. the number 7, and not some variable a that can take on multiple values), it will examine the literal and ensure it meets the precondition of division. If the number is an input to the current routine, it will ensure the routine has a precondition on that value that it not be zero. If the number is the output from some other routine, it verifies that the the routine that produced the value has, as a “postcondition”, that its result will never be zero. If the check passes for all usages of division, your use of division will be declared safe. If the check fails, it will tell you what usages were unsafe, and you will be able to fix it before your program goes live. The Haskell programming language is very safe to begin with, but a Haskell program verified by Liquid Haskell is practically Fort Knox!

The Human Factor

Humans are imperfect, we make mistakes. However, we make up for it in our ability to respond to dynamic situations. A human would never fail to grab the 259th egg from a 12 egg carton and crack it into a bowl; the human wouldn’t even try. The human can see that there is only 12 eggs without having to be told to do so, and will respond accordingly. Machines do not make mistakes, they do exactly what you tell them to, exactly how you told them to do it. If you tell the machine to grab the 259th egg and crack it into a bowl, it will reach it’s hand down, grab whatever is in the space 258 egg lengths to the right of the first egg, and smash it on the edge of a mixing bowl. You can only hope that nothing valuable was in that spot.

Most people don’t necessarily have a strong intuition for what “undefined behavior” is, but mathematicians and programmers everywhere fight this battle every day.

Of Semicolons and Sadness

Lately, I’ve been writing a lot of Haskell. If you’ve been following the blog, you likely are aware of this. However, I’ve been feeling the itch to try something different. I’ve been dabbling in things here and there, but for some reason I just can’t settle into a groove. It seems that Haskell has ruined programming for me.

Why? Is it the functional purity of Haskell? Is it the way that lazy evaluation changes the way you think about programming? Is it the sublime type system that catches whole classes of bugs at compile time? No, I’m afraid not. My issue is much more basic; semicolons and angle brackets.

Your Mission

…should you choose to accept it: write a function fun that does the following: takes an integer argument acc, and two arguments of the same generic type. This function should enforce that the two generic arguments cannot be changed. This function calls another function given that takes two arguments of the same generic type, that also cannot be changed. given does something, then returns an integer as a result. given can fail!

After calling given, if it failed, return acc - 1, if it succeeded, return its result + acc.

In C

C is a pretty simple language. I like simplicity (it’s the whole premise of this post, right?) but C’s simplicity turns into overly verbose code in the face of the requirements of fun. Here is the signature of given

// please ensure that lhs and rhs have the same type int given(int * result, void const * const lhs, void const * const rhs)

So, what’s going on here? In C, the dominant idiom for indicating success or failure is to return an error code, thus we return an int. However, this isn’t our return value! We use the out parameter result instead. Finally, we have lhs and rhs, which have the confusing types void const * const. These read as “constant pointer to constant void”, and ensures that neither the where the pointer points, or the value of what it points to can change. (thus documenting the requirement that it not change its arguments!) Finally, we have our comment asking the caller to pretty please, ensure lhs and rhs have the same type. There is no way in C to ensure two void * are actually the same type, so this is the best we can do… Now, let’s see my implementation of fun:

// please ensure that lhs and rhs have the same type int fun(int acc, void const * const lhs, void const * const rhs) { int res = 0; if (given(&res, lhs, rhs)) { return acc + res; } else { return acc - 1; } }

We have the same confusing sort of signature, except here the return value is our actual return value, and acc is not an out parameter. (which is really obvious, right?)

Inside of this function, we allocate res on the stack and initialize it to 0. Then we call given with the address of res, and lhs and rhs. Taking the address of a something requires a special symbol. This function is called within an if block because given returns 0 on failure and something on success.

Some of this noise can be reduced, but overall it’s pretty minimal. The use of void pointers and our type checking comment are troubling, but that’s not really the point of this post. However, the many semicolons, curly braces, and commas add a lot of clutter.

Noise level: Moderate
Strangeness: High
Confidence that there are no bugs in my implementation: Very Low

In C++

Next, let’s check out some C++. Here is the signature of given:

// throws an std::exception on failure! template <typename T> int given(const T & lhs, const T & rhs)

To me, this is kind of a wash from the C implementation. On one hand, we have this template <typename T> silliness, on the other hand we don’t have an out parameter, we only have one const per argument, and the compiler can actually enforce that lhs is of the same type as rhs. Next, the implementation:

template <typename T> int fun(int acc, const T & lhs, const T & rhs) { try { return acc + given<T>(lhs, rhs); } catch (std::exception & e) { return acc - 1; } }

Here we have that same yucky sort of signature, and we have these try and catch blocks. Personally, I don’t care for exceptions. However, they are a tried and true strategy accepted by the majority of the programming community, and who am I to dispute that? However, what I don’t like is the fact that we have no indication that given can throw. We had to read a comment. Additional weirdness here is the call to given: given<T>(lhs, rhs). Why do I have to do <T>?

The usual complaints about curly braces and semicolons apply here as well.

Noise Level: High
Strangeness: Low
Confidence that there are no bugs in my implementation: Low

In Rust

A newcomer to the scene is Rust. Rust is a low level systems programming language that has a very strong emphasis on safety. Excellent, right? Let’s see the signature of given:

fn given<T>(lhs: &T, rhs: &T) -> Option<i64>

… eew. We have twice as many angle brackets as the C++ implementation did, which is unfortunate. We’ve also added some colons. On the plus side, we have option types in rust, so no icky exceptions! Let’s see my implementation of fun:

fn fun<T>(acc: i64, lhs: &T, rhs: &T) -> i64 { let res = given(&lhs, &rhs); let ret = match res { Some(x) => acc + x, None => acc - 1, }; ret }

… and things get strange. Let’s go through this line by line:

fn fun<T>(acc: i64, lhs: &T, rhs: &T) -> i64: A bit noisier than I’d like, but perfectly reasonable.

let res = given(&lhs, &rhs);: Here we call given, and for some reason I have to put an & in front of lhs and rhs. At least rustc inferred the type for us!

let ret = match res {};: Yay, Rust has pattern matching! It’s a bit unfortunate about all those curly braces and semicolons though…

Some(x) => acc + x,: Nothing wrong here, in a world where commas and such are a thing…

None => acc - 1,: And here we have the last pattern of this match block. Why is there a comma there?! For reasons that science will never explain, in Rust, you have to put commas after the last entry in things like this! This oddity extends to structs and enums as well…

ret: Finally, we return. No, ret isn’t a keyword, it’s a variable that I bound in my pattern match. Notice the lack of a semicolon. In Rust, the last line in a function is the return value, and does not have a semicolon! However, rust has a return statement for early returns, and this does take a semicolon. Yes, you can use a return statement here, but according to the docs doing this makes you a bad person.

Noise Level: Very High
Strangeness: High
Confidence that there are no bugs in my implementation: Reasonable

In Haskell

Finally, let’s take a look at what I consider to be a reasonable syntax: Haskell. The signature of given is as follows:

given :: a -> a -> Maybe Int

Once you learn to read Haskell, you’ll see how simple this is. given is a function that takes an a, another a, and returns a Maybe Int. Next, let’s see my implementation of fun:

fun :: Int -> a -> a -> Int fun acc lhs rhs = case given lhs rhs of Just res -> acc + res _ -> acc - 1

Not counting the optional function signature, this is 3 lines. The first is our pattern match, which calls given. The second line handles success, and the third handles failure. There is no return statement here, the result of the expression is the return value. And unlike Rust, there’s no special case that breaks the mold.

Noise Level: Low
Strangeness: Low (If your definition of strange isn’t “doesn’t look like C”)
Confidence that there are no bugs in my implementation: High

Which Brings Me To My Point

Honestly, I believe that the reason for all of this is: “We’ll make our language look like C so that we don’t scare people away!” Personally, I believe that this is a sad state of affairs.

Strictly speaking, I do mean to criticize. C, C++, and Rust are all fine languages. They do what they set out to do, and do it well. Many people successfully use these languages every day! They are all languages worth knowing and languages worth using. However, there is no reason for all these glyphs and extra keywords!

My top feature for the new version of C++: Get rid of all uses of the angle bracket!

My top feature for the new version of Rust: Get rid of those trailing commas and the return statement!

Of course this will never happen, and for good reason: It will cause massive breakages in everybody’s codebases. But a man can dream…

Functional Doubly Linked Lists

It’s often been said that functional programming just isn’t cut out for certain tasks. File IO? Please… Databases? Forget about it!

I’ve always figured that the humble Doubly Linked list was on this list. After all, how do we implement these in C? An implementation would probably look something like this:

struct _DList { struct DList * next; struct DList * prev; void * element; } DList;

In this case, the element pointed to by prev and next have pointers to this element, if they aren’t NULL. The Doubly Linked list isn’t exactly a complicated structure, this is basically the way to do it. So, how would we do this in Haskell?

In the past, I’ve thought there were three answers to this question:

1) Use C pointers. This would involve use of unsafePerformIO, and you’d be a monster.

2) Use a singly linked list, and pretend it’s doubly linked. This would involve a “prev” function that just walks from the beginning of the list to the element before the current one. You’d be an even bigger monster than the guy who did option 1.

3) Don’t use a doubly linked list. To me, this is actually reasonable. In a world where arrays are basically always better, the only reason to use a list is because they’re very easy to deal with. If you need the level of complexity of a doubly linked list, you’re probably just better off with a different data structure.

But recently, I thought of a way to implement a doubly linked list in Haskell without being a bad person.

Preserved Here For Posterity

The implementation is actually quite simple. Here’s our type:

data DoublyLinkedList a = DList [a] [a] | Nil

Obviously, we have our Nil type for the empty list. We also have DList, which has two lists. Why two? On the left, we have the previous elements. This starts empty. On the right, we have our next elements. As we walk the list, we pop elements from the head of the right list to the head of the left list. The head of the right list is our current element. Let’s see some functions:

get :: DoublyLinkedList a -> a get (DList _ (x:xs)) = x

This function returns the “current” element; the element at the current position in the list.

next :: DoublyLinkedList a -> DoublyLinkedList a next (DList _ []) = undefined next (DList prev (c:next)) = DList (c:prev) next prev :: DoublyLinkedList a -> DoublyLinkedList a prev (DList [] _) = undefined prev (DList (c:prev) next) = DList prev (c:next)

These two functions move the cursor forward or backward. As you can see, when next is called, the current element is popped from the next list, and pushed onto the previous list. The opposite happens when prev is called. I’ve left the edge cases undefined, but a real implementation should do something sane here. (return a Maybe, throw an exception, etc…)

insert :: DoublyLinkedList a -> a -> DoublyLinkedList a insert Nil e = DList [] [e] insert (DList prev next) e = DList prev (e:next)

Elements are inserted in front of the current element.

makeDouble :: [a] -> DoublyLinkedList a makeDouble [] = Nil makeDouble l = DList [] l

…and it is trivial to convert a singly linked list into a doubly linked list. We can even add a pretty show instance!

instance (Show a) => Show (DoublyLinkedList a) where show Nil = "Nil" show (DList prev (c:next)) = (show $ reverse prev) ++ "*[" ++ (show c) ++ "]*" ++ show next

Obviously, we could go crazy and make a Monad instance and other things, but you get the idea. The final solution was very simple. One could even say simpler than the C version as you don’t have to worry about updating the next and prev pointers when adding elements. Sure, it took a bit of thought, but like many things in Haskell, the result was simple and elegant.

Getting Started With GLib in Emacs

Recently, I decided to start doing some C. In the past, I’ve used GLib in my C programs, and I’m a fan. I decided that I’d like to use GLib in my current endeavors. All that said, before I can use it, I have to be able to build it. Unfortunately, nothing in my life just works, so it took some configuring.

The Makefile

In my last post, I talked about creating a Makefile, and walked through it. I forgot one huge thing though: pkg-config!

Previously in DMP Photobooth, I used pkg-config to manage my library compiler flags. To that end, let’s make some changes to the Makefile I wrote previously. First, let’s refer back to what I wrote before:

COMPILE_FLAGS = -c -g -Wall -Wextra -std=c11 $(OPTIMIZE_LEVEL) LINK_FLAGS = -g -Wall -Wextra -std=c11 $(OPTIMIZE_LEVEL) LINKER_LIBS = -lSDL2 -ldl -lGL

It’s pretty straightforward. I have a compile flag set for compiling a .o, and for compiling a program. I also have a LINKER_LIBS variable to pass to the compile command. This isn’t part of the COMPLIE/LINK_FLAGS because the sources and object code being compiled must appear first or GCC complains. Now, let’s take a look at the new snippet:

COMPILE_FLAGS = -c -g -Wall -Wextra -std=c11 $(OPTIMIZE_LEVEL) \ $(shell pkg-config --cflags $(PKG_CONFIG_LIBS)) LINK_FLAGS = -g -Wall -Wextra -std=c11 $(OPTIMIZE_LEVEL) \ $(shell pkg-config --cflags $(PKG_CONFIG_LIBS)) PKG_CONFIG_LIBS = glib-2.0 gl sdl2 MANUAL_LIBS = -ldl LINKER_LIBS = $(MANUAL_LIBS) $(shell pkg-config --libs $(PKG_CONFIG_LIBS))

Things are getting just a bit more complicated now. You’ll notice there are three LIBS related variables. PKG_CONFIG_LIBS is the list of libraries to be passed to the pkg-config command. MANUAL_LIBS, as the name implies, is a list of manually configured -l strings. For the life of me, I couldn’t figure out what to pass to pkg-config to get it to spit out -ldl, so I’m forced to do it this way.

Regardless, LINKER_LIBS now contains the MANUAL_LIBS, and the output of $(shell pkg-config --libs $(PKG_CONFIG_LIBS)) which produces the necessary -l strings for all the PKG_CONFIG_LIBS.

On top of that, I’ve added the output of $(shell pkg-config --cflags $(PKG_CONFIG_LIBS)) to the COMPILE_FLAGS and LINK_FLAGS. This will ensure that if any pkg-config library needs special compiler flags, that they get used.

Great, now that’s done. A quick make, and everything seems to be working. We’re in business! …right?

Convincing Flycheck

If only it could be that easy. I created a new source and entered the following:

#include <glib.h>

Flycheck wasn’t convinced though; it put some red jaggies under this, and a quick mouse over of the error shows that flycheck doesn’t think that file exists. I began getting deja vu. After some googling, I determined that I can add arbitrary paths to flycheck-clang-include-path (I’m using the flycheck clang checker, if you’re using gcc this variable is going to be different. I’m guessing flycheck-gcc-include-path) to rectify the issue. To do this, enter:

M-x customize-variable [ENTER] flycheck-clang-include-path [ENTER]

This will get you a customize window for this variable. I added the following:

/usr/include/glib-2.0 /usr/lib/x86_64-linux-gnu/glib-2.0/include

…and things seem to be working fine. That said, I imagine if I get more involved in the GLib stack, I’m going to have to add all of these guys:

includes

Not a huge deal, but I’ll cross that bridge when I come to it.

Fun With Makefiles

Lately, I’ve been toying with the idea of trying my hand at some graphics programming. After spending the better part of yesterday trying to figure out how to even get started, I think I have a way ahead.

Building hello triangle using OpenGL is a fairly involved task. First, you need to settle on a graphics library. There are two choices here: OpenGL and DirectX. Obviously, I’ll be selecting OpenGL in order to avoid Microsoft vendor lock-in.

Next, you need a library to display a window for you. Sure, you could do it yourself, but then you’d get bogged down in a quagmire of platform specific issues. If you’ve been reading the blog, you know I don’t care for this sort of platform dependent nonsense, so I’ve tenatively settled on SDL 2. SDL is a cross platform multimedia library that handles sound, input, window creation, and the like. I plan to use this, in conjunction with an OpenGL context to do my work.

After you have that in order, you need an OpenGL Function Loader. Apparently the folks at Khronos were inspired by DMP Photobooth’s module system, there isn’t some opengl.h file you can just include and get your functions: you get to call dlopen and use dlsym to get function pointers. This wouldn’t be a huge issue if there were just a few functions, but there are thousands of them. In light of this, I’ve elected to go with GL3W for the time being. GL3W is a simple python script that generates a .c file containig the function pointers, and a .h to include.

All of this leads us to the topic of today’s post. How do we build this mess of libraries and random .c files?

We’ll Make it Work

The obvious answer here is that we need to use some sort of build system. Given my past experience with the abominations produced by NetBeans, I’ve elected to roll my own. Let’s take a look:

.DEFAULT_GOAL := all

First, we have the default goal. By default, the default goal is the first one in the file. However, I like to make things explicit. Here, we set the default goal to “all”, which builds the code for all targets.

Next, we define some variables:

CC = gcc COMPILE_FLAGS = -c -g -Wall -Wextra -std=c11 $(OPTIMIZE_LEVEL) LINK_FLAGS = -g -Wall -Wextra -std=c11 $(OPTIMIZE_LEVEL) OPTIMIZE_LEVEL = -Og LINKER_LIBS = -lSDL2 -ldl -lGL RM = rm -f UNIVERSAL = gl3w gl_includes.h

The first variable, CC is built-in, and defaults to gcc. Again, I’m redefining it here to be explicit. After that, I define COMPILE_FLAGS and LINK_FLAGS, which are the flags I want to pass when I’m compiling someing to be link at a future time, and when I’m compiling and linking respectively. I define OPTIMIZE_LEVEL separately, because I want to potentially change it, and I don’t want to have to worry about if the two are in sync.

LINKER_LIBS are the libraries I’m going to be using. RM is the rm command, with flags, to be used in the clean target. UNIVERSAL is a list of files and targets that all buid targets depend on.

all : chapter1 chapter2 chapter3 chapter4 chapter5 chapter6 chapter7 chapter8 \ chapter9 chapter10 chapter11 chapter12 chapter13 chapter14 chapter15 \ chapter16 chapter17 chapter1 : $(UNIVERSAL) chapter1.c @echo "Building chapter 1:" $(CC) -o chapter1 $(LINK_FLAGS) chapter1.c gl3w.o $(LINKER_LIBS) ... chapter17 : $(UNIVERSAL) @echo "Building chapter 17:"

Here we have the meat of our makefile. The tutorial I’m following has 17 chapters, and I’ll be building code from each. We have an “all” target that builds each chapter, and we have a target for each chapter that builds an executable. Each chapter target depends on UNIVERSAL and its own files.

gl3w : gl3w.c GL/gl3w.h GL/glcorearb.h $(CC) $(COMPILE_FLAGS) gl3w.c

Here we build the source files that GL3W produces. I’m compiling it into a .o file so that it can be linked into the code for the various chapters.

clean: @echo "Deleting .o files..." $(RM) *.o @echo "Deleting core..." $(RM) core @echo "Deleting chapters..." $(RM) chapter1 $(RM) chapter2 $(RM) chapter3 $(RM) chapter4 $(RM) chapter5 $(RM) chapter6 $(RM) chapter7 $(RM) chapter8 $(RM) chapter9 $(RM) chapter10 $(RM) chapter11 $(RM) chapter12 $(RM) chapter13 $(RM) chapter14 $(RM) chapter15 $(RM) chapter16 $(RM) chapter17

Finally, I have my clean target. Here we delete all the cruft that builds up in the build process.

It’s a simple makefile, but I feel it’ll make this process easier. I can just do the exercises and hopefully spend less time fiddling with gcc.

My Kingdom For a File Browser

As you may know, lately I’ve been trying to get used to using emacs. This has been going well, but there are some things I find myself missing from my old editors. One of them is the files list on the left side:

file_browser

Sure, I can use dired, Speedbar, or any number of other solutions. But none of these work exactly like my old file browser. In despair, I got used to the C-x C-f Do [tab] c [tab] [tab] d [tab] p [tab] [tab] ... workflow that autocompletion brings us. However, it never quite settled with me.

Lately, I’ve found myself using org-mode in a way that has been quieting the proverbial fire. I’ve been able to use org-mode as a project browser, let me show you how!

Some Background

As you may know, org-mode is kind of a catch-all tool for organizing, note taking, and keeping todo lists. You can create hierarchical trees of headings. Hierarchical trees… Sound like anything we know?

I’ve been trying to use org-mode primarily as a TODO list to keep track of all the stuff I have going on. For projects that I’m working on, I’ve been creating TODO lists for the various source files I’m writing. A few links later, and we have the beginning of a project browser.

* TODO programming project

One of the nice things about tracking your own project is that you can decide what the tree looks like. An embedded project might group C and assembly source files separately, where a Haskell project might group source files by module nesting level.

Let’s talk C. We can create a project.org file in the root of our project directory (where the Makefile is). Next, let’s create some headers:

* C * ASM * Unit Tests * Misc

Here, I’ve created 4 groups. C source goes in the C group, Assembly source goes in ASM, Unit test source files go in Unit Tests, and miscellaneous stuff like Makefiles, .gitignores, and READMEs go in Misc.

Next, let’s add some source files:

* C ** [[./main.c]] * ASM * Unit Tests * Misc

Here, I’ve added a link to main.c to the C group. At this point, this file doesn’t exist. If you click on main.c with org-mode enabled, the file will be opened in emacs, creating it if doesn’t exist! Sure, it’s manual, but I find it is not hard to keep up. If you commit your project.org to your version control system, all members of the project can keep it updated and in sync.

However, what the manual updating buys you is the freedom to organize it how you like! We can add tasks to the tree:

* C ** [[./main.c]] *** TODO implement main *** TODO file header *** TODO add to git * ASM * Unit Tests * Misc

We can add deadlines:

*C ** [[./main.c]] *** TODO implement main DEADLINE: <2015-02-13 Fri> *** TODO file header *** TODO add to git * ASM * Unit Tests * Misc

We can add commentary:

* C ** [[./main.c]] *** TODO implement main DEADLINE: <2015-02-13 Fri> - I tried to implement main, but printf is such a hard function to use! I'll revisit this after I've had four beers... *** TODO file header *** TODO add to git * ASM * Unit Tests * Misc

…and this can lead to more tasks!

* C ** [[./main.c]] *** TODO implement main DEADLINE: <2015-02-13 Fri> - I tried to implement main, but printf is such a hard function to use! I'll revisit this after I've had four beers... **** TODO drink four beers *** TODO file header *** TODO add to git * ASM * Unit Tests * Misc

Let’s see your silly little file browser window do that!

%d bloggers like this: