Archive | DMP Photo Booth RSS for this section

The Specter of Undefined Behavior

If you’ve ever spoken to a programmer, and really got them on a roll, they may have said the words “undefined behavior” to you. Since you speak English, you probably know what each of those words mean, and can imagine a reasonable meaning for them in that order. But then your programmer friends goes on about “null-pointer dereferencing” and “invariant violations” and you start thinking about cats or football or whatever because you are not a programmer.

I often find myself being asked what it is that I do. Since I’ve spent the last few years working on my Computer Science degree, and have spent much of that time involved in programming language research, I often find myself trying to explain this concept. Unfortunately, when put on the spot, I usually am only able to come up with the usual sort of explanation that programmers use among themselves: “If you invoke undefined behavior, anything can happen! Try to dereference a null pointer? Bam! Lions could emerge from your monitor and eat your family!” Strictly speaking, while I’m sure some compiler writer would implement this behavior if they could, it’s not a good explanation for a person who doesn’t already kind of understand the issues at play.

Today, I’d like to give an explanation of undefined behavior for a lay person. Using examples, I’ll give an intuitive understanding of what it is, and also why we tolerate it. Then I’ll talk about how we go about mitigating it.

Division By Zero

Here is one that most of us know. Tell me, what is 8 / 0? The answer of course is “division by zero is undefined.” In mathematics, there are two sorts of functions: total and partial. A total function is defined for all inputs. If you say a + b, this can be evaluated to some result no matter what you substitute for a and b. Addition is total. The same cannot be said for division. If you say a / b, this can be evaluated to some result no matter what you substitute for a and b unless you substitute b with 0. Division is not total.

If you go to the Wikipedia article for division by zero you’ll find some rationale for why division by zero is undefined. The short version is that if it were defined, then it could be mathematically proven that one equals two. This would of course imply that cats and dogs live in peace together and that pigs fly, and we can’t have that!

ti86_calculator_divbyzero

However, there is a way we can define division to be total that doesn’t have this issue. Instead of defining division to return a number, we could define division to return a set of numbers. You can think of a set as a collection of things. We write this as a list in curly braces. {this, is, a, set, of, words} I have two cats named Gatito and Moogle. I can have a set of cats by writing {Gatito, Moogle}. Sets can be empty; we call the empty set the null set and can write it as {} or using this symbol . I’ll stick with empty braces because one of the things I hate about mathematics is everybody’s insistence on writing in Greek.

So here is our new total division function:

totalDivide(a, b) if (b does not equal 0) output {a / b} otherwise output {}

If you use totalDivide to do your division, then you will never have to worry about the undefined behavior of division! So why didn’t Aristotle (or Archimedes or Yoda or whoever invented division) define division like this in the first place? Because it’s super annoying to deal with these sets. None of the other arithmetic functions are defined to take sets, so we’d have to constantly test that the division result did not produce the empty set, and extract the result from the set. In other words: while our division is now total, we still need to treat division by zero as a special case. Let us try to evaluate 2/2 + 2/2 and totalDivide(2,2) + totalDivide(2,2):

1: 2/2 + 2/2 2: 1 + 1 3: 2

Even showing all my work, that took only 3 lines.

1: let {1} = totalDivide(2,2) 2: let {1} = totalDivide(2,2) 3: 1 + 1 4: 2

Since you can’t add two sets, I had to evaluate totalDivide out of line, and extract the values and add them separately. Even this required my human ability to look at the denominator and see that it wasn’t zero for both cases. In other words, making division total made it much more complicated to work with, and it didn’t actually buy us anything. It’s slower. It’s easier to mess up. It has no real value. As humans, it’s fairly easy for us to look at the denominator, see that it’s zero, and just say “undefined.”

Cartons of Eggs

I’m sure many of you have a carton of eggs in your fridge. Go get me the 17th egg from your carton of eggs. Some of you will be able to do this, and some of you will not. Maybe you only have a 12 egg carton. Maybe you only have 4 eggs in your 18 egg carton, and the 17th egg is one of the ones that are missing. Maybe you’re vegan.

A basic sort of construct in programming is called an “array.” Basically, this is a collection of the same sort of things packed together in a region of memory on your computer. You can think of a carton of eggs as an array of eggs. The carton only contains one sort of thing: an egg. The eggs are all packed together right next to each other with nothing in between. There is some finite amount of eggs.

SAMSUNG DIGITAL CAMERA

If I told you “for each egg in the carton, take it out and crack it, and dump it in a bowl starting with the first egg”, you would be able to do this. If I told you “take the 7th egg and throw it at your neighbor’s house” you would be able to do this. In the first example, you would notice when you cracked the last egg. In the second example you would make sure that there was a 7th egg, and if there wasn’t you probably picked some other egg because your neighbor is probably a jerk who deserves to have his house egged. You did this unconsciously because you are a human who can react to dynamic situations. The computer can’t do this.

If you have some array that looks like this (array locations are separated by | bars | and * stars * are outside the array) ***|1|2|3|*** and you told the computer “for each location in the array, add 1 to the number, starting at the first location” it would set the first location to be 2, the second location to be 3, the third location to be 4. Then it would interpret the bits in the location of memory directly to the right of the third location as a number, and it would add 1 to this “number” thereby destroying the data in that location. It would do this forever because this is what you told the machine to do. Suppose that part of memory was involved in controlling the brakes in your 2010 era Toyota vehicle. This is obviously incredibly bad, so how do we prevent this?

The answer is that the programmer (hopefully) knows how big the array is and actually says “starting at location one, for the next 3 locations, add one to the number in the location”. But suppose the programmer messes up, and accidentally says “for the next 4 locations” and costs a multinational company billions of dollars? We could prevent this. There are programming languages that give us ways to prevent these situations. “High level” programming languages such as Java have built-in ways to tell how long an array is. They are also designed to prevent the programmer from telling the machine to write past the end of the array. In Java, the program will successfully write |2|3|4| and then it will crash, rather than corrupting the data outside of the array. This crash will be noticed in testing, and Toyota will save face. We also have “low level” programming languages such as C, which don’t do this. Why do we use low level programming languages? Let’s step through what these languages actually have the machine do for “starting at location one, for the next 3 locations, add one to the number in the location”: First the C program:

NOTE: location[some value] is shorthand for “the location identified by some value.” egg_carton[3] is the third egg in the carton. Additionally, you should read these as sequential instructions “first do this, then do that” Finally, these examples are greatly simplified for the purposes of this article.

1: counter = 1 2: location[counter] = 1 + 1 3: if (counter equals 3) terminate 4: counter = 2 5: location[counter] = 2 + 1 6: if (counter equals 3) terminate 7: counter = 3 8: location[count] = 3 + 1 9: if (counter equals 3) terminate

Very roughly speaking, this is what the computer does. The programmer will use a counter to keep track of their location in the array. After updating each location, they will test the counter to see if they should stop. If they keep going they will repeat this process until the stop condition is satisfied. The Java programmer would write mostly the same program, but the program that translates the Java code into machine code (called a compiler) will add some stuff:

1: counter = 1 2: if (counter greater than array length) crash 3: location[counter] = 1 + 1 4: if (counter equals 3) terminate 5: counter = 2 6: if (counter greater than array length) crash 7: location[counter] = 2 + 1 8: if (counter equals 3) terminate 9: counter = 3 10: if (counter greater than array length) crash 11: location[count] = 3 + 1 12: if (counter equals 3) terminate

As you can see, 3 extra lines were added. If you know for a fact that the array you are working with has a length that is greater than or equal to three, then this code is redundant.

For such a small array, this might not be a huge deal, but suppose the array was a billion elements. Suddenly an extra billion instructions were added. Your phone’s processor likely runs at 1-3 gigahertz, which means that it has an internal clock that ticks 1-3 billion times per second. The smallest amount of time that an instruction can take is one clock cycle, which means that in the best case scenario, the java program takes one entire second longer to complete. The fact of the matter is that “if (counter greater than array length) crash” definitely takes longer than one clock cycle to complete. For a game on your phone, this extra second may be acceptable. For the onboard computer in your car, it is definitely not. Imagine if your brakes took an extra second to engage after you push the pedal? Congressmen would get involved!

windows_xp_bsod

In Java, reading off the end of an array is defined. The language defines that if you attempt to do this, the program will crash (it actually does something similar but not the same, but this is outside the scope of this article). In order to enforce this definition, it inserts these extra instructions into the program that implement the functionality. In C, reading off the end of an array is undefined. Since C doesn’t care what happens when you read off the end of an array, it doesn’t add any code to your program. C assume you know what you’re doing, and have taken the necessary steps to ensure your program is correct. The result is that the C program is much faster than the Java program.

There are many such undefined behaviors in programming. For instance, your computer’s division function is partial just like the mathematical version. Java will test that the denominator isn’t zero, and crash if it is. C happily tells the machine to evaluate 8 / 0. Most processors will actually go into a failure state if you attempt to divide by zero, and most operating systems (such as Windows or Mac OSX) will crash your program to recover from the fault. However, there is no law that says this must happen. I could very well create a processor that sends lions to your house to punish you for trying to divide by zero. I could define x / 0 = 17. The C language committee would be perfectly fine with either solution; they just don’t care. This is why people often call languages such as C “unsafe.” This doesn’t mean that they are bad necessarily, just that their use requires caution. A chainsaw is unsafe, but it is a very powerful tool when used correctly. When used incorrectly, it will slice your face off.

What To Do

So, if defining every behavior is slow, but leaving it undefined is dangerous, what should we do? Well, the fact of the matter is that in most cases, the added overhead of adding these extra instructions is acceptable. In these cases, “safe” languages such as Java are preferred because they ensure program correctness. Some people will still write these sorts of programs in unsafe languages such as C (for instance, my own DMP Photobooth is implemented in C), but strictly speaking there are better options. This is part of the explanation for the phenomenon that “computers get faster every year, but [insert program] is just as slow as ever!” Since the performance of [insert program] we deemed to be “good enough”, this extra processing power is instead being devoted to program correctness. If you’ve ever used older versions of Windows, then you know that your programs not constantly crashing is a Good Thing.

windows_xp_bsod

This is fine and good for those programs, but what about the ones that cannot afford this luxury? These other programs fall into a few general categories, two of which we’ll call “real-time” and “big data.” These are buzzwords that you’ve likely heard before, “big data” programs are the programs that actually process one billion element arrays. An example of this sort of software would be software that is run by a financial company. Financial companies have billions of transactions per day, and these transactions need to post as quickly as possible. (suppose you deposit a check, you want those funds to be available as quickly as possible) These companies need all the speed they can get, and all those extra instructions dedicated to totality are holding up the show.

Meanwhile “real-time” applications have operations that absolutely must complete in a set amount of time. Suppose I’m flying a jet, and I push the button to raise a wing flap. That button triggers an operation in the program running on the flight computer, and if that operation doesn’t complete immediately (where “immediately” is some fixed, non-zero-but-really-small amount of time) then that program is not correct. In these cases, the programmer needs to have very precise control over what instructions are produced, and they need to make every instruction count. In these cases, redundant totality checks are a luxury that is not in the budget.

Real-time and big data programs need to be fast, so they are often implemented in unsafe languages, but that does not mean that invoking undefined behavior is OK. If a financial company sets your account balance to be check value / 0, you are not going to have a good day. If your car reads the braking strength from a location off to the right of the braking strength array, you are going to die. So, what do these sorts of programs do?

One very common method, often used in safety-critical software such as a car’s onboard computer is to employ strict coding standards. MISRA C is a set of guidelines for programming in C to help ensure program correctness. Such guidelines instruct the developer on how to program to avoid unsafe behavior. Enforcement of the guidelines is ensured by peer-review, software testing, and Static program analysis.

Static program analysis (or just static analysis) is the process of running a program on a codebase to check it for defects. For MISRA C, there exists tooling to ensure compliance with its guidelines. Static analysis can also be more general. Over the last year or so, I’ve been assisting with a research project at UCSD called Liquid Haskell. Simply put, Liquid Haskell provides the programmer with ways to specify requirements about the inputs and outputs of a piece of code. Liquid Haskell could ensure the correct usage of division by specifying a “precondition” that “the denominator must not equal zero.” (I believe that this actually comes for free if you use Liquid Haskell as part of its basic built-in checks) After specifying the precondition, the tool will check your codebase, find all uses of division, and ensure that you ensured that zero will never be used as the denominator.

It does this by determining where the denominator value came from. If the denominator is some literal (i.e. the number 7, and not some variable a that can take on multiple values), it will examine the literal and ensure it meets the precondition of division. If the number is an input to the current routine, it will ensure the routine has a precondition on that value that it not be zero. If the number is the output from some other routine, it verifies that the the routine that produced the value has, as a “postcondition”, that its result will never be zero. If the check passes for all usages of division, your use of division will be declared safe. If the check fails, it will tell you what usages were unsafe, and you will be able to fix it before your program goes live. The Haskell programming language is very safe to begin with, but a Haskell program verified by Liquid Haskell is practically Fort Knox!

The Human Factor

Humans are imperfect, we make mistakes. However, we make up for it in our ability to respond to dynamic situations. A human would never fail to grab the 259th egg from a 12 egg carton and crack it into a bowl; the human wouldn’t even try. The human can see that there is only 12 eggs without having to be told to do so, and will respond accordingly. Machines do not make mistakes, they do exactly what you tell them to, exactly how you told them to do it. If you tell the machine to grab the 259th egg and crack it into a bowl, it will reach it’s hand down, grab whatever is in the space 258 egg lengths to the right of the first egg, and smash it on the edge of a mixing bowl. You can only hope that nothing valuable was in that spot.

Most people don’t necessarily have a strong intuition for what “undefined behavior” is, but mathematicians and programmers everywhere fight this battle every day.

DMP Photobooth: On the Road to 2.0

It’s been a busy few months for me, and things have been quiet here. Quiet, but not forgotten; big things are in the works! What big things you ask?

Back in June, I reflected on DMP Photobooth. The completion of the photobooth was a major accomplishment for me: it worked perfectly (dead camera batteries and jammed printers aside) for a good six hours, and was a hit with the guests! However, like many things in life, it has problems. On my list of things to fix are a wonky UI, and an aggressively imperative code style.

A few months of reflection later, and I’ve had time to consider these issues. I thought about why I wrote DMP Photobooth. Aside from saving ~ $1,000 on renting a photobooth, it was really an opportunity for me to truly explore the C programming language.

While I had some experience with it, it was mostly in an academic capacity. Writing linked lists and trees and such is one thing, creating a real program is another entirely. DMP Photobooth gave me a chance to truly do something with it. Having implemented a non-trivial application in it has given me an appreciation of the strengths and weaknesses of the language. I’ve seen what’s easy, and what’s hard. I’ve made questionable design choices, and attempted to fix them. I’ve made good design choices that made my life easier. I can truly say that I’ve been there and done that.

But let’s be honest with ourselves: C wasn’t really an appropriate choice for this sort of program. Java, Python, or something along those lines really would have been the correct choice. But where’s the fun in that? However, now that the project is “complete” its time to find a way forward. C no longer has the allure for me that it once did, and since my customers include me and my wife, I feel confident that a re-write won’t lose me any business.

That’s right: DMP Photobooth is being completely re-written. Version 2.0 will be re-implemented in Haskell. However, this doesn’t mean that I have to completely walk away from all the work I did. Early versions will likely use the original Camera, Printer, and Trigger modules via FFI.

Why re-write? I thought about it for a while, I’ve wanted to do something with Haskell to cement my knowledge as I did with C, but I was lacking inspiration. Since DMP Photobooth was really a playground to explore C, why not do the same with Haskell? It seems like a logical choice, implementing a non-trivial application in Haskell will cement my knowledge in the same way it did in C. Additionally, trying to solve the same problem in a different way will give me an appreciation of the pros and cons of each approach.

Change in the Wind

So, surely this isn’t going to be a one-to-one port of DMP Photobooth to Haskell, right?

With this new version, I am making some fundamental changes to the program. Probably the biggest change is the elimination of runtime loading of modules. In 2.0, all modules will be set at compile time. I feel that the machinery of loading and operating modules at runtime was a bit wonky. Too much effort was put into “not restricting a module developer’s freedom”. In 2.0, modules will in place at compile time, and they will be treated as a first-class part of the program.

This isn’t to say that the module concept is going away. In fact, it is being expanded. There are now six total modules. The Camera, Trigger, and Printer modules return in 2.0; but the Photostrip, Persistence, and Interface modules are new in this version. The original modules will continue to work as they did, but let’s discuss the newcomers.

Interface

The interface is fairly straightforward. This module is responsible for the user’s interaction with the program through the computer. Triggering the capture process will still be the trigger module’s domain, but anything that appears on the computer monitor is the domain of the Interface module. All log messages generated by the program will be sent to the Interface module for presentation to the user. Additionally, the Core will provide callbacks to the Interface that the interface will be responsible for enabling the user to call these. There will no longer be a concept of “starting” or “stopping” the photobooth; the photobooth will start when the program starts, and it will stop when the program exits. However, the interface will likely be responsible for allowing a user to print a previously processed strip, alter the configuration, and close the program.

Photostrip

Another fairly self-explanatory module. This module encompasses all the functions that were previously defined in photo_strip.c. In 2.0, the use of temporary files will be limited, therefore all images will be passed in ByteStrings.

Persistence

Of the three new modules, this one is probably the most novel. This module’s responsibility is persisting data between invocations of the program. This will include at the very least program configuration, but it can also include things like log history or even photo sessions. All modules will be given hooks into the persistence module, so they can persist and restore anything that they see fit.

The functionality here closely resembles that of configuration.c. It will work in a similar way. I have defined the following data type:

data Property = IntProperty {propName :: String, intValue :: Integer} | StringProperty {propName :: String, stringValue :: String} | DoubleProperty {propName :: String, doubleValue :: Double} | BoolProperty {propName :: String, boolValue :: Bool} | ListProperty {propName :: String, listValue :: [Property]} deriving (Eq, Show)

The persistence module will be able to persist and restore this data type. The modules can store anything in this, including arbitrary types if they are instances of Show and Read. This list of types will probably change over time, but its a good start.

The Way Ahead

Its going to be a long road to 2.0, but as the project progresses, expect to see updates here. The codebase for DMP Photobooth has been uploaded to GitHub, and can be found here. Version 1.0’s repository has been renamed to dmp-photo-booth-legacy. I have no plans to take this version down, so it should remain available for the foreseeable future.

DMP Photo Booth 1.0

Well, the day has come and gone. DMP Photo Booth’s final test on June 21st went off without issue, and DMP Photo Booth has left Beta and is now considered “production ready”. The initial 1.0 release can be found on GitHub.

The significance of June 21st is the very reason DMP Photo Booth was created; the 21st is the day of my wedding. My wife wanted a photo booth for the reception. We looked into renting a photo booth, but it turns out that they run around $1,000. I turned to open source. Some quick googling turned up some options, but they were all personal projects or out of date. Sure I could get somebody else’s project working, but what’s the fun in that? I decided that we didn’t need to rent one, or download one, I could build it!

In late 2013, I set to work in earnest. I had a couple of months of downtime in school, and since I’m not currently working it was the perfect time. I decided I had three main objectives for this project: get some arduino experience, get some GTK+ experience, and do this all as portably as possible. I had initially decided to mostly ignore GLib and focus on GTK, but slowly I grew to appreciate GLib for what it is: the standard library that C never had. First I used GModule to handle shared libraries in a portable manner. Next I decided to use GLib primitives to keep from having to deal with cross-platform type wonkiness. Next, having grown tired of dealing with return codes, I refactored the project to use GLib’s exception replacement: GError.

Lessons Learned

It’s not all roses and puppies though. There are certainly things I’d do differently. DMP Photo Booth is developed in an Object Oriented style, passing opaque structs with “method” functions that operate on them. Each component of the program are organized into their own source file with file scoped globals scattered throughout. Said globals are protected by mutexes to create a semblance of thread safety. That said, threading issues have been a major thorn in my side. Long story short: I regret this design choice. While I still feel that this is the correct way to structure C code, and that if globals are required, this is the correct way to handle them; I feel that I should have made more of an effort to limit side effects. Recently, I’ve spent some time doing functional programming, and if I could do it again I’d try to write in a more functional style. Fortunately for me, this is something that a little refactoring could help with.

Additionally, one thing I thought would be a major help is something that began to be a major thorn in my side: NetBeans. As the size of the project grew, NetBeans got slower and slower. It seemed that I spent more time fiddling with IDE settings than actually coding. Even worse is that the IDE-generated makefile is so convoluted that it’s extremely difficult to modify by hand in a satisfying way. I’ve always coded with and IDE so I wouldn’t have even considered not using one, but then I spent some time with Haskell. One of Haskell’s “problems” is that it doesn’t have good IDE support. It doesn’t seem like any IDE really handles it well, so most people use Emacs. Personally, I haven’t really warmed up to Emacs, but GEdit has syntax highlighting for Haskell and a built-in terminal for GHCI. GEdit also has syntax highlighting for C. Next time, I will seriously consider using a lighter-weight text editor for a C project. All this said, I think NetBeans for Java remains the way to go.

What’s Next

Like any program, version 1.0 is just one of many versions. There certainly remains a lot of work to do with DMP Photo Booth. Some major items you are likely to see whenever I get around to working on DMP Photo Booth some more:

Options Dialog

I think anybody who has seen it will agree: the options dialog in DMP Photo Booth is bad. It’s poorly organized, and kind of wonky. Personally, I modify settings using the .rc file, which is telling. This is certainly a high-priority improvement.

Functional Refactor

Like I said above, the code could use a pass to limit side effects. Funtions need to have their side effects limited, and globals need to be eliminated unless absolutely necessary. However, C is not a functional language. While one could argue that function pointers enable functional programming in C, this is a very pedantic argument. I won’t be going crazy with functional programming techniques. There will be no Monads, or for loops being turned into mappings of function pointers.

Optional Module API

An idea I’ve had on the back burner for a while is an optional module API. This would be used for very specific quality-of-life things. For instance, a module could provide a GTK widget to be shown in the options dialog. Any module that doesn’t want to implement any or all of the optional API can just ignore it. The module loading function will gracefully handle the dlsym failure, just treating it as it is: declining to implement the API. I have no plans to change the current existing API, so all you module developers can rest easy!

User Interface Module

It occurred to me that it might be good to have a UI module. This would provide the UI, and wouldn’t be tied to the trigger/printer/camera module start/stop system. This module would be loaded at startup and unloaded on shutdown. This would allow the Photo Booth to use different widget toolkits: QT, Curses, Cocoa, WinForms, or whatever else. Under this scheme, the current GTK+ interface would be abstracted into the reference UI Module.

DMP Photo Booth: Release Candidate 1

It’s been a long time coming, but the day is almost here: the day DMP Photo Booth is officially released.

Following last week’s successful stress test, I’ve been doing some last-minute touch-up work. I’ve been preparing my laptop for the big day, and finding and squashing some bugs.

DMP Photo Booth RC1 using my fancy new dark GTK theme

DMP Photo Booth RC1 using my fancy new dark GTK theme

Now things are coming along, and I feel the time has come to proceed to RC1. This means that barring any major new show stoppers, DMP Photo Booth will proceed to version 1.0 on June 22.

You can find the latest release here. On that page you’ll find the latest source for DMP Photo Booth and the reference modules, as well as a pre-compiled version for Debian/AMD64. Just extract the tarball into a folder and double-click the executable and you’re off! It comes pre-configured with sane defaults.

Moving forward, I plan to work on a “GTK Trigger Module”. This will just show a window with a button you can click to trigger the photo session. I understand that not everybody feels like constructing an Arduino thingamabober, and that this is surely the only thing preventing DMP Photo Booth from going viral on a global scale. Hopefully this is done by 1.0, but if not it will likely make it into a version 1.0.1, to be released shortly after 1.0.

DMP Photo Booth: To The Test

It’s been a long year leading up to this, but last week DMP Photo Booth saw its first time out in the wild.

Last weekend my fiancée had her bachelorette party. Since it was No Boys Allowed, I wouldn’t be able to babysit the Photo Booth. Luckily for me the event went off largely without issue.

IMG_3212

I spent the week leading up to the event writing documentation. Channeling my past life as an IT professional, I wrote up an HTML page documenting the use of the Photo Booth and some common issues. This documentation will probably get uploaded to either this site or github soon. I need to strip out some stuff specifically relating to my computer first.

After that and a quick walkthrough the night before, it was go time. As the appointed hour arrived, I watched my phone for calls. The good news is that none came. The Photo Booth performed as advertised, with only a few minor difficulties caused by my computer. The only issue with the actual Photo Booth itself that was reported to me is that there is a slight delay between a picture being taken and the trigger counting down. An issue has been opened against this in Github.

The final stretch is here now. The wedding is on the 21st, and the Photo Booth must be complete by then. Honestly, if the day were tomorrow I’d be confident the Photo Booth would work. However, there is always room for polish. There remains bugs to be squashed, documentation to be finalized, and packaging to be done.

Yet Another Aeson Tutorial

Lately I’ve been trying to make sense of Aeson, a prominent JSON parsing library for Haskell. If you do some googling you’ll see two competing notions: 1) Aeson is a powerful library, but it’s documentation is terrible and 2) about 10,000 Aeson tutorials. Even the Aeson hackage page has a huge “How To Use This Library” section with several examples. These examples usually take the form of:

“Make a type for your JSON data!”

data Contrived = Contrived { field1 :: String , field2 :: String } deriving (Show)

“Write a FromJSON instance!”

instance FromJSON Contrived where parseJSON (Object v) = Contrived <$> v .: "field1" <*> v .: "field2"

“Yay!!”

I’ll spare you the long form, other people have done it already and I’m sure you’ve seen it already. What I quickly noticed was that this contrived example wasn’t quite cutting it. I’ve had some challenges that I’ve had to overcome. I’d like to share some of the wisdom I’ve accumulated on this subject.

Nested JSON

The first Problem I’ve run into is the nested JSON. Let’s take a look at an example:

{ "error": { "message": "Message describing the error", "type": "OAuthException", "code": 190 , "error_subcode": 460 } }

That is an example of an exception that can get returned by any Facebook Graph API call. You’ll notice that the exception data is actually contained in a nested JSON object. If passed to a parseJSON function, the only field retrievable by operator .: is “error”, which returns the JSON object. We could define two types and two instances for this like:

data EXTopLevel = EXTopLevel { getEx :: FBException } deriving (Show) data FBException = FBException { exMsg :: String , exType :: String , exCode :: Int , exSubCode :: Maybe Int } deriving (Show) instance FromJSON EXTopLevel where parseJSON (Object v) = EXTopLevel <$> v .: "error" instance FromJSON FBException where parseJSON (Object v) = FBException <$> v .: "message" <*> v .: "type" <*> v .: "code" <*> v .:? "error_subcode"

In this case, you could decode to a EXTopLevel, and call getEx to get the actual exception. However, it doesn’t take a doctor of computer science to see that this is silly. Nobody needs the top-level object, and this is a silly amount of boilerplate. The solution? We can use our friend the bind operator. Aeson Objects are instances of Monad, and it turns out that it’s bind function allows us to drill down into objects. We can re-implement that mess above simply as:

data FBException = FBException { exMsg :: String , exType :: String , exCode :: Int , exSubCode :: Maybe Int } deriving (Show) instance FromJSON FBException where parseJSON (Object v) = FBException <$> (e >>= (.: "message")) <*> (e >>= (.: "type")) <*> (e >>= (.: "code")) <*> (e >>= (.:? "error_subcode")) where e = (v .: "error")

Much better right? I thought so too.

Types With Multiple Constructors

The next problem I’d like to talk about is using types with multiple constructors. Let’s take a look at an example:

{ "value": "EVERYONE" }

When creating a new album of facebook, the API needs you to set a privacy on the album. This setting is set using this JSON object. This would seem to be a trivial case for Aeson:

data Privacy = Privacy { value :: String } deriving (Show) instance FromJSON Privacy where parseJSON (Object v) = Privacy <$> v .: "value" <*>

Unfortunately, the following is not a valid privacy setting:

"{ "value": "NOT_THE_NSA" }

However, our Privacy type would allow that. In reality, this should be an enumeration:

data Privacy = Everyone | AllFriends | FriendsOfFriends | Self

But how would you write a FromJSON instance for that? The method we’ve been using doesn’t work, and parseJSON takes and returns magical internal types that you can’t really do anything with. I was at a loss for a while, and even considered using the method I posted above. Finally, the answer hit me. Like many things in Haskell, the answer was stupidly simple; just define a function to create the privacy object, and use that in the parseJSON function instead of the type constructor! My solution looks like this:

instance FromJSON Privacy where parseJSON (Object v) = createPrivacy v .: "value" createPrivacy :: String -> Privacy createPrivacy "EVERYONE" = Everyone createPrivacy "ALL_FRIENDS" = AllFriends createPrivacy "FRIENDS_OF_FRIENDS" = FriendsOfFriends createPrivacy "SELF" = Self createPrivacy _ = error "Invalid privacy setting!"

If the parseJSON needs a singular function to create a type, and you have more than one type constructor for your type, you can wrap your type constructors in one function that picks and uses the correct one. Your type function doesn’t need to know about Aeson; Aeson magically turns it’s parser into whatever type your function calls for.

Doesn’t Play Nice With Others

For the last few weeks, I’ve been looking into making a Printer Module in Haskell. I must say, it’s been a pretty miserable experience. Not the Haskell part, that was ok. No, my issue is more basic. It seems that Haskell doesn’t like to share.

My plan was to build a module in Haskell to do the printer logic, then link that module as a library into C, which will be imported by the Core as normal. A preliminary look about the internet confirms that this is supported behavior. There are a few trivial examples peppered throughout the internet; so I set to work, confident that this was a solvable problem.

Giving Cabal A Shot

Cabal is Haskell’s package management program. In addition to this, it serves as Haskell’s answer to make. With a simple call to:

cabal init

You are presented with a series of questions about your package. After filling out the form (and selecting library), Cabal creates a Setup.hs file. Calling:

runhaskell Setup.hs configure runhaskell Setup.hs build

…produces a .a library for your package. Success, right? Unfortunately when you try to load that you get a linker error stating something to the effect of “can not be used when making a shared object; recompile with -fPIC“. After hours of research I have determined that this is because the Haskell’s libraries have not been compiled using the -fPIC, which prevents them from being used with a static library.

Trying GHC

The Glasgow Haskell Compiler can be used to compile libraries directly. Having given up on cabal, I decided to try to cut out the middle man and use GHC. After much tinkering, I came up with a Makefile that worked, which I will preserve here for posterity:

COMPILER=ghc HS_RTS=HSrts-ghc7.6.3 OUTPUT=dmp_printer_module.so all: "${COMPILER}" --make -no-hs-main -dynamic \ -l${HS_RTS} -shared -fPIC dmp_printer_module.c \ DmpPrinterModule.hs -o ${OUTPUT}

This makefile compiles any required Haskell scripts, as well as a C “glue” source file that initializes and finalizes the Haskell environment. More on what goes in that file can be found in the GHC documentation.

Cool, good to go, right? Wrong.

Couldn’t Find That Dyn Library

So, I started adding things to the module to make sure it doesn’t break. After adding some dependencies, and trying to recomplie, I start seeing this error:

Could not find module `[SOME_MODULE]' Perhaps you haven't installed the "dyn" libraries for package `[SOME_PACKAGE]'?

After more hours of research, it turns out that for a module to be used in a shared library, it must be compiled as one. Seems logical, but that would imply that all module developers have to go through this nightmare. And the developers of any dependencies they use have to have done so. And so on…

Since even Prelude hadn’t done so, I set off to figure this out. After poking about, it turns out that Debian provides a package ghc-dynamic, which provides the dyn libraries from Base. I installed it, and things were checking out. However, the dependencies I was using still did not work.

After some more reasearch, I found a suggestion that I re-install all my Cabal packages using the --enable-shared flag, which would provide me with my dyn libraries. I gave it a shot, but since my dependencies’ dependencies hadn’t done so, I got the same errors.

Some more research suggested that I could delete the .ghc folder in my home folder, then re-install all Cabal packages. This would force them to rebuild. However, I encountered the same issues.

The Man In Black Fled Across The Desert…

I’m beginning to feel a bit like Roland, ceaselessly chasing after the Dark Tower. Every time I get there, my journey starts over again. I clear one roadblock, and there’s another there waiting for me.

It seems like there isn’t any real interest in calling Haskell from C, and I must say that I am extremely disappointed. Calling C from Haskell works great, but when asked to share its toys with C, Haskell takes its ball and goes home.

I’m sure it’s possible to do meaningful work in Haskell, and call that from C. However, the amount of work I would have to do to attain that goal is not something I’m willing to accept. For this reason I am shelving Haskell for the time being. Maybe I’ll pick it again for some other project, but it’s not a good fit for DMP Photo Booth.

%d bloggers like this: