Archive | GLib RSS for this section

Getting Started With GLib in Emacs

Recently, I decided to start doing some C. In the past, I’ve used GLib in my C programs, and I’m a fan. I decided that I’d like to use GLib in my current endeavors. All that said, before I can use it, I have to be able to build it. Unfortunately, nothing in my life just works, so it took some configuring.

The Makefile

In my last post, I talked about creating a Makefile, and walked through it. I forgot one huge thing though: pkg-config!

Previously in DMP Photobooth, I used pkg-config to manage my library compiler flags. To that end, let’s make some changes to the Makefile I wrote previously. First, let’s refer back to what I wrote before:

COMPILE_FLAGS = -c -g -Wall -Wextra -std=c11 $(OPTIMIZE_LEVEL) LINK_FLAGS = -g -Wall -Wextra -std=c11 $(OPTIMIZE_LEVEL) LINKER_LIBS = -lSDL2 -ldl -lGL

It’s pretty straightforward. I have a compile flag set for compiling a .o, and for compiling a program. I also have a LINKER_LIBS variable to pass to the compile command. This isn’t part of the COMPLIE/LINK_FLAGS because the sources and object code being compiled must appear first or GCC complains. Now, let’s take a look at the new snippet:

COMPILE_FLAGS = -c -g -Wall -Wextra -std=c11 $(OPTIMIZE_LEVEL) \ $(shell pkg-config --cflags $(PKG_CONFIG_LIBS)) LINK_FLAGS = -g -Wall -Wextra -std=c11 $(OPTIMIZE_LEVEL) \ $(shell pkg-config --cflags $(PKG_CONFIG_LIBS)) PKG_CONFIG_LIBS = glib-2.0 gl sdl2 MANUAL_LIBS = -ldl LINKER_LIBS = $(MANUAL_LIBS) $(shell pkg-config --libs $(PKG_CONFIG_LIBS))

Things are getting just a bit more complicated now. You’ll notice there are three LIBS related variables. PKG_CONFIG_LIBS is the list of libraries to be passed to the pkg-config command. MANUAL_LIBS, as the name implies, is a list of manually configured -l strings. For the life of me, I couldn’t figure out what to pass to pkg-config to get it to spit out -ldl, so I’m forced to do it this way.

Regardless, LINKER_LIBS now contains the MANUAL_LIBS, and the output of $(shell pkg-config --libs $(PKG_CONFIG_LIBS)) which produces the necessary -l strings for all the PKG_CONFIG_LIBS.

On top of that, I’ve added the output of $(shell pkg-config --cflags $(PKG_CONFIG_LIBS)) to the COMPILE_FLAGS and LINK_FLAGS. This will ensure that if any pkg-config library needs special compiler flags, that they get used.

Great, now that’s done. A quick make, and everything seems to be working. We’re in business! …right?

Convincing Flycheck

If only it could be that easy. I created a new source and entered the following:

#include <glib.h>

Flycheck wasn’t convinced though; it put some red jaggies under this, and a quick mouse over of the error shows that flycheck doesn’t think that file exists. I began getting deja vu. After some googling, I determined that I can add arbitrary paths to flycheck-clang-include-path (I’m using the flycheck clang checker, if you’re using gcc this variable is going to be different. I’m guessing flycheck-gcc-include-path) to rectify the issue. To do this, enter:

M-x customize-variable [ENTER] flycheck-clang-include-path [ENTER]

This will get you a customize window for this variable. I added the following:

/usr/include/glib-2.0 /usr/lib/x86_64-linux-gnu/glib-2.0/include

…and things seem to be working fine. That said, I imagine if I get more involved in the GLib stack, I’m going to have to add all of these guys:

includes

Not a huge deal, but I’ll cross that bridge when I come to it.

DMP Photo Booth 1.0

Well, the day has come and gone. DMP Photo Booth’s final test on June 21st went off without issue, and DMP Photo Booth has left Beta and is now considered “production ready”. The initial 1.0 release can be found on GitHub.

The significance of June 21st is the very reason DMP Photo Booth was created; the 21st is the day of my wedding. My wife wanted a photo booth for the reception. We looked into renting a photo booth, but it turns out that they run around $1,000. I turned to open source. Some quick googling turned up some options, but they were all personal projects or out of date. Sure I could get somebody else’s project working, but what’s the fun in that? I decided that we didn’t need to rent one, or download one, I could build it!

In late 2013, I set to work in earnest. I had a couple of months of downtime in school, and since I’m not currently working it was the perfect time. I decided I had three main objectives for this project: get some arduino experience, get some GTK+ experience, and do this all as portably as possible. I had initially decided to mostly ignore GLib and focus on GTK, but slowly I grew to appreciate GLib for what it is: the standard library that C never had. First I used GModule to handle shared libraries in a portable manner. Next I decided to use GLib primitives to keep from having to deal with cross-platform type wonkiness. Next, having grown tired of dealing with return codes, I refactored the project to use GLib’s exception replacement: GError.

Lessons Learned

It’s not all roses and puppies though. There are certainly things I’d do differently. DMP Photo Booth is developed in an Object Oriented style, passing opaque structs with “method” functions that operate on them. Each component of the program are organized into their own source file with file scoped globals scattered throughout. Said globals are protected by mutexes to create a semblance of thread safety. That said, threading issues have been a major thorn in my side. Long story short: I regret this design choice. While I still feel that this is the correct way to structure C code, and that if globals are required, this is the correct way to handle them; I feel that I should have made more of an effort to limit side effects. Recently, I’ve spent some time doing functional programming, and if I could do it again I’d try to write in a more functional style. Fortunately for me, this is something that a little refactoring could help with.

Additionally, one thing I thought would be a major help is something that began to be a major thorn in my side: NetBeans. As the size of the project grew, NetBeans got slower and slower. It seemed that I spent more time fiddling with IDE settings than actually coding. Even worse is that the IDE-generated makefile is so convoluted that it’s extremely difficult to modify by hand in a satisfying way. I’ve always coded with and IDE so I wouldn’t have even considered not using one, but then I spent some time with Haskell. One of Haskell’s “problems” is that it doesn’t have good IDE support. It doesn’t seem like any IDE really handles it well, so most people use Emacs. Personally, I haven’t really warmed up to Emacs, but GEdit has syntax highlighting for Haskell and a built-in terminal for GHCI. GEdit also has syntax highlighting for C. Next time, I will seriously consider using a lighter-weight text editor for a C project. All this said, I think NetBeans for Java remains the way to go.

What’s Next

Like any program, version 1.0 is just one of many versions. There certainly remains a lot of work to do with DMP Photo Booth. Some major items you are likely to see whenever I get around to working on DMP Photo Booth some more:

Options Dialog

I think anybody who has seen it will agree: the options dialog in DMP Photo Booth is bad. It’s poorly organized, and kind of wonky. Personally, I modify settings using the .rc file, which is telling. This is certainly a high-priority improvement.

Functional Refactor

Like I said above, the code could use a pass to limit side effects. Funtions need to have their side effects limited, and globals need to be eliminated unless absolutely necessary. However, C is not a functional language. While one could argue that function pointers enable functional programming in C, this is a very pedantic argument. I won’t be going crazy with functional programming techniques. There will be no Monads, or for loops being turned into mappings of function pointers.

Optional Module API

An idea I’ve had on the back burner for a while is an optional module API. This would be used for very specific quality-of-life things. For instance, a module could provide a GTK widget to be shown in the options dialog. Any module that doesn’t want to implement any or all of the optional API can just ignore it. The module loading function will gracefully handle the dlsym failure, just treating it as it is: declining to implement the API. I have no plans to change the current existing API, so all you module developers can rest easy!

User Interface Module

It occurred to me that it might be good to have a UI module. This would provide the UI, and wouldn’t be tied to the trigger/printer/camera module start/stop system. This module would be loaded at startup and unloaded on shutdown. This would allow the Photo Booth to use different widget toolkits: QT, Curses, Cocoa, WinForms, or whatever else. Under this scheme, the current GTK+ interface would be abstracted into the reference UI Module.

Forking A New Process Using GLib

These days we tend to think of concurrency in terms of spawning threads. Need to perform a long running calculation? Spawn a thread. However, there are other ways; we can fork and create a new process. Unfortunately for us, fork and threads don’t play nice together. How so, you ask? When you fork a new process, only the current thread is copied into the new process. If any other thread held a lock on a mutex, that mutex will never be unlocked in the new process. This includes mutexes held by system calls such as malloc.

In light of this, you may be wondering why I’m wasting your time with this. No, this isn’t just a PSA, there is something sane you can do with fork in a multi-threaded world: you can call exec and friends. And it just so happens that GLib can help us with this. GLib provides us with Process Spawning facilities that integrate with GIOChannel and GMainLoop.

The first thing you may notice is that there isn’t actually a GLib equivalent to fork or exec. These two calls are combined into the g_*_spawn_* family of functions. The reason for this is because GLib itself spawns threads to perform work. By default, *all* GLib applications potentially have threads running and as such it is never safe to call fork without immediately calling exec.

Forking A New Process

First, let’s do some setup:

gchar * child_argv[] = {"[PROGRAM_TO_RUN]", "[ARGUMENTS]", NULL}

This is the command that will be executed (Your argv). Since this array is terminated by NULL, GLib is able to determine its length and we do not need an argc.

GPid pid; gint stdout; GError * error = NULL;

We’ll need these as well. Now, it’s time to start our process:

gboolean result = g_spawn_async_with_pipes (NULL, child_argv, NULL, G_SPAWN_DEFAULT, NULL, NULL, &pid, NULL, &stdout, NULL, &error);

Yeah… That one’s a doosey. Let’s go over all those fields.

The first argument is the child’s working directory. If this is NULL, then the child inherits the parent’s working directory.

The second argument is the child’s argument vector. This is the command that will be executed.

The third argument is the child’s environment. Like the argv, this must be NULL-terminated. If NULL the child inherits the parent’s environment.

The fourth argument is the child’s spawn flags.

The fifth argument is a pointer to a GSpawnChildSetupFunc function, to be called just before exec. If null, then the process will fork and exec without additional setup.

The sixth argument is the gpointer to be passed to the GSpawnChildSetupFunc.

The seventh argument is a location to return the PID of the new process.

The eighth, ninth, and tenth arguments are return locations for the file descriptors of STDIN, STDOUT, and STDERR respectively.

The last argument is a return location for a GError if something goes wrong. This function returns FALSE if something goes wrong.

What Now

So you’ve got your fancy new process, what do you do with it?

Well, first let’s create some GIOChannels using our file descriptors:

GIOChannel * outch = g_io_channel_unix_new(stdout);

Next, we add callbacks:

GSource * stdout_source = g_io_create_watch( outch, G_IO_IN); g_source_set_callback(stdout_source, stdout_callback, outch, NULL); g_source_attach(stdout_source, main_context); GSource * stdout_abort = g_io_create_watch( outch, G_IO_ERR | G_IO_HUP | G_IO_NVAL); g_source_set_callback(stdout_abort, abort_callback, NULL, NULL); g_source_attach(stdout_abort, main_context);

Here, I’ve created two sources: one that will be called when there’s data to be read, and one to be called when something goes wrong. The first call to g_io_create_watch creates a GSource that watches for a certain condition. The second call to g_source_set_callback tells the watch what function to call when the condition is met. This function should have the following signature:

static gboolean callback(gpointer data)

The final call to g_source_attach attaches a source to a GMainContext. If NULL is passed to the second argument, then the default context is used.

…and that’s all there is to it! Your callbacks can operate on file descriptors using the g_io_channel_* family of functions, and when the abort callback is called, it can exit gracefully.

The Smallest Things…

For the last few weeks I’ve been banging my head against a problem. I need my Photo Booth application to actually take a photo. It seems like such a simple thing, but it has actually been one of the most difficult ones I’ve encountered. Like the Great PostScript Debacle and the Mystery of the GAsyncQueue Ref before it, I spent a good week banging my head against the wall. I even took a day off to write an entire Lua Camera Module just so I could shell out and call a command line utility to try to work around it.

The best part? If you’ve been following the blog, you know I’m kind of a crybaby about poor documentation. While libgphoto2 is certainly a repeat offender on this count, no amount of documentation could have prepared me for what was to come.

But first, let’s go over my now-working implementation.

Taking A Picture With Libgphoto2

To take a picture, we need 3 functions:

  • gint dmp_cm_camera_init()
  • gint dmp_cm_camera_finalize()
  • gint dmp_cm_camera_capture(gchar * location)

dmp_cm_camera_init

gint dmp_cm_camera_init() { context = gp_context_new(); gp_log_add_func(GP_LOG_ERROR, (GPLogFunc) dmp_cm_log_func, NULL); if (gp_camera_new(&camera) != GP_OK) { //error handling } if (gp_camera_init(camera, context) != GP_OK) { //error handling } return DMP_PB_SUCCESS; }

There are two main structs: Camera and GPContext. A Camera represents, shockingly, a camera attached to the system. A GPContext represents work to be done. Callback functions, data, and other things of that nature.

First we create a new context. Next we can add a log function to accept log messages from libgphoto2. In my experience, no matter what you do you will get a lot of useless garbage output from libgphoto2. For this reason, I recommend you don’t just let this spew to the console or some other user-facing output. At first, I was going to send this to the console queue, but I’ve since decided against using this feature. It is good to know about though in case you need it for troubleshooting.

After all of that is done, we need to create our camera object, and initialize libgphoto2.

dmp_cm_camera_finalize

gint dmp_cm_camera_finalize() { gp_camera_unref(camera); gp_context_unref(context); return DMP_PB_SUCCESS; }

Nothing particularly tricky there. We need to ensure we free our memory when we’re done, so we unref our camera and context. Having seen these two functions, you may be wondering to yourself: “Are we dealing with GObjects here?” Luckily for us, there is a simple test for this:

g_assert(G_IS_OBJECT(camera));

I’ll spare you the effort of running this test: the assertion fails. Too bad really, but it is what it is. Libgphoto2 just uses function names similar to GObject.

dmp_cm_camera_capture

gint dmp_cm_camera_capture(gchar * location) { CameraFile * file; CameraFilePath camera_file_path; gint fd; CameraEventType event_type; void * event_data; if (gp_camera_capture(camera, GP_CAPTURE_IMAGE, &camera_file_path, context) != GP_OK) { //error handling } if ((fd = g_open(location, O_CREAT | O_WRONLY, 0644)) == -1) { //error handling } do { gp_camera_wait_for_event(camera, 1000, &event_type, &event_data, context); if (event_type == GP_EVENT_CAPTURE_COMPLETE) break; } while(event_type != GP_EVENT_TIMEOUT); if (gp_file_new_from_fd(&file, fd) != GP_OK) { //error handling } do { gp_camera_wait_for_event(camera, 1000, &event_type, &event_data, context); } while(event_type != GP_EVENT_TIMEOUT); if (gp_camera_file_get(camera, camera_file_path.folder, camera_file_path.name, GP_FILE_TYPE_NORMAL, file, context) != GP_OK) { //error handling } if (gp_camera_file_delete(camera, camera_file_path.folder, camera_file_path.name, context) != GP_OK) { //error handling } gp_file_free(file); do { gp_camera_wait_for_event(camera, 1000, &event_type, &event_data, context); } while(event_type != GP_EVENT_TIMEOUT); return DMP_PB_SUCCESS; }

This function is where the meat of the process is. First we need to do some housekeeping. We create a CameraFile pointer to represent the actual image file, and a CameraFilePath struct to represent the path to the file. We also create an int for use as a file descriptor, a CameraEventType and void pointer for our calls to gp_cmaera_wait_for_event

Next we call gp_camera_capture which triggers the camera to take a picture. After that is done, we’ll open a file descriptor to save the image. You’ll notice that the call to g_open is enclosed in parentheses. THIS STEP IS 100% MANDATORY Don’t omit it, you’ll be sorry. More on this in a bit.

Next, we wait for the camera to finish working. The camera uses an event system; it will emit events when things happen. After releasing the shutter, the camera has other work to do before it is “done taking the picture”. If you try to do the next step before the camera is ready libgphoto2 will spew garbage to your STDOUT and you’ll have to ctrl+c to fix it. To avoid this, we call gp_camera_wait_for_event while event_type != GP_EVENT_TIMEOUT || GP_EVENT_CAPTURE_COMPLETE Capture complete is obviously the event we care about, but it may have happened while we weren’t listening for it. In that case, we’ll settle for a timeout.

Next up is instantiating our CameraFile. We use our File descriptor that we just opened to call gp_file_new_from_fd. Unfortunately there is no gp_file_new_from_file_pointer which means that this call is POSIX only, and there’s no portable substitute.

After creating our CameraFile we download the image we just took by calling gp_camera_file_get and then delete the file from the camera using gp_camera_file_delete

Finally we make sure no events are pending, then return.

Why Are You Yelling At Me?

Good question. The block in question of course is

if ((fd = g_open(location, O_CREAT | O_WRONLY, 0644)) == -1) { //error handling }

Inside of that if block, I’m assigning a value and testing the result inside of the if statement. This operation is about a 2 out of 10 on the cleverness scale. Normally, you could omit the parentheses around (fd = g_open(location, O_CREAT | O_WRONLY, 0644). However, if we do it here, things go off the rails. Not right away, of course, but a few function calls later we get to:

if (gp_camera_file_get(camera, camera_file_path.folder, camera_file_path.name, GP_FILE_TYPE_NORMAL, file, context) != GP_OK) { //error handling }

As soon as gp_camera_file_get(...) is evaluated, this is spewed to the console:

mystic_runes

…and you have no choice but to kill the process.

Why does this happen? I have no idea. Why does enclosing the call to g_open in parenthesis fix it? Again, no idea. And it only happens here too. I just tried to modify the examples that come with libgphoto2 to reproduce the error and get that screenshot for this post, but it works fine there. Knowing my luck, if you download and build the program, it’ll work fine for you.

As long as it works, I guess…

DMP Camera Module: Shooting For The Moon

So there I was; several hours into my work on the Camera module. I may have mentioned this before, but lack of documentation is a pet peeve of mine. Unfortunately, some times it can’t be avoided. Take libgphoto2. If you click that link, you’ll get taken to a doxygen website. Seems promising, right? Go ahead and poke around, things start to look less rosy as you do. Unfortunately, this seems to be the gold standard of PTP libraries for Linux, so there’s really nothing for it. Right?

Maybe Not

After hours of frustration, I decided to try something crazy. I opened up a command prompt and entered:

gphoto2 --capture-image-and-download

And you know what? My camera took a picture and downloaded it to the current directory. Maybe that’s the answer I’m looking for. DMP Photo Booth doesn’t need to do anything fancy. It just needs to take a picture.

Now, I had been planning to provide modules that call out into Lua to allow people to implement modules in Lua. However, this was always a back-burner project. The sort of thing that happens after version 1.0 is released. But with implementing a libgphoto2 Camera Module seeming like So Much Work, maybe it was time to get on it. At least, for the Camera Module.

dmp_pb_lua_camera_module

So I committed and pushed my work on the Camera Module. I made a copy of it, and removed all the logic. After that, I committed it to the repository. It was officially official.

The first order of business was creating the lua script loader. I needed an init, finalize, and is_initialized function for lua, and a capture function. Let’s take a look:

(I’ve omitted error handling from these examples. If I didn’t they’d be 3 times as long and nobody wants to read that)

gint dmp_cm_lua_initialize() { dmp_cm_state = luaL_newstate(); luaL_openlibs(dmp_cm_state); luaL_loadfile(dmp_cm_state, DMP_CM_MODULE_SCRIPT); lua_pcall(dmp_cm_state, 0, 1, 0); lua_setglobal(dmp_cm_state, DMP_CM_NAMESPACE); lua_getglobal(dmp_cm_state, DMP_CM_NAMESPACE); lua_getfield(dmp_cm_state, -1, DMP_CM_MODULE); lua_pushcfunction(dmp_cm_state, dmp_cm_lua_console_write); lua_setfield(dmp_cm_state, -2, "console_write"); lua_getglobal(dmp_cm_state, DMP_CM_NAMESPACE); lua_getfield(dmp_cm_state, -1, DMP_CM_MODULE); lua_getfield(dmp_cm_state, -1, "initialize"); lua_pcall(dmp_cm_state, 0, 0, 0); is_initialized = TRUE; return DMP_PB_SUCCESS; }

First up is the initialize function. First, I initialize Lua and open the standard library. The call to luaL_loadfile loads the script and pops it onto the stack as a function, which is called by the subsequent call to lua_pcall.

If you’ve been following the blog, you may have noticed that I’m a fan of namespaces. I follow the GLib namespace style and use [NAMESPACE]::[APPLICATION/MODULE]::. I’ve decided that DMP Photo Booth modules implemented in Lua should do this as well. Lua doesn’t have actual namespaces as a language feature, but like most things, they can be approximated using tables. To that end a Lua camera module script should return a table named dmp, which contains a table named cm. In a future version, these will likely be configurable. The module will return the dmp dmp, which is set as a global in the next call.

Next, we must register the console write callback. This is accomplished by getting the dmp.cm table, pushing the console write function, and setting it as a field in dmp.cm.

Next, we get dmp.cm.initialize, and call it.

gint dmp_cm_lua_capture(gchar * location) { lua_getglobal(dmp_cm_state, DMP_CM_NAMESPACE); lua_getfield(dmp_cm_state, -1, DMP_CM_MODULE); lua_getfield(dmp_cm_state, -1, "capture"); lua_pushstring(dmp_cm_state, location); lua_pcall(dmp_cm_state, 1, 0, 0); return DMP_PB_SUCCESS; }

This is the basic method to call a function. First, get the dmp table, then get its cm field. Next, get the function from dmp.cm. After the function is on the stack, we push its arguments onto the stack, and finally we call it. The functions for finalize and is_initialized look strikingly similar, so I’ll spare you.

The Script

The script is extremely simple, thanks to Lua. I can print the whole thing here without editing it, it’s so small:

local dmp = {} dmp.cm = {} function dmp.cm.capture(location) os.execute("gphoto2 --capture-image-and-download" .. "--filename=" .. location) end function dmp.cm.initialize() end function dmp.cm.finalize() end function dmp.cm.is_initialized() return true end return dmp

In the first two lines, we create our dmp.cm namespace tables. Next we define our functions: capture, initialize, finalize, and is_initialized.

Finally, we return our namespace table for use within C. Of the four functions, only capture isn’t a placeholder. In capture, we fork and execute gphoto2, signaling our camera to capture.

How’s That Working Out For Me

Unfortunately, not so great. Well, the Lua module works perfectly. It loads, all functions call without a hitch. And a Lua script is a lot easier to implement than a C module. If only gphoto2 wasn’t so incredibly brittle.

The problem with a command line utility is that you have to count on it to work. Unfortunately, so many things can go wrong with gphoto2. So many errors, so many ways to get into an inconsistent state. Plus, my favorite part about all of this, is that all of this happens by magic! You can do the same thing twice and get different results! Take that Einstein!

No, it seems that my little forray into Lua has come to an end. The Module is live. However, work must re-start on dmp_pb_camera_module. Such a shame…

At The Gates Of Valhalla

In my post the other day, I talked about using Valgrind and GProf to debug DMP Photo Booth. As tends to be the way of most articles on the Internet, I didn’t actually spend any time talking about how to use said programs. You are, however, on notice that they are Good.

Well, I promised I’d tell you how to use them. I may be many things, but a liar I am not. Today, I’ll start with Valgrind.

Valgrind

Installation

First things first, we need to install. Once again, Ubuntu comes through for us. It’s just a simple matter of …

sudo apt-get install valgrind

… and we’re off to the races. Unfortunately for those of you in the audience running Windows, Valgrind is not available for Windows.

Valgrind Vs. GLib

You may remember in my last post a lot of talk about difficulties with GTK. Per the Gnome Wiki, the recommended way to launch Valgrind with a GLib/GTK application is:

G_DEBUG=resident-modules valgrind \ --tool=memcheck \ --leak-check=full \ --leak-resolution=high \ --num-callers=20 \ --log-file=vgdump \ [your-program]

Let’s talk about these options.

  • G_DEBUG=resident-modules
    • This is not actually an argument to Valgrind, but an environment variable. This is used by GLib. You need to use this if your application makes use of GModule. For instance, if you’re implementing a module-based Photo Booth application. If you don’t use this, your modules may get unloaded prematurely. If this doesn’t apply to your application, feel free to omit this.
  • valgrind
    • Calls valgrind
  • –tool=memcheck
    • Valgrind is actually a suite of tools. Memcheck is the tool that checks for memory leaks
  • –leak-check=full
    • Enables searching for leaks on exit
  • –leak-resolution=high
    • Sets the detail level of leak stack traces
  • –num-callers=20
    • This sets the depth of the stack traces. For instance, setting this to 20 will show: foo() called by bar() … and so on up to 20 times
  • –log-file=vgdump
    • The file to output to. In its current configuration, a file named vgdump will be placed in the current directory
  • [your-program]
    • Finally, your program. Place in whatever command you’d call to start your program.

…so you do all that and execute your program. Valgrind will monitor your program while you interact with it. Go ahead and run some test cases. It’ll be a little slow due to the monitoring, but that’s ok. Finally, when you’re done, exit your program and open the log file. The first thing you should notice is that it is long. Really long.

This is due to GTK. GTK doesn’t clean up after itself on exit, instead relying on the OS to clean up on process termination. While this behavior is considered to be fine by most, it makes this step difficult. On the Gnome Wiki, you’ll find a Suppression file that is used to mitigate some of this. My experience with this is that it doesn’t do much.

The best way I’ve found is to just search for “definitely lost”. These are most likely to be caused by your program. You can go line by line and check each possibly lost section, but I’ve found that this isn’t practical as 99/100 of these originate from gtk_main().

DMP Photo Booth: Underwater

You’ve heard it before: “Premature optimization is the root of all Evil.” Capital Evil. So you go on about your day, arranging the ones and zeros in pretty christmas tree shapes and suddenly the day arrives: your program is slow as molasses. What are you going to do now?

Last monday was that day for me, and I’ve been underwater ever since. “Why is this happening to me?!” I thought. While not prematurely optimizing, I thought I did things right. I have no nested for loops. I’m not using an array when I need a list. Threads aren’t modifying the UI willy-nilly. Why has God forsaken me?

The Symptoms

I first noticed it while working on the printer module. After the program is open for some length of time, my whole computer begins to lag. Not just a little bit either; things completely fall apart. In the space of about 5 minutes, the computer becomes unusably slow. Killing the Photo Booth process doesn’t help; only physically shutting the computer off helps. Of course, the computer is so slow that I can’t use the shutdown option; I have to press The Button.

At this point, I feel some context is in order. I had been trying to figure out how to make my printer print on photo paper. Apparently printing is one of the areas Linux still hasn’t caught up to windows on, so this was proving to be difficult. After printing a few strips, I realized that my low-res photo strips weren’t going to cut it, so I bumped the resolution from 100 pixels wide to 1000. It was then that I noticed things were off.

Ten years of troubleshooting experience kicked in: “what changed?” I thought. The obvious answer was the image size. Clearly my photo strip assembly algorithm was operating at O(n^n^n) or something. What can be done?

Doing It Wrong

I took a look at my assemble strips function. After poking around for a while, I zeroed in on something that had been bugging me for a while. I had been using a function MagickResetImagePage combined with MagickCoalesceImages to composite images over each other. I had decided to use these functions before I knew this operation was called “compositing”, and I had found them in a tutorial on making animated .gif files in MagickWand. At the time, I was never really happy with this implementation, so I went back to the API docs to see if there was a function with “composite” in its name. There was.

MagickCompositeImage is a lot more intuitive to use than MagickResetImagePage. It doesn’t have that Magickal formatting string that MagickResetImagePage uses, it just takes coordinates. Perhaps this was the solution to my problem. I refactored, and recompiled.

Still broke.

Measure, Don’t Guess

That old gem: I’m sure you’ve heard it too. I decided that maybe this was my best course of action. I decided it was time to learn how to use this Valgrind thing all the Cool Kids are talking about these days. For those of you not in the know, Valgrind is a utility that will tell you various things about your program. The most important/most well-known thing that it can do for you is identify memory leaks. Thinking that prehaps I have a memory leak, I installed Valgrind and got to work.

It turns out that GTK has more than a few memory leaks. Allegedly this is due to the fact that it doesn’t cleanup on exit, relying on the OS to free the memory on program termination. While the general consensus is that this is fine, it doesn’t help us. The folks at Gnome are aware of this, and there is even a Wiki page on ways to mitigate this. The cliff’s notes version of that page being: “Just search for ‘definitely lost'”.

Armed with this piece of wisdom, I set off. I ran the Photo Booth in Valgrind, and examined the results. Valgrind actually turned up some memory leaks, which I corrected. Maybe now we’re set!

Nope.

Breaking Out The Profiler

This is what they usually want you to do when they tell you to Measure. Unfortunately for me, NetBeans’ built-in profiler is only for Java. After some google searching, I found gprof. Gprof is a pretty bare-bones profiler. It does what it says and not much else, which is fine. I hooked my program into the profiler and got to work. The results? Nothing. My two GTK idle functions ran some 7 million times, returning basically immediately each time as expected. Every other function performed as expected.

What now?

Trying The Process Monitor

Having run through Valgrind and GProf, coming out empty-handed, I was at a loss. I got into development because I wanted to fix my own broken code instead of mitigate somebody else’s, and fix it I will. Luckily I have 10 years of sysadmin experience to fall back on. I dusted off my process monitor and got to work.

I fired up DMP Photo Booth, and watched it in the process monitor. I pushed the button. I pushed it again. And again. memory use rose and fell predictably as the strip was assembled, but CPU usage stayed relatively low. Then boom!

I tried again, this time doing literally nothing. Still my computer sputtered and died. I killed the process, but again it was too late.

But wait, isn’t the OS supposed to clean up after me when my process ends? Something fishy is going on.

Have I Mentioned That Threads Are Hard?

Having eliminated all other possibilities, I was forced to consider that I was having a threading issue. “But I was so careful!” I thought. Shortly thereafter I noticed it: I was getting random pthread mutex errors on my console. Clearly I had a threading issue on my hand. Was I spawning extra threads? Was something not releasing its lock? Was I being victimized by gremlins? I set a break point on line one of main() and fired up my debugger. It was time to see just what was being done when nothing was being done.

So, I stepped through my program. Whenever I got to a g_thread_new call, I made sure the thread function was solid. Finally, I got to my g_idle_add calls. I had two of them, one to monitor the status indicators, and one to retrieve photo strip thumbnails. Both of these functions pop from a result from a GAsyncQueue. These Queues are fed by worker threads. I thought back to my profiler output and remembered how often these are called. Looking a few lines down I saw a call to g_timeout_add_seconds. This function is basically adds an idle function, but is only called at most X seconds. Maybe replacing the g_idle_add calls with g_timeout_add_seconds was my answer. I refactored and reran.

Nope.

Well, crud. “Are these functions even my problem?” I thought. I commented them out, recompiled and reran.

Fixed.

“So, what’s the difference?” I wondered. All three of these functions rely on the same basic behavior: pop from a GAsyncQueue some result placed there by a worker thread. I looked at the three threads: the thread that was working properly calls g_async_queue_ref/unref, and the two that don’t work do not take a reference, instead accessing the static global variable in their module. I refactored all thread functions that access a GAsyncQueue to take a reference and work on their local copy only. I recompiled, reran, and went to bed. 46,100 seconds later, everything was humming along just fine.

Wait, So I Just Had To Increment A Reference Count?

It certainly seemed odd. That’s like your car not starting if the headlights are out. Sure, they’re important, but the car should still start right?

Looking through the source of glib didn’t help. So far as I can tell, all that does is increment the reference count, and return a pointer. I turned to the documentation, which says “… Whenever another thread is creating a new reference of (that is, pointer to) the queue, it has to increase the reference count (using g_async_queue_ref()). Also, before removing this reference, the reference count has to be decreased (using g_async_queue_unref()). …” While not definitive, this certainly seems to indicate that taking a reference is important.

Frankly, I’m not happy about this answer. This is just the sort of magic solution that I hate; it’s fixed, but I’m not sure why. For the time being, I won’t dwell on it. Moving forward, I’ll be sure that my threads take a reference of a GAsyncQueue before calling methods on it. At some point when all of this is said and done, perhaps I’ll investigate this mysterious reference count.

I have taken away from this a new appreciation of just how brittle threads are. Sure, they are powerful, but shooting yourself in the foot with a 50 cal hurts a lot more than with a 9 mm. I’ll have to be more careful.

It was also a good introduction to GProf and Valgrind. Expect blog posts on the usage of each of these tools soon!

%d bloggers like this: