Marketable Skills (Random X11 Post)

It’s weird. I’ve known how to code for more than half my life – about 15 years now, yet I haven’t gotten to the point where my skills are actually marketable. I’ve learned a hell of a lot of theory – how to design algorithms, how to analyze and optimize them, and also a lot of abstract mathematics. I’ve gotten really good at the discipline of programming. Yet I haven’t gotten to the point where I can actually write anything useful, at least anything useful to the average user. I’ve never actually completed any of my large-scale projects, even though I had a lot of great ideas, and I never learned how to work with graphics, so my practical skills are basically limited to web development and writing command line apps. Basically, this is me:


I’m the ultimate underachiever. I’ve had so many great ideas, some of them million dollar ideas even, but I’ve never gotten around to actually implementing them. I don’t want to live like this anymore. I want to do something with my life that actually matters. I’m almost 30, way too old to be at this stage in my life. It’s time to stop thinking and start doing. So I’m now making a concerted effort to learn the skills necessary to make my existing programming skills marketable. Basically, that includes graphics programming, working with multimedia, and networking APIs, among possibly other things.

I thought I’d start by learning the X11 API. I figured that was a good starting point because it’s sort of the foundation for everything else graphical in Unix/Linux. It won’t make my coding skills marketable overnight, but at least I’ll get my feet wet with programming in a windowing environment, creating my own widgets and whatnot. To this end, I decided to use the RTFM approach. For those of you who don’t know, that means “Read the fucking manual!” I found what I believe to be the official reference manual for XLib. It’s an old book, but based on the minimal experience I have with programming in X11 in the past, the base library doesn’t seem to have changed all that much, so I’m sure it’s a perfectly good source for learning the X11 API.

The rest of this post is just going to be me regurgitating what I’ve learned so far, just because I’ve learned that summarizing new learning is one of the best ways to get it to solidify in your brain.

The first thing to understand about the X Window System is that it’s a client-server protocol. X11 clients communicate with the X display server through TCP/IP packets, even if the client and server are on the same computer. If they’re on different computers, then the X server resides on the computer through which user I/O is occurring. Probably the best example of this is X11 forwarding through SSH. In this scenario, the X server resides on the computer that is acting as the SSH client, while the X clients reside on the computer that is acting as the SSH server. It’s confusing, but once you understand that there’s a role reversal between client and server, it’s not too hard to wrap your head around.

When X starts, it starts an initial client, usually the window manager. All clients to a given X server are able to examine and sometimes even modify the windows of other clients. This property of X is crucial for window managers, because it allows the window manager, which is really just another X11 client, to modify the windows of all other clients. It does this by placing additional windows behind the other clients’ windows that contain decorations such as the title bar and scrollbar, and these windows move whenever the clients’ windows move. Another point to remember is that virtually everything you see in the X environment is a window. Scrollbars are windows. Titlebars are windows. Clicky buttons are windows. Drop-down menus are windows. Grasping this simple fact will greatly improve your understanding of X.

There are a number of graphics libraries built on top of XLib. These include such names as GTK+, Qt, Motif, etc. What these libraries do is they use XLib’s window creation abilities to create their own predefined widgets with their own look and feel. There are basically two kinds of libraries built on top of XLib: there are normal widget toolkits like GTK+ that provide a predefined set of widgets that you can use, and then there are widget toolkit intrinsics, which are lower-level libraries that allow you to more easily create your own widget libraries. The standard toolkit intrinsic is Xt. Motif and Xaw are two examples of widget toolkits that are built on top of Xt. GTK+ and Qt basically skip the toolkit intrinsic and provide a full abstraction layer with predefined widgets built directly on top of XLib. Programming using one of these widget libraries is considerably faster, but the resulting binary is several times larger. So if you’re going for minimalism, you might want to consider coding directly in XLib.


There are several steps involved in initializing an XLib program. Even a basic window with the words “Hello, World!” written on it involves several function calls and dozens of lines of code. The first thing you must do is connect to the display server. This is done using the XOpenDisplay() function. Next you need to get the default screen for the display so you can tell the X server where to put your window. This is done using the DefaultScreen() macro. Notice that XLib functions all begin with a capital X, while XLib macros don’t. This is how you can distinguish between the two.

New windows are created with XCreateWindow() or XCreateSimpleWindow(). This is still not enough to display the window, however. There are two additional steps: first selecting the types of events the window will respond to (XSelectInput()) and then mapping the window, which draws it to the screen (XMapWindow()).

At this point it’s probably appropriate to explain the concept of the window hierarchy. Every screen in the X protocol has a tree structure consisting of all windows in that screen. The top-level window is called the root window. This window covers the entire screen, and everything else that goes on visually speaking happens within this window. Below the root window you have the top-level windows for the individual X11 clients. These are the windows that get decorated with titlebars and scrollbars by the window manager. Below these windows you have further child windows, which include things like buttons, drop-down menus, etc. A window can only be visible where its parent is visible. If the parent window is resized so that the child window overlaps with the parent window’s border, the part of the child window that is outside the border is simply not drawn.

After drawing your initial windows, the next step is to send the program into an event loop. This is just an infinite loop that waits for an event to happen and then responds to it. Events are one of the main packet types used in the X protocol. They are initiated at the server side by the user and are then sent to the appropriate client, where they are queued in an event queue. During the event loop stage of its life cycle (which it spends the vast majority of its lifetime in) the XLib program blocks on this event queue until an event appears. It then pulls the event off the event queue, looks at its type, and then decides what to do with it based on the type. This would typically be implemented as a switch-case statement.

Events are encapsulated in the Event data type, which is actually a union type. The members of this union are all structs representing each of the different event types. All of these structs have a field called type, which is used to determine the type of event that took place and act accordingly.

One final concept that must be understood in order to properly comprehend the event loop is the concept of buffering in X11. The X protocol was built with networking in mind, so requests from the client to the server are buffered before being sent in order to reduce the number of packets and thus the congestion on the network. In order for the XMapWindow() function to actually take effect, the request buffer has to be flushed. There are three ways to do this: One way is to explicitly call a function to flush the buffer (either XFlush() or XSync()). The second way is to call a function that requires a reply from the server. The third way, which is the method used here, is to call a function that waits for an event on the event queue. To this end we call XNextEvent(), which has the double effect of flushing the request buffer as well as pulling the first event off the queue.

Typically the first event that comes off the event queue is an Expose event, that is, an Event type whose type field is set to Expose. This event is sent to the client controlling a window whenever that window becomes visible. An Expose event also results any time part of a window that was previously obscured by another window is unobscured and brought to the front. The first Expose event is intercepted by a switch-case statement and is the program’s cue to start executing in its main phase of operation.

The following is a Hello World program for X11 that I wrote about a year ago when I was first attempting to learn XLib. It illustrates the steps needed to set up a window in X. I’ve modified it slightly so it uses switch-case instead of if statements so it can be more in line with the methodology I’m talking about here. I haven’t tested the modified version, but I don’t see why it wouldn’t work. Here it is:

 1 // Hello World program in the X11 environment
 2 // Use -lX11 to compile.
 4 #include <X11/Xlib.h>
 5 #include <X11/Xutil.h>
 6 #include <X11/Xos.h>
 8 #include <stdbool.h>
 9 #include <string.h>
11 int mainint argc, char **argv ){
13         // Setup:
14         Display *dpy = XOpenDisplayNULL );
15         int scr = DefaultScreen( dpy );
16         Window win = XCreateSimpleWindow( dpy, RootWindow( dpy, scr ), 00200751BlackPixel( dpy, scr ), WhitePixel( dpy, scr ) );
17         XSelectInput( dpy, win, ExposureMask );
19         // Output:
20         XMapWindow( dpy, win );
21         XEvent e;
22         whiletrue ){
23                 XNextEvent( dpy, &e );
24                 switch( e.type ){
25                         case Expose: {
26                                 const char *s = "Hello, World!";
27                                 XDrawString( dpy, win, DefaultGC( dpy, scr ), 2020, s, strlen( s ) );
28                                 break;
29                         }
30                         case ClientMessagegoto Cleanup;
31                 }
32         }
34         Cleanup:
35         XDestroyWindow( dpy, win );
36         XCloseDisplay( dpy );
38         return 0;
39 }

There are of course several other types of events that get sent in response to various user actions. The ButtonPress and ButtonRelease events are sent to the client when the mouse button is pressed and released, respectively. Both of these structs encapsulate the exact position the mouse pointer was at when the event occurred in the x and y fields (relative to the current window) and in the fields x_root and y_root (relative to the root window). There are also analogous events for the keyboard: KeyPress and KeyRelease. These also have members x, y, x_root, and y_root, as well as an unsigned int member called keycode, which I assume is either an ASCII code or a scan code for the key pressed. A MotionNotify event is sent to the client every time the user moves the mouse.

I’m going to look at drawing now. There are basically two types of entities that can be drawn to: they are windows and pixmaps. A pixmap is an off-screen area of memory that holds an image that can be drawn to a window at any time. Windows and pixmaps both have color depth, which is the number of bits (called “planes” in XLib lingo) in the code for the color. Colors are managed using an entity called a colormap, which is an indexed table of RGB codes for colors. To select a color, the XLib program selects an index into the colormap. In order for a given pixmap to be drawn to a given window, it must have the same color depth. When you draw an image in XLib, you can draw it to either a window or a pixmap. Both Window and Pixmap types are subtypes of the type Drawable (I assume this OOP-like type hierarchy is implemented with struct-unions as with the event types). Functions that draw graphics always take a Drawable object as their argument.

Functions that draw graphics to drawables are called graphics primitives. These include functions for drawing lines, arcs, polygons, and text. A picture is drawn by combining graphics primitives with a graphics context (GC). A graphics context is represented by a GC object. It includes rules for how graphics should be drawn, such as line thickness, fill color, the font to use for text, etc. Graphics primitives are combined with the graphics context and the colormap to compute the resulting colors of all the individual pixels in the drawable.

Every window in the X11 environment has a set of attributes, which is encapsulated in an XWindowAttributes structure. This structure is automatically populated with values when the window is created. You can’t use this structure directly. If you want to change a window’s attributes, you must first create an XSetWindowAttributes structure, populate it with the correct values, and then use the XChangeWindowAttributes() function with that structure as a parameter. You can query a window’s attributes with the XGetWindowAttributes() function.

Probably the most important attributes a window has are its width and height, collectively known as its geometry. You get the width and height with the XGetGeometry() function, which XGetWindowAttributes() uses as a backend. Of especial importance is the geometry of the root window, in other words, the width and height of the actual computer screen that the X windows are displayed on. You could get these by calling XGetGeometry() or XGetWindowAttributes() with the root window as the window argument, but there is a simpler way that doesn’t involve time-consuming client-server communications, and this is using the macros DisplayWidth() and DisplayHeight()– which give the display width and height in pixels – and DisplayWidthMM() and DisplayHeightMM() which give the display width and height in millimeters. Dividing Display*() by Display*MM() allows you to know just how large the physical pixels on the screen are, and dividing the width ratio by the height ratio tells you whether the pixels are perfectly square or oblong and if so how oblong are they. This lets you make sure geometric shapes on the screen don’t get warped by the display’s physical geometry.

Another aspect of window geometry is the color depth. This becomes especially important when drawing pixmaps to windows, because the depth of the window and the depth of the pixmap must be the same. A pixmap is stored in a file in raw binary form, meaning the index values for individual pixels are stored one after the other, left-to-right and top-to-bottom, with no zero-padding between them. A pixmap file starts with three integer values representing the width, height, and depth of the image; these are vital because this information cannot possibly be derived from the pixel data itself, so it must be stored separately. After the initial metadata comes the raw uncompressed pixel data. The format of a pixmap file is very simple compared to say a JPEG. Sometimes pixmaps are stored in an ASCII format, where the alphabetical characters ‘a’ through ‘p’ are used to represent the values 0 through 15, or the sequences ‘aa’ through ‘pp’ represent the values 0 through 255. Storing pixmap files in this format makes them easier to edit by hand, but on the other hand it adds the extra step of translating them into raw binary form.

To display an image file stored in a pixmap format, an XLib program first reads the bytes of a pixmap file into a char array, then feeds that array into a function that converts the data in the array to a graphic that can be drawn on the screen. This function would be either XCreateBitmapFromData() for a 1-byte black-and-white pixmap (also called a bitmap) or XCreatePixmapFromBitmapData() in the case of a multicolor pixmap.

The program below reads raw binary pixel data from a file and then uses that data to generate a Pixmap structure. It’s not very useful since it doesn’t actually do anything with that Pixmap, but it serves as a good illustration:

 1 #include <stdio.h>
 2 #include <stdint.h>
 3 #include <X11/Xlib.h>
 4 #include <X11/Xutil.h>
 5 #include <X11/Xos.h>
 7 int mainint argc, char **argv ){
 8         /* Setup: */
 9         int16_t width, height, depth;
10         Display *dpy = XOpenDisplayNULL );
11         int s = DefaultScreen( dpy );
12         FILE *fp = fopen( argv[1], "r" );
13         /* Read metadata: */
14         fread( &width, 21, fp );
15         fread( &height, 21, fp );
16         fread( &depth, 21, fp );
17         int size = width * height;
18         char pixel_data[size];
19         /* Read pixel data: */
20         forint i = 0; i < size; i++ ){
21                 fread( pixel_data + i, 11, fp );
22         }
23         /* Generate Pixmap from read data: */
24         Pixmap pxm = XCreatePixmapFromBitmapData( dpy, RootWindow( dpy, s ), pixel_data,
25         (int) width, (int) height, BlackPixel( dpy, s ), WhitePixel( dpy, s ), depth );
26         /* Cleanup: */
27         fclose( fp );
28         XCloseDisplay( dpy );
29         return 0;
30 }


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s