I think apache needs no introduction to anyone having some knowledge in Linux, web servers, internet or other digital-era inventions … Also, if you’re a little bit more tech-savvy, you might know, that you can write apache modules, which can do various other things beside of the standard “give me the file” feature already existing in apache. A very good tutorial can be found at: http://threebit.net/tutorials/apache2_modules/tut1/tutorial1.html this will introduce you to the basics of how to write an apache module, starting from almost zero, and if you are more interested you also can read some of the chapters from the great book of Doug MacEachern and Lincoln Stein: “Writing Apache modules with Perl and C”.
On the other end writing apache modules is a pretty cumbersome task, so there are not too many resources available. I will try to share a few of my “Lessons Learned” throughout this blog with you, in case it will help someone.
Lesson 1. – Creating a web server inside an apache mod.
Let’s suppose you have the weird situation, that you need a module which needs to act as a server (yes, don’t forget, you are already integrating in an existing solution) so that clients can connect to it, and exchange information. This might freak out a security expert, but let’s hope there’s not even one around to see this requirement. The first thing you need to take into consideration, is that apache doesno’t really like the standard BSD socket functions. They work, but in my experience they are totally unreliable, since their resources might not be manageable by the apache runtime… (yes, apache is managing its resources in so called pools, meaning, you don’t need to take care of memory management if you decide to use these… and I highly recommend using these apache pools. They’ll save you a lot of time). To get your hand on the apache portable runtime’s socket management functions please consult the following link: http://dev.ariel-networks.com/apr/apr-tutorial/html/apr-tutorial-13.html
This is the best that is out there right now, so just read it and understand it. But your problems are far from being over. Remember the following: by default, apache is allocating a process to a request, because on the posix flavored world of X-es, spawning a process was much easier than spawning a thread. So, apache gets the request from the remote browser, spawns your process, your process loads the requested modules, your process gets the request, does some work on it and the result is sent back to the client, the process dies. Let’s see what is happening for your module: when apache spawns the new process there are some callbacks executed, usually these register some so called hooks, that are executed in various phases of the request processing. More details can be found out at: http://httpd.apache.org/docs/2.0/developer/modules.html
Usually you would like to create the web server in the initialization phase of the module and would like to keep it there till the apache is up and running. But with the default configuration, a web server is created for each of the processes, and with some network programming background you easily spot the bug: there can be only one web server on one port. Period. The solution is that you configure apache to run in a multi threaded, one process setup, this way you can create a shared web server between the various apache threads as a global, static variable.
Now you’re happy. You deploy your code, works flawlessly… for 3 or 4 hours. Then suddenly crashes. Lots of debugging don’t bring you closer to the solution. You read, and read through the code. Put in lots of debug messages… re-deploy. Great, works for 7-8 hours, then it crashes again. Yup… multi threading, sharing a global object is usually a bad idea. You deploy some cleverly placed mutexes. Finally, your code works as it’s supposed to. Don’t forget to use apache’s own mutex locking mechanism