[hunchentoot-devel] Hunchentoot Request Processing

Sun Apr 17 16:25:38 UTC 2011

[Before diving in to try to "fix" some things in the way hunchentoot handles requests, I figured it would be good to try to more fully understand how hunchentoot handles incoming requests, so I decided to take some notes on the whole process. I thought these might be food for thought or discussion for the list, so here they are - cyrus]

Hunchentoot has a flexible and extensible model for managing connections between the hunchentoot server and its clients. In olden times, there was a hunchentoot SERVER class which received incoming requests and dispatched these requests to handlers that would produce the appropriate responses. In recent versions of hunchentoot, this functionality has been split into a few different classes, or families of classes. The ACCEPTOR and TASKMASTER classes (or, more precisely, they and their subclasses) now work together to listen for incoming connections and to dispatch them appropriately.

The logic of the TASKMASTER and ACCEPTOR classes is somewhat baroque, although this is probably not without good reason. Let's walk through a typical process of starting up hunchentoot. 

* Initializing the Server

1. Create an Acceptor and a Taskmaster

Usually, this is done in one fell swoop with a call such as:

    (make-instance 'hunchentoot:acceptor)

This creates an acceptor that listens for incoming connections on a given port, 80 by default. There's an important detail here that can usually be ignored by the user, which is that the acceptor needs a taskmaster to manage the process of actually receiving connections from the lower level networking infrastructure. The reference to the taskmaster is stored in the ACCEPTOR's TASKMASTER slot and the value of this slot is usually provided directly by the default-initargs section of the ACCEPTOR class definition which reads:

    :taskmaster (make-instance (cond (*supports-threads-p* 'one-thread-per-connection-taskmaster)
                                     (t 'single-threaded-taskmaster)))

So, when running on a suitably threaded lisp implementation, a ONE-THREAD-PER-CONNECTION-TASKMASTER is created, or on a single-threaded lisp a SINGLE-THREADED-TASKMASTER is created, and this value is stored in the TASKMASTER slot of the ACCEPTOR.

2. ACCEPTOR / START method

Once the ACCEPTOR has been created and its TASKMASTER slot suitably initialized, the START method of the ACCEPTOR is called. START then calls the ACCEPTOR's START-LISTENING method.

3. ACCEPTOR / START-LISTENING method

The START-LISTENING method tells the underlying networking infrastructure to listen for incoming connections on a (single) port. On lispworks, the acceptor starts a "server" process with comm:startup-server and then stops this process, with it ready to be restarted in ACCEPT-CONNECTIONS. On non-lispworks lisps, this method sets the LISTEN-SOCKET slot to a USOCKET:STREAM-SERVER-USOCKET as returned by USOCKET:SOCKET-LISTEN.

4. Establish reciprocal connection from TASKMASTER back to ACCEPTOR

Back in the ACCEPTOR's START method, and now that the ACCEPTOR is listening for traffic, we establish the link from the TASKMASTER back to this ACCEPTOR (notice that there is a tight 1:1 coupling between ACCEPTORs and TASKMASTERs, although the decoupling of the two classes allows for subclassing of one independently from the other -- a good thing!).

5. TASKMASTER / EXECUTE-ACCEPTOR method

The TASKMASTER's EXECUTE-ACCEPTOR doesn't do a lot of work -- it turns around and calls the ACCEPTOR's ACCEPT-CONNECTIONS method, but the critical bit here is that the (particular subclass of) TASKMASTER is free to spawn a new thread in which to call ACCEPT-CONNECTIONS (as is done in the ONE-THREAD-PER-CONNECTION-TASKMASTER), or it can call it in its (the TASKMASTER's and, presumably, the ACCEPTOR's) own thread (as in the SINGLE-THREADED-TASKMASTER).

6. ACCEPTOR / ACCEPT-CONNECTIONS

ACCEPT-CONNECTIONS either listens for input using USOCKET:WAIT-FOR-INPUT on non-lispworks lisps, or, on lispworks, wakes the server process using MP:PROCESS-UNSTOP. On non-lispworks lisps, once the stream is ready for input, ACCEPT-CONNECTIONS calls its TASKMASTER's HANDLE-INCOMING-CONNECTION method. On Lispworks, the ACCEPTOR's TASKMASTER's HANDLE-INCOMING-CONNECTION method is called from the callback provided to COMM:START-UP-SERVER. In either case, we transfer control back to the taskmaster once we have a connection ready to provide input.

* Processing Requests

7a. TASKMASTER / HANDLE-INCOMING-CONNECTION

Once again, the TASKMASTER comes into play, as we have different behavior for single- and multi-threaded taskmasters, and, of course, different behavior for lispworks and non-lispworks lisps.

[n.b. The reader conditionals for lispworks/non-lispworks seem a bit bogus to me. Is there any reason the bordeaux/usocket code couldn't be run on lispworks? If the bordeaux/usocket can run on lispworks, are there advantages to the lispworks-specific code? Could/should the lispworks specific code be split off into, say, LISPWORKS-ONE-THREAD-PER-CONNECTION-TASKMASTER and the usocket version into USOCKET-ONE-THREAD-PER-CONNECTION-TASKMASTER? Clearly, the lispworks-specific stuff shouldn't be compiled on non-lispworks implementations (and maybe vice versa), but it seems that a more flexible mechanism for deciding to use the lispworks-stuff or the usocket-stuff on lispworks would be a nice feature.]

7b. (with ONE-THREAD-PER-CONNECTION-TASKMASTER taskmasters) TASKMASTER / CREATE-REQUEST-HANDLER-THREAD

The ONE-THREAD-PER-CONNECTION-TASKMASTER's CREATE-REQUEST-HANDLER-THREAD spawns a new thread and calls its ACCEPTOR's PROCESS-CONNECTION method.

With SINGLE-THREADED-TASKMASTER, HANDLE-INCOMING-CONNECTION calls process-connection directly.

8. ACCEPTOR / PROCESS-CONNECTION

Now control switches back to the ACCEPTOR which initializes the connection stream and calls PROCESS-REQUEST on a newly created instance of the ACCEPTOR's REQUEST-CLASS.

One important thing that happens in PROCESS-CONNECTION is that INITIALIZE-CONNECTION-STREAM is called with the acceptorand USOCKET:SOCKET-STREAM (as provided by the compatibility function make-socket-stream). One reason that this is important is that this is the "hook" that the SSL-ACCEPTOR uses to make the connection stream a CL+SSL:SSL-SERVER-STREAM. In fact, this (and ACCEPTOR-SSL-P) are the only methods specialized by the SSL-ACCEPTOR. This seems to be a pretty clean place to try to break the SSL functionality off into another class that is not a subclass of ACCEPTOR. Whether or not this is a good idea remains to be seen.

Q: Why do we set *CLOSE-HUNCHENTOOT-STREAM* to T inside the loop here?

Q: If we created a new class to implement INITIALIZE-CONNECTION-STREAM, what would it be called?

9. REQUEST / PROCESS-REQUEST

Having initialized the connection stream (usually a no-op, except for SSL connections), we now move to the next task at hand, processing the actual request. This is done via an unspecialized method on the PROCESS-REQUEST generic function, which happens to live in the request.lisp file, but doesn't actually specialize on request.

PROCESS-REQUEST sets up some error handlers, calls HANDLE-REQUEST and then starts (and maybe finishes) sending the HTTP reply.

10. ACCEPTOR / HANDLE-REQUEST

The HANDLE-REQUEST method specializes on acceptor and request as the first two arguments and, with the appropriate error handling forms in place, turns around calls the ACCEPTOR's ACCEPTOR-DISPATCH-REQUEST method.

11. ACCEPTOR / ACCEPTOR-DISPATCH-REQUEST

The ACCEPTOR-DISPATCH-REQUEST method is the one place in the new hunchentoot source were user-extensible request handling is demonstrated. The new, so-called, easy-handler extends the ACCEPTOR class as EASY-ACCEPTOR and defines an ACCEPTOR-DISPATCH-REQUEST specialized on EASY-ACCEPTOR as its first argument. This method is called in preference to the ACCEPTOR's version (for EASY-ACCEPTOR's anyway) and, in this case, loops over the values in the *dispatch-table* special variable and funcalls each function on the list with the request as the single argument to the function.

* Some final comments

While this entire process is modularized, somewhat flexible and extensible, in practice it's a bit baroque and a bit challenging to figure out exactly where and how to extend the functionality of the core server. As a case in point, how is one supposed to create an acceptor that is both an easy-acceptor and works over SSL connections? Make a new ACCEPTOR subclass that extends both EASY-ACCEPTOR and SSL-ACCEPTOR? I don't see why that wouldn't work, but it also seems somehow wrong. It seems to me that there should be a way to compose these kinds of functionalities without triggering a combinatorial subclass explosion.

The initialization of the connection stream and the actual process of handling the request seem like they might be components for somehow being further modularized. The challenge is to figure out a nice way to do this.

* Some ideas for new classes

1. REQUEST-PROCESSOR

2. SOCKET-CONNECTOR