Event-driven programming is an advanced way of organizing programs around I/O channels. This may be best explained by an example: Consider you want to read from a pipeline, convert all arriving lowercase letters to their corresponding uppercase letters, and finally write the result into a second pipeline.
A conventional solution works as follows: A number of bytes is read from the input pipeline into a buffer, converted, and then written into the output pipeline. Because we do not know at the beginning how many bytes will arrive, we do not know how big the buffer must be to store all bytes; so we simply decide to repeat the whole read/convert/write cycle until the end of input is signaled.
In O'Caml code:
let buffer_length = 1024 in let buffer = String.create buffer_length in try while true do (* Read up to buffer_length bytes into the buffer: *) let n = Unix.read Unix.stdin buffer 0 buffer_length in (* If n=0, the end of input is reached. Otherwise we have * read n bytes. *) if n=0 then raise End_of_file; (* Convert: *) let buffer' = String.uppercase (String.sub buffer 0 n) in (* Write the buffer' contents: *) let m = ref 0 in while !m < n do m := !m + Unix.write Unix.stdout buffer' !m (n - !m) done done with End_of_file -> ()
The input and output pipelines may be connected with any other endpoint of pipelines, and may be arbitrary slow. Because of this, there are two interesting phenomenons. First, it is possible that the Unix.read system call returns less than buffer_length bytes, even if we are not almost at the end of the data stream. The reason might be that the pipeline works across a network connection, and that just a network packet arrived with less than buffer_length bytes. In this case, the operating system may decide to forward this packet to the application as soon as possible (but it is free not to decide so). The same may happen when Unix.write is called; because of this the inner while loop invokes Unix.write repeatedly until all bytes are actually written.
Nevertheless, Unix.read guarantees to read at least one byte (unless the end of the stream is reached), and Unix.write always writes at least one byte. But what happens if there is currently no byte to return? In this case, the second phenomenon happens: The program stops until at least one byte is available; this is called blocking.
Consider that the output pipeline is very fast, and that the input pipeline is rather slow. In this case, blocking slows down the program such that it is as slow as the input pipeline delivers data.
Consider that both pipelines are slow: Now, the program may block because it is waiting on input, but the output pipeline would accept data. Or, the program blocks because it waits until the output side is ready, but there have already input bytes arrived which cannot be read in because the program blocks. In these cases, the program runs much slower than it could do if it would react on I/O possibilities in an optimal way.
The operating systems indicates the I/O possibilities by the Unix.select system call. It works as follows: We pass lists of file descriptors on which we want to react. Unix.select also blocks, but the program continues to run already if one of the file descriptors is ready to perform I/O. Furthermore, we can pass a timeout value.
Here is the improved program:
let buffer_length = 1024 in let in_buffer = String.create buffer_length in let out_buffer = String.create buffer_length in let out_buffer_length = ref 0 in let end_of_stream = ref false in let waiting_for_input = ref true in let waiting_for_output = ref false in while !waiting_for_input or !waiting_for_output do (* If !waiting_for_input, we are interested whether input arrives. * If !waiting_for_output, we are interested whether output is * possible. *) let (in_fd, out_fd, oob_fd) = Unix.select (if !waiting_for_input then [ Unix.stdin] else []) (if !waiting_for_output then [ Unix.stdout] else []) [] (-.1.0) in (* If in_fd is non-empty, input is immediately possible and will * not block. *) if in_fd <> [] then begin (* How many bytes we can read in depends on the amount of * free space in the output buffer. *) let n = buffer_length - !out_buffer_length in assert(n > 0); let n' = Unix.read Unix.stdin in_buffer 0 n in end_of_stream := (n' = 0); (* Convert the bytes, and append them to the output buffer. *) let converted = String.uppercase (String.sub in_buffer 0 n') in String.blit converted 0 out_buffer !out_buffer_length n'; out_buffer_length := !out_buffer_length + n'; end; (* If out_fd is non-empty, output is immediately possible and * will not block. *) if out_fd <> [] then begin (* Try to write !out_buffer_length bytes. *) let n' = Unix.write Unix.stdout out_buffer 0 !out_buffer_length in (* Remove the written bytes from the out_buffer: *) String.blit out_buffer n' out_buffer 0 (!out_buffer_length - n'); out_buffer_length := !out_buffer_length - n' end; (* Now find out which event is interesting next: *) waiting_for_input := (* Input is interesting if...*) not !end_of_stream && (* ...we are before the end *) !out_buffer_length < buffer_length; (* ...there is space in the out buf *) waiting_for_output := (* Output is interesting if... *) !out_buffer_length > 0; (* ...there is material to output *) done
Most important, we must now track the states of the I/O connections ourselves. The variable end_of_stream stores whether the end of the input stream has been reached. In waiting_for_input it is stored whether we are ready to accept input data. We can only accept input if there is space in the output buffer. The variable waiting_for_output indicates whether we have data to output or not. In the previous program, these states were implicitly encoded by the "program counter", i.e. which next statement was to be executed: After the Unix.read was done we knew that we had data to output; after the Unix.write we knew that there was again space in the buffer. Now, these states must be explicitly stored in variables because the structure of the program does not contain such information anymore.
This program is already an example of event-driven programming. We have two possible events: "Input arrived", and "output is possible". The Unix.select statement is the event source, it produces a sequence of events. There are two resources which cause the events, namely the two file descriptors. We have two event handlers: The statements after if in_fd <> [] then form the input event handler, and the statements after if out_fd <> [] then are the output event handler.
The Equeue module now provides these concepts as abstractions you can program with. It is a general-purpose event queue, allowing to specify an arbitrary event source, to manage event handlers, and offering a system how the events are sent to the event handlers that can process them. The Unixqueue module is a layer above Equeue and deals with file descriptor events. It has already an event source generating file descriptor events using the Unix.select system call, and it provides a way to manage file descriptor resources.
Especially the Unixqueue abstraction is an interesting link between the operating system and components offering services on file descriptors. For example, it is possible to create one event queue, and to attach several, independent components to this queue, and to invoke these components in parallel. For instance, consider a HTTP proxy. Such proxies accept connections and forward them to the service that can best deal with the requests arriving. These services are typically a disk cache, a HTTP client, and an FTP client. Using the Unixqueue model, you can realize this constellation by creating one event queue, and by attaching the services to it which can be independently programmed and tested; finally these components communicate either directly with the outer world or with other components only by putting events onto the queue and receiving events from this queue.