All filters (excluding the core filter that reads from the network and the core filter that writes to it) block at least once when invoked. Depending on whether it's an input or an output filter, the blocking happens when the bucket brigade is requested from the upstream filter or when the bucket brigade is passed to the next filter.

Input and output filters differ in the ways they acquire the bucket brigades (which include the data that they filter). Although the difference can't be seen when a streaming API is used, it's important to understand how things work underneath.

When an input filter is invoked, it first asks the upstream filter for the next bucket brigade (using the get_brigade( ) call). That upstream filter in turn asks for the bucket brigade from the next upstream filter in the chain, and so on, until the last filter that reads from the network (called core_in) is reached. The core_in filter reads, using a socket, a portion of the incoming data from the network, processes it, and sends it to its downstream filter, which processes the data and sends it to its downstream filter, and so on, until it reaches the very first filter that asked for the data. (In reality, some other handler triggers the request for the bucket brigade (e.g., the HTTP response handler or a protocol module), but for our discussion it's good enough to assume that it's the first filter that issues the get_brigade( ) call.)

Figure 25-5 depicts a typical input filter chain data flow, in addition to the program control flow. The arrows show when the control is switched from one filter to another, and the black-headed arrows show the actual data flow. The diagram includes some pseudocode, both in Perl for the mod_perl filters and in C for the internal Apache filters. You don't have to understand C to understand this diagram. What's important to understand is that when input filters are invoked they first call each other via the get_brigade( ) call and then block (notice the brick walls in the diagram), waiting for the call to return. When this call returns, all upstream filters have already completed their filtering tasks.

Figure 25-5

Figure 25-5. mod_perl 2.0 input filter program control and data flow

As mentioned earlier, the streaming interface hides these details; however, the first call to $filter->read( ) will block, as underneath it performs the get_brigade( ) call.

Figure 25-5 shows a part of the actual input filter chain for an HTTP request. The ... shows that there are more filters in between the mod_perl filter and http_in.

Now let's look at what happens in the output filter chain. The first filter acquires the bucket brigades containing the response data from the content handler (or another protocol handler if we aren't talking HTTP), then it applies any modifications and passes the data to the next filter (using the pass_brigade( ) call), which in turn applies its modifications and sends the bucket brigade to the next filter, and so on, all the way down to the last filter (called core), which writes the data to the network, via the socket to which the client is listening. Even though the output filters don't have to wait to acquire the bucket brigade (since the upstream filter passes it to them as an argument), they still block in a similar fashion to input filters, because they have to wait for the pass_brigade( ) call to return.

Figure 25-6 depicts a typical output filter chain data flow in addition to the program control flow. As in the input filter chain diagram, the arrows show the program control flow, and the black-headed arrows show the data flow. Again, the diagram uses Perl pseudocode for the mod_perl filter and C pseudocode for the Apache filters, and the brick walls represent the blocking. The diagram shows only part of the real HTTP response filter chain; ... stands for the omitted filters.

Figure 25-6

Figure 25-6. mod_perl 2.0 output filter program control and data flow