# Chapter 7: Filters and Bucket Brigades ## The Problem: Streaming Data A web server needs to handle data that: - May be too large to fit in memory - Arrives in chunks (network packets) - Needs transformation (compression, encryption) - Must be sent before it's fully received (streaming) Traditional approaches like "read everything into a buffer" don't scale. ## Apache's Solution: Bucket Brigades Apache uses a **bucket brigade** system - a linked list of data chunks that flows through a chain of filters. ```mermaid flowchart LR subgraph brigade["Bucket Brigade"] direction LR B1["GET "] e1@--> B2["/ HTTP"] e2@--> B3["/1.1\r\n"] e3@--> B4@{ shape: stadium, label: "EOS" } e1@{ animate: true } e2@{ animate: true } e3@{ animate: true } end brigade --> FC@{ shape: hex, label: "Filter Chain" } style B1 fill:#3498db,stroke:#2980b9,color:#000 style B2 fill:#3498db,stroke:#2980b9,color:#000 style B3 fill:#3498db,stroke:#2980b9,color:#000 style B4 fill:#e74c3c,stroke:#c0392b,color:#000 style FC fill:#f39c12,stroke:#e67e22,color:#000 ``` The key design principle is **zero-copy where possible**. A file bucket doesn't read file data into memory - it holds a file descriptor and reads on demand. A transient bucket points to existing memory without copying it. Only when data needs to outlive its original context does Apache copy it into a heap or pool bucket. This makes serving large files or proxying responses efficient: data flows through the filter chain without being fully materialized in memory. ## Buckets A bucket is a single chunk of data with a **type** that determines how its data is stored and accessed. Every bucket type implements the same vtable interface, defined in `srclib/apr-util/include/apr_buckets.h`: ```c // srclib/apr-util/include/apr_buckets.h struct apr_bucket_type_t { const char *name; int num_func; enum { APR_BUCKET_DATA = 0, // Actual content data APR_BUCKET_METADATA = 1 // Metadata (EOS, FLUSH, etc.) } is_metadata; void (*destroy)(void *data); apr_status_t (*read)(apr_bucket *b, const char **str, apr_size_t *len, apr_read_type_e block); apr_status_t (*setaside)(apr_bucket *e, apr_pool_t *pool); apr_status_t (*split)(apr_bucket *e, apr_size_t point); apr_status_t (*copy)(apr_bucket *e, apr_bucket **c); }; struct apr_bucket { APR_RING_ENTRY(apr_bucket) link; // Links to brigade ring const apr_bucket_type_t *type; // Vtable for this bucket apr_size_t length; // Data length (-1 if unknown) apr_off_t start; // Offset into backing data void *data; // Type-dependent private data void (*free)(void *e); // Deallocator for this bucket apr_bucket_alloc_t *list; // Freelist this came from }; ``` The {httpd}`apr_bucket_type_t::setaside` function is particularly important: it "morphs" a bucket from a short-lived type to a long-lived one. For example, when a transient bucket (pointing to stack data) needs to survive beyond the current function call, `setaside` copies the data to the heap and converts it to a heap bucket. This is how Apache achieves zero-copy in the common case while still handling lifetime mismatches safely. ### Data Buckets Data buckets carry actual content bytes. They differ in how the data is stored and who owns it: `````{tab-set} ````{tab-item} Heap Bucket **Heap Bucket:** Data is `malloc`'d on the heap. Use this when you've generated data that needs to outlive the current scope. The `free_func` is called when the last reference to this data is destroyed (multiple buckets can share the same backing data after a split). ```c apr_bucket *b = apr_bucket_heap_create(data, len, free_func, alloc); ``` ```` ````{tab-item} Pool Bucket **Pool Bucket:** Data lives in an APR pool. Use this when the data was allocated from a pool and you want the bucket's lifetime tied to that pool. If the pool is destroyed before the bucket, `setaside` automatically morphs it to a heap bucket. ```c apr_bucket *b = apr_bucket_pool_create(data, len, pool, alloc); ``` ```` ````{tab-item} Transient Bucket **Transient Bucket:** A zero-copy reference to temporary data (e.g., a stack buffer or a buffer that will be reused). The data must be consumed or set aside before the next filter call, because the backing memory may disappear. This is the cheapest bucket to create - no copies, no allocations beyond the bucket struct itself. ```c apr_bucket *b = apr_bucket_transient_create(data, len, alloc); ``` ```` ````{tab-item} Immortal Bucket **Immortal Bucket:** A reference to permanent, read-only data like string constants or global buffers. Since the data will never be freed, `setaside` is a no-op. Use this for static content. ```c apr_bucket *b = apr_bucket_immortal_create("Hello", 5, alloc); ``` ```` ````` ### I/O Buckets These buckets represent data that comes from an external source. They have **unknown length** (`(apr_size_t)(-1)`) until read, and reading them may block: `````{tab-set} ````{tab-item} File Bucket **File Bucket:** Represents a range of bytes from a file on disk. The data is read into memory lazily, only when a downstream filter calls `apr_bucket_read()`. This is how Apache serves static files efficiently - `sendfile()` can even bypass userspace entirely. ```c apr_bucket *b = apr_bucket_file_create(file, offset, len, pool, alloc); ``` ```` ````{tab-item} Pipe Bucket **Pipe Bucket:** Data from a pipe (e.g., CGI script output). Can only be read once and in order. Cannot be split or copied. ```c apr_bucket *b = apr_bucket_pipe_create(pipe, alloc); ``` ```` ````{tab-item} Socket **Socket Bucket:** Data from a network socket. This is what the core input filter creates to represent incoming request data. Like pipe buckets, socket reads are sequential and may block. ```c apr_bucket *b = apr_bucket_socket_create(sock, alloc); ``` ```` ````` ### Metadata Buckets Metadata buckets carry no data content ({httpd}`apr_bucket_type_t::is_metadata`= 1) - they are signals that control how the filter chain behaves: **EOS (End-Of-Stream)** - marks the end of a response or request body. Every response must end with an EOS bucket. Filters use it to know when to finalize their processing (e.g., write a compression trailer, flush buffered content). ```c apr_bucket *b = apr_bucket_eos_create(alloc); ``` **FLUSH** - tells downstream filters to flush any buffered data immediately. Used when partial data needs to reach the client before the response is complete (e.g., server-sent events, chunked streaming). ```c apr_bucket *b = apr_bucket_flush_create(alloc); ``` ### The EOS Bucket The EOS (End-Of-Stream) bucket is critical: - It marks the logical end of a response/request body - Filters should pass it through (never consume or drop it) - Handlers must send it to complete the response - Without an EOS, the client will hang waiting for more data ## Bucket Brigades A bucket brigade is a doubly-linked ring of buckets, implemented with APR's ring macros. The brigade itself is just a sentinel node - buckets are inserted, removed, and iterated using ring operations: ```c // Create a brigade apr_bucket_brigade *bb = apr_brigade_create(pool, bucket_alloc); //... // Insert bucket at end/front //... // Get first/last bucket apr_bucket *first = APR_BRIGADE_FIRST(bb); apr_bucket *last = APR_BRIGADE_LAST(bb); // Iterate over buckets for (apr_bucket *b = APR_BRIGADE_FIRST(bb); b != APR_BRIGADE_SENTINEL(bb); b = APR_BUCKET_NEXT(b)) { // Process bucket } ``` ### Reading Bucket Data All bucket types expose their contents through a single `apr_bucket_read` call that supports both blocking and non-blocking modes: ```c const char *data; apr_size_t len; // Read bucket contents apr_status_t rv = apr_bucket_read(bucket, &data, &len, APR_BLOCK_READ); if (rv == APR_SUCCESS) { // data points to len bytes // WARNING: data may be invalidated after bucket operations! } // Non-blocking read rv = apr_bucket_read(bucket, &data, &len, APR_NONBLOCK_READ); if (rv == APR_EAGAIN) { // Data not ready yet } ``` ### Brigade Operations Brigades can be concatenated, split, flattened into a contiguous buffer, or cleaned up: ```c // Concatenate: append bb2 to bb1 APR_BRIGADE_CONCAT(bb1, bb2); // Prepend: insert bb2 at start of bb1 APR_BRIGADE_PREPEND(bb1, bb2); // Split: move buckets after 'e' to new brigade apr_brigade_split(bb, e); // Flatten: copy all data to a buffer apr_size_t len; apr_brigade_flatten(bb, buffer, &len); // Destroy: cleanup brigade and all buckets apr_brigade_destroy(bb); // Cleanup: remove all buckets but keep brigade apr_brigade_cleanup(bb); ``` ## Filters Filters transform data as it flows through Apache. There are two directions: ```mermaid flowchart TD subgraph input["INPUT FILTERS (request body)"] direction RL N1["Network
('CORE_IN' filter,
mod_core)"] e1@==> S1["SSL Decrypt
(mod_ssl)"] e2@==> D1["Decompress
(mod_deflate)"] e3@==> P1["Request Parser
('HTTP_IN' filter,
mod_http)"] e1@{ animate: true } e2@{ animate: true } e3@{ animate: true } end subgraph output["OUTPUT FILTERS (response body)"] direction LR H2["Handler"] e4@==> C2["Compress
(mod_deflate)"] e5@==> S2["SSL Encrypt
(mod_ssl)"] e6@==> N2["Network
('CORE' filter,
mod_core)"] e4@{ animate: true } e5@{ animate: true } e6@{ animate: true } end input ~~~ output style N1 fill:#e74c3c,stroke:#c0392b,color:#000 style S1 fill:#9b59b6,stroke:#8e44ad,color:#000 style D1 fill:#3498db,stroke:#2980b9,color:#000 style P1 fill:#2ecc71,stroke:#27ae60,color:#000 style H2 fill:#2ecc71,stroke:#27ae60,color:#000 style C2 fill:#3498db,stroke:#2980b9,color:#000 style S2 fill:#9b59b6,stroke:#8e44ad,color:#000 style N2 fill:#e74c3c,stroke:#c0392b,color:#000 ``` ```{note} A **handler** is the module function that generates the actual response content. It's triggered when a request matches a `SetHandler` or `AddHandler` directive in the Apache config (e.g., `SetHandler cgi-script` routes to mod_cgi). The handler hook was covered in [Chapter 6](06-hooks.md) - see also the official [handler documentation](https://httpd.apache.org/docs/current/handler.html) and [module development guide](https://httpd.apache.org/docs/2.4/developer/modguide.html). ``` ### Filter Types Filters are categorized into levels that determine their position in the chain. The type constants are defined in `include/util_filter.h` and represent a numeric ordering - lower numbers run closer to the handler, higher numbers run closer to the network: | Constant | Value | Description | |----------|-------|-------------| | {httpd}`AP_FTYPE_RESOURCE` | 10 | Content generators (`mod_include` SSI) | | {httpd}`AP_FTYPE_CONTENT_SET` | 20 | Content transformers (`mod_deflate`) | | {httpd}`AP_FTYPE_PROTOCOL` | 30 | Protocol framing (HTTP chunking) | | {httpd}`AP_FTYPE_TRANSCODE` | 40 | Charset/encoding conversion | | {httpd}`AP_FTYPE_CONNECTION` | 50 | Connection-level (`mod_ssl`) | | {httpd}`AP_FTYPE_NETWORK` | 60 | Actual I/O (core socket read/write) | For **output filters**, data flows from low to high - the handler's output passes through `RESOURCE` filters first, then `CONTENT_SET`, and so on until `NETWORK` actually writes to the socket. For **input filters**, the direction is reversed - the `NETWORK` filter reads raw bytes from the socket, and higher-level filters progressively decode and transform them before the handler sees the data. This layering ensures that content transformation (like gzip compression) always happens before protocol framing (like HTTP chunking), which always happens before encryption (SSL), which always happens before network I/O. The numeric values also allow fine-grained positioning: a filter can register at `AP_FTYPE_CONTENT_SET + 5` to run after other content-set filters. ### Registering a Filter Filters are registered globally during module initialization, specifying a name, callback function, and filter type level: ```c // Output filter registration ap_register_output_filter("MY_OUTPUT", // Filter name my_output_filter, // Function NULL, // Init function (optional) AP_FTYPE_CONTENT_SET); // Input filter registration ap_register_input_filter("MY_INPUT", my_input_filter, NULL, AP_FTYPE_CONTENT_SET); ``` ### Adding Filters to Request/Connection Once registered, filters are attached to individual requests or connections - request-scoped filters are removed after the response, while connection-scoped filters persist for the entire connection: ```c // Add output filter to request ap_add_output_filter("MY_OUTPUT", ctx, r, r->connection); // Add input filter to request ap_add_input_filter("MY_INPUT", ctx, r, r->connection); // Add to connection (lives for entire connection) ap_add_input_filter("SSL_IN", ctx, NULL, c); ap_add_output_filter("SSL_OUT", ctx, NULL, c); ``` ### Output Filter Implementation An output filter receives a bucket brigade, iterates through it, transforms data buckets while passing metadata through, then forwards the brigade to the next filter: ```c static apr_status_t my_output_filter(ap_filter_t *f, apr_bucket_brigade *bb) { request_rec *r = f->r; apr_bucket *b; // Iterate through buckets for (b = APR_BRIGADE_FIRST(bb); b != APR_BRIGADE_SENTINEL(bb); b = APR_BUCKET_NEXT(b)) { // Handle metadata buckets if (APR_BUCKET_IS_EOS(b)) { // End of stream - pass through break; } if (APR_BUCKET_IS_FLUSH(b)) { // Flush request - pass through continue; } if (APR_BUCKET_IS_METADATA(b)) { // Other metadata - pass through continue; } // Read data bucket const char *data; apr_size_t len; apr_status_t rv = apr_bucket_read(b, &data, &len, APR_BLOCK_READ); if (rv != APR_SUCCESS) { return rv; } // Transform data (example: uppercase) char *transformed = apr_palloc(r->pool, len); for (apr_size_t i = 0; i < len; i++) { transformed[i] = toupper(data[i]); } // Replace bucket with transformed data apr_bucket *new_b = apr_bucket_heap_create( transformed, len, NULL, f->c->bucket_alloc); APR_BUCKET_INSERT_BEFORE(b, new_b); apr_bucket_delete(b); b = new_b; } // Pass to next filter return ap_pass_brigade(f->next, bb); } ``` ### Input Filter Implementation Input filters are more complex because they handle read modes: ```c static apr_status_t my_input_filter(ap_filter_t *f, apr_bucket_brigade *bb, ap_input_mode_t mode, apr_read_type_e block, apr_off_t readbytes) { my_filter_ctx *ctx = f->ctx; // Initialize context on first call if (!ctx) { ctx = f->ctx = apr_pcalloc(f->r->pool, sizeof(*ctx)); ctx->bb = apr_brigade_create(f->r->pool, f->c->bucket_alloc); } // Handle different read modes switch (mode) { case AP_MODE_GETLINE: // Read until newline return get_line_from_filters(f, bb, ctx); case AP_MODE_READBYTES: // Read up to readbytes return read_bytes_from_filters(f, bb, readbytes, ctx); case AP_MODE_SPECULATIVE: // Peek at data without consuming return speculative_read(f, bb, readbytes, ctx); case AP_MODE_EXHAUSTIVE: // Read all remaining data return exhaustive_read(f, bb, ctx); case AP_MODE_INIT: // Initialize return APR_SUCCESS; } return APR_ENOTIMPL; } // Helper: read from upstream filter static apr_status_t get_upstream_data(ap_filter_t *f, apr_bucket_brigade *bb, apr_read_type_e block, apr_off_t readbytes) { return ap_get_brigade(f->next, bb, AP_MODE_READBYTES, block, readbytes); } ``` ### Input Mode Constants Input filters must handle multiple read modes because different parts of HTTP processing need to read data differently. The HTTP request parser reads headers line-by-line (`GETLINE`), then reads the body in sized chunks (`READBYTES`). These modes are defined in `include/util_filter.h`: | Mode | Description | |------|-------------| | {httpd}`AP_MODE_READBYTES` | Read up to N bytes (body data) | | {httpd}`AP_MODE_GETLINE` | Read a line terminated by `\n` (header parsing) | | {httpd}`AP_MODE_EATCRLF` | Consume leading CRLF without returning data | | {httpd}`AP_MODE_SPECULATIVE` | Peek at data without consuming (lookahead) | | {httpd}`AP_MODE_EXHAUSTIVE` | Read all remaining data | | {httpd}`AP_MODE_INIT` | Initialize filter (one-time setup) | The `SPECULATIVE` mode is particularly interesting - it lets a filter peek ahead without consuming the data. The HTTP/1.1 parser uses this to detect whether a pipelined request is waiting after the current one finishes. ## Filter Context Filters are called repeatedly - once per brigade chunk - so they need persistent state across invocations. The `f->ctx` pointer stores a filter-allocated context struct, typically initialized on the first call: ```c typedef struct { apr_bucket_brigade *bb; // Buffered data int state; // Current state apr_size_t bytes_read; // Running total char *buffer; // Work buffer } my_filter_ctx; static apr_status_t my_filter(ap_filter_t *f, apr_bucket_brigade *bb) { my_filter_ctx *ctx = f->ctx; if (!ctx) { // First call - initialize ctx = f->ctx = apr_pcalloc(f->r->pool, sizeof(*ctx)); ctx->bb = apr_brigade_create(f->r->pool, f->c->bucket_alloc); ctx->state = STATE_INITIAL; } // Use context... ctx->bytes_read += brigade_length(bb); // Process based on state switch (ctx->state) { case STATE_INITIAL: // ... break; case STATE_READING_BODY: // ... break; } return ap_pass_brigade(f->next, bb); } ``` ## Common Filter Patterns ````{dropdown} Pass-Through Filter A filter that inspects a condition and then removes itself from the chain, forwarding data unchanged: ```c static apr_status_t passthrough_filter(ap_filter_t *f, apr_bucket_brigade *bb) { // Remove ourselves (only needed once) ap_remove_output_filter(f); // Just pass data to next filter return ap_pass_brigade(f->next, bb); } ``` ```` ````{dropdown} Accumulating Filter Buffers all incoming brigades until EOS arrives, then processes the complete data at once. Used when transformation requires seeing the entire content (e.g., computing a content hash): ```c // Collect all data before processing (e.g., for compression) static apr_status_t accumulating_filter(ap_filter_t *f, apr_bucket_brigade *bb) { accum_ctx *ctx = f->ctx; if (!ctx) { ctx = f->ctx = apr_pcalloc(f->r->pool, sizeof(*ctx)); ctx->bb = apr_brigade_create(f->r->pool, f->c->bucket_alloc); } // Look for EOS apr_bucket *eos = NULL; for (apr_bucket *b = APR_BRIGADE_FIRST(bb); b != APR_BRIGADE_SENTINEL(bb); b = APR_BUCKET_NEXT(b)) { if (APR_BUCKET_IS_EOS(b)) { eos = b; break; } } // Accumulate data APR_BRIGADE_CONCAT(ctx->bb, bb); if (eos) { // Got all data - process it process_complete_data(ctx->bb); // Pass processed data return ap_pass_brigade(f->next, ctx->bb); } // More data coming return APR_SUCCESS; } ``` ```` ````{dropdown} Streaming Filter Processes each bucket as it arrives and passes data through immediately. Best for transformations that operate on chunks independently (e.g., character encoding, search-and-replace): ```c // Process data chunk by chunk static apr_status_t streaming_filter(ap_filter_t *f, apr_bucket_brigade *bb) { stream_ctx *ctx = f->ctx; if (!ctx) { ctx = f->ctx = apr_pcalloc(f->r->pool, sizeof(*ctx)); } apr_bucket *b; apr_bucket *next; for (b = APR_BRIGADE_FIRST(bb); b != APR_BRIGADE_SENTINEL(bb); b = next) { next = APR_BUCKET_NEXT(b); if (!APR_BUCKET_IS_METADATA(b)) { const char *data; apr_size_t len; apr_bucket_read(b, &data, &len, APR_BLOCK_READ); // Transform in place or create new bucket transform_chunk(ctx, data, len); } } // Pass (possibly modified) brigade return ap_pass_brigade(f->next, bb); } ``` ```` ## The Core Network Filters At the bottom of every filter chain sit the **core network filters** (in `server/core_filters.c`). These are the only filters that actually touch the socket - everything above them works with bucket brigades in memory: ### Core Output Filter The core output filter sits at the bottom of every output chain and performs the actual socket write, using `writev()` for multiple buckets and `sendfile()` for file buckets: ```c // server/core_filters.c // Writes bucket data to the socket using writev/sendfile ap_register_output_filter("CORE", ap_core_output_filter, NULL, AP_FTYPE_NETWORK); ``` The core output filter is smart about I/O. It uses `writev()` to send multiple buckets in a single syscall and `sendfile()` for file buckets (sending file data directly from kernel space to the socket without copying through userspace). ### Core Input Filter The core input filter reads raw bytes from the client socket into buckets, handling both blocking and non-blocking modes: ```c // server/core_filters.c // Reads from socket into buckets ap_register_input_filter("CORE_IN", ap_core_input_filter, NULL, AP_FTYPE_NETWORK); ``` The core input filter creates socket buckets that read from the client connection. It handles both blocking and non-blocking reads, and implements the speculative mode needed by the HTTP parser. ### Fuzzing: Replacing the Network Layer For fuzzing, we replace these core filters with our own that read from a memory buffer (the fuzzer input) and write to `/dev/null`. This is the fundamental trick that makes in-process fuzzing work - all the filters above the network layer operate identically, but instead of reading from a TCP socket, they read from the fuzzer's mutated input buffer. See the Harness Design guide for details on how this replacement works. ## Reading from Input Filters Handlers read request bodies by pulling brigades from the input filter chain in a loop until they see an EOS bucket: ```c static int my_handler(request_rec *r) { apr_bucket_brigade *bb = apr_brigade_create(r->pool, r->connection->bucket_alloc); // Read request body in chunks apr_status_t rv; int seen_eos = 0; do { rv = ap_get_brigade(r->input_filters, bb, AP_MODE_READBYTES, APR_BLOCK_READ, HUGE_STRING_LEN); if (rv != APR_SUCCESS) { return HTTP_INTERNAL_SERVER_ERROR; } // Process buckets for (apr_bucket *b = APR_BRIGADE_FIRST(bb); b != APR_BRIGADE_SENTINEL(bb); b = APR_BUCKET_NEXT(b)) { if (APR_BUCKET_IS_EOS(b)) { seen_eos = 1; break; } const char *data; apr_size_t len; apr_bucket_read(b, &data, &len, APR_BLOCK_READ); // Process data... } apr_brigade_cleanup(bb); } while (!seen_eos); apr_brigade_destroy(bb); return OK; } ``` ## Writing to Output Filters Handlers push response data into the output filter chain by creating a brigade, inserting data and EOS buckets, and calling {httpd}`ap_pass_brigade`: ```c static int my_handler(request_rec *r) { apr_bucket_brigade *bb = apr_brigade_create(r->pool, r->connection->bucket_alloc); apr_bucket *b; // Set headers ap_set_content_type(r, "text/plain"); // Create data bucket const char *content = "Hello, World!"; b = apr_bucket_transient_create(content, strlen(content), r->connection->bucket_alloc); APR_BRIGADE_INSERT_TAIL(bb, b); // Add EOS bucket b = apr_bucket_eos_create(r->connection->bucket_alloc); APR_BRIGADE_INSERT_TAIL(bb, b); // Pass to output filters apr_status_t rv = ap_pass_brigade(r->output_filters, bb); if (rv != APR_SUCCESS) { return HTTP_INTERNAL_SERVER_ERROR; } return OK; } // Or use convenience functions: static int simpler_handler(request_rec *r) { ap_set_content_type(r, "text/plain"); // These internally create buckets ap_rputs("Hello, ", r); ap_rprintf(r, "World! (request #%ld)", r->request_time); // Send EOS // (done automatically when handler returns OK) return OK; } ``` ## The Complete Output Filter Chain Here's a concrete example of what the output filter chain looks like for a typical HTTPS response with compression enabled: ```mermaid %%{init: {"flowchart": {"curve": "basis", "nodeSpacing": 80, "rankSpacing": 30}}}%% flowchart TD H["Handler
(ap_rprintf)"] e0@-->|ap_pass_brigade| D1{"RESOURCE?"} D1 e1@-->|transform| R1["mod_include
(SSI)"] e9@-->|ap_pass_brigade| D2{"CONTENT_SET?"} D1 e2@-->|no-op| D2 D2 e3@-->|compress| R2["mod_deflate
(gzip)"] e10@-->|ap_pass_brigade| D3{"PROTOCOL?"} D2 e4@-->|no-op| D3 D3 e5@-->|frame| R3["HTTP_HEADER
(chunking)"] e11@-->|ap_pass_brigade| D4{"CONNECTION?"} D3 e6@-->|no-op| D4 D4 e7@-->|encrypt| R4["mod_ssl
(TLS)"] e12@-->|ap_pass_brigade| M@{ shape: processes, label: "... down the filter chain ..." } D4 e8@-->|no-op| M M e13@-->|ap_pass_brigade| N["NETWORK
CORE"] e0@{ animate: true } e1@{ animate: true, curve: linear } e2@{ curve: stepAfter } e3@{ animate: true, curve: linear } e4@{ curve: stepAfter } e5@{ animate: true, curve: linear } e6@{ curve: stepAfter } e7@{ animate: true, curve: linear } e8@{ curve: stepAfter } e9@{ animate: true } e10@{ animate: true } e11@{ animate: true } e12@{ animate: true } e13@{ animate: true } style H fill:#2ecc71,stroke:#27ae60,color:#000 style D1 fill:#f39c12,stroke:#e67e22,color:#000 style D2 fill:#f39c12,stroke:#e67e22,color:#000 style D3 fill:#f39c12,stroke:#e67e22,color:#000 style D4 fill:#f39c12,stroke:#e67e22,color:#000 style R1 fill:#3498db,stroke:#2980b9,color:#000 style R2 fill:#3498db,stroke:#2980b9,color:#000 style R3 fill:#e67e22,stroke:#d35400,color:#000 style R4 fill:#9b59b6,stroke:#8e44ad,color:#000 style N fill:#e74c3c,stroke:#c0392b,color:#000 ``` Each arrow represents an `ap_pass_brigade(f->next, bb)` call. The brigade flows top down, with each filter potentially modifying, splitting, or buffering buckets before passing them on. The handler never needs to know about compression, chunking, or encryption - the filter chain handles it all transparently. ## Summary Bucket brigades and filters are Apache's I/O abstraction: **Buckets:** - Chunks of data or metadata, each with a type-specific vtable - Data buckets: heap, pool, transient, immortal (differ in ownership/lifetime) - I/O buckets: file, pipe, socket (lazy/streaming reads) - Metadata buckets: EOS (end of stream), FLUSH (force downstream flush) - Zero-copy by default, with {httpd}`apr_bucket_type_t::setaside` for lifetime extension **Brigades:** - Doubly-linked ring of buckets (via {httpd}`APR_RING`) - Created per-request or per-filter invocation - Operations: insert, concat, split, flatten, cleanup **Filters:** - Transform data in chains, registered at specific type levels - Output: handler → `RESOURCE` → `CONTENT_SET` → `PROTOCOL` → `CONNECTION` → `NETWORK` - Input: `NETWORK` → `CONNECTION` → `PROTOCOL` → `CONTENT_SET` → `RESOURCE` → handler - Input filters handle multiple read modes (`READBYTES`, `GETLINE`, `SPECULATIVE`, etc.) **Key patterns:** - Pass-through: remove self and pass brigade unchanged - Accumulating: buffer all data until EOS, then process at once - Streaming: process each bucket as it arrives, pass immediately - Always pass EOS through - dropping it breaks the response **For fuzzing:** we replace the `NETWORK`-level core filters with custom ones that read from a memory buffer and discard output. Everything above the network layer - all the content filters, protocol framing, and module-specific transformations - runs exactly as it would in production.