Apache Internals: A Practical Guide

From Zero to Fuzzing

This guide is designed for developers with C and Linux experience who want to understand Apache HTTP Server’s internal architecture. By the end, you’ll understand enough to build a fuzzing harness that exercises Apache’s request processing pipeline.


Table of Contents

Part 0x01: Foundations

  1. Introduction to Apache Architecture

  2. APR - Apache Portable Runtime

    • Why APR exists (portability layer between Apache and the OS)

    • Strings, arrays, tables, hash tables

    • File and network I/O abstractions

    • How APR relates to fuzzing (--with-included-apr)

  3. Memory Management and Pools

    • Why pools instead of malloc/free

    • Pool hierarchy (pconf → connection → request)

    • Pool API, cleanups, and subpools for loops

    • Pool debugging with ASan (--enable-pool-debug=yes)

Part 0x02: Core Systems

  1. The Configuration System

    • Configuration contexts (<Directory>, <Location>, .htaccess)

    • Directive types and the command table

    • Config creators, mergers, and the per-request merge flow

    • Module config vectors and runtime access

  2. MPM - Multi-Processing Modules

    • Prefork, Worker, Event MPMs and their trade-offs

    • Connection handling lifecycle

    • Scoreboard and worker status tracking

    • Thread safety considerations for modules

  3. The Hook System

    • What hooks are and how they work

    • Hook ordering constants and predecessor/successor lists

    • Return values (OK, DECLINED, DONE, HTTP_*)

    • Major request and connection hooks

    • Hook infrastructure: macros, ap_setup_prelinked_modules(), sorting

Part 0x03: I/O Architecture

  1. Filters and Bucket Brigades

    • Bucket types: data (heap, pool, transient, immortal), I/O (file, pipe, socket), metadata (EOS, FLUSH)

    • The apr_bucket_type_t vtable and zero-copy setaside morphing

    • Brigades as linked rings of buckets

    • Input vs output filters and the filter type hierarchy

    • Common patterns: pass-through, accumulating, streaming

  2. Request Processing Pipeline

    • Complete lifecycle from connection accept to pool cleanup

    • Each processing phase in detail (with source file references)

    • Directory walk and per-request config merge

    • Internal redirects, subrequests, and error handling

    • Fuzzing entry points and what each phase exercises

Part 0x04: Practical Application

  1. Module Anatomy

    • The module struct and STANDARD20_MODULE_STUFF

    • Complete annotated module template

    • Configuration directives (AP_INIT_* macros, ACCESS_CONF vs RSRC_CONF)

    • Adding filters and custom hooks to a module

    • Lifecycle hooks (child_init, post_config)

    • How to read a module’s source for fuzzing targets


How to Read This Guide

If you’re new to Apache: Start from Chapter 1 and read sequentially. Each chapter builds on the previous ones.

If you want to write a module: Focus on Chapters 1, 3, 4, 6, 7, and 9. These cover the essential concepts for module development.

If you want to understand the fuzzing harness: Read Chapters 5-8 first for context, then the harness design document (coming soon).

If you need a quick reference: Each chapter is self-contained with code examples. Jump to the topic you need.


Prerequisites

  • Solid C programming knowledge

  • Linux development experience

  • Familiarity with:

    • Makefiles

    • Shared libraries

    • Basic networking concepts

No prior Apache knowledge required.


Further Resources