Implementing an ARP Cache with Queuing and Timeout Logic in C

You’re building a network router. A packet arrives destined for 192.168.2.2. Your routing table says “forward it out interface eth1.” Simple enough — except Ethernet requires a MAC address to actually send the frame. You have an IP address but no MAC. You need to broadcast “Who has 192.168.2.2?” and wait for a reply. While you’re waiting, ten more packets for the same destination pile up.

Do you drop all of them until ARP resolves? Do you send ten duplicate ARP requests? Do you queue the packets and risk running out of memory if the host never responds?

The sr_arpcache.c module solves this with a request queue that deduplicates ARP broadcasts, queues packets waiting for resolution, retries failed requests up to five times, and sends ICMP Host Unreachable errors when a host can’t be found.

What You Need

C fundamentals: pointers, structs, malloc/free
Basic networking: what IP and MAC addresses are, how ARP works
Linked list data structures
Some familiarity with mutexes (the cache runs a background sweep thread)

# Build the router
cd router
make clean && make

Key headers: <pthread.h> for mutexes, <time.h> for timestamps, <netinet/in.h> for network byte order.

The Packet Queue: A Cache Entry for Every Unanswered Question

Think of the ARP cache as a two-section filing system. The first section holds resolved entries — IP-to-MAC mappings we’ve confirmed. The second holds pending requests — questions we’ve asked but haven’t heard back on yet. Each pending request has its own queue of packets that arrived while we were waiting.

When an ARP reply arrives, we pull the matching request off the pending list, forward all its queued packets with the now-known MAC address, and move the MAC mapping into the resolved section.

The data structures reflect this:

/* Resolved entry */
struct sr_arpentry {
    unsigned char mac[6];
    uint32_t ip;
    time_t added;
    int valid;
};

/* Packet waiting for resolution */
struct sr_packet {
    uint8_t *buf;
    unsigned int len;
    char *iface;
    struct sr_packet *next;
};

/* Pending ARP request with its packet queue */
struct sr_arpreq {
    uint32_t ip;
    time_t sent;
    uint32_t times_sent;
    struct sr_packet *packets;
    struct sr_arpreq *next;
};

/* The cache itself */
struct sr_arpcache {
    struct sr_arpentry entries[SR_ARPCACHE_SZ];  // Fixed array, 100 slots
    struct sr_arpreq *requests;                   // Linked list of pending requests
    pthread_mutex_t lock;
    pthread_t thread;
};

I chose a fixed array for resolved entries and a linked list for pending requests. Resolved entries are bounded — a simple router topology has at most 100 neighbors — and random access is O(1) with an array versus O(n) with a list. Pending requests are unbounded (an attacker could send packets to 1000 nonexistent IPs) and mostly sequential access, so a linked list is appropriate here.

Looking Up a MAC Address

struct sr_arpentry *sr_arpcache_lookup(struct sr_arpcache *cache, uint32_t ip) {
    pthread_mutex_lock(&(cache->lock));
    
    struct sr_arpentry *entry = NULL;
    
    for (int i = 0; i < SR_ARPCACHE_SZ; i++) {
        if ((cache->entries[i].valid) && (cache->entries[i].ip == ip)) {
            entry = (struct sr_arpentry *) malloc(sizeof(struct sr_arpentry));
            memcpy(entry, &(cache->entries[i]), sizeof(struct sr_arpentry));
            break;
        }
    }
    
    pthread_mutex_unlock(&(cache->lock));
    return entry;
}

The function returns a heap-allocated copy, not a pointer to the cache entry itself. This matters because of the background sweep thread. If we returned a pointer into the cache and then dropped the lock, another thread could invalidate that entry before the caller reads the MAC address. With a copy, the caller owns the data and can use it safely after the lock is released. The caller is responsible for freeing it.

Queueing a Packet

When a packet arrives for an IP we don’t have a MAC for, we need to either create a new ARP request or find the existing one for that IP and add the packet to its queue:

struct sr_arpreq *sr_arpcache_queuereq(struct sr_arpcache *cache,
                                        uint32_t ip,
                                        uint8_t *packet,
                                        unsigned int packet_len,
                                        char *iface)
{
    pthread_mutex_lock(&(cache->lock));
    
    struct sr_arpreq *req = cache->requests;
    for (; req != NULL; req = req->next) {
        if (req->ip == ip) break;
    }
    
    if (req == NULL) {
        req = (struct sr_arpreq *) calloc(1, sizeof(struct sr_arpreq));
        req->ip = ip;
        req->times_sent = 0;
        req->sent = time(NULL);
        req->packets = NULL;
        req->next = cache->requests;
        cache->requests = req;
    }
    
    struct sr_packet *new_pkt = (struct sr_packet *) malloc(sizeof(struct sr_packet));
    new_pkt->buf = (uint8_t *) malloc(packet_len);
    memcpy(new_pkt->buf, packet, packet_len);
    new_pkt->len = packet_len;
    new_pkt->iface = (char *) malloc(strlen(iface) + 1);
    strcpy(new_pkt->iface, iface);
    new_pkt->next = req->packets;
    req->packets = new_pkt;
    
    pthread_mutex_unlock(&(cache->lock));
    return req;
}

The search before creating a new request is the key deduplication step. If 100 packets arrive for 192.168.2.2 before the ARP reply comes back, we send exactly one ARP broadcast and queue all 100 packets behind it. Without this, we’d flood the network with 100 ARP requests for the same address.

Note the deep copy of the packet buffer — memcpy(new_pkt->buf, packet, packet_len). The packet memory passed in belongs to the caller. We can’t keep a pointer to it; we need our own allocation.

Inserting a Resolved MAC and Forwarding Queued Packets

When an ARP reply arrives, two things need to happen atomically (under the lock): store the MAC in the resolved entries array, and remove the corresponding pending request from the list so the caller can forward its queued packets.

struct sr_arpreq *sr_arpcache_insert(struct sr_arpcache *cache,
                                      unsigned char *mac,
                                      uint32_t ip)
{
    pthread_mutex_lock(&(cache->lock));
    
    /* Find empty slot */
    int i = 0;
    for (; i < SR_ARPCACHE_SZ; i++) {
        if (!(cache->entries[i].valid)) break;
    }
    
    /* Cache full? Evict oldest entry */
    if (i == SR_ARPCACHE_SZ) {
        time_t oldest = cache->entries[0].added;
        int oldest_idx = 0;
        for (int j = 1; j < SR_ARPCACHE_SZ; j++) {
            if (cache->entries[j].added < oldest) {
                oldest = cache->entries[j].added;
                oldest_idx = j;
            }
        }
        i = oldest_idx;
    }
    
    memcpy(cache->entries[i].mac, mac, 6);
    cache->entries[i].ip = ip;
    cache->entries[i].added = time(NULL);
    cache->entries[i].valid = 1;
    
    /* Find and remove the pending request for this IP */
    struct sr_arpreq *req = cache->requests, *prev = NULL;
    for (; req != NULL; prev = req, req = req->next) {
        if (req->ip == ip) {
            if (prev) {
                prev->next = req->next;
            } else {
                cache->requests = req->next;
            }
            break;
        }
    }
    
    pthread_mutex_unlock(&(cache->lock));
    return req;
}

The function returns the removed request. The caller iterates through req->packets, rewrites each packet’s destination MAC (now known), sends them, and frees the request structure. This separation keeps the cache module focused on state management while the router module handles packet forwarding logic.

The Background Sweep Thread

Every second, a background thread walks the pending request list and handles timeouts:

void handle_arpreq(struct sr_instance *sr, struct sr_arpreq* req) {
    time_t now = time(NULL);
    
    if (difftime(now, req->sent) >= 1.0) {
        if (req->times_sent >= 5) {
            /* 5 seconds have passed — give up and send ICMP errors */
            struct sr_packet *pkt = req->packets;
            while (pkt) {
                sr_ip_hdr_t *ip_hdr = (sr_ip_hdr_t *)(pkt->buf + sizeof(sr_ethernet_hdr_t));
                send_icmp_error_message(sr, pkt->iface, ip_hdr->ip_id, 
                                        (uint8_t*)ip_hdr, 
                                        ip_hdr->ip_src, 
                                        ICMP_DEST_UNREACHABLE);
                pkt = pkt->next;
            }
            sr_arpreq_destroy(&(sr->cache), req);
        } else {
            /* Retry */
            send_arp_request(sr, req->ip);
            req->sent = now;
            req->times_sent++;
        }
    }
}

void sr_arpcache_sweepreqs(struct sr_instance *sr) {
    struct sr_arpreq *req = sr->cache.requests;
    while (req) {
        struct sr_arpreq *next = req->next;  // Save next before possible destruction
        handle_arpreq(sr, req);
        req = next;
    }
}

The state machine per request:

Time	`times_sent`	Action
t=0s	0	ARP request sent
t=1s	1	Retry
t=2s	2	Retry
t=3s	3	Retry
t=4s	4	Retry
t=5s	5	Timeout → ICMP Host Unreachable, destroy request

The next = req->next save-before-destruction pattern is essential. If handle_arpreq destroys the request (frees it from the list), using req->next afterward is a use-after-free bug. Saving next first keeps the iteration safe.

Locking Strategy: Why One Lock Is Enough Here

The cache uses a single lock that covers the entire structure. An alternative would be a lock per cache entry (fine-grained locking). I chose coarse-grained for this implementation because:

Cache operations are fast — scanning 100 array entries takes under a microsecond
Fine-grained locking adds overhead from 100 lock/unlock pairs per lookup
Contention is low — one sweep thread plus occasional packet-handler lookups

The rule of thumb: use fine-grained locks when critical sections are long (>10ms) or heavily contended (100+ threads). Neither applies here.

Three Edge Cases Worth Knowing

ARP spoofing. The current code accepts any ARP reply without validation — if an attacker sends a fake reply claiming a MAC address, we’d cache it. A production implementation would only accept replies for IPs we actually sent a request for, and check that the reply came from the right interface.

The thundering herd on timeout. If 1000 packets queue up for a nonexistent host, the timeout triggers 1000 ICMP errors simultaneously, which can flood the outbound link. RFC 1812 requires rate-limiting ICMP generation. This implementation doesn’t do that, which is fine for a simple router but not for production.

Memory bounds. Each sr_packet struct contains heap-allocated copies of the packet buffer, the interface name, and the struct itself. A malicious sender targeting 1000 different nonexistent IPs could queue 100 packets each, creating 100,000 heap allocations. Production routers cap the queue depth — typically 5–10 packets per pending request — and drop additional packets silently.

The patterns here — request deduplication, queued delivery on resolution, time-based retry with a finite limit — show up in Linux’s ARP stack, BSD routing code, and enterprise router firmware. The implementation is small and readable enough to understand in one sitting, but the design decisions are the same ones real networking software makes.

Implementing an ARP Cache with Queuing and Timeout Logic in C

Technologies Used