Your Sensor Remembers Nothing: Building a Fixed-Memory Ring Buffer to Give Hardware a Short-Term History

What You Need to Know and Have Installed Before We Begin

Knowledge Base:

Basic C++ syntax: classes, constructors, member functions
A conceptual understanding of arrays and how indexing works
Familiarity with the template keyword (you don’t need to be an expert — we’ll explain every use)

Environment:

A C++17-compatible compiler (GCC 7+, Clang 5+, or MSVC 19.14+)
No external libraries required — this tutorial uses only <vector> and <cstddef> from the standard library

A Conveyor Belt That Never Gets Longer

A ring buffer is a fixed-size collection that overwrites its oldest element when full. Think of it as a sushi restaurant conveyor belt that holds exactly N plates. When a new plate comes out of the kitchen, the oldest plate at the far end falls off. You can look at any plate currently on the belt, but once something rolls off, it’s gone.

In the Sentinel Fall Detector, the CircularBuffer holds the last 50 sensor readings — exactly 500 milliseconds of history at 100 Hz. The fall detection algorithm needs to compare the user’s current posture against their posture half a second ago. A regular std::vector that grows unboundedly would eventually consume all available memory on a Raspberry Pi running 24/7. The ring buffer guarantees constant memory usage regardless of how long the system has been running.

In the diagram below, slots 0 and 1 contain previously written data. Slot 2 is where m_head currently points — the next write location. Slots 3 and 4 are empty (the buffer hasn’t filled yet). When push() is called, data goes into slot 2, and m_head advances to slot 3. Once m_head passes slot 4, it wraps back to slot 0, overwriting the oldest entry.

Writing a Ring Buffer From Scratch in 38 Lines

The class skeleton and constructor

We begin by defining a template class. The template <typename T> allows this buffer to hold any data type — sensor structs, integers, or anything else. The constructor pre-allocates a std::vector of the requested size and initializes our bookkeeping indices.

#include <vector>
#include <cstddef>

template <typename T>
class CircularBuffer {
public:
    // 'explicit' prevents accidental implicit conversions like:
    //   CircularBuffer<int> buf = 50;  // This would compile without 'explicit'
    explicit CircularBuffer(size_t size)
        : m_data(size),   // Pre-allocate the vector with 'size' default-constructed elements
          m_size(size),    // Remember the capacity
          m_head(0),       // Next write position starts at index 0
          m_count(0)       // No elements written yet
    {}

I use std::vector instead of a raw C array because the buffer size is determined at runtime (the constructor parameter). The vector’s internal memory lives on the heap, but the vector object itself — including m_head, m_count, and the pointer to that heap memory — lives wherever the CircularBuffer is placed, which in Sentinel’s case is as a member of ImuSensor (stack-allocated in main()).

The `push` method — writing data with wraparound

This is the core operation. We write the new item at the current m_head position, then advance m_head by one. The modulo operator (%) handles the wraparound: when m_head reaches the end of the array, it snaps back to 0.

    void push(const T& item) {
        m_data[m_head] = item;              // Overwrite whatever was at this position
        m_head = (m_head + 1) % m_size;     // Advance head, wrapping at the end
        if (m_count < m_size) m_count++;     // Track how many slots are actually filled
    }

The m_count guard matters: until the buffer has been completely filled for the first time, m_count is less than m_size. This prevents get_past() from reading uninitialized slots.

Reading the most recent entry

get_last() returns the element that was most recently pushed. Since m_head points to the next write location, the most recent write is one step behind it. The expression (m_head + m_size - 1) % m_size performs a backwards step with wraparound.

    T get_last() const {
        if (m_count == 0) return T{};  // Return a default-constructed T if buffer is empty
        return m_data[(m_head + m_size - 1) % m_size];
    }

T{} calls the default constructor. If T is a struct like SensorData with uninitialized double fields, the returned values will be zero. This is safe for Sentinel because the caller checks m_count before using the data — but in other contexts, returning a default object silently could mask bugs. An alternative is std::optional<T>.

Time-travel — reading N steps into the past

This is the method that makes the ring buffer useful for fall detection. get_past(49) retrieves the sensor reading from 49 pushes ago — 490 milliseconds in the past at 100 Hz.

    // Get item from N steps ago (0 = most recent, 1 = one before that, etc.)
    T get_past(size_t steps_ago) const {
        if (steps_ago >= m_count) return T{};  // Can't look further back than we have data
        size_t index = (m_head + m_size - 1 - steps_ago) % m_size;
        return m_data[index];
    }

The index formula works by starting at the most recent element (m_head - 1) and subtracting steps_ago additional positions, with + m_size to prevent unsigned integer underflow before the modulo.

Accessors and private members

    size_t count() const { return m_count; }  // How many elements have been written
    size_t size() const { return m_size; }     // Total capacity

private:
    std::vector<T> m_data;  // The underlying storage
    size_t m_size;          // Capacity
    size_t m_head;          // Index of the next write position
    size_t m_count;         // Number of valid elements (maxes out at m_size)
};

Why Not Just Use a `std::deque`? Memory Layout and Cache Performance

A std::deque supports efficient insertion and removal at both ends and could serve a similar purpose. Two important differences make the ring buffer the right choice here.

Memory layout. A std::vector stores its elements in a single contiguous block of memory. A std::deque stores elements in multiple non-contiguous chunks (typically 512-byte blocks). When the sensor loop iterates over recent history, contiguous memory means sequential cache-line fetches — the CPU prefetcher can predict the access pattern. Fragmented deque storage causes cache misses at chunk boundaries.

Allocation behavior. A deque dynamically allocates and deallocates chunk blocks as elements are added and removed. In a system running at 100 Hz indefinitely, this means heap allocations happening 100 times per second. The ring buffer allocates once at construction and never again — zero allocations during steady-state operation.

For a buffer of 50 SensorData structs (each approximately 88 bytes), the ring buffer uses a single 4,400-byte allocation. On a Raspberry Pi 5 with shared memory between CPU and GPU, minimizing heap fragmentation is not a theoretical concern.

Both push and get_past are O(1) — constant time regardless of buffer size. No shifts, no copies of the underlying array, no memory allocations after construction.

What Happens at the Boundaries

Empty buffer reads. Both get_last() and get_past() check m_count before accessing m_data. If the buffer is empty, they return T{}. In Sentinel, detectFall() separately guards against this by checking m_history.count() < 50 before calling get_past(49).

Unsigned integer underflow. The expression m_head - 1 - steps_ago could underflow for unsigned size_t if m_head is 0. Adding m_size before the subtraction guarantees the intermediate value stays positive. With m_head = 0, m_size = 50, steps_ago = 3: (0 + 50 - 1 - 3) % 50 = 46. Correct.

Thread safety. This buffer is not thread-safe. In Sentinel, all reads and writes happen on the same worker thread inside processSensorLoop(), so no synchronization is needed. If you were reading from a different thread, you’d need a mutex or a lock-free design.

The m_size member is redundant with m_data.size(). Since m_data is private, external code can’t call m_data.resize() — but in a teaching context, it’s worth noting that if they could diverge, the modulo arithmetic would produce out-of-bounds indices.

This 38-line class is the mechanism that gives the fall detection algorithm its memory of the past — enabling it to compare “now” against “500 milliseconds ago” without ever growing its memory footprint.

Your Sensor Remembers Nothing: Building a Fixed-Memory Ring Buffer to Give Hardware a Short-Term History

Technologies Used

What You Need to Know and Have Installed Before We Begin

A Conveyor Belt That Never Gets Longer

Writing a Ring Buffer From Scratch in 38 Lines

The class skeleton and constructor

The `push` method — writing data with wraparound

Reading the most recent entry

Time-travel — reading N steps into the past

Accessors and private members

Why Not Just Use a `std::deque`? Memory Layout and Cache Performance

What Happens at the Boundaries

Related Projects

Sentinel Fall Detector: Engineering Real-Time Safety With Sensor Fusion and Signal Processing

Your Sensor Remembers Nothing: Building a Fixed-Memory Ring Buffer to Give Hardware a Short-Term History

Technologies Used

What You Need to Know and Have Installed Before We Begin

A Conveyor Belt That Never Gets Longer

Writing a Ring Buffer From Scratch in 38 Lines

The class skeleton and constructor

The push method — writing data with wraparound

Reading the most recent entry

Time-travel — reading N steps into the past

Accessors and private members

Why Not Just Use a std::deque? Memory Layout and Cache Performance

What Happens at the Boundaries

Related Projects

Sentinel Fall Detector: Engineering Real-Time Safety With Sensor Fusion and Signal Processing

The `push` method — writing data with wraparound

Why Not Just Use a `std::deque`? Memory Layout and Cache Performance