On this page
- What You Need to Know and Have Installed Before We Begin
- The Conveyor Belt: How a Ring Buffer Thinks About Time
- Writing a Ring Buffer From Scratch in 38 Lines
- Step 1: The class skeleton and constructor
- Step 2: The push method — writing data with wraparound
- Step 3: Reading the most recent entry
- Step 4: Time-travel — reading N steps into the past
- Step 5: Accessors and private members
- Why Not Just Use a std::deque? Memory Layout and Cache Performance
- What Happens When the Buffer Is Empty, Overflows, or Gets Confused
- You Now Know How to Build a Zero-Allocation Sliding Window in C++
What You Need to Know and Have Installed Before We Begin
Knowledge Base:
- Basic C++ syntax: classes, constructors, member functions
- A conceptual understanding of arrays and how indexing works
- Familiarity with the
templatekeyword (you don’t need to be an expert — we’ll explain every use)
Environment:
- A C++17-compatible compiler (GCC 7+, Clang 5+, or MSVC 19.14+)
- No external libraries required — this tutorial uses only
<vector>and<cstddef>from the standard library
The Conveyor Belt: How a Ring Buffer Thinks About Time
A ring buffer is a fixed-size collection that overwrites its oldest element when full. Think of it as a conveyor belt at a sushi restaurant: the belt holds exactly N plates. When a new plate is added at the kitchen end, the oldest plate at the far end falls off. You can always look at any plate currently on the belt, but once it rolls off, it’s gone.
In the Sentinel Fall Detector, the CircularBuffer holds the last 50 sensor readings — exactly 500 milliseconds of history at 100 Hz. The fall detection algorithm needs to compare the user’s current posture against their posture half a second ago. A regular std::vector that grows unboundedly would eventually consume all available memory on a Raspberry Pi running 24/7. A ring buffer guarantees constant memory usage regardless of how long the system runs.
graph LR
subgraph "CircularBuffer (size=5)"
direction LR
S0["Slot 0"]
S1["Slot 1"]
S2["Slot 2"]
S3["Slot 3"]
S4["Slot 4"]
end
HEAD["m_head →"] --> S2
PUSH["push()"] --> S2
style S0 fill:#334155,color:#fff
style S1 fill:#334155,color:#fff
style S2 fill:#38bdf8,color:#000
style S3 fill:#1e293b,color:#555
style S4 fill:#1e293b,color:#555
In the diagram above, slots 0 and 1 contain previously written data. Slot 2 is where m_head currently points — the next write location. Slots 3 and 4 are empty (the buffer hasn’t filled yet). When push() is called, data goes into slot 2, and m_head advances to slot 3. Once m_head passes slot 4, it wraps back to slot 0, overwriting the oldest entry.
Writing a Ring Buffer From Scratch in 38 Lines
Step 1: The class skeleton and constructor
We begin by defining a template class. The template <typename T> allows this buffer to hold any data type — sensor structs, integers, or anything else. The constructor pre-allocates a std::vector of the requested size and initializes our bookkeeping indices.
#include <vector>
#include <cstddef>
template <typename T>
class CircularBuffer {
public:
// 'explicit' prevents accidental implicit conversions like:
// CircularBuffer<int> buf = 50; // This would compile without 'explicit'
explicit CircularBuffer(size_t size)
: m_data(size), // Pre-allocate the vector with 'size' default-constructed elements
m_size(size), // Remember the capacity
m_head(0), // Next write position starts at index 0
m_count(0) // No elements written yet
{}
🔵 Deep Dive: We use std::vector instead of a raw C array because the buffer size is determined at runtime (the constructor parameter). The vector’s internal memory lives on the heap, but the vector object itself — including m_head, m_count, and the pointer to that heap memory — lives wherever the CircularBuffer is placed, which in Sentinel’s case is as a member of ImuSensor (stack-allocated in main()).
Step 2: The push method — writing data with wraparound
This is the core operation. We write the new item at the current m_head position, then advance m_head by one. The modulo operator (%) handles the wraparound: when m_head reaches the end of the array, it snaps back to 0.
void push(const T& item) {
m_data[m_head] = item; // Overwrite whatever was at this position
m_head = (m_head + 1) % m_size; // Advance head, wrapping at the end
if (m_count < m_size) m_count++; // Track how many slots are actually filled
}
The m_count guard is important: until the buffer has been completely filled for the first time, m_count is less than m_size. This prevents the get_past() method from reading uninitialized slots.
Step 3: Reading the most recent entry
get_last() returns the element that was most recently pushed. Since m_head points to the next write location, the most recent write is one step behind it. The expression (m_head + m_size - 1) % m_size performs a backwards step with wraparound.
T get_last() const {
if (m_count == 0) return T{}; // Return a default-constructed T if buffer is empty
return m_data[(m_head + m_size - 1) % m_size];
}
🔴 Danger: T{} calls the default constructor. If T is a struct like SensorData with uninitialized double fields, the returned values will be zero. This is safe for Sentinel because the caller checks m_count before using the data — but in other contexts, returning a default object silently could mask bugs. An alternative is to return std::optional<T>.
Step 4: Time-travel — reading N steps into the past
This is the method that makes the ring buffer powerful for fall detection. get_past(49) retrieves the sensor reading from 49 pushes ago — 490 milliseconds in the past at 100 Hz.
// Get item from N steps ago (0 = most recent, 1 = one before that, etc.)
T get_past(size_t steps_ago) const {
if (steps_ago >= m_count) return T{}; // Can't look further back than we have data
size_t index = (m_head + m_size - 1 - steps_ago) % m_size;
return m_data[index];
}
The index formula works by starting at the most recent element (m_head - 1) and subtracting steps_ago additional positions, with + m_size to prevent unsigned integer underflow before the modulo.
Step 5: Accessors and private members
size_t count() const { return m_count; } // How many elements have been written
size_t size() const { return m_size; } // Total capacity
private:
std::vector<T> m_data; // The underlying storage
size_t m_size; // Capacity (redundant with m_data.size(), kept for clarity)
size_t m_head; // Index of the next write position
size_t m_count; // Number of valid elements (maxes out at m_size)
};
Why Not Just Use a std::deque? Memory Layout and Cache Performance
A std::deque supports efficient insertion and removal at both ends and could serve a similar purpose. However, there are two important differences in this context:
Memory layout. A std::vector stores its elements in a single contiguous block of memory. A std::deque stores elements in multiple non-contiguous chunks (typically 512-byte blocks). When the sensor loop iterates over recent history, contiguous memory means sequential cache-line fetches — the CPU prefetcher can predict the access pattern. Fragmented deque storage causes cache misses at chunk boundaries.
Allocation behavior. A deque dynamically allocates and deallocates chunk blocks as elements are added and removed. In a system running at 100 Hz indefinitely, this means heap allocations happening 100 times per second. The ring buffer allocates once at construction and never again — zero allocations during steady-state operation.
For a buffer of 50 SensorData structs (each approximately 88 bytes), the ring buffer uses a single 4,400-byte allocation. The std::deque would use at least 9 chunk allocations and a separate map array. On a Raspberry Pi 5 with shared memory between CPU and GPU, minimizing heap fragmentation is not a theoretical concern.
Time complexity: Both push and get_past are O(1) — constant time regardless of buffer size. There are no shifts, no copies of the underlying array, and no memory allocations after construction.
What Happens When the Buffer Is Empty, Overflows, or Gets Confused
Empty buffer reads. Both get_last() and get_past() check m_count before accessing m_data. If the buffer is empty, they return T{} — a default-constructed value. In Sentinel, detectFall() separately guards against this by checking m_history.count() < 50 before calling get_past(49).
Unsigned integer underflow. The expression m_head - 1 - steps_ago could underflow for unsigned size_t if m_head is 0. Adding m_size before the subtraction guarantees the intermediate value stays positive. For example, with m_head = 0, m_size = 50, steps_ago = 3: (0 + 50 - 1 - 3) % 50 = 46. Correct.
Thread safety. This buffer is not thread-safe. In Sentinel, all reads and writes happen on the same worker thread inside processSensorLoop(), so no synchronization is needed. If you were to read from a different thread, you would need a mutex or a lock-free design.
🔴 Danger: The m_size member is redundant with m_data.size(). If someone were to call m_data.resize() externally (not possible here since m_data is private, but worth noting in a teaching context), the two values would diverge, and the modulo arithmetic would produce out-of-bounds indices.
You Now Know How to Build a Zero-Allocation Sliding Window in C++
This tutorial covered a fundamental systems programming pattern: the ring buffer. You learned how modular arithmetic enables constant-time insertion and historical lookup without memory allocation, why contiguous memory layout matters for cache performance, and how template <typename T> lets a single implementation serve any data type. In the Sentinel codebase, this 38-line class is the mechanism that gives the fall detection algorithm a memory of the past — enabling it to compare “now” against “500 milliseconds ago” without ever growing its memory footprint.