featured image

10 Milliseconds or Failure: Architecting a Real-Time Sensor Pipeline Across Two Threads Without Locking Up the UI

A deep dive into the engineering of a real-time IMU processing pipeline that runs on a worker thread while communicating with a Qt Quick UI on the main thread. Learn how to use `std::atomic` for lock-free shutdown signaling, Qt's queued signals for cross-thread communication, and self-correcting timing loops to maintain a 100 Hz sample rate without ever blocking the UI thread.

Published

Mon Apr 20 2026

Technologies Used

C++ IMU Qt
Advanced 31 minutes

A while(true) Loop and a UI Thread Can’t Share a CPU Peacefully

The Sentinel Fall Detector has two hard requirements that directly conflict with each other. It must read sensor data, run a Kalman filter, execute fall detection logic, and run FFT-based gait analysis — all within 10 milliseconds, 100 times per second, with zero jitter. It must also render a Qt Quick dashboard with 3D rotations, animated status text, and a live force meter at a smooth frame rate.

If the sensor loop runs on the main (UI) thread, it blocks the Qt event loop every time it sleeps. The dashboard freezes. If anything in the UI blocks the sensor loop, readings are missed and the fall detection algorithm receives corrupted timing data.

imusensor.cpp solves this with a dedicated worker thread, std::atomic<bool> for lock-free shutdown signaling, and Qt’s queued signal mechanism for cross-thread communication — all without a single mutex. This tutorial examines how that works and, importantly, which shared state is safe to leave unprotected and why.

Before You Read This

Knowledge you need:

  • C++ classes, lambdas, member function pointers
  • Threading fundamentals: what a race condition is, why shared mutable state is dangerous
  • Basic Qt: signals and slots, QObject, Q_PROPERTY
  • The Kalman filter tutorial — ImuSensor calls into that code

Environment:

  • C++17 compiler
  • Qt 6 (Core, Gui, Quick)
  • Linux with I2C support
  • CMake 3.16+ with CMAKE_AUTOMOC ON

The Factory and the Showroom

The cleanest way to think about this architecture: ImuSensor is a factory (the worker thread) and a showroom (the UI thread) sharing the same building. The factory stamps out a product every 10 milliseconds. The showroom displays the latest product to visitors at its own pace. A conveyor belt (Qt’s queued signals) carries finished products from factory to showroom. The factory’s emergency stop button (std::atomic<bool>) can be read and written from either floor — no need to halt the line to check it.

sequenceDiagram
    participant Main as Main Thread (UI)
    participant Worker as Worker Thread (Sensor)
    participant I2C as I2C Bus (Hardware)

    Main->>Worker: QThread::create(lambda) → start()
    Note over Main: UI renders at display refresh rate

    loop Every 10ms
        Worker->>I2C: write(register address)
        I2C-->>Worker: read(12 bytes: gyro + accel)
        Worker->>Worker: Kalman filter update
        Worker->>Worker: detectFall()
        Worker->>Worker: FFT gait analysis (every 128 samples)
        Worker->>Worker: Write m_pitch, m_roll (every 2nd sample)
        Worker-->>Main: emit pitchChanged() [queued signal]
    end

    Main->>Worker: m_running.store(false) [atomic]
    Worker-->>Main: Thread exits → wait() → delete

Launching the Worker Thread Safely

The sensor loop is a while(m_running) loop with a 10ms sleep. It cannot run on the main thread. startSensing() creates a QThread with a lambda that captures this:

void ImuSensor::startSensing() {
    if (m_running) return;

    if (!initI2C()) {
        emit statusUpdated("Error: I2C Init Failed");
        return;
    }

    m_running = true;  // Set BEFORE thread starts
    m_workerThread = QThread::create([this]{ processSensorLoop(); });
    m_workerThread->start();
    emit statusUpdated("Monitoring (Advanced)...");
}

Setting m_running = true before calling start() is essential. If the thread starts and enters the loop before the flag is set, it sees false and exits immediately. std::atomic<bool> guarantees that the worker thread sees the written value even though it was written on a different thread — this is the happens-before guarantee that atomics provide, without any mutex.

The Timing Loop: Sleeping Only What’s Left

The loop measures its own execution time and sleeps only for the remaining budget. If processing takes 3ms, it sleeps 7ms. If it takes 9ms, it sleeps 1ms:

void ImuSensor::processSensorLoop() {
    while (m_running) {
        auto start = std::chrono::steady_clock::now();

        // I2C read, Kalman update, fall detection, FFT

        auto end = std::chrono::steady_clock::now();
        auto elapsed = std::chrono::duration_cast<std::chrono::milliseconds>(
            end - start).count();
        int sleepTime = 10 - elapsed;
        if (sleepTime > 0) QThread::msleep(sleepTime);
    }
}

I use std::chrono::steady_clock rather than system_clock because steady_clock is monotonic — it never jumps backward due to NTP corrections or daylight saving changes. A backward time jump with system_clock would produce a negative elapsed value, which would cause an excessively long sleep or undefined behavior. steady_clock is guaranteed to only move forward.

If processing overruns (takes more than 10ms), sleepTime goes negative and the guard skips the sleep entirely. The next iteration starts immediately. The Kalman filter compensates somewhat because DT is a constant, but sustained overruns will cause the filter’s time model to drift from reality. Brief overruns from OS scheduling jitter are negligible. Sustained overruns from I2C bus stalls are a problem.

Reading Raw Bytes from the Sensor

The LSM6DSO32 delivers gyroscope and accelerometer data as raw 16-bit little-endian integers. Each axis requires two bytes, combined and then scaled to physical units:

char reg = LSM_OUTX_L_G;
write(i2c_file, &reg, 1);
char data[12];
if (read(i2c_file, data, 12) == 12) {
    // Combine two bytes into a signed 16-bit integer (little-endian)
    int16_t gx_raw = (data[1] << 8) | (uint8_t)data[0];

    // Convert to physical units
    // Gyro: 2000 dps range → 70 mdps/LSB → radians/second
    cur.gx = gx_raw * 0.070 * (M_PI / 180.0);

    // Accel: ±8g range → 0.244 mg/LSB → g
    cur.ax = ax_raw * 0.000244;

Reading all 12 bytes in one burst is intentional — it’s faster than six separate reads and ensures all six axes are captured at the same instant rather than across multiple I2C transactions.

Throttled UI Updates via Queued Signals

The sensor runs at 100 Hz. Updating the UI at 100 Hz is wasteful — displays refresh at 60 Hz, and QML property bindings have overhead. A simple counter throttles updates to 50 Hz:

if (m_analysisCounter++ % 2 == 0) {
    m_pitch = cur.pitch;
    m_roll = cur.roll;
    m_yaw += cur.gz * (DT * 2.0);
    m_totalAccel = cur.total_accel;

    emit pitchChanged();
    emit rollChanged();
    emit yawChanged();
    emit sensorUpdated();
}

When a signal is emitted from a thread different from the receiver’s thread, Qt automatically uses a queued connection. The signal’s arguments are copied into an event object and posted to the main thread’s event queue. The worker thread is never blocked waiting for the UI to handle it — it posts and immediately continues.

Shutting Down Without Deadlock

Stopping the system requires two threads to coordinate. The main thread writes the flag; the worker thread reads it and exits:

void ImuSensor::stopSensing() {
    m_running = false;  // Atomic write — worker sees this within 10ms
    if (m_workerThread) {
        m_workerThread->quit();
        m_workerThread->wait();  // Block until thread exits
        delete m_workerThread;
        m_workerThread = nullptr;
    }
}

The wait() call blocks the main thread for up to 10ms while the worker finishes its current cycle. This is acceptable during shutdown. It would be catastrophic if called from within a rendering frame.

Which Race Conditions Actually Exist Here

m_running is safe. One thread writes false exactly once; the other reads it on every iteration. This is exactly what std::atomic is designed for. No mutex needed, and the overhead is minimal — an atomic load costs 2–5 nanoseconds versus 50–100 nanoseconds for an uncontended mutex.

m_pitch, m_roll, and friends are technically unsafe but practically fine. These double values are written by the worker and read by the main thread through Q_PROPERTY getters. The C++ standard doesn’t guarantee that double writes are atomic. In practice, on 64-bit ARM (the Raspberry Pi 5’s Cortex-A76), naturally aligned 8-byte writes are atomic at the hardware level. The worst case is a torn read: the main thread reads m_pitch mid-write and gets a value from neither the old nor new sample. For a display property that updates 50 times per second, one corrupted frame is invisible. For fall detection decisions, the system never reads these properties — detectFall() operates on the local SensorData cur variable that lives entirely on the worker thread’s stack.

The static in detectFall() is a hidden assumption. The static int potentialFallTimeout = 0 inside detectFall() persists across calls. Because detectFall() is only ever called from the worker thread, this is safe. But if a unit test ever calls it from a second thread in parallel, you have a shared mutable without synchronization. A member variable would be the safer design.

The I2C file descriptor is safe as written. It’s opened in initI2C() (called from startSensing() on the main thread) before the worker thread starts. The open completes before start() is called, which establishes a happens-before relationship. If initI2C() were ever callable during active sensing, you’d have a race on the descriptor.

These patterns — dedicated worker threads with atomic shutdown flags, self-correcting timing loops, throttled UI updates via queued signals — appear in any system bridging deterministic processing with interactive display: robotics control, audio engines, game physics, industrial monitoring.

We respect your privacy.

← View All Tutorials

Related Projects

    Ask me anything!