C++ Standard Library Concurrency: Threads, Mutexes, Futures

The C++ Standard Library provides a portable, high-level concurrency model built on four main headers: <thread> for launching and managing threads, <mutex> for mutual exclusion, <condition_variable> for thread synchronization, and <future> for asynchronous task results. The <atomic> header complements these with lock-free, hardware-backed atomic operations. Together they cover the full range from low-level thread management to high-level async task dispatch.

In MSVC, code compiled with /clr (C++/CLI) cannot use <thread>, <mutex>, or <future> — these headers are blocked in that mode. Use the Windows thread pool APIs or the Concurrency Runtime (ConcRT) for managed/native interop scenarios.

`<thread>` — Thread Management

`std::thread`

std::thread represents a single OS thread of execution. You pass a callable (function, lambda, functor, or pointer-to-member) plus any arguments to the constructor, and the thread starts immediately.

#include <thread>
#include <iostream>
#include <vector>

void worker(int id, int iterations)
{
    for (int i = 0; i < iterations; ++i)
        std::cout << "Thread " << id << " iteration " << i << "\n";
}

int main()
{
    // Launch two threads
    std::thread t1(worker, 1, 3);
    std::thread t2(worker, 2, 3);

    // Must join or detach before thread object is destroyed
    t1.join();  // wait for t1 to finish
    t2.join();  // wait for t2 to finish

    // Lambda thread
    std::thread t3([] {
        std::cout << "Lambda thread id: "
                  << std::this_thread::get_id() << "\n";
    });
    t3.join();

    // Pass arguments by reference using std::ref
    int counter = 0;
    std::thread t4([&counter] { counter = 42; });
    t4.join();
    std::cout << "counter = " << counter << "\n"; // 42
}

If a std::thread object is destroyed while it is still joinable (neither joined nor detached), the program calls std::terminate. Always call join() or detach() before the thread object goes out of scope.

`std::jthread` — Automatically Joining Thread (C++20)

std::jthread extends std::thread with two important additions: it automatically joins in its destructor (eliminating the risk of std::terminate), and it supports cooperative cancellation via std::stop_token.

#include <thread>
#include <iostream>
#include <chrono>

int main()
{
    // jthread joins automatically when it goes out of scope
    std::jthread t([](std::stop_token stoken) {
        while (!stoken.stop_requested()) {
            std::cout << "Working...\n";
            std::this_thread::sleep_for(std::chrono::milliseconds(100));
        }
        std::cout << "Stopped gracefully.\n";
    });

    std::this_thread::sleep_for(std::chrono::milliseconds(350));
    t.request_stop(); // signal the thread to stop
    // destructor joins automatically — no explicit join() needed
}

Thread Utility Functions

#include <thread>
#include <iostream>
#include <chrono>

int main()
{
    // Get current thread id
    auto id = std::this_thread::get_id();
    std::cout << "Main thread id: " << id << "\n";

    // Hint to the scheduler to yield
    std::this_thread::yield();

    // Sleep for a duration
    std::this_thread::sleep_for(std::chrono::milliseconds(50));

    // Sleep until a point in time
    auto wake = std::chrono::steady_clock::now() + std::chrono::seconds(1);
    std::this_thread::sleep_until(wake);

    // Hardware concurrency hint
    unsigned hw = std::thread::hardware_concurrency();
    std::cout << "Logical cores: " << hw << "\n";
}

`<mutex>` — Mutual Exclusion

Mutex Types

Type	Description
`std::mutex`	Non-recursive, non-timed basic mutex
`std::recursive_mutex`	Allows the same thread to lock multiple times
`std::timed_mutex`	Adds `try_lock_for()` and `try_lock_until()`
`std::recursive_timed_mutex`	Recursive + timed
`std::shared_mutex` (C++17)	Reader-writer lock (`lock_shared` / `lock`)

RAII Lock Guards

Always prefer RAII wrappers over manual lock()/unlock() — they guarantee the mutex is released even if an exception is thrown.

lock_guard (simplest)
unique_lock (flexible)
scoped_lock (C++17, multi-mutex)

#include <mutex>
#include <iostream>
#include <thread>
#include <vector>

std::mutex mtx;
int shared_counter = 0;

void increment(int n)
{
    for (int i = 0; i < n; ++i) {
        std::lock_guard<std::mutex> lock(mtx); // locked here
        ++shared_counter;
    } // lock released automatically here
}

int main()
{
    std::vector<std::thread> threads;
    for (int i = 0; i < 10; ++i)
        threads.emplace_back(increment, 1000);

    for (auto& t : threads) t.join();

    std::cout << "counter = " << shared_counter << "\n"; // 10000
}

#include <mutex>
#include <iostream>

std::mutex mtx;

void example()
{
    // unique_lock can be unlocked and re-locked manually
    std::unique_lock<std::mutex> lock(mtx);

    // Critical section
    std::cout << "Locked\n";

    lock.unlock(); // release early if needed
    // ... do non-critical work ...
    lock.lock();   // re-acquire

    // Deferred locking
    std::unique_lock<std::mutex> lock2(mtx, std::defer_lock);
    lock2.lock(); // lock later
}

#include <mutex>
#include <iostream>

std::mutex m1, m2;

void transfer()
{
    // Locks BOTH mutexes atomically, avoiding deadlock
    std::scoped_lock lock(m1, m2);
    std::cout << "Both locked\n";
} // both released here

`std::shared_mutex` — Reader-Writer Lock (C++17)

#include <shared_mutex>
#include <thread>
#include <string>
#include <iostream>
#include <vector>

class Cache
{
    std::shared_mutex rw_mutex;
    std::string data;

public:
    // Multiple readers can hold the shared lock simultaneously
    std::string read() const
    {
        std::shared_lock<std::shared_mutex> lock(rw_mutex);
        return data;
    }

    // Only one writer at a time; blocks all readers
    void write(const std::string& value)
    {
        std::unique_lock<std::shared_mutex> lock(rw_mutex);
        data = value;
    }
};

int main()
{
    Cache cache;
    cache.write("initial");

    std::vector<std::thread> readers;
    for (int i = 0; i < 5; ++i)
        readers.emplace_back([&cache, i] {
            std::cout << "Reader " << i << ": " << cache.read() << "\n";
        });

    for (auto& t : readers) t.join();
}

`std::call_once` — One-Time Initialization

#include <mutex>
#include <iostream>
#include <thread>

std::once_flag init_flag;

void initialize()
{
    std::cout << "Initialized (called exactly once)\n";
}

void worker()
{
    std::call_once(init_flag, initialize);
    // ... do work ...
}

int main()
{
    std::thread t1(worker), t2(worker), t3(worker);
    t1.join(); t2.join(); t3.join();
}

`<condition_variable>` — Thread Synchronization

std::condition_variable allows threads to wait until a condition becomes true, avoiding busy-waiting.

#include <condition_variable>
#include <mutex>
#include <thread>
#include <queue>
#include <iostream>
#include <chrono>

// Producer-Consumer pattern using condition_variable
std::queue<int>              task_queue;
std::mutex                   queue_mutex;
std::condition_variable      cv;
bool                         done = false;

void producer(int n)
{
    for (int i = 0; i < n; ++i) {
        {
            std::lock_guard<std::mutex> lock(queue_mutex);
            task_queue.push(i);
            std::cout << "Produced: " << i << "\n";
        }
        cv.notify_one(); // wake one waiting consumer
        std::this_thread::sleep_for(std::chrono::milliseconds(50));
    }
    {
        std::lock_guard<std::mutex> lock(queue_mutex);
        done = true;
    }
    cv.notify_all(); // wake all consumers so they can exit
}

void consumer(int id)
{
    while (true) {
        std::unique_lock<std::mutex> lock(queue_mutex);

        // Wait until there is work or we are done
        cv.wait(lock, [] { return !task_queue.empty() || done; });

        while (!task_queue.empty()) {
            int task = task_queue.front();
            task_queue.pop();
            lock.unlock();
            std::cout << "Consumer " << id << " processed: " << task << "\n";
            lock.lock();
        }

        if (done && task_queue.empty())
            return;
    }
}

int main()
{
    std::thread prod(producer, 8);
    std::thread cons1(consumer, 1);
    std::thread cons2(consumer, 2);

    prod.join();
    cons1.join();
    cons2.join();
}

Always use cv.wait(lock, predicate) rather than the predicate-less cv.wait(lock). The predicate form handles spurious wakeups — the OS may wake a thread even when notify_one was not called — by re-checking the condition after each wakeup.

`<future>` — Asynchronous Tasks

`std::async` — Fire-and-Forget Async

std::async launches a callable asynchronously and returns a std::future through which you retrieve the result (or any exception).

#include <future>
#include <iostream>
#include <vector>
#include <numeric>

// A CPU-intensive function
double sum_range(std::vector<double>::iterator first,
                 std::vector<double>::iterator last)
{
    return std::accumulate(first, last, 0.0);
}

int main()
{
    std::vector<double> data(1'000'000, 1.0);
    auto mid = data.begin() + data.size() / 2;

    // Launch two halves in parallel
    auto f1 = std::async(std::launch::async, sum_range, data.begin(), mid);
    auto f2 = std::async(std::launch::async, sum_range, mid,          data.end());

    double total = f1.get() + f2.get(); // get() blocks until ready
    std::cout << "Total: " << total << "\n"; // 1000000
}

Launch policy	Behavior
`std::launch::async`	Always runs in a new thread
`std::launch::deferred`	Runs lazily in the calling thread when `get()` or `wait()` is called
`std::launch::async \| std::launch::deferred`	Implementation chooses (default)

`std::promise` and `std::future`

std::promise gives you explicit control over when and what value a future receives:

#include <future>
#include <thread>
#include <iostream>
#include <stdexcept>

void compute(std::promise<int> prom, int x)
{
    try {
        if (x < 0)
            throw std::invalid_argument("x must be non-negative");

        int result = x * x;
        prom.set_value(result);      // fulfil the promise
    } catch (...) {
        prom.set_exception(std::current_exception()); // propagate exception
    }
}

int main()
{
    std::promise<int> prom;
    std::future<int>  fut = prom.get_future();

    std::thread t(compute, std::move(prom), 7);
    t.join();

    try {
        std::cout << "Result: " << fut.get() << "\n"; // 49
    } catch (const std::exception& e) {
        std::cerr << "Error: " << e.what() << "\n";
    }
}

`std::shared_future`

A std::shared_future can be copied and shared across multiple threads — each thread can call get() independently:

#include <future>
#include <thread>
#include <iostream>
#include <vector>

int main()
{
    std::promise<int> prom;
    std::shared_future<int> sf = prom.get_future().share();

    std::vector<std::thread> threads;
    for (int i = 0; i < 5; ++i) {
        threads.emplace_back([sf, i] {
            int val = sf.get(); // all five threads block here
            std::cout << "Thread " << i << " got: " << val << "\n";
        });
    }

    prom.set_value(42); // unblocks all five threads at once

    for (auto& t : threads) t.join();
}

`std::packaged_task`

std::packaged_task wraps a callable so its return value is stored in a future:

#include <future>
#include <thread>
#include <iostream>

int main()
{
    std::packaged_task<int(int, int)> task([](int a, int b) { return a + b; });
    std::future<int> result = task.get_future();

    std::thread t(std::move(task), 10, 32);
    t.join();

    std::cout << "10 + 32 = " << result.get() << "\n"; // 42
}

`<atomic>` — Lock-Free Atomic Operations

std::atomic<T> wraps a value and guarantees that load, store, and read-modify-write operations are indivisible — no mutex needed for simple shared counters and flags.

#include <atomic>
#include <thread>
#include <vector>
#include <iostream>

std::atomic<int>  counter{0};
std::atomic<bool> ready{false};

void worker()
{
    // Spin-wait until ready
    while (!ready.load(std::memory_order_acquire))
        std::this_thread::yield();

    // Atomically increment
    counter.fetch_add(1, std::memory_order_relaxed);
}

int main()
{
    std::vector<std::thread> threads;
    for (int i = 0; i < 100; ++i)
        threads.emplace_back(worker);

    ready.store(true, std::memory_order_release); // release all workers

    for (auto& t : threads) t.join();
    std::cout << "counter = " << counter.load() << "\n"; // 100
}

Common `std::atomic` Operations

Operation	Method	Notes
Read	`load(order)`	Returns current value
Write	`store(val, order)`	Sets value
Add and get old value	`fetch_add(n, order)`	Returns previous value
Subtract and get old	`fetch_sub(n, order)`	Returns previous value
Bitwise AND/OR/XOR	`fetch_and/or/xor`	Returns previous value
Compare and swap	`compare_exchange_weak/strong`	CAS loop pattern
Exchange	`exchange(val, order)`	Returns old value

Memory order quick reference

Memory order	Description
`memory_order_relaxed`	No synchronization; only atomicity is guaranteed
`memory_order_acquire`	Reads before this point happen after all prior `release` stores in other threads
`memory_order_release`	Writes before this point are visible to threads that `acquire`
`memory_order_acq_rel`	Combines `acquire` + `release` (for RMW operations)
`memory_order_seq_cst`	Sequentially consistent — the default, strongest guarantee

When in doubt, use the default memory_order_seq_cst. Weaker orderings only pay off in carefully benchmarked hot paths.

Complete Producer-Consumer Example

The following example combines std::thread, std::mutex, std::condition_variable, and std::atomic into a bounded queue producer-consumer pattern:

#include <iostream>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <queue>
#include <atomic>
#include <chrono>
#include <vector>

template <typename T>
class BoundedQueue
{
    std::queue<T>           q;
    std::mutex              mtx;
    std::condition_variable not_full;
    std::condition_variable not_empty;
    const size_t            capacity;

public:
    explicit BoundedQueue(size_t cap) : capacity(cap) {}

    void push(T item)
    {
        std::unique_lock<std::mutex> lock(mtx);
        not_full.wait(lock, [this] { return q.size() < capacity; });
        q.push(std::move(item));
        not_empty.notify_one();
    }

    T pop()
    {
        std::unique_lock<std::mutex> lock(mtx);
        not_empty.wait(lock, [this] { return !q.empty(); });
        T item = std::move(q.front());
        q.pop();
        not_full.notify_one();
        return item;
    }

    bool empty()
    {
        std::lock_guard<std::mutex> lock(mtx);
        return q.empty();
    }
};

int main()
{
    BoundedQueue<int> bq(5);
    std::atomic<bool> done{false};

    // Producer
    std::thread producer([&] {
        for (int i = 0; i < 20; ++i) {
            bq.push(i);
            std::cout << "Produced " << i << "\n";
            std::this_thread::sleep_for(std::chrono::milliseconds(10));
        }
        done = true;
    });

    // Two consumers
    auto consumer_fn = [&](int id) {
        while (!done || !bq.empty()) {
            // Try to drain remaining items after producer signals done
            if (!bq.empty()) {
                int v = bq.pop();
                std::cout << "Consumer " << id << " consumed " << v << "\n";
            } else {
                std::this_thread::yield();
            }
        }
    };

    std::thread c1(consumer_fn, 1);
    std::thread c2(consumer_fn, 2);

    producer.join();
    c1.join();
    c2.join();

    std::cout << "All done.\n";
}

Concurrency Facilities at a Glance

<thread>

std::thread, std::jthread (C++20), std::this_thread::sleep_for, yield, get_id, hardware_concurrency

<mutex>

std::mutex, recursive_mutex, timed_mutex, shared_mutex (C++17); lock_guard, unique_lock, scoped_lock (C++17), shared_lock; call_once, once_flag

<condition_variable>

std::condition_variable (with unique_lock<mutex>), std::condition_variable_any (with any lockable), notify_one, notify_all, wait, wait_for, wait_until

<future>

std::async, std::future, std::shared_future, std::promise, std::packaged_task; future_status enum; std::launch bitmask

<atomic>

std::atomic<T>, std::atomic_flag, std::atomic_ref (C++20); std::memory_order; std::atomic_thread_fence, atomic_signal_fence

<semaphore> (C++20)

std::counting_semaphore, std::binary_semaphore. Lighter than condition_variable for simple resource counting.

C++ Language

C Language

Standard Library

Preprocessor

C++ Standard Library Concurrency: Threads, Mutexes, Futures

`<thread>` — Thread Management

`std::thread`

`std::jthread` — Automatically Joining Thread (C++20)

Thread Utility Functions

`<mutex>` — Mutual Exclusion

Mutex Types

RAII Lock Guards

`std::shared_mutex` — Reader-Writer Lock (C++17)

`std::call_once` — One-Time Initialization

`<condition_variable>` — Thread Synchronization

`<future>` — Asynchronous Tasks

`std::async` — Fire-and-Forget Async

`std::promise` and `std::future`

`std::shared_future`

`std::packaged_task`

`<atomic>` — Lock-Free Atomic Operations

Common `std::atomic` Operations

Complete Producer-Consumer Example

Concurrency Facilities at a Glance

<thread>

<mutex>

<condition_variable>

<future>

<atomic>

<semaphore> (C++20)

See Also

Build docs developers (and LLMs) love

C++ Language

C Language

Standard Library

Preprocessor

Documentation Index

​<thread> — Thread Management

​std::thread

​std::jthread — Automatically Joining Thread (C++20)

​Thread Utility Functions

​<mutex> — Mutual Exclusion

​Mutex Types

​RAII Lock Guards

​std::shared_mutex — Reader-Writer Lock (C++17)

​std::call_once — One-Time Initialization

​<condition_variable> — Thread Synchronization

​<future> — Asynchronous Tasks

​std::async — Fire-and-Forget Async

​std::promise and std::future

​std::shared_future

​std::packaged_task

​<atomic> — Lock-Free Atomic Operations

​Common std::atomic Operations

​Complete Producer-Consumer Example

​Concurrency Facilities at a Glance

<thread>

<mutex>

<condition_variable>

<future>

<atomic>

<semaphore> (C++20)

​See Also

Build docs developers (and LLMs) love

`<thread>` — Thread Management

`std::thread`

`std::jthread` — Automatically Joining Thread (C++20)

Thread Utility Functions

`<mutex>` — Mutual Exclusion

Mutex Types

RAII Lock Guards

`std::shared_mutex` — Reader-Writer Lock (C++17)

`std::call_once` — One-Time Initialization

`<condition_variable>` — Thread Synchronization

`<future>` — Asynchronous Tasks

`std::async` — Fire-and-Forget Async

`std::promise` and `std::future`

`std::shared_future`

`std::packaged_task`

`<atomic>` — Lock-Free Atomic Operations

Common `std::atomic` Operations

Complete Producer-Consumer Example

Concurrency Facilities at a Glance

See Also