Interrupting a Blocked Thread in C++ on Linux

Building multi-threaded software in C++ has many pitfalls. One of them is ending your threads. When you want to cleanly shut down your application, how do you make sure all threads exit? C++20 added std::jthread, which helps you in getting the message that the thread should end into your threads. But how do you actually stop your threads?

There is no universal answer to this, of course. Whatever code is running in your threads must check for some exit condition and then exit. There is one especially difficult case, however: What if some of your threads are blocked in a system call? This article describes a strategy of how you can still achieve a clean end in your threads in these cases - as long as you are on Linux.

We start by illustrating the problem in the next two sections. If you’re just interested in the solution, you can skip ahead to the Interrupting the Blocking System Call section.

Basic Scenario Without System Calls

Consider this basic “classic” example, in which main() spawns a second thread (which just sleeps and prints something to the console) and then ends it after a while:1

 1class Thread {
 2public:
 3  void run() { m_thread = std::thread(&Thread::threadMain, this); }
 4  void stop() {
 5    m_stop = true;
 6    m_thread.join();
 7  }
 8
 9private:
10  void threadMain() {
11    while (!m_stop) {
12      std::this_thread::sleep_for(std::chrono::seconds{1});
13      std::cout << "Thread idling…\n";
14    }
15    std::cout << "Thread stopped cleanly.\n";
16  }
17  std::thread m_thread;
18  std::atomic<bool> m_stop = false;
19};
20
21int main(int argc, char **argv) {
22  Thread t;
23  t.run();
24  std::this_thread::sleep_for(std::chrono::seconds{5});
25  std::cout << "Requesting thread to stop.\n";
26  t.stop();
27  std::cout << "Exiting main()\n";
28  return 0;
29}

(You can also find all code from this article in the GitHub repository with the code snippets from my blog.)

This demonstrates the “classic” way: The thread’s main function regularly checks some flag for stopping (m_stop in this case). When the thread should be stopped, this flag is set and the thread is join() ed. When you run this, you should see some output like this:

1Thread idling…
2Thread idling…
3Thread idling…
4Thread idling…
5Requesting thread to stop.
6Thread idling…
7Thread stopped cleanly.
8Exiting main()

With C++20, std::jthread makes our life a little easier:

 1class Thread {
 2public:
 3  void run() {
 4    m_thread = std::jthread(
 5        [this](std::stop_token token) { this->threadMain(token); });
 6  }
 7
 8private:
 9  void threadMain(std::stop_token token) {
10    while (!token.stop_requested()) {
11      std::this_thread::sleep_for(std::chrono::seconds{1});
12      std::cout << "Thread idling…\n";
13    }
14    std::cout << "Thread stopped cleanly.\n";
15  }
16  std::jthread m_thread;
17};
18
19int main(int argc, char **argv) {
20  Thread t;
21  t.run();
22  std::this_thread::sleep_for(std::chrono::seconds{5});
23  return 0;
24}

A couple of things to note here:

  • The std::atomic<bool> m_stop member is gone. It is replaced by a std::stop_token that std::jthread passes to the function executed inside the thread.
  • We don’t stop() the thread manually anymore. When a std::jthread object is destroyed (which happens at the end of main()), it automatically requests the stop_token to stop the thread and then calls join() on the (now hopefully stopping…) thread.
More about std::jthread

std::jthread brings more to the table than just giving us our std::atomic<bool> m_stop and automatically requesting a stop and joining on destruction. Some of the other additions include:

  • Instead of having std::jthread’s destructor request the stop, you can of course also use std::jthread::request_stop().
  • You can wait on a std::stop_token to be ’notified’ (i.e., requested to stop), by using it with std::condition_variable_any.
  • You can attach a callback to be called when stopping is requested to a std::stop_token using std::stop_callback.

Enter a Blocking System Call

In the previous examples, things were straightforward: We had our while(…) loop, which stopped looping when stopping was requested. Since the body of the while() loop always finished in a short time (ca. a second…), the thread stopped not long after stopping was requested.

Now let’s assume that the body of the while loop does something that blocks on a system call, say for example reading from stdin:

 1class Thread {
 2public:
 3  void run() { m_thread = std::thread(&Thread::threadMain, this); }
 4  void stop() {
 5    m_stop = true;
 6    m_thread.join();
 7  }
 8
 9private:
10  void threadMain() {
11    while (!m_stop) {
12      std::this_thread::sleep_for(std::chrono::seconds{1});
13      char buffer[10];
14      read(0, buffer, 9); // fd 0 is stdin
15      buffer[9] = '\0';
16      std::cout << "Read from stdin: " << buffer << "\n";
17    }
18    std::cout << "Thread stopped cleanly.\n";
19  }
20  std::thread m_thread;
21  std::atomic<bool> m_stop = false;
22};
23
24int main(int argc, char **argv) {
25  Thread t;
26  t.run();
27  std::this_thread::sleep_for(std::chrono::seconds{5});
28  std::cout << "Requesting thread stop\n";
29  t.stop();
30  return 0;
31}

If you run this, after five seconds this should print “Requesting thread stop” to the console… and then nothing should happen. That’s because in line 14, the read() call blocks. The stop() method was called from main() and m_stop has been set to true - but since the body of the while loop never finishes, the stopping condition (m_stop) is never tested, the loop never ends, …

If you hit return on your console, the read() call will return, the thread should finish and the program should end.

However, we want to be able to cleanly finish the thread without depending on user interaction (or network activity, or whatever the syscall is blocked on).

Interrupting the Blocking System Call

The solution to our problem - at least on Linux2 - are POSIX signals. In a POSIX-compatible OS, running programs can set custom functions as signal handlers1 for each signal. If a signal handler is set for a signal, say SIGUSR1, and the process receives that signal, the signal handler function is run.

Imagine your program happily running, perhaps in multiple threads, each with its own instruction counter. Now your program receives a signal. How is that signal handler executed? Is a new thread spawned to execute the handler? That would be difficult. Instead, the OS chooses one of the threads your program is running anyways.3 That thread is paused in its execution, and instead it executes the signal handler. As soon as the signal handler finishes, the thread continues its usual execution.

There is one complication with the approach, which we will exploit: When the thread chosen to run the signal handler is currently executing a system call, that system call can usually not be suspended in the same way that user-mode code can. Instead, there are two options: First, the system call can be restarted after the signal handler finishes, which then effectively looks to your program as if the system call would have been suspended and resumed. The other option is: the system call is aborted and returns an error code (EINTR).

The system call is aborted! That’s exactly what we want to provoke!

Thus, we need to do three things:

  • We need to install an signal handler for some signal that’s not used for anything else (say SIGUSR1),
  • then we need to get our signal handling into the “abort system call” mode instead of the (default) “restart system call” mode,
  • and finally we need to get our program to handle a SIGUSR1 signal in the thread we want to interrupt.

Setting the Signal Handler

Historically, signal handlers were set with signal(). However, sigaction() is a more stable function for setting signal handlers (and doing more, stay tuned).

Our signal handler does not need to do anything, we only want it to interrupt the system call. Thus, our signal handler can look like this:

1void myHandler(int) {}

The sigaction() function requires a struct sigaction as input, which we need to set up and supply with the handler:

1struct sigaction sigActionData;
2sigemptyset(&sigActionData.sa_mask);
3sigActionData.sa_handler = &myHandler;

(The sigemptyset call just specifies that all signals are unblocked.)

Now we still need to get it into the ‘abort system calls’ mode, which is done with

1sigActionData.sa_flags = SA_INTERRUPT;

and finally install that as signal handler for SIGUSR1:

1sigaction(SIGUSR1, &sigActionData, nullptr);

Sending the Signal

We have installed the signal handler in a way that will interrupt the system call. Now we need to handle a SIGUSR1 signal in the correct thread. While in general you have little control over which thread handles which signal, when sending the signal you can actually select which thread should handle the signal: That’s what pthread_kill() is for.

Note that I wrote that everything in this article should work on Linux. The following technique depends on the fact that the C++ standard library implementation uses POSIX threads (“pthreads”) to implement std::thread. That should be true for at least the GCC and LLVM implementations of the standard library. I can’t rule out that there’s a standard library implementation on Linux that doesn’t use POSIX threads.

With pthread_kill, we can specify the thread ID to which we want to send the signal. The thread ID can be retrieved from std::thread::native_handle() (see the caveat above). Thus, we rewrite our stop() method like this:

1void stop() {
2    m_stop = true;
3    pthread_kill(m_thread.native_handle(), SIGUSR1);
4    m_thread.join();
5}

Thus, when stop() is called we first set the flag that will stop the while() loop, and then send the signal that will break the read() call. The while loop will then continue into the next iteration, check m_stop, and end.

Also, we previously kind of ignored errors that may be returned from our read(), which is always a bad idea, but since we are now actively provoking such an error, we should probably step up our game:

 1void threadMain() {
 2  while (!m_stop) {
 3    std::this_thread::sleep_for(std::chrono::seconds{1});
 4    char buffer[10];
 5    int result = read(0, buffer, 9); // fd 0 is stdin
 6    if (result >= 0) {
 7      buffer[9] = '\0';
 8      std::cout << "Read from stdin: " << buffer << "\n";
 9    } else {
10      std::cout << "read() returned error " << errno << "\n";
11    }
12  }
13  std::cout << "Thread stopped cleanly.\n";
14}

You can see the whole working code example in the repository containing the code examples from my blog.

With this, if you execute the program and just wait for five seconds (until t.stop() is executed), you should see this console output:

1Requesting thread stop
2read() returned error 4
3Thread stopped cleanly.

Hooray, we stopped the thread even though it was blocked in a system call!

Bonus: Doing it with std::jthread

The previous solution has the drawback that we again have to implement (and call!) our stop() manually - because stop() is responsible for sending the SIGUSR1 signal. This approach lacks the std::jthread feature of automatically cleaning up the thread on destruction. Can we transfer this solution to std::jthread?

Yes, we can: We can attach a callback to be executed when stopping is requested to the std::stop_token used by the std::jthread. In that callback, we can send the signal:

 1class Thread {
 2public:
 3  void run() {
 4    m_thread = std::jthread(
 5        [this](std::stop_token token) { this->threadMain(token); });
 6  }
 7
 8private:
 9  void threadMain(std::stop_token token) {
10
11    // Register a stop callback that will send us the SIGUSR1 signal
12    std::stop_callback callback(token, [this] {
13      pthread_kill(m_thread.native_handle(), SIGUSR1);
14    });
15
16    while (!token.stop_requested()) {
17      std::this_thread::sleep_for(std::chrono::seconds{1});
18      char buffer[10];
19      int result = read(0, buffer, 9); // fd 0 is stdin
20      if (result >= 0) {
21        buffer[9] = '\0';
22        std::cout << "Read from stdin: " << buffer << "\n";
23      } else {
24        std::cout << "read() returned error " << errno << "\n";
25      }
26    }
27    std::cout << "Thread stopped cleanly.\n";
28  }
29  std::jthread m_thread;
30};
31
32int main(int argc, char **argv) {
33    
34}

(You can find the full code example here on GitHub).

This will work just as the example with the manual stop() call above - but with the additional bonus that you cannot forget to manually stop your thread.


  1. Note that I’ve skipped header includes for brevity ↩︎

  2. And probably other POSIX-compliant systems, but not Microsoft Windows. ↩︎

  3. You have limited control over which thread is chosen. As long as your system uses POSIX threads, you can use sigprocmask / pthread_sigmask to block signals from being handled in certain threads. ↩︎

Comments

You can use your Mastodon account to reply to this post.

Reply to tinloaf's post

With an account on the Fediverse or Mastodon, you can respond to this post. Since Mastodon is decentralized, you can use your existing account hosted by another Mastodon server or compatible platform if you don't have an account on this one.

Copy and paste this URL into the search field of your favourite Fediverse app or the web interface of your Mastodon server.