Radish alpha
h
rad:z3gqcJUoA1n9HaHKufZs5FCSGazv5
Radicle Heartwood Protocol & Stack
Radicle
Git
signals: Mutex can cause signals to be missed
Closed { reason: Solved } did:key:z6Mko5mi...bgMH opened 1 year ago

When radicle_signals::handler is called when a signal is delivered and when it locks the mutex, it’s possible that one or more subsequent signals are delivered that also call handler, while the first call of handler still holds the mutex, which causes the subsequent call(s) to handler to try to acquire the mutex but they can’t and so they just do nothing and so no notification is sent on the signals-notification channel. This prevents the desired processing of such subsequent signals from being done. E.g. if SIGWINCH or SIGHUP is delivered and then SIGTERM or SIGINT is delivered immediately following while the handling of the first signal still holds the lock, the program will not be shutdown, or the current TUI spinner or pager will not be aborted, etc, as was desired.

This can more easily be reproduced in a debugger by keeping the first call of handler stopped while it holds the lock and then sending subsequent signals. This is sufficient to prove it could happen in production.

The mutex is only needed for installing & uninstalling the static NOTIFY channel. The mutex isn’t necessary for handler to use the channel, because crossbeam_channel::Sender is Sync and would work if accessed directly and concurrently. But wrapping it in a mutex forces the need to acquire the lock to access it.

It might be possible to instead use std::sync::OnceLock which would fix this issue by avoiding mutual exclusion in the handler. OnceLock is only available in Rust versions >= 1.70, and since the rust-toolchain file of the Heartwood repository uses 1.77, I’m guessing that the requirement of at least 1.70 might be acceptable.

But using OnceLock (or anything that’s not like a mutex) would prevent being able to mutate the static NOTIFY, which would prevent being able to install a different channel after one had already been initially installed and then uninstalled. I don’t know if that use-case is needed, but it’s currently possible, and changing this might require redesigning the uses of radical_signals by radical_term if it needs that use-case. I’d need to review radical_term more.

Uninstalling by itself would still be used the same, with having OnceLock instead, since the uninstall function would still be able to do libc::signal(..., libc::SIG_DFL) just the same. With the signals’ handling reset so that handler isn’t used anymore, it wouldn’t matter that the Sender side of the channel is still held in the static NOTIFY. The Receiver side would still be dropped when desired, which would still disconnect the channel.

However, using OnceLock and a channel, or almost any other approach, would still not be async-signal-safe, as I described in issue#111cf90. Resolving that issue could also resolve this issue, so maybe it’s not worth it to implement a OnceLock (or whatever) approach that would not be kept.

z6Mko5mi...bgMH commented 1 year ago

This issue is fixed by patch 8e05702.