Radish alpha
h
rad:z3gqcJUoA1n9HaHKufZs5FCSGazv5
Radicle Heartwood Protocol & Stack
Radicle
Git
signals: Use `signals_receipts` crate instead
Merged did:key:z6Mko5mi...bgMH opened 1 year ago

This fixes the following issues:

  • 111cf90 (handler is not async-signal-safe)
  • c8e0300 (libc::signal should not be used)
  • b91c6c3 (Mutex can cause signals to be missed)

The changes are all only internal to radicle-signals, and the API of radicle-signals is the same as before, and uses of it don’t need to be and aren’t changed.

See the commit’s message that describes its changes. See the comments in the new code. See the below:

All those issues are fixed by a different internal approach that:

  • Uses only atomic counters and a semaphore from within the signal handlers, ensuring async-signal-safety portably.
  • Uses libc::sigaction instead. With SA_RESTART, which preserves the same behavior as the prior use of libc::signal on all the OSs that https://radicle.xyz/ provides prebuilt binaries for, and which now ensures consistent behavior across all further OSs.
  • Does not access the mutex from within the signal handlers, and separates the management of the mutex’ed global state from the processing of signals receipts that sends the notifications over a channel.
  • Has an additional internal thread that blocks on the semaphore, and, upon wakes, takes the counters’ latest values, and, for those signals that have newly been delivered, sends notifications over the user-provided channel.

All of this approach is provided via a new external dependency, the signals_receipts crate, written by myself, that provides and manages it all in a very easy-to-use way. The justification for having a new external dependency is:

It was found that all the parts for this approach are better provided as libraries (crates), because: that significantly shortens and simplifies radicle-signals; they are entirely reusable for other various projects; their internal workings need to be encapsulated; and they’re not something that the Radicle team should be expected to take on the maintenance of. I, as the maintainer, expect the maintenance burden to be minimal and rare and my understanding to not fade too much, and I expect that any team member could reasonably familiarize with it all if needed, due to the medium-small size, limited, and already-complete nature of these crates and due to the extensive comments in them.

It was also found that no lesser approach could be truly robust and portable while still preserving the preexisting API and use cases of radicle-signals. The approach I settled on does achieve that and does preserve those.

Further information about the approach can be found in the crates’ READMEs and documentation: https://crates.io/crates/signals_receipts https://crates.io/crates/sem_safe

To my knowledge, the only alternative for comparable robustness and portability would be the signal_hook crate which has more use in the Rust ecosystem. After evaluating it, I decided against it because:

  1. It can’t fully uninstall the signal handlers at the OS-process level (it can only emulate that (without perfect fidelity, IIUC)), and not being able to do so would’ve been a more disruptive change to the preexisting, and perhaps future, uses of radicle-signals that have been doing, and might want to do, full uninstalling. Whereas, signals_receipts does fully uninstall.
  2. It doesn’t provide the ability to send over channels. Achieving that would’ve required an additional internal thread to do such, but having an ad-hoc approach to managing that, with the uninstalling and re-installing, would’ve been error-prone. Whereas, signals_receipts does provide the ability to send over channels, and it automatically manages its internal thread robustly.
  3. It has other substantial features that aren’t needed by radicle-signals, but those would’ve had to be carried in the object code just to use its iterator that builds on them. Whereas, signals_receipts carries only what’s needed.
  4. It uses the “self-pipe trick” which is a hack (albeit a reliable one). Whereas, signals_receipts uses semaphores which are more modern and are intended for signal handling.

I’ve confirmed this patch builds cleanly for all the currently supported platforms: x86_64-unknown-linux-musl aarch64-unknown-linux-musl x86_64-apple-darwin aarch64-apple-darwin As well as for: x86_64-unknown-linux-gnu

I have tested this patch in these ways:

  • The dependency crates, signals_receipts & sem_safe, that provide the new approach, have thorough tests and have been confirmed by me to work correctly on numerous POSIX OSs (see their READMEs). (Their portability is broader than Radicle’s current support, and so will assist in broadening its support.) I’ve also done thorough debugger stepping of their code paths.
  • I’ve conducted functional testing of the direct uses of radicle-signals, using the new approach of this patch, on Linux and macOS, x86-64.
  • I’ve done thorough debugger stepping of the code paths of the direct uses of radicle-signals, including of the various possibilities due to differences in thread-scheduling races.
  • I’ve spent considerable time reasoning about the dependency crates and the preexisting uses of radicle-signals.

The uses of radicle-signals that I tested are:

  • radicle-node: Simple use in main.rs & runtime.rs that only installs once with a bounded channel and never uninstalls.
  • radicle-term: Uses in spinner.rs & pager.rs that uninstall after installing with unbounded channels and that could be invoked multiple times within the same process and thus could do re-installing.
  • rad-cli-demo: Uses radicle-term spinner & pager.

The indirect uses of radicle-signals that I haven’t tested yet are:

  • radicle-cli: Uses radicle-term spinner & pager in numerous places that seemed equivalent enough to what I have tested.

The uses should be tested on AArch64 for at least the OSs that Radicle supports (Linux & macOS), which I haven’t done yet. There’s no reason to expect any issues with that, since OSs provide the same APIs and semantics for the relevant aspects of signal handling regardless of CPU architecture. I’d need to find access to such machines.

I can test the remaining uses, if there’s interest in accepting this contribution. Or, it might be good for someone from the team to test those as part of reviewing this.

If you have any questions, I’m happy to respond to them.

3 files changed +126 -82 c05434eb 47c785b9
modified Cargo.lock
@@ -214,6 +214,12 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9d297deb1925b89f2ccc13d7635fa0714f12c87adce1c75356b39ca9b7178567"

[[package]]
+
name = "base64"
+
version = "0.22.1"
+
source = "registry+https://github.com/rust-lang/crates.io-index"
+
checksum = "72b3254f16251a8381aa12e40e3c4d2f0199f8c6508fbecb9d91f575e0fbb8c6"
+

+
[[package]]
name = "base64ct"
version = "1.6.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
@@ -646,9 +652,9 @@ checksum = "5443807d6dff69373d433ab9ef5378ad8df50ca6298caf15de6e52e24aaf54d5"

[[package]]
name = "errno"
-
version = "0.3.8"
+
version = "0.3.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
-
checksum = "a258e46cdc063eb8519c00b9fc845fc47bcfca4130e2f08e88665ceda8474245"
+
checksum = "534c5cf6194dfab3db3242765c03bbe257cf92f22b38f6bc0c58d59108a820ba"
dependencies = [
 "libc",
 "windows-sys 0.52.0",
@@ -741,9 +747,9 @@ dependencies = [

[[package]]
name = "getrandom"
-
version = "0.2.14"
+
version = "0.2.15"
source = "registry+https://github.com/rust-lang/crates.io-index"
-
checksum = "94b22e06ecb0110981051723910cbf0b5f5e09a2062dd7663334ee79a9d1286c"
+
checksum = "c4567c8db10ae91089c99af84c68c38da3ec2f087c3f82960bcdbf3656b6f4d7"
dependencies = [
 "cfg-if",
 "libc",
@@ -2106,10 +2112,11 @@ dependencies = [

[[package]]
name = "radicle-signals"
-
version = "0.10.0"
+
version = "0.11.0"
dependencies = [
 "crossbeam-channel",
 "libc",
+
 "signals_receipts",
]

[[package]]
@@ -2358,6 +2365,18 @@ dependencies = [
]

[[package]]
+
name = "sem_safe"
+
version = "0.2.0"
+
source = "registry+https://github.com/rust-lang/crates.io-index"
+
checksum = "9110d660d39b5f65f10a978ce0dd5e0a08c50e1d5589cadfad9b9a5f221fdb14"
+
dependencies = [
+
 "base64 0.22.1",
+
 "errno",
+
 "getrandom",
+
 "libc",
+
]
+

+
[[package]]
name = "serde"
version = "1.0.198"
source = "registry+https://github.com/rust-lang/crates.io-index"
@@ -2429,6 +2448,18 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0fda2ff0d084019ba4d7c6f371c95d8fd75ce3524c3cb8fb653a3023f6323e64"

[[package]]
+
name = "signals_receipts"
+
version = "0.2.0"
+
source = "registry+https://github.com/rust-lang/crates.io-index"
+
checksum = "74fa07c27e948f68f2f72241a189850526fbd3b428eee702884ba1883a39610e"
+
dependencies = [
+
 "cfg-if",
+
 "errno",
+
 "libc",
+
 "sem_safe",
+
]
+

+
[[package]]
name = "signature"
version = "1.6.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
modified radicle-signals/Cargo.toml
@@ -5,8 +5,9 @@ homepage = "https://radicle.xyz"
repository = "https://app.radicle.xyz/seeds/seed.radicle.xyz/rad:z3gqcJUoA1n9HaHKufZs5FCSGazv5"
license = "MIT OR Apache-2.0"
edition = "2021"
-
version = "0.10.0"
+
version = "0.11.0"

[dependencies]
crossbeam-channel = { version = "0.5.6" }
libc = { version = "0.2" }
+
signals_receipts = { version = "0.2.0", features = ["channel_notify_facility"] }
modified radicle-signals/src/lib.rs
@@ -1,7 +1,12 @@
use std::io;
-
use std::sync::Mutex;

use crossbeam_channel as chan;
+
use signals_receipts::channel_notify_facility::{
+
    self, FinishError, InstallError, SendError, SignalsChannel as _, UninstallError,
+
};
+
use signals_receipts::SignalNumber;
+

+
use crate::channel_notify_facility_premade::SignalsChannel;

/// Operating system signal.
#[derive(Debug, Copy, Clone, PartialEq, Eq)]
@@ -16,10 +21,10 @@ pub enum Signal {
    WindowChanged,
}

-
impl TryFrom<i32> for Signal {
-
    type Error = i32;
+
impl TryFrom<SignalNumber> for Signal {
+
    type Error = SignalNumber;

-
    fn try_from(value: i32) -> Result<Self, Self::Error> {
+
    fn try_from(value: SignalNumber) -> Result<Self, Self::Error> {
        match value {
            libc::SIGTERM => Ok(Self::Terminate),
            libc::SIGINT => Ok(Self::Interrupt),
@@ -30,90 +35,97 @@ impl TryFrom<i32> for Signal {
    }
}

-
/// Signal notifications are sent via this channel.
-
static NOTIFY: Mutex<Option<chan::Sender<Signal>>> = Mutex::new(None);
-

-
/// A slice of signals to handle.
-
const SIGNALS: &[i32] = &[libc::SIGINT, libc::SIGTERM, libc::SIGHUP, libc::SIGWINCH];
+
// The signals of interest to handle.
+
signals_receipts::channel_notify_facility! {
+
    SIGINT
+
    SIGTERM
+
    SIGHUP
+
    SIGWINCH
+
}

-
/// Install global signal handlers.
+
/// Install global signal handlers, with notifications sent to the given
+
/// `notify` channel.
pub fn install(notify: chan::Sender<Signal>) -> io::Result<()> {
-
    if let Ok(mut channel) = NOTIFY.try_lock() {
-
        if channel.is_some() {
-
            return Err(io::Error::new(
-
                io::ErrorKind::AlreadyExists,
-
                "signal handler is already installed",
-
            ));
-
        }
-
        *channel = Some(notify);
+
    /// The sender type must implement the facility's trait.
+
    #[derive(Debug)]
+
    struct ChanSender(chan::Sender<Signal>);

-
        unsafe { _install() }?;
-
    } else {
-
        return Err(io::Error::new(
-
            io::ErrorKind::WouldBlock,
-
            "unable to install signal handler",
-
        ));
-
    }
-
    Ok(())
-
}
-

-
/// Uninstall global signal handlers.
-
pub fn uninstall() -> io::Result<()> {
-
    if let Ok(mut channel) = NOTIFY.try_lock() {
-
        if channel.is_none() {
-
            return Err(io::Error::new(
-
                io::ErrorKind::NotFound,
-
                "signal handler is already uninstalled",
-
            ));
+
    /// This also does our desired conversion from signal numbers to our
+
    /// `Signal` representation.
+
    impl channel_notify_facility::Sender for ChanSender {
+
        fn send(&self, sig_num: SignalNumber) -> Result<(), SendError> {
+
            if let Ok(sig) = sig_num.try_into() {
+
                self.0.send(sig).or(Err(SendError::Disconnected))
+
            } else {
+
                debug_assert!(false, "only called for recognized signal numbers");
+
                // Unrecognized signal numbers would be ignored, but
+
                // this never occurs.
+
                Err(SendError::Ignored)
+
            }
        }
-
        *channel = None;
-

-
        unsafe { _uninstall() }?;
-
    } else {
-
        return Err(io::Error::new(
-
            io::ErrorKind::WouldBlock,
-
            "unable to uninstall signal handler",
-
        ));
    }
-
    Ok(())
-
}

-
/// Install global signal handlers.
-
///
-
/// # Safety
-
///
-
/// Calls `libc` functions safely.
-
unsafe fn _install() -> io::Result<()> {
-
    for signal in SIGNALS {
-
        if libc::signal(*signal, handler as libc::sighandler_t) == libc::SIG_ERR {
-
            return Err(io::Error::last_os_error());
-
        }
-
    }
-
    Ok(())
+
    SignalsChannel::install_with_outside_channel(ChanSender(notify)).map_err(|e| match e {
+
        InstallError::AlreadyInstalled { unused_notify: _ } => io::Error::new(
+
            io::ErrorKind::AlreadyExists,
+
            "signal handling is already installed",
+
        ),
+
        _ => io::Error::other(e), // The error type is non-exhaustive.
+
    })
}

/// Uninstall global signal handlers.
///
-
/// # Safety
-
///
-
/// Calls `libc` functions safely.
-
unsafe fn _uninstall() -> io::Result<()> {
-
    for signal in SIGNALS {
-
        if libc::signal(*signal, libc::SIG_DFL) == libc::SIG_ERR {
-
            return Err(io::Error::last_os_error());
+
/// The caller must ensure that all `Receiver`s for the other end of the
+
/// channel are dropped to disconnect the channel, to ensure that the
+
/// internal "signals-receipt" thread wakes to clean-up (in case it's
+
/// blocked on sending on the channel).  Such dropping usually occurs
+
/// naturally when uninstalling, since the other end is no longer needed
+
/// then, and may be done after calling this.  Not doing so might
+
/// deadlock the "signals-receipt" thread.
+
pub fn uninstall() -> io::Result<()> {
+
    SignalsChannel::uninstall_with_outside_channel().map_err(|e| match e {
+
        UninstallError::AlreadyUninstalled => io::Error::new(
+
            io::ErrorKind::NotFound,
+
            "signal handling is already uninstalled",
+
        ),
+
        #[allow(clippy::unreachable)]
+
        UninstallError::WrongMethod => {
+
            // SAFETY: Impossible, because `SignalsChannel` is private
+
            // and so `SignalsChannel::install()` is never done.
+
            unreachable!()
        }
-
    }
-
    Ok(())
+
        _ => io::Error::other(e), // The error type is non-exhaustive.
+
    })
}

-
/// Called by `libc` when a signal is received.
-
extern "C" fn handler(sig: libc::c_int, _info: *mut libc::siginfo_t, _data: *mut libc::c_void) {
-
    let Ok(sig) = sig.try_into() else {
-
        return;
-
    };
-
    if let Ok(guard) = NOTIFY.try_lock() {
-
        if let Some(c) = &*guard {
-
            c.try_send(sig).ok();
+
/// Do [`uninstall()`], terminate the internal "signals-receipt"
+
/// thread, and wait for that thread to finish.
+
///
+
/// This is provided in case it's ever needed to completely clean-up the
+
/// facility to be like it hadn't been installed before.  It's
+
/// unnecessary to use this, just to uninstall the handling.  Usually,
+
/// only using `uninstall` is desirable so that the "signals-receipt"
+
/// thread is kept alive for faster reuse when re-installing the
+
/// handling.
+
///
+
/// The caller must ensure that all `Receiver`s for the other end of the
+
/// channel have **already** been dropped to disconnect the channel,
+
/// before calling this, to ensure that the "signals-receipt" thread
+
/// wakes (in case it's blocked on sending on the channel) to see that
+
/// it must finish.  If this is not done, this might deadlock.
+
pub fn finish() -> io::Result<()> {
+
    SignalsChannel::finish_with_outside_channel().map_err(|e| match e {
+
        FinishError::AlreadyFinished => io::Error::new(
+
            io::ErrorKind::NotFound,
+
            "signal-handling facility is already finished",
+
        ),
+
        #[allow(clippy::unreachable)]
+
        FinishError::WrongMethod => {
+
            // SAFETY: Impossible, because `SignalsChannel` is private
+
            // and so `SignalsChannel::install()` is never done.
+
            unreachable!()
        }
-
    }
+
        _ => io::Error::other(e), // The error type is non-exhaustive.
+
    })
}