api daemons: periodically unpark a tokio thread to ensure progress

The underlying issue seems to be the case when the thread that runs
the IO driver is polling its own tasks, while that happens the IO
driver/poller won't run and thus work stealing won't happen, meaning
that idle and parked threads will keep being parked even if there's
pending work they could do.

A promising solution for tokio is proposed in its issue tracker [0],
but it wasn't yet implemented. So, as stop gap spawn a separate
thread that periodically spawns a no-op ready future in the runtime
which would unpark a worker in the aforementioned case and thus
should break the bogus idleness. Choose a 3s period for that without
any overly elaborate reasons, our main goal is to ensure we accept
incoming connections and 3s is well below a HTTP timeout and leaves
some room for high network latencies while not invoking to much
additional wakeups for systems that are really idling.

[0]: https://github.com/tokio-rs/tokio/issues/4730#issuecomment-1147975074

Link: https://github.com/tokio-rs/tokio/issues/4730
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This commit is contained in:
Thomas Lamprecht 2022-07-18 14:11:01 +02:00
parent 4f04ecb2f6
commit c2206e21e0
2 changed files with 22 additions and 0 deletions

View File

@ -170,6 +170,17 @@ async fn run() -> Result<(), Error> {
bail!("unable to start daemon - {}", err);
}
// stop gap for https://github.com/tokio-rs/tokio/issues/4730 where the thread holding the
// IO-driver may block progress completely if it starts polling its own tasks (blocks).
// So, trigger a notify to parked threads, as we're immediately ready the woken up thread will
// acquire the IO driver, if blocked, before going to sleep, which allows progress again
// TODO: remove once tokio solves this at their level (see proposals in linked comments)
let rt_handle = tokio::runtime::Handle::current();
std::thread::spawn(move || loop {
rt_handle.spawn(std::future::ready(()));
std::thread::sleep(std::time::Duration::from_secs(3));
});
server.await?;
log::info!("server shutting down, waiting for active workers to complete");
proxmox_rest_server::last_worker_future().await?;

View File

@ -340,6 +340,17 @@ async fn run() -> Result<(), Error> {
bail!("unable to start daemon - {}", err);
}
// stop gap for https://github.com/tokio-rs/tokio/issues/4730 where the thread holding the
// IO-driver may block progress completely if it starts polling its own tasks (blocks).
// So, trigger a notify to parked threads, as we're immediately ready the woken up thread will
// acquire the IO driver, if blocked, before going to sleep, which allows progress again
// TODO: remove once tokio solves this at their level (see proposals in linked comments)
let rt_handle = tokio::runtime::Handle::current();
std::thread::spawn(move || loop {
rt_handle.spawn(std::future::ready(()));
std::thread::sleep(Duration::from_secs(3));
});
start_task_scheduler();
start_stat_generator();
start_traffic_control_updater();