Skip to content

Stop issued in middle of handler can trigger deadlock? #1180

@vlovich

Description

@vlovich

Discovered this as part of #1179. I think I also observe some kind of race condition where xitca occassionally never terminates the server even though a stop has been requested. Graceful or ungraceful - doesn't matter. However, I believe the stop has to be issued precisely while the handler is still in the middle of running the code.

My handler looks something like this:

    #[route("/tests/upload", method = post)]
    pub(super) async fn route_upload(
        ctx: &WebContext<'_, ServiceState>,
        Body(mut body): Body<RequestBody>,
    ) -> std::result::Result<WebResponse, xitca_web::error::Error<ServiceState>> {
        let mut received = 0;
        let mut chunk_id = 0;

        if let Some(state) = &ctx.state().test {
            state.notify("enter::route_upload").await;
        }

        while let Some(chunk) = body.next().await {
            match chunk {
                Ok(chunk) => {
                    if let Some(state) = &ctx.state().test {
                        state.notify(format!("route_upload::chunk-{chunk_id}")).await;
                    }
                    chunk_id += 1;
                    eprintln!("Read chunk {chunk_id:?} {:?} bytes", chunk.len());
                    received += chunk.len();
                }
                Err(e) => {
                    eprintln!("Chunk failed with error");
                    return Ok(WebResponse::builder()
                        .status(StatusCode::INTERNAL_SERVER_ERROR)
                        .body(ResponseBody::bytes(e.to_string()))
                        .unwrap())
                }
            }
        }

        eprintln!("Finished reading body");

        if let Some(state) = &ctx.state().test {
            state.notify("exit::route_upload").await;
        }

        return Ok(WebResponse::builder()
            .status(StatusCode::OK)
            .body(ResponseBody::bytes(format!("{received}")))
            .unwrap());
    }

state.test is a helper that lets me wait or send strings between the thread running the handler and the test harness.

In my test harness, the reqwest body is a wrapper over a tokio::mpsc channel of 1 message deep. I first wait for "enter::route_upload" to be sent from inside the handler. I then issue a graceful shutdown to the server handle. Then I send the first chunk into the mpsc, wait for it to be acked by the handler, send the second chunk, wait for ack, and then drop the writer and wait for the "exit::route_upload" event. About 20% of the time the thread.join on the thread running xitca hangs (all futures complete - the HTTP handler, the test harness, etc). Running under a debugger I see that the xitca worker threads are all still running and none have taken any steps to shutdown.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions