Skip to content

h1 servers can shutdown connections with pending buffered data on filled sockets#4018

Open
deven96 wants to merge 1 commit intohyperium:masterfrom
deven96:diretnan/fix-shutdown-with-unflushed-body
Open

h1 servers can shutdown connections with pending buffered data on filled sockets#4018
deven96 wants to merge 1 commit intohyperium:masterfrom
deven96:diretnan/fix-shutdown-with-unflushed-body

Conversation

@deven96
Copy link

@deven96 deven96 commented Feb 13, 2026

Focusing on a recurrent issue in production that happens from 0.14.x up to the latest 1.8.x upgrade where responses end up as 200 but the actual body length seems not to match the headers that were sent and the server shuts down the connection.

It's a bit difficult to run strace on all the important syscalls as this issue dissapears with that, however by focusing on recvfrom,sendto,shutdown,epoll_ctl I'm able to capture a request that exhibits this behavior and I see the following

// Request read begins at this point ✅ 
1977315 1770641958.064243 recvfrom(20, "POST /transform HTTP/1.1 ...", 8192,...) = 1027
1977315 1770641958.067062 recvfrom(20, "\xff\xd8...", 16384,...) = 16384
...
1977315 1770641958.631908 recvfrom(20, "....", 57031, ....) = 62
// Request body is complete at this point ✅
....
// Response write begins here ✅ 
1977315 1770641958.930401 sendto(20, "HTTP/1.1 200 OK\r\nContent-Length: 14896092\r\n ...", 14896450,...) = 219264
1977315 1770641958.930539 shutdown(20, SHUT_WR) = 0 // Socket shutdown just 138 microseconds after first chunk was sent! ⚠️ 
1977315 1770641958.930601 epoll_ctl(8, EPOLL_CTL_DEL, 20, NULL) = 0 

We didn't have any shutdowns that quick and so I did a bit of digging into hyper and realized that if all the right pieces aligned, we could get into a state where a h1 server shuts down a connection that still has data that hasn't yet been flushed.
The issue resides within the logic of the poll_loop and the step by step looks something like this

  • Socket default size 212_922
  • Request comes in and gets read fully
  • A body (X) larger than the socket size gets created and passed to buffer with Full::new(Bytes::new())
  • On poll_write, Full provides the data to internal buffer at once which means encoder.is_eof() is equals true
  • State is transitioned to Writing::Closed
  • Within poll_flush loop
  • First loop attempt to write X to socket but only successfully writes 212_922
  • Second loop gets pending as reader is slow and socket is blocked
  • poll_flush returns Pending
  • poll_loop ignores result from poll_flush. Sees !wants_read_again() as request has been fully read and exits
  • is_done() returns true as read is done and write is closed

The fix I am proposing is to propagate pending flushes within the poll loop to give the buffer a chance to get fully flushed and the included test just simply asserts that shutdown was not called with pending flushes

I did notice a previously yanked change on here that modified the poll loop to achieve something similar but this should lead to exiting the poll loop and waiting to get polled again rather than exhausting the CPU

@deven96 deven96 changed the title h1 servers can be shutdown connections with pending buffered data h1 servers can shutdown connections with pending buffered data Feb 13, 2026
@deven96 deven96 changed the title h1 servers can shutdown connections with pending buffered data h1 servers can shutdown connections with pending buffered data on blocked sockets Feb 13, 2026
@deven96 deven96 changed the title h1 servers can shutdown connections with pending buffered data on blocked sockets h1 servers can shutdown connections with pending buffered data on filled sockets Feb 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant