proxmox/proxmox-rest-server
Max Carrara f6bacbb58f fix #5105: rest-server: connection: overhaul TLS handshake check logic
On rare occasions, the TLS "client hello" message [1] is delayed after
a connection with the server was established, which causes HTTPS
requests to fail before TLS was even negotiated. In these cases, the
server would incorrectly respond with "HTTP/1.1 400 Bad Request"
instead of closing the connection (or similar).

The reasons for the "client hello" being delayed seem to vary; one
user noticed that the issue went away completely after they turned off
UFW [2]. Another user noticed (during private correspondence) that the
issue only appeared when connecting to their PBS instance via WAN, but
not from within their VPN. In the WAN case a firewall was also
present. The same user kindly provided tcpdumps and strace logs on
request.

The issue was finally reproduced with the following Python script:

  import socket
  import time

  HOST: str = ...
  PORT: int = ...

  with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
      sock.connect((HOST, PORT))
      time.sleep(1.5) # simulate firewall / proxy / etc. delay
      sock.sendall(b"\x16\x03\x01\x02\x00")
      data = sock.recv(256)
      print(data)

The additional delay before sending the first 5 bytes of the "client
hello" message causes the handshake checking logic to incorrectly fall
back to plain HTTP.

All of this is fixed by the following:

  1. Increase the timeout duration to 10 seconds (from 1)
  2. Instead of falling back to plain HTTP, refuse to accept the
     connection if the TLS handshake wasn't initiated before the
     timeout limit is reached
  3. Only accept plain HTTP if the first 5 bytes do not correspond to
     a TLS handshake fragment [3]
  4. Do not take the last number of bytes that were in the buffer into
     account; instead, only perform the actual handshake check if
     5 bytes are in the peek buffer using some of tokio's low-level
     functionality

Regarding 1.: This should be generous enough for any client to be able
to initiate a TLS handshake, despite its surrounding circumstances.

Regarding 4.: While this is not 100% related to the issue, peeking into
the buffer in this manner should ensure that our implementation here
remains correct, even if the kernel's underlying behaviour regarding
edge-triggering is changed [4]. At the same time, there's no need for
busy-waiting and continuously yielding to the event loop anymore.

[1]: https://www.rfc-editor.org/rfc/rfc8446.html#section-4.1.2
[2]: https://forum.proxmox.com/threads/disable-default-http-redirects-on-8007.142312/post-675352
[3]: https://www.rfc-editor.org/rfc/rfc8446.html#section-5.1
[4]: https://lwn.net/Articles/864947/

Signed-off-by: Max Carrara <m.carrara@proxmox.com>
2024-07-10 12:22:17 +02:00
..
debian rest-server: bump to 0.5.3-1 2024-06-20 14:06:12 +02:00
examples rest-server: update example to new ApiConfig 2023-03-02 16:14:04 +01:00
src fix #5105: rest-server: connection: overhaul TLS handshake check logic 2024-07-10 12:22:17 +02:00
Cargo.toml rest-server: bump to 0.5.3-1 2024-06-20 14:06:12 +02:00