Commit f737984 ("fix #4816: do not disconnect twice if client sends no
data") introduced a 'disconnected' flag in the request state to avoid
duplicate calls to client_do_disconnect() for a given client. The flag
is only set and checked in the on_error callback of the handle
however. Do this more centrally at the beginning of the
client_do_disconnect() function itself to catch all callers and code
paths that could lead to a duplicate call. For example, while not
currently known to cause issues, the on_eof handler might re-enter the
function.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250408142014.86344-4-f.ebner@proxmox.com
Commit f737984 ("fix #4816: do not disconnect twice if client sends no
data") introduced a 'disconnected' flag in the request state to avoid
duplicate calls to client_do_disconnect() for a given client. This
works, except in the case where client_do_disconnect() enters the
on_error callback itself. To fix this, set the 'disconnected' flag
before calling client_do_disconnect().
This was exposed by commit 07e56cc ("fix unexpected EOF for client
when closing TLS session") which introduced a call to stoptls() in
client_do_disconnect(). The documentation [0] mentions for stoptls():
> This method may invoke callbacks (and therefore the handle might be
> destroyed after it returns).
Indeed, the on_error callback might get invoked and lead to a
"detected empty handle" error message as reported in the community
forum [1].
[0]: https://metacpan.org/pod/AnyEvent::Handle#$handle-%3Estoptls
[1]: https://forum.proxmox.com/threads/164744/
Fixes: 07e56cc ("fix unexpected EOF for client when closing TLS session")
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250408142014.86344-3-f.ebner@proxmox.com
Commit 07e56cc ("fix unexpected EOF for client when closing TLS
session") added a call to stoptls() before the call to shutdown() for
the handle's file descriptor. However, the documentation for
AnyEvent[0] mentions for stoptls():
> This method may invoke callbacks (and therefore the handle might be
> destroyed after it returns).
Therefore, it is necessary to check that the handle is still defined
before calling shutdown(). Otherwise, this can result in a warning:
> Can't use an undefined value as a symbol reference at
> /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 150.
as reported in the community forum [1].
The debug print message for closing the file handle is split up,
because part of it relies on the file handle to be defined.
[0]: https://metacpan.org/pod/AnyEvent::Handle#$handle-%3Estoptls
[1]: https://forum.proxmox.com/threads/164744/
Fixes: 07e56cc ("fix unexpected EOF for client when closing TLS session")
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250408142014.86344-2-f.ebner@proxmox.com
When pve-http-server initiates the closure of a TLS session, it does not
send a TLS close notify, resulting in an unexpected EOF error on systems
with recent crypto policies. This can break functionality with other
applications, such as Foreman[0].
This behavior can be observed in the following cases:
* client uses HTTP/1.0 (no keepalive; server closes connection)
* client sends no data for 5 sec (timeout; server closes connection)
* server responds with 400 (no keepalive; server closes connection)
This patch sends the TLS close notify prior to socket teardown,
resulting in clean closure of TLS connections and no client error.
It also moves shutdown() to after the clearing of handlers. The reason
for this is stoptls() must come before shutdown(), but it also triggers
on_drain(), which calls client_do_disconnect() again. The extra call to
client_do_disconnect() is avoided inside accept_connections() by commit
f737984, but perhaps clearing the handlers prior to shutdown() will
avoid it in all cases.
[0]: https://github.com/theforeman/foreman_fog_proxmox/issues/325
Signed-off-by: Rob Rozestraten <admin@truthsolo.net>
Link: https://lore.proxmox.com/mailman.798.1741211145.293.pve-devel@lists.proxmox.com
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
In some situations, e.g., when one has a large resource mapping, the
UI can generate a request that is bigger than the current limit of
64KiB.
Our files in pmxcfs can grow up to 1 MiB, so theoretically, a single
mapping can grow to that size. In practice, a single entry will have
much less. In #6230, a user has a mapping with about ~130KiB.
Increase the limit to 512KiB so we have a bit of headroom left.
We have to also increase the 'rbuf_max' size here, otherwise the
request will fail (since the buffer is too small for the request).
Since the post limit and the rbuf_max are tightly coupled, let it
reflect that in the code. To do that sum the post size + max header
size there.
A short benchmark shows that it only slightly impacts performance for
the same amount of data (but that could be runtime variance too):
I used a 4 node virtualized cluster, benchmarked with oha[0] with
these options:
ADDR=<IP> oha --insecure -H $COOKIE -H $CSRFTOKEN -D bodyfile \
-m "PUT" -T "application/x-www-form-urlencoded" -n 3000 -c 50 \
--disable-keepalive --latency-correction \
"https://$ADDR:8006/api2/json/cluster/mapping/pci/test"
So 3000 requests with 50 parallel. I also restarted pveproxy and
daemon in between runs, and took the rss values around the 50% runtime
of the benchmark.
average time requests/s pvedaemon rss pveproxy rss
old with 60k body 3.0067s 16.3487 140M-155M 141M-170M
new with 60k body 3.0865s 15.7623 140M-155M 141M-171M
new with 180k body 8.3834s 5.8934 140M-158M 141M-181M
Using a bigger body size had a large impact on the time, but that's
IMHO expected. Also, RSS is not that much impacted, only when using
many requests with larger request size, but this should also be
expected.
0: https://github.com/hatoo/oha
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
[TL: fix wrapping the benchmark command here]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
The http status code 501 is meant to be 'Not Implemented'[0] but that
clearly does not fit here as the default error when we encounter a
problem during handling an api request or upload.
So instead use '500' (HTTP_INTERNAL_SERVER_ERROR) which we already use
in other places where it fits.
0: https://datatracker.ietf.org/doc/html/rfc9110#name-501-not-implemented
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
client_do_disconnect expects to be called exactly once per connection, since it
takes care of closing and unsetting the handle corresponding to the connection.
to find bugs in our connection handling, it will log "detected empty handle" if
it is called for a request/connection that no longer has a handle.
the edge case of opening a connection without sending any data leads to the
error callback being called twice:
Dec 04 09:37:02 xxx pveproxy[175235]: err (): Connection timed out
this is the (5 second) timeout triggering
Dec 04 09:37:02 xxx pveproxy[175235]: err (1): Broken pipe
this is AnyEvent trying to drain the buffer while the connection is already
closed
as soon as a single byte of traffic is sent, only the timeout will trigger.
there is no guarantee that the on_error callback is only called once (in fact,
it's possible to return from it for non-fatal errors and continue processing
the connection).
if there are further reports of empty handles with this in place, other
on_error callbacks might need similar logic - but it should only be added if
the triggering conditions are clear and deemed safe. the additional logging is
only cosmetic after all, but might point out an actual issue in our connection
handling code.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
similar to what we do for the extjs formatter, put the error message or
status message in the 'message' property of the return object.
This way client libraries can extract the error without having to parse
the HTTP status reason phrase (which is not possible in all http
libraries, e.g. hyperium's http rust crate).
This should not be a breaking change, since it just adds a (semi) new
field to the return value.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
when an api call with the 'extjs' formatter fails, the intention is that
the api call succeeds but contains the error in the inner structure
('error'/'status' property). When the api call fails with a raised
exception (e.g. PVE::APIClient::Exception), the '$res->{message}' field
is an object instead of a string.
Currently we directly assign the message to the resulting struct, which
we then try to convert to json. Since the message was an object,
`to_json` fails with 'encountered object` and the whole api call returns
a 501 error (since `handle_api2_request` returns that by default if
anything dies there, which is IMHO not correct but a different issue.)
By 'stringifying' the message, we avoid the error in `to_json` and the
api call can succeed.
partially fixes#6051: It improves the error message in PDM when trying
to remote migrate and the source cannot correctly resolve the target
remote. (We use PVE::APIClient there to query the remote).
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
The API server proxies HTTP requests in two cases:
- between cluster nodes (pveproxy->pveproxy)
- between daemons on one node for protected API endpoints
(pveproxy->pvedaemon)
The API server uses AnyEvent::HTTP for proxying, with unfortunate
settings for connection reuse (details below). With these settings,
long-running synchronous API requests on the proxy destination's side
can cause unrelated proxied requests to fail with a misleading HTTP
599 "Too many redirections" error response. In order to avoid these
errors, improve the connection reuse settings.
In more detail:
Per default, AnyEvent::HTTP reuses previously-opened connections for
requests with idempotent HTTP verbs, e.g. GET/PUT/DELETE [1]. However,
when trying to reuse a previously-opened connection, it can happen
that the destination unexpectedly closes the connection. In case of
idempotent requests, AnyEvent::HTTP's http_request will retry by
recursively calling itself. Since the API server disallows recursion
by passing `recurse => 0` to http_request initially, the recursive
call fails with "HTTP 599 Too many redirections".
This can happen both for pveproxy->pveproxy and pveproxy->pvedaemon,
as connection reuse is enabled in both cases. Connection reuse being
enabled in the pveproxy->pvedaemon case was likely not intended: A
comment mentions that "keep alive for localhost is not worth it", but
only sets `keepalive => 0` and not `persistent => 0`. This setting
switches from HTTP/1.1 persistent connections to HTTP/1.0-style
keep-alive connections, but still allows connection reuse.
The destination unexpectedly closing the connection can be due to
unfortunate timing, but it becomes much more likely in case of
long-running synchronous requests. An example sequence:
1) A pveproxy worker P1 handles a protected request R1 and proxies it
to a pvedaemon worker D1, opening a pveproxy worker->pvedaemon
worker connection C1. The pvedaemon worker D1 is relatively fast
(<1s) in handling R1. P1 saves connection C1 for later reuse.
2) A different pveproxy worker P2 handles a protected request R2 and
proxies it to the same pvedaemon worker D1, opening a new pveproxy
worker->pvedaemon connection C2. Handling this request takes a long
time (>5s), for example because it queries a slow storage. While
the request is being handled, the pvedaemon worker D1 cannot do
anything else.
3) Since pvedaemon worker D1 sets a timeout of 5s when accepting
connections and it did not see anything on connection C1 for >5s
(because it was busy handling R2), it closes the connection C1.
4) pveproxy worker P1 handles a protected idempotent request R3. Since
the request is idempotent, it tries to reuse connection C1. But C1
was just closed by D1, so P1 fails request R3 with HTTP 599 as
described above.
In addition, AnyEvent::HTTP's default of reusing connections for all
idempotent HTTP verbs is problematic in our case, as not all PUT
requests of the PVE API are actually idempotent, e.g. /sendkey [2].
To fix the issues above, improve the connection reuse settings:
a) Actually disable connection reuse for pveproxy->pvedaemon requests,
by passing `persistent => 0`.
b) For pveproxy->pveproxy requests, enable connection reuse for GET
requests only, as these should be actually idempotent.
c) If connection reuse is enabled, allow one retry by passing `recurse
=> 1`, to avoid the HTTP 599 errors.
With a) and b), the API server will reuse connections less often,
which can theoretically result in a performance drop. To gain
confidence that the performance impact is tolerable, here are the
results of a simple benchmark.
The benchmark runs hey [3] against a virtual 3-node PVE cluster, with
or without the patch applied. It performs 10000 requests in 2 worker
threads to `PUT $HTTP_NODE:8006/api2/json/nodes/$PROXY_NODE/config`
with a JSON payload that sets a 32KiB ASCII `description`. The
shortened hey invocation:
hey -H "$TOKEN" -m PUT -T application/json -D payload.json \
--disable-keepalive -n 10000 -c 2 "$URL"
The endpoint was chosen because it is performs little work (locks and
writes a config file), it is protected (to test behavior change a)),
and it is a PUT endpoint (to test behavior change b)).
The command is ran two times:
- With $HTTP_NODE == $PROXY_NODE for pveproxy->pvedaemon proxying
- With $HTTP_NODE != $PROXY_NODE for pveproxy->pveproxy->pvedaemon
proxying
For each invocation, we record the response times.
Without this patch:
$HTTP_NODE == $PROXY_NODE
Slowest: 0.0215 secs
Fastest: 0.0061 secs
Average: 0.0090 secs
0.006 [1] |
0.008 [2409] |■■■■■■■■■■■■■■■■■■■■■■■■
0.009 [4065] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
0.011 [1781] |■■■■■■■■■■■■■■■■■■
0.012 [1024] |■■■■■■■■■■
0.014 [414] |■■■■
0.015 [196] |■■
0.017 [85] |■
0.018 [21] |
0.020 [2] |
0.022 [2] |
$HTTP_NODE != $PROXY_NODE
Slowest: 0.0584 secs
Fastest: 0.0075 secs
Average: 0.0105 secs
0.007 [1] |
0.013 [8445] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
0.018 [1482] |■■■■■■■
0.023 [56] |
0.028 [5] |
0.033 [1] |
0.038 [0] |
0.043 [0] |
0.048 [0] |
0.053 [5] |
0.058 [5] |
With this patch:
$HTTP_NODE == $PROXY_NODE
Slowest: 0.0194 secs
Fastest: 0.0062 secs
Average: 0.0088 secs
0.006 [1] |
0.007 [1980] |■■■■■■■■■■■■■■■■■■■
0.009 [4134] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
0.010 [1874] |■■■■■■■■■■■■■■■■■■
0.011 [1406] |■■■■■■■■■■■■■■
0.013 [482] |■■■■■
0.014 [93] |■
0.015 [16] |
0.017 [5] |
0.018 [4] |
0.019 [5] |
$HTTP_NODE != $PROXY_NODE
Slowest: 0.0369 secs
Fastest: 0.0091 secs
Average: 0.0121 secs
0.009 [1] |
0.012 [5711] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
0.015 [3392] |■■■■■■■■■■■■■■■■■■■■■■■■
0.017 [794] |■■■■■■
0.020 [79] |■
0.023 [16] |
0.026 [3] |
0.029 [2] |
0.031 [0] |
0.034 [1] |
0.037 [1] |
Comparing the averages, there is
- little difference when $HTTP_NODE == $PROXY_NODE (0.009s vs
0.0088s). So for pveproxy->pvedaemon proxying, the effect of
disabling connection reuse seems negligible.
- ~15% overhead when $HTTP_NODE != $PROXY_NODE (0.0105s vs 0.0121s).
Such an increase for pveproxy->pveproxy->pvedaemon proxying is not
nothing, but in real-world workloads I'd expect the response time
for non-idempotent requests to be dominated by other factors.
[1] https://metacpan.org/pod/AnyEvent::HTTP#persistent-=%3E-$boolean
[2] https://pve.proxmox.com/pve-docs/api-viewer/index.html#/nodes/{node}/qemu/{vmid}/sendkey
[3] https://github.com/rakyll/hey
Suggested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
There where some changes w.r.t. allowing downloads in response making
that a bit stricter, the package versions before the break are not
compatible with that stricter behavior.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
this was only used by PMG's HttpServer and for non-API file responses. all of
those got dropped there in favour of always returning an object like
{
data => {
download => {
[download info here]
},
[..],
},
[..],
}
in case of PMG, or passing in a download hash in case of APIServer internal
calls.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
only a few API endpoints should allow downloads, mark them explicitly and
forbid downloading for the rest.
Fixes: 6d832db ("allow 'download' to be passed from API handler")
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Add support for compressing the body of responses with
`Content-Encoding: deflate` following [RFC9110]. Note that in this
context `deflate` is actually a "zlib" data format as defined in
[RFC1950].
To preserve the current behavior we prefer `Content-Encoding: gzip`
whenever `gzip` is listed as one of the encodings in the
`Accept-Encoding` header and the data should be compressed.
[RFC9110] https://www.rfc-editor.org/rfc/rfc9110#name-deflate-coding
[RFC1950] https://www.rfc-editor.org/rfc/rfc1950
Suggested-by: Lukas Wagner <l.wagner@proxmox.com>
Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
Tested-by: Folke Gleumes <f.gleumes@proxmox.com>
ALLOW_FROM/DENY_FROM accept any syntax understood by Net::IP. However,
if an IP range like "10.1.1.1-10.1.1.3" is configured, a confusing
Perl warning is printed to the syslog on a match:
Use of uninitialized value in concatenation (.) or string at [...]
The reason is that we use Net::IP::prefix to prepare a debug message,
but this returns undef if a range was specified. To avoid the warning,
use Net::IP::print to obtain a string representation instead.
Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
set_min/max_proto_version is recommended upstream nowadays, and it seems to be
required for some reason if *only* TLS v1.3 is supposed to be enabled.
querying via get_options gives us the union of
- system-wide openssl defaults
- our internal SSL defaults
- flags configured by the user via /etc/default/pveproxy
note that by default only 1.2 and 1.3 are enabled in the first place, so
disabling either leaves a single version being set as min and max.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
OpenSSL as packaged in Debian bookworm now ships a compat symlink for
the "combined" CA certificates file (CAfile) as managed by
update-ca-certificates. This symlink is in addition to the CApath
one that has been around for a file. The new symlink in turn gets
picked up by openssl-using code that uses the default values for the
trust store.
Every TLS context initialization now reads the full combined file,
even if no TLS is actually employed on a connection. We do such an
initialization for every proxied connection (where our HTTP server is
the client).
By specifying an explicit CA path (that is identical to the default
one), the old behaviour of looking up each CA certificate
individually iff needed is enabled again.
For an API endpoint where HTTP request handling is the bottle neck
(as opposed to the actual API handler), this improves performance of
proxied requests to be back in line with unproxied ones handled
directly by the unprivileged daemon. For all proxied requests, CPU
usage is decreased as well.
The default CAfile and CApath contain the same certificates, so there
should be no change in trusted certificates. Additionally,
certificate fingerprints are pinned in this context and verified
against the cache of pinned fingerprints.
Reported-by: Roland Kletzing <roland.kletzing@cybercon.de>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
when installing AnyEvent::AIO (by the package libanyevent-aio-perl),
the worker forks of our daemons using AnyEvent would consume 100% cpu
cycles while trying to do an epoll_wait which no one read from. It
was not really clear which part of the code set that fd up.
Reading the documentation of the related perl modules, it became
clear that the issue was with AnyEvent::IO. By default this uses
AnyEvent::AIO (if installed) which in turn uses IO::AIO which
explicitly says it uses pthreads and is not really fork compatible
(which we rely heavy upon).
It seems that IO::AIO sets up some fds with epoll in the END handler
of it's library (or earlier, but sends data to it in the END
handler), so that when using 'exit' instead of 'POSIX::_exit' (which
we do in PVE::Daemon) creates the observed behavior.
Interestingly we did not use any of AnyEvent::IO's functionality, so
we can safely remove it. Even if we would have used it in the past,
without AnyEvent::AIO the IO would not have been async anyway (the
pure perl impl doesn't do async IO). My best guess is that we wanted
to use it, but noticed that we can't, and forgot to remove the use
statement. (This is indicated by a comment that says aio_load is not
async unless IO::AIO is used)
This only occurs now, since bookworm is the first debian release to
package the library.
if we ever wanted to use AnyEvent::AIO, there are probably two other
ways that could fix it:
* replace our 'exit()' calls with 'POSIX::_exit()', which seems to
fix it, but other side effects are currently unknown
* use 'IO::AIO::reinit()' after forking, which also seems to fix it,
but perldoc says it 'is not an operation supported by any
standards, but happens to work on GNU/LINUX and some newer BSD
systems'
With this fix, one can safely install 'libanyevent-aio-perl' and
'libperl-languageserver-perl' (the only user of it AFAICS) on a
Proxmox VE or Proxmox Mail Gateway system.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
In case the actual request-body is empty it seems not Content-Type
header is set by browsers.
Tested on a vm with stopping and starting a container via GUI
(/api2/extjs/nodes/<nodename>/lxc/<vmid>/status/stop)
fixes f398a3d94b
Reported-by: Friedrich Weber <f.weber@proxmox.com>
Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
since there is no other way to get an array parameter when using
x-www-form-urlencoded content type
the previous format with \0 separated strings (known as '-alist' format)
should not be used anymore (in favor of the now supported arrays)
Acked-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
instead of always trying to encode them as x-www-form-urlencoded
Acked-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
This prohibits the cookie from being sent along in cross-site
sub-requests or when the user navigates to a different site.
Signed-off-by: Max Carrara <m.carrara@proxmox.com>
Since v5.13, URI::Escape handles the 'unsafe characters' parameter
differently than before, i.e., enforcing what is documented [0]:
The set is specified as a string that can be used in a regular
expression character class (between [ ]).
So, the leading/trailing [] were never supposed to be there.
Note that since v5.15 we could also pass a qr// regex object.
[0]: 1a4ed66802
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
[ T: Add details and mention regex objects ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>