Since v5.13, URI::Escape handles the 'unsafe characters' parameter
differently than before, i.e., enforcing what is documented [0]:
The set is specified as a string that can be used in a regular
expression character class (between [ ]).
So, the leading/trailing [] were never supposed to be there.
Note that since v5.15 we could also pass a qr// regex object.
[0]: 1a4ed66802
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
[ T: Add details and mention regex objects ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
commas can be used in two ways, quoting Perl Best Practices (PBP):
> The comma actually has two distinct roles in Perl. In a scalar
> context, it is (as those former C programmers expect) a sequencing
> operator: “do this, then do that”. But in a list context, such as
> the argument list of a print, the comma is a list separator, not
> technically an operator at all.
-- PBP, page 69
And the separating variant is called a "junior semicolon" by PBP.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This is an internal parameter and we pass the actual internal one
around via the $reqstate variable, so avoid confusion and return a
clear error if a POST request sets this query parameter.
Reported-by: Friedrich Weber <f.weber@proxmox.com>
Suggested-by: Friedrich Weber <f.weber@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Until now, we calculated the MD5 hash of any uploaded file during the
upload, regardless of whether the user chose to provide a hash sum
and algorithm. The hash was only logged in the syslog.
As the user can provide a hash algorithm and a checksum when
uploading a file, which gets automatically checked (after the
upload), this is not needed anymore. Instead, the file name is
logged.
Depending on the speed of the network and the cpu, upload speed or
CPU usage might improve: All tests were made by uploading a 3.6GB iso
from the PVE host to a local VM. First line is with md5, second
without.
no networklimit
multipart upload complete (size: 3826831360B time: 20.310s rate: 179.69MiB/s md5sum: 8c651682056205967d530697c98d98c3)
multipart upload complete (size: 3826831360B time: 16.169s rate: 225.72MiB/s filename: ubuntu-22.04.1-desktop-amd64.iso)
125MB/s network
In this test, pveproxy worker used x % CPU during the upload. As you can see, the reduced CPU usage is noticable in slower networks.
~75% CPU: multipart upload complete (size: 3826831360B time: 30.764s rate: 118.63MiB/s md5sum: 8c651682056205967d530697c98d98c3)
~60% CPU: multipart upload complete (size: 3826831360B time: 30.763s rate: 118.64MiB/s filename: ubuntu-22.04.1-desktop-amd64.iso)
qemu64 cpu, no network limit
multipart upload complete (size: 3826831360B time: 46.113s rate: 79.14MiB/s md5sum: 8c651682056205967d530697c98d98c3)
multipart upload complete (size: 3826831360B time: 41.492s rate: 87.96MiB/s filename: ubuntu-22.04.1-desktop-amd64.iso)
qemu64, -aes, 1 core, 0.7 cpu
multipart upload complete (size: 3826831360B time: 79.875s rate: 45.69MiB/s md5sum: 8c651682056205967d530697c98d98c3)
multipart upload complete (size: 3826831360B time: 66.364s rate: 54.99MiB/s filename: ubuntu-22.04.1-desktop-amd64.iso)
Signed-off-by: Matthias Heiserer <m.heiserer@proxmox.com>
[ T: reflow text-width and slightly add to subject ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
As reported in the forum, multipart requests are parsed incorrectly if
the file part header contains *only* Content-Disposition, but no other
fields (in particular, no Content-Type). As a result, uploaded files
are mangled: In most cases, an additional carriage return and line
feed (\r\n) is prepended to the file contents.
As an example, consider the following file part (with explicit \r\n
for clarity):
Content-Disposition: form-data; name=...; filename=...\r\n
Content-Type: application/x-iso9660-image\r\n
\r\n
file contents...
The current parsing code for file parts roughly works as follows:
1) Consume the Content-Disposition field including the trailing \r\n
2) Consume and ignore everything up to and including the next \r\n\r\n
3) Read the file contents
This works fine in the example above. However, it has a bug in case
Content-Disposition is the *only* header field:
Content-Disposition: form-data; name=...; filename=...\r\n
\r\n
file contents...
Now, step 1 already consumes the first half of the \r\n\r\n sequence
that marks the end of the part headers. As a result, step 3 starts
reading the file at a wrong offset:
- If the remaining contents of the read buffer (currently sized 16KiB)
contain \r\n\r\n, step 2 consumes everything up to and including
this marker and step 3 starts reading file contents there. As a
result, the uploaded file is truncated at its beginning.
- Otherwise, step 2 is a noop and step 3 considers the remaining
second half of the \r\n\r\n marker to be part of the file contents.
As a result, the uploaded file is prepended with an extra \r\n.
To fix this, modify step 1 to *not* consume the trailing \r\n. This
keeps the \r\n\r\n marker intact, no matter whether additional header
fields are present or not.
Fixes: 3e3faddb4a
Link: https://forum.proxmox.com/threads/125411/
Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
Allow HTTP connections up until the request's header has been
parsed and processed. If no TLS handshake has been completed
beforehand, the server now responds with either a
'301 Moved Permanently' or a '308 Permanent Redirect' as noted in the
MDN web docs[1].
This is done after the header was parsed; for the redirect to work,
the `Host` header field of the request is used to create the
`Location` field of the response. This makes redirections independent
of how the server is accessed (e.g. via IP, localhost, FQDN, ...)
possible.
Upon redirection the client is immediately disconnected; otherwise,
they would have to wait for the connection to time out until
they may reconnect via TLS again.
[1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/301
Signed-off-by: Max Carrara <m.carrara@proxmox.com>
The part responsible for authentication and subsequent request
handling is moved into the new `authenticate_and_handle_request`
subroutine.
If `authenticate_and_handle_request` doesn't return early, it returns
`1` for further control flow purposes.
Some minor things are formatted or renamed for readability's sake.
Signed-off-by: Max Carrara <m.carrara@proxmox.com>
The code concerned with processing the request's header in
`unshift_read_header` is moved into the new `process_header`
subroutine.
If `process_header` doesn't return early, it returns `1` for further
control flow purposes.
Some minor things are formatted or renamed for readability's sake.
Signed-off-by: Max Carrara <m.carrara@proxmox.com>
makes it rather harder to read and now unnecessary
Signed-off-by: John Hollowell <jhollowe@johnhollowell.com>
[ T: resolve merge conflict and add commit message ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
In commit 0fbcbc2 ("fix #3990: multipart upload: rework to fix
uploading small files") a breaking change was added which now
requires the file's multipart part to have a `Content-Type` even
though the content type is never used. It is just included to consume
those bytes so phase 2 (dumping the file contents into the file) can
continue.
Avoid this overly strict and unused requirement.
Signed-off-by: John Hollowell <jhollowe@johnhollowell.com>
[ T: resolve merge conflict, add telling commit message ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Allow upload without trailing newline, even though this is not
compliant with RFC 1521.
RFC 1521 mandates that the close-delimiter ends in a newline:
'close-delimiter := "--" boundary "--" CRLF'
However, some software (e.g. postman) sends their request without a
trailing newline, which resulted in failing uploads.
Signed-off-by: Matthias Heiserer <m.heiserer@proxmox.com>
Reviewed-by: Daniel Tschlatscher <d.tschlatscher@proxmox.com>
Tested-by: Daniel Tschlatscher <d.tschlatscher@proxmox.com>
Reported in the forum:
https://forum.proxmox.com/threads/image-upload-fails-after-upgrading-from-7-1-to-7-3.119051/#post-516517
When additional headers existed in the request body, the upload failed.
With this patch, all additional headers get ignored.
Example: The following upload would fail because no headers were
expected after Content-Disposition.
```
--EPIHyQJFC5ftgoXHMe8-Jc6E7FqA4oMb0QBfOTz
Content-Disposition: form-data; name="content"
Content-Type: text/plain; charset=ISO-8859-1
iso
```
would fail. These headers now also get ignored, as we don't use them.
Also, upload now works when the Content-Disposition header isn't the
first, i.e.:
```
--XVH95dt1-A3J8mWiLCmHCW4roSC7-gBntjATBy--
Content-Type: text/plain; charset=ISO-8859-1
Content-Disposition: form-data; name="content"
```
Fixed upload was tested using
* Curl
* GUI
* Apache HttpClient 5
Signed-off-by: Matthias Heiserer <m.heiserer@proxmox.com>
Reviewed-by: Daniel Tschlatscher <d.tschlatscher@proxmox.com>
Tested-by: Daniel Tschlatscher <d.tschlatscher@proxmox.com>
Currently, if a file starts with a newline, it gets removed and the
upload succeeds (provided no hash is given).
Signed-off-by: Matthias Heiserer <m.heiserer@proxmox.com>
Reviewed-by: Daniel Tschlatscher <d.tschlatscher@proxmox.com>
Tested-by: Daniel Tschlatscher <d.tschlatscher@proxmox.com>
Some fields (e.g. filename) can contain spaces, but our 'trim'
function, would only return the value until the first whitespace
character instead of removing leading/trailing white space. This lead
to passing the wrong filename to the API call (e.g. 'foo' instead of
'foo (1).iso'), which would then reject it because of the 'wrong'
extension.
Fix this by just using the battle proven trim from pve-common.
Fixes: 0fbcbc2 ("fix #3990: multipart upload: rework to fix uploading small files")
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Separate the flow into first getting the length and data reference
and only then handle writing/digest in a common way
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
and improve the boundary variable helper name by adding a _re postfix
and using snake case everywhere.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
== The problem
Upload of files smaller than ~16kb failed.
This was because the code assumed that it would be called
several times, and each time would do a certain action.
When the whole file was in the buffer, this failed because
the function would try parssing the first part in the payload and
then return, instead of parsing the rest of the available data.
== Why not just modifying the current code a bit?
The code had a lot of nested control statements and a
non-intuitive control flow (phase 0->2->1->1->1 and so on).
The way the phases and buffer content were checked made it
rather difficult to just fix a few lines.
== What was changed
* Part headers are parsed with a single regex line each,
which improves code readability.
* Parsing the content is done in order, so even if the whole data is in the buffer,
it can be read in one go. Files of arbitrary sizes can be uploaded.
== Tested with
* Uploaded 0B, 1B, 14KB, 16KB, 1GB, 10GB, 20GB files
* Tested with all checksums and without
* Tested on firefox, chromium, and pvesh
I didn't do any fuzzing or automated upload testing.
== Drawbacks & Potential issues
* Part headers are hardcoded, adding new ones requries modifying this file
== does not fix
* upload can still time out
Signed-off-by: Matthias Heiserer <m.heiserer@proxmox.com>
Tested-by: Daniel Tschlatscher <d.tschlatscher@proxmox.com>
Acknowledging the Content-Disposition header makes it possible for the
backend to tell the browser whether a file should be downloaded,
rather than displayed inline, and what it's default name should be.
Signed-off-by: Daniel Tschlatscher <d.tschlatscher@proxmox.com>
While $self->error will immediately send out a 4xx or 5xx response
anyhow its still good to cover against possible side effects (e.g.,
from future code in that branch) on the server and return directly.
Note that this is mostly for completeness sake, we already have
another check that covers this one for relevant cases in commit
580d540ea9.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
We don't expect any userinfo in the authority and t o avoid that this
allows some leverage in doing weird things later its better to error
out early on such requests.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Originally-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Ensures that no external request can control streaming on proxying
requests as safety net for when we'd have another issue in the
request handling part.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Originally-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
We implicitly assume that to be the case when assembling the target
URL, so assert it explicitly as it's user controlled input.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Originally-by: Wolfgang Bumiller <w.bumiller@proxmox.com>