file upload: don't calculate MD5, log file name instead

Until now, we calculated the MD5 hash of any uploaded file during the
upload, regardless of whether the user chose to provide a hash sum
and algorithm. The hash was only logged in the syslog.

As the user can provide a hash algorithm and a checksum when
uploading a file, which gets automatically checked (after the
upload), this is not needed anymore. Instead, the file name is
logged.

Depending on the speed of the network and the cpu, upload speed or
CPU usage might improve: All tests were made by uploading a 3.6GB iso
from the PVE host to a local VM. First line is with md5, second
without.

no networklimit
multipart upload complete (size: 3826831360B time: 20.310s rate: 179.69MiB/s md5sum: 8c651682056205967d530697c98d98c3)
multipart upload complete (size: 3826831360B time: 16.169s rate: 225.72MiB/s filename: ubuntu-22.04.1-desktop-amd64.iso)

125MB/s network
In this test, pveproxy worker used x % CPU during the upload. As you can see, the reduced CPU usage is noticable in slower networks.
~75% CPU: multipart upload complete (size: 3826831360B time: 30.764s rate: 118.63MiB/s md5sum: 8c651682056205967d530697c98d98c3)
~60% CPU: multipart upload complete (size: 3826831360B time: 30.763s rate: 118.64MiB/s filename: ubuntu-22.04.1-desktop-amd64.iso)

qemu64 cpu, no network limit
multipart upload complete (size: 3826831360B time: 46.113s rate: 79.14MiB/s md5sum: 8c651682056205967d530697c98d98c3)
multipart upload complete (size: 3826831360B time: 41.492s rate: 87.96MiB/s filename: ubuntu-22.04.1-desktop-amd64.iso)

qemu64, -aes, 1 core, 0.7 cpu
multipart upload complete (size: 3826831360B time: 79.875s rate: 45.69MiB/s md5sum: 8c651682056205967d530697c98d98c3)
multipart upload complete (size: 3826831360B time: 66.364s rate: 54.99MiB/s filename: ubuntu-22.04.1-desktop-amd64.iso)

Signed-off-by: Matthias Heiserer <m.heiserer@proxmox.com>
 [ T: reflow text-width and slightly add to subject ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This commit is contained in:
Matthias Heiserer 2023-04-12 16:22:48 +02:00 committed by Thomas Lamprecht
parent c4a08453ed
commit a2a3d17be8

View File

@ -1245,15 +1245,14 @@ sub file_upload_multipart {
if ($write_length > 0) {
syswrite($rstate->{outfh}, $data) == $write_length or die "write to temporary file failed - $!\n";
$rstate->{bytes} += $write_length;
$rstate->{ctx}->add($data);
}
}
if ($rstate->{phase} == 100) { # Phase 100 - transfer finished
$rstate->{md5sum} = $rstate->{ctx}->hexdigest;
my $elapsed = tv_interval($rstate->{starttime});
syslog('info', "multipart upload complete (size: %dB time: %.3fs rate: %.2fMiB/s md5sum: %s)",
$rstate->{bytes}, $elapsed, $rstate->{bytes} / ($elapsed * 1024 * 1024), $rstate->{md5sum}
syslog('info', "multipart upload complete (size: %dB time: %.3fs rate: %.2fMiB/s filename: %s)",
$rstate->{bytes}, $elapsed, $rstate->{bytes} / ($elapsed * 1024 * 1024),
$rstate->{params}->{filename}
);
$self->handle_api2_request($reqstate, $auth, $method, $path, $rstate);
}
@ -1563,7 +1562,6 @@ sub authenticate_and_handle_request {
my $state = {
size => $len,
boundary => $boundary,
ctx => Digest::MD5->new,
boundlen => $boundlen,
maxheader => 2048 + $boundlen, # should be large enough
params => decode_urlencoded($request->url->query()),