In the following commit we will make use of std::sync::LazyLock which
was introduced in rust 1.80.
Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
Otherwise proxmox-daily-update panics if attempting to send a
notification for any available new updates:
"context for proxmox-notify has not been set yet"
Reported on our community forum:
https://forum.proxmox.com/threads/152429/
Signed-off-by: Lukas Wagner <l.wagner@proxmox.com>
Some systemd code got split out from proxmox-sys and left there
re-exported with a deprecation marker, use the newer crate, the
workspace already depends on proxmox-systemd anyway.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Some systemd code got split out from proxmox-sys and left there
re-exported with a deprecation marker, use the newer crate, the
workspace already depends on proxmox-systemd anyway.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Some systemd code got split out from proxmox-sys and left there
re-exported with a deprecation marker, use the newer crate, the
workspace already depends on proxmox-systemd anyway.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
by combining the compression call from both encrypted and unencrypted
paths and deciding on the header magic at one site.
No functional changes intended, besides reusing the same buffer for
compression.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Increase the zstd compression throughput by not using the
`zstd::stream::copy_encode` method, because it seems it uses an
internal buffer size of 32 KiB [0], copies at least once extra in the
target buffer and might have some additional (allocation and/or
syscall) overhead. Due to the amount of wrappers and indirections it's
a bit hard to tell for sure. In anyway, there can be a reduced
throughput observed if all, the target and source storage and the
network are so fast that the operations from creating chunks, like
compressions, can become the bottleneck.
Instead use the lower-level `zstd_safe::compress` which avoids (big)
allocations, since we provide the target buffer.
In case of a compression error just return the uncompressed data,
there's nothing we can do and saving uncompressed data is better than
having none. Additionally, log any such error besides the one for the
target buffer being too small.
Some benchmarks on my machine (Intel i7-12700K with DDR5-4800 memory
using a ASUS Prime Z690-A motherboard) from a tmpfs to a datastore on
tmpfs:
Type without patches (MiB/s) with patches (MiB/s)
.img file ~614 ~767
pxar one big file ~657 ~807
pxar small files ~576 ~627
The new approach is faster by a factor of 1.19.
Note that the new approach should not have a measurable negative
impact, e.g. (peak) memory usage wise. That is because we always
reserved a vector with max-data-size (data length + header length) and
thus did not have to add a new buffer, rather we actually removed the
buffer that the high-level zstd wrapper crate used internally.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
We want to check the error code of zstd not to be 'Destination buffer
to small' (dstSize_tooSmall), but currently there is no practical API
that is also public. So we introduce a helper that uses the internal
logic of zstd to determine the error.
Since this is not guaranteed to be a stable api, add a test for that
so we catch that error early on build. This should be fine, as long as
the zstd behavior only changes with e.g. major debian upgrades, which
is normally the only time where the zstd version is updated.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
[ TL: re-order fn, rename test and reword comments ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This is leftover code that is not currently used outside of its own
tests.
Should we need it again, we can just revert this commit.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Add External Metrics page to PBS's documentation. Most of it is copied
from the PVE documentation, minus the Graphite part.
Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
with the default 8k input buffer size, the client will spend most of the time
polling instead of reading/chunking/uploading.
tested with 16G random data file from tmpfs to fresh datastore backed by tmpfs,
without encryption.
stock:
Time (mean ± σ): 36.064 s ± 0.655 s [User: 21.079 s, System: 26.415 s]
Range (min … max): 35.663 s … 36.819 s 3 runs
patched:
Time (mean ± σ): 23.591 s ± 0.807 s [User: 16.532 s, System: 18.629 s]
Range (min … max): 22.663 s … 24.125 s 3 runs
Summary
patched ran
1.53 ± 0.06 times faster than stock
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
by dropping the print-per-chunk and making the input buffer size configurable
(8k is the default when using `new()`).
this allows benchmarking various input buffer sizes. basically the same code is
used for image-based backups in proxmox-backup-client, but just the
reading and chunking part. looking at the flame graphs the smaller input
buffer sizes clearly show most of time spent polling, instead of
reading+copying (or reading and scanning and copying).
for a fixed chunk size stream with a 16G input file on tmpfs:
fixed 1M ran
1.06 ± 0.17 times faster than fixed 4M
1.22 ± 0.11 times faster than fixed 16M
1.25 ± 0.09 times faster than fixed 512k
1.31 ± 0.10 times faster than fixed 256k
1.55 ± 0.13 times faster than fixed 128k
1.92 ± 0.15 times faster than fixed 64k
3.09 ± 0.31 times faster than fixed 32k
4.76 ± 0.32 times faster than fixed 16k
8.08 ± 0.59 times faster than fixed 8k
(from 15.275s down to 1.890s)
dynamic chunk stream, same input:
dynamic 4M ran
1.01 ± 0.03 times faster than dynamic 1M
1.03 ± 0.03 times faster than dynamic 16M
1.06 ± 0.04 times faster than dynamic 512k
1.07 ± 0.03 times faster than dynamic 128k
1.12 ± 0.03 times faster than dynamic 64k
1.15 ± 0.20 times faster than dynamic 256k
1.23 ± 0.03 times faster than dynamic 32k
1.47 ± 0.04 times faster than dynamic 16k
1.92 ± 0.05 times faster than dynamic 8k
(from 26.5s down to 13.772s)
same input file on ext4 on LVM on CT2000P5PSSD8 (with caches dropped for each run):
fixed 4M ran
1.06 ± 0.02 times faster than fixed 16M
1.10 ± 0.01 times faster than fixed 1M
1.12 ± 0.01 times faster than fixed 512k
1.15 ± 0.02 times faster than fixed 128k
1.17 ± 0.01 times faster than fixed 256k
1.22 ± 0.02 times faster than fixed 64k
1.55 ± 0.05 times faster than fixed 32k
2.00 ± 0.07 times faster than fixed 16k
3.01 ± 0.15 times faster than fixed 8k
(from 19.807s down to 6.574s)
dynamic 4M ran
1.04 ± 0.02 times faster than dynamic 512k
1.04 ± 0.02 times faster than dynamic 128k
1.04 ± 0.02 times faster than dynamic 16M
1.06 ± 0.02 times faster than dynamic 1M
1.06 ± 0.02 times faster than dynamic 256k
1.08 ± 0.02 times faster than dynamic 64k
1.16 ± 0.02 times faster than dynamic 32k
1.34 ± 0.03 times faster than dynamic 16k
1.70 ± 0.04 times faster than dynamic 8k
(from 31.184s down to 18.378s)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
`cargo build` and `cargo install` pick up different config files, by symlinking
the wrapper config into a place with higher precedence than the one in the
top-level git repo dir, we ensure the package build actually picks up the
desired config instead of the one intended for quick dev builds.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
The root namespace is displayed as empty string when used in the
format string. Distinguish and explicitly write out the root namespace
in the sync info message shown in the sync jobs task log.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
Describe the `pull` direction of the sync operation more precisely
before adding also a `push` direction as synchronization operation.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>