We are going to add more credentials so it makes sense to have a common
helper to get the secrets.
Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
otherwise tokens without comments can no longer be created as the api
will reject the additional `delete` parameter. this bug was introduced
by commit:
3fdf876: api: token: make comment deletable
Signed-off-by: Shannon Sterz <s.sterz@proxmox.com>
group or namespace removal can fail if the old locking mechanism is
still in use, as it is unsafe to properly clean up in that scenario.
return an error message that explains how to rectify that situation.
Signed-off-by: Shannon Sterz <s.sterz@proxmox.com>
[TL: address simple merge conflict and fine tune message to admins]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
this is only needed for removing the group if the last snapshot is
removed, ignore locking failures, as the user can't do anything to
rectify the situation anymore.
log the locking error for debugging purposes, though.
Signed-off-by: Shannon Sterz <s.sterz@proxmox.com>
[TL: line-wrap comment at 100cc and fix bullet-point indentation]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
See commit 27dd7377 ("fix #3935: datastore/api/backup: move datastore
locking to '/run'") for details, as I'll bump PBS now we can fixate
the version and drop the safety-net "reminder" from d/rules again.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
To reduce the number of atimes updates, keep track of the recently
marked chunks in phase 1 of garbage to avoid multiple atime updates
via expensive utimensat() calls.
Recently touched chunks are tracked by storing the chunk digests in
an LRU cache of fixed capacity. By inserting a digest, the chunk will
be the most recently touched one and if already present in the cache
before insert, the atime update can be skipped. The cache capacity of
1024 * 1024 was chosen as compromise between required memory usage
and the size of an index file referencing a 4 TiB fixed size chunked
image (with 4MiB chunk size).
The previous change to iterate over the datastore contents using the
datastore's iterator helps for increased cache hits, as subsequent
snapshots are most likely to share common chunks.
Basic benchmarking:
Number of utimensat calls shows significatn reduction:
unpatched: 31591944
patched: 1495136
Total GC runtime shows significatn reduction (average of 3 runs):
unpatched: 155.4 ± 3.5 s
patched: 22.8 ± 0.5 s
VmPeak measured via /proc/self/status before and after
`mark_used_chunks` (proxmox-backup-proxy was restarted in between
for normalization, average of 3 runs):
unpatched before: 1196028 ± 0 kB
unpatched after: 1196028 ± 0 kB
unpatched before: 1163337 ± 28317 kB
unpatched after: 1330906 ± 29280 kB
delta: 167569 kB
Dependence on the cache capacity:
capacity runtime[s] VmPeakDiff[kB]
1*1024 66.221 0
10*1024 36.164 0
100*1024 23.141 0
1024*1024 22.188 101060
10*1024*1024 23.178 689660
100*1024*1024 25.135 5507292
Description of the PBS host and datastore:
CPU: Intel Xeon E5-2620
Datastore backing storage: ZFS RAID 10 with 3 mirrors of 2x
ST16000NM001G, mirror of 2x SAMSUNG_MZ1LB1T9HALS as special
Namespaces: 45
Groups: 182
Snapshots: 3184
Index files: 6875
Deduplication factor: 44.54
Original data usage: 120.742 TiB
On-Disk usage: 2.711 TiB (2.25%)
On-Disk chunks: 1494727
Average chunk size: 1.902 MiB
Distribution of snapshots (binned by month):
2023-11 11
2023-12 16
2024-01 30
2024-02 38
2024-03 17
2024-04 37
2024-05 17
2024-06 59
2024-07 99
2024-08 96
2024-09 115
2024-10 35
2024-11 42
2024-12 37
2025-01 162
2025-02 489
2025-03 1884
Fixes: https://bugzilla.proxmox.com/show_bug.cgi?id=5331
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
Instead of iterating over all index files found in the datastore in
an unstructured manner, use the datastore iterators to logically
iterate over them as other datastore operations will.
This allows to better distinguish index files in unexpected locations
from ones in their expected location, warning the user of unexpected
ones to allow to act on possible missconfigurations. Further, this
will allow to integrate marking of snapshots with missing chunks as
incomplete/corrupt more easily and helps improve cache hits when
introducing LRU caching to avoid multiple atime updates in phase 1 of
garbage collection.
This now iterates twice over the index files, as indices in
unexpected locations are still considered by generating the list of
all index files to be found in the datastore and removing regular
index files from that list, leaving unexpected ones behind.
Further, align terminology by renaming the `list_images` method to
a more fitting `list_index_files` and the variable names accordingly.
This will reduce possible confusion since throughout the codebase and
in the documentation files referencing the data chunks are referred
to as index files. The term image on the other hand is associated
with virtual machine images and other large binary data stored as
fixed-size chunks.
Basic benchmarking:
Total GC runtime shows no significatn change (average of 3 runs):
unpatched: 155.4 ± 2.6 s
patched: 155.4 ± 3.5 s
VmPeak measured via /proc/self/status before and after
`mark_used_chunks` (proxmox-backup-proxy was restarted in between
for normalization, no changes for all 3 runs):
unpatched before: 1196032 kB
unpatched after: 1196032 kB
patched before: 1196028 kB
patched after: 1196028 kB
List image shows a slight increase due to the switch to a HashSet
(average of 3 runs):
unpatched: 64.2 ± 8.4 ms
patched: 72.8 ± 3.7 ms
Description of the PBS host and datastore:
CPU: Intel Xeon E5-2620
Datastore backing storage: ZFS RAID 10 with 3 mirrors of 2x
ST16000NM001G, mirror of 2x SAMSUNG_MZ1LB1T9HALS as special
Namespaces: 45
Groups: 182
Snapshots: 3184
Index files: 6875
Deduplication factor: 44.54
Original data usage: 120.742 TiB
On-Disk usage: 2.711 TiB (2.25%)
On-Disk chunks: 1494727
Average chunk size: 1.902 MiB
Distribution of snapshots (binned by month):
2023-11 11
2023-12 16
2024-01 30
2024-02 38
2024-03 17
2024-04 37
2024-05 17
2024-06 59
2024-07 99
2024-08 96
2024-09 115
2024-10 35
2024-11 42
2024-12 37
2025-01 162
2025-02 489
2025-03 1884
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
Refactor the archive type and index file reader opening with its
error handling into a helper method for better reusability.
This allows to use the same logic for both, expected image paths
and unexpected image paths when iterating trough the datastore
in a hierarchical manner.
Improve error handling by switching to anyhow's error context.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
Until now errors are shown ignoring the anyhow error context. In
order to allow the garbage collection to return additional error
context, format the error including the context as single line.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
Add a boolean return type to LruCache::insert(), telling if the node
was already present in the cache or if it was newly inserted.
This will allow to use the LRU cache for garbage collection, where
it is required to skip atime updates for chunks already marked in
use.
That improves phase 1 garbage collection performance by avoiding,
multiple atime updates.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
Currently, the only way to delete a comment on a token is to set it to
just spaces. Since we trim it in the endpoint, it gets deleted as a
side effect. This allows the comment to be deleted properly.
Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
Using a single thread for reading is not optimal in some cases, e.g.
when the underlying storage can handle reads from multiple threads in
parallel.
We use the ParallelHandler to handle the actual reads. Make the
sync_channel buffer size depend on the number of threads so we have
space for two chunks per thread. (But keep the minimum to 3 like
before).
How this impacts the backup speed largely depends on the underlying
storage and how the backup is laid out on it.
I benchmarked the following setups:
* Setup A: relatively spread out backup on a virtualized pbs on single HDDs
* Setup B: mostly sequential chunks on a virtualized pbs on single HDDs
* Setup C: backup on virtualized pbs on a fast NVME
* Setup D: backup on bare metal pbs with ZFS in a RAID10 with 6 HDDs
and 2 fast special devices in a mirror
(values are reported in MB/s as seen in the task log, caches were
cleared between runs, backups were bigger than the memory available)
setup 1 thread 2 threads 4 threads 8 threads
A 55 70 80 95
B 110 89 100 108
C 294 294 294 294
D 118 180 300 300
So there are cases where multiple read threads speed up the tape backup
(dramatically). On the other hand there are situations where reading
from a single thread is actually faster, probably because we can read
from the HDD sequentially.
I left the default value of '1' to not change the default behavior.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
[TL: update comment about mpsc buffer size for clarity and drop
commented-out debug-code]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
In preparation for skipping over snapshots when synchronizing with
encrypted/verified only flags set. In these cases, the manifest has
to be fetched from the remote and it's status checked. If the
snapshot should be skipped, the snapshot directory including the
temporary manifest file has to be cleaned up, given the snapshot
directory has been newly created. By reorganizing the current
snapshot pull logic, this can be achieved more easily.
The `corrupt` flag will be set to `false` in the snapshot
prefiltering, so the previous explicit distinction for newly created
snapshot directories must not be preserved.
No functional changes intended.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
The template files for this one have simply been copied from PVE,
including the HTML template.
In PBS we actually don't provide any HTML templates for any other type
of notification, so especially with the template override mechanism on
the horizon, it's probably better to remove this template until we
also provide an HTML version for the other types as well.
Signed-off-by: Lukas Wagner <l.wagner@proxmox.com>
Reviewed-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
This commit adds a separate type for the data passed to this type of
notification template. Also we make sure that we do not expose any
non-primitive types to the template renderer, any data needed in the
template is mapped into the new dedicated template data type.
This ensures that any changes in types defined in other places do not
leak into the template rendering process by accident.
These changes are also preparation for allowing user-overrides for
notification templates.
This commit also tries to unify the style and naming of template
variables.
Signed-off-by: Lukas Wagner <l.wagner@proxmox.com>
Reviewed-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
This commit adds a separate type for the data passed to this type of
notification template. Also we make sure that we do not expose any
non-primitive types to the template renderer, any data needed in the
template is mapped into the new dedicated template data type.
This ensures that any changes in types defined in other places do not
leak into the template rendering process by accident.
These changes are also preparation for allowing user-overrides for
notification templates.
This commit also tries to unify the style and naming of template
variables.
Signed-off-by: Lukas Wagner <l.wagner@proxmox.com>
Reviewed-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
This commit adds a separate type for the data passed to this type of
notification template. Also we make sure that we do not expose any
non-primitive types to the template renderer, any data needed in the
template is mapped into the new dedicated template data type.
This ensures that any changes in types defined in other places do not
leak into the template rendering process by accident.
These changes are also preparation for allowing user-overrides for
notification templates.
This commit also tries to unify the style and naming of template
variables.
Signed-off-by: Lukas Wagner <l.wagner@proxmox.com>
Reviewed-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
This commit adds a separate type for the data passed to this type of
notification template. Also we make sure that we do not expose any
non-primitive types to the template renderer, any data needed in the
template is mapped into the new dedicated template data type.
This ensures that any changes in types defined in other places do not
leak into the template rendering process by accident.
These changes are also preparation for allowing user-overrides for
notification templates.
This commit also tries to unify the style and naming of template
variables.
Signed-off-by: Lukas Wagner <l.wagner@proxmox.com>
Reviewed-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
This commit adds a separate type for the data passed to this type of
notification template. Also we make sure that we do not expose any
non-primitive types to the template renderer, any data needed in the
template is mapped into the new dedicated template data type.
This ensures that any changes in types defined in other places do not
leak into the template rendering process by accident.
These changes are also preparation for allowing user-overrides for
notification templates.
This commit also tries to unify the style and naming of template
variables.
Signed-off-by: Lukas Wagner <l.wagner@proxmox.com>
Reviewed-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
This commit adds a separate type for the data passed to this type of
notification template. Also we make sure that we do not expose any
non-primitive types to the template renderer, any data needed in the
template is mapped into the new dedicated template data type.
This ensures that any changes in types defined in other places do not
leak into the template rendering process by accident.
These changes are also preparation for allowing user-overrides for
notification templates.
This commit also tries to unify the style and naming of template
variables.
Signed-off-by: Lukas Wagner <l.wagner@proxmox.com>
Reviewed-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
This commit adds a separate type for the data passed to this type of
notification template. Also we make sure that we do not expose any
non-primitive types to the template renderer, any data needed in the
template is mapped into the new dedicated template data type.
This ensures that any changes in types defined in other places do not
leak into the template rendering process by accident.
These changes are also preparation for allowing user-overrides for
notification templates.
This commit also tries to unify the style and naming of template
variables.
Signed-off-by: Lukas Wagner <l.wagner@proxmox.com>
Reviewed-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
This commit adds a separate type for the data passed to this type of
notification template. Also we make sure that we do not expose any
non-primitive types to the template renderer, any data needed in the
template is mapped into the new dedicated template data type.
This ensures that any changes in types defined in other places do not
leak into the template rendering process by accident.
These changes are also preparation for allowing user-overrides for
notification templates.
This commit also tries to unify the style and naming of template
variables.
Signed-off-by: Lukas Wagner <l.wagner@proxmox.com>
Reviewed-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
The next commit is going to add a separate submodule for notification
template data types.
Signed-off-by: Lukas Wagner <l.wagner@proxmox.com>
Reviewed-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
Empty backup groups are not visible in the API or GUI. This led to a
confusing issue where users were unable to create a group because it
already existed and was still owned by another user. Resolve this
issue by removing the group if its last snapshot is removed.
Also fixes an issue where removing a group used the non-atomic
`remove_dir_all()` function when destroying a group unconditionally.
This could lead to two different threads suddenly holding a lock to
the same group. Make sure that the new locking mechanism is used,
which prevents that, before removing the group. This is also a bit
more conservative now, as it specifically removes the owner file and
group directory separately to avoid accidentally removing snapshots in
case we made an oversight.
Signed-off-by: Shannon Sterz <s.sterz@proxmox.com>
when two clients change the owner of a backup store, a race condition
arose. add locking to avoid this.
Signed-off-by: Shannon Sterz <s.sterz@proxmox.com>
Acked-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
adds double stat'ing and removes directory hierarchy to bring manifest
locking in-line with other locks used by the BackupDir trait.
if the old locking mechanism is still supposed to be used, this still
falls back to the previous lock file. however, we already add double
stat'ing since it is trivial to do here and should only provide better
safety when it comes to removing locks.
Signed-off-by: Shannon Sterz <s.sterz@proxmox.com>
Acked-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
to avoid issues when removing a group or snapshot directory where two
threads hold a lock to the same directory, move locking to the tmpfs
backed '/run' directory. also adds double stat'ing to make it possible
to remove locks without certain race condition issues.
this new mechanism is only employed when we can be sure, that a reboot
has occured so that all processes are using the new locking mechanism.
otherwise, two separate process could assume they have exclusive
rights to a group or snapshot.
bumps the rust version to 1.81 so we can use `std::fs::exists` without
issue.
Signed-off-by: Shannon Sterz <s.sterz@proxmox.com>
Acked-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
[TL: drop unused format_err import]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
to avoid duplicate code, add helpers for locking groups and snapshots
to the BackupGroup and BackupDir traits respectively and refactor
existing code to use them.
this also adapts error handling by adding relevant context to each
locking helper call site. otherwise, we might loose valuable
information useful for debugging. note, however, that users that
relied on specific error messages will break.
Signed-off-by: Shannon Sterz <s.sterz@proxmox.com>
Acked-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Since we are exposing functions now to get the password and encryption
password this should be private.
Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
Allows to load credentials passed down by systemd. A possible use-case
is safely storing the server's password in a file encrypted by the
systems TPM, e.g. via
```
systemd-ask-password -n | systemd-creds encrypt --name=proxmox-backup-client.password - my-api-token.cred
```
which then can be used via
```
systemd-run --pipe --wait --property=LoadCredentialEncrypted=proxmox-backup-client.password:my-api-token.cred \
proxmox-backup-client ...
```
or from inside a service.
Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
Adapt the description for the backup specification to use
`archive-name` and `type` over `label` and `ext`, to be in line with
the terminology used in the documentation.
Further, explicitley describe the `path` as `source-path` to be less
ambigouos.
In order to avoid formatting issues in the man pages because of line
breaks after a hyphen, show the backup specification description in
multiple lines.
Suggested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
Currently if any `?`/`bail!` happens between mounting and completing
the creation process unmounting will be skipped. Adding this guard
solves that problem and makes it easier to add things in the future
without having to worry about a disk not being unmounted in case of a
failed creation.
Reported-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
Tested-by: Christian Ebner <c.ebner@proxmox.com>
fixes the clippy warning on types T implementing Copy:
```
warning: using `clone` on type `T` which implements the `Copy` trait
```
followed by formatting fixups via `cargo fmt`.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
Drop the pub scope for `DataStore`s `list_images` method.
This method is only used to generate a list of index files found in
the datastore for iteration during garbage collection. There are no
other call sites and this is intended to only be used within the
module itself. Allows to be more flexible for future method signature
adaptions.
No functional changes.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>