Fixes the cargo doc lint:
```
warning: unclosed HTML tag `uuid`
--> pbs-datastore/src/datastore.rs:60:41
|
60 | /// - could not stat /dev/disk/by-uuid/<uuid>
| ^^^^^^
|
= note: `#[warn(rustdoc::invalid_html_tags)]` on by default
warning: unclosed HTML tag `uuid`
--> pbs-datastore/src/datastore.rs:61:26
|
61 | /// - /dev/disk/by-uuid/<uuid> is not a block device
```
Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
Fixes the cargo doc lint:
```
warning: this URL is not a hyperlink
--> pbs-datastore/src/data_blob.rs:555:5
|
555 | /// https://github.com/facebook/zstd/blob/dev/lib/common/error_private.h
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: bare URLs are not automatically turned into clickable links
= note: `#[warn(rustdoc::bare_urls)]` on by default
help: use an automatic link instead
|
555 | /// <https://github.com/facebook/zstd/blob/dev/lib/common/error_private.h>
| + +
```
Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
Since we don't want to have lingering file descriptors on any fork +
exec, like the reload code from the proxmox-daemon crate we're using
for the rest-server(s) does, as that can have serious side effects and
even cause hangs.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
[ TL: Reword commit message ]}
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
```
warning: field assignment outside of initializer for an instance created with Default::default()
--> pbs-datastore/src/chunker.rs:431:5
|
431 | ctx.total = buffer.len() as u64;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
note: consider initializing the variable with `chunker::Context { total: buffer.len() as u64, ..Default::default() }` and removing relevant reassignments
--> pbs-datastore/src/chunker.rs:430:5
|
430 | let mut ctx = Context::default();
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#field_reassign_with_default
= note: `#[warn(clippy::field_reassign_with_default)]` on by default
```
Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
Fixes the clippy lint:
```
warning: empty line after doc comment
--> src/tape/pool_writer/mod.rs:441:5
|
441 | / /// updated.
442 | |
| |_
...
448 | / pub fn append_snapshot_archive(
449 | | &mut self,
450 | | snapshot_reader: &SnapshotReader,
451 | | ) -> Result<(bool, usize), Error> {
| |_____________________________________- the comment documents this method
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#empty_line_after_doc_comments
= help: if the empty line is unintentional remove it
help: if the documentation should include the empty line include it in the comment
|
442 | ///
|
```
Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
Fixes the question_mark clippy lint:
```
warning: this `let...else` may be rewritten with the `?` operator
--> pbs-datastore/src/datastore.rs:101:5
|
101 | / let Some(ref device_uuid) = config.backing_device else {
102 | | return None;
103 | | };
| |______^ help: replace it with: `let ref device_uuid = config.backing_device?;`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#question_mark
= note: `#[warn(clippy::question_mark)]` on by default
```
Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
these are particularly problematic since GC will walk the whole datastore tree
on the file system, and will thus pick up indices (but not chunks!) from nested
directories that are ignored in other code paths that use our regular
iterators..
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
the current phrase leads to clumsy log messages such as:
> datastore 'store' is in datastore is being unmounted
this commit re-phrases that too:
> datastore 'store' is unavailable: datastore is being unmounted
Signed-off-by: Shannon Sterz <s.sterz@proxmox.com>
Data deletion is only possible if the datastore is mounted, won't attempt
mounting it for the purpose of deleting data.
Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
... at a specific location. Also adds two additional functions to
get the mount status, and ensuring a removable datastore is mounted.
Co-authored-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
... and add MaintenanceType::Delete to it. We also want to clear any
cach entries if we are deleting the datastore, not just if it is marked
as offline.
Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
Add helper functions to retrieve the verify_state from the manifest of a
snapshot. Replaced all the manual "verify_state" parsing with the helper
function.
Suggested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
Instead of using the plain String or slices of it for archive names,
use the dedicated api type and its methods to parse and check for
archive type based on archive filename extension.
Thereby, keeping the checks and mappings in the api type and
resticting function parameters by the narrower wrapper type to reduce
potential misuse.
Further, instead of declaring and using the archive name constants
throughout the codebase, use the `BackupArchiveName` helpers to
generate the archive names for manifest, client logs and encryption
keys.
This allows for easy archive name comparisons using the same
`BackupArchiveName` type, at the cost of some extra allocations and
avoids the currently present double constant declaration of
`CATALOG_NAME`.
A positive ergonomic side effect of this is that commands now also
accept the archive type extension optionally, when passing the archive
name.
E.g.
```
proxmox-backup-client restore <snapshot> <name>.pxar.didx <target>
```
is equal to
```
proxmox-backup-client restore <snapshot> <name>.pxar <target>
```
The previously default mapping of any archive name extension to a blob
has been dropped in favor of consistent mapping by the api type
helpers.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
FG: use LazyLock for constant archive names
FG: add missing import
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Moving the `ArchiveType` to avoid crate dependencies on
`pbs-datastore`.
In preparation for introducing a dedicated `BackupArchiveName` api
type, allowing to set the corresponding archive type variant when
parsing the archive name based on it's filename.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
Permissions are stored in the lower 9 bits (rwxrwxrwx),
so we have to mask `st_mode` with 0o777.
The datastore root dir is created with 755, the `.chunks` dir and its
contents with 750 and the `.lock` file with 644, this changes the
expected permissions accordingly.
Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
Fixes: 6e101ff757 ("fix #5439: allow to reuse existing datastore")
Reviewed-By: Gabriel Goller <g.goller@proxmox.com>
Add and optionally expose the backup group delete statistics by adding the
return type to the corresponding REST API endpoints.
Clients can opt-into the new behaviour by setting the new `error-on-protected`
flag to `false` when calling the api endpoints, which results in removal not
erroring out when encountering protected snapshots.
The default value for the flag remains `true` for now, to remain backwards
compatible with older clients expecting this behaviour.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
FG: reworded commit message slightly
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
To correctly account also for the number of deleted backup groups, in
preparation to correctly return the delete statistics when removing
contents via the REST API.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
In preparation for the delete stats to be exposed as return type to
the backup group delete api endpoint.
Also, rename the private field `unremoved_protected` to a better
fitting `protected_snapshots` to be in line with the method names.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
removable datastores will have a PBS-managed mountpoint as path, direct
access to the field needs to be replaced with a helper that can account
for this.
Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
Disallow creating datastores in non-empty directories. Allow adding
existing datastores via a 'reuse-datastore' checkmark. This only checks
if all the necessary directories (.chunks + subdirectories and .lock)
exist and have the correct permissions. Note that the reuse-datastore
path does not open the datastore, so that we don't drop the
ProcessLocker of an existing datastore.
Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
Perform the conversion from pxar file entries to catalog entry
attributes by implementing `TryFrom<&FileEntry<T>>` for
`DirEntryAttribute` and use that.
Allows the reuse for the catalog shell, when using the split pxar
archive instead of the catalog.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
we currently use the behavior of zstd that is not part of the public
api, so this is at risk to be changed without notice.
There is a public api that we could use, but it's only available
with zstd_sys >= 2.0.9, which at this time, is not yet packaged for/by
us.
Add a comment that we can use the public api for this when the
new version of the crate gets available.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
by combining the compression call from both encrypted and unencrypted
paths and deciding on the header magic at one site.
No functional changes intended, besides reusing the same buffer for
compression.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Increase the zstd compression throughput by not using the
`zstd::stream::copy_encode` method, because it seems it uses an
internal buffer size of 32 KiB [0], copies at least once extra in the
target buffer and might have some additional (allocation and/or
syscall) overhead. Due to the amount of wrappers and indirections it's
a bit hard to tell for sure. In anyway, there can be a reduced
throughput observed if all, the target and source storage and the
network are so fast that the operations from creating chunks, like
compressions, can become the bottleneck.
Instead use the lower-level `zstd_safe::compress` which avoids (big)
allocations, since we provide the target buffer.
In case of a compression error just return the uncompressed data,
there's nothing we can do and saving uncompressed data is better than
having none. Additionally, log any such error besides the one for the
target buffer being too small.
Some benchmarks on my machine (Intel i7-12700K with DDR5-4800 memory
using a ASUS Prime Z690-A motherboard) from a tmpfs to a datastore on
tmpfs:
Type without patches (MiB/s) with patches (MiB/s)
.img file ~614 ~767
pxar one big file ~657 ~807
pxar small files ~576 ~627
The new approach is faster by a factor of 1.19.
Note that the new approach should not have a measurable negative
impact, e.g. (peak) memory usage wise. That is because we always
reserved a vector with max-data-size (data length + header length) and
thus did not have to add a new buffer, rather we actually removed the
buffer that the high-level zstd wrapper crate used internally.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
We want to check the error code of zstd not to be 'Destination buffer
to small' (dstSize_tooSmall), but currently there is no practical API
that is also public. So we introduce a helper that uses the internal
logic of zstd to determine the error.
Since this is not guaranteed to be a stable api, add a test for that
so we catch that error early on build. This should be fine, as long as
the zstd behavior only changes with e.g. major debian upgrades, which
is normally the only time where the zstd version is updated.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
[ TL: re-order fn, rename test and reword comments ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This is leftover code that is not currently used outside of its own
tests.
Should we need it again, we can just revert this commit.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Commit ea584a75 "move more api types for the client" deprecated
the `archive_type` function in favor of the associated function
`ArchiveType::from_path`.
Replace all remaining callers of the deprecated function with its
replacement.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
[WB: and remove the deprecated function]
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
When getting the `full_path` of a snapshot we did not use the cached
time string. By using it we avoid a call to the super-slow libc strftime.
This has some minor performance improvements of circa 7%. That is ~100ms
on my datastore with ~5000 snapshots.
Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
... if `.chunks/` is not available(deleted/moved) ChunkStore::open
fails, but that would happen after updating the active operations on the
datastore, so no reference that could be dropped is returned. Leading to
the operations counter to always increase. This only updates the counter
when a reference is returned, not before.
Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
Implement the Chunker trait for a dedicated payload stream chunker,
which extends the regular chunker by the option to suggest boundaries
to be used over the hast based boundaries whenever possible.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
Add the Chunker trait and move the current Chunker to ChunkerImpl to
implement the trait instead. This allows to use different chunker
implementations by dynamic dispatch and is in preparation for
implementing a dedicated payload chunker.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
When forcing a boundary, the internal chunker state is not in sync
with the chunk stream anymore. The reset method therefore allows
to reset the internal state.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
In preparation for injecting reused payload chunks in payload streams
for regular files with unchanged metaddata. Allows to get the digest
of a dynamic index entry to construct a reusable dynamic entry from
it.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
Maintenance mode Delete locks the datastore. It must not be possible to go
back to normal modes, because the datastore may be in undefined state.
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
No functional change intended: In preparation for including the
removed vanished groups and snapshots statistics in a sync jobs task
log output.
Instead of returning a boolean value showing whether all of the
snapshots of the group have been removed, return an instance of
`BackupGroupDeleteStats`, containing the count of deleted and
protected snapshots, the latter not having been removed from the
group.
The `removed_all` method is introduced as replacement for the previous
boolean return value and can be used to check if all snapshots have
been removed. If there are no protected snapshots, the group is
considered to be deleted.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
We keep a DataStore cache, so ChunkStore's and lock files are kept by
the proxy process and don't have to be reopened every time. However,
for specific maintenance modes, e.g. 'offline', our process should not
keep file in that datastore open. This clears the cache entry of a
datastore if it is in a specific maintanance mode and the last task
finished, which also drops any files still open by the process.
Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
Reviewed-by: Gabriel Goller <g.goller@proxmox.com>
Tested-by: Gabriel Goller <g.goller@proxmox.com>
By this it becomes clear that the error stems from a parsing error when
getting the backup group owner.
See also: https://forum.proxmox.com/threads/139482/
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
... making the pull logic independent from the actual source
using two traits.
Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
Tested-by: Lukas Wagner <l.wagner@proxmox.com>
Tested-by: Tested-by: Gabriel Goller <g.goller@proxmox.com>
Fixed a few rustdoc warnings. Converted some 'html'-links to
intra-doc-links and surrounded paths with '`'.
Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
Added lifetime to `find` function. We need this lifetime
because of the `impl MatchList` and 'anonymous lifetimes in
`impl Trait` are unstable'.
Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
When walking through a datastore on a GC run, it can
happen that the snapshot is deleted, and then walked over.
For example:
- read dir entry for group
- walk entries (snapshots)
- snapshot X is removed/pruned
- walking reaches snapshot X, but ENOENT
Previously we bailed here, now we just ignore it.
Backups that are just created (and a atomic rename from
tmpdir happens, which might triggers a ENOENT error) are
not a problem here, the GC handles them separately.
Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
this sounded like we need to skip lost+found to avoid pruning too many chunks,
while the opposite is true - it's safe to skip lost+found on EPERM without
pruning too many chunks, but this is not the case for all EPERM situations..
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Simply pull out the inner IO error and the affected path first.
Clean up style-wise a bit while touching this anyway, but no semantic
change intended.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Passed a closure with the `stat()` function call to `matches()`. This
will traverse through all patterns and try to match using the path only, if a
`file_mode` is needed, it will run the closure. This means that if we exclude
a file with the `MatchType::ANY_FILE_TYPE`, we will skip it without running
`stat()` on it. As we updated the `matches()` function, we also updated all the
invocations of it.
Added `pathpatterns` crate to local overrides in cargo.toml.
Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
the wrong info here was rather misleading, especially when encountering errors
just talking about "blobs" when the actual problem is with a chunk.
chunks did originally have their own magic values, but that got removed in
4ee8f53d07 "remove DataChunk file format - use DataBlob instead"
back in 2019, before the 0.1.1 release(!)
Reported-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
very unexpected and unreachable is probably fine here, but it's not
really winning us anything, so avoid the panic-potential and just
bail out.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
previously when marking used chunks the namespace wasn't taken into
account and valid snapshots were marked as "strange paths". this lead
to a line in the log of a gc job such as this:
found (and marked) 2 index files outside of expected directory scheme
which some users perceived as an error. parse the namespace too and
only mark the path as strange if parsing the namespace and/or backup
dir fails.
Signed-off-by: Stefan Sterz <s.sterz@proxmox.com>
these were previously called out in a comment, but should now be handled (as
much as they can be).
the performance impact shouldn't be too bad, since we only look at the magic 8
bytes at the start of the existing chunk (we already did a stat on it, so that
might even be prefetched already by storage), and only if there is a size
mismatch and encryption is enabled.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
[ T: fold in "just to be sure" touch_chunk calls ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
instead of
Error: unable to open snapshot directory "/full/path/to/snapshot" for locking - ENOENT: No such file or directory
this will now print
Error: Snapshot vm/800/2023-01-16T12:28:11Z does not exist.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
besides harmonizing versions, the only global change is that the tokio-io
feature of pxar is now implied since its default anyway, instead of being
spelled out.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
pbs-buildcfg is the only one that needs to inherit the version as well, since
it stores it in the compiled crate.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
instead of hardcoding the default deep inside the code. This makes it
much easier to see what is the actual default
the first instance of ChunkOrder::None was only for the test case, were
the ordering doe not matter
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
fixups for DatastoreFSyncLevel:
* use derive for Default
* add some more derives (Clone, Copy)
chunk store:
* drop to_owned for chunk_dir_path
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
currently, we don't (f)sync on chunk insertion (or at any point after
that), which can lead to broken chunks in case of e.g. an unexpected
powerloss. To fix that, offer a tuning option for datastores that
controls the level of syncs it does:
* None (default): same as current state, no (f)syncs done at any point
* Filesystem: at the end of a backup, the datastore issues
a syncfs(2) to the filesystem of the datastore
* File: issues an fsync on each chunk as they get inserted
(using our 'replace_file' helper) and a fsync on the directory handle
a small benchmark showed the following (times in mm:ss):
setup: virtual pbs, 4 cores, 8GiB memory, ext4 on spinner
size none filesystem file
2GiB (fits in ram) 00:13 0:41 01:00
33GiB 05:21 05:31 13:45
so if the backup fits in memory, there is a large difference between all
of the modes (expected), but as soon as it exceeds the memory size,
the difference between not syncing and syncing the fs at the end becomes
much smaller.
i also tested on an nvme, but there the syncs basically made no difference
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
previously the BackGroup trait used the datastore's
namespace_path() method to construct a base path. this would result in
it returning an absolute path equivalent to full_group_path(). use
the namspace's path() method instead to get a relative path, in-line
with backup_dir's relative_path().
Signed-off-by: Stefan Sterz <s.sterz@proxmox.com>
the remaining ones are:
- type complexity
- fns with many arguments
- new() without default()
- false positives for redundant closures (where closure returns a static
value)
- expected vs actual length check without match/cmp
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
In the API we want to iterate over all backup groups
belonging to a particular type at least once, and iterating
through *everything* and simply "skipping" over every single
entry from another type makes no sense given that the groups
are organized into subdirectories based on their type.
Let's have an `.iter_backup_type()` method which returns an
iterator over all the groups of a specific type named
ListGroupsType and factorize the type level iterator out of
ListGroups for reuse.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>