A newer than the Luminous version is shipped with buster, and our
ceph repos are on Nautilus (14.2) in PVE 6.
Allows to drop a check for really old ceph versions (< 10, so
Infernalis and older).
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
dmesg: libceph: bad option at 'conf=/etc/pve/ceph.conf'
After the upgrade to PVE 6 with Ceph Luminous, the mount.ceph helper
doesn't understand the conf= option yet. And the CephFS mount with the
kernel client fails. After upgrading to Ceph Nautilus the option exists
in the mount.ceph helper.
Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
the '.*' was greedy, also consuming all but one digits of the real percentage
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
switch to \s* instead of .*?, to prevent mis-interpreting potential
strings like '< 50%' or '0-50%'
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
ZFS supports the -p flag in the list command since a few years now.
Let us use the real byte values and avoid the error prone calculation
from human readable numbers that can lead to incorrect numbers if the
reported human readable value is a rounded number.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
Getting the volume sizes as byte values instead of converted to human
readable units helps to avoid rounding errors in the further processing
if the volume size is more on the odd side.
The `zfs list` command supports the -p(arseable) flag since a few years
now.
When returning the size in bytes there is no calculation performed and
thus we need to explicitly cast the size to an integer before returning
it.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
to guess a valid volname for a targetstorage of a different type.
This makes it possible to migrate raw volumes between 'dir' and 'lvm'
storages.
It is only used when the storage type for the source storage X
and target storage Y differ and should work as long as Y uses
the standard naming scheme (VMID/vm-VMID-name.fmt respectively vm-VMID-name).
If it doesn't, we get an invalid name and fail, which is the old
behavior (except if X and Y have different types but the same
non-standard naming-scheme, where the old behavior did work).
The original name is preserved, except for a possible extension
and it's also checked if the format is valid for the target storage.
Example: mylvm:vm-123-disk-4 <-> mydir:123/vm-123-disk-4.raw
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
and also return the ID of the allocated volume. This option
allows plugins to choose a new name if there is a collision.
In storage_migrate, the API version of the receiving side is checked.
In Storage.pm's volume_import, when a plugin returns 'undef',
it can be assumed that the import with the requested volid was
successful (it should've died otherwise) and so volid is returned.
This is done for backwards compatibility with foreign plugins.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Instead of relying on list_volumes of Plugin.pm (which filters by
the content types set in the config), use our own to always
show the luns of an iscsi.
This makes sense here, since we need it to show the luns when using
it as base storage for LVM (where we have content type 'none' set).
It does not interfere with the rest of the GUI, since on e.g. disk
creation, we already filter the storages in the dropdown by content
type, iow. an iscsi storage used this way still does not show up
when trying to create a disk.
This also shows the luns now in the 'Content' tab, but this is also
OK, since the user cannot actually do anything there with the luns.
(Besides looking at them)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
With the option valid_target_formats it's possible
to let the caller specify possible formats for the target
of an operation.
[0]: If the option is not set, assume that every format is valid.
In most cases the format of the the target and the format
of the source will agree (and therefore assumption [0] is
not actually assuming very much and ensures backwards
compatability). But when cloning a volume on a storage
using Plugin.pm's implementation (e.g. directory based
storages), the result is always a qcow2 image.
When cloning containers, the new option can be used to detect
that qcow2 is not valid and hence the clone feature is not
available.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
a small grammar fix, and we now return ctime of all files, as
remaining storages are planned for the future omit this hint
completely.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
this way the content listing api also returns the vmid on content
listings which, among other things, is useful for the gui for
filtering
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Do not pollute top-level private directory, use "storage" folder but
with backward compatibility.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1. Avoids the error
qemu-img: The new size must be a multiple of 512
for qcow2 disks.
2. Because volume_import expects disk sizes to be a multiple of 1 KiB.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
As /etc/pve/priv is already pretty polluted, having a
"<storage-id>.pw" file there smells like it could make problems in
the future.
So let the pbs pw file generator use /etc/pve/priv/storages as base
path.
Other storage should move also to that path in the future, if they
save such secrets anywhere in /etc/pve.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
since we redirect the output to our (insecure) socket, logfunc is only
used for STDERR anyway, so we might as well make it explicit on the
caller side.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
One the first write bringing the unit file in existence we can just
start it, after that we need to tell systemd that we want to actively
reload it.
While this is slightly shaky due to the fact that we do not check all
paths where such a unit could reside, it is something we can do
because earlier one couldn't have a unit/overwrite anyway (from
procfs mountinfo generated one do not support that) and does adding
such override ones from now on should work.
Also note that we can only get here in the "user does no weird stuff"
case when "cephfs_is_mounted" actively tells that there is no cephfs
mounted at the $mountpoint - at which time we can safely re-write the
potential updated unit file, reload and mount again.
So let's make our life a bit easier here until a user actually
complains about a rational issue for this, maybe we have PVE 7.0 then
and can get rid of that anyway :)
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This fixes a potential races where fuse get's unmouted to late in the
shutdown process, i.e., at a time where network was down and it could
not talk to any MDS or monitor anymore.
We could fix it the same way we did once with the kernel based mount,
i.e., adding _netdev, but doing so would require to switch over from
"ceph-fuse" to "mount.fuse.ceph" which has better compatibility with
the common mount tool API.
As that helper exists we can reuse the newer systemd_netmount
ephemeral unit generator, only some options differ in name between
fuse and kernel variant.
So besides solving a potential issue we get a more unified handling
of those two cases.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
commit 54e0b0034b introduced the
"_netdev" option, for PVE 5.3. The systemd generator then correctly
resolved that in the following resulting order-dependencies:
> Wants=network-online.target
> Before=umount.target remote-fs.target
> After=remote-fs-pre.target system.slice network.target network-online.target -.mount
This worked well and all were happy. With the current systemd in 6.0
we sometimes get the local-fs ones there generated too. This is a
fallout from a try to better handling nested mount hierachies, where
a .mount unit needs to be mounter or unmounted, before or after,
respectively, the parent mount was processed. It seems that sometime
that glitches and thus a "RequireMountFor=/mnt/pve" gets thrown in
and result sometimes in the local-fs order constraints being added.
The issue now is, that one must not have ordering depends to all,
local-fs, local-fs-pre, remote-fs, remote-fs-pre, as that gets you a
ordering cycle. Systemd tries to solve that cycle by randomly
dropping one constraint and retrying. By luck this is a not so
important unit, and all goes on well. Most of the time one isn't that
lucky and something important gets dropped, for example:
> Jan 24 18:43:05 prod1 systemd[1]: sysinit.target: Found ordering cycle on systemd-timesyncd.service/stop
> Jan 24 18:43:05 prod1 systemd[1]: sysinit.target: Found dependency on systemd-tmpfiles-setup.service/stop
> Jan 24 18:43:05 prod1 systemd[1]: sysinit.target: Found dependency on local-fs.target/stop
> Jan 24 18:43:05 prod1 systemd[1]: sysinit.target: Found dependency on mnt-pve-cephfs.mount/stop
> Jan 24 18:43:05 prod1 systemd[1]: sysinit.target: Found dependency on remote-fs-pre.target/stop
> Jan 24 18:43:05 prod1 systemd[1]: sysinit.target: Found dependency on rbdmap.service/stop
> Jan 24 18:43:05 prod1 systemd[1]: sysinit.target: Found dependency on sysinit.target/stop
> Jan 24 18:43:05 prod1 systemd[1]: sysinit.target: Job remote-fs-pre.target/stop deleted to break ordering cycle starting with sysinit.target/stop
Then, most of the time the host reboot hangs for ~10 minutes, often
showing scapegoat units like the pve-ha-lrm being the cause of the
hang (even if no HA is configure >.<).
This behavior is fixed with newer systemd versions, e.g., the v244
from buster-backports, but that is not a real option for us for now.
So until 7.0 we generate the unit with the correct dependencies
directly in the ephemeral /run/ tmpfs backed systemd/system path and
start it.
While FUSE gets only the local-fs ordering constraint, it seems to cope
very well regarding such symptoms. But it _is_ racy and probably only
works due to systemd stopping it early as it has not much ordering
constraints at all.. It should be moved in the future nonetheless, as
there's a mount.fuse.ceph helper that should be not an issue.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This is but a hack, but we have no general helper/tools module here
and I do not want to do versioned dependencies for this fast-tracked
bugfix to pve-common, so I'll have to live with the shame for now.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
when listing volumes, otherwise an empty hash can be persisted into the
current worker's $vmlist, which could cause issues at various other API
endpoints.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
We can use 'list_images' to get the desired volume IDs in
'find_free_diskname' for most plugins. For the two LVM plugins, 'list_images'
potentially skips untagged volumes, so we keep the custom version. For the
RBD plugin, 'list_images' is much more costly than the custom version, so we
keep the custom version.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
we need to unprotect more snapshots than just the base one, since we
allow linked clones of regular VM snapshots. unprotection will only work
if no linked clones exist anymore.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
The size is required to be a multiple of volblocksize. Make sure
that the requirement is always met, so ZFS won't complain when we do
things like 'qm resize 102 scsi1 +0.01G'.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Letting LVM set the meta-data size internally was not a good idea, as
it produces really small metadata LVs. Adapts the same logic as the
installer.
Signed-off-by: Tim Marx <t.marx@proxmox.com>
Reviewed-By: Dominik Csapak <d.csapak@proxmox.com>
Tested-By: Dominik Csapak <d.csapak@proxmox.com>
Those come normally from virtual devices, like a IPMI disk, if no
media is attached. They spam the log really often on operations like
migrate, and are quite scare-mongering. So filter them out.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
it does not work:
disable RBD image features this kernel RBD drivers is not compatible with: fast-diff,object-map,deep-flatten
clone failed: could not disable krbd-incompatible image features 'fast-diff,object-map,deep-flatten' for rbd image: vm-123123123-disk-0@test: rbd: snapshot name specified for a command that doesn't use it
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
since 'pvesm export' and 'pvesm import' are connected via a pipe and
SSH, a fatal error in the former can lead to no valid header being
written to the pipe. handle this more gracefully by printing an easier
to understand error message, instead of uninitialized warnings with no
context.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Modern kernel, like 5.3, support all those features ('fast-diff',
'object-map', 'deep-flatten'), so we do not want to disable them
there. 5.0 already supports exclusive-locks, so no need to disable
exclusive locking there.
Further, we also want to profit from new features available, so let's
enable those which can be enabled "live" (i.e., after image creation)
if their available.
While we could also parse the kernel information directly from:
/sys/module/libceph/parameters/supported_features
there's not much advantage to that, features cannot be disabled with
KConfig, their also very dependent of the kernel version booted.
So for us it's enough to check that one.
This only affects container and VMs backed by a storage with KRBD
explicitly enabled. But as the enabling and disabling happens
transparently, it has no effect on the running guest.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
The bugfix for #2317 introduced a kind of odd API behavior, where
each volume was returned twice from our API if a storage has both
'rootdir' & 'images' content types enabled. To give the content type
of the volume an actual meaning, it is now inferred from the
associated guest, if there's no guest or we don't have an owner for
that volume we default to 'images'.
At the volume level, there is no option to list volumes based on
content types, since the volumes do not know what type they are
actually used for.
Signed-off-by: Tim Marx <t.marx@proxmox.com>
When adding a zfspool storage with 'pvesm add' the mount point is now
added automatically to the storage configuration if it can be
determined. path() does not assume the default mountpoint anymore,
fixing 2085.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
When working with several ZFS over iSCSI / LIO storages, we might lookup
between them with less than 15 sec interval.
Previously, the cache of the previous storage was used, which was breaking
disk move for example
Signed-off-by: Daniel Berteaud <daniel@firewall-services.com>
The common ZFSPlugin was missing volume name parsing
in a few places. This was not a problem for standard
volumes, but broke functionnalities (like resize,
snapshot, rollback) with linked clones as the name of
the zvol must be extracted from the entry in the config
(remove base-X-disk-Y prefix)
Signed-off-by: Daniel Berteaud <daniel@firewall-services.com>
In the default config, emulate_tpu is set to 0, which disables
unmap support. Once enabled, trim can run from guest to reclaim free
space.
Signed-off-by: Daniel Berteaud <daniel@firewall-services.com>
It's not needed, LIO sees the new size automatically.
And it was broken anyway. Partially fix#2335
Signed-off-by: Daniel Berteaud <daniel@firewall-services.com>
Using the json output, as suggested by Thomas, we now die if the decoding
fails and, if not, all return values are set to the corresponding decoded
values. That should prevent any unforeseen null size values, except if
qemu-img info reports it, which we then consider as valid.
Signed-off-by: Tim Marx <t.marx@proxmox.com>
Migration with --targetstorage was broken because of this.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
$1 and $2 get set to undef from the vmid filter regex, so we have to do
the name/format regex after, else we get errors like:
'use of unitiialized value $1[...]'
and the listing is empty
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
The patch uses the value from the field 'stored' if it is available.
In Ceph 14.2.2 the storage calculation changed to a per pool basis. This
introduced an additional field 'stored' that holds the amount of data
that has been written to the pool. While the field 'used' now has the
data after replication for the pool.
The new calculation will be used only if all OSDs are running with the
on-disk format introduced by Ceph 14.2.2.
Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
To maintain full (backwards) compatibility, leave the type name as
'iso' - this makes this patch work without changing every consumer of
storage APIs.
Note that currently these files can only be attached as a CDROM/DVD
drive, so USB-only images can be uploaded but might not work in VMs.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
and actually do that not just for creating zvols, but also when
activating them. this should fix a range of issues/races that sometimes
occured on bootup, snapshot rollback or similar operations.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
plugins can still override list_volumes if they want separate methods to
list rootdir and images content.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Previously, the web GUI timed out when removing content (e.g. backup) took
too long. Doing the main part of the API DELETE call in a fork_worker solves
this.
Signed-off-by: Dominic Jäger <d.jaeger@proxmox.com>
since list_volumes is only supposed to be called with filtered content
types, this should ensure that get_subdir is only called for plugins
that have a defined 'path' property, like the old code in
PVE::Storage::template_list did.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
we can only do this here, since the ceph cluster is not aware of
osd encryption, only the local node is (via ceph-volume and lv tags)
this way, we are able to show an 'encrypted' flag in the disk gui at least
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
previously ceph included a udev rule to populate
/dev/disk/by-parttypeuuid/
but not anymore, so we now use 'lsblk --json -o path,parttype' to
get a mapping between parttype uuid and partition
fix the test by simulating empty lsblk output
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
To allow getting closer to finally drop "pvecm mtunnel".
Code parts taken from pipe_socket_to_command
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[regex fixup]
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
checking '$server:$subdir' is too strict to work in all cirumcstances,
e.g. adding/removing a monitor would mean that it is not the same
anymore, same if one is adding/removing the ports from the config
check only if the subdir is the same and if it is a cephfs
this way, it still returns true if someone changes the config
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
since we write only the mon_host config beginning with nautilus,
we have to get the monitor ips from there as well
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
; is the beginning of a comment, but in some configuration settings
it is also valid syntax, e.g. for mon_host it is a valid
seperator for hosts (sigh ...)
only remove lines when it starts with a ';'
since we remove all comments anyway any time we write the ceph conf
it should not really matter for our users
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
we want a consistent config has, regardless of how the user or a tool
adds it to the config, so we map ' ' and '-' to '_' in the keys
this way we can always access the correct key without trying multiple
times
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
don't require any specific file types, if something is here which can
be requested to delete over API/CLI it either comes from an API/CLI
operation and should be thus OK to delete, else a user caused the
creation of the special file and it either works and all is good or
the user gets notified as we check if unlink succeeded anyway
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Symlinks with a non-existing target fail Perls '-f' test and were thus
not deleteable via the API (failing with '$path does not exist').
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Broken symlinks (and other files without a size) will now show up as 0
byte instead of causing a format validation error in the API.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Since ceph-fuse is called directly in the CephFS storage plugin, which
can not process the _netdev option, mounting the CephFS storage fails
when fuse is set in the storage.cfg.
This patch moves the _netdev option into the else part of the if fuse is
set statement. _netdev is only added if the CephFS kernel client mounts
the storage.
It seems _netdev is not needed anyway for the fuse mount, as the
connection is closed, once the fuse process gets killed on shutdown.
Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
simply chmod the temp file before copying to the "correct" permission
mode, where all users with access to the directory can read the file,
to mirror the behavior one gets for a apl_download call.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
as already announced over two months ago[0], remove the unofficial
SheepDog plugin now completely. Besides that it was never fully
supported in Proxmox VE one of its main developer and ex-maintainer
declared it as abandoned[1], and thus just let's remove it, git
allows to resurrect it any time if a wonder happens anyway.
[0]: https://pve.proxmox.com/pipermail/pve-user/2019-March/170497.html
[1]: http://lists.wpkg.org/pipermail/sheepdog/2019-March/068449.html
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
we will use this for adding a partition to a disk when using a device
for ceph osd db/wal which already has partitions on it
first we search for the highest partition number, then add the partition
and search for the resulting device (we cannot assume to simply
append the number, e.g. from /dev/nvme0n1 we get /dev/nvme0n1pX)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
we now expect the first parameter to be either a string with a single
disk, or an array ref with a list of disks
this way we can get the info of multiple disks simultaneously while
not iterating over all disks
this will be used to get the info for osd/db/wal disk
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Less reading and the own name for the variable should helps to grasp
more quickly what it should contain
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
ceph-volume creates osds/journal/etc. on LVM instead of partitions,
so to detect them, we have to parse the lv_tags of the LVs and
match them with the underlying device
also add tests for this detection
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
With this, users can select disks with LVM on it for journal/db devices,
which will be necessary for ceph nautilus, since there
journals/db/wal will be put on an LV
of course when creating an osd, we have to detect if that
is ok (probably based on the vg name on it)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
the delete parameter get's injected by the SectionConfigs
updateSchem, but we need to handle it ourself in the code
This makes the following possible:
pvesm set STORAGEID --delete property
Also the API equivalent is now possible. Adapted from the HA
managers Resource update API call.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Adds a fallback to 'Plugin::path' in the default implementation of
'map_volume' to simplify a common case of calling 'map_volume' followed
by a defined-check and a call to path if it is not. The path is now
always returned if the plugin in question does not override
'map_volume'.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
The underlying issue is that a zpool can get imported only once, so
we first check if it's in `zpool list`, and thus imported, and only
if it does not shows up there we try to import it.
But, this can race with either:
* parallel running activate_storage call, through CLI/API/daemon
* a zpool import from an admin (a bit unlikely, but hey that's the
thing with race conditions ;))
So refactor the "is pool imported" check into a closure, and call it
addditionally if the import failed, and silent the error if the pool
is now listed, and thus imported. This makes it a little bit nicer to
read too, IMO.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Vzdump log files were not deleted when a backup was deleted.
Consequently, the folder continuously filled with .log files.
Now they get deleted after the backup is removed.
Signed-off-by: Dominic Jäger <d.jaeger@proxmox.com>
Since zfsutils are not a hard dependency of our stack it is possible to not have
`zpool` available.
Checking for existance of `zpool` before calling it suppresses spurious warnings
in the logs (e.g. when creating Ceph OSDs or accessing the 'Disk' Tab in the
GUI).
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
If one of the storages passed in $storage_list was not defined
get_bandwidth_limit died (see [0], of an occurence of this).
This patch changes the behavior to ignore undef as storage instead.
[0] https://pve.proxmox.com/pipermail/pve-devel/2019-April/036515.html
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
during storage activation.
for pools that don't get imported at boot (e.g. because their vdevs are
not available when zfs-import-*.service runs) it is fatal to include
them in the cachefile, for those that do get imported at boot this code
should never run anyway as they are already imported.
in any case, a fallback to import without cachefile is the safe variant.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
fixes the 'cannot create 'nvme/foo': volume size must be a multiple of
volume block size' error by always rounding the size up to the next 1M
boundary. this is a workaround until
https://github.com/zfsonlinux/zfs/issues/8541 is solved.
the current manpage says 128k is the maximum blocksize, but a local test
showed that values up to 1M are allowed. it might be possible to
increase it even further (see f1512ee61).
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
Passing 'undef' as '$storage_list' led to a warning about using an
uninitialized value as array_ref.
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
the test would read the real device and if one is an iscsi device
it would fail, move the test code to a sub and mock it in the tests
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
When trying to create a qcow2 disk image with a size larger than available on the
storage, this will fail.
As qemu-img does not clean up the disk afterwards, it needs to be deleted
explicitly. Further, the vmid folder is cleaned up once it is empty.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
we only allow '-' '_' and '.' int storage-ids and names,
and we do not need to escape '_' and '.' (see man 5 systemd.unit)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
while the commit message tells it nicely a comment should add
additional info for people just giving this a quick look
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
(custom) templates might contain sensitive data, so require at least
read access on the underlying storage to access ISO and template files.
the same permissions are already needed for listing them, so this is
unlikely to cause fallout.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
will be used to contain files which can be executed as hookscripts or
contain custom cloud-init configs
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
use the port where the main glusterfs daemon listens on as ping port,
this one is also used by QEMU as default.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Remove directories if they are empty, which can happen if all images
from a VM got deleted, e.g., after destroying said VM.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
LVMPlugin->volume_import (used by storage_migrate on either offline
migration with local disks, or online migration with storage-only
referenced disks) passed 'conv=sparse' to `dd`. This can lead to
data-corruption, if the target volume is not zero-initialized.
dropping the sparse argument completely would fix the problem, but
breaks keeping data sparse for LvmThinPlugin.
This patch moves the dd out into (LVM*) plugin specific sub so that
each can control the parameters.
Steps for reproducing the issue:
* create a cluster with (at least) 2 nodes A and B, with a free
disk-device (/dev/sdx)
* write a recognizable pattern to /dev/sdx on B:
`dd if=/dev/zero bs=10M | tr '\000' '\255' | dd of=/dev/sdb bs=10M`
(would be grateful for alternatives to the dd| tr| dd)
* on both A and B create a lvm-vg (pvcreate, vgcreate)
* add it as _not_ shared storage, which is available on nodes A and B
* create a small guest on A
* fill a file in the guest with zeros
`dd if=/dev/zero of=/zerofil bs=10M`
* stop the guest, migrate it to B
* start the guest - check that the file `/zerofil` contains `ad`
instead of `00`
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
From `man 8 lvchange`:
--refresh
If the logical volume is active, reload its metadata. This is not
necessary in normal operation, but may be useful ... if you're doing
clustering manually without a clustered lock manager.
Fixes migration in a shared LVM (iscsi) setup, where a disk gets resized on one
node A and the guest is afterwards migrated to another node B: B still presents
the old size to the guest, leading to data corruption.
It is necessary to run `lvchange` twice because the options `-ay` and
`--refresh` are mutually exclusive.
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Without volume_size_info a Storage plugin falls back to the Implementation
in PVE/Storage/Plugin.pm, which relies on `qemu-img info`.
`qemu-img info` returns wrong results on a node in the case of shared volume
groups (e.g. when sharing disks via iSCSI), if a disk was resized on another
node (it lseeks to the end of the block-device, and this yields the old size).
Using lvs directly fixes the issue, since the LVM metadata gets updated when
invoking lvs.
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Else, if a general MON section existed in the ceph.conf, we added a
undefined entry and a cephfs storage can't be mounted anymore.
Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
With this we can use cfs_read_file/cfs_write_file and cfs_lock_file.
Code for writing is mostly copied from pve-managers CephTools.pm,
with the addition of mgr sections.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
it is not a storage plugin, and it makes more sense to have it
top-level, but there we cannot name it CephTools because of the
existing ones in pve-manager
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
`nvmeX` devices nodes are apparently allocated independently
from their namespace block devices `nvmeXnY` and therefore
they are not strictly related by name. For instance:
$ readlink /sys/block/nvme0n1/device
../../nvme1
$ readlink /sys/block/nvme1n1/device
../../nvme0
Here /dev/nvme0n1 is the first namespace of /dev/nvme1 while
/dev/nvme1n1 is the first namespace of /dev/nvme0.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
As we mount this manually and thus systemd doesn't know about any
dependency for cephFS mounts, this got umounted only at the last
stage of shutdown, where network wasn't active anymore.
But, CephFS needs to be connected to an active MDS for a clean
unmount so without network this mount would delay shutdown for quite
a bit, until after some minutes systemd gave up and forced unmount.
So tell systemd that this mount requires network, which can be done
with the '_netdev'[0] mount option, that lucky for us can be also
passed to a mount call and isn't only available for fstab.
with this a mount gets, among others:
> Wants=network-online.target
> Before=umount.target remote-fs.target
> After=remote-fs-pre.target system.slice network.target network-online.target -.mount
Which does the trick for us.
[0]: https://www.freedesktop.org/software/systemd/man/systemd.mount.html#_netdev
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Change to a cleaner sub command interface grouping all scan commands.
Alias to old command names for backward compatibility
Best viewed with the ignore whitespace/indent change '-w' flag from
git
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
With the addition of the map/unmap_volume() methods we made
an (actually unnecessary) API version bump.
All current users of these methods fall back to path() when
they return undef, so plugins implementing version 1 are
in fact compatible currently. (In fact, the default
Plugin::map_volume() could fall back to it on its own, but
doesn't currently).
For now let's just allow plugins older plugins to also be
loaded by introducing an API age variable. With it, if we
have a reason to break older plugins, we can have a
deprecation period during which older plugins cause a
warning instead of refusing to load altogether.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
This is important for shared LVM storages. As with deletes and
creates of images, as else we may have not the up-to-date metadata
and extents may get reused if another node created an image during
the same time, for example.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
as described in #862:
> I experienced a problem with ISCSI portal when using a hostname and
> not IP.
> The GUI resolves the hostname to an IP and writes it to storage.cfg.
> As my setup requires hostnames, i needed to change the config
> manually back to the hostname which is working fine.
>
> Why is this conversion done? If I enter a hostname, i want to have a
> hostname. If i enter an IP address i want to have an IP address.
This makes sense to me, a feature of using domains is that they
are/should be resolved when actually using (i.e., connecting to them)
so resolving it once on add does not seems like a good idea (if I do
not miss something - as this is a classic "imported from SVN" I do
not have any rationale to look at).
So save the work and pass it as is.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This allows to request a mapped device/path explicitly, regardles of
the storage option, eg. krbd option in the RBDplugin.
Bump of the storage ABI => 2
Co-authored-by: Alwin Antreich <a.antreich@proxmox.com>
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
this was not installed, had references to non-existing modules
(e.g., PVE::ReplicationTools) and the things it probably was intended
for are done in pve-manager, which has full access to all pve perl
libs and API methods (from a dependency POV)
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
to have a clear method name for this. check_XYZ suggests also that we
return true if the check was OK, but we don't.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Accessing a non-existing 'format' key in plugindata (e.g., in LvmThinPlugin),
created it by autovivication, thus breaking the fallback to the default value
'raw' upon any following access.
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
$features is actually an array reference, so use it as one.
This broke creation and migration of disks on rbd storages
Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This was newly introduced and is only used once, so having a
wantarray return mechanism, without ever using it or knowing for sure
if this may help with reuse of the method is not ideal.
Make the sub a module private one just returning the vm disk number
or explicit undef. Pass it the $suffix variable, to avoid recomputing
it every time called by out caller's loop.
If there's re-use potential in the future we can actually decide what
makes sense to return.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
else we can only have MAX_VOLUMES_PER_GUEST-1 disk per VMID,
not tragic but possible confusing
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Non conforming image names are not ignored anymore by the new rbd_ls
implementation, this patch adds the old behaviour.
This fix is a temporary workaround and should be removed, once the new
image name parser is ready.
Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
since ceph changed the plain output format for 12.2.8
we have to change the code anyway, and when were at it,
we can change it to the (hopefully) more robust json output
Co-authored-by: Alwin Antreich <a.antreich@proxmox.com>
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
the syntax for creating a pool with a single disk is
not the same as for mirror, so let the user select it
explicitely
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
lvm_find_free_diskname only checked for existing volumes starting with 'vm-',
and not with 'base-'.
Unify implementation with other Plugins.
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
move two loop bodies from
if (condition) {
...
}
too
next if !condition;
...
to save an indentation level
rename variables to a bit shorter version, i.e.:
s/oneTarget/target/
s/oneTpg/tpg/
and a comment rewording
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Introducing LIO/targetcli support allowing to use recent linux
distributions as iSCSI targets for ZFS volumes.
In order for this to work, two preconditions have to be met:
1. the portal has to be set up correctly using targetcli
2. the initiator has to be authorized to connect to the target
based on the initiator's InitiatorName
When adding a LIO iSCSI target, a new "LIO target portal group" field needs
to be correctly populated in the "Add: ZFS over iSCSI" popup, containing the
fitting "LIO target portal group" name (typically something like 'tpg1').
Signed-Off-By: Udo Rader <udo.rader@bestsolution.at>
Tested-by: Stoiko Ivanov <s.ivanov@proxmox.com>
creates/lists systemd mount units for /mnt/pve/.*
filetypes allowed are ext4 and xfs for now
mount with /dev/disk/by-uuid
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
like the LVM API, but return an array for the list,
because we do not have nested data here
and create a vg and thin lv with the name given and use the full size
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
if no vg is given, give back all thinpools from all vgs
if verbose is 1, then give back the information about the thinpools
(like size and free)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>