To prevent an error when disabling features of a rbd image with already
disabled flags. This aborted the CT/VM cloning halfway through with
a leftover rbd image, but no vmid.conf to it.
Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
Takes an operation, an optional requested bandwidth
limit override, and a list of storages involved in the
operation and lowers the requested bandwidth against global
and storage-specific limits unless the user has permissions
to change those.
This means:
* Global limits apply to all users without Sys.Modify on /
(as they can change datacenter.cfg options via the API).
* Storage specific limits apply to users without
Datastore.Allocate access on /storage/X for any involved
storage X.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
we will use this for the gui to figure out if we have to show
a size selector, a file selector, which formats are there, etc.
we have to include this data even for not active storages, else
we cannot show the correct fields
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
this will be used in the gui, for determining if we need to select
something from the storage when using for an image
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
in get_disks, when called with a parameter 'cciss/cXdY', we replaced
the '/' with '!' so that we can properly poll the information
about it from /sys/block/
but we have to replace the '!' with '/' again in our result list,
because the caller does not know anything about it and fails, because
the original dev is not in the list
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
We otherwise use the long options everywhere in the plugin.
This will build the following command:
iscsiadm --mode session --sid 1 --rescan
Rescanning session [sid: 1, target: xxx, portal: yyy]
preserve the old behaviour of selecting auth_supported based on the
existence of the keyring, but limit it to external clusters.
this allows switching 'auth XXX required' in the pveceph-managed
ceph.conf while still automatically copying the keyring when adding a
storage.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
add /etc/pve/ceph.conf to commands / option strings instead
of the monitor list provided via the 'monhost' option.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
to allow differentiating between user-created external RBD storage
entries (WITH monhost), and those created and managed by pveceph
(without).
making monhost non-fixed allows easily opting into the managed behaviour via
'pvesm set STORAGE -delete monhost', but is also helpful for external clusters
(i.e., after adding or removing a monitor you need to update the monhost
parameter..)
adapt description accordingly.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
This turns is_mountpoint more into export(5)'s `mountpoint`
property.
Given the directory storage with the properties:
path /a/b/c
is_mountpoint $value
$value = yes
Same as before, /a/b/c must be mounted.
$value = no (or not set)
Same as before, no effect.
$value = /a/b
New: /a/b must be mounted (as opposed to /a/b/c)
Accommodates changes in 44ae567 and d40e27d by
reordering checks to allow for proper filtering
of disabled storages. Also reorders two checks to
prevent autovivification resulting in disabled
storages always showing in output.
this patch adds information about bluestore/db/wal to the disklist,
and we set the journal count only when we have at least one journal on
the disk
also adapt the regression tests
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Add column names at top of output, this allows easier understanding
of what each column means.
Use leading spaces on the percentage column so that this is lined up.
Switch out the 1/0 from the active column with the actual status
(active, inactive, disabled).
Show N/A if storage is disabled.
Use $res->{total} instead of calculating a sum of used and available.
Remove wrong rounding - if we want to display 2 digits from the
fractional part we would need to add 0.005 not 0.5, this made the
result quite wrong depending on the storage size.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
in the Storage/Status API call we have a 'enabled' param which had no
effect because storage_info only returned enabled one way or the
other.
This affected also `pvesm status` which uses the Storage/Status API
call.
So push also disabled storages to the info array but only activate
and get their status when thei are enabled.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Ceph change ceph version output.
full output of 'ceph --version'
Luminous 'ceph version 12.1.0 (262617c9f16c55e863693258061c5b25dea5b086) luminous (dev)'
Jewel 'ceph version 10.2.7 (50e863e0f4bc8f4b9e31156de690d765af245185)'
Wrap the -d test with run_or_get_killed sub this test
can make pvestatd hang on I/O wait when a nfsd process is stopped
This might help with other file based storages, for instance
directory storages on unplugged USB devices.
This replaces the path-based and lvm/thin special cases in
storage_migrate with the already generic-enough zfspool
case which is already using import/export and does not
directly depend on zfs anymore.
All of them have a `+size` prefix to show that they're not
"pure raw" or "pure tar" streams, because some storage may
need to know in advance how much storage to allocate.
The formats are explained in comments.
PVE::Storage::Plugin now has default implementations for
these for non-incremental streams exporting the current
(rather than a snapshot state).
To use qcow2 or vmdk formats $with_snapshots must be true,
otherwise raw/tar will be used where $with_snapshots must
be false.
this caused the webinterface to sort alphabetically instead of numerical
when sorting by image size
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
The volume_snapshot call was missing the condition when to
create a snapshot. Make the whole logic easier to follow
with a $migration_snapshot boolean.
Also get rid of the remote `pvesm free -snapshot` call by
using import's new -delete-snapshot parameter.
This deletes a snapshot on *success*, done directly in the
CLI handler, as the rollback/delete on failure is already
happening inside the plugin's import method.
It is possible to synchronise a volume to an other node in a defined interval.
So if a node fail there will be an copy of the volumes from a VM
on an other node.
With this copy it is possible to start the VM on this node.
since we allow vm-ID-whatever when allocating images, we
should also include those when listing them.
note: '@' is reserved for snapshots in ceph, so it is safe to
skip lines including an '@' in the image name.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
with more than a few images, 'rbd ls -l' gets rather slow
compared to a simple 'rbd ls'. since we only need to check
existing image names for finding a free one, the latter is
sufficient.
example with ~400 rbd images:
$ time rbd ls -p ceph-vm > /dev/null
real 0m0.027s
user 0m0.012s
sys 0m0.008s
$ time rbd ls -l -p ceph-vm > /dev/null
real 0m5.250s
user 0m1.632s
sys 0m0.584s
a linked clone of two disks on the same setup accordingly
also shows a massive speedup:
$ time qm clone 1000 10000 -snap test
create linked clone of drive scsi0 (ceph-vm:vm-1000-disk-2)
clone vm-1000-disk-2: vm-1000-disk-2 snapname test to
vm-10000-disk-1
create linked clone of drive scsi1 (ceph-vm:vm-1000-disk-1)
clone vm-1000-disk-1: vm-1000-disk-1 snapname test to
vm-10000-disk-2
real 0m11.157s
user 0m3.752s
sys 0m1.308s
$ time qm clone 1000 10000 -snap test
create linked clone of drive scsi1 (ceph-vm:vm-1000-disk-1)
clone vm-1000-disk-1: vm-1000-disk-1 snapname test to
vm-10000-disk-1
create linked clone of drive scsi0 (ceph-vm:vm-1000-disk-2)
clone vm-1000-disk-2: vm-1000-disk-2 snapname test to
vm-10000-disk-2
real 0m0.872s
user 0m0.652s
sys 0m0.096s
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
With krbd we resize volume and tell QemuSever to notify running QEMU
with zero $size by returning undef.
Signed-off-by: Dmitry Petuhov <mityapetuhov@gmail.com>
there was still a point where we got the wrong string
on createosd we get the devpath (/dev/cciss/c0d0)
but need the info from get_disks, which looks in /sys/block
where it needs to be cciss!c0d0
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
otherwise there are situations where snapshots are left
behind for already sent volumes. also include more warnings.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
this works around a bug, where qemu does not align the qcow2 file
when using the filesystem directly, and the gluster blockdriver
refuses to read from it
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
the old code was way too broad here, this fixes at least the
following issues:
- importing of other/unconfigured zpools by "import -a"
- possible false positives if a pool name is a substring of
another pool name because of "list" without pool name,
potentially skipping activation for such pools
- not noticing failure to activate in activate_storage
because the success of "zpool import -a" does not tell us
anything about the pool we actually wanted to import
checking specifically for the pool to be activated when
calling "zpool list" gets rid of the second issue, and
trying to import only that pool fixes the other two.
we want this, because the model in /sys/block/<device>/device/model
is limited to 16 characters
and since the model is not always in the udevadm output (nvme),
also read the model from the model file as fallback
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
since we iterate over the entries in /sys/block
it makes sense to use this path
this should fix#1099
because udevadm does not take
-n cciss!c0d0 (because it only looks in dev for this)
but takes
-p /sys/block/cciss!c0d0
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
without this, having an efidisk on a ceph storage
prevents creating another disk on the same
ceph storage, because it will not be detected
and we try to allocate one with the same name
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
refactored the wear level parsing into its
own function, where we can now define a
vendor <-> attribute id
mapping
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
instead of parsing the output of smart in two places,
give get_smart_data a flag if we only want health
this fixes a bug (not on the bugtracker), where
an ssd with disabled smart had an empty string as health
in the gui
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
the smart checks are only needed for the API call(s) that
list all disks and their status, but get_disks is also used
in disk usage checks and in the Ceph code, where the smart
status is completely irrelevant.
drop the implicit skipping of smart checks if $disk is set,
since we have an explicit parameter for this now.
because we never ever want to die in get_disks because of a
single disk, but the nodes/xyz/disks/smart API path is
allowed to fail if a disk device is unsupported by smartctl
or something else goes wrong.
While the mkdir option deals with the case where we don't
want to clobber a mount point with directories (like ZFS,
gluster or NFS), putting a directory storage directly onto a
mount point is still risky:
If the path exists - which it usually does even if not
mounted - the storage will be considered successfully
activated, but empty (or with unexpected content). Some
operations will then lead to unexpected problems: the
free_disk operation for instance only warns if the disk does
not exist, but does not throw an error. In this case the
configuration might be updated without the real disk being
deleted. Once it's mounted back in, later operations which
check existing disks which are not part of the current VM
configuration (like migration) might error unexpectedly.
This adds an 'is_mountpoint' option to directory storages
which assumes the directory is an externally managed mount
point (eg. fstab or zfs) and changes activate_storage() to
throw an error if the path is not mounted.
So far this only prevented the creation of the toplevel
directory. This does not cover all problem cases,
particularly when said directory is supposed to be a mount
point, including NFS and glusterfs beside ZFS.
The directory based storages we have already use mkpath
whenever they need to create files, and for actions on files
which are supposed to exist it's fine if it errors out.
So it should also be safe to skip the creation of standard
subdirectories in activate_storage().
Additionally NFS and glusterfs storages should also accept
the mkdir option as they otherwise may exhibit similar
issues, eg. when an NFS storage is mounted onto a directory
inside a ZFS subvolume.
since the rbd images themselves are named differently than
the volumes in our config files, we need to recreate this
information from the parent relation in the ceph metadata,
otherwise list_images() might return wrong volume names/IDs
since list_images is used by PVE::Storage::vdisk_free() to
check for children still referencing a base image, because
of the wrong volume id RBDPlugin->parse_volname() does not
detect the base image of linked clones and the check fails.
this is thankfully mitigated by the protected status of the
base snapshot, but creates a rather confusing error message.
scenario (VM 701 is a linked clone of template VM 700):
$ qm config 700 | grep virtio0:
virtio0: ceph_qemu:base-700-disk-1,size=2G
$ qm config 701 | grep virtio0:
virtio0: ceph_qemu:base-700-disk-1/vm-701-disk-1,size=2G
before (pvesm list reports wrong volume ID, check fails):
$ pvesm list ceph_qemu
ceph_qemu:base-700-disk-1 raw 2147483648 700
ceph_qemu:vm-701-disk-1 raw 2147483648 701
$ pvesm free ceph_qemu:base-700-disk-1
snap_unprotect: can't unprotect; at least 1 child(ren) in pool rbd
rbd unprotect base-700-disk-1 snap '__base__' error: snap_unprotect: can't unprotect; at least 1 child(ren) in pool rbd
after (correct volume ID, check works as intended):
$ pvesm list ceph_qemu
ceph_qemu:base-700-disk-1 raw 2147483648 700
ceph_qemu:base-700-disk-1/vm-701-disk-1 raw 2147483648 701
$ pvesm free ceph_qemu:base-700-disk-1
base volume 'base-700-disk-1' is still in use (use by 'base-700-disk-1/vm-701-disk-1')
since smartctl uses the return value to encode
disk health status (such as failure in the past)
we cannot die there, but have to parse the returncode
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
adds a new class (intended to be used under nodes in pve-manager)
which adds the three api calls: list, smart and init
list being a general list of the available disk with infos
smart being a call to get the smart data from a given device
init being a call to write a gpt header to an unused disk
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
this adds the functions for listing the disks (mostly copied from
the ceph code), checking if a disk is a valid blockdevice, if it
is used/in a zfs pool/as an lvm pv, and an init function (just to add a gpt header;
this is important if one wants to use a fresh disk for ceph journals)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
If you want to use different ceph storage,
something they have differents values like ms_nocrc = true.(they are also others ones).
The client need to specify theses special options to be able to connect
This patch allow to create a ceph config file for each storeid in
/etc/pve/priv/ceph/$storeid.conf
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
PVE team cannot support specialized vendor-specific storage
plugins because of lack of hardware. But we can allow users to
add own plugins for their storages without need to rewrite any
PVE code and thus ease PVE updates to them.
Idea of this patch is to add folder /usr/share/perl5/PVE/Storage/Custom
where user can place his plugins and PVE will automatically load
them on start or warn if it could not and continue. Maybe we could
even load all plugins (except PVE::Storage::Plugin itself) this way,
because current storage plugins are not really plugins, if they
need to be explicitly loaded in PVE code :-).
Custom plugins MUST have api() method returning version for which
it was designed. If API changes from PVE side, module is just not
being registered and warnig message is printed do log, so user have
to update module. Until module update, corresponding storage will
just disappear from PVE, so it shall not impose any data damage
because of API change.
This approach works (with some limitations) if plugin works in
generic PVE way: full control of volumes lifecycle. And will not
currently work for custom plugins like iSCSI, which needs to select
pre-existing volumes. Maybe someone will add more flexible way to
pve-manager to select input elements for storage plugins to target
this.
Currently tested with my NetApp plugin.
Signed-off-by: Dmitry Petuhov <mityapetuhov@gmail.com>
ssh(1) mentions that compression is only disirable on slow
connections.
since migration from cluster node to cluster node needs a
fast network anyway, we can drop the compression for
a speed improvement
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
This way we get parameter verification on monitor addresses
as well as the ability to pass multiple `--monhost`
arguments to `pvesm add`.
Since our '-list' schemas default to using commas we now
need to properly support these, so all uses of the monhost
property now replace all of semicolon, space or comma into
the currently required character.
This should fix the issues reported by Alwin Antreich on the
pve-user list.
Since this schema supports both ipv6+port notations we need
to make sure we convert to the bracket enclosed variant.
Added a helper for this.
"ceph version" retrieves the version from the cluster (i.e.,
from the queried monitor), but what is needed here is the
local ceph version, which is returned by "ceph --version".
Ideally we don't need this, but this with the directory
storage this is a user-input field which gets returned
by the storage's path() method which is used in various
external command calls.
By default a directory storage creates its path. In some
cases this can be undesired, mostly when storages have
nested paths (eg. a dir storage on a ZFS path or in an NFS
share, or inside custom mount points).
As a simple fix to this the 'mkdir' option (default ON)
can now be used to disable this behavior.
otherwise mapping those images will fail. disabling the
features only needs to be done once per image, so it makes
sense to do this when creating the images.
unfortunately, the command does not work in hammer, so
it needs a version check for jewel or higher.
extract_vzdump_config_tar is an adapted combination
of tar_archive_search_conf() and the first part of
recover_config(), both from PVE::LXC::Create.
a compressed vma backup file needs special error
handling because vma exits as soon as it found the config
file, which the used decompressors treat as error.
create_base() uses '-ky' to prevent base images from being
activated by default, similar to snapshots. This means we
need to activate them like snapshots with the '-K' option.
this patch adds an lvmthin scan to the api, so that we can get a list
of thinpools for a specific vg via an api call
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
The parse_proc_mounts change made the glusterfs is_mounted
check fail (causing it to be shown as inactive on the GUI).
The NFS check was stricter (not allowing a trailing / in the
source anymore).
vdisk_alloc comes in with an umask of 0037, which means the
.subvol dir has permissions 0740, which means that the root
directory of containers has permissions 0740, essentially
preventing the users inside a container from accessing
anything.
I converted several zfs_request($class, ...) calls to $class->zfs_request(...) calls in ZFSPoolPlugin.pm and removed a superfluous $class parameter in ZFSPlugin.pm.
Fixes#816
Signed-off-by: Phillip Schichtel <phillip.public@schich.tel>
This makes no sense because it should always be exclusive.
Also RDB checks it self.
LVM has not possibility to use lvchange.
DRBD is this feature not implemented.
Replace possibly-dangerous characters in uploaded filenames
with underscores, this includes spaces, colons, commas,
equal signs and any byte >= 128. Previously only spaces were
turned into underscores.
Also shell_quote the destination for scp.
Use '--' for some shell commands for safety.
Use brackets around the scp destination for ipv6 support.
Avoid world-readable disk files being created as suggested
in #416 by setting an umask to strip world permissions as
well as group write/exec permissions before calling
alloc_image.
rpcinfo from rpcbind-0.2.1 in debian doesn't support ipv6 addresses.
At the same time the used command only actually tests for
portmapper/rpcbind availability, not for NFS directly.
Storage::scan_nfs uses /sbin/showmount to get a list of NFS exports from a
server and happily accepts ipv6 addresses. It is also more specific to NFS.
Replacing the rpcinfo call with showmount here means checking explicitly
for NFS and supporting IPv6 without the need for an updated rpcbind
package.
NFS needs brackets around ipv6 addresses.
Also: nfs_is_mounted needs to quote the variables. This becomes apparent
when ipv6 addresses are used as then the address would otherwise be
treated as a character class, causing the check to always fail.
While in posix gethostbyname(3) does support ipv6, perl's gethostbyname
usually returns wrong results for names, or no results for ipv6
addresses. Since we provide a getaddrinfo helper already, we now use
that instead.
It is better to check if a VM is running in QemuServer then in Storage.
for the Storage there is no difference if it is running or not.
Signed-off-by: Wolfgang Link <w.link@proxmox.com>
we need to escape ":" used to defined mon ports
"10.5.0.11:6789; 10.5.0.12:6789; 10.5.0.13:6789"
->
"10.5.0.11\:6789; 10.5.0.12\:6789; 10.5.0.13\:6789"
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
add method volume_rollback_is_possible and redactor
Improve error handling
If snapshot is not reversible catch it before vm will lock and shutdown.
This is the case if zfs has an younger snapshot.
Signed-off-by: Wolfgang Link <w.link@proxmox.com>
Turned out it makes no sense to duplicated DirPlugin features. So I
also changed the name to make it less confusing. So we can only
create zvols inside a zfs pool with this plugin.
Currently vmstate snapshot with rbd have wrong name,
because rbd alloc_image don't care if $name is provided
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
Always use a custom error sub to get the real errors out of rbd command instead of the typical:
2014-02-06 11:20:20.187190 7f3b6c37c760 -1 librbd: removing snapshot from header failed: (16) Device or resource busy
before:
rbd: snapshot 'abc' is protected from removal.
TASK ERROR: rbd snapshot vm-173-disk-1' error: 2014-02-06 11:06:02.438336 7f6f4ac92760 -1 librbd: removing snapshot from header failed: (16) Device or resource busy
now:
TASK ERROR: rbd: snapshot 'abc' is protected from removal.
Signed-off-by: Stefan Priebe <s.priebe@profihost.ag>
The actual code would only accept zvols like: POOL/vm-123-disk-1.
However, using POOL/DataSet/vm-123-disk-1 allows setting specific
proparties at POOL/DataSet level (like compression, etc.) which
would be inherited by any zvol created under such DataSet.
This allows more flexibility of zfs/zvol's management.
Signed-off-by: Pablo Ruiz García <pablo.ruiz@gmail.com>
- collie command is now 'dog'
- KB size is now k
- snapshot rollback need force -f flag, to avoid confirm prompt
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
forum user report slow qcow2 volume create with preallocated metadatas
http://forum.proxmox.com/threads/17471-GlusterFS-amp-Proxmox-Future-amp-QCOW2-Issues
(note that I can't reproduce it with qemu 1.7)
But redhat bugzilla have an entry about possible problem with volume is create through mount point.
https://bugzilla.redhat.com/show_bug.cgi?id=895830
So,It's better to manage it through gluster block driver directly.
(We need only the mount point to create directory and list images files)
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
example of storage.cfg
zfs: omnios
blocksize 8k
target iqn.2010-09.org.openindiana:target1
pool pool1
iscsiprovider comstar
portal 192.168.0.1
sudo 1 (optionnal)
content images
note for fast ssh login:
on solaris host :
/etc/ssh/sshd_config
LookupClientHostnames no
VerifyReverseMapping no
GSSAPIAuthentication no
note for nexenta:
rm /root/.bash_profile
to avoid to go in nmc console by default
Signed-off-by: Michael Rasmussen <mir@datanom.net>
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
If a plugin overwrites method path() to return optimized setting for qemu,
it can now still use the generic methods from PVE::Storage::Plugin which works
on file system paths (for example the glusterfs plugin).
We also do this for LVM. Else I get:
> qm rescan --vmid 100
Use of uninitialized value $owner in string ne at /usr/share/perl5/PVE/Storage/NexentaPlugin.pm line 356.
So we can't bring up the iscsi storage
This patch is based on the patch submitted by Alexandre, but we only
suppress error messages when there are no active sessions. Other errors still
trigges an exceptions.
pool is now optional, default value is 'rbd';
username is now optional, default value is 'admin';
auth_supported option is removed and is autodetected.
auth = cephx if private key exist
auth = none if private key does not exist
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
- rename volume
- take snapshot '__base__'
- protect the snapshot
Fix: the volume_snapshot sub need a $running parameter,
to known if it need to use rbd command or qmp command to take the snapshot.
for now, I pass undef, as it should be always offline.
(But we need to verify somewhere that vm is not running,
because take a snapshot with rbd command on a running vm can break it.
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
This is an iplementation for file base storage types.
changes compared to patches from Alexandre:
* use correct locking
* private find_free_diskname() with bug fixes
* changed names of new methods
* always refer to base volumes in volume names
Example volume names:
local:6000/base-6000-disk-9.raw
local:6000/base-6000-disk-9.raw/7000/vm-7000-disk-9.qcow2
local:6000/base-6000-disk-9.raw/7000/base-7000-disk-10.qcow2
remove all uncommented sleep calls (will add them later if required). Use
new nexenta_request() syntax. Also removed strangs eval{} sections which
hide errors.