Allow to specify a separate cluster network when initializing ceph.
Ceph docs[0] imply a possibility for performance increase and
enhanced security in environments where the public network serves not
fully trusted peers, which could else provoke a DOS to the cluster
traffic[0].
Make this optional, but if passed `network` is required too.
[0]: http://docs.ceph.com/docs/luminous/rados/configuration/network-config-ref/
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Allow to create a new CephFS instance and allow to list them.
As deletion requires coordination between the active MDS and all
standby MDS next in line this needs a bit more work. One could mark
the MDS cluster down and stop the active, that should work but as
destroying is quite a sensible operation, in production not often
needed I deemed it better to document this only, and leaving API
endpoints for this to the future.
For index/list I slightly transform the result of an RADOS `fs ls`
monitor command, this would allow relative easy display of a CephFS
and it's backing metadata and data pools in a GUI.
While for now it's not enabled by default and marked as experimental,
this API is designed to host multiple CephFS instances - we may not
need this at all, but I did not want to limit us early. And anybody
liking to experiment can use it after the respective ceph.conf
settings.
When encountering errors try to rollback. As we verified at the
beginning that we did not reused pools, destroy the ones which we
created.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Co-authored-by: Alwin Antreich <a.antreich@proxmox.com>
Allow to create, list and destroy and Ceph Metadata Server (MDS) over
the API and the CLI `pveceph` tool.
Besides setting up the local systemd service template and the MDS
data directory we also add a reference to the MDS in the ceph.conf
We note the backing host (node) from the respective MDS and set up a
'mds standby for name' = 'pve' so that the PVE created ones are a
single group. If we decide to add integration for rank/path specific
MDS (possible useful for CephFS with quite a bit of load) then this
may help as a starting point.
On create, check early if a reference already exists in ceph.conf and
abort in that case. If we only see existing data directories later
on we abort but do not remove them, they could well be from an older
manual create - where it's possible dangerous to just remove it. Let
the user handle it themself in that case.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Co-authored-by: Alwin Antreich <a.antreich@proxmox.com>
We will reuse this in the future, e.g., when creating a data and
metadata pool for CephFS.
Allow to pass a $rados object (to reuse it, as initializing is not
that cheap) but also create it if it's undefined, fro convenience.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
most of this was imported by just copying without verifying if all is
actually required. Some lost its purpose as we re-used more from our
existing module code base (e.g., pve-common) but wasn't actually
removed.
As this file includes two perl modules you need to take a bit caution
when looking at this, as some things are used in one module but not
the other - simple grep'ing at this may give false positives.
Also add PVE::API2::Storage use which was missing here.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
most of this was imported by just copying without verifying if all is
actually required. Some lost its purpose as we re-used more from our
existing module code base (e.g., pve-common) but wasn't actually
removed.
As this file includes two perl modules you need to take a bit caution
when looking at this, as some things are used in one module but not
the other - simple grep'ing at this may give false positives.
Also include the missing IO::File use.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This patch removes the separate storage entries for CT & VM to the same
ceph pool. Instead only one entry is made as we can now map/unmap
volumes actively in pve-container.
Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
this allows the disk to be reused as ceph disk by zeroing the first 200M
of the destroyed disk. disks are iterated separately from partitions to
prevent duplicate wipes.
Signed-off-by: David Limbeck <d.limbeck@proxmox.com>
btrfs is deprecated since Luminous and it will no more be tested.
If btrfs is used, you have to add an extra parameter to ceph.conf
to allow ceph-disk to activate btrfs OSD's.
In our default config this is not the case.
From Luminous release note [1]:
"We no longer test the FileStore ceph-osd backend in combination with
btrfs. We recommend against using btrfs. If you are using
btrfs-based OSDs and want to upgrade to luminous you will need to
add the follwing to your ceph.conf:
enable experimental unrecoverable data corrupting features = btrfs
The code is mature and unlikely to change, but we are only
continuing to test the Jewel stable branch against btrfs. We
recommend moving these OSDs to FileStore with XFS or BlueStore."
[1] https://ceph.com/releases/v12-2-0-luminous-released/
If a CIDR gets passed to Net::IP it is expected to not be from the
middle of an subnet, i.e., 192.168.1.12/24 is *not* OK but
192.168.1.0/24 would be OK.
As the Network/interfaces files also accepts CIDR notation for the
'address' param (now also for IPv4) this let to problems in our node
monitor IP detection code, which used the interface file and Net::IP to
find any address from the ceph public network.
So change to our newer helper PVE::Network::get_local_ip_from_cidr to
get all configured and ready (=up) IPs from this network.
Also handle the case where multiple networks where returned, add a
parameter to allow specifying one of those and ask the user to do so.
If no public network is configured and no mon-address parameter was
passed, we fall back to the remote node IP of the node, as was done
previously. We expect that the user only overwrites the mon-address
if he knows what he do and omit checks here.
with this we also have to send '0' to from the frontend, when the
bluestore checkbox is not checked
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
while OSDs units should only be runtime enable and disappear on reboots,
this serves as an additional safeguard to ensure no leftover units can
exist.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
vdisk_list can potentially take very long, and we don't want
the API request to time out.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
introduce new API parameter 'add_storages'. if set, one
storage each is configured using the created pool:
- for containers using KRBD
- for VMs using librbd
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
add version check to ceph init to require luminous or higher and
fix#1481: check existence of ceph binaries before use
Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
in the gui this is already the default, so make it also the default
in the backend (also 2/1 is really bad as a default)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
this adds information about bluestore (which devices and if
bluestore/filestore) to show in the gui
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
we get the names in the backend, and give them as an additional field
in the api call, and use it in the grid
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
this patch does a few things
1. we introduce a new api call /nodes/nodename/ceph/rules
which gets us a list of crush rules
2. we introduce a new CephRuleSelector which is a simple combobox
with the data from the api call ceph/rules
3. we use this in the create pool window
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
since ceph 12.1.1 the (deprecated) parameter 'crush_ruleset' is removed
and replaced with 'crush_rule' while changing this, change from
integer to string so that we can later use the names of the rules
instead of the id
(for now there seems to be a bug that you can only use the name and
not the id)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
we now have to remove 5 types of partitions:
data/metadata
journal
block
block.db
block.wal
this patch fixes the detection of block/block.db/block.wal
generalizes it
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
we reuse the 'journal_dev' parameter for bluestores block.db
and add a new parameter 'wal_dev' for bluestores write ahead log
if only journal_dev is given, use it for both db and wal
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>