Commit Graph

46 Commits

Author SHA1 Message Date
Fabian Ebner
cffeb11592 api: ceph: create osd: set correct partition type
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-11-11 21:50:51 +01:00
Fabian Ebner
5161a0c2f0 partially fix #2285: api: ceph: create osd: allow using partitions
Note that this does not only allow partitions to be used, but for DB
and WAL disks, one more type of disk, that wasn't allowed before.
Namely, GPT-partitioned disks with any partitions detected as used.
The reason is get_disks' behavior:
  * Without $include_partitions=1, the disk will have the same usage
    as it's first used partition, and thus wasn't allowed. (Except in
    the case that usage was LVM, where the check was bypassed, but
    luckily OSD creation just failed later because no Ceph volume
    group would be detected).
  * With $include_partitions=1, the disk will have usage 'partitions'
    and thus be allowed.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-11-11 21:50:38 +01:00
Fabian Ebner
46b1ccc357 api: ceph: create osd: set correct parttype for DB/WAL
The get_ceph_journals function in pve-storage uses this information.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-11-11 21:50:33 +01:00
Dominik Csapak
d3eed3b4a8 api: ceph: fix getting ceph versions
Since commit: 8a3a300b ("ceph services: drop broadcasting legacy
version pmxcfs KV")

The 'ceph-version' kv is not broadcasted anymore, so we should not
query it, instead use get_ceph_versions

Also drop the other legacy keys for the versions

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-11-10 14:36:22 +01:00
Fabian Ebner
45d602f212 api: ceph: create osd: work around udev bug
There is a udev bug [0] which can ultimately lead to the udev database
for certain devices not being actively updated. The Diskmanage package
relies upon lsblk for certain info, and lsblk queries the udev
database. Ensure the information is updated by manually calling
'udevadm trigger' for the changed devices.

Without the fix, and a bit of bad luck, a cleaned up disk could still
show up as an 'LVM2_member' for example.

[0]: https://github.com/systemd/systemd/issues/18525

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-09-30 18:12:58 +02:00
Fabian Ebner
683a3563e7 api: check: create osd: use wipe_blockdev from the Diskmanage package
which is mostly a copy of the wipe_disks helper with the difference
that it also uses wipefs on the device and its partitions.

Remove the wipe_disks helper as no users remain.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-09-30 18:12:58 +02:00
Fabian Ebner
e256595683 api: ceph: create osd: re-check disk requirements after fork/lock
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-09-30 18:12:58 +02:00
Fabian Ebner
596bb7b11a api: ceph: osd: create: rename size parameters
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-06-09 11:29:34 +02:00
Thomas Lamprecht
d7a63207a3 ceph: osd_belongs_to_node: only check tree-entries of type host, refactor
We want to check explicitly for type host, so filter for that first
and create a hash map for easier usage afterwards.

Drop the error when there's no tree, as either RADOS error'd on bad
command already, or there really is no tree (but RADOS worked OK), in
which case we simply return that the OSD did not belong to this node.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-04-20 18:06:07 +02:00
Dominic Jäger
220173e9c6 Fix #2053: OSD destroy only on specified node
Allow destroying only OSDs that belong to the node that has been specified in
the API path.

So if
 - OSD 1 belongs to node A and
 - OSD 2 belongs to node B
then
 - pvesh delete nodes/A/ceph/osd/1 is allowed but
 - pvesh delete nodes/A/ceph/osd/2 is not

Destroying an OSD via GUI automatically inserts the correct node
into the API path.

pveceph automatically insert the local node into the API call, too.
Consequently, it can now only destroy local OSDs (fix #2053).
 - pveceph osd destroy 1 is allowed on node A but
 - pveceph osd destroy 2 is not

Signed-off-by: Dominic Jäger <d.jaeger@proxmox.com>
2021-04-20 16:42:12 +02:00
Stoiko Ivanov
c92fc8a1e8 api2: osd destroy: untaint device before pvremove
We get the device list from ceph-volume lvm list, and decode the json
output, which at that point is tainted (perlsec (1)).
Untaint it here before calling, because it is currently the only
call-site using the information in a problematic way (run_command).
(the only other call-site being in pve5to6)

Alternatively we could untaint while reading the information, but then
should only return a small subset of the ceph-volume output.

The issue is most likely due to
cb9db10c1a9855cf40ff13e81f9dd97d6a9b2698 in pve-common ('run_command:
improve performance for logging and long lines'),

Tested on a virtual testsetup by creating OSDs with second DB disk,
and destroying it via GUI (did not manage to get the error without the
DB disk)

Reported via our community forum:
https://forum.proxmox.com/threads/insecure-dependency-in-exec-during-osd-destroy.79574/

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
2020-11-24 23:37:33 +01:00
Stoiko Ivanov
259b557cf4 api2: osd destroy: fix error function
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
2020-11-24 23:37:33 +01:00
Alwin Antreich
2184098ed3 Allow setting device class on osd create
In some situations Ceph's auto-detection doesn't recognize the device
class correctly. The option allows to set it directly on osd create,
instead of altering it afterwards. This way the cluster doesn't need to
shift data back and forth unnecessarily.

Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
2020-07-24 10:26:11 +02:00
Alwin Antreich
e25dda254c Make PVE6 compatible with supported ceph versions
Luminous, Nautilus and Octopus. In Octopus the mon_status was dropped.
Also the ceph status was cleaned up and doesn't provide the mgrmap and
monmap.

The rados queries used in the ceph status API endpoints (cluster / node)
were factored out and merged to one place.

Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
2020-06-03 14:23:38 +02:00
Dominik Csapak
a0ef509a66 ceph: do not check ips if no network is configured
the network and the cluster network are optional in the ceph config
and with 'pveceph init', so only check if we have an ip address
from those networks if it is actually configured

otherwise, the createosd call dies with an 'ip' error message
even if it would work

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2020-03-04 15:38:09 +01:00
Thomas Lamprecht
a05349ab35 followup: add a bit of context to error message
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-12-16 15:38:50 +01:00
Aaron Lauterer
05bd76ac0e API: OSD: Fix #2496 Check OSD Network
It's possible to have a situation where the cluster network (used for
inter-OSD traffic) is not configured on a node. The OSD can still be
created but can't communicate.

This check will abort the creation if there is no IP within the subnet
of the cluster network present on the node. If there is no dedicated
cluster network the public network is used as a failsafe even though
this situation should not occur.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
2019-12-16 15:12:18 +01:00
Dominik Csapak
385df8382d fix #2341: ceph: osd create: allow db/wal on partioned disks
It was intended that for partitioned disks, we create one and use it.
Instead the code died always when the disk was used and not of type 'LVM'

We now check correctly the 2 cases:
* used for partitions and has gpt
* used and lvm

The remaining api call handles those two cases correctly

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2019-08-22 14:09:20 +02:00
Thomas Lamprecht
cead98bd69 api/osd: opinionated code cleanup of list
among others: reduce use of sub-hash as index for another hash

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-07-22 16:25:07 +02:00
Dominik Csapak
69ad2e539e ceph: osd list: add hostversions to the host nodes
we want to improve the version hints in the osd tree gui and need
the version at the host nodes

we could (and want to) workaround it in the gui to have that
info for both versions of the api call

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2019-07-22 15:52:07 +02:00
Thomas Lamprecht
67d8218fbd fix #2292: ceph osd create: use size parameter for db/wal
commit 970f96fdbb did not account for
getting the correct size parameter from the api call, so we ignored
it always resulting in uses not be able to set an explicit db/wal
size

Originally-fixed-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-07-19 11:05:49 +02:00
Thomas Lamprecht
9cc5ac9e75 api/ceph: code cleanup
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-07-11 14:16:11 +02:00
Dominik Csapak
b7701301a8 api/ceph: add osd scrub api call
can be called to (deep) scrub a specific osd

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2019-07-11 14:16:06 +02:00
Dominik Csapak
217dde83f0 ceph: osd: use get-or-create to create a bootstrap-osd key on demand
if for some reason the cluster does not have this key, generate it

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2019-07-04 09:57:50 +02:00
Dominik Csapak
7712a4e151 ceph: osd create: check for auth before getting bootstrap key
we do not need it if auth is 'none'

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2019-07-04 09:57:50 +02:00
Dominik Csapak
4ce045788a ceph: osd create: add encrypted as parameter
uses cpeh-volumes --dmcrypt parameter to encrypt the osd

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2019-06-11 12:58:24 +02:00
Thomas Lamprecht
970f96fdbb api osd create: followup code cleanup
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-06-06 13:43:32 +02:00
Dominik Csapak
45d45a63cd ceph: make ceph osd create api more readable
The aim of this patch is to reorder/rework the code of the api call
so that it gets more readable

it adds comments of what/why something is done, removes
code duplication between db/wal checks/creation

There are two changes in behaviour:
* when a device is given more than once via the api,
  the user gets a parameter exception for the db or wal
  with the information that the explicit defined devices must be
  different

* we check the usage for db/wal before the worker, so that the user
  gets instant feedback if a device is already in use
  (this is more for api users than for gui users, since we do those
  checks there also)

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2019-06-06 12:41:14 +02:00
Dominik Csapak
3d7b3992dd ceph: osd create: add missing gpt check
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2019-06-05 18:23:49 +02:00
Dominik Csapak
ab62d137e1 ceph: osd create: round size down to the next kib
since the size of an LV can only be a multiple of 512b, we round
down to the next kib

we then have to mulitply it by 1024 for the partition, since
append_partition expects bytes and not kib

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2019-06-05 18:23:49 +02:00
Thomas Lamprecht
b32e925587 api: osd destroy: try to remove PVs directly on the fly
no point in first building a list if we can just remove it directly
afterwards, it's eval-ed anyway and $osd_list did not get touched
in-between.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-06-05 10:50:54 +02:00
Thomas Lamprecht
5ebb945c3c api: osd destroy: pull out cleanup param
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-06-05 10:49:53 +02:00
Dominik Csapak
9b44d03dad ceph: osd: rework osd destroy to work with ceph-volume
with this, osd destruction is left to ceph-volume if the osd was created
with ceph-volume, else our old code remains mostly the same since
we want to be able to destroy upgraded osds

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2019-06-05 10:43:12 +02:00
Thomas Lamprecht
0154e79558 followup: api: osd create: code cleanup
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-06-05 10:01:45 +02:00
Thomas Lamprecht
0e5f83badc api: osd create: use verbose_description and document defaults directly
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-06-05 10:01:01 +02:00
Thomas Lamprecht
afa09e02c7 fixup: ceph osd create: also put real UUID when adding a lv
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-06-05 09:46:38 +02:00
Dominik Csapak
7783f755e3 ceph: osd: rework creation with ceph-volume
this completely rewrites the ceph os creation api call using ceph-volume
since ceph-disk is not available anymore

breaking changes:
no filestore anymore, journal_dev -> db_dev

it is now possible to give a specific size for db/wal, default
is to read from ceph db/config and fallback is
10% of osd for block.db and 1% of osd for block.wal

the reason is that ceph-volume does not autocreate those itself
(like ceph-disk) but you have to create it yourself

if the db/wal device has an lvm on it with naming scheme 'ceph-UUID'
it uses that and creates a new lv

if we detect partitions, we create a new partition at the end

if the disk is not used at all, we create a pv/vg/lv for it

it is not possible to create osds on luminous with this api call anymore,
anyone needing this has to use ceph-disk directly

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2019-06-05 08:55:28 +02:00
Dominik Csapak
e02970235d gui/ceph: show versions in osd overview
and highlight the not current osds

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2019-05-31 15:45:48 +02:00
Thomas Lamprecht
78c2d7f781 api: ceph/osd: conciser metadata array to hash mapping
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-04-24 10:24:21 +00:00
Thomas Lamprecht
de6ad72f23 followup: refactor & code cleanup
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-04-24 10:22:58 +00:00
Dominik Csapak
91564b7267 adapt osd api call for ceph nautilus
ceph nautilus changed the structure of 'pg dump osds'
they moved the data one level below

parse both new and old format, and bail if it returns anything else

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2019-04-24 10:06:28 +00:00
Thomas Lamprecht
017bb1a8bd minor typo fix and code cleanups
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-02-08 15:26:44 +01:00
Alwin Antreich
3c6aa3f47e api osd/destroy: use ProcFSTools to iterate mounts
Instead of opening proc/mounts through IO::File directly for parsing,
the patch uses ProcFSTools. This way it also takes care of eventual
decoding.

Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
2019-02-08 15:26:00 +01:00
Thomas Lamprecht
e45cc727e6 follouwp: code cleanup
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-02-08 15:26:00 +01:00
Alwin Antreich
b436dca874 Fix #2051: preserve DB/WAL disk on destroy
When destroying an OSD over API or CLI, e.g. by executing:

'pveceph osd destroy <num> --cleanup'

all disks associated with the OSD got wiped with dd, which included
any shared and by others still in use ones, e.g., separate disks with
DB/WAL.

The patch changes 'wipe_disks' to wipe the partition instead of the
whole disk.

Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-02-08 14:20:23 +01:00
Dominik Csapak
79fa41a2b8 ceph: move API2::CephOSD to API2::Ceph::OSD
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2018-12-20 09:44:01 +01:00