Update pveceph

* Combine sections from the wiki * add section for avoiding RAID controllers * correct command line for bluestore DB device creation * minor rewording Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
2025-08-07 06:05:21 +00:00 · 2018-06-26 17:17:09 +02:00 · 2018-06-26 17:17:09 +02:00 · a474ca1f74
commit a474ca1f74
parent 6a8897ca46
1 changed files with 60 additions and 32 deletions
--- a/pveceph.adoc
+++ b/pveceph.adoc
@ -25,19 +25,32 @@ endif::manvolnum[]

 [thumbnail="gui-ceph-status.png"]

-{pve} unifies your compute and storage systems, i.e. you can use the
-same physical nodes within a cluster for both computing (processing
-VMs and containers) and replicated storage. The traditional silos of
-compute and storage resources can be wrapped up into a single
-hyper-converged appliance. Separate storage networks (SANs) and
-connections via network (NAS) disappear. With the integration of Ceph,
-an open source software-defined storage platform, {pve} has the
-ability to run and manage Ceph storage directly on the hypervisor
-nodes.
+{pve} unifies your compute and storage systems, i.e. you can use the same
+physical nodes within a cluster for both computing (processing VMs and
+containers) and replicated storage. The traditional silos of compute and
+storage resources can be wrapped up into a single hyper-converged appliance.
+Separate storage networks (SANs) and connections via network attached storages
+(NAS) disappear. With the integration of Ceph, an open source software-defined
+storage platform, {pve} has the ability to run and manage Ceph storage directly
+on the hypervisor nodes.

 Ceph is a distributed object store and file system designed to provide
 excellent performance, reliability and scalability.

+.Some of the advantages of Ceph are:
+- Easy setup and management with CLI and GUI support on Proxmox VE
+- Thin provisioning
+- Snapshots support
+- Self healing
+- No single point of failure
+- Scalable to the exabyte level
+- Setup pools with different performance and redundancy characteristics
+- Data is replicated, making it fault tolerant
+- Runs on economical commodity hardware
+- No need for hardware RAID controllers
+- Easy management
+- Open source
+
 For small to mid sized deployments, it is possible to install a Ceph server for
 RADOS Block Devices (RBD) directly on your {pve} cluster nodes, see
 xref:ceph_rados_block_devices[Ceph RADOS Block Devices (RBD)]. Recent
@ -47,10 +60,7 @@ and VMs on the same node is possible.
 To simplify management, we provide 'pveceph' - a tool to install and
 manage {ceph} services on {pve} nodes.

-Ceph consists of a couple of Daemons
-footnote:[Ceph intro http://docs.ceph.com/docs/master/start/intro/], for use as
-a RBD storage:
-
+.Ceph consists of a couple of Daemons footnote:[Ceph intro http://docs.ceph.com/docs/master/start/intro/], for use as a RBD storage:
 - Ceph Monitor (ceph-mon)
 - Ceph Manager (ceph-mgr)
 - Ceph OSD (ceph-osd; Object Storage Daemon)
@ -65,13 +75,21 @@ Precondition
 To build a Proxmox Ceph Cluster there should be at least three (preferably)
 identical servers for the setup.

-A 10Gb network, exclusively used for Ceph, is recommended. A meshed
-network setup is also an option if there are no 10Gb switches
-available, see {webwiki-url}Full_Mesh_Network_for_Ceph_Server[wiki] .
+A 10Gb network, exclusively used for Ceph, is recommended. A meshed network
+setup is also an option if there are no 10Gb switches available, see our wiki
+article footnote:[Full Mesh Network for Ceph {webwiki-url}Full_Mesh_Network_for_Ceph_Server] .

 Check also the recommendations from
 http://docs.ceph.com/docs/luminous/start/hardware-recommendations/[Ceph's website].

+.Avoid RAID
+While RAID controller are build for storage virtualisation, to combine
+independent disks to form one or more logical units. Their caching methods,
+algorithms (RAID modes; incl. JBOD), disk or write/read optimisations are
+targeted towards aforementioned logical units and not to Ceph.
+
+WARNING: Avoid RAID controller, use host bus adapter (HBA) instead.
+

 Installation of Ceph Packages
 -----------------------------
@ -101,7 +119,7 @@ in the following example) dedicated for Ceph:
 pveceph init --network 10.10.10.0/24
 ----

-This creates an initial config at `/etc/pve/ceph.conf`. That file is
+This creates an initial configuration at `/etc/pve/ceph.conf`. That file is
 automatically distributed to all {pve} nodes by using
 xref:chapter_pmxcfs[pmxcfs]. The command also creates a symbolic link
 from `/etc/ceph/ceph.conf` pointing to that file. So you can simply run
@ -116,8 +134,8 @@ Creating Ceph Monitors

 The Ceph Monitor (MON)
 footnote:[Ceph Monitor http://docs.ceph.com/docs/luminous/start/intro/]
-maintains a master copy of the cluster map. For HA you need to have at least 3
-monitors.
+maintains a master copy of the cluster map. For high availability you need to
+have at least 3 monitors.

 On each node where you want to place a monitor (three monitors are recommended),
 create it by using the 'Ceph -> Monitor' tab in the GUI or run.
@ -136,7 +154,7 @@ do not want to install a manager, specify the '-exclude-manager' option.
 Creating Ceph Manager
 ----------------------

-The Manager daemon runs alongside the monitors. It provides interfaces for
+The Manager daemon runs alongside the monitors, providing an interface for
 monitoring the cluster. Since the Ceph luminous release the
 ceph-mgr footnote:[Ceph Manager http://docs.ceph.com/docs/luminous/mgr/] daemon
 is required. During monitor installation the ceph manager will be installed as
@ -167,14 +185,24 @@ pveceph createosd /dev/sd[X]
 TIP: We recommend a Ceph cluster size, starting with 12 OSDs, distributed evenly
 among your, at least three nodes (4 OSDs on each node).

+If the disk was used before (eg. ZFS/RAID/OSD), to remove partition table, boot
+sector and any OSD leftover the following commands should be sufficient.
+
+[source,bash]
+----
+dd if=/dev/zero of=/dev/sd[X] bs=1M count=200
+ceph-disk zap /dev/sd[X]
+----
+
+WARNING: The above commands will destroy data on the disk!

 Ceph Bluestore
 ~~~~~~~~~~~~~~

 Starting with the Ceph Kraken release, a new Ceph OSD storage type was
 introduced, the so called Bluestore
-footnote:[Ceph Bluestore http://ceph.com/community/new-luminous-bluestore/]. In
-Ceph luminous this store is the default when creating OSDs.
+footnote:[Ceph Bluestore http://ceph.com/community/new-luminous-bluestore/].
+This is the default when creating OSDs in Ceph luminous.

 [source,bash]
 ----
@ -182,18 +210,18 @@ pveceph createosd /dev/sd[X]
 ----

 NOTE: In order to select a disk in the GUI, to be more failsafe, the disk needs
-to have a
-GPT footnoteref:[GPT,
-GPT partition table https://en.wikipedia.org/wiki/GUID_Partition_Table]
-partition table. You can create this with `gdisk /dev/sd(x)`. If there is no
-GPT, you cannot select the disk as DB/WAL.
+to have a GPT footnoteref:[GPT, GPT partition table
+https://en.wikipedia.org/wiki/GUID_Partition_Table] partition table. You can
+create this with `gdisk /dev/sd(x)`. If there is no GPT, you cannot select the
+disk as DB/WAL.

 If you want to use a separate DB/WAL device for your OSDs, you can specify it
-through the '-wal_dev' option.
+through the '-journal_dev' option. The WAL is placed with the DB, if not
+specified separately.

 [source,bash]
 ----
-pveceph createosd /dev/sd[X] -wal_dev /dev/sd[Y]
+pveceph createosd /dev/sd[X] -journal_dev /dev/sd[Y]
 ----

 NOTE: The DB stores BlueStore’s internal metadata and the WAL is BlueStore’s
@ -262,9 +290,9 @@ NOTE: The default number of PGs works for 2-6 disks. Ceph throws a
 "HEALTH_WARNING" if you have too few or too many PGs in your cluster.

 It is advised to calculate the PG number depending on your setup, you can find
-the formula and the PG
-calculator footnote:[PG calculator http://ceph.com/pgcalc/] online. While PGs
-can be increased later on, they can never be decreased.
+the formula and the PG calculator footnote:[PG calculator
+http://ceph.com/pgcalc/] online. While PGs can be increased later on, they can
+never be decreased.


 You can create pools through command line or on the GUI on each PVE host under