import ceph reef 18.2.4

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This commit is contained in:
Thomas Lamprecht 2024-07-25 18:23:05 +02:00
parent e9fe820e7f
commit f38dd50b34
1399 changed files with 133003 additions and 58787 deletions

View File

@ -1,7 +1,7 @@
cmake_minimum_required(VERSION 3.16) cmake_minimum_required(VERSION 3.16)
project(ceph project(ceph
VERSION 18.2.2 VERSION 18.2.4
LANGUAGES CXX C ASM) LANGUAGES CXX C ASM)
cmake_policy(SET CMP0028 NEW) cmake_policy(SET CMP0028 NEW)
@ -247,6 +247,15 @@ set(HAVE_LIBURING ${WITH_LIBURING})
CMAKE_DEPENDENT_OPTION(WITH_SYSTEM_LIBURING "Require and build with system liburing" OFF CMAKE_DEPENDENT_OPTION(WITH_SYSTEM_LIBURING "Require and build with system liburing" OFF
"HAVE_LIBAIO;WITH_BLUESTORE" OFF) "HAVE_LIBAIO;WITH_BLUESTORE" OFF)
if(WITH_LIBURING)
if(WITH_SYSTEM_LIBURING)
find_package(uring REQUIRED)
else()
include(Builduring)
build_uring()
endif()
endif()
CMAKE_DEPENDENT_OPTION(WITH_BLUESTORE_PMEM "Enable PMDK libraries" OFF CMAKE_DEPENDENT_OPTION(WITH_BLUESTORE_PMEM "Enable PMDK libraries" OFF
"WITH_BLUESTORE" OFF) "WITH_BLUESTORE" OFF)
if(WITH_BLUESTORE_PMEM) if(WITH_BLUESTORE_PMEM)
@ -679,7 +688,7 @@ if(WITH_SYSTEM_NPM)
message(FATAL_ERROR "Can't find npm.") message(FATAL_ERROR "Can't find npm.")
endif() endif()
endif() endif()
set(DASHBOARD_FRONTEND_LANGS "" CACHE STRING set(DASHBOARD_FRONTEND_LANGS "ALL" CACHE STRING
"List of comma separated ceph-dashboard frontend languages to build. \ "List of comma separated ceph-dashboard frontend languages to build. \
Use value `ALL` to build all languages") Use value `ALL` to build all languages")
CMAKE_DEPENDENT_OPTION(WITH_MGR_ROOK_CLIENT "Enable the mgr's Rook support" ON CMAKE_DEPENDENT_OPTION(WITH_MGR_ROOK_CLIENT "Enable the mgr's Rook support" ON

View File

@ -1,3 +1,17 @@
>=18.2.2
--------
* RBD: When diffing against the beginning of time (`fromsnapname == NULL`) in
fast-diff mode (`whole_object == true` with `fast-diff` image feature enabled
and valid), diff-iterate is now guaranteed to execute locally if exclusive
lock is available. This brings a dramatic performance improvement for QEMU
live disk synchronization and backup use cases.
* RADOS: `get_pool_is_selfmanaged_snaps_mode` C++ API has been deprecated
due to being prone to false negative results. It's safer replacement is
`pool_is_in_selfmanaged_snaps_mode`.
* RBD: The option ``--image-id`` has been added to `rbd children` CLI command,
so it can be run for images in the trash.
>=19.0.0 >=19.0.0
* RGW: S3 multipart uploads using Server-Side Encryption now replicate correctly in * RGW: S3 multipart uploads using Server-Side Encryption now replicate correctly in
@ -47,6 +61,52 @@
affected and to clean them up accordingly. affected and to clean them up accordingly.
* mgr/snap-schedule: For clusters with multiple CephFS file systems, all the * mgr/snap-schedule: For clusters with multiple CephFS file systems, all the
snap-schedule commands now expect the '--fs' argument. snap-schedule commands now expect the '--fs' argument.
* RGW: Fixed a S3 Object Lock bug with PutObjectRetention requests that specify
a RetainUntilDate after the year 2106. This date was truncated to 32 bits when
stored, so a much earlier date was used for object lock enforcement. This does
not effect PutBucketObjectLockConfiguration where a duration is given in Days.
The RetainUntilDate encoding is fixed for new PutObjectRetention requests, but
cannot repair the dates of existing object locks. Such objects can be identified
with a HeadObject request based on the x-amz-object-lock-retain-until-date
response header.
* RADOS: `get_pool_is_selfmanaged_snaps_mode` C++ API has been deprecated
due to being prone to false negative results. It's safer replacement is
`pool_is_in_selfmanaged_snaps_mode`.
* RADOS: For bug 62338 (https://tracker.ceph.com/issues/62338), we did not choose
to condition the fix on a server flag in order to simplify backporting. As
a result, in rare cases it may be possible for a PG to flip between two acting
sets while an upgrade to a version with the fix is in progress. If you observe
this behavior, you should be able to work around it by completing the upgrade or
by disabling async recovery by setting osd_async_recovery_min_cost to a very
large value on all OSDs until the upgrade is complete:
``ceph config set osd osd_async_recovery_min_cost 1099511627776``
* RADOS: A detailed version of the `balancer status` CLI command in the balancer
module is now available. Users may run `ceph balancer status detail` to see more
details about which PGs were updated in the balancer's last optimization.
See https://docs.ceph.com/en/latest/rados/operations/balancer/ for more information.
* CephFS: For clusters with multiple CephFS file systems, all the snap-schedule
commands now expect the '--fs' argument.
* CephFS: The period specifier ``m`` now implies minutes and the period specifier
``M`` now implies months. This has been made consistent with the rest
of the system.
* CephFS: Full support for subvolumes and subvolume groups is now available
for snap_schedule Manager module.
* CephFS: The `subvolume snapshot clone` command now depends on the config option
`snapshot_clone_no_wait` which is used to reject the clone operation when
all the cloner threads are busy. This config option is enabled by default which means
that if no cloner threads are free, the clone request errors out with EAGAIN.
The value of the config option can be fetched by using:
`ceph config get mgr mgr/volumes/snapshot_clone_no_wait`
and it can be disabled by using:
`ceph config set mgr mgr/volumes/snapshot_clone_no_wait false`
* CephFS: fixes to the implementation of the ``root_squash`` mechanism enabled
via cephx ``mds`` caps on a client credential require a new client feature
bit, ``client_mds_auth_caps``. Clients using credentials with ``root_squash``
without this feature will trigger the MDS to raise a HEALTH_ERR on the
cluster, MDS_CLIENTS_BROKEN_ROOTSQUASH. See the documentation on this warning
and the new feature bit for more information.
>=18.0.0 >=18.0.0
@ -54,6 +114,10 @@
mirroring policies between RGW and AWS, you may wish to set mirroring policies between RGW and AWS, you may wish to set
"rgw policy reject invalid principals" to "false". This affects only newly set "rgw policy reject invalid principals" to "false". This affects only newly set
policies, not policies that are already in place. policies, not policies that are already in place.
* The CephFS automatic metadata load (sometimes called "default") balancer is
now disabled by default. The new file system flag `balance_automate`
can be used to toggle it on or off. It can be enabled or disabled via
`ceph fs set <fs_name> balance_automate <bool>`.
* RGW's default backend for `rgw_enable_ops_log` changed from RADOS to file. * RGW's default backend for `rgw_enable_ops_log` changed from RADOS to file.
The default value of `rgw_ops_log_rados` is now false, and `rgw_ops_log_file_path` The default value of `rgw_ops_log_rados` is now false, and `rgw_ops_log_file_path`
defaults to "/var/log/ceph/ops-log-$cluster-$name.log". defaults to "/var/log/ceph/ops-log-$cluster-$name.log".
@ -226,6 +290,11 @@
than the number mentioned against the config tunable `mds_max_snaps_per_dir` than the number mentioned against the config tunable `mds_max_snaps_per_dir`
so that a new snapshot can be created and retained during the next schedule so that a new snapshot can be created and retained during the next schedule
run. run.
* `ceph config dump --format <json|xml>` output will display the localized
option names instead of its normalized version. For e.g.,
"mgr/prometheus/x/server_port" will be displayed instead of
"mgr/prometheus/server_port". This matches the output of the non pretty-print
formatted version of the command.
>=17.2.1 >=17.2.1
@ -291,3 +360,14 @@ Relevant tracker: https://tracker.ceph.com/issues/55715
request from client(s). This can be useful during some recovery situations request from client(s). This can be useful during some recovery situations
where it's desirable to bring MDS up but have no client workload. where it's desirable to bring MDS up but have no client workload.
Relevant tracker: https://tracker.ceph.com/issues/57090 Relevant tracker: https://tracker.ceph.com/issues/57090
* New MDSMap field `max_xattr_size` which can be set using the `fs set` command.
This MDSMap field allows to configure the maximum size allowed for the full
key/value set for a filesystem extended attributes. It effectively replaces
the old per-MDS `max_xattr_pairs_size` setting, which is now dropped.
Relevant tracker: https://tracker.ceph.com/issues/55725
* Introduced a new file system flag `refuse_standby_for_another_fs` that can be
set using the `fs set` command. This flag prevents using a standby for another
file system (join_fs = X) when standby for the current filesystem is not available.
Relevant tracker: https://tracker.ceph.com/issues/61599

View File

@ -1,4 +1,4 @@
Sphinx == 4.5.0 Sphinx == 5.0.2
git+https://github.com/ceph/sphinx-ditaa.git@py3#egg=sphinx-ditaa git+https://github.com/ceph/sphinx-ditaa.git@py3#egg=sphinx-ditaa
git+https://github.com/vlasovskikh/funcparserlib.git git+https://github.com/vlasovskikh/funcparserlib.git
breathe >= 4.20.0,!=4.33 breathe >= 4.20.0,!=4.33

View File

@ -1,6 +1,6 @@
ceph-menv ceph-menv
Environment assistant for use in conjuction with multiple ceph vstart (or more accurately mstart) clusters. Eliminates the need to specify the cluster that is being used with each and every command. Can provide a shell prompt feedback about the currently used cluster. Environment assistant for use in conjunction with multiple Ceph vstart (or more accurately mstart) clusters. Eliminates the need to specify the cluster that is being used with each and every command. Can provide a shell prompt feedback about the currently used cluster.
Usage: Usage:

View File

@ -35,8 +35,8 @@
%else %else
%bcond_with rbd_rwl_cache %bcond_with rbd_rwl_cache
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%if 0%{?rhel} < 9 %if 0%{?rhel} < 9 || 0%{?openEuler}
%bcond_with system_pmdk %bcond_with system_pmdk
%else %else
%ifarch s390x aarch64 %ifarch s390x aarch64
@ -93,7 +93,7 @@
%endif %endif
%endif %endif
%bcond_with seastar %bcond_with seastar
%if 0%{?suse_version} %if 0%{?suse_version} || 0%{?openEuler}
%bcond_with jaeger %bcond_with jaeger
%else %else
%bcond_without jaeger %bcond_without jaeger
@ -112,7 +112,7 @@
# this is tracked in https://bugzilla.redhat.com/2152265 # this is tracked in https://bugzilla.redhat.com/2152265
%bcond_with system_arrow %bcond_with system_arrow
%endif %endif
%if 0%{?fedora} || 0%{?suse_version} || 0%{?rhel} >= 8 %if 0%{?fedora} || 0%{?suse_version} || 0%{?rhel} >= 8 || 0%{?openEuler}
%global weak_deps 1 %global weak_deps 1
%endif %endif
%if %{with selinux} %if %{with selinux}
@ -170,7 +170,7 @@
# main package definition # main package definition
################################################################################# #################################################################################
Name: ceph Name: ceph
Version: 18.2.2 Version: 18.2.4
Release: 0%{?dist} Release: 0%{?dist}
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel}
Epoch: 2 Epoch: 2
@ -186,7 +186,7 @@ License: LGPL-2.1 and LGPL-3.0 and CC-BY-SA-3.0 and GPL-2.0 and BSL-1.0 and BSD-
Group: System/Filesystems Group: System/Filesystems
%endif %endif
URL: http://ceph.com/ URL: http://ceph.com/
Source0: %{?_remote_tarball_prefix}ceph-18.2.2.tar.bz2 Source0: %{?_remote_tarball_prefix}ceph-18.2.4.tar.bz2
%if 0%{?suse_version} %if 0%{?suse_version}
# _insert_obs_source_lines_here # _insert_obs_source_lines_here
ExclusiveArch: x86_64 aarch64 ppc64le s390x ExclusiveArch: x86_64 aarch64 ppc64le s390x
@ -211,7 +211,7 @@ BuildRequires: selinux-policy-devel
BuildRequires: gperf BuildRequires: gperf
BuildRequires: cmake > 3.5 BuildRequires: cmake > 3.5
BuildRequires: fuse-devel BuildRequires: fuse-devel
%if 0%{?fedora} || 0%{?suse_version} > 1500 || 0%{?rhel} == 9 %if 0%{?fedora} || 0%{?suse_version} > 1500 || 0%{?rhel} == 9 || 0%{?openEuler}
BuildRequires: gcc-c++ >= 11 BuildRequires: gcc-c++ >= 11
%endif %endif
%if 0%{?suse_version} == 1500 %if 0%{?suse_version} == 1500
@ -222,12 +222,12 @@ BuildRequires: %{gts_prefix}-gcc-c++
BuildRequires: %{gts_prefix}-build BuildRequires: %{gts_prefix}-build
BuildRequires: %{gts_prefix}-libatomic-devel BuildRequires: %{gts_prefix}-libatomic-devel
%endif %endif
%if 0%{?fedora} || 0%{?rhel} == 9 %if 0%{?fedora} || 0%{?rhel} == 9 || 0%{?openEuler}
BuildRequires: libatomic BuildRequires: libatomic
%endif %endif
%if 0%{with tcmalloc} %if 0%{with tcmalloc}
# libprofiler did not build on ppc64le until 2.7.90 # libprofiler did not build on ppc64le until 2.7.90
%if 0%{?fedora} || 0%{?rhel} >= 8 %if 0%{?fedora} || 0%{?rhel} >= 8 || 0%{?openEuler}
BuildRequires: gperftools-devel >= 2.7.90 BuildRequires: gperftools-devel >= 2.7.90
%endif %endif
%if 0%{?rhel} && 0%{?rhel} < 8 %if 0%{?rhel} && 0%{?rhel} < 8
@ -379,7 +379,7 @@ BuildRequires: liblz4-devel >= 1.7
BuildRequires: golang-github-prometheus-prometheus BuildRequires: golang-github-prometheus-prometheus
BuildRequires: jsonnet BuildRequires: jsonnet
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
Requires: systemd Requires: systemd
BuildRequires: boost-random BuildRequires: boost-random
BuildRequires: nss-devel BuildRequires: nss-devel
@ -401,7 +401,7 @@ BuildRequires: lz4-devel >= 1.7
# distro-conditional make check dependencies # distro-conditional make check dependencies
%if 0%{with make_check} %if 0%{with make_check}
BuildRequires: golang BuildRequires: golang
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
BuildRequires: golang-github-prometheus BuildRequires: golang-github-prometheus
BuildRequires: libtool-ltdl-devel BuildRequires: libtool-ltdl-devel
BuildRequires: xmlsec1 BuildRequires: xmlsec1
@ -412,7 +412,6 @@ BuildRequires: xmlsec1-nss
BuildRequires: xmlsec1-openssl BuildRequires: xmlsec1-openssl
BuildRequires: xmlsec1-openssl-devel BuildRequires: xmlsec1-openssl-devel
BuildRequires: python%{python3_pkgversion}-cherrypy BuildRequires: python%{python3_pkgversion}-cherrypy
BuildRequires: python%{python3_pkgversion}-jwt
BuildRequires: python%{python3_pkgversion}-routes BuildRequires: python%{python3_pkgversion}-routes
BuildRequires: python%{python3_pkgversion}-scipy BuildRequires: python%{python3_pkgversion}-scipy
BuildRequires: python%{python3_pkgversion}-werkzeug BuildRequires: python%{python3_pkgversion}-werkzeug
@ -425,7 +424,6 @@ BuildRequires: libxmlsec1-1
BuildRequires: libxmlsec1-nss1 BuildRequires: libxmlsec1-nss1
BuildRequires: libxmlsec1-openssl1 BuildRequires: libxmlsec1-openssl1
BuildRequires: python%{python3_pkgversion}-CherryPy BuildRequires: python%{python3_pkgversion}-CherryPy
BuildRequires: python%{python3_pkgversion}-PyJWT
BuildRequires: python%{python3_pkgversion}-Routes BuildRequires: python%{python3_pkgversion}-Routes
BuildRequires: python%{python3_pkgversion}-Werkzeug BuildRequires: python%{python3_pkgversion}-Werkzeug
BuildRequires: python%{python3_pkgversion}-numpy-devel BuildRequires: python%{python3_pkgversion}-numpy-devel
@ -435,7 +433,7 @@ BuildRequires: xmlsec1-openssl-devel
%endif %endif
# lttng and babeltrace for rbd-replay-prep # lttng and babeltrace for rbd-replay-prep
%if %{with lttng} %if %{with lttng}
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
BuildRequires: lttng-ust-devel BuildRequires: lttng-ust-devel
BuildRequires: libbabeltrace-devel BuildRequires: libbabeltrace-devel
%endif %endif
@ -447,15 +445,18 @@ BuildRequires: babeltrace-devel
%if 0%{?suse_version} %if 0%{?suse_version}
BuildRequires: libexpat-devel BuildRequires: libexpat-devel
%endif %endif
%if 0%{?rhel} || 0%{?fedora} %if 0%{?rhel} || 0%{?fedora} || 0%{?openEuler}
BuildRequires: expat-devel BuildRequires: expat-devel
%endif %endif
#hardened-cc1 #hardened-cc1
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel}
BuildRequires: redhat-rpm-config BuildRequires: redhat-rpm-config
%endif %endif
%if 0%{?openEuler}
BuildRequires: openEuler-rpm-config
%endif
%if 0%{with seastar} %if 0%{with seastar}
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
BuildRequires: cryptopp-devel BuildRequires: cryptopp-devel
BuildRequires: numactl-devel BuildRequires: numactl-devel
%endif %endif
@ -543,7 +544,7 @@ Requires: python%{python3_pkgversion}-cephfs = %{_epoch_prefix}%{version}-%{rele
Requires: python%{python3_pkgversion}-rgw = %{_epoch_prefix}%{version}-%{release} Requires: python%{python3_pkgversion}-rgw = %{_epoch_prefix}%{version}-%{release}
Requires: python%{python3_pkgversion}-ceph-argparse = %{_epoch_prefix}%{version}-%{release} Requires: python%{python3_pkgversion}-ceph-argparse = %{_epoch_prefix}%{version}-%{release}
Requires: python%{python3_pkgversion}-ceph-common = %{_epoch_prefix}%{version}-%{release} Requires: python%{python3_pkgversion}-ceph-common = %{_epoch_prefix}%{version}-%{release}
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
Requires: python%{python3_pkgversion}-prettytable Requires: python%{python3_pkgversion}-prettytable
%endif %endif
%if 0%{?suse_version} %if 0%{?suse_version}
@ -615,9 +616,8 @@ Requires: ceph-mgr = %{_epoch_prefix}%{version}-%{release}
Requires: ceph-grafana-dashboards = %{_epoch_prefix}%{version}-%{release} Requires: ceph-grafana-dashboards = %{_epoch_prefix}%{version}-%{release}
Requires: ceph-prometheus-alerts = %{_epoch_prefix}%{version}-%{release} Requires: ceph-prometheus-alerts = %{_epoch_prefix}%{version}-%{release}
Requires: python%{python3_pkgversion}-setuptools Requires: python%{python3_pkgversion}-setuptools
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
Requires: python%{python3_pkgversion}-cherrypy Requires: python%{python3_pkgversion}-cherrypy
Requires: python%{python3_pkgversion}-jwt
Requires: python%{python3_pkgversion}-routes Requires: python%{python3_pkgversion}-routes
Requires: python%{python3_pkgversion}-werkzeug Requires: python%{python3_pkgversion}-werkzeug
%if 0%{?weak_deps} %if 0%{?weak_deps}
@ -626,7 +626,6 @@ Recommends: python%{python3_pkgversion}-saml
%endif %endif
%if 0%{?suse_version} %if 0%{?suse_version}
Requires: python%{python3_pkgversion}-CherryPy Requires: python%{python3_pkgversion}-CherryPy
Requires: python%{python3_pkgversion}-PyJWT
Requires: python%{python3_pkgversion}-Routes Requires: python%{python3_pkgversion}-Routes
Requires: python%{python3_pkgversion}-Werkzeug Requires: python%{python3_pkgversion}-Werkzeug
Recommends: python%{python3_pkgversion}-python3-saml Recommends: python%{python3_pkgversion}-python3-saml
@ -645,7 +644,7 @@ Group: System/Filesystems
%endif %endif
Requires: ceph-mgr = %{_epoch_prefix}%{version}-%{release} Requires: ceph-mgr = %{_epoch_prefix}%{version}-%{release}
Requires: python%{python3_pkgversion}-numpy Requires: python%{python3_pkgversion}-numpy
%if 0%{?fedora} || 0%{?suse_version} %if 0%{?fedora} || 0%{?suse_version} || 0%{?openEuler}
Requires: python%{python3_pkgversion}-scikit-learn Requires: python%{python3_pkgversion}-scikit-learn
%endif %endif
Requires: python3-scipy Requires: python3-scipy
@ -665,7 +664,7 @@ Requires: python%{python3_pkgversion}-pyOpenSSL
Requires: python%{python3_pkgversion}-requests Requires: python%{python3_pkgversion}-requests
Requires: python%{python3_pkgversion}-dateutil Requires: python%{python3_pkgversion}-dateutil
Requires: python%{python3_pkgversion}-setuptools Requires: python%{python3_pkgversion}-setuptools
%if 0%{?fedora} || 0%{?rhel} >= 8 %if 0%{?fedora} || 0%{?rhel} >= 8 || 0%{?openEuler}
Requires: python%{python3_pkgversion}-cherrypy Requires: python%{python3_pkgversion}-cherrypy
Requires: python%{python3_pkgversion}-pyyaml Requires: python%{python3_pkgversion}-pyyaml
Requires: python%{python3_pkgversion}-werkzeug Requires: python%{python3_pkgversion}-werkzeug
@ -722,7 +721,7 @@ Requires: openssh
Requires: python%{python3_pkgversion}-CherryPy Requires: python%{python3_pkgversion}-CherryPy
Requires: python%{python3_pkgversion}-Jinja2 Requires: python%{python3_pkgversion}-Jinja2
%endif %endif
%if 0%{?rhel} || 0%{?fedora} %if 0%{?rhel} || 0%{?fedora} || 0%{?openEuler}
Requires: openssh-clients Requires: openssh-clients
Requires: python%{python3_pkgversion}-cherrypy Requires: python%{python3_pkgversion}-cherrypy
Requires: python%{python3_pkgversion}-jinja2 Requires: python%{python3_pkgversion}-jinja2
@ -814,7 +813,7 @@ Requires: ceph-selinux = %{_epoch_prefix}%{version}-%{release}
%endif %endif
Requires: librados2 = %{_epoch_prefix}%{version}-%{release} Requires: librados2 = %{_epoch_prefix}%{version}-%{release}
Requires: librgw2 = %{_epoch_prefix}%{version}-%{release} Requires: librgw2 = %{_epoch_prefix}%{version}-%{release}
%if 0%{?rhel} || 0%{?fedora} %if 0%{?rhel} || 0%{?fedora} || 0%{?openEuler}
Requires: mailcap Requires: mailcap
%endif %endif
%if 0%{?weak_deps} %if 0%{?weak_deps}
@ -894,6 +893,7 @@ Requires: parted
Requires: util-linux Requires: util-linux
Requires: xfsprogs Requires: xfsprogs
Requires: python%{python3_pkgversion}-setuptools Requires: python%{python3_pkgversion}-setuptools
Requires: python%{python3_pkgversion}-packaging
Requires: python%{python3_pkgversion}-ceph-common = %{_epoch_prefix}%{version}-%{release} Requires: python%{python3_pkgversion}-ceph-common = %{_epoch_prefix}%{version}-%{release}
%description volume %description volume
This package contains a tool to deploy OSD with different devices like This package contains a tool to deploy OSD with different devices like
@ -905,7 +905,7 @@ Summary: RADOS distributed object store client library
%if 0%{?suse_version} %if 0%{?suse_version}
Group: System/Libraries Group: System/Libraries
%endif %endif
%if 0%{?rhel} || 0%{?fedora} %if 0%{?rhel} || 0%{?fedora} || 0%{?openEuler}
Obsoletes: ceph-libs < %{_epoch_prefix}%{version}-%{release} Obsoletes: ceph-libs < %{_epoch_prefix}%{version}-%{release}
%endif %endif
%description -n librados2 %description -n librados2
@ -1052,7 +1052,7 @@ Requires: librados2 = %{_epoch_prefix}%{version}-%{release}
%if 0%{?suse_version} %if 0%{?suse_version}
Requires(post): coreutils Requires(post): coreutils
%endif %endif
%if 0%{?rhel} || 0%{?fedora} %if 0%{?rhel} || 0%{?fedora} || 0%{?openEuler}
Obsoletes: ceph-libs < %{_epoch_prefix}%{version}-%{release} Obsoletes: ceph-libs < %{_epoch_prefix}%{version}-%{release}
%endif %endif
%description -n librbd1 %description -n librbd1
@ -1096,7 +1096,7 @@ Summary: Ceph distributed file system client library
Group: System/Libraries Group: System/Libraries
%endif %endif
Obsoletes: libcephfs1 < %{_epoch_prefix}%{version}-%{release} Obsoletes: libcephfs1 < %{_epoch_prefix}%{version}-%{release}
%if 0%{?rhel} || 0%{?fedora} %if 0%{?rhel} || 0%{?fedora} || 0%{?openEuler}
Obsoletes: ceph-libs < %{_epoch_prefix}%{version}-%{release} Obsoletes: ceph-libs < %{_epoch_prefix}%{version}-%{release}
Obsoletes: ceph-libcephfs Obsoletes: ceph-libcephfs
%endif %endif
@ -1149,7 +1149,7 @@ descriptions, and submitting the command to the appropriate daemon.
%package -n python%{python3_pkgversion}-ceph-common %package -n python%{python3_pkgversion}-ceph-common
Summary: Python 3 utility libraries for Ceph Summary: Python 3 utility libraries for Ceph
%if 0%{?fedora} || 0%{?rhel} >= 8 %if 0%{?fedora} || 0%{?rhel} >= 8 || 0%{?openEuler}
Requires: python%{python3_pkgversion}-pyyaml Requires: python%{python3_pkgversion}-pyyaml
%endif %endif
%if 0%{?suse_version} %if 0%{?suse_version}
@ -1288,11 +1288,20 @@ Group: System/Monitoring
%description mib %description mib
This package provides a Ceph MIB for SNMP traps. This package provides a Ceph MIB for SNMP traps.
%package node-proxy
Summary: hw monitoring agent for Ceph
BuildArch: noarch
%if 0%{?suse_version}
Group: System/Monitoring
%endif
%description node-proxy
This package provides a Ceph hardware monitoring agent.
################################################################################# #################################################################################
# common # common
################################################################################# #################################################################################
%prep %prep
%autosetup -p1 -n ceph-18.2.2 %autosetup -p1 -n ceph-18.2.4
%build %build
# Disable lto on systems that do not support symver attribute # Disable lto on systems that do not support symver attribute
@ -1467,7 +1476,7 @@ install -m 0755 %{buildroot}%{_bindir}/crimson-osd %{buildroot}%{_bindir}/ceph-o
%endif %endif
install -m 0644 -D src/etc-rbdmap %{buildroot}%{_sysconfdir}/ceph/rbdmap install -m 0644 -D src/etc-rbdmap %{buildroot}%{_sysconfdir}/ceph/rbdmap
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
install -m 0644 -D etc/sysconfig/ceph %{buildroot}%{_sysconfdir}/sysconfig/ceph install -m 0644 -D etc/sysconfig/ceph %{buildroot}%{_sysconfdir}/sysconfig/ceph
%endif %endif
%if 0%{?suse_version} %if 0%{?suse_version}
@ -1501,7 +1510,7 @@ install -m 0644 -D udev/50-rbd.rules %{buildroot}%{_udevrulesdir}/50-rbd.rules
# sudoers.d # sudoers.d
install -m 0440 -D sudoers.d/ceph-smartctl %{buildroot}%{_sysconfdir}/sudoers.d/ceph-smartctl install -m 0440 -D sudoers.d/ceph-smartctl %{buildroot}%{_sysconfdir}/sudoers.d/ceph-smartctl
%if 0%{?rhel} >= 8 %if 0%{?rhel} >= 8 || 0%{?openEuler}
pathfix.py -pni "%{__python3} %{py3_shbang_opts}" %{buildroot}%{_bindir}/* pathfix.py -pni "%{__python3} %{py3_shbang_opts}" %{buildroot}%{_bindir}/*
pathfix.py -pni "%{__python3} %{py3_shbang_opts}" %{buildroot}%{_sbindir}/* pathfix.py -pni "%{__python3} %{py3_shbang_opts}" %{buildroot}%{_sbindir}/*
%endif %endif
@ -1538,7 +1547,7 @@ install -m 644 -D -t %{buildroot}%{_datadir}/snmp/mibs monitoring/snmp/CEPH-MIB.
%fdupes %{buildroot}%{_prefix} %fdupes %{buildroot}%{_prefix}
%endif %endif
%if 0%{?rhel} == 8 %if 0%{?rhel} == 8 || 0%{?openEuler}
%py_byte_compile %{__python3} %{buildroot}%{python3_sitelib} %py_byte_compile %{__python3} %{buildroot}%{python3_sitelib}
%endif %endif
@ -1581,7 +1590,7 @@ rm -rf %{_vpath_builddir}
%{_libdir}/libosd_tp.so* %{_libdir}/libosd_tp.so*
%endif %endif
%config(noreplace) %{_sysconfdir}/logrotate.d/ceph %config(noreplace) %{_sysconfdir}/logrotate.d/ceph
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%config(noreplace) %{_sysconfdir}/sysconfig/ceph %config(noreplace) %{_sysconfdir}/sysconfig/ceph
%endif %endif
%if 0%{?suse_version} %if 0%{?suse_version}
@ -1614,7 +1623,7 @@ if [ $1 -eq 1 ] ; then
/usr/bin/systemctl preset ceph.target ceph-crash.service >/dev/null 2>&1 || : /usr/bin/systemctl preset ceph.target ceph-crash.service >/dev/null 2>&1 || :
fi fi
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_post ceph.target ceph-crash.service %systemd_post ceph.target ceph-crash.service
%endif %endif
if [ $1 -eq 1 ] ; then if [ $1 -eq 1 ] ; then
@ -1625,7 +1634,7 @@ fi
%if 0%{?suse_version} %if 0%{?suse_version}
%service_del_preun ceph.target ceph-crash.service %service_del_preun ceph.target ceph-crash.service
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_preun ceph.target ceph-crash.service %systemd_preun ceph.target ceph-crash.service
%endif %endif
@ -1722,7 +1731,7 @@ exit 0
%pre common %pre common
CEPH_GROUP_ID=167 CEPH_GROUP_ID=167
CEPH_USER_ID=167 CEPH_USER_ID=167
%if 0%{?rhel} || 0%{?fedora} %if 0%{?rhel} || 0%{?fedora} || 0%{?openEuler}
/usr/sbin/groupadd ceph -g $CEPH_GROUP_ID -o -r 2>/dev/null || : /usr/sbin/groupadd ceph -g $CEPH_GROUP_ID -o -r 2>/dev/null || :
/usr/sbin/useradd ceph -u $CEPH_USER_ID -o -r -g ceph -s /sbin/nologin -c "Ceph daemons" -d %{_localstatedir}/lib/ceph 2>/dev/null || : /usr/sbin/useradd ceph -u $CEPH_USER_ID -o -r -g ceph -s /sbin/nologin -c "Ceph daemons" -d %{_localstatedir}/lib/ceph 2>/dev/null || :
%endif %endif
@ -1768,7 +1777,7 @@ if [ $1 -eq 1 ] ; then
/usr/bin/systemctl preset ceph-mds@\*.service ceph-mds.target >/dev/null 2>&1 || : /usr/bin/systemctl preset ceph-mds@\*.service ceph-mds.target >/dev/null 2>&1 || :
fi fi
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_post ceph-mds@\*.service ceph-mds.target %systemd_post ceph-mds@\*.service ceph-mds.target
%endif %endif
if [ $1 -eq 1 ] ; then if [ $1 -eq 1 ] ; then
@ -1779,7 +1788,7 @@ fi
%if 0%{?suse_version} %if 0%{?suse_version}
%service_del_preun ceph-mds@\*.service ceph-mds.target %service_del_preun ceph-mds@\*.service ceph-mds.target
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_preun ceph-mds@\*.service ceph-mds.target %systemd_preun ceph-mds@\*.service ceph-mds.target
%endif %endif
@ -1813,7 +1822,7 @@ if [ $1 -eq 1 ] ; then
/usr/bin/systemctl preset ceph-mgr@\*.service ceph-mgr.target >/dev/null 2>&1 || : /usr/bin/systemctl preset ceph-mgr@\*.service ceph-mgr.target >/dev/null 2>&1 || :
fi fi
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_post ceph-mgr@\*.service ceph-mgr.target %systemd_post ceph-mgr@\*.service ceph-mgr.target
%endif %endif
if [ $1 -eq 1 ] ; then if [ $1 -eq 1 ] ; then
@ -1824,7 +1833,7 @@ fi
%if 0%{?suse_version} %if 0%{?suse_version}
%service_del_preun ceph-mgr@\*.service ceph-mgr.target %service_del_preun ceph-mgr@\*.service ceph-mgr.target
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_preun ceph-mgr@\*.service ceph-mgr.target %systemd_preun ceph-mgr@\*.service ceph-mgr.target
%endif %endif
@ -1953,7 +1962,7 @@ if [ $1 -eq 1 ] ; then
/usr/bin/systemctl preset ceph-mon@\*.service ceph-mon.target >/dev/null 2>&1 || : /usr/bin/systemctl preset ceph-mon@\*.service ceph-mon.target >/dev/null 2>&1 || :
fi fi
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_post ceph-mon@\*.service ceph-mon.target %systemd_post ceph-mon@\*.service ceph-mon.target
%endif %endif
if [ $1 -eq 1 ] ; then if [ $1 -eq 1 ] ; then
@ -1964,7 +1973,7 @@ fi
%if 0%{?suse_version} %if 0%{?suse_version}
%service_del_preun ceph-mon@\*.service ceph-mon.target %service_del_preun ceph-mon@\*.service ceph-mon.target
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_preun ceph-mon@\*.service ceph-mon.target %systemd_preun ceph-mon@\*.service ceph-mon.target
%endif %endif
@ -2002,7 +2011,7 @@ if [ $1 -eq 1 ] ; then
/usr/bin/systemctl preset cephfs-mirror@\*.service cephfs-mirror.target >/dev/null 2>&1 || : /usr/bin/systemctl preset cephfs-mirror@\*.service cephfs-mirror.target >/dev/null 2>&1 || :
fi fi
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_post cephfs-mirror@\*.service cephfs-mirror.target %systemd_post cephfs-mirror@\*.service cephfs-mirror.target
%endif %endif
if [ $1 -eq 1 ] ; then if [ $1 -eq 1 ] ; then
@ -2013,7 +2022,7 @@ fi
%if 0%{?suse_version} %if 0%{?suse_version}
%service_del_preun cephfs-mirror@\*.service cephfs-mirror.target %service_del_preun cephfs-mirror@\*.service cephfs-mirror.target
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_preun cephfs-mirror@\*.service cephfs-mirror.target %systemd_preun cephfs-mirror@\*.service cephfs-mirror.target
%endif %endif
@ -2033,6 +2042,7 @@ fi
%files -n ceph-exporter %files -n ceph-exporter
%{_bindir}/ceph-exporter %{_bindir}/ceph-exporter
%{_unitdir}/ceph-exporter.service
%files -n rbd-fuse %files -n rbd-fuse
%{_bindir}/rbd-fuse %{_bindir}/rbd-fuse
@ -2050,7 +2060,7 @@ if [ $1 -eq 1 ] ; then
/usr/bin/systemctl preset ceph-rbd-mirror@\*.service ceph-rbd-mirror.target >/dev/null 2>&1 || : /usr/bin/systemctl preset ceph-rbd-mirror@\*.service ceph-rbd-mirror.target >/dev/null 2>&1 || :
fi fi
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_post ceph-rbd-mirror@\*.service ceph-rbd-mirror.target %systemd_post ceph-rbd-mirror@\*.service ceph-rbd-mirror.target
%endif %endif
if [ $1 -eq 1 ] ; then if [ $1 -eq 1 ] ; then
@ -2061,7 +2071,7 @@ fi
%if 0%{?suse_version} %if 0%{?suse_version}
%service_del_preun ceph-rbd-mirror@\*.service ceph-rbd-mirror.target %service_del_preun ceph-rbd-mirror@\*.service ceph-rbd-mirror.target
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_preun ceph-rbd-mirror@\*.service ceph-rbd-mirror.target %systemd_preun ceph-rbd-mirror@\*.service ceph-rbd-mirror.target
%endif %endif
@ -2091,7 +2101,7 @@ if [ $1 -eq 1 ] ; then
/usr/bin/systemctl preset ceph-immutable-object-cache@\*.service ceph-immutable-object-cache.target >/dev/null 2>&1 || : /usr/bin/systemctl preset ceph-immutable-object-cache@\*.service ceph-immutable-object-cache.target >/dev/null 2>&1 || :
fi fi
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_post ceph-immutable-object-cache@\*.service ceph-immutable-object-cache.target %systemd_post ceph-immutable-object-cache@\*.service ceph-immutable-object-cache.target
%endif %endif
if [ $1 -eq 1 ] ; then if [ $1 -eq 1 ] ; then
@ -2102,7 +2112,7 @@ fi
%if 0%{?suse_version} %if 0%{?suse_version}
%service_del_preun ceph-immutable-object-cache@\*.service ceph-immutable-object-cache.target %service_del_preun ceph-immutable-object-cache@\*.service ceph-immutable-object-cache.target
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_preun ceph-immutable-object-cache@\*.service ceph-immutable-object-cache.target %systemd_preun ceph-immutable-object-cache@\*.service ceph-immutable-object-cache.target
%endif %endif
@ -2145,7 +2155,7 @@ if [ $1 -eq 1 ] ; then
/usr/bin/systemctl preset ceph-radosgw@\*.service ceph-radosgw.target >/dev/null 2>&1 || : /usr/bin/systemctl preset ceph-radosgw@\*.service ceph-radosgw.target >/dev/null 2>&1 || :
fi fi
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_post ceph-radosgw@\*.service ceph-radosgw.target %systemd_post ceph-radosgw@\*.service ceph-radosgw.target
%endif %endif
if [ $1 -eq 1 ] ; then if [ $1 -eq 1 ] ; then
@ -2156,7 +2166,7 @@ fi
%if 0%{?suse_version} %if 0%{?suse_version}
%service_del_preun ceph-radosgw@\*.service ceph-radosgw.target %service_del_preun ceph-radosgw@\*.service ceph-radosgw.target
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_preun ceph-radosgw@\*.service ceph-radosgw.target %systemd_preun ceph-radosgw@\*.service ceph-radosgw.target
%endif %endif
@ -2196,7 +2206,7 @@ if [ $1 -eq 1 ] ; then
/usr/bin/systemctl preset ceph-osd@\*.service ceph-osd.target >/dev/null 2>&1 || : /usr/bin/systemctl preset ceph-osd@\*.service ceph-osd.target >/dev/null 2>&1 || :
fi fi
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_post ceph-osd@\*.service ceph-osd.target %systemd_post ceph-osd@\*.service ceph-osd.target
%endif %endif
if [ $1 -eq 1 ] ; then if [ $1 -eq 1 ] ; then
@ -2212,7 +2222,7 @@ fi
%if 0%{?suse_version} %if 0%{?suse_version}
%service_del_preun ceph-osd@\*.service ceph-osd.target %service_del_preun ceph-osd@\*.service ceph-osd.target
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_preun ceph-osd@\*.service ceph-osd.target %systemd_preun ceph-osd@\*.service ceph-osd.target
%endif %endif
@ -2251,7 +2261,7 @@ if [ $1 -eq 1 ] ; then
/usr/bin/systemctl preset ceph-volume@\*.service >/dev/null 2>&1 || : /usr/bin/systemctl preset ceph-volume@\*.service >/dev/null 2>&1 || :
fi fi
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_post ceph-volume@\*.service %systemd_post ceph-volume@\*.service
%endif %endif
@ -2259,7 +2269,7 @@ fi
%if 0%{?suse_version} %if 0%{?suse_version}
%service_del_preun ceph-volume@\*.service %service_del_preun ceph-volume@\*.service
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_preun ceph-volume@\*.service %systemd_preun ceph-volume@\*.service
%endif %endif
@ -2620,4 +2630,10 @@ exit 0
%attr(0755,root,root) %dir %{_datadir}/snmp %attr(0755,root,root) %dir %{_datadir}/snmp
%{_datadir}/snmp/mibs %{_datadir}/snmp/mibs
%files node-proxy
%{_sbindir}/ceph-node-proxy
%dir %{python3_sitelib}/ceph_node_proxy
%{python3_sitelib}/ceph_node_proxy/*
%{python3_sitelib}/ceph_node_proxy-*
%changelog %changelog

View File

@ -35,8 +35,8 @@
%else %else
%bcond_with rbd_rwl_cache %bcond_with rbd_rwl_cache
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%if 0%{?rhel} < 9 %if 0%{?rhel} < 9 || 0%{?openEuler}
%bcond_with system_pmdk %bcond_with system_pmdk
%else %else
%ifarch s390x aarch64 %ifarch s390x aarch64
@ -93,7 +93,7 @@
%endif %endif
%endif %endif
%bcond_with seastar %bcond_with seastar
%if 0%{?suse_version} %if 0%{?suse_version} || 0%{?openEuler}
%bcond_with jaeger %bcond_with jaeger
%else %else
%bcond_without jaeger %bcond_without jaeger
@ -112,7 +112,7 @@
# this is tracked in https://bugzilla.redhat.com/2152265 # this is tracked in https://bugzilla.redhat.com/2152265
%bcond_with system_arrow %bcond_with system_arrow
%endif %endif
%if 0%{?fedora} || 0%{?suse_version} || 0%{?rhel} >= 8 %if 0%{?fedora} || 0%{?suse_version} || 0%{?rhel} >= 8 || 0%{?openEuler}
%global weak_deps 1 %global weak_deps 1
%endif %endif
%if %{with selinux} %if %{with selinux}
@ -211,7 +211,7 @@ BuildRequires: selinux-policy-devel
BuildRequires: gperf BuildRequires: gperf
BuildRequires: cmake > 3.5 BuildRequires: cmake > 3.5
BuildRequires: fuse-devel BuildRequires: fuse-devel
%if 0%{?fedora} || 0%{?suse_version} > 1500 || 0%{?rhel} == 9 %if 0%{?fedora} || 0%{?suse_version} > 1500 || 0%{?rhel} == 9 || 0%{?openEuler}
BuildRequires: gcc-c++ >= 11 BuildRequires: gcc-c++ >= 11
%endif %endif
%if 0%{?suse_version} == 1500 %if 0%{?suse_version} == 1500
@ -222,12 +222,12 @@ BuildRequires: %{gts_prefix}-gcc-c++
BuildRequires: %{gts_prefix}-build BuildRequires: %{gts_prefix}-build
BuildRequires: %{gts_prefix}-libatomic-devel BuildRequires: %{gts_prefix}-libatomic-devel
%endif %endif
%if 0%{?fedora} || 0%{?rhel} == 9 %if 0%{?fedora} || 0%{?rhel} == 9 || 0%{?openEuler}
BuildRequires: libatomic BuildRequires: libatomic
%endif %endif
%if 0%{with tcmalloc} %if 0%{with tcmalloc}
# libprofiler did not build on ppc64le until 2.7.90 # libprofiler did not build on ppc64le until 2.7.90
%if 0%{?fedora} || 0%{?rhel} >= 8 %if 0%{?fedora} || 0%{?rhel} >= 8 || 0%{?openEuler}
BuildRequires: gperftools-devel >= 2.7.90 BuildRequires: gperftools-devel >= 2.7.90
%endif %endif
%if 0%{?rhel} && 0%{?rhel} < 8 %if 0%{?rhel} && 0%{?rhel} < 8
@ -379,7 +379,7 @@ BuildRequires: liblz4-devel >= 1.7
BuildRequires: golang-github-prometheus-prometheus BuildRequires: golang-github-prometheus-prometheus
BuildRequires: jsonnet BuildRequires: jsonnet
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
Requires: systemd Requires: systemd
BuildRequires: boost-random BuildRequires: boost-random
BuildRequires: nss-devel BuildRequires: nss-devel
@ -401,7 +401,7 @@ BuildRequires: lz4-devel >= 1.7
# distro-conditional make check dependencies # distro-conditional make check dependencies
%if 0%{with make_check} %if 0%{with make_check}
BuildRequires: golang BuildRequires: golang
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
BuildRequires: golang-github-prometheus BuildRequires: golang-github-prometheus
BuildRequires: libtool-ltdl-devel BuildRequires: libtool-ltdl-devel
BuildRequires: xmlsec1 BuildRequires: xmlsec1
@ -412,7 +412,6 @@ BuildRequires: xmlsec1-nss
BuildRequires: xmlsec1-openssl BuildRequires: xmlsec1-openssl
BuildRequires: xmlsec1-openssl-devel BuildRequires: xmlsec1-openssl-devel
BuildRequires: python%{python3_pkgversion}-cherrypy BuildRequires: python%{python3_pkgversion}-cherrypy
BuildRequires: python%{python3_pkgversion}-jwt
BuildRequires: python%{python3_pkgversion}-routes BuildRequires: python%{python3_pkgversion}-routes
BuildRequires: python%{python3_pkgversion}-scipy BuildRequires: python%{python3_pkgversion}-scipy
BuildRequires: python%{python3_pkgversion}-werkzeug BuildRequires: python%{python3_pkgversion}-werkzeug
@ -425,7 +424,6 @@ BuildRequires: libxmlsec1-1
BuildRequires: libxmlsec1-nss1 BuildRequires: libxmlsec1-nss1
BuildRequires: libxmlsec1-openssl1 BuildRequires: libxmlsec1-openssl1
BuildRequires: python%{python3_pkgversion}-CherryPy BuildRequires: python%{python3_pkgversion}-CherryPy
BuildRequires: python%{python3_pkgversion}-PyJWT
BuildRequires: python%{python3_pkgversion}-Routes BuildRequires: python%{python3_pkgversion}-Routes
BuildRequires: python%{python3_pkgversion}-Werkzeug BuildRequires: python%{python3_pkgversion}-Werkzeug
BuildRequires: python%{python3_pkgversion}-numpy-devel BuildRequires: python%{python3_pkgversion}-numpy-devel
@ -435,7 +433,7 @@ BuildRequires: xmlsec1-openssl-devel
%endif %endif
# lttng and babeltrace for rbd-replay-prep # lttng and babeltrace for rbd-replay-prep
%if %{with lttng} %if %{with lttng}
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
BuildRequires: lttng-ust-devel BuildRequires: lttng-ust-devel
BuildRequires: libbabeltrace-devel BuildRequires: libbabeltrace-devel
%endif %endif
@ -447,15 +445,18 @@ BuildRequires: babeltrace-devel
%if 0%{?suse_version} %if 0%{?suse_version}
BuildRequires: libexpat-devel BuildRequires: libexpat-devel
%endif %endif
%if 0%{?rhel} || 0%{?fedora} %if 0%{?rhel} || 0%{?fedora} || 0%{?openEuler}
BuildRequires: expat-devel BuildRequires: expat-devel
%endif %endif
#hardened-cc1 #hardened-cc1
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel}
BuildRequires: redhat-rpm-config BuildRequires: redhat-rpm-config
%endif %endif
%if 0%{?openEuler}
BuildRequires: openEuler-rpm-config
%endif
%if 0%{with seastar} %if 0%{with seastar}
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
BuildRequires: cryptopp-devel BuildRequires: cryptopp-devel
BuildRequires: numactl-devel BuildRequires: numactl-devel
%endif %endif
@ -543,7 +544,7 @@ Requires: python%{python3_pkgversion}-cephfs = %{_epoch_prefix}%{version}-%{rele
Requires: python%{python3_pkgversion}-rgw = %{_epoch_prefix}%{version}-%{release} Requires: python%{python3_pkgversion}-rgw = %{_epoch_prefix}%{version}-%{release}
Requires: python%{python3_pkgversion}-ceph-argparse = %{_epoch_prefix}%{version}-%{release} Requires: python%{python3_pkgversion}-ceph-argparse = %{_epoch_prefix}%{version}-%{release}
Requires: python%{python3_pkgversion}-ceph-common = %{_epoch_prefix}%{version}-%{release} Requires: python%{python3_pkgversion}-ceph-common = %{_epoch_prefix}%{version}-%{release}
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
Requires: python%{python3_pkgversion}-prettytable Requires: python%{python3_pkgversion}-prettytable
%endif %endif
%if 0%{?suse_version} %if 0%{?suse_version}
@ -615,9 +616,8 @@ Requires: ceph-mgr = %{_epoch_prefix}%{version}-%{release}
Requires: ceph-grafana-dashboards = %{_epoch_prefix}%{version}-%{release} Requires: ceph-grafana-dashboards = %{_epoch_prefix}%{version}-%{release}
Requires: ceph-prometheus-alerts = %{_epoch_prefix}%{version}-%{release} Requires: ceph-prometheus-alerts = %{_epoch_prefix}%{version}-%{release}
Requires: python%{python3_pkgversion}-setuptools Requires: python%{python3_pkgversion}-setuptools
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
Requires: python%{python3_pkgversion}-cherrypy Requires: python%{python3_pkgversion}-cherrypy
Requires: python%{python3_pkgversion}-jwt
Requires: python%{python3_pkgversion}-routes Requires: python%{python3_pkgversion}-routes
Requires: python%{python3_pkgversion}-werkzeug Requires: python%{python3_pkgversion}-werkzeug
%if 0%{?weak_deps} %if 0%{?weak_deps}
@ -626,7 +626,6 @@ Recommends: python%{python3_pkgversion}-saml
%endif %endif
%if 0%{?suse_version} %if 0%{?suse_version}
Requires: python%{python3_pkgversion}-CherryPy Requires: python%{python3_pkgversion}-CherryPy
Requires: python%{python3_pkgversion}-PyJWT
Requires: python%{python3_pkgversion}-Routes Requires: python%{python3_pkgversion}-Routes
Requires: python%{python3_pkgversion}-Werkzeug Requires: python%{python3_pkgversion}-Werkzeug
Recommends: python%{python3_pkgversion}-python3-saml Recommends: python%{python3_pkgversion}-python3-saml
@ -645,7 +644,7 @@ Group: System/Filesystems
%endif %endif
Requires: ceph-mgr = %{_epoch_prefix}%{version}-%{release} Requires: ceph-mgr = %{_epoch_prefix}%{version}-%{release}
Requires: python%{python3_pkgversion}-numpy Requires: python%{python3_pkgversion}-numpy
%if 0%{?fedora} || 0%{?suse_version} %if 0%{?fedora} || 0%{?suse_version} || 0%{?openEuler}
Requires: python%{python3_pkgversion}-scikit-learn Requires: python%{python3_pkgversion}-scikit-learn
%endif %endif
Requires: python3-scipy Requires: python3-scipy
@ -665,7 +664,7 @@ Requires: python%{python3_pkgversion}-pyOpenSSL
Requires: python%{python3_pkgversion}-requests Requires: python%{python3_pkgversion}-requests
Requires: python%{python3_pkgversion}-dateutil Requires: python%{python3_pkgversion}-dateutil
Requires: python%{python3_pkgversion}-setuptools Requires: python%{python3_pkgversion}-setuptools
%if 0%{?fedora} || 0%{?rhel} >= 8 %if 0%{?fedora} || 0%{?rhel} >= 8 || 0%{?openEuler}
Requires: python%{python3_pkgversion}-cherrypy Requires: python%{python3_pkgversion}-cherrypy
Requires: python%{python3_pkgversion}-pyyaml Requires: python%{python3_pkgversion}-pyyaml
Requires: python%{python3_pkgversion}-werkzeug Requires: python%{python3_pkgversion}-werkzeug
@ -722,7 +721,7 @@ Requires: openssh
Requires: python%{python3_pkgversion}-CherryPy Requires: python%{python3_pkgversion}-CherryPy
Requires: python%{python3_pkgversion}-Jinja2 Requires: python%{python3_pkgversion}-Jinja2
%endif %endif
%if 0%{?rhel} || 0%{?fedora} %if 0%{?rhel} || 0%{?fedora} || 0%{?openEuler}
Requires: openssh-clients Requires: openssh-clients
Requires: python%{python3_pkgversion}-cherrypy Requires: python%{python3_pkgversion}-cherrypy
Requires: python%{python3_pkgversion}-jinja2 Requires: python%{python3_pkgversion}-jinja2
@ -814,7 +813,7 @@ Requires: ceph-selinux = %{_epoch_prefix}%{version}-%{release}
%endif %endif
Requires: librados2 = %{_epoch_prefix}%{version}-%{release} Requires: librados2 = %{_epoch_prefix}%{version}-%{release}
Requires: librgw2 = %{_epoch_prefix}%{version}-%{release} Requires: librgw2 = %{_epoch_prefix}%{version}-%{release}
%if 0%{?rhel} || 0%{?fedora} %if 0%{?rhel} || 0%{?fedora} || 0%{?openEuler}
Requires: mailcap Requires: mailcap
%endif %endif
%if 0%{?weak_deps} %if 0%{?weak_deps}
@ -894,6 +893,7 @@ Requires: parted
Requires: util-linux Requires: util-linux
Requires: xfsprogs Requires: xfsprogs
Requires: python%{python3_pkgversion}-setuptools Requires: python%{python3_pkgversion}-setuptools
Requires: python%{python3_pkgversion}-packaging
Requires: python%{python3_pkgversion}-ceph-common = %{_epoch_prefix}%{version}-%{release} Requires: python%{python3_pkgversion}-ceph-common = %{_epoch_prefix}%{version}-%{release}
%description volume %description volume
This package contains a tool to deploy OSD with different devices like This package contains a tool to deploy OSD with different devices like
@ -905,7 +905,7 @@ Summary: RADOS distributed object store client library
%if 0%{?suse_version} %if 0%{?suse_version}
Group: System/Libraries Group: System/Libraries
%endif %endif
%if 0%{?rhel} || 0%{?fedora} %if 0%{?rhel} || 0%{?fedora} || 0%{?openEuler}
Obsoletes: ceph-libs < %{_epoch_prefix}%{version}-%{release} Obsoletes: ceph-libs < %{_epoch_prefix}%{version}-%{release}
%endif %endif
%description -n librados2 %description -n librados2
@ -1052,7 +1052,7 @@ Requires: librados2 = %{_epoch_prefix}%{version}-%{release}
%if 0%{?suse_version} %if 0%{?suse_version}
Requires(post): coreutils Requires(post): coreutils
%endif %endif
%if 0%{?rhel} || 0%{?fedora} %if 0%{?rhel} || 0%{?fedora} || 0%{?openEuler}
Obsoletes: ceph-libs < %{_epoch_prefix}%{version}-%{release} Obsoletes: ceph-libs < %{_epoch_prefix}%{version}-%{release}
%endif %endif
%description -n librbd1 %description -n librbd1
@ -1096,7 +1096,7 @@ Summary: Ceph distributed file system client library
Group: System/Libraries Group: System/Libraries
%endif %endif
Obsoletes: libcephfs1 < %{_epoch_prefix}%{version}-%{release} Obsoletes: libcephfs1 < %{_epoch_prefix}%{version}-%{release}
%if 0%{?rhel} || 0%{?fedora} %if 0%{?rhel} || 0%{?fedora} || 0%{?openEuler}
Obsoletes: ceph-libs < %{_epoch_prefix}%{version}-%{release} Obsoletes: ceph-libs < %{_epoch_prefix}%{version}-%{release}
Obsoletes: ceph-libcephfs Obsoletes: ceph-libcephfs
%endif %endif
@ -1149,7 +1149,7 @@ descriptions, and submitting the command to the appropriate daemon.
%package -n python%{python3_pkgversion}-ceph-common %package -n python%{python3_pkgversion}-ceph-common
Summary: Python 3 utility libraries for Ceph Summary: Python 3 utility libraries for Ceph
%if 0%{?fedora} || 0%{?rhel} >= 8 %if 0%{?fedora} || 0%{?rhel} >= 8 || 0%{?openEuler}
Requires: python%{python3_pkgversion}-pyyaml Requires: python%{python3_pkgversion}-pyyaml
%endif %endif
%if 0%{?suse_version} %if 0%{?suse_version}
@ -1288,6 +1288,15 @@ Group: System/Monitoring
%description mib %description mib
This package provides a Ceph MIB for SNMP traps. This package provides a Ceph MIB for SNMP traps.
%package node-proxy
Summary: hw monitoring agent for Ceph
BuildArch: noarch
%if 0%{?suse_version}
Group: System/Monitoring
%endif
%description node-proxy
This package provides a Ceph hardware monitoring agent.
################################################################################# #################################################################################
# common # common
################################################################################# #################################################################################
@ -1467,7 +1476,7 @@ install -m 0755 %{buildroot}%{_bindir}/crimson-osd %{buildroot}%{_bindir}/ceph-o
%endif %endif
install -m 0644 -D src/etc-rbdmap %{buildroot}%{_sysconfdir}/ceph/rbdmap install -m 0644 -D src/etc-rbdmap %{buildroot}%{_sysconfdir}/ceph/rbdmap
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
install -m 0644 -D etc/sysconfig/ceph %{buildroot}%{_sysconfdir}/sysconfig/ceph install -m 0644 -D etc/sysconfig/ceph %{buildroot}%{_sysconfdir}/sysconfig/ceph
%endif %endif
%if 0%{?suse_version} %if 0%{?suse_version}
@ -1501,7 +1510,7 @@ install -m 0644 -D udev/50-rbd.rules %{buildroot}%{_udevrulesdir}/50-rbd.rules
# sudoers.d # sudoers.d
install -m 0440 -D sudoers.d/ceph-smartctl %{buildroot}%{_sysconfdir}/sudoers.d/ceph-smartctl install -m 0440 -D sudoers.d/ceph-smartctl %{buildroot}%{_sysconfdir}/sudoers.d/ceph-smartctl
%if 0%{?rhel} >= 8 %if 0%{?rhel} >= 8 || 0%{?openEuler}
pathfix.py -pni "%{__python3} %{py3_shbang_opts}" %{buildroot}%{_bindir}/* pathfix.py -pni "%{__python3} %{py3_shbang_opts}" %{buildroot}%{_bindir}/*
pathfix.py -pni "%{__python3} %{py3_shbang_opts}" %{buildroot}%{_sbindir}/* pathfix.py -pni "%{__python3} %{py3_shbang_opts}" %{buildroot}%{_sbindir}/*
%endif %endif
@ -1538,7 +1547,7 @@ install -m 644 -D -t %{buildroot}%{_datadir}/snmp/mibs monitoring/snmp/CEPH-MIB.
%fdupes %{buildroot}%{_prefix} %fdupes %{buildroot}%{_prefix}
%endif %endif
%if 0%{?rhel} == 8 %if 0%{?rhel} == 8 || 0%{?openEuler}
%py_byte_compile %{__python3} %{buildroot}%{python3_sitelib} %py_byte_compile %{__python3} %{buildroot}%{python3_sitelib}
%endif %endif
@ -1581,7 +1590,7 @@ rm -rf %{_vpath_builddir}
%{_libdir}/libosd_tp.so* %{_libdir}/libosd_tp.so*
%endif %endif
%config(noreplace) %{_sysconfdir}/logrotate.d/ceph %config(noreplace) %{_sysconfdir}/logrotate.d/ceph
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%config(noreplace) %{_sysconfdir}/sysconfig/ceph %config(noreplace) %{_sysconfdir}/sysconfig/ceph
%endif %endif
%if 0%{?suse_version} %if 0%{?suse_version}
@ -1614,7 +1623,7 @@ if [ $1 -eq 1 ] ; then
/usr/bin/systemctl preset ceph.target ceph-crash.service >/dev/null 2>&1 || : /usr/bin/systemctl preset ceph.target ceph-crash.service >/dev/null 2>&1 || :
fi fi
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_post ceph.target ceph-crash.service %systemd_post ceph.target ceph-crash.service
%endif %endif
if [ $1 -eq 1 ] ; then if [ $1 -eq 1 ] ; then
@ -1625,7 +1634,7 @@ fi
%if 0%{?suse_version} %if 0%{?suse_version}
%service_del_preun ceph.target ceph-crash.service %service_del_preun ceph.target ceph-crash.service
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_preun ceph.target ceph-crash.service %systemd_preun ceph.target ceph-crash.service
%endif %endif
@ -1722,7 +1731,7 @@ exit 0
%pre common %pre common
CEPH_GROUP_ID=167 CEPH_GROUP_ID=167
CEPH_USER_ID=167 CEPH_USER_ID=167
%if 0%{?rhel} || 0%{?fedora} %if 0%{?rhel} || 0%{?fedora} || 0%{?openEuler}
/usr/sbin/groupadd ceph -g $CEPH_GROUP_ID -o -r 2>/dev/null || : /usr/sbin/groupadd ceph -g $CEPH_GROUP_ID -o -r 2>/dev/null || :
/usr/sbin/useradd ceph -u $CEPH_USER_ID -o -r -g ceph -s /sbin/nologin -c "Ceph daemons" -d %{_localstatedir}/lib/ceph 2>/dev/null || : /usr/sbin/useradd ceph -u $CEPH_USER_ID -o -r -g ceph -s /sbin/nologin -c "Ceph daemons" -d %{_localstatedir}/lib/ceph 2>/dev/null || :
%endif %endif
@ -1768,7 +1777,7 @@ if [ $1 -eq 1 ] ; then
/usr/bin/systemctl preset ceph-mds@\*.service ceph-mds.target >/dev/null 2>&1 || : /usr/bin/systemctl preset ceph-mds@\*.service ceph-mds.target >/dev/null 2>&1 || :
fi fi
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_post ceph-mds@\*.service ceph-mds.target %systemd_post ceph-mds@\*.service ceph-mds.target
%endif %endif
if [ $1 -eq 1 ] ; then if [ $1 -eq 1 ] ; then
@ -1779,7 +1788,7 @@ fi
%if 0%{?suse_version} %if 0%{?suse_version}
%service_del_preun ceph-mds@\*.service ceph-mds.target %service_del_preun ceph-mds@\*.service ceph-mds.target
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_preun ceph-mds@\*.service ceph-mds.target %systemd_preun ceph-mds@\*.service ceph-mds.target
%endif %endif
@ -1813,7 +1822,7 @@ if [ $1 -eq 1 ] ; then
/usr/bin/systemctl preset ceph-mgr@\*.service ceph-mgr.target >/dev/null 2>&1 || : /usr/bin/systemctl preset ceph-mgr@\*.service ceph-mgr.target >/dev/null 2>&1 || :
fi fi
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_post ceph-mgr@\*.service ceph-mgr.target %systemd_post ceph-mgr@\*.service ceph-mgr.target
%endif %endif
if [ $1 -eq 1 ] ; then if [ $1 -eq 1 ] ; then
@ -1824,7 +1833,7 @@ fi
%if 0%{?suse_version} %if 0%{?suse_version}
%service_del_preun ceph-mgr@\*.service ceph-mgr.target %service_del_preun ceph-mgr@\*.service ceph-mgr.target
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_preun ceph-mgr@\*.service ceph-mgr.target %systemd_preun ceph-mgr@\*.service ceph-mgr.target
%endif %endif
@ -1953,7 +1962,7 @@ if [ $1 -eq 1 ] ; then
/usr/bin/systemctl preset ceph-mon@\*.service ceph-mon.target >/dev/null 2>&1 || : /usr/bin/systemctl preset ceph-mon@\*.service ceph-mon.target >/dev/null 2>&1 || :
fi fi
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_post ceph-mon@\*.service ceph-mon.target %systemd_post ceph-mon@\*.service ceph-mon.target
%endif %endif
if [ $1 -eq 1 ] ; then if [ $1 -eq 1 ] ; then
@ -1964,7 +1973,7 @@ fi
%if 0%{?suse_version} %if 0%{?suse_version}
%service_del_preun ceph-mon@\*.service ceph-mon.target %service_del_preun ceph-mon@\*.service ceph-mon.target
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_preun ceph-mon@\*.service ceph-mon.target %systemd_preun ceph-mon@\*.service ceph-mon.target
%endif %endif
@ -2002,7 +2011,7 @@ if [ $1 -eq 1 ] ; then
/usr/bin/systemctl preset cephfs-mirror@\*.service cephfs-mirror.target >/dev/null 2>&1 || : /usr/bin/systemctl preset cephfs-mirror@\*.service cephfs-mirror.target >/dev/null 2>&1 || :
fi fi
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_post cephfs-mirror@\*.service cephfs-mirror.target %systemd_post cephfs-mirror@\*.service cephfs-mirror.target
%endif %endif
if [ $1 -eq 1 ] ; then if [ $1 -eq 1 ] ; then
@ -2013,7 +2022,7 @@ fi
%if 0%{?suse_version} %if 0%{?suse_version}
%service_del_preun cephfs-mirror@\*.service cephfs-mirror.target %service_del_preun cephfs-mirror@\*.service cephfs-mirror.target
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_preun cephfs-mirror@\*.service cephfs-mirror.target %systemd_preun cephfs-mirror@\*.service cephfs-mirror.target
%endif %endif
@ -2033,6 +2042,7 @@ fi
%files -n ceph-exporter %files -n ceph-exporter
%{_bindir}/ceph-exporter %{_bindir}/ceph-exporter
%{_unitdir}/ceph-exporter.service
%files -n rbd-fuse %files -n rbd-fuse
%{_bindir}/rbd-fuse %{_bindir}/rbd-fuse
@ -2050,7 +2060,7 @@ if [ $1 -eq 1 ] ; then
/usr/bin/systemctl preset ceph-rbd-mirror@\*.service ceph-rbd-mirror.target >/dev/null 2>&1 || : /usr/bin/systemctl preset ceph-rbd-mirror@\*.service ceph-rbd-mirror.target >/dev/null 2>&1 || :
fi fi
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_post ceph-rbd-mirror@\*.service ceph-rbd-mirror.target %systemd_post ceph-rbd-mirror@\*.service ceph-rbd-mirror.target
%endif %endif
if [ $1 -eq 1 ] ; then if [ $1 -eq 1 ] ; then
@ -2061,7 +2071,7 @@ fi
%if 0%{?suse_version} %if 0%{?suse_version}
%service_del_preun ceph-rbd-mirror@\*.service ceph-rbd-mirror.target %service_del_preun ceph-rbd-mirror@\*.service ceph-rbd-mirror.target
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_preun ceph-rbd-mirror@\*.service ceph-rbd-mirror.target %systemd_preun ceph-rbd-mirror@\*.service ceph-rbd-mirror.target
%endif %endif
@ -2091,7 +2101,7 @@ if [ $1 -eq 1 ] ; then
/usr/bin/systemctl preset ceph-immutable-object-cache@\*.service ceph-immutable-object-cache.target >/dev/null 2>&1 || : /usr/bin/systemctl preset ceph-immutable-object-cache@\*.service ceph-immutable-object-cache.target >/dev/null 2>&1 || :
fi fi
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_post ceph-immutable-object-cache@\*.service ceph-immutable-object-cache.target %systemd_post ceph-immutable-object-cache@\*.service ceph-immutable-object-cache.target
%endif %endif
if [ $1 -eq 1 ] ; then if [ $1 -eq 1 ] ; then
@ -2102,7 +2112,7 @@ fi
%if 0%{?suse_version} %if 0%{?suse_version}
%service_del_preun ceph-immutable-object-cache@\*.service ceph-immutable-object-cache.target %service_del_preun ceph-immutable-object-cache@\*.service ceph-immutable-object-cache.target
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_preun ceph-immutable-object-cache@\*.service ceph-immutable-object-cache.target %systemd_preun ceph-immutable-object-cache@\*.service ceph-immutable-object-cache.target
%endif %endif
@ -2145,7 +2155,7 @@ if [ $1 -eq 1 ] ; then
/usr/bin/systemctl preset ceph-radosgw@\*.service ceph-radosgw.target >/dev/null 2>&1 || : /usr/bin/systemctl preset ceph-radosgw@\*.service ceph-radosgw.target >/dev/null 2>&1 || :
fi fi
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_post ceph-radosgw@\*.service ceph-radosgw.target %systemd_post ceph-radosgw@\*.service ceph-radosgw.target
%endif %endif
if [ $1 -eq 1 ] ; then if [ $1 -eq 1 ] ; then
@ -2156,7 +2166,7 @@ fi
%if 0%{?suse_version} %if 0%{?suse_version}
%service_del_preun ceph-radosgw@\*.service ceph-radosgw.target %service_del_preun ceph-radosgw@\*.service ceph-radosgw.target
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_preun ceph-radosgw@\*.service ceph-radosgw.target %systemd_preun ceph-radosgw@\*.service ceph-radosgw.target
%endif %endif
@ -2196,7 +2206,7 @@ if [ $1 -eq 1 ] ; then
/usr/bin/systemctl preset ceph-osd@\*.service ceph-osd.target >/dev/null 2>&1 || : /usr/bin/systemctl preset ceph-osd@\*.service ceph-osd.target >/dev/null 2>&1 || :
fi fi
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_post ceph-osd@\*.service ceph-osd.target %systemd_post ceph-osd@\*.service ceph-osd.target
%endif %endif
if [ $1 -eq 1 ] ; then if [ $1 -eq 1 ] ; then
@ -2212,7 +2222,7 @@ fi
%if 0%{?suse_version} %if 0%{?suse_version}
%service_del_preun ceph-osd@\*.service ceph-osd.target %service_del_preun ceph-osd@\*.service ceph-osd.target
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_preun ceph-osd@\*.service ceph-osd.target %systemd_preun ceph-osd@\*.service ceph-osd.target
%endif %endif
@ -2251,7 +2261,7 @@ if [ $1 -eq 1 ] ; then
/usr/bin/systemctl preset ceph-volume@\*.service >/dev/null 2>&1 || : /usr/bin/systemctl preset ceph-volume@\*.service >/dev/null 2>&1 || :
fi fi
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_post ceph-volume@\*.service %systemd_post ceph-volume@\*.service
%endif %endif
@ -2259,7 +2269,7 @@ fi
%if 0%{?suse_version} %if 0%{?suse_version}
%service_del_preun ceph-volume@\*.service %service_del_preun ceph-volume@\*.service
%endif %endif
%if 0%{?fedora} || 0%{?rhel} %if 0%{?fedora} || 0%{?rhel} || 0%{?openEuler}
%systemd_preun ceph-volume@\*.service %systemd_preun ceph-volume@\*.service
%endif %endif
@ -2620,4 +2630,10 @@ exit 0
%attr(0755,root,root) %dir %{_datadir}/snmp %attr(0755,root,root) %dir %{_datadir}/snmp
%{_datadir}/snmp/mibs %{_datadir}/snmp/mibs
%files node-proxy
%{_sbindir}/ceph-node-proxy
%dir %{python3_sitelib}/ceph_node_proxy
%{python3_sitelib}/ceph_node_proxy/*
%{python3_sitelib}/ceph_node_proxy-*
%changelog %changelog

View File

@ -1,7 +1,13 @@
ceph (18.2.2-1jammy) jammy; urgency=medium ceph (18.2.4-1jammy) jammy; urgency=medium
-- Jenkins Build Slave User <jenkins-build@braggi10.front.sepia.ceph.com> Mon, 04 Mar 2024 20:27:31 +0000 -- Jenkins Build Slave User <jenkins-build@braggi02.front.sepia.ceph.com> Fri, 12 Jul 2024 15:42:34 +0000
ceph (18.2.4-1) stable; urgency=medium
* New upstream release
-- Ceph Release Team <ceph-maintainers@ceph.io> Fri, 12 Jul 2024 09:57:18 -0400
ceph (18.2.2-1) stable; urgency=medium ceph (18.2.2-1) stable; urgency=medium

View File

@ -86,6 +86,9 @@ function(build_arrow)
else() else()
list(APPEND arrow_CMAKE_ARGS -DCMAKE_BUILD_TYPE=Release) list(APPEND arrow_CMAKE_ARGS -DCMAKE_BUILD_TYPE=Release)
endif() endif()
# don't add -Werror or debug package builds fail with:
#warning _FORTIFY_SOURCE requires compiling with optimization (-O)
list(APPEND arrow_CMAKE_ARGS -DBUILD_WARNING_LEVEL=PRODUCTION)
# we use an external project and copy the sources to bin directory to ensure # we use an external project and copy the sources to bin directory to ensure
# that object files are built outside of the source tree. # that object files are built outside of the source tree.

View File

@ -11,6 +11,13 @@ function(build_rocksdb)
-DCMAKE_TOOLCHAIN_FILE=${CMAKE_TOOLCHAIN_FILE}) -DCMAKE_TOOLCHAIN_FILE=${CMAKE_TOOLCHAIN_FILE})
endif() endif()
list(APPEND rocksdb_CMAKE_ARGS -DWITH_LIBURING=${WITH_LIBURING})
if(WITH_LIBURING)
list(APPEND rocksdb_CMAKE_ARGS -During_INCLUDE_DIR=${URING_INCLUDE_DIR})
list(APPEND rocksdb_CMAKE_ARGS -During_LIBRARIES=${URING_LIBRARY_DIR})
list(APPEND rocksdb_INTERFACE_LINK_LIBRARIES uring::uring)
endif()
if(ALLOCATOR STREQUAL "jemalloc") if(ALLOCATOR STREQUAL "jemalloc")
list(APPEND rocksdb_CMAKE_ARGS -DWITH_JEMALLOC=ON) list(APPEND rocksdb_CMAKE_ARGS -DWITH_JEMALLOC=ON)
list(APPEND rocksdb_INTERFACE_LINK_LIBRARIES JeMalloc::JeMalloc) list(APPEND rocksdb_INTERFACE_LINK_LIBRARIES JeMalloc::JeMalloc)
@ -52,12 +59,13 @@ function(build_rocksdb)
endif() endif()
include(CheckCXXCompilerFlag) include(CheckCXXCompilerFlag)
check_cxx_compiler_flag("-Wno-deprecated-copy" HAS_WARNING_DEPRECATED_COPY) check_cxx_compiler_flag("-Wno-deprecated-copy" HAS_WARNING_DEPRECATED_COPY)
set(rocksdb_CXX_FLAGS "${CMAKE_CXX_FLAGS}")
if(HAS_WARNING_DEPRECATED_COPY) if(HAS_WARNING_DEPRECATED_COPY)
set(rocksdb_CXX_FLAGS -Wno-deprecated-copy) string(APPEND rocksdb_CXX_FLAGS " -Wno-deprecated-copy")
endif() endif()
check_cxx_compiler_flag("-Wno-pessimizing-move" HAS_WARNING_PESSIMIZING_MOVE) check_cxx_compiler_flag("-Wno-pessimizing-move" HAS_WARNING_PESSIMIZING_MOVE)
if(HAS_WARNING_PESSIMIZING_MOVE) if(HAS_WARNING_PESSIMIZING_MOVE)
set(rocksdb_CXX_FLAGS "${rocksdb_CXX_FLAGS} -Wno-pessimizing-move") string(APPEND rocksdb_CXX_FLAGS " -Wno-pessimizing-move")
endif() endif()
if(rocksdb_CXX_FLAGS) if(rocksdb_CXX_FLAGS)
list(APPEND rocksdb_CMAKE_ARGS -DCMAKE_CXX_FLAGS='${rocksdb_CXX_FLAGS}') list(APPEND rocksdb_CMAKE_ARGS -DCMAKE_CXX_FLAGS='${rocksdb_CXX_FLAGS}')
@ -84,6 +92,9 @@ function(build_rocksdb)
INSTALL_COMMAND "" INSTALL_COMMAND ""
LIST_SEPARATOR !) LIST_SEPARATOR !)
# make sure all the link libraries are built first
add_dependencies(rocksdb_ext ${rocksdb_INTERFACE_LINK_LIBRARIES})
add_library(RocksDB::RocksDB STATIC IMPORTED) add_library(RocksDB::RocksDB STATIC IMPORTED)
add_dependencies(RocksDB::RocksDB rocksdb_ext) add_dependencies(RocksDB::RocksDB rocksdb_ext)
set(rocksdb_INCLUDE_DIR "${rocksdb_SOURCE_DIR}/include") set(rocksdb_INCLUDE_DIR "${rocksdb_SOURCE_DIR}/include")

View File

@ -32,6 +32,8 @@ function(build_uring)
ExternalProject_Get_Property(liburing_ext source_dir) ExternalProject_Get_Property(liburing_ext source_dir)
set(URING_INCLUDE_DIR "${source_dir}/src/include") set(URING_INCLUDE_DIR "${source_dir}/src/include")
set(URING_LIBRARY_DIR "${source_dir}/src") set(URING_LIBRARY_DIR "${source_dir}/src")
set(URING_INCLUDE_DIR ${URING_INCLUDE_DIR} PARENT_SCOPE)
set(URING_LIBRARY_DIR ${URING_LIBRARY_DIR} PARENT_SCOPE)
add_library(uring::uring STATIC IMPORTED GLOBAL) add_library(uring::uring STATIC IMPORTED GLOBAL)
add_dependencies(uring::uring liburing_ext) add_dependencies(uring::uring liburing_ext)

View File

@ -0,0 +1,2 @@
lib/systemd/system/ceph-exporter*
usr/bin/ceph-exporter

View File

@ -1,3 +1,4 @@
bcrypt
pyOpenSSL pyOpenSSL
cephfs cephfs
ceph-argparse ceph-argparse

View File

@ -91,7 +91,6 @@ Build-Depends: automake,
python3-all-dev, python3-all-dev,
python3-cherrypy3, python3-cherrypy3,
python3-natsort, python3-natsort,
python3-jwt <pkg.ceph.check>,
python3-pecan <pkg.ceph.check>, python3-pecan <pkg.ceph.check>,
python3-bcrypt <pkg.ceph.check>, python3-bcrypt <pkg.ceph.check>,
tox <pkg.ceph.check>, tox <pkg.ceph.check>,
@ -353,6 +352,30 @@ Description: debugging symbols for ceph-mgr
. .
This package contains the debugging symbols for ceph-mgr. This package contains the debugging symbols for ceph-mgr.
Package: ceph-exporter
Architecture: linux-any
Depends: ceph-base (= ${binary:Version}),
Description: metrics exporter for the ceph distributed storage system
Ceph is a massively scalable, open-source, distributed
storage system that runs on commodity hardware and delivers object,
block and file system storage.
.
This package contains the metrics exporter daemon, which is used to expose
the performance metrics.
Package: ceph-exporter-dbg
Architecture: linux-any
Section: debug
Priority: extra
Depends: ceph-exporter (= ${binary:Version}),
${misc:Depends},
Description: debugging symbols for ceph-exporter
Ceph is a massively scalable, open-source, distributed
storage system that runs on commodity hardware and delivers object,
block and file system storage.
.
This package contains the debugging symbols for ceph-exporter.
Package: ceph-mon Package: ceph-mon
Architecture: linux-any Architecture: linux-any
Depends: ceph-base (= ${binary:Version}), Depends: ceph-base (= ${binary:Version}),

View File

@ -105,6 +105,7 @@ override_dh_strip:
dh_strip -pceph-mds --dbg-package=ceph-mds-dbg dh_strip -pceph-mds --dbg-package=ceph-mds-dbg
dh_strip -pceph-fuse --dbg-package=ceph-fuse-dbg dh_strip -pceph-fuse --dbg-package=ceph-fuse-dbg
dh_strip -pceph-mgr --dbg-package=ceph-mgr-dbg dh_strip -pceph-mgr --dbg-package=ceph-mgr-dbg
dh_strip -pceph-exporter --dbg-package=ceph-exporter-dbg
dh_strip -pceph-mon --dbg-package=ceph-mon-dbg dh_strip -pceph-mon --dbg-package=ceph-mon-dbg
dh_strip -pceph-osd --dbg-package=ceph-osd-dbg dh_strip -pceph-osd --dbg-package=ceph-osd-dbg
dh_strip -pceph-base --dbg-package=ceph-base-dbg dh_strip -pceph-base --dbg-package=ceph-base-dbg

357
ceph/doc/_static/js/pgcalc.js vendored Normal file
View File

@ -0,0 +1,357 @@
var _____WB$wombat$assign$function_____ = function(name) {return (self._wb_wombat && self._wb_wombat.local_init && self._wb_wombat.local_init(name)) || self[name]; };
if (!self.__WB_pmw) { self.__WB_pmw = function(obj) { this.__WB_source = obj; return this; } }
{
let window = _____WB$wombat$assign$function_____("window");
let self = _____WB$wombat$assign$function_____("self");
let document = _____WB$wombat$assign$function_____("document");
let location = _____WB$wombat$assign$function_____("location");
let top = _____WB$wombat$assign$function_____("top");
let parent = _____WB$wombat$assign$function_____("parent");
let frames = _____WB$wombat$assign$function_____("frames");
let opener = _____WB$wombat$assign$function_____("opener");
var pow2belowThreshold = 0.25
var key_values={};
key_values['poolName'] ={'name':'Pool Name','default':'newPool','description': 'Name of the pool in question. Typical pool names are included below.', 'width':'30%; text-align: left'};
key_values['size'] ={'name':'Size','default': 3, 'description': 'Number of replicas the pool will have. Default value of 3 is pre-filled.', 'width':'10%', 'global':1};
key_values['osdNum'] ={'name':'OSD #','default': 100, 'description': 'Number of OSDs which this Pool will have PGs in. Typically, this is the entire Cluster OSD count, but could be less based on CRUSH rules. (e.g. Separate SSD and SATA disk sets)', 'width':'10%', 'global':1};
key_values['percData'] ={'name':'%Data', 'default': 5, 'description': 'This value represents the approximate percentage of data which will be contained in this pool for that specific OSD set. Examples are pre-filled below for guidance.','width':'10%'};
key_values['targPGsPerOSD'] ={'name':'Target PGs per OSD', 'default':100, 'description': 'This value should be populated based on the following guidance:', 'width':'10%', 'global':1, 'options': [ ['100','If the cluster OSD count is not expected to increase in the foreseeable future.'], ['200', 'If the cluster OSD count is expected to increase (up to double the size) in the foreseeable future.']]}
var notes ={
'totalPerc':'<b>"Total Data Percentage"</b> below table should be a multiple of 100%.',
'totalPGs':'<b>"Total PG Count"</b> below table will be the count of Primary PG copies. However, when calculating total PGs per OSD average, you must include all copies.',
'noDecrease':'It\'s also important to know that the PG count can be increased, but <b>NEVER</b> decreased without destroying / recreating the pool. However, increasing the PG Count of a pool is one of the most impactful events in a Ceph Cluster, and should be avoided for production clusters if possible.',
};
var presetTables={};
presetTables['All-in-One']=[
{ 'poolName' : 'rbd', 'size' : '3', 'osdNum' : '100', 'percData' : '100', 'targPGsPerOSD' : '100'},
];
presetTables['OpenStack']=[
{ 'poolName' : 'cinder-backup', 'size' : '3', 'osdNum' : '100', 'percData' : '25', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'cinder-volumes', 'size' : '3', 'osdNum' : '100', 'percData' : '53', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'ephemeral-vms', 'size' : '3', 'osdNum' : '100', 'percData' : '15', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'glance-images', 'size' : '3', 'osdNum' : '100', 'percData' : '7', 'targPGsPerOSD' : '100'},
];
presetTables['OpenStack w RGW - Jewel and later']=[
{ 'poolName' : '.rgw.root', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.control', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.data.root', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.gc', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.log', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.intent-log', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.meta', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.usage', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.users.keys', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.users.email', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.users.swift', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.users.uid', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.buckets.extra', 'size' : '3', 'osdNum' : '100', 'percData' : '1.0', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.buckets.index', 'size' : '3', 'osdNum' : '100', 'percData' : '3.0', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.buckets.data', 'size' : '3', 'osdNum' : '100', 'percData' : '19', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'cinder-backup', 'size' : '3', 'osdNum' : '100', 'percData' : '18', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'cinder-volumes', 'size' : '3', 'osdNum' : '100', 'percData' : '42.8', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'ephemeral-vms', 'size' : '3', 'osdNum' : '100', 'percData' : '10', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'glance-images', 'size' : '3', 'osdNum' : '100', 'percData' : '5', 'targPGsPerOSD' : '100'},
];
presetTables['Rados Gateway Only - Jewel and later']=[
{ 'poolName' : '.rgw.root', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.control', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.data.root', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.gc', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.log', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.intent-log', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.meta', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.usage', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.users.keys', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.users.email', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.users.swift', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.users.uid', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.buckets.extra', 'size' : '3', 'osdNum' : '100', 'percData' : '1.0', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.buckets.index', 'size' : '3', 'osdNum' : '100', 'percData' : '3.0', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'default.rgw.buckets.data', 'size' : '3', 'osdNum' : '100', 'percData' : '94.8', 'targPGsPerOSD' : '100'},
];
presetTables['OpenStack w RGW - Infernalis and earlier']=[
{ 'poolName' : '.intent-log', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.log', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.rgw', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.rgw.buckets', 'size' : '3', 'osdNum' : '100', 'percData' : '18', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.rgw.buckets.extra', 'size' : '3', 'osdNum' : '100', 'percData' : '1.0', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.rgw.buckets.index', 'size' : '3', 'osdNum' : '100', 'percData' : '3.0', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.rgw.control', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.rgw.gc', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.rgw.root', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.usage', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.users', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.users.email', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.users.swift', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.users.uid', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'cinder-backup', 'size' : '3', 'osdNum' : '100', 'percData' : '19', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'cinder-volumes', 'size' : '3', 'osdNum' : '100', 'percData' : '42.9', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'ephemeral-vms', 'size' : '3', 'osdNum' : '100', 'percData' : '10', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'glance-images', 'size' : '3', 'osdNum' : '100', 'percData' : '5', 'targPGsPerOSD' : '100'},
];
presetTables['Rados Gateway Only - Infernalis and earlier']=[
{ 'poolName' : '.intent-log', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.log', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.rgw', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.rgw.buckets', 'size' : '3', 'osdNum' : '100', 'percData' : '94.9', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.rgw.buckets.extra', 'size' : '3', 'osdNum' : '100', 'percData' : '1.0', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.rgw.buckets.index', 'size' : '3', 'osdNum' : '100', 'percData' : '3.0', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.rgw.control', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.rgw.gc', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.rgw.root', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.usage', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.users', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.users.email', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.users.swift', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
{ 'poolName' : '.users.uid', 'size' : '3', 'osdNum' : '100', 'percData' : '0.1', 'targPGsPerOSD' : '100'},
];
presetTables['RBD and libRados']=[
{ 'poolName' : 'rbd', 'size' : '3', 'osdNum' : '100', 'percData' : '75', 'targPGsPerOSD' : '100'},
{ 'poolName' : 'myObjects', 'size' : '3', 'osdNum' : '100', 'percData' : '25', 'targPGsPerOSD' : '100'},
];
$(function() {
$("#presetType").on("change",changePreset);
$("#btnAddPool").on("click",addPool);
$("#btnGenCommands").on("click",generateCommands);
$.each(presetTables,function(index,value) {
selIndex='';
if ( index == 'OpenStack w RGW - Jewel and later' )
selIndex=' selected';
$("#presetType").append("<option value=\""+index+"\""+selIndex+">"+index+"</option>");
});
changePreset();
$("#beforeTable").html("<fieldset id='keyFieldset'><legend>Key</legend><dl class='table-display' id='keyDL'></dl></fieldset>");
$.each(key_values, function(index, value) {
pre='';
post='';
if ('global' in value) {
pre='<a href="javascript://" onClick="globalChange(\''+index+'\');" title="Change the \''+value['name']+'\' parameter globally">';
post='</a>'
}
var dlAdd="<dt id='dt_"+index+"'>"+pre+value['name']+post+"</dt><dd id='dd_"+index+"'>"+value['description'];
if ( 'options' in value ) {
dlAdd+="<dl class='sub-table'>";
$.each(value['options'], function (subIndex, subValue) {
dlAdd+="<dt><a href=\"javascript://\" onClick=\"massUpdate('"+index+"','"+subValue[0]+"');\" title=\"Set all '"+value['name']+"' fields to '"+subValue[0]+"'.\">"+subValue[0]+"</a></dt><dd>"+subValue[1]+"</dd>";
});
dlAdd+="</dl>";
}
dlAdd+="</dd>";
$("#keyDL").append(dlAdd);
});
$("#afterTable").html("<fieldset id='notesFieldset'><legend>Notes</legend><ul id='notesUL'>\n<ul></fieldset>");
$.each(notes,function(index, value) {
$("#notesUL").append("\t<li id=\"li_"+index+"\">"+value+"</li>\n");
});
});
function changePreset() {
resetTable();
fillTable($("#presetType").val());
}
function resetTable() {
$("#pgsperpool").html("");
$("#pgsperpool").append("<tr id='headerRow'>\n</tr>\n");
$("#headerRow").append("\t<th>&nbsp;</th>\n");
var fieldCount=0;
var percDataIndex=0;
$.each(key_values, function(index, value) {
fieldCount++;
pre='';
post='';
var widthAdd='';
if ( index == 'percData' )
percDataIndex=fieldCount;
if ('width' in value)
widthAdd=' style=\'width: '+value['width']+'\'';
if ('global' in value) {
pre='<a href="javascript://" onClick="globalChange(\''+index+'\');" title="Change the \''+value['name']+'\' parameter globally">';
post='</a>'
}
$("#headerRow").append("\t<th"+widthAdd+">"+pre+value['name']+post+"</th>\n");
});
percDataIndex++;
$("#headerRow").append("\t<th class='center'>Suggested PG Count</th>\n");
$("#pgsperpool").append("<tr id='totalRow'><td colspan='"+percDataIndex+"' id='percTotal' style='text-align: right; margin-right: 10px;'><strong>Total Data Percentage:</strong> <span id='percTotalValue'>0</span>%</td><td>&nbsp;</td><td id='pgTotal' class='bold pgcount' style='text-align: right;'>PG Total Count: <span id='pgTotalValue'>0</span></td></tr>");
}
function nearestPow2( aSize ){
var tmp=Math.pow(2, Math.round(Math.log(aSize)/Math.log(2)));
if(tmp<(aSize*(1-pow2belowThreshold)))
tmp*=2;
return tmp;
}
function globalChange(field) {
dialogHTML='<div title="Change \''+key_values[field]['name']+'\' Globally"><form>';
dialogHTML+='<label for="value">New '+key_values[field]['name']+' value:</label><br />\n';
dialogHTML+='<input type="text" name="globalValue" id="globalValue" value="'+$("#row0_"+field+"_input").val()+'" style="text-align: right;"/>';
dialogHTML+='<input type="hidden" name="globalField" id="globalField" value="'+field+'"/>';
dialogHTML+='<input type="submit" tabindex="-1" style="position:absolute; top:-1000px">';
dialogHTML+='</form>';
globalDialog=$(dialogHTML).dialog({
autoOpen: true,
width: 350,
show: 'fold',
hide: 'fold',
modal: true,
buttons: {
"Update Value": function() { massUpdate($("#globalField").val(),$("#globalValue").val()); globalDialog.dialog("close"); setTimeout(function() { globalDialog.dialog("destroy"); }, 1000); },
"Cancel": function() { globalDialog.dialog("close"); setTimeout(function() { globalDialog.dialog("destroy"); }, 1000); }
}
});
}
var rowCount=0;
function fillTable(presetType) {
rowCount=0;
$.each(presetTables[presetType], function(index,value) {
addTableRow(value);
});
}
function addPool() {
dialogHTML='<div title="Add Pool"><form>';
$.each(key_values, function(index,value) {
dialogHTML+='<br /><label for="new'+index+'">'+value['name']+':</label><br />\n';
classAdd='right';
if ( index == 'poolName' )
classAdd='left';
dialogHTML+='<input type="text" name="new'+index+'" id="new'+index+'" value="'+value['default']+'" class="'+classAdd+'"/><br />';
});
dialogHTML+='<input type="submit" tabindex="-1" style="position:absolute; top:-1000px">';
dialogHTML+='</form>';
addPoolDialog=$(dialogHTML).dialog({
autoOpen: true,
width: 350,
show: 'fold',
hide: 'fold',
modal: true,
buttons: {
"Add Pool": function() {
var newPoolValues={};
$.each(key_values,function(index,value) {
newPoolValues[index]=$("#new"+index).val();
});
addTableRow(newPoolValues);
addPoolDialog.dialog("close");
setTimeout(function() { addPoolDialog.dialog("destroy"); }, 1000); },
"Cancel": function() { addPoolDialog.dialog("close"); setTimeout(function() { addPoolDialog.dialog("destroy"); }, 1000); }
}
});
// addTableRow({'poolName':'newPool','size':3, 'osdNum':100,'targPGsPerOSD': 100, 'percData':0});
}
function addTableRow(rowValues) {
rowAdd="<tr id='row"+rowCount+"'>\n";
rowAdd+="\t<td width='15px' class='inputColor'><a href='javascript://' title='Remove Pool' onClick='$(\"#row"+rowCount+"\").remove();updateTotals();'><span class='ui-icon ui-icon-trash'></span></a></td>\n";
$.each(key_values, function(index,value) {
classAdd=' center';
modifier='';
if ( index == 'percData' ) {
classAdd='" style="text-align: right;';
// modifier=' %';
} else if ( index == 'poolName' )
classAdd=' left';
rowAdd+="\t<td id=\"row"+rowCount+"_"+index+"\"><input type=\"text\" class=\"inputColor "+index+classAdd+"\" id=\"row"+rowCount+"_"+index+"_input\" value=\""+rowValues[index]+"\" onFocus=\"focusMe("+rowCount+",'"+index+"');\" onKeyUp=\"keyMe("+rowCount+",'"+index+"');\" onBlur=\"blurMe("+rowCount+",'"+index+"');\">"+modifier+"</td>\n";
});
rowAdd+="\t<td id=\"row"+rowCount+"_pgCount\" class='pgcount' style='text-align: right;'>0</td></tr>";
$("#totalRow").before(rowAdd);
updatePGCount(rowCount);
$("[id$='percData_input']").each(function() { var fieldVal=parseFloat($(this).val()); $(this).val(fieldVal.toFixed(2)); });
rowCount++;
}
function updatePGCount(rowID) {
if(rowID==-1) {
for(var i=0;i<rowCount;i++) {
updatePGCount(i);
}
} else {
minValue=nearestPow2(Math.floor($("#row"+rowID+"_osdNum_input").val()/$("#row"+rowID+"_size_input").val())+1);
if(minValue<$("#row"+rowID+"_osdNum_input").val())
minValue*=2;
calcValue=nearestPow2(Math.floor(($("#row"+rowID+"_targPGsPerOSD_input").val()*$("#row"+rowID+"_osdNum_input").val()*$("#row"+rowID+"_percData_input").val())/(100*$("#row"+rowID+"_size_input").val())));
if(minValue>calcValue)
$("#row"+rowID+"_pgCount").html(minValue);
else
$("#row"+rowID+"_pgCount").html(calcValue);
}
updateTotals();
}
function focusMe(rowID,field) {
$("#row"+rowID+"_"+field+"_input").toggleClass('inputColor');
$("#row"+rowID+"_"+field+"_input").toggleClass('highlightColor');
$("#dt_"+field).toggleClass('highlightColor');
$("#dd_"+field).toggleClass('highlightColor');
updatePGCount(rowID);
}
function blurMe(rowID,field) {
focusMe(rowID,field);
$("[id$='percData_input']").each(function() { var fieldVal=parseFloat($(this).val()); $(this).val(fieldVal.toFixed(2)); });
}
function keyMe(rowID,field) {
updatePGCount(rowID);
}
function massUpdate(field,value) {
$("[id$='_"+field+"_input']").val(value);
key_values[field]['default']=value;
updatePGCount(-1);
}
function updateTotals() {
var totalPerc=0;
var totalPGs=0;
$("[id$='percData_input']").each(function() {
totalPerc+=parseFloat($(this).val());
if ( parseFloat($(this).val()) > 100 )
$(this).addClass('ui-state-error');
else
$(this).removeClass('ui-state-error');
});
$("[id$='_pgCount']").each(function() {
totalPGs+=parseInt($(this).html());
});
$("#percTotalValue").html(totalPerc.toFixed(2));
$("#pgTotalValue").html(totalPGs);
if(parseFloat(totalPerc.toFixed(2)) % 100 != 0) {
$("#percTotalValue").addClass('ui-state-error');
$("#li_totalPerc").addClass('ui-state-error');
} else {
$("#percTotalValue").removeClass('ui-state-error');
$("#li_totalPerc").removeClass('ui-state-error');
}
$("#commandCode").html("");
}
function generateCommands() {
outputCommands="## Note: The 'while' loops below pause between pools to allow all\n\
## PGs to be created. This is a safety mechanism to prevent\n\
## saturating the Monitor nodes.\n\
## -------------------------------------------------------------------\n\n";
for(i=0;i<rowCount;i++) {
console.log(i);
outputCommands+="ceph osd pool create "+$("#row"+i+"_poolName_input").val()+" "+$("#row"+i+"_pgCount").html()+"\n";
outputCommands+="ceph osd pool set "+$("#row"+i+"_poolName_input").val()+" size "+$("#row"+i+"_size_input").val()+"\n";
outputCommands+="while [ $(ceph -s | grep creating -c) -gt 0 ]; do echo -n .;sleep 1; done\n\n";
}
window.location.href = "data:application/download," + encodeURIComponent(outputCommands);
}
}

View File

@ -19,9 +19,14 @@ The Ceph Storage Cluster
======================== ========================
Ceph provides an infinitely scalable :term:`Ceph Storage Cluster` based upon Ceph provides an infinitely scalable :term:`Ceph Storage Cluster` based upon
:abbr:`RADOS (Reliable Autonomic Distributed Object Store)`, which you can read :abbr:`RADOS (Reliable Autonomic Distributed Object Store)`, a reliable,
about in `RADOS - A Scalable, Reliable Storage Service for Petabyte-scale distributed storage service that uses the intelligence in each of its nodes to
Storage Clusters`_. secure the data it stores and to provide that data to :term:`client`\s. See
Sage Weil's "`The RADOS Object Store
<https://ceph.io/en/news/blog/2009/the-rados-distributed-object-store/>`_" blog
post for a brief explanation of RADOS and see `RADOS - A Scalable, Reliable
Storage Service for Petabyte-scale Storage Clusters`_ for an exhaustive
explanation of :term:`RADOS`.
A Ceph Storage Cluster consists of multiple types of daemons: A Ceph Storage Cluster consists of multiple types of daemons:
@ -33,9 +38,8 @@ A Ceph Storage Cluster consists of multiple types of daemons:
.. _arch_monitor: .. _arch_monitor:
Ceph Monitors maintain the master copy of the cluster map, which they provide Ceph Monitors maintain the master copy of the cluster map, which they provide
to Ceph clients. Provisioning multiple monitors within the Ceph cluster ensures to Ceph clients. The existence of multiple monitors in the Ceph cluster ensures
availability in the event that one of the monitor daemons or its host fails. availability if one of the monitor daemons or its host fails.
The Ceph monitor provides copies of the cluster map to storage cluster clients.
A Ceph OSD Daemon checks its own state and the state of other OSDs and reports A Ceph OSD Daemon checks its own state and the state of other OSDs and reports
back to monitors. back to monitors.
@ -47,10 +51,11 @@ A Ceph Metadata Server (MDS) manages file metadata when CephFS is used to
provide file services. provide file services.
Storage cluster clients and :term:`Ceph OSD Daemon`\s use the CRUSH algorithm Storage cluster clients and :term:`Ceph OSD Daemon`\s use the CRUSH algorithm
to compute information about data location. This means that clients and OSDs to compute information about the location of data. Use of the CRUSH algoritm
are not bottlenecked by a central lookup table. Ceph's high-level features means that clients and OSDs are not bottlenecked by a central lookup table.
include a native interface to the Ceph Storage Cluster via ``librados``, and a Ceph's high-level features include a native interface to the Ceph Storage
number of service interfaces built on top of ``librados``. Cluster via ``librados``, and a number of service interfaces built on top of
``librados``.
Storing Data Storing Data
------------ ------------
@ -128,8 +133,8 @@ massive scale by distributing the work to all the OSD daemons in the cluster
and all the clients that communicate with them. CRUSH uses intelligent data and all the clients that communicate with them. CRUSH uses intelligent data
replication to ensure resiliency, which is better suited to hyper-scale replication to ensure resiliency, which is better suited to hyper-scale
storage. The following sections provide additional details on how CRUSH works. storage. The following sections provide additional details on how CRUSH works.
For a detailed discussion of CRUSH, see `CRUSH - Controlled, Scalable, For an in-depth, academic discussion of CRUSH, see `CRUSH - Controlled,
Decentralized Placement of Replicated Data`_. Scalable, Decentralized Placement of Replicated Data`_.
.. index:: architecture; cluster map .. index:: architecture; cluster map
@ -587,7 +592,7 @@ cluster map, the client doesn't know anything about object locations.**
**Object locations must be computed.** **Object locations must be computed.**
The client requies only the object ID and the name of the pool in order to The client requires only the object ID and the name of the pool in order to
compute the object location. compute the object location.
Ceph stores data in named pools (for example, "liverpool"). When a client Ceph stores data in named pools (for example, "liverpool"). When a client
@ -1589,7 +1594,8 @@ typically deploy a Ceph Block Device with the ``rbd`` network storage driver in
QEMU/KVM, where the host machine uses ``librbd`` to provide a block device QEMU/KVM, where the host machine uses ``librbd`` to provide a block device
service to the guest. Many cloud computing stacks use ``libvirt`` to integrate service to the guest. Many cloud computing stacks use ``libvirt`` to integrate
with hypervisors. You can use thin-provisioned Ceph Block Devices with QEMU and with hypervisors. You can use thin-provisioned Ceph Block Devices with QEMU and
``libvirt`` to support OpenStack and CloudStack among other solutions. ``libvirt`` to support OpenStack, OpenNebula and CloudStack
among other solutions.
While we do not provide ``librbd`` support with other hypervisors at this time, While we do not provide ``librbd`` support with other hypervisors at this time,
you may also use Ceph Block Device kernel objects to provide a block device to a you may also use Ceph Block Device kernel objects to provide a block device to a

View File

@ -22,20 +22,20 @@ Preparation
#. Make sure that the ``cephadm`` command line tool is available on each host #. Make sure that the ``cephadm`` command line tool is available on each host
in the existing cluster. See :ref:`get-cephadm` to learn how. in the existing cluster. See :ref:`get-cephadm` to learn how.
#. Prepare each host for use by ``cephadm`` by running this command: #. Prepare each host for use by ``cephadm`` by running this command on that host:
.. prompt:: bash # .. prompt:: bash #
cephadm prepare-host cephadm prepare-host
#. Choose a version of Ceph to use for the conversion. This procedure will work #. Choose a version of Ceph to use for the conversion. This procedure will work
with any release of Ceph that is Octopus (15.2.z) or later, inclusive. The with any release of Ceph that is Octopus (15.2.z) or later. The
latest stable release of Ceph is the default. You might be upgrading from an latest stable release of Ceph is the default. You might be upgrading from an
earlier Ceph release at the same time that you're performing this earlier Ceph release at the same time that you're performing this
conversion; if you are upgrading from an earlier release, make sure to conversion. If you are upgrading from an earlier release, make sure to
follow any upgrade-related instructions for that release. follow any upgrade-related instructions for that release.
Pass the image to cephadm with the following command: Pass the Ceph container image to cephadm with the following command:
.. prompt:: bash # .. prompt:: bash #
@ -50,25 +50,27 @@ Preparation
cephadm ls cephadm ls
Before starting the conversion process, ``cephadm ls`` shows all existing Before starting the conversion process, ``cephadm ls`` reports all existing
daemons to have a style of ``legacy``. As the adoption process progresses, daemons with the style ``legacy``. As the adoption process progresses,
adopted daemons will appear with a style of ``cephadm:v1``. adopted daemons will appear with the style ``cephadm:v1``.
Adoption process Adoption process
---------------- ----------------
#. Make sure that the ceph configuration has been migrated to use the cluster #. Make sure that the ceph configuration has been migrated to use the cluster's
config database. If the ``/etc/ceph/ceph.conf`` is identical on each host, central config database. If ``/etc/ceph/ceph.conf`` is identical on all
then the following command can be run on one single host and will affect all hosts, then the following command can be run on one host and will take
hosts: effect for all hosts:
.. prompt:: bash # .. prompt:: bash #
ceph config assimilate-conf -i /etc/ceph/ceph.conf ceph config assimilate-conf -i /etc/ceph/ceph.conf
If there are configuration variations between hosts, you will need to repeat If there are configuration variations between hosts, you will need to repeat
this command on each host. During this adoption process, view the cluster's this command on each host, taking care that if there are conflicting option
settings across hosts, the values from the last host will be used. During this
adoption process, view the cluster's central
configuration to confirm that it is complete by running the following configuration to confirm that it is complete by running the following
command: command:
@ -76,36 +78,36 @@ Adoption process
ceph config dump ceph config dump
#. Adopt each monitor: #. Adopt each Monitor:
.. prompt:: bash # .. prompt:: bash #
cephadm adopt --style legacy --name mon.<hostname> cephadm adopt --style legacy --name mon.<hostname>
Each legacy monitor should stop, quickly restart as a cephadm Each legacy Monitor will stop, quickly restart as a cephadm
container, and rejoin the quorum. container, and rejoin the quorum.
#. Adopt each manager: #. Adopt each Manager:
.. prompt:: bash # .. prompt:: bash #
cephadm adopt --style legacy --name mgr.<hostname> cephadm adopt --style legacy --name mgr.<hostname>
#. Enable cephadm: #. Enable cephadm orchestration:
.. prompt:: bash # .. prompt:: bash #
ceph mgr module enable cephadm ceph mgr module enable cephadm
ceph orch set backend cephadm ceph orch set backend cephadm
#. Generate an SSH key: #. Generate an SSH key for cephadm:
.. prompt:: bash # .. prompt:: bash #
ceph cephadm generate-key ceph cephadm generate-key
ceph cephadm get-pub-key > ~/ceph.pub ceph cephadm get-pub-key > ~/ceph.pub
#. Install the cluster SSH key on each host in the cluster: #. Install the cephadm SSH key on each host in the cluster:
.. prompt:: bash # .. prompt:: bash #
@ -118,9 +120,10 @@ Adoption process
SSH keys. SSH keys.
.. note:: .. note::
It is also possible to have cephadm use a non-root user to SSH It is also possible to arrange for cephadm to use a non-root user to SSH
into cluster hosts. This user needs to have passwordless sudo access. into cluster hosts. This user needs to have passwordless sudo access.
Use ``ceph cephadm set-user <user>`` and copy the SSH key to that user. Use ``ceph cephadm set-user <user>`` and copy the SSH key to that user's
home directory on each host.
See :ref:`cephadm-ssh-user` See :ref:`cephadm-ssh-user`
#. Tell cephadm which hosts to manage: #. Tell cephadm which hosts to manage:
@ -129,10 +132,10 @@ Adoption process
ceph orch host add <hostname> [ip-address] ceph orch host add <hostname> [ip-address]
This will perform a ``cephadm check-host`` on each host before adding it; This will run ``cephadm check-host`` on each host before adding it.
this check ensures that the host is functioning properly. The IP address This check ensures that the host is functioning properly. The IP address
argument is recommended; if not provided, then the host name will be resolved argument is recommended. If the address is not provided, then the host name
via DNS. will be resolved via DNS.
#. Verify that the adopted monitor and manager daemons are visible: #. Verify that the adopted monitor and manager daemons are visible:
@ -153,8 +156,8 @@ Adoption process
cephadm adopt --style legacy --name osd.1 cephadm adopt --style legacy --name osd.1
cephadm adopt --style legacy --name osd.2 cephadm adopt --style legacy --name osd.2
#. Redeploy MDS daemons by telling cephadm how many daemons to run for #. Redeploy CephFS MDS daemons (if deployed) by telling cephadm how many daemons to run for
each file system. List file systems by name with the command ``ceph fs each file system. List CephFS file systems by name with the command ``ceph fs
ls``. Run the following command on the master nodes to redeploy the MDS ls``. Run the following command on the master nodes to redeploy the MDS
daemons: daemons:
@ -189,19 +192,19 @@ Adoption process
systemctl stop ceph-mds.target systemctl stop ceph-mds.target
rm -rf /var/lib/ceph/mds/ceph-* rm -rf /var/lib/ceph/mds/ceph-*
#. Redeploy RGW daemons. Cephadm manages RGW daemons by zone. For each #. Redeploy Ceph Object Gateway RGW daemons if deployed. Cephadm manages RGW
zone, deploy new RGW daemons with cephadm: daemons by zone. For each zone, deploy new RGW daemons with cephadm:
.. prompt:: bash # .. prompt:: bash #
ceph orch apply rgw <svc_id> [--realm=<realm>] [--zone=<zone>] [--port=<port>] [--ssl] [--placement=<placement>] ceph orch apply rgw <svc_id> [--realm=<realm>] [--zone=<zone>] [--port=<port>] [--ssl] [--placement=<placement>]
where *<placement>* can be a simple daemon count, or a list of where *<placement>* can be a simple daemon count, or a list of
specific hosts (see :ref:`orchestrator-cli-placement-spec`), and the specific hosts (see :ref:`orchestrator-cli-placement-spec`). The
zone and realm arguments are needed only for a multisite setup. zone and realm arguments are needed only for a multisite setup.
After the daemons have started and you have confirmed that they are After the daemons have started and you have confirmed that they are
functioning, stop and remove the old, legacy daemons: functioning, stop and remove the legacy daemons:
.. prompt:: bash # .. prompt:: bash #

View File

@ -1,36 +1,36 @@
======================= =======================
Basic Ceph Client Setup Basic Ceph Client Setup
======================= =======================
Client machines require some basic configuration to interact with Client hosts require basic configuration to interact with
Ceph clusters. This section describes how to configure a client machine Ceph clusters. This section describes how to perform this configuration.
so that it can interact with a Ceph cluster.
.. note:: .. note::
Most client machines need to install only the `ceph-common` package Most client hosts need to install only the ``ceph-common`` package
and its dependencies. Such a setup supplies the basic `ceph` and and its dependencies. Such an installation supplies the basic ``ceph`` and
`rados` commands, as well as other commands including `mount.ceph` ``rados`` commands, as well as other commands including ``mount.ceph``
and `rbd`. and ``rbd``.
Config File Setup Config File Setup
================= =================
Client machines usually require smaller configuration files (here Client hosts usually require smaller configuration files (here
sometimes called "config files") than do full-fledged cluster members. sometimes called "config files") than do back-end cluster hosts.
To generate a minimal config file, log into a host that has been To generate a minimal config file, log into a host that has been
configured as a client or that is running a cluster daemon, and then run the following command: configured as a client or that is running a cluster daemon, then
run the following command:
.. prompt:: bash # .. prompt:: bash #
ceph config generate-minimal-conf ceph config generate-minimal-conf
This command generates a minimal config file that tells the client how This command generates a minimal config file that tells the client how
to reach the Ceph monitors. The contents of this file should usually to reach the Ceph Monitors. This file should usually
be installed in ``/etc/ceph/ceph.conf``. be copied to ``/etc/ceph/ceph.conf`` on each client host.
Keyring Setup Keyring Setup
============= =============
Most Ceph clusters run with authentication enabled. This means that Most Ceph clusters run with authentication enabled. This means that
the client needs keys in order to communicate with the machines in the the client needs keys in order to communicate with Ceph daemons.
cluster. To generate a keyring file with credentials for `client.fs`, To generate a keyring file with credentials for ``client.fs``,
log into an running cluster member and run the following command: log into an running cluster member and run the following command:
.. prompt:: bash $ .. prompt:: bash $
@ -40,6 +40,10 @@ log into an running cluster member and run the following command:
The resulting output is directed into a keyring file, typically The resulting output is directed into a keyring file, typically
``/etc/ceph/ceph.keyring``. ``/etc/ceph/ceph.keyring``.
To gain a broader understanding of client keyring distribution and administration, you should read :ref:`client_keyrings_and_configs`. To gain a broader understanding of client keyring distribution and administration,
you should read :ref:`client_keyrings_and_configs`.
To see an example that explains how to distribute ``ceph.conf`` configuration files to hosts that are tagged with the ``bare_config`` label, you should read the section called "Distributing ceph.conf to hosts tagged with bare_config" in the section called :ref:`etc_ceph_conf_distribution`. To see an example that explains how to distribute ``ceph.conf`` configuration
files to hosts that are tagged with the ``bare_config`` label, you should read
the subsection named "Distributing ceph.conf to hosts tagged with bare_config"
under the heading :ref:`etc_ceph_conf_distribution`.

View File

@ -30,8 +30,8 @@ This table shows which version pairs are expected to work or not work together:
.. note:: .. note::
While not all podman versions have been actively tested against While not all Podman versions have been actively tested against
all Ceph versions, there are no known issues with using podman all Ceph versions, there are no known issues with using Podman
version 3.0 or greater with Ceph Quincy and later releases. version 3.0 or greater with Ceph Quincy and later releases.
.. warning:: .. warning::

View File

@ -74,9 +74,9 @@ To add each new host to the cluster, perform two steps:
ceph orch host add host2 10.10.0.102 ceph orch host add host2 10.10.0.102
ceph orch host add host3 10.10.0.103 ceph orch host add host3 10.10.0.103
It is best to explicitly provide the host IP address. If an IP is It is best to explicitly provide the host IP address. If an address is
not provided, then the host name will be immediately resolved via not provided, then the host name will be immediately resolved via
DNS and that IP will be used. DNS and the result will be used.
One or more labels can also be included to immediately label the One or more labels can also be included to immediately label the
new host. For example, by default the ``_admin`` label will make new host. For example, by default the ``_admin`` label will make
@ -104,7 +104,7 @@ To drain all daemons from a host, run a command of the following form:
The ``_no_schedule`` and ``_no_conf_keyring`` labels will be applied to the The ``_no_schedule`` and ``_no_conf_keyring`` labels will be applied to the
host. See :ref:`cephadm-special-host-labels`. host. See :ref:`cephadm-special-host-labels`.
If you only want to drain daemons but leave managed ceph conf and keyring If you want to drain daemons but leave managed `ceph.conf` and keyring
files on the host, you may pass the ``--keep-conf-keyring`` flag to the files on the host, you may pass the ``--keep-conf-keyring`` flag to the
drain command. drain command.
@ -115,7 +115,8 @@ drain command.
This will apply the ``_no_schedule`` label to the host but not the This will apply the ``_no_schedule`` label to the host but not the
``_no_conf_keyring`` label. ``_no_conf_keyring`` label.
All OSDs on the host will be scheduled to be removed. You can check the progress of the OSD removal operation with the following command: All OSDs on the host will be scheduled to be removed. You can check
progress of the OSD removal operation with the following command:
.. prompt:: bash # .. prompt:: bash #
@ -148,7 +149,7 @@ cluster by running the following command:
Offline host removal Offline host removal
-------------------- --------------------
Even if a host is offline and can not be recovered, it can be removed from the If a host is offline and can not be recovered, it can be removed from the
cluster by running a command of the following form: cluster by running a command of the following form:
.. prompt:: bash # .. prompt:: bash #
@ -250,8 +251,8 @@ Rescanning Host Devices
======================= =======================
Some servers and external enclosures may not register device removal or insertion with the Some servers and external enclosures may not register device removal or insertion with the
kernel. In these scenarios, you'll need to perform a host rescan. A rescan is typically kernel. In these scenarios, you'll need to perform a device rescan on the appropriate host.
non-disruptive, and can be performed with the following CLI command: A rescan is typically non-disruptive, and can be performed with the following CLI command:
.. prompt:: bash # .. prompt:: bash #
@ -314,19 +315,43 @@ create a new CRUSH host located in the specified hierarchy.
.. note:: .. note::
The ``location`` attribute will be only affect the initial CRUSH location. Subsequent The ``location`` attribute will be only affect the initial CRUSH location.
changes of the ``location`` property will be ignored. Also, removing a host will not remove Subsequent changes of the ``location`` property will be ignored. Also,
any CRUSH buckets. removing a host will not remove an associated CRUSH bucket unless the
``--rm-crush-entry`` flag is provided to the ``orch host rm`` command
See also :ref:`crush_map_default_types`. See also :ref:`crush_map_default_types`.
Removing a host from the CRUSH map
==================================
The ``ceph orch host rm`` command has support for removing the associated host bucket
from the CRUSH map. This is done by providing the ``--rm-crush-entry`` flag.
.. prompt:: bash [ceph:root@host1/]#
ceph orch host rm host1 --rm-crush-entry
When this flag is specified, cephadm will attempt to remove the host bucket
from the CRUSH map as part of the host removal process. Note that if
it fails to do so, cephadm will report the failure and the host will remain under
cephadm control.
.. note::
Removal from the CRUSH map will fail if there are OSDs deployed on the
host. If you would like to remove all the host's OSDs as well, please start
by using the ``ceph orch host drain`` command to do so. Once the OSDs
have been removed, then you may direct cephadm remove the CRUSH bucket
along with the host using the ``--rm-crush-entry`` flag.
OS Tuning Profiles OS Tuning Profiles
================== ==================
Cephadm can be used to manage operating-system-tuning profiles that apply sets Cephadm can be used to manage operating system tuning profiles that apply
of sysctl settings to sets of hosts. ``sysctl`` settings to sets of hosts.
Create a YAML spec file in the following format: To do so, create a YAML spec file in the following format:
.. code-block:: yaml .. code-block:: yaml
@ -345,18 +370,21 @@ Apply the tuning profile with the following command:
ceph orch tuned-profile apply -i <tuned-profile-file-name> ceph orch tuned-profile apply -i <tuned-profile-file-name>
This profile is written to ``/etc/sysctl.d/`` on each host that matches the This profile is written to a file under ``/etc/sysctl.d/`` on each host
hosts specified in the placement block of the yaml, and ``sysctl --system`` is specified in the ``placement`` block, then ``sysctl --system`` is
run on the host. run on the host.
.. note:: .. note::
The exact filename that the profile is written to within ``/etc/sysctl.d/`` The exact filename that the profile is written to within ``/etc/sysctl.d/``
is ``<profile-name>-cephadm-tuned-profile.conf``, where ``<profile-name>`` is is ``<profile-name>-cephadm-tuned-profile.conf``, where ``<profile-name>`` is
the ``profile_name`` setting that you specify in the YAML spec. Because the ``profile_name`` setting that you specify in the YAML spec. We suggest
naming these profiles following the usual ``sysctl.d`` `NN-xxxxx` convention. Because
sysctl settings are applied in lexicographical order (sorted by the filename sysctl settings are applied in lexicographical order (sorted by the filename
in which the setting is specified), you may want to set the ``profile_name`` in which the setting is specified), you may want to carefully choose
in your spec so that it is applied before or after other conf files. the ``profile_name`` in your spec so that it is applied before or after other
conf files. Careful selection ensures that values supplied here override or
do not override those in other ``sysctl.d`` files as desired.
.. note:: .. note::
@ -365,7 +393,7 @@ run on the host.
.. note:: .. note::
Applying tuned profiles is idempotent when the ``--no-overwrite`` option is Applying tuning profiles is idempotent when the ``--no-overwrite`` option is
passed. Moreover, if the ``--no-overwrite`` option is passed, existing passed. Moreover, if the ``--no-overwrite`` option is passed, existing
profiles with the same name are not overwritten. profiles with the same name are not overwritten.
@ -525,7 +553,7 @@ There are two ways to customize this configuration for your environment:
We do *not recommend* this approach. The path name must be We do *not recommend* this approach. The path name must be
visible to *any* mgr daemon, and cephadm runs all daemons as visible to *any* mgr daemon, and cephadm runs all daemons as
containers. That means that the file either need to be placed containers. That means that the file must either be placed
inside a customized container image for your deployment, or inside a customized container image for your deployment, or
manually distributed to the mgr data directory manually distributed to the mgr data directory
(``/var/lib/ceph/<cluster-fsid>/mgr.<id>`` on the host, visible at (``/var/lib/ceph/<cluster-fsid>/mgr.<id>`` on the host, visible at
@ -578,8 +606,8 @@ Note that ``man hostname`` recommends ``hostname`` to return the bare
host name: host name:
The FQDN (Fully Qualified Domain Name) of the system is the The FQDN (Fully Qualified Domain Name) of the system is the
name that the resolver(3) returns for the host name, such as, name that the resolver(3) returns for the host name, for example
ursula.example.com. It is usually the hostname followed by the DNS ``ursula.example.com``. It is usually the short hostname followed by the DNS
domain name (the part after the first dot). You can check the FQDN domain name (the part after the first dot). You can check the FQDN
using ``hostname --fqdn`` or the domain name using ``dnsdomainname``. using ``hostname --fqdn`` or the domain name using ``dnsdomainname``.

View File

@ -4,7 +4,7 @@
Deploying a new Ceph cluster Deploying a new Ceph cluster
============================ ============================
Cephadm creates a new Ceph cluster by "bootstrapping" on a single Cephadm creates a new Ceph cluster by bootstrapping a single
host, expanding the cluster to encompass any additional hosts, and host, expanding the cluster to encompass any additional hosts, and
then deploying the needed services. then deploying the needed services.
@ -18,7 +18,7 @@ Requirements
- Python 3 - Python 3
- Systemd - Systemd
- Podman or Docker for running containers - Podman or Docker for running containers
- Time synchronization (such as chrony or NTP) - Time synchronization (such as Chrony or the legacy ``ntpd``)
- LVM2 for provisioning storage devices - LVM2 for provisioning storage devices
Any modern Linux distribution should be sufficient. Dependencies Any modern Linux distribution should be sufficient. Dependencies
@ -45,6 +45,13 @@ There are two ways to install ``cephadm``:
Choose either the distribution-specific method or the curl-based method. Do Choose either the distribution-specific method or the curl-based method. Do
not attempt to use both these methods on one system. not attempt to use both these methods on one system.
.. note:: Recent versions of cephadm are distributed as an executable compiled
from source code. Unlike for earlier versions of Ceph it is no longer
sufficient to copy a single script from Ceph's git tree and run it. If you
wish to run cephadm using a development version you should create your own
build of cephadm. See :ref:`compiling-cephadm` for details on how to create
your own standalone cephadm executable.
.. _cephadm_install_distros: .. _cephadm_install_distros:
distribution-specific installations distribution-specific installations
@ -85,9 +92,9 @@ that case, you can install cephadm directly. For example:
curl-based installation curl-based installation
----------------------- -----------------------
* First, determine what version of Ceph you will need. You can use the releases * First, determine what version of Ceph you wish to install. You can use the releases
page to find the `latest active releases <https://docs.ceph.com/en/latest/releases/#active-releases>`_. page to find the `latest active releases <https://docs.ceph.com/en/latest/releases/#active-releases>`_.
For example, we might look at that page and find that ``18.2.0`` is the latest For example, we might find that ``18.2.1`` is the latest
active release. active release.
* Use ``curl`` to fetch a build of cephadm for that release. * Use ``curl`` to fetch a build of cephadm for that release.
@ -113,7 +120,7 @@ curl-based installation
* If you encounter any issues with running cephadm due to errors including * If you encounter any issues with running cephadm due to errors including
the message ``bad interpreter``, then you may not have Python or the message ``bad interpreter``, then you may not have Python or
the correct version of Python installed. The cephadm tool requires Python 3.6 the correct version of Python installed. The cephadm tool requires Python 3.6
and above. You can manually run cephadm with a particular version of Python by or later. You can manually run cephadm with a particular version of Python by
prefixing the command with your installed Python version. For example: prefixing the command with your installed Python version. For example:
.. prompt:: bash # .. prompt:: bash #
@ -121,6 +128,11 @@ curl-based installation
python3.8 ./cephadm <arguments...> python3.8 ./cephadm <arguments...>
* Although the standalone cephadm is sufficient to bootstrap a cluster, it is
best to have the ``cephadm`` command installed on the host. To install
the packages that provide the ``cephadm`` command, run the following
commands:
.. _cephadm_update: .. _cephadm_update:
update cephadm update cephadm
@ -166,7 +178,7 @@ What to know before you bootstrap
The first step in creating a new Ceph cluster is running the ``cephadm The first step in creating a new Ceph cluster is running the ``cephadm
bootstrap`` command on the Ceph cluster's first host. The act of running the bootstrap`` command on the Ceph cluster's first host. The act of running the
``cephadm bootstrap`` command on the Ceph cluster's first host creates the Ceph ``cephadm bootstrap`` command on the Ceph cluster's first host creates the Ceph
cluster's first "monitor daemon", and that monitor daemon needs an IP address. cluster's first Monitor daemon.
You must pass the IP address of the Ceph cluster's first host to the ``ceph You must pass the IP address of the Ceph cluster's first host to the ``ceph
bootstrap`` command, so you'll need to know the IP address of that host. bootstrap`` command, so you'll need to know the IP address of that host.
@ -187,13 +199,13 @@ Run the ``ceph bootstrap`` command:
This command will: This command will:
* Create a monitor and manager daemon for the new cluster on the local * Create a Monitor and a Manager daemon for the new cluster on the local
host. host.
* Generate a new SSH key for the Ceph cluster and add it to the root * Generate a new SSH key for the Ceph cluster and add it to the root
user's ``/root/.ssh/authorized_keys`` file. user's ``/root/.ssh/authorized_keys`` file.
* Write a copy of the public key to ``/etc/ceph/ceph.pub``. * Write a copy of the public key to ``/etc/ceph/ceph.pub``.
* Write a minimal configuration file to ``/etc/ceph/ceph.conf``. This * Write a minimal configuration file to ``/etc/ceph/ceph.conf``. This
file is needed to communicate with the new cluster. file is needed to communicate with Ceph daemons.
* Write a copy of the ``client.admin`` administrative (privileged!) * Write a copy of the ``client.admin`` administrative (privileged!)
secret key to ``/etc/ceph/ceph.client.admin.keyring``. secret key to ``/etc/ceph/ceph.client.admin.keyring``.
* Add the ``_admin`` label to the bootstrap host. By default, any host * Add the ``_admin`` label to the bootstrap host. By default, any host
@ -205,7 +217,7 @@ This command will:
Further information about cephadm bootstrap Further information about cephadm bootstrap
------------------------------------------- -------------------------------------------
The default bootstrap behavior will work for most users. But if you'd like The default bootstrap process will work for most users. But if you'd like
immediately to know more about ``cephadm bootstrap``, read the list below. immediately to know more about ``cephadm bootstrap``, read the list below.
Also, you can run ``cephadm bootstrap -h`` to see all of ``cephadm``'s Also, you can run ``cephadm bootstrap -h`` to see all of ``cephadm``'s
@ -216,15 +228,15 @@ available options.
journald. If you want Ceph to write traditional log files to ``/var/log/ceph/$fsid``, journald. If you want Ceph to write traditional log files to ``/var/log/ceph/$fsid``,
use the ``--log-to-file`` option during bootstrap. use the ``--log-to-file`` option during bootstrap.
* Larger Ceph clusters perform better when (external to the Ceph cluster) * Larger Ceph clusters perform best when (external to the Ceph cluster)
public network traffic is separated from (internal to the Ceph cluster) public network traffic is separated from (internal to the Ceph cluster)
cluster traffic. The internal cluster traffic handles replication, recovery, cluster traffic. The internal cluster traffic handles replication, recovery,
and heartbeats between OSD daemons. You can define the :ref:`cluster and heartbeats between OSD daemons. You can define the :ref:`cluster
network<cluster-network>` by supplying the ``--cluster-network`` option to the ``bootstrap`` network<cluster-network>` by supplying the ``--cluster-network`` option to the ``bootstrap``
subcommand. This parameter must define a subnet in CIDR notation (for example subcommand. This parameter must be a subnet in CIDR notation (for example
``10.90.90.0/24`` or ``fe80::/64``). ``10.90.90.0/24`` or ``fe80::/64``).
* ``cephadm bootstrap`` writes to ``/etc/ceph`` the files needed to access * ``cephadm bootstrap`` writes to ``/etc/ceph`` files needed to access
the new cluster. This central location makes it possible for Ceph the new cluster. This central location makes it possible for Ceph
packages installed on the host (e.g., packages that give access to the packages installed on the host (e.g., packages that give access to the
cephadm command line interface) to find these files. cephadm command line interface) to find these files.
@ -245,12 +257,12 @@ available options.
EOF EOF
$ ./cephadm bootstrap --config initial-ceph.conf ... $ ./cephadm bootstrap --config initial-ceph.conf ...
* The ``--ssh-user *<user>*`` option makes it possible to choose which SSH * The ``--ssh-user *<user>*`` option makes it possible to designate which SSH
user cephadm will use to connect to hosts. The associated SSH key will be user cephadm will use to connect to hosts. The associated SSH key will be
added to ``/home/*<user>*/.ssh/authorized_keys``. The user that you added to ``/home/*<user>*/.ssh/authorized_keys``. The user that you
designate with this option must have passwordless sudo access. designate with this option must have passwordless sudo access.
* If you are using a container on an authenticated registry that requires * If you are using a container image from a registry that requires
login, you may add the argument: login, you may add the argument:
* ``--registry-json <path to json file>`` * ``--registry-json <path to json file>``
@ -261,7 +273,7 @@ available options.
Cephadm will attempt to log in to this registry so it can pull your container Cephadm will attempt to log in to this registry so it can pull your container
and then store the login info in its config database. Other hosts added to and then store the login info in its config database. Other hosts added to
the cluster will then also be able to make use of the authenticated registry. the cluster will then also be able to make use of the authenticated container registry.
* See :ref:`cephadm-deployment-scenarios` for additional examples for using ``cephadm bootstrap``. * See :ref:`cephadm-deployment-scenarios` for additional examples for using ``cephadm bootstrap``.
@ -326,7 +338,7 @@ Add all hosts to the cluster by following the instructions in
By default, a ``ceph.conf`` file and a copy of the ``client.admin`` keyring are By default, a ``ceph.conf`` file and a copy of the ``client.admin`` keyring are
maintained in ``/etc/ceph`` on all hosts that have the ``_admin`` label. This maintained in ``/etc/ceph`` on all hosts that have the ``_admin`` label. This
label is initially applied only to the bootstrap host. We usually recommend label is initially applied only to the bootstrap host. We recommend
that one or more other hosts be given the ``_admin`` label so that the Ceph CLI that one or more other hosts be given the ``_admin`` label so that the Ceph CLI
(for example, via ``cephadm shell``) is easily accessible on multiple hosts. To add (for example, via ``cephadm shell``) is easily accessible on multiple hosts. To add
the ``_admin`` label to additional host(s), run a command of the following form: the ``_admin`` label to additional host(s), run a command of the following form:
@ -339,9 +351,10 @@ the ``_admin`` label to additional host(s), run a command of the following form:
Adding additional MONs Adding additional MONs
====================== ======================
A typical Ceph cluster has three or five monitor daemons spread A typical Ceph cluster has three or five Monitor daemons spread
across different hosts. We recommend deploying five across different hosts. We recommend deploying five
monitors if there are five or more nodes in your cluster. Monitors if there are five or more nodes in your cluster. Most clusters do not
benefit from seven or more Monitors.
Please follow :ref:`deploy_additional_monitors` to deploy additional MONs. Please follow :ref:`deploy_additional_monitors` to deploy additional MONs.
@ -366,12 +379,12 @@ See :ref:`osd_autotune`.
To deploy hyperconverged Ceph with TripleO, please refer to the TripleO documentation: `Scenario: Deploy Hyperconverged Ceph <https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/cephadm.html#scenario-deploy-hyperconverged-ceph>`_ To deploy hyperconverged Ceph with TripleO, please refer to the TripleO documentation: `Scenario: Deploy Hyperconverged Ceph <https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/cephadm.html#scenario-deploy-hyperconverged-ceph>`_
In other cases where the cluster hardware is not exclusively used by Ceph (hyperconverged), In other cases where the cluster hardware is not exclusively used by Ceph (converged infrastructure),
reduce the memory consumption of Ceph like so: reduce the memory consumption of Ceph like so:
.. prompt:: bash # .. prompt:: bash #
# hyperconverged only: # converged only:
ceph config set mgr mgr/cephadm/autotune_memory_target_ratio 0.2 ceph config set mgr mgr/cephadm/autotune_memory_target_ratio 0.2
Then enable memory autotuning: Then enable memory autotuning:
@ -400,9 +413,11 @@ Different deployment scenarios
Single host Single host
----------- -----------
To configure a Ceph cluster to run on a single host, use the To deploy a Ceph cluster running on a single host, use the
``--single-host-defaults`` flag when bootstrapping. For use cases of this, see ``--single-host-defaults`` flag when bootstrapping. For use cases, see
:ref:`one-node-cluster`. :ref:`one-node-cluster`. Such clusters are generally not suitable for
production.
The ``--single-host-defaults`` flag sets the following configuration options:: The ``--single-host-defaults`` flag sets the following configuration options::
@ -419,8 +434,8 @@ Deployment in an isolated environment
------------------------------------- -------------------------------------
You might need to install cephadm in an environment that is not connected You might need to install cephadm in an environment that is not connected
directly to the internet (such an environment is also called an "isolated directly to the Internet (an "isolated" or "airgapped"
environment"). This can be done if a custom container registry is used. Either environment). This requires the use of a custom container registry. Either
of two kinds of custom container registry can be used in this scenario: (1) a of two kinds of custom container registry can be used in this scenario: (1) a
Podman-based or Docker-based insecure registry, or (2) a secure registry. Podman-based or Docker-based insecure registry, or (2) a secure registry.
@ -569,9 +584,9 @@ in order to have cephadm use them for SSHing between cluster hosts
Note that this setup does not require installing the corresponding public key Note that this setup does not require installing the corresponding public key
from the private key passed to bootstrap on other nodes. In fact, cephadm will from the private key passed to bootstrap on other nodes. In fact, cephadm will
reject the ``--ssh-public-key`` argument when passed along with ``--ssh-signed-cert``. reject the ``--ssh-public-key`` argument when passed along with ``--ssh-signed-cert``.
Not because having the public key breaks anything, but because it is not at all needed This is not because having the public key breaks anything, but rather because it is not at all needed
for this setup and it helps bootstrap differentiate if the user wants the CA signed and helps the bootstrap command differentiate if the user wants the CA signed
keys setup or standard pubkey encryption. What this means is, SSH key rotation keys setup or standard pubkey encryption. What this means is that SSH key rotation
would simply be a matter of getting another key signed by the same CA and providing would simply be a matter of getting another key signed by the same CA and providing
cephadm with the new private key and signed cert. No additional distribution of cephadm with the new private key and signed cert. No additional distribution of
keys to cluster nodes is needed after the initial setup of the CA key as a trusted key, keys to cluster nodes is needed after the initial setup of the CA key as a trusted key,

View File

@ -328,15 +328,15 @@ You can disable this health warning by running the following command:
Cluster Configuration Checks Cluster Configuration Checks
---------------------------- ----------------------------
Cephadm periodically scans each of the hosts in the cluster in order Cephadm periodically scans each host in the cluster in order
to understand the state of the OS, disks, NICs etc. These facts can to understand the state of the OS, disks, network interfacess etc. This information can
then be analysed for consistency across the hosts in the cluster to then be analyzed for consistency across the hosts in the cluster to
identify any configuration anomalies. identify any configuration anomalies.
Enabling Cluster Configuration Checks Enabling Cluster Configuration Checks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The configuration checks are an **optional** feature, and are enabled These configuration checks are an **optional** feature, and are enabled
by running the following command: by running the following command:
.. prompt:: bash # .. prompt:: bash #
@ -346,7 +346,7 @@ by running the following command:
States Returned by Cluster Configuration Checks States Returned by Cluster Configuration Checks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The configuration checks are triggered after each host scan (1m). The Configuration checks are triggered after each host scan. The
cephadm log entries will show the current state and outcome of the cephadm log entries will show the current state and outcome of the
configuration checks as follows: configuration checks as follows:
@ -383,14 +383,14 @@ To list all the configuration checks and their current states, run the following
# ceph cephadm config-check ls # ceph cephadm config-check ls
NAME HEALTHCHECK STATUS DESCRIPTION NAME HEALTHCHECK STATUS DESCRIPTION
kernel_security CEPHADM_CHECK_KERNEL_LSM enabled checks SELINUX/Apparmor profiles are consistent across cluster hosts kernel_security CEPHADM_CHECK_KERNEL_LSM enabled check that SELINUX/Apparmor profiles are consistent across cluster hosts
os_subscription CEPHADM_CHECK_SUBSCRIPTION enabled checks subscription states are consistent for all cluster hosts os_subscription CEPHADM_CHECK_SUBSCRIPTION enabled check that subscription states are consistent for all cluster hosts
public_network CEPHADM_CHECK_PUBLIC_MEMBERSHIP enabled check that all hosts have a NIC on the Ceph public_network public_network CEPHADM_CHECK_PUBLIC_MEMBERSHIP enabled check that all hosts have a network interface on the Ceph public_network
osd_mtu_size CEPHADM_CHECK_MTU enabled check that OSD hosts share a common MTU setting osd_mtu_size CEPHADM_CHECK_MTU enabled check that OSD hosts share a common MTU setting
osd_linkspeed CEPHADM_CHECK_LINKSPEED enabled check that OSD hosts share a common linkspeed osd_linkspeed CEPHADM_CHECK_LINKSPEED enabled check that OSD hosts share a common network link speed
network_missing CEPHADM_CHECK_NETWORK_MISSING enabled checks that the cluster/public networks defined exist on the Ceph hosts network_missing CEPHADM_CHECK_NETWORK_MISSING enabled check that the cluster/public networks as defined exist on the Ceph hosts
ceph_release CEPHADM_CHECK_CEPH_RELEASE enabled check for Ceph version consistency - ceph daemons should be on the same release (unless upgrade is active) ceph_release CEPHADM_CHECK_CEPH_RELEASE enabled check for Ceph version consistency: all Ceph daemons should be the same release unless upgrade is in progress
kernel_version CEPHADM_CHECK_KERNEL_VERSION enabled checks that the MAJ.MIN of the kernel on Ceph hosts is consistent kernel_version CEPHADM_CHECK_KERNEL_VERSION enabled checks that the maj.min version of the kernel is consistent across Ceph hosts
The name of each configuration check can be used to enable or disable a specific check by running a command of the following form: The name of each configuration check can be used to enable or disable a specific check by running a command of the following form:
: :
@ -414,29 +414,29 @@ flagged as an anomaly and a healthcheck (WARNING) state raised.
CEPHADM_CHECK_SUBSCRIPTION CEPHADM_CHECK_SUBSCRIPTION
~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~
This check relates to the status of vendor subscription. This check is This check relates to the status of OS vendor subscription. This check is
performed only for hosts using RHEL, but helps to confirm that all hosts are performed only for hosts using RHEL and helps to confirm that all hosts are
covered by an active subscription, which ensures that patches and updates are covered by an active subscription, which ensures that patches and updates are
available. available.
CEPHADM_CHECK_PUBLIC_MEMBERSHIP CEPHADM_CHECK_PUBLIC_MEMBERSHIP
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All members of the cluster should have NICs configured on at least one of the All members of the cluster should have a network interface configured on at least one of the
public network subnets. Hosts that are not on the public network will rely on public network subnets. Hosts that are not on the public network will rely on
routing, which may affect performance. routing, which may affect performance.
CEPHADM_CHECK_MTU CEPHADM_CHECK_MTU
~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~
The MTU of the NICs on OSDs can be a key factor in consistent performance. This The MTU of the network interfaces on OSD hosts can be a key factor in consistent performance. This
check examines hosts that are running OSD services to ensure that the MTU is check examines hosts that are running OSD services to ensure that the MTU is
configured consistently within the cluster. This is determined by establishing configured consistently within the cluster. This is determined by determining
the MTU setting that the majority of hosts is using. Any anomalies result in a the MTU setting that the majority of hosts is using. Any anomalies result in a
Ceph health check. health check.
CEPHADM_CHECK_LINKSPEED CEPHADM_CHECK_LINKSPEED
~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~
This check is similar to the MTU check. Link speed consistency is a factor in This check is similar to the MTU check. Link speed consistency is a factor in
consistent cluster performance, just as the MTU of the NICs on the OSDs is. consistent cluster performance, as is the MTU of the OSD node network interfaces.
This check determines the link speed shared by the majority of OSD hosts, and a This check determines the link speed shared by the majority of OSD hosts, and a
health check is run for any hosts that are set at a lower link speed rate. health check is run for any hosts that are set at a lower link speed rate.
@ -448,15 +448,14 @@ a health check is raised.
CEPHADM_CHECK_CEPH_RELEASE CEPHADM_CHECK_CEPH_RELEASE
~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~
Under normal operations, the Ceph cluster runs daemons under the same ceph Under normal operations, the Ceph cluster runs daemons that are of the same Ceph
release (that is, the Ceph cluster runs all daemons under (for example) release (for example, Reef). This check determines the active release for each daemon, and
Octopus). This check determines the active release for each daemon, and
reports any anomalies as a healthcheck. *This check is bypassed if an upgrade reports any anomalies as a healthcheck. *This check is bypassed if an upgrade
process is active within the cluster.* is in process.*
CEPHADM_CHECK_KERNEL_VERSION CEPHADM_CHECK_KERNEL_VERSION
~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The OS kernel version (maj.min) is checked for consistency across the hosts. The OS kernel version (maj.min) is checked for consistency across hosts.
The kernel version of the majority of the hosts is used as the basis for The kernel version of the majority of the hosts is used as the basis for
identifying anomalies. identifying anomalies.

View File

@ -357,7 +357,9 @@ Or in YAML:
Placement by pattern matching Placement by pattern matching
----------------------------- -----------------------------
Daemons can be placed on hosts as well: Daemons can be placed on hosts using a host pattern as well.
By default, the host pattern is matched using fnmatch which supports
UNIX shell-style wildcards (see https://docs.python.org/3/library/fnmatch.html):
.. prompt:: bash # .. prompt:: bash #
@ -385,6 +387,26 @@ Or in YAML:
placement: placement:
host_pattern: "*" host_pattern: "*"
The host pattern also has support for using a regex. To use a regex, you
must either add "regex: " to the start of the pattern when using the
command line, or specify a ``pattern_type`` field to be "regex"
when using YAML.
On the command line:
.. prompt:: bash #
ceph orch apply prometheus --placement='regex:FOO[0-9]|BAR[0-9]'
In YAML:
.. code-block:: yaml
service_type: prometheus
placement:
host_pattern:
pattern: 'FOO[0-9]|BAR[0-9]'
pattern_type: regex
Changing the number of daemons Changing the number of daemons
------------------------------ ------------------------------

View File

@ -83,6 +83,37 @@ steps below:
ceph orch apply grafana ceph orch apply grafana
Enabling security for the monitoring stack
----------------------------------------------
By default, in a cephadm-managed cluster, the monitoring components are set up and configured without enabling security measures.
While this suffices for certain deployments, others with strict security needs may find it necessary to protect the
monitoring stack against unauthorized access. In such cases, cephadm relies on a specific configuration parameter,
`mgr/cephadm/secure_monitoring_stack`, which toggles the security settings for all monitoring components. To activate security
measures, set this option to ``true`` with a command of the following form:
.. prompt:: bash #
ceph config set mgr mgr/cephadm/secure_monitoring_stack true
This change will trigger a sequence of reconfigurations across all monitoring daemons, typically requiring
few minutes until all components are fully operational. The updated secure configuration includes the following modifications:
#. Prometheus: basic authentication is required to access the web portal and TLS is enabled for secure communication.
#. Alertmanager: basic authentication is required to access the web portal and TLS is enabled for secure communication.
#. Node Exporter: TLS is enabled for secure communication.
#. Grafana: TLS is enabled and authentication is requiered to access the datasource information.
In this secure setup, users will need to setup authentication
(username/password) for both Prometheus and Alertmanager. By default the
username and password are set to ``admin``/``admin``. The user can change these
value with the commands ``ceph orch prometheus set-credentials`` and ``ceph
orch alertmanager set-credentials`` respectively. These commands offer the
flexibility to input the username/password either as parameters or via a JSON
file, which enhances security. Additionally, Cephadm provides the commands
`orch prometheus get-credentials` and `orch alertmanager get-credentials` to
retrieve the current credentials.
.. _cephadm-monitoring-centralized-logs: .. _cephadm-monitoring-centralized-logs:
Centralized Logging in Ceph Centralized Logging in Ceph

View File

@ -15,7 +15,7 @@ Deploying NFS ganesha
===================== =====================
Cephadm deploys NFS Ganesha daemon (or set of daemons). The configuration for Cephadm deploys NFS Ganesha daemon (or set of daemons). The configuration for
NFS is stored in the ``nfs-ganesha`` pool and exports are managed via the NFS is stored in the ``.nfs`` pool and exports are managed via the
``ceph nfs export ...`` commands and via the dashboard. ``ceph nfs export ...`` commands and via the dashboard.
To deploy a NFS Ganesha gateway, run the following command: To deploy a NFS Ganesha gateway, run the following command:

View File

@ -232,7 +232,7 @@ Remove an OSD
Removing an OSD from a cluster involves two steps: Removing an OSD from a cluster involves two steps:
#. evacuating all placement groups (PGs) from the cluster #. evacuating all placement groups (PGs) from the OSD
#. removing the PG-free OSD from the cluster #. removing the PG-free OSD from the cluster
The following command performs these two steps: The following command performs these two steps:

View File

@ -246,6 +246,7 @@ It is a yaml format file with the following properties:
virtual_interface_networks: [ ... ] # optional: list of CIDR networks virtual_interface_networks: [ ... ] # optional: list of CIDR networks
use_keepalived_multicast: <bool> # optional: Default is False. use_keepalived_multicast: <bool> # optional: Default is False.
vrrp_interface_network: <string>/<string> # optional: ex: 192.168.20.0/24 vrrp_interface_network: <string>/<string> # optional: ex: 192.168.20.0/24
health_check_interval: <string> # optional: Default is 2s.
ssl_cert: | # optional: SSL certificate and key ssl_cert: | # optional: SSL certificate and key
-----BEGIN CERTIFICATE----- -----BEGIN CERTIFICATE-----
... ...
@ -273,6 +274,7 @@ It is a yaml format file with the following properties:
monitor_port: <integer> # ex: 1967, used by haproxy for load balancer status monitor_port: <integer> # ex: 1967, used by haproxy for load balancer status
virtual_interface_networks: [ ... ] # optional: list of CIDR networks virtual_interface_networks: [ ... ] # optional: list of CIDR networks
first_virtual_router_id: <integer> # optional: default 50 first_virtual_router_id: <integer> # optional: default 50
health_check_interval: <string> # optional: Default is 2s.
ssl_cert: | # optional: SSL certificate and key ssl_cert: | # optional: SSL certificate and key
-----BEGIN CERTIFICATE----- -----BEGIN CERTIFICATE-----
... ...
@ -321,6 +323,9 @@ where the properties of this service specification are:
keepalived will have different virtual_router_id. In the case of using ``virtual_ips_list``, keepalived will have different virtual_router_id. In the case of using ``virtual_ips_list``,
each IP will create its own virtual router. So the first one will have ``first_virtual_router_id``, each IP will create its own virtual router. So the first one will have ``first_virtual_router_id``,
second one will have ``first_virtual_router_id`` + 1, etc. Valid values go from 1 to 255. second one will have ``first_virtual_router_id`` + 1, etc. Valid values go from 1 to 255.
* ``health_check_interval``
Default is 2 seconds. This parameter can be used to set the interval between health checks
for the haproxy with the backend servers.
.. _ingress-virtual-ip: .. _ingress-virtual-ip:

View File

@ -32,7 +32,7 @@ completely by running the following commands:
ceph orch set backend '' ceph orch set backend ''
ceph mgr module disable cephadm ceph mgr module disable cephadm
These commands disable all of the ``ceph orch ...`` CLI commands. All These commands disable all ``ceph orch ...`` CLI commands. All
previously deployed daemon containers continue to run and will start just as previously deployed daemon containers continue to run and will start just as
they were before you ran these commands. they were before you ran these commands.
@ -56,7 +56,7 @@ following form:
ceph orch ls --service_name=<service-name> --format yaml ceph orch ls --service_name=<service-name> --format yaml
This will return something in the following form: This will return information in the following form:
.. code-block:: yaml .. code-block:: yaml
@ -252,16 +252,17 @@ For more detail on operations of this kind, see
Accessing the Admin Socket Accessing the Admin Socket
-------------------------- --------------------------
Each Ceph daemon provides an admin socket that bypasses the MONs (See Each Ceph daemon provides an admin socket that allows runtime option setting and statistic reading. See
:ref:`rados-monitoring-using-admin-socket`). :ref:`rados-monitoring-using-admin-socket`.
#. To access the admin socket, enter the daemon container on the host:: #. To access the admin socket, enter the daemon container on the host::
[root@mon1 ~]# cephadm enter --name <daemon-name> [root@mon1 ~]# cephadm enter --name <daemon-name>
#. Run a command of the following form to see the admin socket's configuration:: #. Run a command of the following forms to see the admin socket's configuration and other available actions::
[ceph: root@mon1 /]# ceph --admin-daemon /var/run/ceph/ceph-<daemon-name>.asok config show [ceph: root@mon1 /]# ceph --admin-daemon /var/run/ceph/ceph-<daemon-name>.asok config show
[ceph: root@mon1 /]# ceph --admin-daemon /var/run/ceph/ceph-<daemon-name>.asok help
Running Various Ceph Tools Running Various Ceph Tools
-------------------------------- --------------------------------
@ -444,11 +445,11 @@ Running repeated debugging sessions
When using ``cephadm shell``, as in the example above, any changes made to the When using ``cephadm shell``, as in the example above, any changes made to the
container that is spawned by the shell command are ephemeral. After the shell container that is spawned by the shell command are ephemeral. After the shell
session exits, the files that were downloaded and installed cease to be session exits, the files that were downloaded and installed cease to be
available. You can simply re-run the same commands every time ``cephadm available. You can simply re-run the same commands every time ``cephadm shell``
shell`` is invoked, but in order to save time and resources one can create a is invoked, but to save time and resources you can create a new container image
new container image and use it for repeated debugging sessions. and use it for repeated debugging sessions.
In the following example, we create a simple file that will construct the In the following example, we create a simple file that constructs the
container image. The command below uses podman but it is expected to work container image. The command below uses podman but it is expected to work
correctly even if ``podman`` is replaced with ``docker``:: correctly even if ``podman`` is replaced with ``docker``::
@ -463,14 +464,14 @@ correctly even if ``podman`` is replaced with ``docker``::
The above file creates a new local image named ``ceph:debugging``. This image The above file creates a new local image named ``ceph:debugging``. This image
can be used on the same machine that built it. The image can also be pushed to can be used on the same machine that built it. The image can also be pushed to
a container repository or saved and copied to a node runing other Ceph a container repository or saved and copied to a node that is running other Ceph
containers. Consult the ``podman`` or ``docker`` documentation for more containers. See the ``podman`` or ``docker`` documentation for more
information about the container workflow. information about the container workflow.
After the image has been built, it can be used to initiate repeat debugging After the image has been built, it can be used to initiate repeat debugging
sessions. By using an image in this way, you avoid the trouble of having to sessions. By using an image in this way, you avoid the trouble of having to
re-install the debug tools and debuginfo packages every time you need to run a re-install the debug tools and the debuginfo packages every time you need to
debug session. To debug a core file using this image, in the same way as run a debug session. To debug a core file using this image, in the same way as
previously described, run: previously described, run:
.. prompt:: bash # .. prompt:: bash #

View File

@ -2,7 +2,7 @@
Upgrading Ceph Upgrading Ceph
============== ==============
Cephadm can safely upgrade Ceph from one bugfix release to the next. For Cephadm can safely upgrade Ceph from one point release to the next. For
example, you can upgrade from v15.2.0 (the first Octopus release) to the next example, you can upgrade from v15.2.0 (the first Octopus release) to the next
point release, v15.2.1. point release, v15.2.1.
@ -137,25 +137,25 @@ UPGRADE_NO_STANDBY_MGR
---------------------- ----------------------
This alert (``UPGRADE_NO_STANDBY_MGR``) means that Ceph does not detect an This alert (``UPGRADE_NO_STANDBY_MGR``) means that Ceph does not detect an
active standby manager daemon. In order to proceed with the upgrade, Ceph active standby Manager daemon. In order to proceed with the upgrade, Ceph
requires an active standby manager daemon (which you can think of in this requires an active standby Manager daemon (which you can think of in this
context as "a second manager"). context as "a second manager").
You can ensure that Cephadm is configured to run 2 (or more) managers by You can ensure that Cephadm is configured to run two (or more) Managers by
running the following command: running the following command:
.. prompt:: bash # .. prompt:: bash #
ceph orch apply mgr 2 # or more ceph orch apply mgr 2 # or more
You can check the status of existing mgr daemons by running the following You can check the status of existing Manager daemons by running the following
command: command:
.. prompt:: bash # .. prompt:: bash #
ceph orch ps --daemon-type mgr ceph orch ps --daemon-type mgr
If an existing mgr daemon has stopped, you can try to restart it by running the If an existing Manager daemon has stopped, you can try to restart it by running the
following command: following command:
.. prompt:: bash # .. prompt:: bash #
@ -183,7 +183,7 @@ Using customized container images
================================= =================================
For most users, upgrading requires nothing more complicated than specifying the For most users, upgrading requires nothing more complicated than specifying the
Ceph version number to upgrade to. In such cases, cephadm locates the specific Ceph version to which to upgrade. In such cases, cephadm locates the specific
Ceph container image to use by combining the ``container_image_base`` Ceph container image to use by combining the ``container_image_base``
configuration option (default: ``docker.io/ceph/ceph``) with a tag of configuration option (default: ``docker.io/ceph/ceph``) with a tag of
``vX.Y.Z``. ``vX.Y.Z``.

View File

@ -1,11 +1,13 @@
.. _cephfs_add_remote_mds: .. _cephfs_add_remote_mds:
.. note:: .. warning:: The material on this page is to be used only for manually setting
It is highly recommended to use :doc:`/cephadm/index` or another Ceph up a Ceph cluster. If you intend to use an automated tool such as
orchestrator for setting up the ceph cluster. Use this approach only if you :doc:`/cephadm/index` to set up a Ceph cluster, do not use the
are setting up the ceph cluster manually. If one still intends to use the instructions on this page.
manual way for deploying MDS daemons, :doc:`/cephadm/services/mds/` can
also be used. .. note:: If you are certain that you know what you are doing and you intend to
manually deploy MDS daemons, see :doc:`/cephadm/services/mds/` before
proceeding.
============================ ============================
Deploying Metadata Servers Deploying Metadata Servers

View File

@ -258,31 +258,47 @@ Clients that are missing newly added features will be evicted automatically.
Here are the current CephFS features and first release they came out: Here are the current CephFS features and first release they came out:
+------------------+--------------+-----------------+ +----------------------------+--------------+-----------------+
| Feature | Ceph release | Upstream Kernel | | Feature | Ceph release | Upstream Kernel |
+==================+==============+=================+ +============================+==============+=================+
| jewel | jewel | 4.5 | | jewel | jewel | 4.5 |
+------------------+--------------+-----------------+ +----------------------------+--------------+-----------------+
| kraken | kraken | 4.13 | | kraken | kraken | 4.13 |
+------------------+--------------+-----------------+ +----------------------------+--------------+-----------------+
| luminous | luminous | 4.13 | | luminous | luminous | 4.13 |
+------------------+--------------+-----------------+ +----------------------------+--------------+-----------------+
| mimic | mimic | 4.19 | | mimic | mimic | 4.19 |
+------------------+--------------+-----------------+ +----------------------------+--------------+-----------------+
| reply_encoding | nautilus | 5.1 | | reply_encoding | nautilus | 5.1 |
+------------------+--------------+-----------------+ +----------------------------+--------------+-----------------+
| reclaim_client | nautilus | N/A | | reclaim_client | nautilus | N/A |
+------------------+--------------+-----------------+ +----------------------------+--------------+-----------------+
| lazy_caps_wanted | nautilus | 5.1 | | lazy_caps_wanted | nautilus | 5.1 |
+------------------+--------------+-----------------+ +----------------------------+--------------+-----------------+
| multi_reconnect | nautilus | 5.1 | | multi_reconnect | nautilus | 5.1 |
+------------------+--------------+-----------------+ +----------------------------+--------------+-----------------+
| deleg_ino | octopus | 5.6 | | deleg_ino | octopus | 5.6 |
+------------------+--------------+-----------------+ +----------------------------+--------------+-----------------+
| metric_collect | pacific | N/A | | metric_collect | pacific | N/A |
+------------------+--------------+-----------------+ +----------------------------+--------------+-----------------+
| alternate_name | pacific | PLANNED | | alternate_name | pacific | 6.5 |
+------------------+--------------+-----------------+ +----------------------------+--------------+-----------------+
| notify_session_state | quincy | 5.19 |
+----------------------------+--------------+-----------------+
| op_getvxattr | quincy | 6.0 |
+----------------------------+--------------+-----------------+
| 32bits_retry_fwd | reef | 6.6 |
+----------------------------+--------------+-----------------+
| new_snaprealm_info | reef | UNKNOWN |
+----------------------------+--------------+-----------------+
| has_owner_uidgid | reef | 6.6 |
+----------------------------+--------------+-----------------+
| client_mds_auth_caps | squid+bp | PLANNED |
+----------------------------+--------------+-----------------+
..
Comment: use `git describe --tags --abbrev=0 <commit>` to lookup release
CephFS Feature Descriptions CephFS Feature Descriptions
@ -340,6 +356,15 @@ Clients can send performance metric to MDS if MDS support this feature.
Clients can set and understand "alternate names" for directory entries. This is Clients can set and understand "alternate names" for directory entries. This is
to be used for encrypted file name support. to be used for encrypted file name support.
::
client_mds_auth_caps
To effectively implement ``root_squash`` in a client's ``mds`` caps, the client
must understand that it is enforcing ``root_squash`` and other cap metadata.
Clients without this feature are in danger of dropping updates to files. It is
recommend to set this feature bit.
Global settings Global settings
--------------- ---------------

View File

@ -47,4 +47,4 @@ client cache.
| MDSs | -=-------> | OSDs | | MDSs | -=-------> | OSDs |
+---------------------+ +--------------------+ +---------------------+ +--------------------+
.. _Architecture: ../architecture .. _Architecture: ../../architecture

View File

@ -93,6 +93,15 @@ providing high-availability.
.. note:: Deploying a single mirror daemon is recommended. Running multiple .. note:: Deploying a single mirror daemon is recommended. Running multiple
daemons is untested. daemons is untested.
The following file types are supported by the mirroring:
- Regular files (-)
- Directory files (d)
- Symbolic link file (l)
The other file types are ignored by the mirroring. So they won't be
available on a successfully synchronized peer.
The mirroring module is disabled by default. To enable the mirroring module, The mirroring module is disabled by default. To enable the mirroring module,
run the following command: run the following command:

View File

@ -63,6 +63,62 @@ By default, `cephfs-top` uses `client.fstop` user to connect to a Ceph cluster::
$ ceph auth get-or-create client.fstop mon 'allow r' mds 'allow r' osd 'allow r' mgr 'allow r' $ ceph auth get-or-create client.fstop mon 'allow r' mds 'allow r' osd 'allow r' mgr 'allow r'
$ cephfs-top $ cephfs-top
Description of Fields
---------------------
1. chit : Cap hit
Percentage of file capability hits over total number of caps
2. dlease : Dentry lease
Percentage of dentry leases handed out over the total dentry lease requests
3. ofiles : Opened files
Number of opened files
4. oicaps : Pinned caps
Number of pinned caps
5. oinodes : Opened inodes
Number of opened inodes
6. rtio : Total size of read IOs
Number of bytes read in input/output operations generated by all process
7. wtio : Total size of write IOs
Number of bytes written in input/output operations generated by all processes
8. raio : Average size of read IOs
Mean of number of bytes read in input/output operations generated by all
process over total IO done
9. waio : Average size of write IOs
Mean of number of bytes written in input/output operations generated by all
process over total IO done
10. rsp : Read speed
Speed of read IOs with respect to the duration since the last refresh of clients
11. wsp : Write speed
Speed of write IOs with respect to the duration since the last refresh of clients
12. rlatavg : Average read latency
Mean value of the read latencies
13. rlatsd : Standard deviation (variance) for read latency
Dispersion of the metric for the read latency relative to its mean
14. wlatavg : Average write latency
Mean value of the write latencies
15. wlatsd : Standard deviation (variance) for write latency
Dispersion of the metric for the write latency relative to its mean
16. mlatavg : Average metadata latency
Mean value of the metadata latencies
17. mlatsd : Standard deviation (variance) for metadata latency
Dispersion of the metric for the metadata latency relative to its mean
Command-Line Options Command-Line Options
-------------------- --------------------

View File

@ -259,3 +259,121 @@ Following is an example of enabling root_squash in a filesystem except within
caps mds = "allow rw fsname=a root_squash, allow rw fsname=a path=/volumes" caps mds = "allow rw fsname=a root_squash, allow rw fsname=a path=/volumes"
caps mon = "allow r fsname=a" caps mon = "allow r fsname=a"
caps osd = "allow rw tag cephfs data=a" caps osd = "allow rw tag cephfs data=a"
Updating Capabilities using ``fs authorize``
============================================
After Ceph's Reef version, ``fs authorize`` can not only be used to create a
new client with caps for a CephFS but it can also be used to add new caps
(for a another CephFS or another path in same FS) to an already existing
client.
Let's say we run following and create a new client::
$ ceph fs authorize a client.x / rw
[client.x]
key = AQAOtSVk9WWtIhAAJ3gSpsjwfIQ0gQ6vfSx/0w==
$ ceph auth get client.x
[client.x]
key = AQAOtSVk9WWtIhAAJ3gSpsjwfIQ0gQ6vfSx/0w==
caps mds = "allow rw fsname=a"
caps mon = "allow r fsname=a"
caps osd = "allow rw tag cephfs data=a"
Previously, running ``fs authorize a client.x / rw`` a second time used to
print an error message. But after Reef, it instead prints message that
there's not update::
$ ./bin/ceph fs authorize a client.x / rw
no update for caps of client.x
Adding New Caps Using ``fs authorize``
--------------------------------------
Users can now add caps for another path in same CephFS::
$ ceph fs authorize a client.x /dir1 rw
updated caps for client.x
$ ceph auth get client.x
[client.x]
key = AQAOtSVk9WWtIhAAJ3gSpsjwfIQ0gQ6vfSx/0w==
caps mds = "allow r fsname=a, allow rw fsname=a path=some/dir"
caps mon = "allow r fsname=a"
caps osd = "allow rw tag cephfs data=a"
And even add caps for another CephFS on Ceph cluster::
$ ceph fs authorize b client.x / rw
updated caps for client.x
$ ceph auth get client.x
[client.x]
key = AQD6tiVk0uJdARAABMaQuLRotxTi3Qdj47FkBA==
caps mds = "allow rw fsname=a, allow rw fsname=b"
caps mon = "allow r fsname=a, allow r fsname=b"
caps osd = "allow rw tag cephfs data=a, allow rw tag cephfs data=b"
Changing rw permissions in caps
-------------------------------
It's not possible to modify caps by running ``fs authorize`` except for the
case when read/write permissions have to be changed. This is because the
``fs authorize`` becomes ambiguous. For example, user runs ``fs authorize
cephfs1 client.x /dir1 rw`` to create a client and then runs ``fs authorize
cephfs1 client.x /dir2 rw`` (notice ``/dir1`` is changed to ``/dir2``).
Running second command can be interpreted as changing ``/dir1`` to ``/dir2``
in current cap or can also be interpreted as authorizing the client with a
new cap for path ``/dir2``. As seen in previous sections, second
interpretation is chosen and therefore it's impossible to update a part of
capability granted except rw permissions. Following is how read/write
permissions for ``client.x`` (that was created above) can be changed::
$ ceph fs authorize a client.x / r
[client.x]
key = AQBBKjBkIFhBDBAA6q5PmDDWaZtYjd+jafeVUQ==
$ ceph auth get client.x
[client.x]
key = AQBBKjBkIFhBDBAA6q5PmDDWaZtYjd+jafeVUQ==
caps mds = "allow r fsname=a"
caps mon = "allow r fsname=a"
caps osd = "allow r tag cephfs data=a"
``fs authorize`` never deducts any part of caps
-----------------------------------------------
It's not possible to remove caps issued to a client by running ``fs
authorize`` again. For example, if a client cap has ``root_squash`` applied
on a certain CephFS, running ``fs authorize`` again for the same CephFS but
without ``root_squash`` will not lead to any update, the client caps will
remain unchanged::
$ ceph fs authorize a client.x / rw root_squash
[client.x]
key = AQD61CVkcA1QCRAAd0XYqPbHvcc+lpUAuc6Vcw==
$ ceph auth get client.x
[client.x]
key = AQD61CVkcA1QCRAAd0XYqPbHvcc+lpUAuc6Vcw==
caps mds = "allow rw fsname=a root_squash"
caps mon = "allow r fsname=a"
caps osd = "allow rw tag cephfs data=a"
$ ceph fs authorize a client.x / rw
[client.x]
key = AQD61CVkcA1QCRAAd0XYqPbHvcc+lpUAuc6Vcw==
no update was performed for caps of client.x. caps of client.x remains unchanged.
And if a client already has a caps for FS name ``a`` and path ``dir1``,
running ``fs authorize`` again for FS name ``a`` but path ``dir2``, instead
of modifying the caps client already holds, a new cap for ``dir2`` will be
granted::
$ ceph fs authorize a client.x /dir1 rw
$ ceph auth get client.x
[client.x]
key = AQC1tyVknMt+JxAAp0pVnbZGbSr/nJrmkMNKqA==
caps mds = "allow rw fsname=a path=/dir1"
caps mon = "allow r fsname=a"
caps osd = "allow rw tag cephfs data=a"
$ ceph fs authorize a client.x /dir2 rw
updated caps for client.x
$ ceph auth get client.x
[client.x]
key = AQC1tyVknMt+JxAAp0pVnbZGbSr/nJrmkMNKqA==
caps mds = "allow rw fsname=a path=dir1, allow rw fsname=a path=dir2"
caps mon = "allow r fsname=a"
caps osd = "allow rw tag cephfs data=a"

View File

@ -15,7 +15,7 @@ Advanced: Metadata repair tools
file system before attempting to repair it. file system before attempting to repair it.
If you do not have access to professional support for your cluster, If you do not have access to professional support for your cluster,
consult the ceph-users mailing list or the #ceph IRC channel. consult the ceph-users mailing list or the #ceph IRC/Slack channel.
Journal export Journal export

View File

@ -501,10 +501,14 @@ To initiate a clone operation use::
$ ceph fs subvolume snapshot clone <vol_name> <subvol_name> <snap_name> <target_subvol_name> $ ceph fs subvolume snapshot clone <vol_name> <subvol_name> <snap_name> <target_subvol_name>
.. note:: ``subvolume snapshot clone`` command depends upon the above mentioned config option ``snapshot_clone_no_wait``
If a snapshot (source subvolume) is a part of non-default group, the group name needs to be specified:: If a snapshot (source subvolume) is a part of non-default group, the group name needs to be specified::
$ ceph fs subvolume snapshot clone <vol_name> <subvol_name> <snap_name> <target_subvol_name> --group_name <subvol_group_name> $ ceph fs subvolume snapshot clone <vol_name> <subvol_name> <snap_name> <target_subvol_name> --group_name <subvol_group_name>
If a snapshot (source subvolume) is a part of non-default group, the group name needs to be specified:
Cloned subvolumes can be a part of a different group than the source snapshot (by default, cloned subvolumes are created in default group). To clone to a particular group use:: Cloned subvolumes can be a part of a different group than the source snapshot (by default, cloned subvolumes are created in default group). To clone to a particular group use::
$ ceph fs subvolume snapshot clone <vol_name> <subvol_name> <snap_name> <target_subvol_name> --target_group_name <subvol_group_name> $ ceph fs subvolume snapshot clone <vol_name> <subvol_name> <snap_name> <target_subvol_name> --target_group_name <subvol_group_name>
@ -513,13 +517,15 @@ Similar to specifying a pool layout when creating a subvolume, pool layout can b
$ ceph fs subvolume snapshot clone <vol_name> <subvol_name> <snap_name> <target_subvol_name> --pool_layout <pool_layout> $ ceph fs subvolume snapshot clone <vol_name> <subvol_name> <snap_name> <target_subvol_name> --pool_layout <pool_layout>
Configure the maximum number of concurrent clones. The default is 4::
$ ceph config set mgr mgr/volumes/max_concurrent_clones <value>
To check the status of a clone operation use:: To check the status of a clone operation use::
$ ceph fs clone status <vol_name> <clone_name> [--group_name <group_name>] ceph fs subvolume snapshot clone <vol_name> <subvol_name> <snap_name> <target_subvol_name> --pool_layout <pool_layout>
To check the status of a clone operation use:
.. prompt:: bash #
ceph fs clone status <vol_name> <clone_name> [--group_name <group_name>]
A clone can be in one of the following states: A clone can be in one of the following states:
@ -616,6 +622,31 @@ On successful cancellation, the cloned subvolume is moved to the ``canceled`` st
.. note:: The canceled cloned may be deleted by supplying the ``--force`` option to the `fs subvolume rm` command. .. note:: The canceled cloned may be deleted by supplying the ``--force`` option to the `fs subvolume rm` command.
Configurables
~~~~~~~~~~~~~
Configure the maximum number of concurrent clone operations. The default is 4:
.. prompt:: bash #
ceph config set mgr mgr/volumes/max_concurrent_clones <value>
Configure the snapshot_clone_no_wait option :
The ``snapshot_clone_no_wait`` config option is used to reject clone creation requests when cloner threads
(which can be configured using above option i.e. ``max_concurrent_clones``) are not available.
It is enabled by default i.e. the value set is True, whereas it can be configured by using below command.
.. prompt:: bash #
ceph config set mgr mgr/volumes/snapshot_clone_no_wait <bool>
The current value of ``snapshot_clone_no_wait`` can be fetched by using below command.
.. prompt:: bash #
ceph config get mgr mgr/volumes/snapshot_clone_no_wait
.. _subvol-pinning: .. _subvol-pinning:

View File

@ -130,7 +130,9 @@ other daemons, please see :ref:`health-checks`.
from properly cleaning up resources used by client requests. This message from properly cleaning up resources used by client requests. This message
appears if a client appears to have more than ``max_completed_requests`` appears if a client appears to have more than ``max_completed_requests``
(default 100000) requests that are complete on the MDS side but haven't (default 100000) requests that are complete on the MDS side but haven't
yet been accounted for in the client's *oldest tid* value. yet been accounted for in the client's *oldest tid* value. The last tid
used by the MDS to trim completed client requests (or flush) is included
as part of `session ls` (or `client ls`) command as a debug aid.
``MDS_DAMAGE`` ``MDS_DAMAGE``
-------------- --------------
@ -238,3 +240,32 @@ other daemons, please see :ref:`health-checks`.
Description Description
All MDS ranks are unavailable resulting in the file system to be completely All MDS ranks are unavailable resulting in the file system to be completely
offline. offline.
``MDS_CLIENTS_LAGGY``
----------------------------
Message
"Client *ID* is laggy; not evicted because some OSD(s) is/are laggy"
Description
If OSD(s) is laggy (due to certain conditions like network cut-off, etc)
then it might make clients laggy(session might get idle or cannot flush
dirty data for cap revokes). If ``defer_client_eviction_on_laggy_osds`` is
set to true (default true), client eviction will not take place and thus
this health warning will be generated.
``MDS_CLIENTS_BROKEN_ROOTSQUASH``
---------------------------------
Message
"X client(s) with broken root_squash implementation (MDS_CLIENTS_BROKEN_ROOTSQUASH)"
Description
A bug was discovered in root_squash which would potentially lose changes made by a
client restricted with root_squash caps. The fix required a change to the protocol
and a client upgrade is required.
This is a HEALTH_ERR warning because of the danger of inconsistency and lost
data. It is recommended to either upgrade your clients, discontinue using
root_squash in the interim, or silence the warning if desired.
To evict and permanently block broken clients from connecting to the
cluster, set the ``required_client_feature`` bit ``client_mds_auth_caps``.

View File

@ -116,7 +116,7 @@ The mechanism provided for this purpose is called an ``export pin``, an
extended attribute of directories. The name of this extended attribute is extended attribute of directories. The name of this extended attribute is
``ceph.dir.pin``. Users can set this attribute using standard commands: ``ceph.dir.pin``. Users can set this attribute using standard commands:
:: .. prompt:: bash #
setfattr -n ceph.dir.pin -v 2 path/to/dir setfattr -n ceph.dir.pin -v 2 path/to/dir
@ -128,7 +128,7 @@ pin. In this way, setting the export pin on a directory affects all of its
children. However, the parents pin can be overridden by setting the child children. However, the parents pin can be overridden by setting the child
directory's export pin. For example: directory's export pin. For example:
:: .. prompt:: bash #
mkdir -p a/b mkdir -p a/b
# "a" and "a/b" both start without an export pin set # "a" and "a/b" both start without an export pin set
@ -173,7 +173,7 @@ immediate children across a range of MDS ranks. The canonical example use-case
would be the ``/home`` directory: we want every user's home directory to be would be the ``/home`` directory: we want every user's home directory to be
spread across the entire MDS cluster. This can be set via: spread across the entire MDS cluster. This can be set via:
:: .. prompt:: bash #
setfattr -n ceph.dir.pin.distributed -v 1 /cephfs/home setfattr -n ceph.dir.pin.distributed -v 1 /cephfs/home
@ -183,7 +183,7 @@ may be ephemerally pinned. This is set through the extended attribute
``ceph.dir.pin.random`` with the value set to the percentage of directories ``ceph.dir.pin.random`` with the value set to the percentage of directories
that should be pinned. For example: that should be pinned. For example:
:: .. prompt:: bash #
setfattr -n ceph.dir.pin.random -v 0.5 /cephfs/tmp setfattr -n ceph.dir.pin.random -v 0.5 /cephfs/tmp
@ -205,7 +205,7 @@ Ephemeral pins may override parent export pins and vice versa. What determines
which policy is followed is the rule of the closest parent: if a closer parent which policy is followed is the rule of the closest parent: if a closer parent
directory has a conflicting policy, use that one instead. For example: directory has a conflicting policy, use that one instead. For example:
:: .. prompt:: bash #
mkdir -p foo/bar1/baz foo/bar2 mkdir -p foo/bar1/baz foo/bar2
setfattr -n ceph.dir.pin -v 0 foo setfattr -n ceph.dir.pin -v 0 foo
@ -217,7 +217,7 @@ directory will obey the pin on ``foo`` normally.
For the reverse situation: For the reverse situation:
:: .. prompt:: bash #
mkdir -p home/{patrick,john} mkdir -p home/{patrick,john}
setfattr -n ceph.dir.pin.distributed -v 1 home setfattr -n ceph.dir.pin.distributed -v 1 home
@ -229,7 +229,8 @@ because its export pin overrides the policy on ``home``.
To remove a partitioning policy, remove the respective extended attribute To remove a partitioning policy, remove the respective extended attribute
or set the value to 0. or set the value to 0.
.. code::bash .. prompt:: bash #
$ setfattr -n ceph.dir.pin.distributed -v 0 home $ setfattr -n ceph.dir.pin.distributed -v 0 home
# or # or
$ setfattr -x ceph.dir.pin.distributed home $ setfattr -x ceph.dir.pin.distributed home
@ -237,10 +238,36 @@ or set the value to 0.
For export pins, remove the extended attribute or set the extended attribute For export pins, remove the extended attribute or set the extended attribute
value to `-1`. value to `-1`.
.. code::bash .. prompt:: bash #
$ setfattr -n ceph.dir.pin -v -1 home $ setfattr -n ceph.dir.pin -v -1 home
Dynamic Subtree Partitioning
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
CephFS has long had a dynamic metadata blanacer (sometimes called the "default
balancer") which can split or merge subtrees while placing them on "colder" MDS
ranks. Moving the metadata around can improve overall file system throughput
and cache size.
However, the balancer has suffered from problem with efficiency and performance
so it is by default turned off. This is to avoid an administrator "turning on
multimds" by increasing the ``max_mds`` setting and then finding the balancer
has made a mess of the cluster performance (reverting is straightforward but
can take time).
The setting to turn on the balancer is:
.. prompt:: bash #
ceph fs set <fs_name> balance_automate true
Turning on the balancer should only be done with appropriate configuration,
such as with the ``bal_rank_mask`` setting (described below). Careful
monitoring of the file system performance and MDS is advised.
Dynamic subtree partitioning with Balancer on specific ranks Dynamic subtree partitioning with Balancer on specific ranks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@ -260,27 +287,27 @@ static pinned subtrees.
This option can be configured with the ``ceph fs set`` command. For example: This option can be configured with the ``ceph fs set`` command. For example:
:: .. prompt:: bash #
ceph fs set <fs_name> bal_rank_mask <hex> ceph fs set <fs_name> bal_rank_mask <hex>
Each bitfield of the ``<hex>`` number represents a dedicated rank. If the ``<hex>`` is Each bitfield of the ``<hex>`` number represents a dedicated rank. If the ``<hex>`` is
set to ``0x3``, the balancer runs on active ``0`` and ``1`` ranks. For example: set to ``0x3``, the balancer runs on active ``0`` and ``1`` ranks. For example:
:: .. prompt:: bash #
ceph fs set <fs_name> bal_rank_mask 0x3 ceph fs set <fs_name> bal_rank_mask 0x3
If the ``bal_rank_mask`` is set to ``-1`` or ``all``, all active ranks are masked If the ``bal_rank_mask`` is set to ``-1`` or ``all``, all active ranks are masked
and utilized by the balancer. As an example: and utilized by the balancer. As an example:
:: .. prompt:: bash #
ceph fs set <fs_name> bal_rank_mask -1 ceph fs set <fs_name> bal_rank_mask -1
On the other hand, if the balancer needs to be disabled, On the other hand, if the balancer needs to be disabled,
the ``bal_rank_mask`` should be set to ``0x0``. For example: the ``bal_rank_mask`` should be set to ``0x0``. For example:
:: .. prompt:: bash #
ceph fs set <fs_name> bal_rank_mask 0x0 ceph fs set <fs_name> bal_rank_mask 0x0

View File

@ -21,6 +21,14 @@ value::
setfattr -n ceph.quota.max_bytes -v 100000000 /some/dir # 100 MB setfattr -n ceph.quota.max_bytes -v 100000000 /some/dir # 100 MB
setfattr -n ceph.quota.max_files -v 10000 /some/dir # 10,000 files setfattr -n ceph.quota.max_files -v 10000 /some/dir # 10,000 files
``ceph.quota.max_bytes`` can also be set using human-friendly units::
setfattr -n ceph.quota.max_bytes -v 100K /some/dir # 100 KiB
setfattr -n ceph.quota.max_bytes -v 5Gi /some/dir # 5 GiB
.. note:: Values will be strictly cast to IEC units even when SI units
are input, e.g. 1K to 1024 bytes.
To view quota limit:: To view quota limit::
$ getfattr -n ceph.quota.max_bytes /some/dir $ getfattr -n ceph.quota.max_bytes /some/dir

View File

@ -30,9 +30,9 @@ assumed to be keyword arguments too.
Snapshot schedules are identified by path, their repeat interval and their start Snapshot schedules are identified by path, their repeat interval and their start
time. The time. The
repeat interval defines the time between two subsequent snapshots. It is repeat interval defines the time between two subsequent snapshots. It is
specified by a number and a period multiplier, one of `h(our)`, `d(ay)` and specified by a number and a period multiplier, one of `h(our)`, `d(ay)`,
`w(eek)`. E.g. a repeat interval of `12h` specifies one snapshot every 12 `w(eek)`, `M(onth)` and `Y(ear)`. E.g. a repeat interval of `12h` specifies one
hours. snapshot every 12 hours.
The start time is specified as a time string (more details about passing times The start time is specified as a time string (more details about passing times
below). By default below). By default
the start time is last midnight. So when a snapshot schedule with repeat the start time is last midnight. So when a snapshot schedule with repeat
@ -52,8 +52,8 @@ space or concatenated pairs of `<number><time period>`.
The semantics are that a spec will ensure `<number>` snapshots are kept that are The semantics are that a spec will ensure `<number>` snapshots are kept that are
at least `<time period>` apart. For Example `7d` means the user wants to keep 7 at least `<time period>` apart. For Example `7d` means the user wants to keep 7
snapshots that are at least one day (but potentially longer) apart from each other. snapshots that are at least one day (but potentially longer) apart from each other.
The following time periods are recognized: `h(our), d(ay), w(eek), m(onth), The following time periods are recognized: `h(our)`, `d(ay)`, `w(eek)`, `M(onth)`,
y(ear)` and `n`. The latter is a special modifier where e.g. `10n` means keep `Y(ear)` and `n`. The latter is a special modifier where e.g. `10n` means keep
the last 10 snapshots regardless of timing, the last 10 snapshots regardless of timing,
All subcommands take optional `fs` argument to specify paths in All subcommands take optional `fs` argument to specify paths in

View File

@ -118,10 +118,16 @@ enforces this affinity.
When failing over MDS daemons, a cluster's monitors will prefer standby daemons with When failing over MDS daemons, a cluster's monitors will prefer standby daemons with
``mds_join_fs`` equal to the file system ``name`` with the failed ``rank``. If no ``mds_join_fs`` equal to the file system ``name`` with the failed ``rank``. If no
standby exists with ``mds_join_fs`` equal to the file system ``name``, it will standby exists with ``mds_join_fs`` equal to the file system ``name``, it will
choose an unqualified standby (no setting for ``mds_join_fs``) for the replacement, choose an unqualified standby (no setting for ``mds_join_fs``) for the replacement.
or any other available standby, as a last resort. Note, this does not change the As a last resort, a standby for another filesystem will be chosen, although this
behavior that ``standby-replay`` daemons are always selected before behavior can be disabled:
other standbys.
::
ceph fs set <fs name> refuse_standby_for_another_fs true
Note, configuring MDS file system affinity does not change the behavior that
``standby-replay`` daemons are always selected before other standbys.
Even further, the monitors will regularly examine the CephFS file systems even when Even further, the monitors will regularly examine the CephFS file systems even when
stable to check if a standby with stronger affinity is available to replace an stable to check if a standby with stronger affinity is available to replace an

View File

@ -401,3 +401,64 @@ own copy of the cephadm "binary" use the script located at
``./src/cephadm/build.py [output]``. ``./src/cephadm/build.py [output]``.
.. _Python Zip Application: https://peps.python.org/pep-0441/ .. _Python Zip Application: https://peps.python.org/pep-0441/
You can pass a limited set of version metadata values to be stored in the
compiled cepadm. These options can be passed to the build script with
the ``--set-version-var`` or ``-S`` option. The values should take the form
``KEY=VALUE`` and valid keys include:
* ``CEPH_GIT_VER``
* ``CEPH_GIT_NICE_VER``
* ``CEPH_RELEASE``
* ``CEPH_RELEASE_NAME``
* ``CEPH_RELEASE_TYPE``
Example: ``./src/cephadm/build.py -SCEPH_GIT_VER=$(git rev-parse HEAD) -SCEPH_GIT_NICE_VER=$(git describe) /tmp/cephadm``
Typically these values will be passed to build.py by other, higher level, build
tools - such as cmake.
The compiled version of the binary may include a curated set of dependencies
within the zipapp. The tool used to fetch the bundled dependencies can be
Python's ``pip``, locally installed RPMs, or bundled dependencies can be
disabled. To select the mode for bundled dependencies use the
``--bundled-dependencies`` or ``-B`` option with a value of ``pip``, ``rpm``,
or ``none``.
The compiled cephadm zipapp file retains metadata about how it was built. This
can be displayed by running ``cephadm version --verbose``. The command will
emit a JSON formatted object showing version metadata (if available), a list of
the bundled dependencies generated by the build script (if bundled dependencies
were enabled), and a summary of the top-level contents of the zipapp. Example::
$ ./cephadm version --verbose
{
"name": "cephadm",
"ceph_git_nice_ver": "18.0.0-6867-g6a1df2d0b01",
"ceph_git_ver": "6a1df2d0b01da581bfef3357940e1e88d5ce70ce",
"ceph_release_name": "reef",
"ceph_release_type": "dev",
"bundled_packages": [
{
"name": "Jinja2",
"version": "3.1.2",
"package_source": "pip",
"requirements_entry": "Jinja2 == 3.1.2"
},
{
"name": "MarkupSafe",
"version": "2.1.3",
"package_source": "pip",
"requirements_entry": "MarkupSafe == 2.1.3"
}
],
"zip_root_entries": [
"Jinja2-3.1.2-py3.9.egg-info",
"MarkupSafe-2.1.3-py3.9.egg-info",
"__main__.py",
"__main__.pyc",
"_cephadmmeta",
"cephadmlib",
"jinja2",
"markupsafe"
]
}

View File

@ -148,7 +148,7 @@ options. By default, ``log-to-stdout`` is enabled, and ``--log-to-syslog`` is di
vstart.sh vstart.sh
--------- ---------
The following options aree handy when using ``vstart.sh``, The following options can be used with ``vstart.sh``.
``--crimson`` ``--crimson``
Start ``crimson-osd`` instead of ``ceph-osd``. Start ``crimson-osd`` instead of ``ceph-osd``.
@ -195,9 +195,6 @@ The following options aree handy when using ``vstart.sh``,
Valid types include ``HDD``, ``SSD``(default), ``ZNS``, and ``RANDOM_BLOCK_SSD`` Valid types include ``HDD``, ``SSD``(default), ``ZNS``, and ``RANDOM_BLOCK_SSD``
Note secondary devices should not be faster than the main device. Note secondary devices should not be faster than the main device.
``--seastore``
Use SeaStore as the object store backend.
To start a cluster with a single Crimson node, run:: To start a cluster with a single Crimson node, run::
$ MGR=1 MON=1 OSD=1 MDS=0 RGW=0 ../src/vstart.sh -n -x \ $ MGR=1 MON=1 OSD=1 MDS=0 RGW=0 ../src/vstart.sh -n -x \

View File

@ -1,3 +1,5 @@
.. _crimson_dev_doc:
=============================== ===============================
Crimson developer documentation Crimson developer documentation
=============================== ===============================

View File

@ -13,20 +13,18 @@ following table shows all the leads and their nicks on `GitHub`_:
.. _github: https://github.com/ .. _github: https://github.com/
========= ================ ============= ========= ================== =============
Scope Lead GitHub nick Scope Lead GitHub nick
========= ================ ============= ========= ================== =============
Ceph Sage Weil liewegas RADOS Radoslaw Zarzynski rzarzynski
RADOS Neha Ojha neha-ojha RGW Casey Bodley cbodley
RGW Yehuda Sadeh yehudasa
RGW Matt Benjamin mattbenjamin RGW Matt Benjamin mattbenjamin
RBD Ilya Dryomov dis RBD Ilya Dryomov dis
CephFS Venky Shankar vshankar CephFS Venky Shankar vshankar
Dashboard Ernesto Puerta epuertat Dashboard Nizamudeen A nizamial09
MON Joao Luis jecluis
Build/Ops Ken Dreyer ktdreyer Build/Ops Ken Dreyer ktdreyer
Docs Zac Dover zdover23 Docs Zac Dover zdover23
========= ================ ============= ========= ================== =============
The Ceph-specific acronyms in the table are explained in The Ceph-specific acronyms in the table are explained in
:doc:`/architecture`. :doc:`/architecture`.

View File

@ -209,6 +209,15 @@ For example: for the above test ID, the path is::
This method can be used to view the log more quickly than would be possible through a browser. This method can be used to view the log more quickly than would be possible through a browser.
In addition to ``teuthology.log``, some other files are included for debugging
purposes:
* ``unit_test_summary.yaml``: Provides a summary of all unit test failures.
Generated (optionally) when the ``unit_test_scan`` configuration option is
used in the job's YAML file.
* ``valgrind.yaml``: Summarizes any Valgrind errors that may occur.
.. note:: To access archives more conveniently, ``/a/`` has been symbolically .. note:: To access archives more conveniently, ``/a/`` has been symbolically
linked to ``/ceph/teuthology-archive/``. For instance, to access the previous linked to ``/ceph/teuthology-archive/``. For instance, to access the previous
example, we can use something like:: example, we can use something like::

View File

@ -2,10 +2,14 @@
Ceph Internals Ceph Internals
================ ================
.. note:: If you're looking for how to use Ceph as a library from your .. note:: For information on how to use Ceph as a library (from your own
own software, please see :doc:`/api/index`. software), see :doc:`/api/index`.
You can start a development mode Ceph cluster, after compiling the source, with:: Starting a Development-mode Ceph Cluster
----------------------------------------
Compile the source and then run the following commands to start a
development-mode Ceph cluster::
cd build cd build
OSD=3 MON=3 MGR=3 ../src/vstart.sh -n -x OSD=3 MON=3 MGR=3 ../src/vstart.sh -n -x

View File

@ -218,6 +218,8 @@ we may want to exploit.
The dedup-tool needs to be updated to use ``LIST_SNAPS`` to discover The dedup-tool needs to be updated to use ``LIST_SNAPS`` to discover
clones as part of leak detection. clones as part of leak detection.
.. _osd-make-writeable:
An important question is how we deal with the fact that many clones An important question is how we deal with the fact that many clones
will frequently have references to the same backing chunks at the same will frequently have references to the same backing chunks at the same
offset. In particular, ``make_writeable`` will generally create a clone offset. In particular, ``make_writeable`` will generally create a clone

View File

@ -23,12 +23,11 @@ The difference between *pool snaps* and *self managed snaps* from the
OSD's point of view lies in whether the *SnapContext* comes to the OSD OSD's point of view lies in whether the *SnapContext* comes to the OSD
via the client's MOSDOp or via the most recent OSDMap. via the client's MOSDOp or via the most recent OSDMap.
See OSD::make_writeable See :ref:`manifest.rst <osd-make-writeable>` for more information.
Ondisk Structures Ondisk Structures
----------------- -----------------
Each object has in the PG collection a *head* object (or *snapdir*, which we Each object has in the PG collection a *head* object and possibly a set of *clone* objects.
will come to shortly) and possibly a set of *clone* objects.
Each hobject_t has a snap field. For the *head* (the only writeable version Each hobject_t has a snap field. For the *head* (the only writeable version
of an object), the snap field is set to CEPH_NOSNAP. For the *clones*, the of an object), the snap field is set to CEPH_NOSNAP. For the *clones*, the
snap field is set to the *seq* of the *SnapContext* at their creation. snap field is set to the *seq* of the *SnapContext* at their creation.
@ -47,8 +46,12 @@ The *head* object contains a *SnapSet* encoded in an attribute, which tracks
3. Overlapping intervals between clones for tracking space usage 3. Overlapping intervals between clones for tracking space usage
4. Clone size 4. Clone size
If the *head* is deleted while there are still clones, a *snapdir* object The *head* can't be deleted while there are still clones. Instead, it is
is created instead to house the *SnapSet*. marked as whiteout (``object_info_t::FLAG_WHITEOUT``) in order to house the
*SnapSet* contained in it.
In that case, the *head* object no longer logically exists.
See: should_whiteout()
Additionally, the *object_info_t* on each clone includes a vector of snaps Additionally, the *object_info_t* on each clone includes a vector of snaps
for which clone is defined. for which clone is defined.
@ -126,3 +129,111 @@ up to 8 prefixes need to be checked to determine all hobjects in a particular
snap for a particular PG. Upon split, the prefixes to check on the parent snap for a particular PG. Upon split, the prefixes to check on the parent
are adjusted such that only the objects remaining in the PG will be visible. are adjusted such that only the objects remaining in the PG will be visible.
The children will immediately have the correct mapping. The children will immediately have the correct mapping.
clone_overlap
-------------
Each SnapSet attached to the *head* object contains the overlapping intervals
between clone objects for optimizing space.
The overlapping intervals are stored within the ``clone_overlap`` map, each element in the
map stores the snap ID and the corresponding overlap with the next newest clone.
See the following example using a 4 byte object:
+--------+---------+
| object | content |
+========+=========+
| head | [AAAA] |
+--------+---------+
listsnaps output is as follows:
+---------+-------+------+---------+
| cloneid | snaps | size | overlap |
+=========+=======+======+=========+
| head | - | 4 | |
+---------+-------+------+---------+
After taking a snapshot (ID 1) and re-writing the first 2 bytes of the object,
the clone created will overlap with the new *head* object in its last 2 bytes.
+------------+---------+
| object | content |
+============+=========+
| head | [BBAA] |
+------------+---------+
| clone ID 1 | [AAAA] |
+------------+---------+
+---------+-------+------+---------+
| cloneid | snaps | size | overlap |
+=========+=======+======+=========+
| 1 | 1 | 4 | [2~2] |
+---------+-------+------+---------+
| head | - | 4 | |
+---------+-------+------+---------+
By taking another snapshot (ID 2) and this time re-writing only the first 1 byte of the object,
the clone created (ID 2) will overlap with the new *head* object in its last 3 bytes.
While the oldest clone (ID 1) will overlap with the newest clone in its last 2 bytes.
+------------+---------+
| object | content |
+============+=========+
| head | [CBAA] |
+------------+---------+
| clone ID 2 | [BBAA] |
+------------+---------+
| clone ID 1 | [AAAA] |
+------------+---------+
+---------+-------+------+---------+
| cloneid | snaps | size | overlap |
+=========+=======+======+=========+
| 1 | 1 | 4 | [2~2] |
+---------+-------+------+---------+
| 2 | 2 | 4 | [1~3] |
+---------+-------+------+---------+
| head | - | 4 | |
+---------+-------+------+---------+
If the *head* object will be completely re-written by re-writing 4 bytes,
the only existing overlap that will remain will be between the two clones.
+------------+---------+
| object | content |
+============+=========+
| head | [DDDD] |
+------------+---------+
| clone ID 2 | [BBAA] |
+------------+---------+
| clone ID 1 | [AAAA] |
+------------+---------+
+---------+-------+------+---------+
| cloneid | snaps | size | overlap |
+=========+=======+======+=========+
| 1 | 1 | 4 | [2~2] |
+---------+-------+------+---------+
| 2 | 2 | 4 | |
+---------+-------+------+---------+
| head | - | 4 | |
+---------+-------+------+---------+
Lastly, after the last snap (ID 2) is removed and snaptrim kicks in,
no overlapping intervals will remain:
+------------+---------+
| object | content |
+============+=========+
| head | [DDDD] |
+------------+---------+
| clone ID 1 | [AAAA] |
+------------+---------+
+---------+-------+------+---------+
| cloneid | snaps | size | overlap |
+=========+=======+======+=========+
| 1 | 1 | 4 | |
+---------+-------+------+---------+
| head | - | 4 | |
+---------+-------+------+---------+

View File

@ -6,92 +6,87 @@ Concepts
-------- --------
*Peering* *Peering*
the process of bringing all of the OSDs that store the process of bringing all of the OSDs that store a Placement Group (PG)
a Placement Group (PG) into agreement about the state into agreement about the state of all of the objects in that PG and all of
of all of the objects (and their metadata) in that PG. the metadata associated with those objects. Two OSDs can agree on the state
Note that agreeing on the state does not mean that of the objects in the placement group yet still may not necessarily have the
they all have the latest contents. latest contents.
*Acting set* *Acting set*
the ordered list of OSDs who are (or were as of some epoch) the ordered list of OSDs that are (or were as of some epoch) responsible for
responsible for a particular PG. a particular PG.
*Up set* *Up set*
the ordered list of OSDs responsible for a particular PG for the ordered list of OSDs responsible for a particular PG for a particular
a particular epoch according to CRUSH. Normally this epoch, according to CRUSH. This is the same as the *acting set* except when
is the same as the *acting set*, except when the *acting set* has been the *acting set* has been explicitly overridden via *PG temp* in the OSDMap.
explicitly overridden via *PG temp* in the OSDMap.
*PG temp* *PG temp*
a temporary placement group acting set used while backfilling the a temporary placement group acting set that is used while backfilling the
primary osd. Let say acting is [0,1,2] and we are primary OSD. Assume that the acting set is ``[0,1,2]`` and we are
active+clean. Something happens and acting is now [3,1,2]. osd 3 is ``active+clean``. Now assume that something happens and the acting set
empty and can't serve reads although it is the primary. osd.3 will becomes ``[2,1,2]``. Under these circumstances, OSD ``3`` is empty and can't
see that and request a *PG temp* of [1,2,3] to the monitors using a serve reads even though it is the primary. ``osd.3`` will respond by
MOSDPGTemp message so that osd.1 temporarily becomes the requesting a *PG temp* of ``[1,2,3]`` to the monitors using a ``MOSDPGTemp``
primary. It will select osd.3 as a backfill peer and continue to message, and ``osd.1`` will become the primary temporarily. ``osd.1`` will
serve reads and writes while osd.3 is backfilled. When backfilling select ``osd.3`` as a backfill peer and will continue to serve reads and
is complete, *PG temp* is discarded and the acting set changes back writes while ``osd.3`` is backfilled. When backfilling is complete, *PG
to [3,1,2] and osd.3 becomes the primary. temp* is discarded. The acting set changes back to ``[3,1,2]`` and ``osd.3``
becomes the primary.
*current interval* or *past interval* *current interval* or *past interval*
a sequence of OSD map epochs during which the *acting set* and *up a sequence of OSD map epochs during which the *acting set* and the *up
set* for particular PG do not change set* for particular PG do not change.
*primary* *primary*
the (by convention first) member of the *acting set*, the member of the *acting set* that is responsible for coordination peering.
who is responsible for coordination peering, and is The only OSD that accepts client-initiated writes to the objects in a
the only OSD that will accept client initiated placement group. By convention, the primary is the first member of the
writes to objects in a placement group. *acting set*.
*replica* *replica*
a non-primary OSD in the *acting set* for a placement group a non-primary OSD in the *acting set* of a placement group. A replica has
(and who has been recognized as such and *activated* by the primary). been recognized as a non-primary OSD and has been *activated* by the
primary.
*stray* *stray*
an OSD who is not a member of the current *acting set*, but an OSD that is not a member of the current *acting set* and has not yet been
has not yet been told that it can delete its copies of a told to delete its copies of a particular placement group.
particular placement group.
*recovery* *recovery*
ensuring that copies of all of the objects in a PG the process of ensuring that copies of all of the objects in a PG are on all
are on all of the OSDs in the *acting set*. Once of the OSDs in the *acting set*. After *peering* has been performed, the
*peering* has been performed, the primary can start primary can begin accepting write operations and *recovery* can proceed in
accepting write operations, and *recovery* can proceed the background.
in the background.
*PG info* *PG info*
basic metadata about the PG's creation epoch, the version basic metadata about the PG's creation epoch, the version for the most
for the most recent write to the PG, *last epoch started*, *last recent write to the PG, the *last epoch started*, the *last epoch clean*,
epoch clean*, and the beginning of the *current interval*. Any and the beginning of the *current interval*. Any inter-OSD communication
inter-OSD communication about PGs includes the *PG info*, such that about PGs includes the *PG info*, such that any OSD that knows a PG exists
any OSD that knows a PG exists (or once existed) also has a lower (or once existed) and also has a lower bound on *last epoch clean* or *last
bound on *last epoch clean* or *last epoch started*. epoch started*.
*PG log* *PG log*
a list of recent updates made to objects in a PG. a list of recent updates made to objects in a PG. These logs can be
Note that these logs can be truncated after all OSDs truncated after all OSDs in the *acting set* have acknowledged the changes.
in the *acting set* have acknowledged up to a certain
point.
*missing set* *missing set*
Each OSD notes update log entries and if they imply updates to the set of all objects that have not yet had their contents updated to match
the contents of an object, adds that object to a list of needed the log entries. The missing set is collated by each OSD. Missing sets are
updates. This list is called the *missing set* for that <OSD,PG>. kept track of on an ``<OSD,PG>`` basis.
*Authoritative History* *Authoritative History*
a complete, and fully ordered set of operations that, if a complete and fully-ordered set of operations that bring an OSD's copy of a
performed, would bring an OSD's copy of a Placement Group Placement Group up to date.
up to date.
*epoch* *epoch*
a (monotonically increasing) OSD map version number a (monotonically increasing) OSD map version number.
*last epoch start* *last epoch start*
the last epoch at which all nodes in the *acting set* the last epoch at which all nodes in the *acting set* for a given placement
for a particular placement group agreed on an group agreed on an *authoritative history*. At the start of the last epoch,
*authoritative history*. At this point, *peering* is *peering* is deemed to have been successful.
deemed to have been successful.
*up_thru* *up_thru*
before a primary can successfully complete the *peering* process, before a primary can successfully complete the *peering* process,
@ -107,10 +102,9 @@ Concepts
- *acting set* = [B] (B restarts, A does not) - *acting set* = [B] (B restarts, A does not)
*last epoch clean* *last epoch clean*
the last epoch at which all nodes in the *acting set* the last epoch at which all nodes in the *acting set* for a given placement
for a particular placement group were completely group were completely up to date (this includes both the PG's logs and the
up to date (both PG logs and object contents). PG's object contents). At this point, *recovery* is deemed to have been
At this point, *recovery* is deemed to have been
completed. completed.
Description of the Peering Process Description of the Peering Process

View File

@ -213,10 +213,24 @@
Ceph cluster. See :ref:`the "Cluster Map" section of the Ceph cluster. See :ref:`the "Cluster Map" section of the
Architecture document<architecture_cluster_map>` for details. Architecture document<architecture_cluster_map>` for details.
Crimson
A next-generation OSD architecture whose core aim is the
reduction of latency costs incurred due to cross-core
communications. A re-design of the OSD that reduces lock
contention by reducing communication between shards in the data
path. Crimson improves upon the performance of classic Ceph
OSDs by eliminating reliance on thread pools. See `Crimson:
Next-generation Ceph OSD for Multi-core Scalability
<https://ceph.io/en/news/blog/2023/crimson-multi-core-scalability/>`_.
See the :ref:`Crimson developer
documentation<crimson_dev_doc>`.
CRUSH CRUSH
**C**\ontrolled **R**\eplication **U**\nder **S**\calable **C**\ontrolled **R**\eplication **U**\nder **S**\calable
**H**\ashing. The algorithm that Ceph uses to compute object **H**\ashing. The algorithm that Ceph uses to compute object
storage locations. storage locations. See `CRUSH: Controlled, Scalable,
Decentralized Placement of Replicated Data
<https://ceph.com/assets/pdfs/weil-crush-sc06.pdf>`_.
CRUSH rule CRUSH rule
The CRUSH data placement rule that applies to a particular The CRUSH data placement rule that applies to a particular
@ -255,17 +269,31 @@
Hybrid OSD Hybrid OSD
Refers to an OSD that has both HDD and SSD drives. Refers to an OSD that has both HDD and SSD drives.
librados
An API that can be used to create a custom interface to a Ceph
storage cluster. ``librados`` makes it possible to interact
with Ceph Monitors and with OSDs. See :ref:`Introduction to
librados <librados-intro>`. See :ref:`librados (Python)
<librados-python>`.
LVM tags LVM tags
**L**\ogical **V**\olume **M**\anager tags. Extensible metadata **L**\ogical **V**\olume **M**\anager tags. Extensible metadata
for LVM volumes and groups. They are used to store for LVM volumes and groups. They are used to store
Ceph-specific information about devices and its relationship Ceph-specific information about devices and its relationship
with OSDs. with OSDs.
:ref:`MDS<cephfs_add_remote_mds>` MDS
The Ceph **M**\eta\ **D**\ata **S**\erver daemon. Also referred The Ceph **M**\eta\ **D**\ata **S**\erver daemon. Also referred
to as "ceph-mds". The Ceph metadata server daemon must be to as "ceph-mds". The Ceph metadata server daemon must be
running in any Ceph cluster that runs the CephFS file system. running in any Ceph cluster that runs the CephFS file system.
The MDS stores all filesystem metadata. The MDS stores all filesystem metadata. :term:`Client`\s work
together with either a single MDS or a group of MDSes to
maintain a distributed metadata cache that is required by
CephFS.
See :ref:`Deploying Metadata Servers<cephfs_add_remote_mds>`.
See the :ref:`ceph-mds man page<ceph_mds_man>`.
MGR MGR
The Ceph manager software, which collects all the state from The Ceph manager software, which collects all the state from
@ -274,12 +302,30 @@
:ref:`MON<arch_monitor>` :ref:`MON<arch_monitor>`
The Ceph monitor software. The Ceph monitor software.
Monitor Store
The persistent storage that is used by the Monitor. This
includes the Monitor's RocksDB and all related files in
``/var/lib/ceph``.
Node Node
See :term:`Ceph Node`. See :term:`Ceph Node`.
Object Storage Device Object Storage Device
See :term:`OSD`. See :term:`OSD`.
OMAP
"object map". A key-value store (a database) that is used to
reduce the time it takes to read data from and to write to the
Ceph cluster. RGW bucket indexes are stored as OMAPs.
Erasure-coded pools cannot store RADOS OMAP data structures.
Run the command ``ceph osd df`` to see your OMAPs.
See Eleanor Cawthon's 2012 paper `A Distributed Key-Value Store
using Ceph
<https://ceph.io/assets/pdfs/CawthonKeyValueStore.pdf>`_ (17
pages).
OSD OSD
Probably :term:`Ceph OSD`, but not necessarily. Sometimes Probably :term:`Ceph OSD`, but not necessarily. Sometimes
(especially in older correspondence, and especially in (especially in older correspondence, and especially in
@ -291,18 +337,19 @@
mid-2010s to insist that "OSD" should refer to "Object Storage mid-2010s to insist that "OSD" should refer to "Object Storage
Device", so it is important to know which meaning is intended. Device", so it is important to know which meaning is intended.
OSD fsid OSD FSID
This is a unique identifier used to identify an OSD. It is The OSD fsid is a unique identifier that is used to identify an
found in the OSD path in a file called ``osd_fsid``. The OSD. It is found in the OSD path in a file called ``osd_fsid``.
term ``fsid`` is used interchangeably with ``uuid`` The term ``FSID`` is used interchangeably with ``UUID``.
OSD id OSD ID
The integer that defines an OSD. It is generated by the The OSD id an integer unique to each OSD (each OSD has a unique
monitors during the creation of each OSD. OSD ID). Each OSD id is generated by the monitors during the
creation of its associated OSD.
OSD uuid OSD UUID
This is the unique identifier of an OSD. This term is used The OSD UUID is the unique identifier of an OSD. This term is
interchangeably with ``fsid`` used interchangeably with ``FSID``.
Period Period
In the context of :term:`RGW`, a period is the configuration In the context of :term:`RGW`, a period is the configuration

View File

@ -0,0 +1,183 @@
.. _hardware-monitoring:
Hardware monitoring
===================
`node-proxy` is the internal name to designate the running agent which inventories a machine's hardware, provides the different statuses and enable the operator to perform some actions.
It gathers details from the RedFish API, processes and pushes data to agent endpoint in the Ceph manager daemon.
.. graphviz::
digraph G {
node [shape=record];
mgr [label="{<mgr> ceph manager}"];
dashboard [label="<dashboard> ceph dashboard"];
agent [label="<agent> agent"];
redfish [label="<redfish> redfish"];
agent -> redfish [label=" 1." color=green];
agent -> mgr [label=" 2." color=orange];
dashboard:dashboard -> mgr [label=" 3."color=lightgreen];
node [shape=plaintext];
legend [label=<<table border="0" cellborder="1" cellspacing="0">
<tr><td bgcolor="lightgrey">Legend</td></tr>
<tr><td align="center">1. Collects data from redfish API</td></tr>
<tr><td align="left">2. Pushes data to ceph mgr</td></tr>
<tr><td align="left">3. Query ceph mgr</td></tr>
</table>>];
}
Limitations
-----------
For the time being, the `node-proxy` agent relies on the RedFish API.
It implies both `node-proxy` agent and `ceph-mgr` daemon need to be able to access the Out-Of-Band network to work.
Deploying the agent
-------------------
| The first step is to provide the out of band management tool credentials.
| This can be done when adding the host with a service spec file:
.. code-block:: bash
# cat host.yml
---
service_type: host
hostname: node-10
addr: 10.10.10.10
oob:
addr: 20.20.20.10
username: admin
password: p@ssword
Apply the spec:
.. code-block:: bash
# ceph orch apply -i host.yml
Added host 'node-10' with addr '10.10.10.10'
Deploy the agent:
.. code-block:: bash
# ceph config set mgr mgr/cephadm/hw_monitoring true
CLI
---
| **orch** **hardware** **status** [hostname] [--category CATEGORY] [--format plain | json]
supported categories are:
* summary (default)
* memory
* storage
* processors
* network
* power
* fans
* firmwares
* criticals
Examples
********
hardware health statuses summary
++++++++++++++++++++++++++++++++
.. code-block:: bash
# ceph orch hardware status
+------------+---------+-----+-----+--------+-------+------+
| HOST | STORAGE | CPU | NET | MEMORY | POWER | FANS |
+------------+---------+-----+-----+--------+-------+------+
| node-10 | ok | ok | ok | ok | ok | ok |
+------------+---------+-----+-----+--------+-------+------+
storage devices report
++++++++++++++++++++++
.. code-block:: bash
# ceph orch hardware status IBM-Ceph-1 --category storage
+------------+--------------------------------------------------------+------------------+----------------+----------+----------------+--------+---------+
| HOST | NAME | MODEL | SIZE | PROTOCOL | SN | STATUS | STATE |
+------------+--------------------------------------------------------+------------------+----------------+----------+----------------+--------+---------+
| node-10 | Disk 8 in Backplane 1 of Storage Controller in Slot 2 | ST20000NM008D-3D | 20000588955136 | SATA | ZVT99QLL | OK | Enabled |
| node-10 | Disk 10 in Backplane 1 of Storage Controller in Slot 2 | ST20000NM008D-3D | 20000588955136 | SATA | ZVT98ZYX | OK | Enabled |
| node-10 | Disk 11 in Backplane 1 of Storage Controller in Slot 2 | ST20000NM008D-3D | 20000588955136 | SATA | ZVT98ZWB | OK | Enabled |
| node-10 | Disk 9 in Backplane 1 of Storage Controller in Slot 2 | ST20000NM008D-3D | 20000588955136 | SATA | ZVT98ZC9 | OK | Enabled |
| node-10 | Disk 3 in Backplane 1 of Storage Controller in Slot 2 | ST20000NM008D-3D | 20000588955136 | SATA | ZVT9903Y | OK | Enabled |
| node-10 | Disk 1 in Backplane 1 of Storage Controller in Slot 2 | ST20000NM008D-3D | 20000588955136 | SATA | ZVT9901E | OK | Enabled |
| node-10 | Disk 7 in Backplane 1 of Storage Controller in Slot 2 | ST20000NM008D-3D | 20000588955136 | SATA | ZVT98ZQJ | OK | Enabled |
| node-10 | Disk 2 in Backplane 1 of Storage Controller in Slot 2 | ST20000NM008D-3D | 20000588955136 | SATA | ZVT99PA2 | OK | Enabled |
| node-10 | Disk 4 in Backplane 1 of Storage Controller in Slot 2 | ST20000NM008D-3D | 20000588955136 | SATA | ZVT99PFG | OK | Enabled |
| node-10 | Disk 0 in Backplane 0 of Storage Controller in Slot 2 | MZ7L33T8HBNAAD3 | 3840755981824 | SATA | S6M5NE0T800539 | OK | Enabled |
| node-10 | Disk 1 in Backplane 0 of Storage Controller in Slot 2 | MZ7L33T8HBNAAD3 | 3840755981824 | SATA | S6M5NE0T800554 | OK | Enabled |
| node-10 | Disk 6 in Backplane 1 of Storage Controller in Slot 2 | ST20000NM008D-3D | 20000588955136 | SATA | ZVT98ZER | OK | Enabled |
| node-10 | Disk 0 in Backplane 1 of Storage Controller in Slot 2 | ST20000NM008D-3D | 20000588955136 | SATA | ZVT98ZEJ | OK | Enabled |
| node-10 | Disk 5 in Backplane 1 of Storage Controller in Slot 2 | ST20000NM008D-3D | 20000588955136 | SATA | ZVT99QMH | OK | Enabled |
| node-10 | Disk 0 on AHCI Controller in SL 6 | MTFDDAV240TDU | 240057409536 | SATA | 22373BB1E0F8 | OK | Enabled |
| node-10 | Disk 1 on AHCI Controller in SL 6 | MTFDDAV240TDU | 240057409536 | SATA | 22373BB1E0D5 | OK | Enabled |
+------------+--------------------------------------------------------+------------------+----------------+----------+----------------+--------+---------+
firmwares details
+++++++++++++++++
.. code-block:: bash
# ceph orch hardware status node-10 --category firmwares
+------------+----------------------------------------------------------------------------+--------------------------------------------------------------+----------------------+-------------+--------+
| HOST | COMPONENT | NAME | DATE | VERSION | STATUS |
+------------+----------------------------------------------------------------------------+--------------------------------------------------------------+----------------------+-------------+--------+
| node-10 | current-107649-7.03__raid.backplane.firmware.0 | Backplane 0 | 2022-12-05T00:00:00Z | 7.03 | OK |
... omitted output ...
| node-10 | previous-25227-6.10.30.20__idrac.embedded.1-1 | Integrated Remote Access Controller | 00:00:00Z | 6.10.30.20 | OK |
+------------+----------------------------------------------------------------------------+--------------------------------------------------------------+----------------------+-------------+--------+
hardware critical warnings report
+++++++++++++++++++++++++++++++++
.. code-block:: bash
# ceph orch hardware status --category criticals
+------------+-----------+------------+----------+-----------------+
| HOST | COMPONENT | NAME | STATUS | STATE |
+------------+-----------+------------+----------+-----------------+
| node-10 | power | PS2 Status | critical | unplugged |
+------------+-----------+------------+----------+-----------------+
Developpers
-----------
.. py:currentmodule:: cephadm.agent
.. autoclass:: NodeProxyEndpoint
.. automethod:: NodeProxyEndpoint.__init__
.. automethod:: NodeProxyEndpoint.oob
.. automethod:: NodeProxyEndpoint.data
.. automethod:: NodeProxyEndpoint.fullreport
.. automethod:: NodeProxyEndpoint.summary
.. automethod:: NodeProxyEndpoint.criticals
.. automethod:: NodeProxyEndpoint.memory
.. automethod:: NodeProxyEndpoint.storage
.. automethod:: NodeProxyEndpoint.network
.. automethod:: NodeProxyEndpoint.power
.. automethod:: NodeProxyEndpoint.processors
.. automethod:: NodeProxyEndpoint.fans
.. automethod:: NodeProxyEndpoint.firmwares
.. automethod:: NodeProxyEndpoint.led

View File

@ -118,8 +118,9 @@ about Ceph, see our `Architecture`_ section.
governance governance
foundation foundation
ceph-volume/index ceph-volume/index
releases/general Ceph Releases (general) <https://docs.ceph.com/en/latest/releases/general/>
releases/index Ceph Releases (index) <https://docs.ceph.com/en/latest/releases/>
security/index security/index
hardware-monitoring/index
Glossary <glossary> Glossary <glossary>
Tracing <jaegertracing/index> Tracing <jaegertracing/index>

View File

@ -98,59 +98,7 @@ repository.
Updating Submodules Updating Submodules
------------------- -------------------
#. Determine whether your submodules are out of date: If your submodules are out of date, run the following commands:
.. prompt:: bash $
git status
A. If your submodules are up to date
If your submodules are up to date, the following console output will
appear:
::
On branch main
Your branch is up to date with 'origin/main'.
nothing to commit, working tree clean
If you see this console output, then your submodules are up to date.
You do not need this procedure.
B. If your submodules are not up to date
If your submodules are not up to date, you will see a message that
includes a list of "untracked files". The example here shows such a
list, which was generated from a real situation in which the
submodules were no longer current. Your list of files will not be the
same as this list of files, but this list is provided as an example.
If in your case any untracked files are listed, then you should
continue to the next step of this procedure.
::
On branch main
Your branch is up to date with 'origin/main'.
Untracked files:
(use "git add <file>..." to include in what will be committed)
src/pybind/cephfs/build/
src/pybind/cephfs/cephfs.c
src/pybind/cephfs/cephfs.egg-info/
src/pybind/rados/build/
src/pybind/rados/rados.c
src/pybind/rados/rados.egg-info/
src/pybind/rbd/build/
src/pybind/rbd/rbd.c
src/pybind/rbd/rbd.egg-info/
src/pybind/rgw/build/
src/pybind/rgw/rgw.c
src/pybind/rgw/rgw.egg-info/
nothing added to commit but untracked files present (use "git add" to track)
#. If your submodules are out of date, run the following commands:
.. prompt:: bash $ .. prompt:: bash $
@ -158,24 +106,10 @@ Updating Submodules
git clean -fdx git clean -fdx
git submodule foreach git clean -fdx git submodule foreach git clean -fdx
If you still have problems with a submodule directory, use ``rm -rf If you still have problems with a submodule directory, use ``rm -rf [directory
[directory name]`` to remove the directory. Then run ``git submodule update name]`` to remove the directory. Then run ``git submodule update --init
--init --recursive`` again. --recursive --progress`` again.
#. Run ``git status`` again:
.. prompt:: bash $
git status
Your submodules are up to date if you see the following message:
::
On branch main
Your branch is up to date with 'origin/main'.
nothing to commit, working tree clean
Choose a Branch Choose a Branch
=============== ===============

View File

@ -251,6 +251,17 @@ openSUSE Tumbleweed
The newest major release of Ceph is already available through the normal Tumbleweed repositories. The newest major release of Ceph is already available through the normal Tumbleweed repositories.
There's no need to add another package repository manually. There's no need to add another package repository manually.
openEuler
^^^^^^^^^
There are two major versions supported in normal openEuler repositories. They are ceph 12.2.8 in openEuler-20.03-LTS series and ceph 16.2.7 in openEuler-22.03-LTS series. Theres no need to add another package repository manually.
You can install ceph just by executing the following:
.. prompt:: bash $
sudo yum -y install ceph
Also you can download packages manually from https://repo.openeuler.org/openEuler-{release}/everything/{arch}/Packages/.
Ceph Development Packages Ceph Development Packages
------------------------- -------------------------

View File

@ -9,9 +9,8 @@ There are multiple ways to install Ceph.
Recommended methods Recommended methods
~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~
:ref:`Cephadm <cephadm_deploying_new_cluster>` installs and manages a Ceph :ref:`Cephadm <cephadm_deploying_new_cluster>` is a tool that can be used to
cluster that uses containers and systemd and is tightly integrated with the CLI install and manage a Ceph cluster.
and dashboard GUI.
* cephadm supports only Octopus and newer releases. * cephadm supports only Octopus and newer releases.
* cephadm is fully integrated with the orchestration API and fully supports the * cephadm is fully integrated with the orchestration API and fully supports the
@ -59,6 +58,8 @@ tool that can be used to quickly deploy clusters. It is deprecated.
`github.com/openstack/puppet-ceph <https://github.com/openstack/puppet-ceph>`_ installs Ceph via Puppet. `github.com/openstack/puppet-ceph <https://github.com/openstack/puppet-ceph>`_ installs Ceph via Puppet.
`OpenNebula HCI clusters <https://docs.opennebula.io/stable/provision_clusters/hci_clusters/overview.html>`_ deploys Ceph on various cloud platforms.
Ceph can also be :ref:`installed manually <install-manual>`. Ceph can also be :ref:`installed manually <install-manual>`.

View File

@ -461,6 +461,52 @@ In the below instructions, ``{id}`` is an arbitrary name, such as the hostname o
#. Now you are ready to `create a Ceph file system`_. #. Now you are ready to `create a Ceph file system`_.
Manually Installing RADOSGW
===========================
For a more involved discussion of the procedure presented here, see `this
thread on the ceph-users mailing list
<https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/LB3YRIKAPOHXYCW7MKLVUJPYWYRQVARU/>`_.
#. Install ``radosgw`` packages on the nodes that will be the RGW nodes.
#. From a monitor or from a node with admin privileges, run a command of the
following form:
.. prompt:: bash #
ceph auth get-or-create client.short-hostname-of-rgw mon 'allow rw' osd 'allow rwx'
#. On one of the RGW nodes, do the following:
a. Create a ``ceph-user``-owned directory. For example:
.. prompt:: bash #
install -d -o ceph -g ceph /var/lib/ceph/radosgw/ceph-$(hostname -s)
b. Enter the directory just created and create a ``keyring`` file:
.. prompt:: bash #
touch /var/lib/ceph/radosgw/ceph-$(hostname -s)/keyring
Use a command similar to this one to put the key from the earlier ``ceph
auth get-or-create`` step in the ``keyring`` file. Use your preferred
editor:
.. prompt:: bash #
$EDITOR /var/lib/ceph/radosgw/ceph-$(hostname -s)/keyring
c. Repeat these steps on every RGW node.
#. Start the RADOSGW service by running the following command:
.. prompt:: bash #
systemctl start ceph-radosgw@$(hostname -s).service
Summary Summary
======= =======

View File

@ -1,5 +1,7 @@
:orphan: :orphan:
.. _ceph_mds_man:
========================================= =========================================
ceph-mds -- ceph metadata server daemon ceph-mds -- ceph metadata server daemon
========================================= =========================================

View File

@ -244,45 +244,56 @@ Procedure
Manipulating the Object Map Key Manipulating the Object Map Key
------------------------------- -------------------------------
Use the **ceph-objectstore-tool** utility to change the object map (OMAP) key. You need to provide the data path, the placement group identifier (PG ID), the object, and the key in the OMAP. Use the **ceph-objectstore-tool** utility to change the object map (OMAP) key.
Note Provide the data path, the placement group identifier (PG ID), the object, and
the key in the OMAP.
Prerequisites Prerequisites
^^^^^^^^^^^^^
* Having root access to the Ceph OSD node. * Having root access to the Ceph OSD node.
* Stopping the ceph-osd daemon. * Stopping the ceph-osd daemon.
Procedure Commands
^^^^^^^^
Get the object map key: Run the commands in this section as ``root`` on an OSD node.
Syntax:: * **Getting the object map key**
Syntax:
.. code-block:: ini
ceph-objectstore-tool --data-path $PATH_TO_OSD --pgid $PG_ID $OBJECT get-omap $KEY > $OBJECT_MAP_FILE_NAME ceph-objectstore-tool --data-path $PATH_TO_OSD --pgid $PG_ID $OBJECT get-omap $KEY > $OBJECT_MAP_FILE_NAME
Example:: Example::
[root@osd ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --pgid 0.1c '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' get-omap "" > zone_info.default.omap.txt ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --pgid 0.1c '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' get-omap "" > zone_info.default.omap.txt
Set the object map key: * **Setting the object map key**
Syntax:: Syntax:
.. code-block:: ini
ceph-objectstore-tool --data-path $PATH_TO_OSD --pgid $PG_ID $OBJECT set-omap $KEY < $OBJECT_MAP_FILE_NAME ceph-objectstore-tool --data-path $PATH_TO_OSD --pgid $PG_ID $OBJECT set-omap $KEY < $OBJECT_MAP_FILE_NAME
Example:: Example::
[root@osd ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --pgid 0.1c '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' set-omap "" < zone_info.default.omap.txt ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --pgid 0.1c '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' set-omap "" < zone_info.default.omap.txt
Remove the object map key: * **Removing the object map key**
Syntax:: Syntax:
.. code-block:: ini
ceph-objectstore-tool --data-path $PATH_TO_OSD --pgid $PG_ID $OBJECT rm-omap $KEY ceph-objectstore-tool --data-path $PATH_TO_OSD --pgid $PG_ID $OBJECT rm-omap $KEY
Example:: Example::
[root@osd ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --pgid 0.1c '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' rm-omap "" ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --pgid 0.1c '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' rm-omap ""
Listing an Object's Attributes Listing an Object's Attributes

View File

@ -18,14 +18,16 @@ Synopsis
Description Description
=========== ===========
**ceph-osd** is the object storage daemon for the Ceph distributed file **ceph-osd** is the **o**\bject **s**\torage **d**\aemon for the Ceph
system. It is responsible for storing objects on a local file system distributed file system. It manages data on local storage with redundancy and
and providing access to them over the network. provides access to that data over the network.
The datapath argument should be a directory on a xfs file system For Filestore-backed clusters, the argument of the ``--osd-data datapath``
where the object data resides. The journal is optional, and is only option (which is ``datapath`` in this example) should be a directory on an XFS
useful performance-wise when it resides on a different disk than file system where the object data resides. The journal is optional. The journal
datapath with low latency (ideally, an NVRAM device). improves performance only when it resides on a different disk than the disk
specified by ``datapath`` . The storage medium on which the journal is stored
should be a low-latency medium (ideally, an SSD device).
Options Options

View File

@ -56,7 +56,7 @@ Options
.. code:: bash .. code:: bash
[build]$ python3 -m venv venv && source venv/bin/activate && pip3 install cmd2 [build]$ python3 -m venv venv && source venv/bin/activate && pip3 install cmd2 colorama
[build]$ source vstart_environment.sh && source venv/bin/activate && python3 ../src/tools/cephfs/shell/cephfs-shell [build]$ source vstart_environment.sh && source venv/bin/activate && python3 ../src/tools/cephfs/shell/cephfs-shell
Commands Commands

View File

@ -199,6 +199,50 @@ Advanced
option is enabled, a namespace operation may complete before the MDS option is enabled, a namespace operation may complete before the MDS
replies, if it has sufficient capabilities to do so. replies, if it has sufficient capabilities to do so.
:command:`crush_location=x`
Specify the location of the client in terms of CRUSH hierarchy (since 5.8).
This is a set of key-value pairs separated from each other by '|', with
keys separated from values by ':'. Note that '|' may need to be quoted
or escaped to avoid it being interpreted as a pipe by the shell. The key
is the bucket type name (e.g. rack, datacenter or region with default
bucket types) and the value is the bucket name. For example, to indicate
that the client is local to rack "myrack", data center "mydc" and region
"myregion"::
crush_location=rack:myrack|datacenter:mydc|region:myregion
Each key-value pair stands on its own: "myrack" doesn't need to reside in
"mydc", which in turn doesn't need to reside in "myregion". The location
is not a path to the root of the hierarchy but rather a set of nodes that
are matched independently. "Multipath" locations are supported, so it is
possible to indicate locality for multiple parallel hierarchies::
crush_location=rack:myrack1|rack:myrack2|datacenter:mydc
:command:`read_from_replica=<no|balance|localize>`
- ``no``: Disable replica reads, always pick the primary OSD (since 5.8, default).
- ``balance``: When a replicated pool receives a read request, pick a random
OSD from the PG's acting set to serve it (since 5.8).
This mode is safe for general use only since Octopus (i.e. after "ceph osd
require-osd-release octopus"). Otherwise it should be limited to read-only
workloads such as snapshots.
- ``localize``: When a replicated pool receives a read request, pick the most
local OSD to serve it (since 5.8). The locality metric is calculated against
the location of the client given with crush_location; a match with the
lowest-valued bucket type wins. For example, an OSD in a matching rack
is closer than an OSD in a matching data center, which in turn is closer
than an OSD in a matching region.
This mode is safe for general use only since Octopus (i.e. after "ceph osd
require-osd-release octopus"). Otherwise it should be limited to read-only
workloads such as snapshots.
Examples Examples
======== ========

View File

@ -333,7 +333,7 @@ Commands
be specified. be specified.
:command:`flatten` [--encryption-format *encryption-format* --encryption-passphrase-file *passphrase-file*]... *image-spec* :command:`flatten` [--encryption-format *encryption-format* --encryption-passphrase-file *passphrase-file*]... *image-spec*
If image is a clone, copy all shared blocks from the parent snapshot and If the image is a clone, copy all shared blocks from the parent snapshot and
make the child independent of the parent, severing the link between make the child independent of the parent, severing the link between
parent snap and child. The parent snapshot can be unprotected and parent snap and child. The parent snapshot can be unprotected and
deleted if it has no further dependent clones. deleted if it has no further dependent clones.
@ -390,7 +390,7 @@ Commands
Set metadata key with the value. They will displayed in `image-meta list`. Set metadata key with the value. They will displayed in `image-meta list`.
:command:`import` [--export-format *format (1 or 2)*] [--image-format *format-id*] [--object-size *size-in-B/K/M*] [--stripe-unit *size-in-B/K/M* --stripe-count *num*] [--image-feature *feature-name*]... [--image-shared] *src-path* [*image-spec*] :command:`import` [--export-format *format (1 or 2)*] [--image-format *format-id*] [--object-size *size-in-B/K/M*] [--stripe-unit *size-in-B/K/M* --stripe-count *num*] [--image-feature *feature-name*]... [--image-shared] *src-path* [*image-spec*]
Create a new image and imports its data from path (use - for Create a new image and import its data from path (use - for
stdin). The import operation will try to create sparse rbd images stdin). The import operation will try to create sparse rbd images
if possible. For import from stdin, the sparsification unit is if possible. For import from stdin, the sparsification unit is
the data block size of the destination image (object size). the data block size of the destination image (object size).
@ -402,14 +402,14 @@ Commands
of image, but also the snapshots and other properties, such as image_order, features. of image, but also the snapshots and other properties, such as image_order, features.
:command:`import-diff` *src-path* *image-spec* :command:`import-diff` *src-path* *image-spec*
Import an incremental diff of an image and applies it to the current image. If the diff Import an incremental diff of an image and apply it to the current image. If the diff
was generated relative to a start snapshot, we verify that snapshot already exists before was generated relative to a start snapshot, we verify that snapshot already exists before
continuing. If there was an end snapshot we verify it does not already exist before continuing. If there was an end snapshot we verify it does not already exist before
applying the changes, and create the snapshot when we are done. applying the changes, and create the snapshot when we are done.
:command:`info` *image-spec* | *snap-spec* :command:`info` *image-spec* | *snap-spec*
Will dump information (such as size and object size) about a specific rbd image. Will dump information (such as size and object size) about a specific rbd image.
If image is a clone, information about its parent is also displayed. If the image is a clone, information about its parent is also displayed.
If a snapshot is specified, whether it is protected is shown as well. If a snapshot is specified, whether it is protected is shown as well.
:command:`journal client disconnect` *journal-spec* :command:`journal client disconnect` *journal-spec*
@ -472,7 +472,7 @@ Commands
the destination image are lost. the destination image are lost.
:command:`migration commit` *image-spec* :command:`migration commit` *image-spec*
Commit image migration. This step is run after a successful migration Commit image migration. This step is run after successful migration
prepare and migration execute steps and removes the source image data. prepare and migration execute steps and removes the source image data.
:command:`migration execute` *image-spec* :command:`migration execute` *image-spec*
@ -499,14 +499,12 @@ Commands
:command:`mirror image disable` [--force] *image-spec* :command:`mirror image disable` [--force] *image-spec*
Disable RBD mirroring for an image. If the mirroring is Disable RBD mirroring for an image. If the mirroring is
configured in ``image`` mode for the image's pool, then it configured in ``image`` mode for the image's pool, then it
can be explicitly disabled mirroring for each image within must be disabled for each image individually.
the pool.
:command:`mirror image enable` *image-spec* *mode* :command:`mirror image enable` *image-spec* *mode*
Enable RBD mirroring for an image. If the mirroring is Enable RBD mirroring for an image. If the mirroring is
configured in ``image`` mode for the image's pool, then it configured in ``image`` mode for the image's pool, then it
can be explicitly enabled mirroring for each image within must be enabled for each image individually.
the pool.
The mirror image mode can either be ``journal`` (default) or The mirror image mode can either be ``journal`` (default) or
``snapshot``. The ``journal`` mode requires the RBD journaling ``snapshot``. The ``journal`` mode requires the RBD journaling
@ -523,7 +521,7 @@ Commands
:command:`mirror pool demote` [*pool-name*] :command:`mirror pool demote` [*pool-name*]
Demote all primary images within a pool to non-primary. Demote all primary images within a pool to non-primary.
Every mirroring enabled image will demoted in the pool. Every mirror-enabled image in the pool will be demoted.
:command:`mirror pool disable` [*pool-name*] :command:`mirror pool disable` [*pool-name*]
Disable RBD mirroring by default within a pool. When mirroring Disable RBD mirroring by default within a pool. When mirroring
@ -551,7 +549,7 @@ Commands
The default for *remote client name* is "client.admin". The default for *remote client name* is "client.admin".
This requires mirroring mode is enabled. This requires mirroring to be enabled on the pool.
:command:`mirror pool peer remove` [*pool-name*] *uuid* :command:`mirror pool peer remove` [*pool-name*] *uuid*
Remove a mirroring peer from a pool. The peer uuid is available Remove a mirroring peer from a pool. The peer uuid is available
@ -564,12 +562,12 @@ Commands
:command:`mirror pool promote` [--force] [*pool-name*] :command:`mirror pool promote` [--force] [*pool-name*]
Promote all non-primary images within a pool to primary. Promote all non-primary images within a pool to primary.
Every mirroring enabled image will promoted in the pool. Every mirror-enabled image in the pool will be promoted.
:command:`mirror pool status` [--verbose] [*pool-name*] :command:`mirror pool status` [--verbose] [*pool-name*]
Show status for all mirrored images in the pool. Show status for all mirrored images in the pool.
With --verbose, also show additionally output status With ``--verbose``, show additional output status
details for every mirroring image in the pool. details for every mirror-enabled image in the pool.
:command:`mirror snapshot schedule add` [-p | --pool *pool*] [--namespace *namespace*] [--image *image*] *interval* [*start-time*] :command:`mirror snapshot schedule add` [-p | --pool *pool*] [--namespace *namespace*] [--image *image*] *interval* [*start-time*]
Add mirror snapshot schedule. Add mirror snapshot schedule.
@ -603,7 +601,7 @@ Commands
specified to rebuild an invalid object map for a snapshot. specified to rebuild an invalid object map for a snapshot.
:command:`pool init` [*pool-name*] [--force] :command:`pool init` [*pool-name*] [--force]
Initialize pool for use by RBD. Newly created pools must initialized Initialize pool for use by RBD. Newly created pools must be initialized
prior to use. prior to use.
:command:`resize` (-s | --size *size-in-M/G/T*) [--allow-shrink] [--encryption-format *encryption-format* --encryption-passphrase-file *passphrase-file*]... *image-spec* :command:`resize` (-s | --size *size-in-M/G/T*) [--allow-shrink] [--encryption-format *encryption-format* --encryption-passphrase-file *passphrase-file*]... *image-spec*
@ -615,7 +613,7 @@ Commands
snapshots, this fails and nothing is deleted. snapshots, this fails and nothing is deleted.
:command:`snap create` *snap-spec* :command:`snap create` *snap-spec*
Create a new snapshot. Requires the snapshot name parameter specified. Create a new snapshot. Requires the snapshot name parameter to be specified.
:command:`snap limit clear` *image-spec* :command:`snap limit clear` *image-spec*
Remove any previously set limit on the number of snapshots allowed on Remove any previously set limit on the number of snapshots allowed on
@ -625,7 +623,7 @@ Commands
Set a limit for the number of snapshots allowed on an image. Set a limit for the number of snapshots allowed on an image.
:command:`snap ls` *image-spec* :command:`snap ls` *image-spec*
Dump the list of snapshots inside a specific image. Dump the list of snapshots of a specific image.
:command:`snap protect` *snap-spec* :command:`snap protect` *snap-spec*
Protect a snapshot from deletion, so that clones can be made of it Protect a snapshot from deletion, so that clones can be made of it
@ -668,9 +666,11 @@ Commands
:command:`trash ls` [*pool-name*] :command:`trash ls` [*pool-name*]
List all entries from trash. List all entries from trash.
:command:`trash mv` *image-spec* :command:`trash mv` [--expires-at <expires-at>] *image-spec*
Move an image to the trash. Images, even ones actively in-use by Move an image to the trash. Images, even ones actively in-use by
clones, can be moved to the trash and deleted at a later time. clones, can be moved to the trash and deleted at a later time. Use
``--expires-at`` to set the expiration time of an image after which
it's allowed to be removed.
:command:`trash purge` [*pool-name*] :command:`trash purge` [*pool-name*]
Remove all expired images from trash. Remove all expired images from trash.
@ -678,9 +678,9 @@ Commands
:command:`trash restore` *image-id* :command:`trash restore` *image-id*
Restore an image from trash. Restore an image from trash.
:command:`trash rm` *image-id* :command:`trash rm` [--force] *image-id*
Delete an image from trash. If image deferment time has not expired Delete an image from trash. If the image deferment time has not expired
you can not removed it unless use force. But an actively in-use by clones it can be removed using ``--force``. An image that is actively in-use by clones
or has snapshots cannot be removed. or has snapshots cannot be removed.
:command:`trash purge schedule add` [-p | --pool *pool*] [--namespace *namespace*] *interval* [*start-time*] :command:`trash purge schedule add` [-p | --pool *pool*] [--namespace *namespace*] *interval* [*start-time*]

View File

@ -568,6 +568,9 @@ If the NFS service is running on a non-standard port number:
.. note:: Only NFS v4.0+ is supported. .. note:: Only NFS v4.0+ is supported.
.. note:: As of this writing (01 Jan 2024), no version of Microsoft Windows
supports mouting an NFS v4.x export natively.
Troubleshooting Troubleshooting
=============== ===============

View File

@ -151,3 +151,96 @@ ceph-mgr and check the logs.
With logging set to debug for the manager the module will print various logging With logging set to debug for the manager the module will print various logging
lines prefixed with *mgr[zabbix]* for easy filtering. lines prefixed with *mgr[zabbix]* for easy filtering.
Installing zabbix-agent 2
-------------------------
*The procedures that explain the installation of Zabbix 2 were developed by John Jasen.*
Follow the instructions in the sections :ref:`mgr_zabbix_2_nodes`,
:ref:`mgr_zabbix_2_cluster`, and :ref:`mgr_zabbix_2_server` to install a Zabbix
server to monitor your Ceph cluster.
.. _mgr_zabbix_2_nodes:
Ceph MGR Nodes
^^^^^^^^^^^^^^
#. Download an appropriate Zabbix release from https://www.zabbix.com/download
or install a package from the Zabbix repositories.
#. Use your package manager to remove any other Zabbix agents.
#. Install ``zabbix-agent 2`` using the instructions at
https://www.zabbix.com/download.
#. Edit ``/etc/zabbix/zabbix-agent2.conf``. Add your Zabbix monitoring servers
and your localhost to the ``Servers`` line of ``zabbix-agent2.conf``::
Server=127.0.0.1,zabbix2.example.com,zabbix1.example.com
#. Start or restart the ``zabbix-agent2`` agent:
.. prompt:: bash #
systemctl restart zabbix-agent2
.. _mgr_zabbix_2_cluster:
Ceph Cluster
^^^^^^^^^^^^
#. Enable the ``restful`` module:
.. prompt:: bash #
ceph mgr module enable restful
#. Generate a self-signed certificate. This step is optional:
.. prompt:: bash #
restful create-self-signed-cert
#. Create an API user called ``zabbix-monitor``:
.. prompt:: bash #
ceph restful create-key zabbix-monitor
The output of this command, an API key, will look something like this::
a4bb2019-XXXX-YYYY-ZZZZ-abcdefghij
#. Save the generated API key. It will be necessary later.
#. Test API access by using ``zabbix-get``:
.. note:: This step is optional.
.. prompt:: bash #
zabbix_get -s 127.0.0.1 -k ceph.ping["${CEPH.CONNSTRING}","${CEPH.USER}","{CEPH.API.KEY}"
Example:
.. prompt:: bash #
zabbix_get -s 127.0.0.1 -k ceph.ping["https://localhost:8003","zabbix-monitor","a4bb2019-XXXX-YYYY-ZZZZ-abcdefghij"]
.. note:: You may need to install ``zabbix-get`` via your package manager.
.. _mgr_zabbix_2_server:
Zabbix Server
^^^^^^^^^^^^^
#. Create a host for the Ceph monitoring servers.
#. Add the template ``Ceph by Zabbix agent 2`` to the host.
#. Inform the host of the keys:
#. Go to “Macros” on the host.
#. Show “Inherited and host macros”.
#. Change ``${CEPH.API.KEY}`` and ``${CEPH.USER}`` to the values provided
under ``ceph restful create-key``, above. Example::
{$CEPH.API.KEY} a4bb2019-XXXX-YYYY-ZZZZ-abcdefghij
{$CEPH.USER} zabbix-monitor
#. Update the host. Within a few cycles, data will populate the server.

View File

@ -470,5 +470,8 @@ Useful queries
rate(ceph_rbd_read_latency_sum[30s]) / rate(ceph_rbd_read_latency_count[30s]) * on (instance) group_left (ceph_daemon) ceph_rgw_metadata rate(ceph_rbd_read_latency_sum[30s]) / rate(ceph_rbd_read_latency_count[30s]) * on (instance) group_left (ceph_daemon) ceph_rgw_metadata
Hardware monitoring
===================
See :ref:`hardware-monitoring`

View File

@ -1,3 +1,5 @@
.. _librados-intro:
========================== ==========================
Introduction to librados Introduction to librados
========================== ==========================

View File

@ -1,3 +1,5 @@
.. _librados-python:
=================== ===================
Librados (Python) Librados (Python)
=================== ===================

View File

@ -358,7 +358,7 @@ OSD and run the following command:
ceph-bluestore-tool \ ceph-bluestore-tool \
--path <data path> \ --path <data path> \
--sharding="m(3) p(3,0-12) o(3,0-13)=block_cache={type=binned_lru} l p" \ --sharding="m(3) p(3,0-12) O(3,0-13)=block_cache={type=binned_lru} l p" \
reshard reshard
.. confval:: bluestore_rocksdb_cf .. confval:: bluestore_rocksdb_cf

View File

@ -123,11 +123,10 @@ OSD host, run the following commands:
ssh {osd-host} ssh {osd-host}
sudo mkdir /var/lib/ceph/osd/ceph-{osd-number} sudo mkdir /var/lib/ceph/osd/ceph-{osd-number}
The ``osd_data`` path ought to lead to a mount point that has mounted on it a The ``osd_data`` path must lead to a device that is not shared with the
device that is distinct from the device that contains the operating system and operating system. To use a device other than the device that contains the
the daemons. To use a device distinct from the device that contains the
operating system and the daemons, prepare it for use with Ceph and mount it on operating system and the daemons, prepare it for use with Ceph and mount it on
the directory you just created by running the following commands: the directory you just created by running commands of the following form:
.. prompt:: bash $ .. prompt:: bash $

View File

@ -151,7 +151,7 @@ generates a catalog of all objects in each placement group and compares each
primary object to its replicas, ensuring that no objects are missing or primary object to its replicas, ensuring that no objects are missing or
mismatched. Light scrubbing checks the object size and attributes, and is mismatched. Light scrubbing checks the object size and attributes, and is
usually done daily. Deep scrubbing reads the data and uses checksums to ensure usually done daily. Deep scrubbing reads the data and uses checksums to ensure
data integrity, and is usually done weekly. The freqeuncies of both light data integrity, and is usually done weekly. The frequencies of both light
scrubbing and deep scrubbing are determined by the cluster's configuration, scrubbing and deep scrubbing are determined by the cluster's configuration,
which is fully under your control and subject to the settings explained below which is fully under your control and subject to the settings explained below
in this section. in this section.

View File

@ -6,12 +6,41 @@
.. index:: pools; configuration .. index:: pools; configuration
Ceph uses default values to determine how many placement groups (PGs) will be The number of placement groups that the CRUSH algorithm assigns to each pool is
assigned to each pool. We recommend overriding some of the defaults. determined by the values of variables in the centralized configuration database
Specifically, we recommend setting a pool's replica size and overriding the in the monitor cluster.
default number of placement groups. You can set these values when running
`pool`_ commands. You can also override the defaults by adding new ones in the Both containerized deployments of Ceph (deployments made using ``cephadm`` or
``[global]`` section of your Ceph configuration file. Rook) and non-containerized deployments of Ceph rely on the values in the
central configuration database in the monitor cluster to assign placement
groups to pools.
Example Commands
----------------
To see the value of the variable that governs the number of placement groups in a given pool, run a command of the following form:
.. prompt:: bash
ceph config get osd osd_pool_default_pg_num
To set the value of the variable that governs the number of placement groups in a given pool, run a command of the following form:
.. prompt:: bash
ceph config set osd osd_pool_default_pg_num
Manual Tuning
-------------
In some cases, it might be advisable to override some of the defaults. For
example, you might determine that it is wise to set a pool's replica size and
to override the default number of placement groups in the pool. You can set
these values when running `pool`_ commands.
See Also
--------
See :ref:`pg-autoscaler`.
.. literalinclude:: pool-pg.conf .. literalinclude:: pool-pg.conf

View File

@ -344,12 +344,13 @@ addresses, repeat this process.
Changing a Monitor's IP address (Advanced Method) Changing a Monitor's IP address (Advanced Method)
------------------------------------------------- -------------------------------------------------
There are cases in which the method outlined in :ref"`<Changing a Monitor's IP There are cases in which the method outlined in
Address (Preferred Method)> operations_add_or_rm_mons_changing_mon_ip` cannot :ref:`operations_add_or_rm_mons_changing_mon_ip` cannot be used. For example,
be used. For example, it might be necessary to move the cluster's monitors to a it might be necessary to move the cluster's monitors to a different network, to
different network, to a different part of the datacenter, or to a different a different part of the datacenter, or to a different datacenter altogether. It
datacenter altogether. It is still possible to change the monitors' IP is still possible to change the monitors' IP addresses, but a different method
addresses, but a different method must be used. must be used.
For such cases, a new monitor map with updated IP addresses for every monitor For such cases, a new monitor map with updated IP addresses for every monitor
in the cluster must be generated and injected on each monitor. Although this in the cluster must be generated and injected on each monitor. Although this
@ -357,11 +358,11 @@ method is not particularly easy, such a major migration is unlikely to be a
routine task. As stated at the beginning of this section, existing monitors are routine task. As stated at the beginning of this section, existing monitors are
not supposed to change their IP addresses. not supposed to change their IP addresses.
Continue with the monitor configuration in the example from :ref"`<Changing a Continue with the monitor configuration in the example from
Monitor's IP Address (Preferred Method)> :ref:`operations_add_or_rm_mons_changing_mon_ip`. Suppose that all of the
operations_add_or_rm_mons_changing_mon_ip` . Suppose that all of the monitors monitors are to be moved from the ``10.0.0.x`` range to the ``10.1.0.x`` range,
are to be moved from the ``10.0.0.x`` range to the ``10.1.0.x`` range, and that and that these networks are unable to communicate. Carry out the following
these networks are unable to communicate. Carry out the following procedure: procedure:
#. Retrieve the monitor map (``{tmp}`` is the path to the retrieved monitor #. Retrieve the monitor map (``{tmp}`` is the path to the retrieved monitor
map, and ``{filename}`` is the name of the file that contains the retrieved map, and ``{filename}`` is the name of the file that contains the retrieved
@ -448,7 +449,135 @@ and inject the modified monitor map into each new monitor.
Migration to the new location is now complete. The monitors should operate Migration to the new location is now complete. The monitors should operate
successfully. successfully.
Using cephadm to change the public network
==========================================
Overview
--------
The procedure in this overview section provides only the broad outlines of
using ``cephadm`` to change the public network.
#. Create backups of all keyrings, configuration files, and the current monmap.
#. Stop the cluster and disable ``ceph.target`` to prevent the daemons from
starting.
#. Move the servers and power them on.
#. Change the network setup as desired.
Example Procedure
-----------------
.. note:: In this procedure, the "old network" has addresses of the form
``10.10.10.0/24`` and the "new network" has addresses of the form
``192.168.160.0/24``.
#. Enter the shell of the first monitor:
.. prompt:: bash #
cephadm shell --name mon.reef1
#. Extract the current monmap from ``mon.reef1``:
.. prompt:: bash #
ceph-mon -i reef1 --extract-monmap monmap
#. Print the content of the monmap:
.. prompt:: bash #
monmaptool --print monmap
::
monmaptool: monmap file monmap
epoch 5
fsid 2851404a-d09a-11ee-9aaa-fa163e2de51a
last_changed 2024-02-21T09:32:18.292040+0000
created 2024-02-21T09:18:27.136371+0000
min_mon_release 18 (reef)
election_strategy: 1
0: [v2:10.10.10.11:3300/0,v1:10.10.10.11:6789/0] mon.reef1
1: [v2:10.10.10.12:3300/0,v1:10.10.10.12:6789/0] mon.reef2
2: [v2:10.10.10.13:3300/0,v1:10.10.10.13:6789/0] mon.reef3
#. Remove monitors with old addresses:
.. prompt:: bash #
monmaptool --rm reef1 --rm reef2 --rm reef3 monmap
#. Add monitors with new addresses:
.. prompt:: bash #
monmaptool --addv reef1 [v2:192.168.160.11:3300/0,v1:192.168.160.11:6789/0] --addv reef2 [v2:192.168.160.12:3300/0,v1:192.168.160.12:6789/0] --addv reef3 [v2:192.168.160.13:3300/0,v1:192.168.160.13:6789/0] monmap
#. Verify that the changes to the monmap have been made successfully:
.. prompt:: bash #
monmaptool --print monmap
::
monmaptool: monmap file monmap
epoch 4
fsid 2851404a-d09a-11ee-9aaa-fa163e2de51a
last_changed 2024-02-21T09:32:18.292040+0000
created 2024-02-21T09:18:27.136371+0000
min_mon_release 18 (reef)
election_strategy: 1
0: [v2:192.168.160.11:3300/0,v1:192.168.160.11:6789/0] mon.reef1
1: [v2:192.168.160.12:3300/0,v1:192.168.160.12:6789/0] mon.reef2
2: [v2:192.168.160.13:3300/0,v1:192.168.160.13:6789/0] mon.reef3
#. Inject the new monmap into the Ceph cluster:
.. prompt:: bash #
ceph-mon -i reef1 --inject-monmap monmap
#. Repeat the steps above for all other monitors in the cluster.
#. Update ``/var/lib/ceph/{FSID}/mon.{MON}/config``.
#. Start the monitors.
#. Update the ceph ``public_network``:
.. prompt:: bash #
ceph config set mon public_network 192.168.160.0/24
#. Update the configuration files of the managers
(``/var/lib/ceph/{FSID}/mgr.{mgr}/config``) and start them. Orchestrator
will now be available, but it will attempt to connect to the old network
because the host list contains the old addresses.
#. Update the host addresses by running commands of the following form:
.. prompt:: bash #
ceph orch host set-addr reef1 192.168.160.11
ceph orch host set-addr reef2 192.168.160.12
ceph orch host set-addr reef3 192.168.160.13
#. Wait a few minutes for the orchestrator to connect to each host.
#. Reconfigure the OSDs so that their config files are automatically updated:
.. prompt:: bash #
ceph orch reconfig osd
*The above procedure was developed by Eugen Block and was successfully tested
in February 2024 on Ceph version 18.2.1 (Reef).*
.. _Manual Deployment: ../../../install/manual-deployment .. _Manual Deployment: ../../../install/manual-deployment
.. _Monitor Bootstrap: ../../../dev/mon-bootstrap .. _Monitor Bootstrap: ../../../dev/mon-bootstrap

View File

@ -474,27 +474,25 @@ following command:
ceph tell mds.{mds-id} config set {setting} {value} ceph tell mds.{mds-id} config set {setting} {value}
Example: Example: to enable debug messages, run the following command:
.. prompt:: bash $ .. prompt:: bash $
ceph tell mds.0 config set debug_ms 1 ceph tell mds.0 config set debug_ms 1
To enable debug messages, run the following command: To display the status of all metadata servers, run the following command:
.. prompt:: bash $ .. prompt:: bash $
ceph mds stat ceph mds stat
To display the status of all metadata servers, run the following command: To mark the active metadata server as failed (and to trigger failover to a
standby if a standby is present), run the following command:
.. prompt:: bash $ .. prompt:: bash $
ceph mds fail 0 ceph mds fail 0
To mark the active metadata server as failed (and to trigger failover to a
standby if a standby is present), run the following command:
.. todo:: ``ceph mds`` subcommands missing docs: set, dump, getmap, stop, setmap .. todo:: ``ceph mds`` subcommands missing docs: set, dump, getmap, stop, setmap

View File

@ -57,53 +57,62 @@ case for most clusters), its CRUSH location can be specified as follows::
``pod``, ``pdu``, ``rack``, ``chassis``, and ``host``. These defined ``pod``, ``pdu``, ``rack``, ``chassis``, and ``host``. These defined
types suffice for nearly all clusters, but can be customized by types suffice for nearly all clusters, but can be customized by
modifying the CRUSH map. modifying the CRUSH map.
#. Not all keys need to be specified. For example, by default, Ceph
automatically sets an ``OSD``'s location as ``root=default
host=HOSTNAME`` (as determined by the output of ``hostname -s``).
The CRUSH location for an OSD can be modified by adding the ``crush location`` The CRUSH location for an OSD can be set by adding the ``crush_location``
option in ``ceph.conf``. When this option has been added, every time the OSD option in ``ceph.conf``, example:
crush_location = root=default row=a rack=a2 chassis=a2a host=a2a1
When this option has been added, every time the OSD
starts it verifies that it is in the correct location in the CRUSH map and starts it verifies that it is in the correct location in the CRUSH map and
moves itself if it is not. To disable this automatic CRUSH map management, add moves itself if it is not. To disable this automatic CRUSH map management, add
the following to the ``ceph.conf`` configuration file in the ``[osd]`` the following to the ``ceph.conf`` configuration file in the ``[osd]``
section:: section::
osd crush update on start = false osd_crush_update_on_start = false
Note that this action is unnecessary in most cases. Note that this action is unnecessary in most cases.
If the ``crush_location`` is not set explicitly,
a default of ``root=default host=HOSTNAME`` is used for ``OSD``s,
where the hostname is determined by the output of the ``hostname -s`` command.
.. note:: If you switch from this default to an explicitly set ``crush_location``,
do not forget to include ``root=default`` because existing CRUSH rules refer to it.
Custom location hooks Custom location hooks
--------------------- ---------------------
A custom location hook can be used to generate a more complete CRUSH location A custom location hook can be used to generate a more complete CRUSH location,
on startup. The CRUSH location is determined by, in order of preference: on startup.
#. A ``crush location`` option in ``ceph.conf`` This is useful when some location fields are not known at the time
#. A default of ``root=default host=HOSTNAME`` where the hostname is determined ``ceph.conf`` is written (for example, fields ``rack`` or ``datacenter``
by the output of the ``hostname -s`` command when deploying a single configuration across multiple datacenters).
A script can be written to provide additional location fields (for example, If configured, executed, and parsed successfully, the hook's output replaces
``rack`` or ``datacenter``) and the hook can be enabled via the following any previously set CRUSH location.
config option::
crush location hook = /path/to/customized-ceph-crush-location The hook hook can be enabled in ``ceph.conf`` by providing a path to an
executable file (often a script), example::
crush_location_hook = /path/to/customized-ceph-crush-location
This hook is passed several arguments (see below). The hook outputs a single This hook is passed several arguments (see below). The hook outputs a single
line to ``stdout`` that contains the CRUSH location description. The output line to ``stdout`` that contains the CRUSH location description. The arguments
resembles the following::: resemble the following:::
--cluster CLUSTER --id ID --type TYPE --cluster CLUSTER --id ID --type TYPE
Here the cluster name is typically ``ceph``, the ``id`` is the daemon Here the cluster name is typically ``ceph``, the ``id`` is the daemon
identifier or (in the case of OSDs) the OSD number, and the daemon type is identifier or (in the case of OSDs) the OSD number, and the daemon type is
``osd``, ``mds, ``mgr``, or ``mon``. ``osd``, ``mds``, ``mgr``, or ``mon``.
For example, a simple hook that specifies a rack location via a value in the For example, a simple hook that specifies a rack location via a value in the
file ``/etc/rack`` might be as follows:: file ``/etc/rack`` (assuming it contains no spaces) might be as follows::
#!/bin/sh #!/bin/sh
echo "host=$(hostname -s) rack=$(cat /etc/rack) root=default" echo "root=default rack=$(cat /etc/rack) host=$(hostname -s)"
CRUSH structure CRUSH structure

View File

@ -96,7 +96,9 @@ Where:
``--force`` ``--force``
:Description: Override an existing profile by the same name, and allow :Description: Override an existing profile by the same name, and allow
setting a non-4K-aligned stripe_unit. setting a non-4K-aligned stripe_unit. Overriding an existing
profile can be dangerous, and thus ``--yes-i-really-mean-it``
must be used as well.
:Type: String :Type: String
:Required: No. :Required: No.

View File

@ -179,6 +179,8 @@ This can be enabled only on a pool residing on BlueStore OSDs, since
BlueStore's checksumming is used during deep scrubs to detect bitrot BlueStore's checksumming is used during deep scrubs to detect bitrot
or other corruption. Using Filestore with EC overwrites is not only or other corruption. Using Filestore with EC overwrites is not only
unsafe, but it also results in lower performance compared to BlueStore. unsafe, but it also results in lower performance compared to BlueStore.
Moreover, Filestore is deprecated and any Filestore OSDs in your cluster
should be migrated to BlueStore.
Erasure-coded pools do not support omap, so to use them with RBD and Erasure-coded pools do not support omap, so to use them with RBD and
CephFS you must instruct them to store their data in an EC pool and CephFS you must instruct them to store their data in an EC pool and
@ -192,6 +194,182 @@ erasure-coded pool as the ``--data-pool`` during image creation:
For CephFS, an erasure-coded pool can be set as the default data pool during For CephFS, an erasure-coded pool can be set as the default data pool during
file system creation or via `file layouts <../../../cephfs/file-layouts>`_. file system creation or via `file layouts <../../../cephfs/file-layouts>`_.
Erasure-coded pool overhead
---------------------------
The overhead factor (space amplification) of an erasure-coded pool
is `(k+m) / k`. For a 4,2 profile, the overhead is
thus 1.5, which means that 1.5 GiB of underlying storage are used to store
1 GiB of user data. Contrast with default three-way replication, with
which the overhead factor is 3.0. Do not mistake erasure coding for a free
lunch: there is a significant performance tradeoff, especially when using HDDs
and when performing cluster recovery or backfill.
Below is a table showing the overhead factors for various values of `k` and `m`.
As `m` increases above 2, the incremental capacity overhead gain quickly
experiences diminishing returns but the performance impact grows proportionally.
We recommend that you do not choose a profile with `k` > 4 or `m` > 2 until
and unless you fully understand the ramifications, including the number of
failure domains your cluster topology must contain. If you choose `m=1`,
expect data unavailability during maintenance and data loss if component
failures overlap.
.. list-table:: Erasure coding overhead
:widths: 4 4 4 4 4 4 4 4 4 4 4 4
:header-rows: 1
:stub-columns: 1
* -
- m=1
- m=2
- m=3
- m=4
- m=4
- m=6
- m=7
- m=8
- m=9
- m=10
- m=11
* - k=1
- 2.00
- 3.00
- 4.00
- 5.00
- 6.00
- 7.00
- 8.00
- 9.00
- 10.00
- 11.00
- 12.00
* - k=2
- 1.50
- 2.00
- 2.50
- 3.00
- 3.50
- 4.00
- 4.50
- 5.00
- 5.50
- 6.00
- 6.50
* - k=3
- 1.33
- 1.67
- 2.00
- 2.33
- 2.67
- 3.00
- 3.33
- 3.67
- 4.00
- 4.33
- 4.67
* - k=4
- 1.25
- 1.50
- 1.75
- 2.00
- 2.25
- 2.50
- 2.75
- 3.00
- 3.25
- 3.50
- 3.75
* - k=5
- 1.20
- 1.40
- 1.60
- 1.80
- 2.00
- 2.20
- 2.40
- 2.60
- 2.80
- 3.00
- 3.20
* - k=6
- 1.16
- 1.33
- 1.50
- 1.66
- 1.83
- 2.00
- 2.17
- 2.33
- 2.50
- 2.66
- 2.83
* - k=7
- 1.14
- 1.29
- 1.43
- 1.58
- 1.71
- 1.86
- 2.00
- 2.14
- 2.29
- 2.43
- 2.58
* - k=8
- 1.13
- 1.25
- 1.38
- 1.50
- 1.63
- 1.75
- 1.88
- 2.00
- 2.13
- 2.25
- 2.38
* - k=9
- 1.11
- 1.22
- 1.33
- 1.44
- 1.56
- 1.67
- 1.78
- 1.88
- 2.00
- 2.11
- 2.22
* - k=10
- 1.10
- 1.20
- 1.30
- 1.40
- 1.50
- 1.60
- 1.70
- 1.80
- 1.90
- 2.00
- 2.10
* - k=11
- 1.09
- 1.18
- 1.27
- 1.36
- 1.45
- 1.54
- 1.63
- 1.72
- 1.82
- 1.91
- 2.00
Erasure-coded pools and cache tiering Erasure-coded pools and cache tiering
------------------------------------- -------------------------------------

View File

@ -21,6 +21,7 @@ and, monitoring an operating cluster.
monitoring-osd-pg monitoring-osd-pg
user-management user-management
pg-repair pg-repair
pgcalc/index
.. raw:: html .. raw:: html

View File

@ -517,6 +517,8 @@ multiple monitors are running to ensure proper functioning of your Ceph
cluster. Check monitor status regularly in order to ensure that all of the cluster. Check monitor status regularly in order to ensure that all of the
monitors are running. monitors are running.
.. _display-mon-map:
To display the monitor map, run the following command: To display the monitor map, run the following command:
.. prompt:: bash $ .. prompt:: bash $

View File

@ -0,0 +1,68 @@
.. _pgcalc:
=======
PG Calc
=======
.. raw:: html
<link rel="stylesheet" id="wp-job-manager-job-listings-css" href="https://web.archive.org/web/20230614135557cs_/https://old.ceph.com/wp-content/plugins/wp-job-manager/assets/dist/css/job-listings.css" type="text/css" media="all"/>
<link rel="stylesheet" id="ceph/googlefont-css" href="https://web.archive.org/web/20230614135557cs_/https://fonts.googleapis.com/css?family=Raleway%3A300%2C400%2C700&amp;ver=5.7.2" type="text/css" media="all"/>
<link rel="stylesheet" id="Stylesheet-css" href="https://web.archive.org/web/20230614135557cs_/https://old.ceph.com/wp-content/themes/cephTheme/Resources/Styles/style.min.css" type="text/css" media="all"/>
<link rel="stylesheet" id="tablepress-default-css" href="https://web.archive.org/web/20230614135557cs_/https://old.ceph.com/wp-content/plugins/tablepress/css/default.min.css" type="text/css" media="all"/>
<link rel="stylesheet" id="jetpack_css-css" href="https://web.archive.org/web/20230614135557cs_/https://old.ceph.com/wp-content/plugins/jetpack/css/jetpack.css" type="text/css" media="all"/>
<script type="text/javascript" src="https://web.archive.org/web/20230614135557js_/https://old.ceph.com/wp-content/themes/cephTheme/foundation_framework/js/vendor/jquery.js" id="jquery-js"></script>
<link rel="stylesheet" href="https://web.archive.org/web/20230614135557cs_/https://ajax.googleapis.com/ajax/libs/jqueryui/1.11.2/themes/smoothness/jquery-ui.css"/>
<link rel="stylesheet" href="https://web.archive.org/web/20230614135557cs_/https://old.ceph.com/pgcalc_assets/pgcalc.css"/>
<script src="https://ajax.googleapis.com/ajax/libs/jqueryui/1.11.2/jquery-ui.min.js"></script>
<script src="../../../_static/js/pgcalc.js"></script>
<div id="pgcalcdiv">
<div id="instructions">
<h2>Ceph PGs per Pool Calculator</h2><br/><fieldset><legend>Instructions</legend>
<ol>
<li>Confirm your understanding of the fields by reading through the Key below.</li>
<li>Select a <b>"Ceph Use Case"</b> from the drop down menu.</li>
<li>Adjust the values in the <span class="inputColor addBorder" style="font-weight: bold;">"Green"</span> shaded fields below.<br/>
<b>Tip:</b> Headers can be clicked to change the value throughout the table.</li>
<li>You will see the Suggested PG Count update based on your inputs.</li>
<li>Click the <b>"Add Pool"</b> button to create a new line for a new pool.</li>
<li>Click the <span class="ui-icon ui-icon-trash" style="display:inline-block;"></span> icon to delete the specific Pool.</li>
<li>For more details on the logic used and some important details, see the area below the table.</li>
<li>Once all values have been adjusted, click the <b>"Generate Commands"</b> button to get the pool creation commands.</li>
</ol></fieldset>
</div>
<div id="beforeTable"></div>
<br/>
<p class="validateTips">&nbsp;</p>
<label for="presetType">Ceph Use Case Selector:</label><br/><select id="presetType"></select><button style="margin-left: 200px;" id="btnAddPool" type="button">Add Pool</button><button type="button" id="btnGenCommands" download="commands.txt">Generate Commands</button>
<div id="pgsPerPoolTable">
<table id="pgsperpool">
</table>
</div> <!-- id = pgsPerPoolTable -->
<br/>
<div id="afterTable"></div>
<div id="countLogic"><fieldset><legend>Logic behind Suggested PG Count</legend>
<br/>
<div class="upperFormula">( Target PGs per OSD ) x ( OSD # ) x ( %Data )</div>
<div class="lowerFormula">( Size )</div>
<ol id="countLogicList">
<li>If the value of the above calculation is less than the value of <b>( OSD# ) / ( Size )</b>, then the value is updated to the value of <b>( OSD# ) / ( Size )</b>. This is to ensure even load / data distribution by allocating at least one Primary or Secondary PG to every OSD for every Pool.</li>
<li>The output value is then rounded to the <b>nearest power of 2</b>.<br/><b>Tip:</b> The nearest power of 2 provides a marginal improvement in efficiency of the <a href="https://web.archive.org/web/20230614135557/http://ceph.com/docs/master/rados/operations/crush-map/" title="CRUSH Map Details">CRUSH</a> algorithm.</li>
<li>If the nearest power of 2 is more than <b>25%</b> below the original value, the next higher power of 2 is used.</li>
</ol>
<b>Objective</b>
<ul><li>The objective of this calculation and the target ranges noted in the &quot;Key&quot; section above are to ensure that there are sufficient Placement Groups for even data distribution throughout the cluster, while not going high enough on the PG per OSD ratio to cause problems during Recovery and/or Backfill operations.</li></ul>
<b>Effects of enpty or non-active pools:</b>
<ul>
<li>Empty or otherwise non-active pools should not be considered helpful toward even data distribution throughout the cluster.</li>
<li>However, the PGs associated with these empty / non-active pools still consume memory and CPU overhead.</li>
</ul>
</fieldset>
</div>
<div id="commands" title="Pool Creation Commands"><code><pre id="commandCode"></pre></code></div>
</div>

View File

@ -4,6 +4,21 @@
Placement Groups Placement Groups
================== ==================
Placement groups (PGs) are subsets of each logical Ceph pool. Placement groups
perform the function of placing objects (as a group) into OSDs. Ceph manages
data internally at placement-group granularity: this scales better than would
managing individual RADOS objects. A cluster that has a larger number of
placement groups (for example, 150 per OSD) is better balanced than an
otherwise identical cluster with a smaller number of placement groups.
Cephs internal RADOS objects are each mapped to a specific placement group,
and each placement group belongs to exactly one Ceph pool.
See Sage Weil's blog post `New in Nautilus: PG merging and autotuning
<https://ceph.io/en/news/blog/2019/new-in-nautilus-pg-merging-and-autotuning/>`_
for more information about the relationship of placement groups to pools and to
objects.
.. _pg-autoscaler: .. _pg-autoscaler:
Autoscaling placement groups Autoscaling placement groups
@ -131,11 +146,11 @@ The output will resemble the following::
if a ``pg_num`` change is in progress, the current number of PGs that the if a ``pg_num`` change is in progress, the current number of PGs that the
pool is working towards. pool is working towards.
- **NEW PG_NUM** (if present) is the value that the system is recommending the - **NEW PG_NUM** (if present) is the value that the system recommends that the
``pg_num`` of the pool to be changed to. It is always a power of 2, and it is ``pg_num`` of the pool should be. It is always a power of two, and it
present only if the recommended value varies from the current value by more is present only if the recommended value varies from the current value by
than the default factor of ``3``. To adjust this factor (in the following more than the default factor of ``3``. To adjust this multiple (in the
example, it is changed to ``2``), run the following command: following example, it is changed to ``2``), run the following command:
.. prompt:: bash # .. prompt:: bash #
@ -168,7 +183,6 @@ The output will resemble the following::
.. prompt:: bash # .. prompt:: bash #
ceph osd pool set .mgr crush_rule replicated-ssd ceph osd pool set .mgr crush_rule replicated-ssd
ceph osd pool set pool 1 crush_rule to replicated-ssd
This intervention will result in a small amount of backfill, but This intervention will result in a small amount of backfill, but
typically this traffic completes quickly. typically this traffic completes quickly.
@ -626,15 +640,14 @@ pools, each with 512 PGs on 10 OSDs, the OSDs will have to handle ~50,000 PGs
each. This cluster will require significantly more resources and significantly each. This cluster will require significantly more resources and significantly
more time for peering. more time for peering.
For determining the optimal number of PGs per OSD, we recommend the `PGCalc`_
tool.
.. _setting the number of placement groups: .. _setting the number of placement groups:
Setting the Number of PGs Setting the Number of PGs
========================= =========================
:ref:`Placement Group Link <pgcalc>`
Setting the initial number of PGs in a pool must be done at the time you create Setting the initial number of PGs in a pool must be done at the time you create
the pool. See `Create a Pool`_ for details. the pool. See `Create a Pool`_ for details.
@ -894,4 +907,3 @@ about it entirely (if it is too new to have a previous version). To mark the
.. _Create a Pool: ../pools#createpool .. _Create a Pool: ../pools#createpool
.. _Mapping PGs to OSDs: ../../../architecture#mapping-pgs-to-osds .. _Mapping PGs to OSDs: ../../../architecture#mapping-pgs-to-osds
.. _pgcalc: https://old.ceph.com/pgcalc/

View File

@ -18,15 +18,17 @@ Pools provide:
<../erasure-code>`_, resilience is defined as the number of coding chunks <../erasure-code>`_, resilience is defined as the number of coding chunks
(for example, ``m = 2`` in the default **erasure code profile**). (for example, ``m = 2`` in the default **erasure code profile**).
- **Placement Groups**: You can set the number of placement groups (PGs) for - **Placement Groups**: The :ref:`autoscaler <pg-autoscaler>` sets the number
the pool. In a typical configuration, the target number of PGs is of placement groups (PGs) for the pool. In a typical configuration, the
approximately one hundred PGs per OSD. This provides reasonable balancing target number of PGs is approximately one-hundred and fifty PGs per OSD. This
without consuming excessive computing resources. When setting up multiple provides reasonable balancing without consuming excessive computing
pools, be careful to set an appropriate number of PGs for each pool and for resources. When setting up multiple pools, set an appropriate number of PGs
the cluster as a whole. Each PG belongs to a specific pool: when multiple for each pool and for the cluster as a whole. Each PG belongs to a specific
pools use the same OSDs, make sure that the **sum** of PG replicas per OSD is pool: when multiple pools use the same OSDs, make sure that the **sum** of PG
in the desired PG-per-OSD target range. To calculate an appropriate number of replicas per OSD is in the desired PG-per-OSD target range. See :ref:`Setting
PGs for your pools, use the `pgcalc`_ tool. the Number of Placement Groups <setting the number of placement groups>` for
instructions on how to manually set the number of placement groups per pool
(this procedure works only when the autoscaler is not used).
- **CRUSH Rules**: When data is stored in a pool, the placement of the object - **CRUSH Rules**: When data is stored in a pool, the placement of the object
and its replicas (or chunks, in the case of erasure-coded pools) in your and its replicas (or chunks, in the case of erasure-coded pools) in your
@ -94,19 +96,12 @@ To get even more information, you can execute this command with the ``--format``
Creating a Pool Creating a Pool
=============== ===============
Before creating a pool, consult `Pool, PG and CRUSH Config Reference`_. Your Before creating a pool, consult `Pool, PG and CRUSH Config Reference`_. The
Ceph configuration file contains a setting (namely, ``pg_num``) that determines Ceph central configuration database in the monitor cluster contains a setting
the number of PGs. However, this setting's default value is NOT appropriate (namely, ``pg_num``) that determines the number of PGs per pool when a pool has
for most systems. In most cases, you should override this default value when been created and no per-pool value has been specified. It is possible to change
creating your pool. For details on PG numbers, see `setting the number of this value from its default. For more on the subject of setting the number of
placement groups`_ PGs per pool, see `setting the number of placement groups`_.
For example:
.. prompt:: bash $
osd_pool_default_pg_num = 128
osd_pool_default_pgp_num = 128
.. note:: In Luminous and later releases, each pool must be associated with the .. note:: In Luminous and later releases, each pool must be associated with the
application that will be using the pool. For more information, see application that will be using the pool. For more information, see
@ -742,8 +737,6 @@ Managing pools that are flagged with ``--bulk``
=============================================== ===============================================
See :ref:`managing_bulk_flagged_pools`. See :ref:`managing_bulk_flagged_pools`.
.. _pgcalc: https://old.ceph.com/pgcalc/
.. _Pool, PG and CRUSH Config Reference: ../../configuration/pool-pg-config-ref .. _Pool, PG and CRUSH Config Reference: ../../configuration/pool-pg-config-ref
.. _Bloom Filter: https://en.wikipedia.org/wiki/Bloom_filter .. _Bloom Filter: https://en.wikipedia.org/wiki/Bloom_filter
.. _setting the number of placement groups: ../placement-groups#set-the-number-of-placement-groups .. _setting the number of placement groups: ../placement-groups#set-the-number-of-placement-groups

View File

@ -121,8 +121,6 @@ your CRUSH map. This procedure shows how to do this.
rule stretch_rule { rule stretch_rule {
id 1 id 1
min_size 1
max_size 10
type replicated type replicated
step take site1 step take site1
step chooseleaf firstn 2 type host step chooseleaf firstn 2 type host
@ -141,11 +139,15 @@ your CRUSH map. This procedure shows how to do this.
#. Run the monitors in connectivity mode. See `Changing Monitor Elections`_. #. Run the monitors in connectivity mode. See `Changing Monitor Elections`_.
.. prompt:: bash $
ceph mon set election_strategy connectivity
#. Command the cluster to enter stretch mode. In this example, ``mon.e`` is the #. Command the cluster to enter stretch mode. In this example, ``mon.e`` is the
tiebreaker monitor and we are splitting across data centers. The tiebreaker tiebreaker monitor and we are splitting across data centers. The tiebreaker
monitor must be assigned a data center that is neither ``site1`` nor monitor must be assigned a data center that is neither ``site1`` nor
``site2``. For this purpose you can create another data-center bucket named ``site2``. This data center **should not** be defined in your CRUSH map, here
``site3`` in your CRUSH and place ``mon.e`` there: we are placing ``mon.e`` in a virtual data center called ``site3``:
.. prompt:: bash $ .. prompt:: bash $

View File

@ -175,17 +175,19 @@ For each subsystem, there is a logging level for its output logs (a so-called
"log level") and a logging level for its in-memory logs (a so-called "memory "log level") and a logging level for its in-memory logs (a so-called "memory
level"). Different values may be set for these two logging levels in each level"). Different values may be set for these two logging levels in each
subsystem. Ceph's logging levels operate on a scale of ``1`` to ``20``, where subsystem. Ceph's logging levels operate on a scale of ``1`` to ``20``, where
``1`` is terse and ``20`` is verbose [#f1]_. As a general rule, the in-memory ``1`` is terse and ``20`` is verbose. In certain rare cases, there are logging
logs are not sent to the output log unless one or more of the following levels that can take a value greater than 20. The resulting logs are extremely
conditions obtain: verbose.
- a fatal signal is raised or The in-memory logs are not sent to the output log unless one or more of the
- an ``assert`` in source code is triggered or following conditions are true:
- upon requested. Please consult `document on admin socket
<http://docs.ceph.com/en/latest/man/8/ceph/#daemon>`_ for more details.
.. warning :: - a fatal signal has been raised or
.. [#f1] In certain rare cases, there are logging levels that can take a value greater than 20. The resulting logs are extremely verbose. - an assertion within Ceph code has been triggered or
- the sending of in-memory logs to the output log has been manually triggered.
Consult `the portion of the "Ceph Administration Tool documentation
that provides an example of how to submit admin socket commands
<http://docs.ceph.com/en/latest/man/8/ceph/#daemon>`_ for more detail.
Log levels and memory levels can be set either together or separately. If a Log levels and memory levels can be set either together or separately. If a
subsystem is assigned a single value, then that value determines both the log subsystem is assigned a single value, then that value determines both the log

View File

@ -85,23 +85,27 @@ Using the monitor's admin socket
================================ ================================
A monitor's admin socket allows you to interact directly with a specific daemon A monitor's admin socket allows you to interact directly with a specific daemon
by using a Unix socket file. This file is found in the monitor's ``run`` by using a Unix socket file. This socket file is found in the monitor's ``run``
directory. The admin socket's default directory is directory.
``/var/run/ceph/ceph-mon.ID.asok``, but this can be overridden and the admin
socket might be elsewhere, especially if your cluster's daemons are deployed in The admin socket's default directory is ``/var/run/ceph/ceph-mon.ID.asok``. It
containers. If you cannot find it, either check your ``ceph.conf`` for an is possible to override the admin socket's default location. If the default
alternative path or run the following command: location has been overridden, then the admin socket will be elsewhere. This is
often the case when a cluster's daemons are deployed in containers.
To find the directory of the admin socket, check either your ``ceph.conf`` for
an alternative path or run the following command:
.. prompt:: bash $ .. prompt:: bash $
ceph-conf --name mon.ID --show-config-value admin_socket ceph-conf --name mon.ID --show-config-value admin_socket
The admin socket is available for use only when the monitor daemon is running. The admin socket is available for use only when the Monitor daemon is running.
Whenever the monitor has been properly shut down, the admin socket is removed. Every time the Monitor is properly shut down, the admin socket is removed. If
However, if the monitor is not running and the admin socket persists, it is the Monitor is not running and yet the admin socket persists, it is likely that
likely that the monitor has been improperly shut down. In any case, if the the Monitor has been improperly shut down. If the Monitor is not running, it
monitor is not running, it will be impossible to use the admin socket, and the will be impossible to use the admin socket, and the ``ceph`` command is likely
``ceph`` command is likely to return ``Error 111: Connection Refused``. to return ``Error 111: Connection Refused``.
To access the admin socket, run a ``ceph tell`` command of the following form To access the admin socket, run a ``ceph tell`` command of the following form
(specifying the daemon that you are interested in): (specifying the daemon that you are interested in):
@ -110,7 +114,7 @@ To access the admin socket, run a ``ceph tell`` command of the following form
ceph tell mon.<id> mon_status ceph tell mon.<id> mon_status
This command passes a ``help`` command to the specific running monitor daemon This command passes a ``help`` command to the specified running Monitor daemon
``<id>`` via its admin socket. If you know the full path to the admin socket ``<id>`` via its admin socket. If you know the full path to the admin socket
file, this can be done more directly by running the following command: file, this can be done more directly by running the following command:
@ -127,10 +131,11 @@ and ``quorum_status``.
Understanding mon_status Understanding mon_status
======================== ========================
The status of the monitor (as reported by the ``ceph tell mon.X mon_status`` The status of a Monitor (as reported by the ``ceph tell mon.X mon_status``
command) can always be obtained via the admin socket. This command outputs a command) can be obtained via the admin socket. The ``ceph tell mon.X
great deal of information about the monitor (including the information found in mon_status`` command outputs a great deal of information about the monitor
the output of the ``quorum_status`` command). (including the information found in the output of the ``quorum_status``
command).
To understand this command's output, let us consider the following example, in To understand this command's output, let us consider the following example, in
which we see the output of ``ceph tell mon.c mon_status``:: which we see the output of ``ceph tell mon.c mon_status``::
@ -160,29 +165,34 @@ which we see the output of ``ceph tell mon.c mon_status``::
"name": "c", "name": "c",
"addr": "127.0.0.1:6795\/0"}]}} "addr": "127.0.0.1:6795\/0"}]}}
It is clear that there are three monitors in the monmap (*a*, *b*, and *c*), This output reports that there are three monitors in the monmap (*a*, *b*, and
the quorum is formed by only two monitors, and *c* is in the quorum as a *c*), that quorum is formed by only two monitors, and that *c* is in quorum as
*peon*. a *peon*.
**Which monitor is out of the quorum?** **Which monitor is out of quorum?**
The answer is **a** (that is, ``mon.a``). The answer is **a** (that is, ``mon.a``). ``mon.a`` is out of quorum.
**Why?** **How do we know, in this example, that mon.a is out of quorum?**
When the ``quorum`` set is examined, there are clearly two monitors in the We know that ``mon.a`` is out of quorum because it has rank 0, and Monitors
set: *1* and *2*. But these are not monitor names. They are monitor ranks, as with rank 0 are by definition out of quorum.
established in the current ``monmap``. The ``quorum`` set does not include
the monitor that has rank 0, and according to the ``monmap`` that monitor is If we examine the ``quorum`` set, we can see that there are clearly two
``mon.a``. monitors in the set: *1* and *2*. But these are not monitor names. They are
monitor ranks, as established in the current ``monmap``. The ``quorum`` set
does not include the monitor that has rank 0, and according to the ``monmap``
that monitor is ``mon.a``.
**How are monitor ranks determined?** **How are monitor ranks determined?**
Monitor ranks are calculated (or recalculated) whenever monitors are added or Monitor ranks are calculated (or recalculated) whenever monitors are added to
removed. The calculation of ranks follows a simple rule: the **greater** the or removed from the cluster. The calculation of ranks follows a simple rule:
``IP:PORT`` combination, the **lower** the rank. In this case, because the **greater** the ``IP:PORT`` combination, the **lower** the rank. In this
``127.0.0.1:6789`` is lower than the other two ``IP:PORT`` combinations, case, because ``127.0.0.1:6789`` (``mon.a``) is numerically less than the
``mon.a`` has the highest rank: namely, rank 0. other two ``IP:PORT`` combinations (which are ``127.0.0.1:6790`` for "Monitor
b" and ``127.0.0.1:6795`` for "Monitor c"), ``mon.a`` has the highest rank:
namely, rank 0.
Most Common Monitor Issues Most Common Monitor Issues
@ -250,14 +260,15 @@ detail`` returns a message similar to the following::
Monitors at a wrong address. ``mon_status`` outputs the ``monmap`` that is Monitors at a wrong address. ``mon_status`` outputs the ``monmap`` that is
known to the monitor: determine whether the other Monitors' locations as known to the monitor: determine whether the other Monitors' locations as
specified in the ``monmap`` match the locations of the Monitors in the specified in the ``monmap`` match the locations of the Monitors in the
network. If they do not, see `Recovering a Monitor's Broken monmap`_. network. If they do not, see :ref:`Recovering a Monitor's Broken monmap
If the locations of the Monitors as specified in the ``monmap`` match the <rados_troubleshooting_troubleshooting_mon_recovering_broken_monmap>`. If
locations of the Monitors in the network, then the persistent the locations of the Monitors as specified in the ``monmap`` match the
``probing`` state could be related to severe clock skews amongst the monitor locations of the Monitors in the network, then the persistent ``probing``
nodes. See `Clock Skews`_. If the information in `Clock Skews`_ does not state could be related to severe clock skews among the monitor nodes. See
bring the Monitor out of the ``probing`` state, then prepare your system logs `Clock Skews`_. If the information in `Clock Skews`_ does not bring the
and ask the Ceph community for help. See `Preparing your logs`_ for Monitor out of the ``probing`` state, then prepare your system logs and ask
information about the proper preparation of logs. the Ceph community for help. See `Preparing your logs`_ for information about
the proper preparation of logs.
**What does it mean when a Monitor's state is ``electing``?** **What does it mean when a Monitor's state is ``electing``?**
@ -314,13 +325,16 @@ detail`` returns a message similar to the following::
substantiate it. See `Preparing your logs`_ for information about the substantiate it. See `Preparing your logs`_ for information about the
proper preparation of logs. proper preparation of logs.
.. _rados_troubleshooting_troubleshooting_mon_recovering_broken_monmap:
Recovering a Monitor's Broken ``monmap`` Recovering a Monitor's Broken "monmap"
---------------------------------------- --------------------------------------
This is how a ``monmap`` usually looks, depending on the number of A monmap can be retrieved by using a command of the form ``ceph tell mon.c
monitors:: mon_status``, as described in :ref:`Understanding mon_status
<rados_troubleshoting_troubleshooting_mon_understanding_mon_status>`.
Here is an example of a ``monmap``::
epoch 3 epoch 3
fsid 5c4e9d53-e2e1-478a-8061-f543f8be4cf8 fsid 5c4e9d53-e2e1-478a-8061-f543f8be4cf8
@ -330,60 +344,63 @@ monitors::
1: 127.0.0.1:6790/0 mon.b 1: 127.0.0.1:6790/0 mon.b
2: 127.0.0.1:6795/0 mon.c 2: 127.0.0.1:6795/0 mon.c
This may not be what you have however. For instance, in some versions of This ``monmap`` is in working order, but your ``monmap`` might not be in
early Cuttlefish there was a bug that could cause your ``monmap`` working order. The ``monmap`` in a given node might be outdated because the
to be nullified. Completely filled with zeros. This means that not even node was down for a long time, during which the cluster's Monitors changed.
``monmaptool`` would be able to make sense of cold, hard, inscrutable zeros.
It's also possible to end up with a monitor with a severely outdated monmap,
notably if the node has been down for months while you fight with your vendor's
TAC. The subject ``ceph-mon`` daemon might be unable to find the surviving
monitors (e.g., say ``mon.c`` is down; you add a new monitor ``mon.d``,
then remove ``mon.a``, then add a new monitor ``mon.e`` and remove
``mon.b``; you will end up with a totally different monmap from the one
``mon.c`` knows).
In this situation you have two possible solutions: There are two ways to update a Monitor's outdated ``monmap``:
Scrap the monitor and redeploy A. **Scrap the monitor and redeploy.**
You should only take this route if you are positive that you won't Do this only if you are certain that you will not lose the information kept
lose the information kept by that monitor; that you have other monitors by the Monitor that you scrap. Make sure that you have other Monitors in
and that they are running just fine so that your new monitor is able good condition, so that the new Monitor will be able to synchronize with
to synchronize from the remaining monitors. Keep in mind that destroying the surviving Monitors. Remember that destroying a Monitor can lead to data
a monitor, if there are no other copies of its contents, may lead to loss if there are no other copies of the Monitor's contents.
loss of data.
Inject a monmap into the monitor B. **Inject a monmap into the monitor.**
These are the basic steps: It is possible to fix a Monitor that has an outdated ``monmap`` by
retrieving an up-to-date ``monmap`` from surviving Monitors in the cluster
Retrieve the ``monmap`` from the surviving monitors and inject it into the and injecting it into the Monitor that has a corrupted or missing
monitor whose ``monmap`` is corrupted or lost. ``monmap``.
Implement this solution by carrying out the following procedure: Implement this solution by carrying out the following procedure:
1. Is there a quorum of monitors? If so, retrieve the ``monmap`` from the #. Retrieve the ``monmap`` in one of the two following ways:
quorum::
$ ceph mon getmap -o /tmp/monmap a. **IF THERE IS A QUORUM OF MONITORS:**
2. If there is no quorum, then retrieve the ``monmap`` directly from another Retrieve the ``monmap`` from the quorum:
monitor that has been stopped (in this example, the other monitor has
the ID ``ID-FOO``)::
$ ceph-mon -i ID-FOO --extract-monmap /tmp/monmap .. prompt:: bash
3. Stop the monitor you are going to inject the monmap into. ceph mon getmap -o /tmp/monmap
4. Inject the monmap:: b. **IF THERE IS NO QUORUM OF MONITORS:**
$ ceph-mon -i ID --inject-monmap /tmp/monmap Retrieve the ``monmap`` directly from a Monitor that has been stopped
:
5. Start the monitor .. prompt:: bash
.. warning:: Injecting ``monmaps`` can cause serious problems because doing ceph-mon -i ID-FOO --extract-monmap /tmp/monmap
so will overwrite the latest existing ``monmap`` stored on the monitor. Be
careful! In this example, the ID of the stopped Monitor is ``ID-FOO``.
#. Stop the Monitor into which the ``monmap`` will be injected.
#. Inject the monmap into the stopped Monitor:
.. prompt:: bash
ceph-mon -i ID --inject-monmap /tmp/monmap
#. Start the Monitor.
.. warning:: Injecting a ``monmap`` into a Monitor can cause serious
problems. Injecting a ``monmap`` overwrites the latest existing
``monmap`` stored on the monitor. Be careful!
Clock Skews Clock Skews
----------- -----------
@ -464,12 +481,13 @@ Clock Skew Questions and Answers
Client Can't Connect or Mount Client Can't Connect or Mount
----------------------------- -----------------------------
Check your IP tables. Some operating-system install utilities add a ``REJECT`` If a client can't connect to the cluster or mount, check your iptables. Some
rule to ``iptables``. ``iptables`` rules will reject all clients other than operating-system install utilities add a ``REJECT`` rule to ``iptables``.
``ssh`` that try to connect to the host. If your monitor host's IP tables have ``iptables`` rules will reject all clients other than ``ssh`` that try to
a ``REJECT`` rule in place, clients that are connecting from a separate node connect to the host. If your monitor host's iptables have a ``REJECT`` rule in
will fail and will raise a timeout error. Any ``iptables`` rules that reject place, clients that connect from a separate node will fail, and this will raise
clients trying to connect to Ceph daemons must be addressed. For example:: a timeout error. Look for ``iptables`` rules that reject clients that are
trying to connect to Ceph daemons. For example::
REJECT all -- anywhere anywhere reject-with icmp-host-prohibited REJECT all -- anywhere anywhere reject-with icmp-host-prohibited
@ -487,9 +505,9 @@ Monitor Store Failures
Symptoms of store corruption Symptoms of store corruption
---------------------------- ----------------------------
Ceph monitors store the :term:`Cluster Map` in a key-value store. If key-value Ceph Monitors maintain the :term:`Cluster Map` in a key-value store. If
store corruption causes a monitor to fail, then the monitor log might contain key-value store corruption causes a Monitor to fail, then the Monitor log might
one of the following error messages:: contain one of the following error messages::
Corruption: error in middle of record Corruption: error in middle of record
@ -500,10 +518,10 @@ or::
Recovery using healthy monitor(s) Recovery using healthy monitor(s)
--------------------------------- ---------------------------------
If there are surviving monitors, we can always :ref:`replace If the cluster contains surviving Monitors, the corrupted Monitor can be
<adding-and-removing-monitors>` the corrupted monitor with a new one. After the :ref:`replaced <adding-and-removing-monitors>` with a new Monitor. After the
new monitor boots, it will synchronize with a healthy peer. After the new new Monitor boots, it will synchronize with a healthy peer. After the new
monitor is fully synchronized, it will be able to serve clients. Monitor is fully synchronized, it will be able to serve clients.
.. _mon-store-recovery-using-osds: .. _mon-store-recovery-using-osds:
@ -511,15 +529,14 @@ Recovery using OSDs
------------------- -------------------
Even if all monitors fail at the same time, it is possible to recover the Even if all monitors fail at the same time, it is possible to recover the
monitor store by using information stored in OSDs. You are encouraged to deploy Monitor store by using information that is stored in OSDs. You are encouraged
at least three (and preferably five) monitors in a Ceph cluster. In such a to deploy at least three (and preferably five) Monitors in a Ceph cluster. In
deployment, complete monitor failure is unlikely. However, unplanned power loss such a deployment, complete Monitor failure is unlikely. However, unplanned
in a data center whose disk settings or filesystem settings are improperly power loss in a data center whose disk settings or filesystem settings are
configured could cause the underlying filesystem to fail and this could kill improperly configured could cause the underlying filesystem to fail and this
all of the monitors. In such a case, data in the OSDs can be used to recover could kill all of the monitors. In such a case, data in the OSDs can be used to
the monitors. The following is such a script and can be used to recover the recover the Monitors. The following is a script that can be used in such a case
monitors: to recover the Monitors:
.. code-block:: bash .. code-block:: bash
@ -572,10 +589,10 @@ monitors:
This script performs the following steps: This script performs the following steps:
#. Collects the map from each OSD host. #. Collect the map from each OSD host.
#. Rebuilds the store. #. Rebuild the store.
#. Fills the entities in the keyring file with appropriate capabilities. #. Fill the entities in the keyring file with appropriate capabilities.
#. Replaces the corrupted store on ``mon.foo`` with the recovered copy. #. Replace the corrupted store on ``mon.foo`` with the recovered copy.
Known limitations Known limitations
@ -587,19 +604,18 @@ The above recovery tool is unable to recover the following information:
auth add`` command are recovered from the OSD's copy, and the auth add`` command are recovered from the OSD's copy, and the
``client.admin`` keyring is imported using ``ceph-monstore-tool``. However, ``client.admin`` keyring is imported using ``ceph-monstore-tool``. However,
the MDS keyrings and all other keyrings will be missing in the recovered the MDS keyrings and all other keyrings will be missing in the recovered
monitor store. You might need to manually re-add them. Monitor store. It might be necessary to manually re-add them.
- **Creating pools**: If any RADOS pools were in the process of being created, - **Creating pools**: If any RADOS pools were in the process of being created,
that state is lost. The recovery tool operates on the assumption that all that state is lost. The recovery tool operates on the assumption that all
pools have already been created. If there are PGs that are stuck in the pools have already been created. If there are PGs that are stuck in the
'unknown' state after the recovery for a partially created pool, you can ``unknown`` state after the recovery for a partially created pool, you can
force creation of the *empty* PG by running the ``ceph osd force-create-pg`` force creation of the *empty* PG by running the ``ceph osd force-create-pg``
command. Note that this will create an *empty* PG, so take this action only command. This creates an *empty* PG, so take this action only if you are
if you know the pool is empty. certain that the pool is empty.
- **MDS Maps**: The MDS maps are lost. - **MDS Maps**: The MDS maps are lost.
Everything Failed! Now What? Everything Failed! Now What?
============================ ============================
@ -611,16 +627,20 @@ irc.oftc.net), or at ``dev@ceph.io`` and ``ceph-users@lists.ceph.com``. Make
sure that you have prepared your logs and that you have them ready upon sure that you have prepared your logs and that you have them ready upon
request. request.
See https://ceph.io/en/community/connect/ for current (as of October 2023) The upstream Ceph Slack workspace can be joined at this address:
information on getting in contact with the upstream Ceph community. https://ceph-storage.slack.com/
See https://ceph.io/en/community/connect/ for current (as of December 2023)
information on getting in contact with the upstream Ceph community.
Preparing your logs Preparing your logs
------------------- -------------------
The default location for monitor logs is ``/var/log/ceph/ceph-mon.FOO.log*``. The default location for Monitor logs is ``/var/log/ceph/ceph-mon.FOO.log*``.
However, if they are not there, you can find their current location by running It is possible that the location of the Monitor logs has been changed from the
the following command: default. If the location of the Monitor logs has been changed from the default
location, find the location of the Monitor logs by running the following
command:
.. prompt:: bash .. prompt:: bash
@ -631,21 +651,21 @@ cluster's configuration files. If Ceph is using the default debug levels, then
your logs might be missing important information that would help the upstream your logs might be missing important information that would help the upstream
Ceph community address your issue. Ceph community address your issue.
To make sure your monitor logs contain relevant information, you can raise Raise debug levels to make sure that your Monitor logs contain relevant
debug levels. Here we are interested in information from the monitors. As with information. Here we are interested in information from the Monitors. As with
other components, the monitors have different parts that output their debug other components, the Monitors have different parts that output their debug
information on different subsystems. information on different subsystems.
If you are an experienced Ceph troubleshooter, we recommend raising the debug If you are an experienced Ceph troubleshooter, we recommend raising the debug
levels of the most relevant subsystems. Of course, this approach might not be levels of the most relevant subsystems. This approach might not be easy for
easy for beginners. In most cases, however, enough information to address the beginners. In most cases, however, enough information to address the issue will
issue will be secured if the following debug levels are entered:: be logged if the following debug levels are entered::
debug_mon = 10 debug_mon = 10
debug_ms = 1 debug_ms = 1
Sometimes these debug levels do not yield enough information. In such cases, Sometimes these debug levels do not yield enough information. In such cases,
members of the upstream Ceph community might ask you to make additional changes members of the upstream Ceph community will ask you to make additional changes
to these or to other debug levels. In any case, it is better for us to receive to these or to other debug levels. In any case, it is better for us to receive
at least some useful information than to receive an empty log. at least some useful information than to receive an empty log.
@ -653,10 +673,12 @@ at least some useful information than to receive an empty log.
Do I need to restart a monitor to adjust debug levels? Do I need to restart a monitor to adjust debug levels?
------------------------------------------------------ ------------------------------------------------------
No, restarting a monitor is not necessary. Debug levels may be adjusted by No. It is not necessary to restart a Monitor when adjusting its debug levels.
using two different methods, depending on whether or not there is a quorum:
There is a quorum There are two different methods for adjusting debug levels. One method is used
when there is quorum. The other is used when there is no quorum.
**Adjusting debug levels when there is a quorum**
Either inject the debug option into the specific monitor that needs to Either inject the debug option into the specific monitor that needs to
be debugged:: be debugged::
@ -668,17 +690,19 @@ There is a quorum
ceph tell mon.* config set debug_mon 10/10 ceph tell mon.* config set debug_mon 10/10
There is no quorum **Adjusting debug levels when there is no quorum**
Use the admin socket of the specific monitor that needs to be debugged Use the admin socket of the specific monitor that needs to be debugged
and directly adjust the monitor's configuration options:: and directly adjust the monitor's configuration options::
ceph daemon mon.FOO config set debug_mon 10/10 ceph daemon mon.FOO config set debug_mon 10/10
**Returning debug levels to their default values**
To return the debug levels to their default values, run the above commands To return the debug levels to their default values, run the above commands
using the debug level ``1/10`` rather than ``10/10``. To check a monitor's using the debug level ``1/10`` rather than the debug level ``10/10``. To check
current values, use the admin socket and run either of the following commands: a Monitor's current values, use the admin socket and run either of the
following commands:
.. prompt:: bash .. prompt:: bash
@ -695,17 +719,17 @@ or:
I Reproduced the problem with appropriate debug levels. Now what? I Reproduced the problem with appropriate debug levels. Now what?
----------------------------------------------------------------- -----------------------------------------------------------------
We prefer that you send us only the portions of your logs that are relevant to Send the upstream Ceph community only the portions of your logs that are
your monitor problems. Of course, it might not be easy for you to determine relevant to your Monitor problems. Because it might not be easy for you to
which portions are relevant so we are willing to accept complete and determine which portions are relevant, the upstream Ceph community accepts
unabridged logs. However, we request that you avoid sending logs containing complete and unabridged logs. But don't send logs containing hundreds of
hundreds of thousands of lines with no additional clarifying information. One thousands of lines with no additional clarifying information. One common-sense
common-sense way of making our task easier is to write down the current time way to help the Ceph community help you is to write down the current time and
and date when you are reproducing the problem and then extract portions of your date when you are reproducing the problem and then extract portions of your
logs based on that information. logs based on that information.
Finally, reach out to us on the mailing lists or IRC or Slack, or by filing a Contact the upstream Ceph community on the mailing lists or IRC or Slack, or by
new issue on the `tracker`_. filing a new issue on the `tracker`_.
.. _tracker: http://tracker.ceph.com/projects/ceph/issues/new .. _tracker: http://tracker.ceph.com/projects/ceph/issues/new

View File

@ -2,25 +2,23 @@
Admin Guide Admin Guide
============= =============
Once you have your Ceph Object Storage service up and running, you may After the Ceph Object Storage service is up and running, it can be administered
administer the service with user management, access controls, quotas with user management, access controls, quotas, and usage tracking.
and usage tracking among other features.
User Management User Management
=============== ===============
Ceph Object Storage user management refers to users of the Ceph Object Storage Ceph Object Storage user management refers only to users of the Ceph Object
service (i.e., not the Ceph Object Gateway as a user of the Ceph Storage Storage service and not to the Ceph Object Gateway as a user of the Ceph
Cluster). You must create a user, access key and secret to enable end users to Storage Cluster. Create a user, access key, and secret key to enable end users
interact with Ceph Object Gateway services. to interact with Ceph Object Gateway services.
There are two user types: There are two types of user:
- **User:** The term 'user' reflects a user of the S3 interface. - **User:** The term "user" refers to user of the S3 interface.
- **Subuser:** The term 'subuser' reflects a user of the Swift interface. A subuser - **Subuser:** The term "subuser" refers to a user of the Swift interface. A
is associated to a user . subuser is associated with a user.
.. ditaa:: .. ditaa::
+---------+ +---------+
@ -31,22 +29,28 @@ There are two user types:
+-----+ Subuser | +-----+ Subuser |
+-----------+ +-----------+
You can create, modify, view, suspend and remove users and subusers. In addition Users and subusers can be created, modified, viewed, suspended and removed.
to user and subuser IDs, you may add a display name and an email address for a you may add a Display names and an email addresses can be added to user
user. You can specify a key and secret, or generate a key and secret profiles. Keys and secrets can either be specified or generated automatically.
automatically. When generating or specifying keys, note that user IDs correspond When generating or specifying keys, remember that user IDs correspond to S3 key
to an S3 key type and subuser IDs correspond to a swift key type. Swift keys types and subuser IDs correspond to Swift key types.
also have access levels of ``read``, ``write``, ``readwrite`` and ``full``.
Swift keys have access levels of ``read``, ``write``, ``readwrite`` and
``full``.
Create a User Create a User
------------- -------------
To create a user (S3 interface), execute the following:: To create a user (S3 interface), run a command of the following form:
.. prompt:: bash
radosgw-admin user create --uid={username} --display-name="{display-name}" [--email={email}] radosgw-admin user create --uid={username} --display-name="{display-name}" [--email={email}]
For example:: For example:
.. prompt:: bash
radosgw-admin user create --uid=johndoe --display-name="John Doe" --email=john@example.com radosgw-admin user create --uid=johndoe --display-name="John Doe" --email=john@example.com
@ -75,32 +79,37 @@ For example::
"max_objects": -1}, "max_objects": -1},
"temp_url_keys": []} "temp_url_keys": []}
Creating a user also creates an ``access_key`` and ``secret_key`` entry for use The creation of a user entails the creation of an ``access_key`` and a
with any S3 API-compatible client. ``secret_key`` entry, which can be used with any S3 API-compatible client.
.. important:: Check the key output. Sometimes ``radosgw-admin`` .. important:: Check the key output. Sometimes ``radosgw-admin`` generates a
generates a JSON escape (``\``) character, and some clients JSON escape (``\``) character, and some clients do not know how to handle
do not know how to handle JSON escape characters. Remedies include JSON escape characters. Remedies include removing the JSON escape character
removing the JSON escape character (``\``), encapsulating the string (``\``), encapsulating the string in quotes, regenerating the key and
in quotes, regenerating the key and ensuring that it ensuring that it does not have a JSON escape character, or specifying the
does not have a JSON escape character or specify the key and secret key and secret manually.
manually.
Create a Subuser Create a Subuser
---------------- ----------------
To create a subuser (Swift interface) for the user, you must specify the user ID To create a subuser (a user of the Swift interface) for the user, specify the
(``--uid={username}``), a subuser ID and the access level for the subuser. :: user ID (``--uid={username}``), a subuser ID, and the subuser's access level:
.. prompt:: bash
radosgw-admin subuser create --uid={uid} --subuser={uid} --access=[ read | write | readwrite | full ] radosgw-admin subuser create --uid={uid} --subuser={uid} --access=[ read | write | readwrite | full ]
For example:: For example:
.. prompt:: bash
radosgw-admin subuser create --uid=johndoe --subuser=johndoe:swift --access=full radosgw-admin subuser create --uid=johndoe --subuser=johndoe:swift --access=full
.. note:: ``full`` is not ``readwrite``, as it also includes the access control policy. .. note:: ``full`` is not the same as ``readwrite``. The ``full`` access level
includes ``read`` and ``write``, but it also includes the access control
policy.
.. code-block:: javascript .. code-block:: javascript
@ -133,100 +142,126 @@ For example::
Get User Info Get User Info
------------- -------------
To get information about a user, you must specify ``user info`` and the user ID To get information about a user, specify ``user info`` and the user ID
(``--uid={username}``) . :: (``--uid={username}``). Use a command of the following form:
.. prompt:: bash
radosgw-admin user info --uid=johndoe radosgw-admin user info --uid=johndoe
Modify User Info Modify User Info
---------------- ----------------
To modify information about a user, you must specify the user ID (``--uid={username}``) To modify information about a user, specify the user ID (``--uid={username}``)
and the attributes you want to modify. Typical modifications are to keys and secrets, and the attributes that you want to modify. Typical modifications are made to
email addresses, display names and access levels. For example:: keys and secrets, email addresses, display names, and access levels. Use a
command of the following form:
.. prompt:: bash
radosgw-admin user modify --uid=johndoe --display-name="John E. Doe" radosgw-admin user modify --uid=johndoe --display-name="John E. Doe"
To modify subuser values, specify ``subuser modify``, user ID and the subuser ID. For example:: To modify subuser values, specify ``subuser modify``, user ID and the subuser
ID. Use a command of the following form:
.. prompt:: bash
radosgw-admin subuser modify --uid=johndoe --subuser=johndoe:swift --access=full radosgw-admin subuser modify --uid=johndoe --subuser=johndoe:swift --access=full
User Enable/Suspend User Suspend
------------------- ------------
When you create a user, the user is enabled by default. However, you may suspend When a user is created, the user is enabled by default. However, it is possible
user privileges and re-enable them at a later time. To suspend a user, specify to suspend user privileges and to re-enable them at a later time. To suspend a
``user suspend`` and the user ID. :: user, specify ``user suspend`` and the user ID in a command of the following
form:
.. prompt:: bash
radosgw-admin user suspend --uid=johndoe radosgw-admin user suspend --uid=johndoe
To re-enable a suspended user, specify ``user enable`` and the user ID. :: User Enable
-----------
To re-enable a suspended user, provide ``user enable`` and specify the user ID
in a command of the following form:
.. prompt:: bash
radosgw-admin user enable --uid=johndoe radosgw-admin user enable --uid=johndoe
.. note:: Disabling the user disables the subuser. .. note:: Disabling the user also disables any subusers.
Remove a User Remove a User
------------- -------------
When you remove a user, the user and subuser are removed from the system. When you remove a user, you also remove any subusers associated with the user.
However, you may remove just the subuser if you wish. To remove a user (and
subuser), specify ``user rm`` and the user ID. :: It is possible to remove a subuser without removing its associated user. This
is covered in the section called :ref:`Remove a Subuser <radosgw-admin-remove-a-subuser>`.
To remove a user and any subusers associated with it, use the ``user rm``
command and provide the user ID of the user to be removed. Use a command of the
following form:
.. prompt:: bash
radosgw-admin user rm --uid=johndoe radosgw-admin user rm --uid=johndoe
To remove the subuser only, specify ``subuser rm`` and the subuser ID. ::
radosgw-admin subuser rm --subuser=johndoe:swift
Options include: Options include:
- **Purge Data:** The ``--purge-data`` option purges all data associated - **Purge Data:** The ``--purge-data`` option purges all data associated
to the UID. with the UID.
- **Purge Keys:** The ``--purge-keys`` option purges all keys associated - **Purge Keys:** The ``--purge-keys`` option purges all keys associated
to the UID. with the UID.
.. _radosgw-admin-remove-a-subuser:
Remove a Subuser Remove a Subuser
---------------- ----------------
When you remove a sub user, you are removing access to the Swift interface. Removing a subuser removes access to the Swift interface or to S3. The user
The user will remain in the system. To remove the subuser, specify associated with the removed subuser remains in the system after the subuser's
``subuser rm`` and the subuser ID. :: removal.
To remove the subuser, use the command ``subuser rm`` and provide the subuser
ID of the subuser to be removed. Use a command of the following form:
.. prompt:: bash
radosgw-admin subuser rm --subuser=johndoe:swift radosgw-admin subuser rm --subuser=johndoe:swift
Options include: Options include:
- **Purge Keys:** The ``--purge-keys`` option purges all keys associated - **Purge Keys:** The ``--purge-keys`` option purges all keys associated
to the UID. with the UID.
Add / Remove a Key Add or Remove a Key
------------------------ --------------------
Both users and subusers require the key to access the S3 or Swift interface. To Both users and subusers require a key to access the S3 or Swift interface. To
use S3, the user needs a key pair which is composed of an access key and a use S3, the user needs a key pair which is composed of an access key and a
secret key. On the other hand, to use Swift, the user typically needs a secret secret key. To use Swift, the user needs a secret key (password), which is used
key (password), and use it together with the associated user ID. You may create together with its associated user ID. You can create a key and either specify
a key and either specify or generate the access key and/or secret key. You may or generate the access key or secret key. You can also remove a key. Options
also remove a key. Options include: include:
- ``--key-type=<type>`` specifies the key type. The options are: s3, swift - ``--key-type=<type>`` specifies the key type. The options are: ``s3``, ``swift``
- ``--access-key=<key>`` manually specifies an S3 access key. - ``--access-key=<key>`` manually specifies an S3 access key.
- ``--secret-key=<key>`` manually specifies a S3 secret key or a Swift secret key. - ``--secret-key=<key>`` manually specifies a S3 secret key or a Swift secret key.
- ``--gen-access-key`` automatically generates a random S3 access key. - ``--gen-access-key`` automatically generates a random S3 access key.
- ``--gen-secret`` automatically generates a random S3 secret key or a random Swift secret key. - ``--gen-secret`` automatically generates a random S3 secret key or a random Swift secret key.
An example how to add a specified S3 key pair for a user. :: Adding S3 keys
~~~~~~~~~~~~~~
To add a specific S3 key pair for a user, run a command of the following form:
.. prompt:: bash
radosgw-admin key create --uid=foo --key-type=s3 --access-key fooAccessKey --secret-key fooSecretKey radosgw-admin key create --uid=foo --key-type=s3 --access-key fooAccessKey --secret-key fooSecretKey
@ -243,9 +278,15 @@ An example how to add a specified S3 key pair for a user. ::
"secret_key": "fooSecretKey"}], "secret_key": "fooSecretKey"}],
} }
Note that you may create multiple S3 key pairs for a user. .. note:: You can create multiple S3 key pairs for a user.
To attach a specified swift secret key for a subuser. :: Adding Swift secret keys
~~~~~~~~~~~~~~~~~~~~~~~~
To attach a specific Swift secret key for a subuser, run a command of the
following form:
.. prompt:: bash
radosgw-admin key create --subuser=foo:bar --key-type=swift --secret-key barSecret radosgw-admin key create --subuser=foo:bar --key-type=swift --secret-key barSecret
@ -263,9 +304,16 @@ To attach a specified swift secret key for a subuser. ::
{ "user": "foo:bar", { "user": "foo:bar",
"secret_key": "asfghjghghmgm"}]} "secret_key": "asfghjghghmgm"}]}
Note that a subuser can have only one swift secret key. .. note:: A subuser can have only one Swift secret key.
Subusers can also be used with S3 APIs if the subuser is associated with a S3 key pair. :: Associating subusers with S3 key pairs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Subusers can also be used with S3 APIs if the subuser is associated with a S3
key pair. To associate a subuser with an S3 key pair, run a command of the
following form:
.. prompt:: bash
radosgw-admin key create --subuser=foo:bar --key-type=s3 --access-key barAccessKey --secret-key barSecretKey radosgw-admin key create --subuser=foo:bar --key-type=s3 --access-key barAccessKey --secret-key barSecretKey
@ -286,49 +334,70 @@ Subusers can also be used with S3 APIs if the subuser is associated with a S3 ke
} }
To remove a S3 key pair, specify the access key. :: Removing S3 key pairs
~~~~~~~~~~~~~~~~~~~~~
To remove a S3 key pair, specify the access key to be removed. Run a command of the following form:
.. prompt:: bash
radosgw-admin key rm --uid=foo --key-type=s3 --access-key=fooAccessKey radosgw-admin key rm --uid=foo --key-type=s3 --access-key=fooAccessKey
To remove the swift secret key. :: Removing Swift secret keys
~~~~~~~~~~~~~~~~~~~~~~~~~~
To remove a Swift secret key, run a command of the following form:
.. prompt:: bash
radosgw-admin key rm --subuser=foo:bar --key-type=swift radosgw-admin key rm --subuser=foo:bar --key-type=swift
Add / Remove Admin Capabilities Add or Remove Admin Capabilities
------------------------------- --------------------------------
The Ceph Storage Cluster provides an administrative API that enables users to The Ceph Storage Cluster provides an administrative API that enables users to
execute administrative functions via the REST API. By default, users do NOT have execute administrative functions via the REST API. By default, users do NOT
access to this API. To enable a user to exercise administrative functionality, have access to this API. To enable a user to exercise administrative
provide the user with administrative capabilities. functionality, provide the user with administrative capabilities.
To add administrative capabilities to a user, execute the following:: To add administrative capabilities to a user, run a command of the following
form:
.. prompt:: bash
radosgw-admin caps add --uid={uid} --caps={caps} radosgw-admin caps add --uid={uid} --caps={caps}
You can add read, write or all capabilities to users, buckets, metadata and You can add read, write or all capabilities to users, buckets, metadata and
usage (utilization). For example:: usage (utilization). To do this, use a command-line option of the following
form:
--caps="[users|buckets|metadata|usage|zone|amz-cache|info|bilog|mdlog|datalog|user-policy|oidc-provider|roles|ratelimit]=[*|read|write|read, write]" .. prompt:: bash
For example:: --caps="[users|buckets|metadata|usage|zone|amz-cache|info|bilog|mdlog|datalog|user-policy|oidc-provider|roles|ratelimit]=[\*|read|write|read, write]"
For example:
.. prompt:: bash
radosgw-admin caps add --uid=johndoe --caps="users=*;buckets=*" radosgw-admin caps add --uid=johndoe --caps="users=*;buckets=*"
To remove administrative capabilities from a user, run a command of the
following form:
To remove administrative capabilities from a user, execute the following:: .. prompt:: bash
radosgw-admin caps rm --uid=johndoe --caps={caps} radosgw-admin caps rm --uid=johndoe --caps={caps}
Quota Management Quota Management
================ ================
The Ceph Object Gateway enables you to set quotas on users and buckets owned by The Ceph Object Gateway makes it possible for you to set quotas on users and
users. Quotas include the maximum number of objects in a bucket and the maximum buckets owned by users. Quotas include the maximum number of objects in a
storage size a bucket can hold. bucket and the maximum storage size a bucket can hold.
- **Bucket:** The ``--bucket`` option allows you to specify a quota for - **Bucket:** The ``--bucket`` option allows you to specify a quota for
buckets the user owns. buckets the user owns.
@ -337,38 +406,47 @@ storage size a bucket can hold.
the maximum number of objects. A negative value disables this setting. the maximum number of objects. A negative value disables this setting.
- **Maximum Size:** The ``--max-size`` option allows you to specify a quota - **Maximum Size:** The ``--max-size`` option allows you to specify a quota
size in B/K/M/G/T, where B is the default. A negative value disables this setting. size in B/K/M/G/T, where B is the default. A negative value disables this
setting.
- **Quota Scope:** The ``--quota-scope`` option sets the scope for the quota. - **Quota Scope:** The ``--quota-scope`` option sets the scope for the quota.
The options are ``bucket`` and ``user``. Bucket quotas apply to buckets a The options are ``bucket`` and ``user``. Bucket quotas apply to each bucket
user owns. User quotas apply to a user. owned by the user. User Quotas are summed across all buckets owned by the
user.
Set User Quota Set User Quota
-------------- --------------
Before you enable a quota, you must first set the quota parameters. Before you enable a quota, you must first set the quota parameters.
For example:: To set quota parameters, run a command of the following form:
.. prompt:: bash
radosgw-admin quota set --quota-scope=user --uid=<uid> [--max-objects=<num objects>] [--max-size=<max size>] radosgw-admin quota set --quota-scope=user --uid=<uid> [--max-objects=<num objects>] [--max-size=<max size>]
For example:: For example:
.. prompt:: bash
radosgw-admin quota set --quota-scope=user --uid=johndoe --max-objects=1024 --max-size=1024B radosgw-admin quota set --quota-scope=user --uid=johndoe --max-objects=1024 --max-size=1024B
Passing a negative value as an argument of ``--max-objects`` or ``--max-size``
A negative value for num objects and / or max size means that the disables the given quota attribute.
specific quota attribute check is disabled.
Enable/Disable User Quota Enabling and Disabling User Quota
------------------------- ---------------------------------
Once you set a user quota, you may enable it. For example:: After a user quota is set, it must be enabled in order to take effect. To enable a user quota, run a command of the following form:
.. prompt:: bash
radosgw-admin quota enable --quota-scope=user --uid=<uid> radosgw-admin quota enable --quota-scope=user --uid=<uid>
You may disable an enabled user quota. For example:: To disable an enabled user quota, run a command of the following form:
.. prompt:: bash
radosgw-admin quota disable --quota-scope=user --uid=<uid> radosgw-admin quota disable --quota-scope=user --uid=<uid>
@ -377,22 +455,30 @@ Set Bucket Quota
---------------- ----------------
Bucket quotas apply to the buckets owned by the specified ``uid``. They are Bucket quotas apply to the buckets owned by the specified ``uid``. They are
independent of the user. :: independent of the user. To set a bucket quota, run a command of the following
form:
.. prompt:: bash
radosgw-admin quota set --uid=<uid> --quota-scope=bucket [--max-objects=<num objects>] [--max-size=<max size] radosgw-admin quota set --uid=<uid> --quota-scope=bucket [--max-objects=<num objects>] [--max-size=<max size]
A negative value for num objects and / or max size means that the A negative value for ``--max-objects`` or ``--max-size`` means that the
specific quota attribute check is disabled. specific quota attribute is disabled.
Enable/Disable Bucket Quota Enable and Disabling Bucket Quota
--------------------------- ---------------------------------
Once you set a bucket quota, you may enable it. For example:: After a bucket quota has been set, it must be enabled in order to take effect.
To enable a bucket quota, run a command of the following form:
.. prompt:: bash
radosgw-admin quota enable --quota-scope=bucket --uid=<uid> radosgw-admin quota enable --quota-scope=bucket --uid=<uid>
You may disable an enabled bucket quota. For example:: To disable an enabled bucket quota, run a command of the following form:
.. prompt:: bash
radosgw-admin quota disable --quota-scope=bucket --uid=<uid> radosgw-admin quota disable --quota-scope=bucket --uid=<uid>
@ -400,9 +486,11 @@ You may disable an enabled bucket quota. For example::
Get Quota Settings Get Quota Settings
------------------ ------------------
You may access each user's quota settings via the user information You can access each user's quota settings via the user information
API. To read user quota setting information with the CLI interface, API. To read user quota setting information with the CLI interface,
execute the following:: run a command of the following form:
.. prompt:: bash
radosgw-admin user info --uid=<uid> radosgw-admin user info --uid=<uid>
@ -410,9 +498,12 @@ execute the following::
Update Quota Stats Update Quota Stats
------------------ ------------------
Quota stats get updated asynchronously. You can update quota Quota stats are updated asynchronously. You can update quota statistics for all
statistics for all users and all buckets manually to retrieve users and all buckets manually to force an update of the latest quota stats. To
the latest quota stats. :: update quota statistics for all users and all buckets in order to retrieve the
latest quota statistics, run a command of the following form:
.. prompt:: bash
radosgw-admin user stats --uid=<uid> --sync-stats radosgw-admin user stats --uid=<uid> --sync-stats
@ -421,69 +512,90 @@ the latest quota stats. ::
Get User Usage Stats Get User Usage Stats
-------------------- --------------------
To see how much of the quota a user has consumed, execute the following:: To see how much of a quota a user has consumed, run a command of the following
form:
.. prompt:: bash
radosgw-admin user stats --uid=<uid> radosgw-admin user stats --uid=<uid>
.. note:: You should execute ``radosgw-admin user stats`` with the .. note:: Run ``radosgw-admin user stats`` with the ``--sync-stats`` option to
``--sync-stats`` option to receive the latest data. receive the latest data.
Default Quotas Default Quotas
-------------- --------------
You can set default quotas in the config. These defaults are used when You can set default quotas in the Ceph Object Gateway config. **These defaults
creating a new user and have no effect on existing users. If the will be used only when creating new users and will have no effect on existing
relevant default quota is set in config, then that quota is set on the users.** If a default quota is set in the Ceph Object Gateway Config, then that
new user, and that quota is enabled. See ``rgw bucket default quota max objects``, quota is set for all subsequently-created users, and that quota is enabled. See
``rgw bucket default quota max size``, ``rgw user default quota max objects``, and ``rgw_bucket_default_quota_max_objects``,
``rgw user default quota max size`` in `Ceph Object Gateway Config Reference`_ ``rgw_bucket_default_quota_max_size``, ``rgw_user_default_quota_max_objects``,
and ``rgw_user_default_quota_max_size`` in `Ceph Object Gateway Config
Reference`_
Quota Cache Quota Cache
----------- -----------
Quota statistics are cached on each RGW instance. If there are multiple Quota statistics are cached by each RGW instance. If multiple RGW instances are
instances, then the cache can keep quotas from being perfectly enforced, as deployed, then this cache may prevent quotas from being perfectly enforced,
each instance will have a different view of quotas. The options that control because each instance may have a different set of quota settings.
this are ``rgw bucket quota ttl``, ``rgw user quota bucket sync interval`` and
``rgw user quota sync interval``. The higher these values are, the more Here are the options that control this behavior:
efficient quota operations are, but the more out-of-sync multiple instances
will be. The lower these values are, the closer to perfect enforcement :confval:`rgw_bucket_quota_ttl`
multiple instances will achieve. If all three are 0, then quota caching is :confval:`rgw_user_quota_bucket_sync_interval`
effectively disabled, and multiple instances will have perfect quota :confval:`rgw_user_quota_sync_interval`
enforcement. See `Ceph Object Gateway Config Reference`_
Increasing these values will make quota operations more efficient at the cost
of increasing the likelihood that the multiple RGW instances may not
consistently have the latest quota settings. Decreasing these values brings
the multiple RGW instances closer to perfect quota synchronization.
If all three values are set to ``0`` , then quota caching is effectively
disabled, and multiple instances will have perfect quota enforcement. See
`Ceph Object Gateway Config Reference`_.
Reading / Writing Global Quotas Reading / Writing Global Quotas
------------------------------- -------------------------------
You can read and write global quota settings in the period configuration. To You can read and write global quota settings in the period configuration. To
view the global quota settings:: view the global quota settings, run the following command:
.. prompt:: bash
radosgw-admin global quota get radosgw-admin global quota get
The global quota settings can be manipulated with the ``global quota`` Global quota settings can be manipulated with the ``global quota``
counterparts of the ``quota set``, ``quota enable``, and ``quota disable`` counterparts of the ``quota set``, ``quota enable``, and ``quota disable``
commands. :: commands, as in the following examples:
.. prompt:: bash
radosgw-admin global quota set --quota-scope bucket --max-objects 1024 radosgw-admin global quota set --quota-scope bucket --max-objects 1024
radosgw-admin global quota enable --quota-scope bucket radosgw-admin global quota enable --quota-scope bucket
.. note:: In a multisite configuration, where there is a realm and period .. note:: In a multisite configuration where there is a realm and period
present, changes to the global quotas must be committed using ``period present, changes to the global quotas must be committed using ``period
update --commit``. If there is no period present, the rados gateway(s) must update --commit``. If no period is present, the RGW instances must
be restarted for the changes to take effect. be restarted for the changes to take effect.
Rate Limit Management Rate Limit Management
===================== =====================
The Ceph Object Gateway makes it possible to set rate limits on users and Quotas can be set for The Ceph Object Gateway on users and buckets. The "rate
buckets. "Rate limit" includes the maximum number of read operations (read limit" includes the maximum number of read operations (read ops) and write
ops) and write operations (write ops) per minute and the number of bytes per operations (write ops) per minute as well as the number of bytes per minute
minute that can be written or read per user or per bucket. that can be written or read per user or per bucket.
Read Requests and Write Requests
--------------------------------
Operations that use the ``GET`` method or the ``HEAD`` method in their REST Operations that use the ``GET`` method or the ``HEAD`` method in their REST
requests are "read requests". All other requests are "write requests". requests are "read requests". All other requests are "write requests".
How Metrics Work
----------------
Each object gateway tracks per-user metrics separately from bucket metrics. Each object gateway tracks per-user metrics separately from bucket metrics.
These metrics are not shared with other gateways. The configured limits should These metrics are not shared with other gateways. The configured limits should
be divided by the number of active object gateways. For example, if "user A" is be divided by the number of active object gateways. For example, if "user A" is
@ -518,66 +630,90 @@ time has elapsed, "user A" will be able to send ``GET`` requests again.
- **User:** The ``--uid`` option allows you to specify a rate limit for a - **User:** The ``--uid`` option allows you to specify a rate limit for a
user. user.
- **Maximum Read Ops:** The ``--max-read-ops`` setting allows you to specify - **Maximum Read Ops:** The ``--max-read-ops`` setting allows you to limit read
the maximum number of read ops per minute per RGW. A 0 value disables this setting (which means unlimited access). bytes per minute per RGW instance. A ``0`` value disables throttling.
- **Maximum Read Bytes:** The ``--max-read-bytes`` setting allows you to specify - **Maximum Read Bytes:** The ``--max-read-bytes`` setting allows you to limit
the maximum number of read bytes per minute per RGW. A 0 value disables this setting (which means unlimited access). read bytes per minute per RGW instance. A ``0`` value disables throttling.
- **Maximum Write Ops:** The ``--max-write-ops`` setting allows you to specify - **Maximum Write Ops:** The ``--max-write-ops`` setting allows you to specify
the maximum number of write ops per minute per RGW. A 0 value disables this setting (which means unlimited access). the maximum number of write ops per minute per RGW instance. A ``0`` value
disables throttling.
- **Maximum Write Bytes:** The ``--max-write-bytes`` setting allows you to specify - **Maximum Write Bytes:** The ``--max-write-bytes`` setting allows you to
the maximum number of write bytes per minute per RGW. A 0 value disables this setting (which means unlimited access). specify the maximum number of write bytes per minute per RGW instance. A
``0`` value disables throttling.
- **Rate Limit Scope:** The ``--ratelimit-scope`` option sets the scope for the rate limit. - **Rate Limit Scope:** The ``--ratelimit-scope`` option sets the scope for the
The options are ``bucket`` , ``user`` and ``anonymous``. Bucket rate limit apply to buckets. rate limit. The options are ``bucket`` , ``user`` and ``anonymous``. Bucket
The user rate limit applies to a user. Anonymous applies to an unauthenticated user. rate limit apply to buckets. The user rate limit applies to a user. The
Anonymous scope is only available for global rate limit. ``anonymous`` option applies to an unauthenticated user. Anonymous scope is
available only for global rate limit.
Set User Rate Limit Set User Rate Limit
------------------- -------------------
Before you enable a rate limit, you must first set the rate limit parameters. Before you can enable a rate limit, you must first set the rate limit
For example:: parameters. The following is the general form of commands that set rate limit
parameters:
radosgw-admin ratelimit set --ratelimit-scope=user --uid=<uid> <[--max-read-ops=<num ops>] [--max-read-bytes=<num bytes>] .. prompt:: bash
radosgw-admin ratelimit set --ratelimit-scope=user --uid=<uid>
<[--max-read-ops=<num ops>] [--max-read-bytes=<num bytes>]
[--max-write-ops=<num ops>] [--max-write-bytes=<num bytes>]> [--max-write-ops=<num ops>] [--max-write-bytes=<num bytes>]>
For example:: An example of using ``radosgw-admin ratelimit set`` to set a rate limit might
look like this:
.. prompt:: bash
radosgw-admin ratelimit set --ratelimit-scope=user --uid=johndoe --max-read-ops=1024 --max-write-bytes=10240 radosgw-admin ratelimit set --ratelimit-scope=user --uid=johndoe --max-read-ops=1024 --max-write-bytes=10240
A 0 value for num ops and / or num bytes means that the A value of ``0`` assigned to ``--max-read-ops``, ``--max-read-bytes``,
specific rate limit attribute check is disabled. ``--max-write-ops``, or ``--max-write-bytes`` disables the specified rate
limit.
Get User Rate Limit Get User Rate Limit
------------------- -------------------
Get the current configured rate limit parameters The ``radosgw-admin ratelimit get`` command returns the currently configured
For example:: rate limit parameters.
The following is the general form of the command that returns the current
configured limit parameters:
.. prompt:: bash
radosgw-admin ratelimit get --ratelimit-scope=user --uid=<uid> radosgw-admin ratelimit get --ratelimit-scope=user --uid=<uid>
For example:: An example of using ``radosgw-admin ratelimit get`` to return the rate limit
parameters might look like this:
.. prompt:: bash
radosgw-admin ratelimit get --ratelimit-scope=user --uid=johndoe radosgw-admin ratelimit get --ratelimit-scope=user --uid=johndoe
A value of ``0`` assigned to ``--max-read-ops``, ``--max-read-bytes``,
A 0 value for num ops and / or num bytes means that the ``--max-write-ops``, or ``--max-write-bytes`` disables the specified rate
specific rate limit attribute check is disabled. limit.
Enable/Disable User Rate Limit Enable and Disable User Rate Limit
------------------------------ ----------------------------------
Once you set a user rate limit, you may enable it. For example:: After you have set a user rate limit, you must enable it in order for it to
take effect. Run a command of the following form to enable a user rate limit:
.. prompt:: bash
radosgw-admin ratelimit enable --ratelimit-scope=user --uid=<uid> radosgw-admin ratelimit enable --ratelimit-scope=user --uid=<uid>
You may disable an enabled user rate limit. For example:: To disable an enabled user rate limit, run a command of the following form:
.. prompt:: bash
radosgw-admin ratelimit disable --ratelimit-scope=user --uid=johndoe radosgw-admin ratelimit disable --ratelimit-scope=user --uid=johndoe
@ -586,114 +722,154 @@ Set Bucket Rate Limit
--------------------- ---------------------
Before you enable a rate limit, you must first set the rate limit parameters. Before you enable a rate limit, you must first set the rate limit parameters.
For example:: The following is the general form of commands that set rate limit parameters:
.. prompt:: bash
radosgw-admin ratelimit set --ratelimit-scope=bucket --bucket=<bucket> <[--max-read-ops=<num ops>] [--max-read-bytes=<num bytes>] radosgw-admin ratelimit set --ratelimit-scope=bucket --bucket=<bucket> <[--max-read-ops=<num ops>] [--max-read-bytes=<num bytes>]
[--max-write-ops=<num ops>] [--max-write-bytes=<num bytes>]> [--max-write-ops=<num ops>] [--max-write-bytes=<num bytes>]>
For example:: An example of using ``radosgw-admin ratelimit set`` to set a rate limit for a
bucket might look like this:
.. prompt:: bash
radosgw-admin ratelimit set --ratelimit-scope=bucket --bucket=mybucket --max-read-ops=1024 --max-write-bytes=10240 radosgw-admin ratelimit set --ratelimit-scope=bucket --bucket=mybucket --max-read-ops=1024 --max-write-bytes=10240
A 0 value for num ops and / or num bytes means that the A value of ``0`` assigned to ``--max-read-ops``, ``--max-read-bytes``,
specific rate limit attribute check is disabled. ``--max-write-ops``, or ``-max-write-bytes`` disables the specified bucket rate
limit.
Get Bucket Rate Limit Get Bucket Rate Limit
--------------------- ---------------------
Get the current configured rate limit parameters The ``radosgw-admin ratelimit get`` command returns the current configured rate
For example:: limit parameters.
radosgw-admin ratelimit set --ratelimit-scope=bucket --bucket=<bucket> The following is the general form of the command that returns the current
configured limit parameters:
For example:: .. prompt:: bash
radosgw-admin ratelimit get --ratelimit-scope=bucket --bucket=<bucket>
An example of using ``radosgw-admin ratelimit get`` to return the rate limit
parameters for a bucket might look like this:
.. prompt:: bash
radosgw-admin ratelimit get --ratelimit-scope=bucket --bucket=mybucket radosgw-admin ratelimit get --ratelimit-scope=bucket --bucket=mybucket
A value of ``0`` assigned to ``--max-read-ops``, ``--max-read-bytes``,
A 0 value for num ops and / or num bytes means that the ``--max-write-ops``, or ``--max-write-bytes`` disables the specified rate
specific rate limit attribute check is disabled. limit.
Enable/Disable Bucket Rate Limit Enable and Disable Bucket Rate Limit
-------------------------------- ------------------------------------
Once you set a bucket rate limit, you may enable it. For example:: After you set a bucket rate limit, you can enable it. The following is the
general form of the ``radosgw-admin ratelimit enable`` command that enables
bucket rate limits:
.. prompt:: bash
radosgw-admin ratelimit enable --ratelimit-scope=bucket --bucket=<bucket> radosgw-admin ratelimit enable --ratelimit-scope=bucket --bucket=<bucket>
You may disable an enabled bucket rate limit. For example:: An enabled bucket rate limit can be disabled by running a command of the following form:
.. prompt:: bash
radosgw-admin ratelimit disable --ratelimit-scope=bucket --uid=mybucket radosgw-admin ratelimit disable --ratelimit-scope=bucket --uid=mybucket
Reading and Writing Global Rate Limit Configuration
---------------------------------------------------
Reading / Writing Global Rate Limit Configuration You can read and write global rate limit settings in the period's configuration.
------------------------------------------------- To view the global rate limit settings, run the following command:
You can read and write global rate limit settings in the period configuration. To .. prompt:: bash
view the global rate limit settings::
radosgw-admin global ratelimit get radosgw-admin global ratelimit get
The global rate limit settings can be manipulated with the ``global ratelimit`` The global rate limit settings can be manipulated with the ``global ratelimit``
counterparts of the ``ratelimit set``, ``ratelimit enable``, and ``ratelimit disable`` counterparts of the ``ratelimit set``, ``ratelimit enable``, and ``ratelimit
commands. Per user and per bucket ratelimit configuration is overriding the global configuration:: disable`` commands. Per-user and per-bucket ratelimit configurations override
the global configuration:
.. prompt:: bash
radosgw-admin global ratelimit set --ratelimit-scope bucket --max-read-ops=1024 radosgw-admin global ratelimit set --ratelimit-scope bucket --max-read-ops=1024
radosgw-admin global ratelimit enable --ratelimit-scope bucket radosgw-admin global ratelimit enable --ratelimit-scope bucket
The global rate limit can configure rate limit scope for all authenticated users:: The global rate limit can be used to configure the scope of the rate limit for
all authenticated users:
.. prompt:: bash
radosgw-admin global ratelimit set --ratelimit-scope user --max-read-ops=1024 radosgw-admin global ratelimit set --ratelimit-scope user --max-read-ops=1024
radosgw-admin global ratelimit enable --ratelimit-scope user radosgw-admin global ratelimit enable --ratelimit-scope user
The global rate limit can configure rate limit scope for all unauthenticated users:: The global rate limit can be used to configure the scope of the rate limit for
all unauthenticated users:
.. prompt:: bash
radosgw-admin global ratelimit set --ratelimit-scope=anonymous --max-read-ops=1024 radosgw-admin global ratelimit set --ratelimit-scope=anonymous --max-read-ops=1024
radosgw-admin global ratelimit enable --ratelimit-scope=anonymous radosgw-admin global ratelimit enable --ratelimit-scope=anonymous
.. note:: In a multisite configuration, where there is a realm and period .. note:: In a multisite configuration where a realm and a period are present,
present, changes to the global rate limit must be committed using ``period any changes to the global rate limit must be committed using ``period update
update --commit``. If there is no period present, the rados gateway(s) must --commit``. If no period is present, the rados gateway(s) must be restarted
be restarted for the changes to take effect. for the changes to take effect.
Usage Usage
===== =====
The Ceph Object Gateway logs usage for each user. You can track The Ceph Object Gateway logs the usage of each user. You can track the usage of
user usage within date ranges too. each user within a specified date range.
- Add ``rgw_enable_usage_log = true`` in the ``[client.rgw]`` section of
``ceph.conf`` and restart the ``radosgw`` service.
.. note:: Until Ceph has a linkable macro that handles all the many ways that options can be set, we advise that you set ``rgw_enable_usage_log = true`` in central config or in ``ceph.conf`` and restart all RGWs.
- Add ``rgw enable usage log = true`` in [client.rgw] section of ceph.conf and restart the radosgw service.
Options include: Options include:
- **Start Date:** The ``--start-date`` option allows you to filter usage - **Start Date:** The ``--start-date`` option allows you to filter usage
stats from a particular start date and an optional start time stats from a specified start date and an optional start time
(**format:** ``yyyy-mm-dd [HH:MM:SS]``). (**format:** ``yyyy-mm-dd [HH:MM:SS]``).
- **End Date:** The ``--end-date`` option allows you to filter usage up - **End Date:** The ``--end-date`` option allows you to filter usage up
to a particular date and an optional end time to a particular end date and an optional end time
(**format:** ``yyyy-mm-dd [HH:MM:SS]``). (**format:** ``yyyy-mm-dd [HH:MM:SS]``).
- **Log Entries:** The ``--show-log-entries`` option allows you to specify - **Log Entries:** The ``--show-log-entries`` option allows you to specify
whether or not to include log entries with the usage stats whether to include log entries with the usage stats
(options: ``true`` | ``false``). (options: ``true`` | ``false``).
.. note:: You may specify time with minutes and seconds, but it is stored .. note:: You can specify time to a precision of minutes and seconds, but the
with 1 hour resolution. specified time is stored only with a one-hour resolution.
Show Usage Show Usage
---------- ----------
To show usage statistics, specify the ``usage show``. To show usage for a To show usage statistics, use the ``radosgw-admin usage show`` command. To show
particular user, you must specify a user ID. You may also specify a start date, usage for a particular user, you must specify a user ID. You can also specify a
end date, and whether or not to show log entries.:: start date, end date, and whether to show log entries. The following is an example
of such a command:
.. prompt:: bash $
radosgw-admin usage show --uid=johndoe --start-date=2012-03-01 --end-date=2012-04-01 radosgw-admin usage show --uid=johndoe --start-date=2012-03-01 --end-date=2012-04-01
You may also show a summary of usage information for all users by omitting a user ID. :: You can show a summary of usage information for all users by omitting the user
ID, as in the following example command:
.. prompt:: bash $
radosgw-admin usage show --show-log-entries=false radosgw-admin usage show --show-log-entries=false
@ -701,9 +877,12 @@ You may also show a summary of usage information for all users by omitting a use
Trim Usage Trim Usage
---------- ----------
With heavy use, usage logs can begin to take up storage space. You can trim Usage logs can consume significant storage space, especially over time and with
usage logs for all users and for specific users. You may also specify date heavy use. You can trim the usage logs for all users and for specific users.
ranges for trim operations. :: You can also specify date ranges for trim operations, as in the following
example commands:
.. prompt:: bash $
radosgw-admin usage trim --start-date=2010-01-01 --end-date=2010-12-31 radosgw-admin usage trim --start-date=2010-01-01 --end-date=2010-12-31
radosgw-admin usage trim --uid=johndoe radosgw-admin usage trim --uid=johndoe

View File

@ -275,6 +275,9 @@ Get User Info
Get user information. Get user information.
Either a ``uid`` or ``access-key`` must be supplied as a request parameter. We recommend supplying uid.
If both are provided but correspond to different users, the info for the user specified with ``uid`` will be returned.
:caps: users=read :caps: users=read
@ -297,6 +300,13 @@ Request Parameters
:Example: ``foo_user`` :Example: ``foo_user``
:Required: Yes :Required: Yes
``access-key``
:Description: The S3 access key of the user for which the information is requested.
:Type: String
:Example: ``ABCD0EF12GHIJ2K34LMN``
:Required: No
Response Entities Response Entities
~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~

View File

@ -4,12 +4,18 @@ Compression
.. versionadded:: Kraken .. versionadded:: Kraken
The Ceph Object Gateway supports server-side compression of uploaded objects, The Ceph Object Gateway supports server-side compression of uploaded objects.
using any of Ceph's existing compression plugins. using any of the existing compression plugins.
.. note:: The Reef release added a :ref:`feature_compress_encrypted` zonegroup .. note:: The Reef release added a :ref:`feature_compress_encrypted` zonegroup
feature to enable compression with `Server-Side Encryption`_. feature to enable compression with `Server-Side Encryption`_.
Supported compression plugins include the following:
* lz4
* snappy
* zlib
* zstd
Configuration Configuration
============= =============
@ -18,14 +24,15 @@ Compression can be enabled on a storage class in the Zone's placement target
by providing the ``--compression=<type>`` option to the command by providing the ``--compression=<type>`` option to the command
``radosgw-admin zone placement modify``. ``radosgw-admin zone placement modify``.
The compression ``type`` refers to the name of the compression plugin to use The compression ``type`` refers to the name of the compression plugin that will
when writing new object data. Each compressed object remembers which plugin be used when writing new object data. Each compressed object remembers which
was used, so changing this setting does not hinder the ability to decompress plugin was used, so any change to this setting will neither affect Ceph's
existing objects, nor does it force existing objects to be recompressed. ability to decompress existing objects nor require existing objects to be
recompressed.
This compression setting applies to all new objects uploaded to buckets using Compression settings apply to all new objects uploaded to buckets using this
this placement target. Compression can be disabled by setting the ``type`` to placement target. Compression can be disabled by setting the ``type`` to an
an empty string or ``none``. empty string or ``none``.
For example:: For example::
@ -62,11 +69,15 @@ For example::
Statistics Statistics
========== ==========
While all existing commands and APIs continue to report object and bucket Run the ``radosgw-admin bucket stats`` command to see compression statistics
sizes based their uncompressed data, compression statistics for a given bucket for a given bucket:
are included in its ``bucket stats``::
.. prompt:: bash
radosgw-admin bucket stats --bucket=<name>
::
$ radosgw-admin bucket stats --bucket=<name>
{ {
... ...
"usage": { "usage": {
@ -83,6 +94,9 @@ are included in its ``bucket stats``::
... ...
} }
Other commands and APIs will report object and bucket sizes based on their
uncompressed data.
The ``size_utilized`` and ``size_kb_utilized`` fields represent the total The ``size_utilized`` and ``size_kb_utilized`` fields represent the total
size of compressed data, in bytes and kilobytes respectively. size of compressed data, in bytes and kilobytes respectively.

View File

@ -15,13 +15,13 @@ Storage Clusters. :term:`Ceph Object Storage` supports two interfaces:
that is compatible with a large subset of the OpenStack Swift API. that is compatible with a large subset of the OpenStack Swift API.
Ceph Object Storage uses the Ceph Object Gateway daemon (``radosgw``), an HTTP Ceph Object Storage uses the Ceph Object Gateway daemon (``radosgw``), an HTTP
server designed for interacting with a Ceph Storage Cluster. The Ceph Object server designed to interact with a Ceph Storage Cluster. The Ceph Object
Gateway provides interfaces that are compatible with both Amazon S3 and Gateway provides interfaces that are compatible with both Amazon S3 and
OpenStack Swift, and it has its own user management. Ceph Object Gateway can OpenStack Swift, and it has its own user management. Ceph Object Gateway can
store data in the same Ceph Storage Cluster in which data from Ceph File System use a single Ceph Storage cluster to store data from Ceph File System and from
clients and Ceph Block Device clients is stored. The S3 API and the Swift API Ceph Block device clients. The S3 API and the Swift API share a common
share a common namespace, which makes it possible to write data to a Ceph namespace, which means that it is possible to write data to a Ceph Storage
Storage Cluster with one API and then retrieve that data with the other API. Cluster with one API and then retrieve that data with the other API.
.. ditaa:: .. ditaa::

View File

@ -24,49 +24,48 @@ Varieties of Multi-site Configuration
.. versionadded:: Jewel .. versionadded:: Jewel
Beginning with the Kraken release, Ceph supports several multi-site Since the Kraken release, Ceph has supported several multi-site configurations
configurations for the Ceph Object Gateway: for the Ceph Object Gateway:
- **Multi-zone:** A more advanced topology, the "multi-zone" configuration, is - **Multi-zone:** The "multi-zone" configuration has a complex topology. A
possible. A multi-zone configuration consists of one zonegroup and multi-zone configuration consists of one zonegroup and multiple zones. Each
multiple zones, with each zone consisting of one or more `ceph-radosgw` zone consists of one or more `ceph-radosgw` instances. **Each zone is backed
instances. **Each zone is backed by its own Ceph Storage Cluster.** by its own Ceph Storage Cluster.**
The presence of multiple zones in a given zonegroup provides disaster The presence of multiple zones in a given zonegroup provides disaster
recovery for that zonegroup in the event that one of the zones experiences a recovery for that zonegroup in the event that one of the zones experiences a
significant failure. Beginning with the Kraken release, each zone is active significant failure. Each zone is active and can receive write operations. A
and can receive write operations. A multi-zone configuration that contains multi-zone configuration that contains multiple active zones enhances
multiple active zones enhances disaster recovery and can also be used as a disaster recovery and can be used as a foundation for content-delivery
foundation for content delivery networks. networks.
- **Multi-zonegroups:** Ceph Object Gateway supports multiple zonegroups (which - **Multi-zonegroups:** Ceph Object Gateway supports multiple zonegroups (which
were formerly called "regions"). Each zonegroup contains one or more zones. were formerly called "regions"). Each zonegroup contains one or more zones.
If two zones are in the same zonegroup, and if that zonegroup is in the same If two zones are in the same zonegroup and that zonegroup is in the same
realm as a second zonegroup, then the objects stored in the two zones share realm as a second zonegroup, then the objects stored in the two zones share a
a global object namespace. This global object namespace ensures unique global object namespace. This global object namespace ensures unique object
object IDs across zonegroups and zones. IDs across zonegroups and zones.
Each bucket is owned by the zonegroup where it was created (except where Each bucket is owned by the zonegroup where it was created (except where
overridden by the :ref:`LocationConstraint<s3_bucket_placement>` on overridden by the :ref:`LocationConstraint<s3_bucket_placement>` on
bucket creation), and its object data will only replicate to other zones in bucket creation), and its object data will replicate only to other zones in
that zonegroup. Any request for data in that bucket that are sent to other that zonegroup. Any request for data in that bucket that is sent to other
zonegroups will redirect to the zonegroup where the bucket resides. zonegroups will redirect to the zonegroup where the bucket resides.
It can be useful to create multiple zonegroups when you want to share a It can be useful to create multiple zonegroups when you want to share a
namespace of users and buckets across many zones, but isolate the object data namespace of users and buckets across many zones and isolate the object data
to a subset of those zones. It might be that you have several connected sites to a subset of those zones. Maybe you have several connected sites that share
that share storage, but only require a single backup for purposes of disaster storage but require only a single backup for purposes of disaster recovery.
recovery. In such a case, it could make sense to create several zonegroups In such a case, you could create several zonegroups with only two zones each
with only two zones each to avoid replicating all objects to all zones. to avoid replicating all objects to all zones.
In other cases, it might make more sense to isolate things in separate In other cases, you might isolate data in separate realms, with each realm
realms, with each realm having a single zonegroup. Zonegroups provide having a single zonegroup. Zonegroups provide flexibility by making it
flexibility by making it possible to control the isolation of data and possible to control the isolation of data and metadata separately.
metadata separately.
- **Multiple Realms:** Beginning with the Kraken release, the Ceph Object - **Multiple Realms:** Since the Kraken release, the Ceph Object Gateway
Gateway supports "realms", which are containers for zonegroups. Realms make supports "realms", which are containers for zonegroups. Realms make it
it possible to set policies that apply to multiple zonegroups. Realms have a possible to set policies that apply to multiple zonegroups. Realms have a
globally unique namespace and can contain either a single zonegroup or globally unique namespace and can contain either a single zonegroup or
multiple zonegroups. If you choose to make use of multiple realms, you can multiple zonegroups. If you choose to make use of multiple realms, you can
define multiple namespaces and multiple configurations (this means that each define multiple namespaces and multiple configurations (this means that each
@ -464,8 +463,8 @@ For example:
.. important:: The following steps assume a multi-site configuration that uses .. important:: The following steps assume a multi-site configuration that uses
newly installed systems that have not yet begun storing data. **DO NOT newly installed systems that have not yet begun storing data. **DO NOT
DELETE the ``default`` zone or its pools** if you are already using it to DELETE the** ``default`` **zone or its pools** if you are already using it
store data, or the data will be irretrievably lost. to store data, or the data will be irretrievably lost.
Delete the default zone if needed: Delete the default zone if needed:
@ -528,6 +527,17 @@ running the following commands on the object gateway host:
systemctl start ceph-radosgw@rgw.`hostname -s` systemctl start ceph-radosgw@rgw.`hostname -s`
systemctl enable ceph-radosgw@rgw.`hostname -s` systemctl enable ceph-radosgw@rgw.`hostname -s`
If the ``cephadm`` command was used to deploy the cluster, you will not be able
to use ``systemctl`` to start the gateway because no services will exist on
which ``systemctl`` could operate. This is due to the containerized nature of
the ``cephadm``-deployed Ceph cluster. If you have used the ``cephadm`` command
and you have a containerized cluster, you must run a command of the following
form to start the gateway:
.. prompt:: bash #
ceph orch apply rgw <name> --realm=<realm> --zone=<zone> --placement --port
Checking Synchronization Status Checking Synchronization Status
------------------------------- -------------------------------

View File

@ -154,6 +154,10 @@ updating, use the name of an existing topic and different endpoint values).
[&Attributes.entry.9.key=persistent&Attributes.entry.9.value=true|false] [&Attributes.entry.9.key=persistent&Attributes.entry.9.value=true|false]
[&Attributes.entry.10.key=cloudevents&Attributes.entry.10.value=true|false] [&Attributes.entry.10.key=cloudevents&Attributes.entry.10.value=true|false]
[&Attributes.entry.11.key=mechanism&Attributes.entry.11.value=<mechanism>] [&Attributes.entry.11.key=mechanism&Attributes.entry.11.value=<mechanism>]
[&Attributes.entry.12.key=time_to_live&Attributes.entry.12.value=<seconds to live>]
[&Attributes.entry.13.key=max_retries&Attributes.entry.13.value=<retries number>]
[&Attributes.entry.14.key=retry_sleep_duration&Attributes.entry.14.value=<sleep seconds>]
[&Attributes.entry.15.key=Policy&Attributes.entry.15.value=<policy-JSON-string>]
Request parameters: Request parameters:

View File

@ -11,16 +11,13 @@ multiple zones.
Tuning Tuning
====== ======
When ``radosgw`` first tries to operate on a zone pool that does not When ``radosgw`` first tries to operate on a zone pool that does not exist, it
exist, it will create that pool with the default values from will create that pool with the default values from ``osd pool default pg num``
``osd pool default pg num`` and ``osd pool default pgp num``. These defaults and ``osd pool default pgp num``. These defaults are sufficient for some pools,
are sufficient for some pools, but others (especially those listed in but others (especially those listed in ``placement_pools`` for the bucket index
``placement_pools`` for the bucket index and data) will require additional and data) will require additional tuning. See `Pools
tuning. We recommend using the `Ceph Placement Groups per Pool <http://docs.ceph.com/en/latest/rados/operations/pools/#pools>`__ for details
Calculator <https://old.ceph.com/pgcalc/>`__ to calculate a suitable number of on pool creation.
placement groups for these pools. See
`Pools <http://docs.ceph.com/en/latest/rados/operations/pools/#pools>`__
for details on pool creation.
.. _radosgw-pool-namespaces: .. _radosgw-pool-namespaces:

View File

@ -90,7 +90,8 @@ $ sudo ln -sf /usr/local/openresty/bin/openresty /usr/bin/nginx
Put in-place your Nginx configuration files and edit them according to your environment: Put in-place your Nginx configuration files and edit them according to your environment:
All Nginx conf files are under: https://github.com/ceph/ceph/tree/main/examples/rgw/rgw-cache All Nginx conf files are under:
https://github.com/ceph/ceph/tree/main/examples/rgw/rgw-cache
`nginx.conf` should go to `/etc/nginx/nginx.conf` `nginx.conf` should go to `/etc/nginx/nginx.conf`

View File

@ -2,14 +2,20 @@
Role Role
====== ======
A role is similar to a user and has permission policies attached to it, that determine what a role can or can not do. A role can be assumed by any identity that needs it. If a user assumes a role, a set of dynamically created temporary credentials are returned to the user. A role can be used to delegate access to users, applications, services that do not have permissions to access some s3 resources. A role is similar to a user. It has permission policies attached to it that
determine what it can do and what it cannot do. A role can be assumed by any
identity that needs it. When a user assumes a role, a set of
dynamically-created temporary credentials are provided to the user. A role can
be used to delegate access to users, to applications, and to services that do
not have permissions to access certain S3 resources.
The following radosgw-admin commands can be used to create/ delete/ update a role and permissions associated with a role. The following ``radosgw-admin`` commands can be used to create or delete or
update a role and the permissions associated with it.
Create a Role Create a Role
------------- -------------
To create a role, execute the following:: To create a role, run a command of the following form::
radosgw-admin role create --role-name={role-name} [--path=="{path to the role}"] [--assume-role-policy-doc={trust-policy-document}] radosgw-admin role create --role-name={role-name} [--path=="{path to the role}"] [--assume-role-policy-doc={trust-policy-document}]
@ -23,12 +29,13 @@ Request Parameters
``path`` ``path``
:Description: Path to the role. The default value is a slash(/). :Description: Path to the role. The default value is a slash(``/``).
:Type: String :Type: String
``assume-role-policy-doc`` ``assume-role-policy-doc``
:Description: The trust relationship policy document that grants an entity permission to assume the role. :Description: The trust relationship policy document that grants an entity
permission to assume the role.
:Type: String :Type: String
For example:: For example::
@ -51,7 +58,9 @@ For example::
Delete a Role Delete a Role
------------- -------------
To delete a role, execute the following:: To delete a role, run a command of the following form:
.. prompt:: bash
radosgw-admin role delete --role-name={role-name} radosgw-admin role delete --role-name={role-name}
@ -63,16 +72,21 @@ Request Parameters
:Description: Name of the role. :Description: Name of the role.
:Type: String :Type: String
For example:: For example:
.. prompt:: bash
radosgw-admin role delete --role-name=S3Access1 radosgw-admin role delete --role-name=S3Access1
Note: A role can be deleted only when it doesn't have any permission policy attached to it. Note: A role can be deleted only when it has no permission policy attached to
it.
Get a Role Get a Role
---------- ----------
To get information about a role, execute the following:: To get information about a role, run a command of the following form:
.. prompt:: bash
radosgw-admin role get --role-name={role-name} radosgw-admin role get --role-name={role-name}
@ -84,7 +98,9 @@ Request Parameters
:Description: Name of the role. :Description: Name of the role.
:Type: String :Type: String
For example:: For example:
.. prompt:: bash
radosgw-admin role get --role-name=S3Access1 radosgw-admin role get --role-name=S3Access1
@ -104,7 +120,9 @@ For example::
List Roles List Roles
---------- ----------
To list roles with a specified path prefix, execute the following:: To list roles with a specified path prefix, run a command of the following form:
.. prompt:: bash
radosgw-admin role list [--path-prefix ={path prefix}] radosgw-admin role list [--path-prefix ={path prefix}]
@ -113,10 +131,13 @@ Request Parameters
``path-prefix`` ``path-prefix``
:Description: Path prefix for filtering roles. If this is not specified, all roles are listed. :Description: Path prefix for filtering roles. If this is not specified, all
roles are listed.
:Type: String :Type: String
For example:: For example:
.. prompt:: bash
radosgw-admin role list --path-prefix="/application" radosgw-admin role list --path-prefix="/application"
@ -134,7 +155,6 @@ For example::
} }
] ]
Update Assume Role Policy Document of a role Update Assume Role Policy Document of a role
-------------------------------------------- --------------------------------------------
@ -334,6 +354,7 @@ Create a Role
------------- -------------
Example:: Example::
POST "<hostname>?Action=CreateRole&RoleName=S3Access&Path=/application_abc/component_xyz/&AssumeRolePolicyDocument=\{\"Version\":\"2012-10-17\",\"Statement\":\[\{\"Effect\":\"Allow\",\"Principal\":\{\"AWS\":\[\"arn:aws:iam:::user/TESTER\"\]\},\"Action\":\[\"sts:AssumeRole\"\]\}\]\}" POST "<hostname>?Action=CreateRole&RoleName=S3Access&Path=/application_abc/component_xyz/&AssumeRolePolicyDocument=\{\"Version\":\"2012-10-17\",\"Statement\":\[\{\"Effect\":\"Allow\",\"Principal\":\{\"AWS\":\[\"arn:aws:iam:::user/TESTER\"\]\},\"Action\":\[\"sts:AssumeRole\"\]\}\]\}"
.. code-block:: XML .. code-block:: XML
@ -353,14 +374,18 @@ Delete a Role
------------- -------------
Example:: Example::
POST "<hostname>?Action=DeleteRole&RoleName=S3Access" POST "<hostname>?Action=DeleteRole&RoleName=S3Access"
Note: A role can be deleted only when it doesn't have any permission policy attached to it. Note: A role can be deleted only when it doesn't have any permission policy
attached to it. If you intend to delete a role, you must first delete any
policies attached to it.
Get a Role Get a Role
---------- ----------
Example:: Example::
POST "<hostname>?Action=GetRole&RoleName=S3Access" POST "<hostname>?Action=GetRole&RoleName=S3Access"
.. code-block:: XML .. code-block:: XML
@ -380,6 +405,7 @@ List Roles
---------- ----------
Example:: Example::
POST "<hostname>?Action=ListRoles&RoleName=S3Access&PathPrefix=/application" POST "<hostname>?Action=ListRoles&RoleName=S3Access&PathPrefix=/application"
.. code-block:: XML .. code-block:: XML
@ -399,18 +425,21 @@ Update Assume Role Policy Document
---------------------------------- ----------------------------------
Example:: Example::
POST "<hostname>?Action=UpdateAssumeRolePolicy&RoleName=S3Access&PolicyDocument=\{\"Version\":\"2012-10-17\",\"Statement\":\[\{\"Effect\":\"Allow\",\"Principal\":\{\"AWS\":\[\"arn:aws:iam:::user/TESTER2\"\]\},\"Action\":\[\"sts:AssumeRole\"\]\}\]\}" POST "<hostname>?Action=UpdateAssumeRolePolicy&RoleName=S3Access&PolicyDocument=\{\"Version\":\"2012-10-17\",\"Statement\":\[\{\"Effect\":\"Allow\",\"Principal\":\{\"AWS\":\[\"arn:aws:iam:::user/TESTER2\"\]\},\"Action\":\[\"sts:AssumeRole\"\]\}\]\}"
Add/ Update a Policy attached to a Role Add/ Update a Policy attached to a Role
--------------------------------------- ---------------------------------------
Example:: Example::
POST "<hostname>?Action=PutRolePolicy&RoleName=S3Access&PolicyName=Policy1&PolicyDocument=\{\"Version\":\"2012-10-17\",\"Statement\":\[\{\"Effect\":\"Allow\",\"Action\":\[\"s3:CreateBucket\"\],\"Resource\":\"arn:aws:s3:::example_bucket\"\}\]\}" POST "<hostname>?Action=PutRolePolicy&RoleName=S3Access&PolicyName=Policy1&PolicyDocument=\{\"Version\":\"2012-10-17\",\"Statement\":\[\{\"Effect\":\"Allow\",\"Action\":\[\"s3:CreateBucket\"\],\"Resource\":\"arn:aws:s3:::example_bucket\"\}\]\}"
List Permission Policy Names attached to a Role List Permission Policy Names attached to a Role
----------------------------------------------- -----------------------------------------------
Example:: Example::
POST "<hostname>?Action=ListRolePolicies&RoleName=S3Access" POST "<hostname>?Action=ListRolePolicies&RoleName=S3Access"
.. code-block:: XML .. code-block:: XML
@ -424,6 +453,7 @@ Get Permission Policy attached to a Role
---------------------------------------- ----------------------------------------
Example:: Example::
POST "<hostname>?Action=GetRolePolicy&RoleName=S3Access&PolicyName=Policy1" POST "<hostname>?Action=GetRolePolicy&RoleName=S3Access&PolicyName=Policy1"
.. code-block:: XML .. code-block:: XML
@ -439,6 +469,7 @@ Delete Policy attached to a Role
-------------------------------- --------------------------------
Example:: Example::
POST "<hostname>?Action=DeleteRolePolicy&RoleName=S3Access&PolicyName=Policy1" POST "<hostname>?Action=DeleteRolePolicy&RoleName=S3Access&PolicyName=Policy1"
Tag a role Tag a role
@ -447,6 +478,7 @@ A role can have multivalued tags attached to it. These tags can be passed in as
AWS does not support multi-valued role tags. AWS does not support multi-valued role tags.
Example:: Example::
POST "<hostname>?Action=TagRole&RoleName=S3Access&Tags.member.1.Key=Department&Tags.member.1.Value=Engineering" POST "<hostname>?Action=TagRole&RoleName=S3Access&Tags.member.1.Key=Department&Tags.member.1.Value=Engineering"
.. code-block:: XML .. code-block:: XML
@ -463,6 +495,7 @@ List role tags
Lists the tags attached to a role. Lists the tags attached to a role.
Example:: Example::
POST "<hostname>?Action=ListRoleTags&RoleName=S3Access" POST "<hostname>?Action=ListRoleTags&RoleName=S3Access"
.. code-block:: XML .. code-block:: XML
@ -486,6 +519,7 @@ Delete role tags
Delete a tag/ tags attached to a role. Delete a tag/ tags attached to a role.
Example:: Example::
POST "<hostname>?Action=UntagRoles&RoleName=S3Access&TagKeys.member.1=Department" POST "<hostname>?Action=UntagRoles&RoleName=S3Access&TagKeys.member.1=Department"
.. code-block:: XML .. code-block:: XML
@ -500,6 +534,7 @@ Update Role
----------- -----------
Example:: Example::
POST "<hostname>?Action=UpdateRole&RoleName=S3Access&MaxSessionDuration=43200" POST "<hostname>?Action=UpdateRole&RoleName=S3Access&MaxSessionDuration=43200"
.. code-block:: XML .. code-block:: XML
@ -565,6 +600,3 @@ The following is sample code for adding tags to role, listing tags and untagging
'Department', 'Department',
] ]
) )

View File

@ -104,7 +104,7 @@ An example of a role permission policy that uses aws:PrincipalTag is as follows:
{ {
"Effect":"Allow", "Effect":"Allow",
"Action":["s3:*"], "Action":["s3:*"],
"Resource":["arn:aws:s3::t1tenant:my-test-bucket","arn:aws:s3::t1tenant:my-test-bucket/*],"+ "Resource":["arn:aws:s3::t1tenant:my-test-bucket","arn:aws:s3::t1tenant:my-test-bucket/*"],
"Condition":{"StringEquals":{"aws:PrincipalTag/Department":"Engineering"}} "Condition":{"StringEquals":{"aws:PrincipalTag/Department":"Engineering"}}
}] }]
} }

View File

@ -32,9 +32,9 @@ the ``librbd`` library.
Ceph's block devices deliver high performance with vast scalability to Ceph's block devices deliver high performance with vast scalability to
`kernel modules`_, or to :abbr:`KVMs (kernel virtual machines)` such as `QEMU`_, and `kernel modules`_, or to :abbr:`KVMs (kernel virtual machines)` such as `QEMU`_, and
cloud-based computing systems like `OpenStack`_ and `CloudStack`_ that rely on cloud-based computing systems like `OpenStack`_, `OpenNebula`_ and `CloudStack`_
libvirt and QEMU to integrate with Ceph block devices. You can use the same cluster that rely on libvirt and QEMU to integrate with Ceph block devices. You can use
to operate the :ref:`Ceph RADOS Gateway <object-gateway>`, the the same cluster to operate the :ref:`Ceph RADOS Gateway <object-gateway>`, the
:ref:`Ceph File System <ceph-file-system>`, and Ceph block devices simultaneously. :ref:`Ceph File System <ceph-file-system>`, and Ceph block devices simultaneously.
.. important:: To use Ceph Block Devices, you must have access to a running .. important:: To use Ceph Block Devices, you must have access to a running
@ -69,4 +69,5 @@ to operate the :ref:`Ceph RADOS Gateway <object-gateway>`, the
.. _kernel modules: ./rbd-ko/ .. _kernel modules: ./rbd-ko/
.. _QEMU: ./qemu-rbd/ .. _QEMU: ./qemu-rbd/
.. _OpenStack: ./rbd-openstack .. _OpenStack: ./rbd-openstack
.. _OpenNebula: https://docs.opennebula.io/stable/open_cluster_deployment/storage_setup/ceph_ds.html
.. _CloudStack: ./rbd-cloudstack .. _CloudStack: ./rbd-cloudstack

View File

@ -41,10 +41,11 @@ illustrates how ``libvirt`` and QEMU use Ceph block devices via ``librbd``.
The most common ``libvirt`` use case involves providing Ceph block devices to The most common ``libvirt`` use case involves providing Ceph block devices to
cloud solutions like OpenStack or CloudStack. The cloud solution uses cloud solutions like OpenStack, OpenNebula or CloudStack. The cloud solution uses
``libvirt`` to interact with QEMU/KVM, and QEMU/KVM interacts with Ceph block ``libvirt`` to interact with QEMU/KVM, and QEMU/KVM interacts with Ceph block
devices via ``librbd``. See `Block Devices and OpenStack`_ and `Block Devices devices via ``librbd``. See `Block Devices and OpenStack`_,
and CloudStack`_ for details. See `Installation`_ for installation details. `Block Devices and OpenNebula`_ and `Block Devices and CloudStack`_ for details.
See `Installation`_ for installation details.
You can also use Ceph block devices with ``libvirt``, ``virsh`` and the You can also use Ceph block devices with ``libvirt``, ``virsh`` and the
``libvirt`` API. See `libvirt Virtualization API`_ for details. ``libvirt`` API. See `libvirt Virtualization API`_ for details.
@ -309,6 +310,7 @@ within your VM.
.. _Installation: ../../install .. _Installation: ../../install
.. _libvirt Virtualization API: http://www.libvirt.org .. _libvirt Virtualization API: http://www.libvirt.org
.. _Block Devices and OpenStack: ../rbd-openstack .. _Block Devices and OpenStack: ../rbd-openstack
.. _Block Devices and OpenNebula: https://docs.opennebula.io/stable/open_cluster_deployment/storage_setup/ceph_ds.html#datastore-internals
.. _Block Devices and CloudStack: ../rbd-cloudstack .. _Block Devices and CloudStack: ../rbd-cloudstack
.. _Create a pool: ../../rados/operations/pools#create-a-pool .. _Create a pool: ../../rados/operations/pools#create-a-pool
.. _Create a Ceph User: ../../rados/operations/user-management#add-a-user .. _Create a Ceph User: ../../rados/operations/user-management#add-a-user

View File

@ -0,0 +1,70 @@
---------------------------------
NVMe/TCP Initiator for VMware ESX
---------------------------------
Prerequisites
=============
- A VMware ESXi host running VMware vSphere Hypervisor (ESXi) 7.0U3 version or later.
- Deployed Ceph NVMe-oF gateway.
- Ceph cluster with NVMe-oF configuration.
- Subsystem defined in the gateway.
Configuration
=============
The following instructions will use the default vSphere web client and esxcli.
1. Enable NVMe/TCP on a NIC:
.. prompt:: bash #
esxcli nvme fabric enable --protocol TCP --device vmnicN
Replace ``N`` with the number of the NIC.
2. Tag a VMKernel NIC to permit NVMe/TCP traffic:
.. prompt:: bash #
esxcli network uip interface tag add --interface-nme vmkN --tagname NVMeTCP
Replace ``N`` with the ID of the VMkernel.
3. Configure the VMware ESXi host for NVMe/TCP:
#. List the NVMe-oF adapter:
.. prompt:: bash #
esxcli nvme adapter list
#. Discover NVMe-oF subsystems:
.. prompt:: bash #
esxcli nvme fabric discover -a NVME_TCP_ADAPTER -i GATEWAY_IP -p 4420
#. Connect to NVME-oF gateway subsystem:
.. prompt:: bash #
esxcli nvme connect -a NVME_TCP_ADAPTER -i GATEWAY_IP -p 4420 -s SUBSYSTEM_NQN
#. List the NVMe/TCP controllers:
.. prompt:: bash #
esxcli nvme controller list
#. List the NVMe-oF namespaces in the subsystem:
.. prompt:: bash #
esxcli nvme namespace list
4. Verify that the initiator has been set up correctly:
#. From the vSphere client go to the ESXi host.
#. On the Storage page go to the Devices tab.
#. Verify that the NVME/TCP disks are listed in the table.

View File

@ -0,0 +1,83 @@
==============================
NVMe/TCP Initiator for Linux
==============================
Prerequisites
=============
- Kernel 5.0 or later
- RHEL 9.2 or later
- Ubuntu 24.04 or later
- SLES 15 SP3 or later
Installation
============
1. Install the nvme-cli:
.. prompt:: bash #
yum install nvme-cli
2. Load the NVMe-oF module:
.. prompt:: bash #
modprobe nvme-fabrics
3. Verify the NVMe/TCP target is reachable:
.. prompt:: bash #
nvme discover -t tcp -a GATEWAY_IP -s 4420
4. Connect to the NVMe/TCP target:
.. prompt:: bash #
nvme connect -t tcp -a GATEWAY_IP -n SUBSYSTEM_NQN
Next steps
==========
Verify that the initiator is set up correctly:
1. List the NVMe block devices:
.. prompt:: bash #
nvme list
2. Create a filesystem on the desired device:
.. prompt:: bash #
mkfs.ext4 NVME_NODE_PATH
3. Mount the filesystem:
.. prompt:: bash #
mkdir /mnt/nvmeof
.. prompt:: bash #
mount NVME_NODE_PATH /mnt/nvmeof
4. List the NVME-oF files:
.. prompt:: bash #
ls /mnt/nvmeof
5. Create a text file in the ``/mnt/nvmeof`` directory:
.. prompt:: bash #
echo "Hello NVME-oF" > /mnt/nvmeof/hello.text
6. Verify that the file can be accessed:
.. prompt:: bash #
cat /mnt/nvmeof/hello.text

Some files were not shown because too many files have changed in this diff Show More