api: ceph: improve reporting of ceph OSD memory usage

Currently we are using the MemoryCurrent property of the OSD service
to determine the used memory of a Ceph OSD. This includes, among other
things, the memory used by buffers [1]. Since BlueFS uses buffered
I/O, this can lead to extremely high values shown in the UI.

Instead we are now reading the PSS value from the proc filesystem,
which should more accurately reflect the amount of memory currently
used by the Ceph OSD.

Aaron and I decided on PSS over RSS, since this should give a better
idea of used memory - particularly when using a large amount of OSDs
on one host, since the OSDs share some of the pages.

[1] https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt

Signed-off-by: Stefan Hanreich <s.hanreich@proxmox.com>
Tested-by: Aaron Lauterer <a.lauterer@proxmox.com>
This commit is contained in:
Stefan Hanreich 2023-09-04 11:18:07 +02:00 committed by Thomas Lamprecht
parent bacb4173fb
commit 808eb12f8c
2 changed files with 17 additions and 6 deletions

View File

@ -687,13 +687,10 @@ __PACKAGE__->register_method ({
my $raw = '';
my $pid;
my $memory;
my $parser = sub {
my $line = shift;
if ($line =~ m/^MainPID=([0-9]*)$/) {
$pid = $1;
} elsif ($line =~ m/^MemoryCurrent=([0-9]*|\[not set\])$/) {
$memory = $1 eq "[not set]" ? 0 : $1;
}
};
@ -702,12 +699,26 @@ __PACKAGE__->register_method ({
'show',
"ceph-osd\@${osdid}.service",
'--property',
'MainPID,MemoryCurrent',
'MainPID',
];
run_command($cmd, errmsg => 'fetching OSD PID and memory usage failed', outfunc => $parser);
$pid = defined($pid) ? int($pid) : undef;
$memory = defined($memory) ? int($memory) : undef;
my $memory = 0;
if ($pid && $pid > 0) {
open (my $SMAPS, '<', "/proc/$pid/smaps_rollup")
or die "failed to read PSS memory-stat from process - $!\n";
while (my $line = <$SMAPS>) {
if ($line =~ m/^Pss:\s+([0-9]+) kB$/) {
$memory = $1 * 1024;
last;
}
}
close $SMAPS;
}
my $data = {
osd => {

View File

@ -148,7 +148,7 @@ Ext.define('PVE.CephOsdDetails', {
{
xtype: 'text',
name: 'mem_usage',
text: gettext('Memory usage'),
text: gettext('Memory usage (PSS)'),
renderer: Proxmox.Utils.render_size,
},
{