Working could be confused with "being ok", which isn't what we want to
convey here, as the lack of this status doesn't mean something "isn't
working".
So use busy, not 100% perfect but a bit closer to what we want to
convey while not taking up a whole paragraph or the like.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Like ceph mgr dashboard, we need a warning state.
- set degraded as warning instead working
- set undersized as warning instead error
- rename error as critical
- add "busy" (info-blue) color for working state
- use warning (orange) color for warning state
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
Tested-By: Aaron Lauterer <a.lauterer@proxmox.com>
Reviewed-By: Aaron Lauterer <a.lauterer@proxmox.com>
[ TL: fold in CSS class addition ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
talked with Aaron off-list and he found it OK to drop this button now
that "Copy Details" became a "Copy All".
This reduces cognitive load on the user as there are half as many
buttons.
Rename "Copy All" to "Copy to Clipboard" now that there's only one and
drop the disable logic.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
this causes jumps and is IMO rather irritating, keep hands off from
scrolling, that's best done by user/browser.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
to make it more clear that this is not the details, but a UI text
placeholder.
Add a `pmx-faded` class that reduced opacity, as there where recent
discussion about adding such a utility class to widget-toolkit anyway.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
by
* replacing the info button with expandable rows that contain the
details of the warning
* adding two action buttons to copy the summary and details
* making the text selectable
The row expander works like the one in the mail gateway tracking center
-> doubleclick only opens it.
The height of the warning grid is limited to not grow too large.
A Diffstore is used to avoid expanded rows being collapsed on an update.
The rowexpander cannot hide the toggle out of the box. Therefore, if
there is no detailed message for a warning, we show a placeholder text.
We could consider extending it in the future to only show the toggle if
a defined condition is met.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
still default to Ceph 17.2 Quincy for now, at least if there isn't a
Ceph Reef set-up in the cluster already.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Currently we are using the MemoryCurrent property of the OSD service
to determine the used memory of a Ceph OSD. This includes, among other
things, the memory used by buffers [1]. Since BlueFS uses buffered
I/O, this can lead to extremely high values shown in the UI.
Instead we are now reading the PSS value from the proc filesystem,
which should more accurately reflect the amount of memory currently
used by the Ceph OSD.
Aaron and I decided on PSS over RSS, since this should give a better
idea of used memory - particularly when using a large amount of OSDs
on one host, since the OSDs share some of the pages.
[1] https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt
Signed-off-by: Stefan Hanreich <s.hanreich@proxmox.com>
Tested-by: Aaron Lauterer <a.lauterer@proxmox.com>
Since some languages translate byte units like 'GiB' or write them in their
own script, this patch wraps units in the `gettext` function.
While most occurrences of byte strings can be translated within the
`format_size` function in `proxmox-widget-toolkit/src/Utils.js`, this patch
catches those instances that are not translated.
Signed-off-by: Noel Ullreich <n.ullreich@proxmox.com>
The pool number is shown in a few places, having it easily accessible
can help to understand which pool a warning/error refers to.
For example, the PG ID consists of '{pool nr}.{pg nr}' and is shown in
every warning concerning that PG.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
looks a bit odd as the background it produces goes over the text, but
is the least invasive method to apply something like this, and
highlighting the whole thing is too flashy here.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
None hint required if all nodes have subscriptions and enterprise
repo is selected, but otherwise give some hints for better UX and to
(hopefully) reduce the chance for mishaps.
We might want to highlight the label to improve visibility tough.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
provide a second combo box that allows one to select which specific
repository out of enterprise, no-subscription or test one would like
to use.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
If we just pass the me.reload as function reference it won't be
executed with `this` being the view controller, so call it directly
on that instead.
Reported-by: Stefan Hanreich <s.hanreich@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
ceph/pools (plural) is deprecated, use the new one.
Since the details / status of a pool has been moved from previously
ceph/pools/{name} to now ceph/pool/{name}/status, we need to pass the
'loadUrl' to the edit window.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
Reviewed-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-by: Dominik Csapak <d.csapak@proxmox.com>
One MDS can only serve a single CephFS at a time and for redundancy
one wants to have standby's on other nodes.
But with multiple CephFS instances a single MDS per node might not be
enough, e.g., with three FS on a three-node cluster a failure of one
node would mean that on CephFS won't work anymore.
While the API and CLI allowed to set up multiple CephFS per node
already, the UI didn't. Address this by adding an `Extra ID` field
that will be suffixed to the base ID, which always contains the node
as that makes sorting and also associating services to their node
easier.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This new windows provides more detailes about an OSD such as:
* PID
* Memory usage
* various metadata that could be of interest
* list of phyiscal disks used for the main disk, db and wal with
additional infos about the volumes for each
A new 'Details' button is added to the OSD overview and a double click
on an OSD will also open this new window.
The componend defines the items in the initComponent instead of
following a fully declarative approach. This is because we need to pass
the same store to multiple Proxmox.ObjectGrids.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
Reviewed-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-by: Dominik Csapak <d.csapak@proxmox.com>
Since the rule selector is not allowed to be empty, but the loading
of the rules is not instant, the validity change will trigger before
the load was finished. Since it is in the advanced section, it will
be opened every time instead of only when there is an invalid value.
This patch fixes that by temporarily setting 'allowBlank' to true
until the store is loaded, and then it revalidates the field.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-By: Aaron Lauterer <a.lauterer@proxmox.com>
add support for setting the background and text color via css. also
allows for dynamically switching the color when a theme change is
detected.
Signed-off-by: Stefan Sterz <s.sterz@proxmox.com>
inline the transformation for the health store and also avoid setting
raw data from the outside
and drop some bogus comments along the way, first one should mostly
use "why?" not "what happens?" comments and second, commenting
straight forward things always makes one pause and recheck everything
far to often, as a comment indicates there something non-obvious
happening.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Some users have a more complicated CRUSH hierarchy, for example with a
stretched cluster. The additional hierarchy steps (datacenter, rack,
room, ...) are shown in the OSD panel. Showing a generic icon for any
CRUSH types that have not a specific icon configured will make it easier
to navigate the tree as it will not look somewhat broken and empty.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
By switching from 'ceph osd tree' to the 'ceph osd df tree' mon API
equivalent , we get the same data structure with more information per
OSD. One of them is the number of PGs stored on that OSD.
The number of PGs per OSD is an important number, for example when
trying to figure out why the performance is not as good as expected.
Therefore, adding it to the OSD overview visible by default should
reduce the number of times, one needs to access the CLI.
Comparing runtime cost on a 3 node ceph cluster with 4 OSDs each doing 50k
iterations gives:
Rate osd-df-tree osd-tree
osd-df-tree 9141/s -- -25%
osd-tree 12136/s 33% --
So, while definitively a bit slower, but it's still in the µs range,
and as such below HTTP in TLS in TCP connection setup for most users,
so worth the extra useful information.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
[ TL: slight rewording of subject and add benchmark data ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Check if stopping of a service (OSD, MON, MDS) will be problematic for
Ceph. The warning still allows the user to proceed.
Ceph also has a check if the destruction of a MON is okay, so let's use
it.
Instead of the common OK button, label it with `Stop OSD` and so forth
to hopefully reduce the "click OK by habit" incidents.
This will not catch it every time as Ceph can need a few moments after a
change to establish its current status. For example, stopping one of 3
MONs and then right away destroying one of the last two running MONs
will most likely not trigger the warning. Doing so after a few seconds
should show the warning though.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
If an OSD is removed during the wrong conditions, it could lead to
blocked IO or worst case data loss.
Check against global flags that limit the capabilities of Ceph to heal
itself (norebalance, norecover, noout) and if there are degraded
objects.
Unfortunately, the 'safe-to-destroy' Ceph API endpoint will not help
here as it only works as long as the OSD is still running. By the time
the destroy button is enabled, the OSD will already be stopped.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
Note that we still check the cluster for an already used installation
and will select that, if any, so this is really just for setting up a
completely new cluster.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Removes the possibility to select the node on which to create the first
monitor in the configuration / initialization step and always sets it to
the current node.
This prevents that a user might select another node on which the Ceph
packages have not yet been installed. If a user did that, they would get
an error, but the Ceph config file would have been written. If the user
then does not select a valid node to create the first mon, but aborts
the wizard, they are greeted with a rados_connect error because the
config file exists, but it does not contain any mon infos that are
needed to connect to the Ceph cluster.
Creating a mon manually will remedy such a situation, but especially for
new users, this behavior is not ideal and confusing.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
By not auto filling the Ceph public network we can avoid accidental
clicks on 'Next' which will cause the first Mon to be created with a
potentially wrong network. While that is fixable, it is tedious and
can be easily avoided by making the user always select the network to
use.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
Tested-by: Stefan Hrdlicka<s.hrdlicka@proxmox.com>
[ T: adapted commit subject to be more specific and match our common
style ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
They cannot be changed after pool creation for erasure coded pools
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
Reviewed-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-by: Dominik Csapak <d.csapak@proxmox.com>
The in & out commands for OSDs are not node specific and can be run on
any node in the Ceph cluster. By sending them to the node currently used
to access the UI they can still be sent even if the node on which the
OSDs are located is down.
This helps in a disaster scenario where a node is down. By default Ceph
will mark a downed OSD as out after 10 minutes. This could be too long
in some situations. Running the CLI command to mark the OSD as out
earlier on one of the remaining nodes does work, but if the admin is not
used doing it this way, this adds stress, in a potentially already
stressful situation.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
Showing already configured custom device classes makes it easier to
create new OSDs with custom device classes.
The Crush map contains a list of all OSDs in the cluster, including
their device class.
This means we can create a list of used device classes from it, avoiding
adding another API endpoint.
Fetching the crushmap should also be quite a bit less data that needs to
be transferred, compared to the other possible nodes/<node>/ceph/osd
endpoint, especially in larger clusters.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
Ext.util.Sorter does not have an 'order' property, so 'order: DESC'
didn't have an effect. The default is 'ASC' and it is arguably the
preferred direction for all affected sorters anyways.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>