remove vmid from data part, it is already contained in object part.
this is accomplished by adding the parameter $excluded to
build_influxdb_payload().
Signed-off-by: Lorenz Stechauner <l.stechauner@proxmox.com>
we set the api prefix by default to '/' so we always triggered
the the replacement and added '///' which is wrong and does not
work for the 'health' api path
(influxdb returns 404 for 'https://ip:port///health')
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
the forwards compatible api of 1.8 only contains this path
(not api/v2/health) and it it also contained in the v2 api
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
I normally use a reverse proxy in front of my influxdb instances,
proxying all from the /influx/ path to the only locally listening
influxdb. So here I'd need to set "influx" as api-path-prefix.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Not a hard error, some network box (proxy) down the line could add it
for us, or it could be just not required, so ...
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
needs an organization/bucket (previously db) and an optional token
the http client does not fit exactly in the connect/send/disconnect
scheme, so it simply creates a request in 'connect',
does the actual http connection in 'send' and nothing in 'disconnect'
max-body-size is set to 25.000.000 bytes by default (the influxdb default)
and the timeout to 1 second (same as default graphite tcp timeout)
the token (if given) gets saved in /etc/pve/priv/metricserver/$ID.pw
it is optional, because the 1.8.x compatibility api does not need
authentication (in contrast to influxdb 2.x)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
like we do in it for the storage section configs
we will need this to store the token for influxdbs http api
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
by providing the id or cfg to have better context in those methods
we will need that for influxdb http api
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
so that if one disables the plugin (e.g. because it is offline),
it will work even when the server is not reachable
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
since some users don't even have a full 1500 (and some systems might
have links with bigger MTU and not require as much fragmentation).
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
We only need to check if the next data addition brings us over the
batch send size, not if we have already at least half of that data in
there, as else we may get again over the batch sent size.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
The data passed to this closure was never free'd, depending on the
count of VM/CTs one could get >1 MB of RSS (!) memory leaked per
statd status cycle update run...
We could also use Scalar::Util's weaken, to weak a copy of this
variable, but as a simple undef works lets do that with a comment..
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
For now it only handles the plugin registration and the two recently
integrated helpers.
But, this is a prepartation to move the external metrics server
update mechanic from a stateless always-newly-connect-send-disconnect
to a statefull transaction based mechanis; see later patches
keep the PVE::Status::Plugin use in pvestatd, as we read the cfs
hosted status.cfg there, and the parser is defined by the common
status plugin base module.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
in preparation of doing real transactions, with one batch connect +
send + disconnect, and not hundreds of those per update cycle..
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Instead of doing multiple sends, for each status metric line one,
assemble it all in a string and send it out in a single go.
Per VM/CT/Node we had >10 lines to send, so this is quite the
reduction. But, also note that thanks to Nagler's delay algorithm
this may not had a big effect for TCP, as it buffered those small
writes anyhow.
For UDP it can reduce the packet count on the line dramatically,
though.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
after rethinking this it felt weird, sockets already can to this
themself, so I checked out the IO::Socket::Timeout module, and yeah,
it's just a OOP wrapper for this, hiding the "scary" struct pack.
So instead of adding that as dependency lets do it ourself.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This is for TCP only, and TCP needs roughly 1.5 time of the Round
Trip Time for connection setup, So, with 1 second timeout we're still
good for connections with 660 ms latency in-between.
The assumption is that most of the time the status server is
relatively near (same datacenter, or region), and connections to it
are datacenter grade, and not like a spotty GPRS modem.
So, reduce this timeout to ensure that we do not block to long.
If anybody needs higher timeouts they can just change the default
anyway.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This change allows sending statistics to graphite over TCP.
So far only UDP is possible, which is not available in some environments, like behind a loadbalancer.
Configuration example:
~ $ cat /etc/pve/status.cfg
graphite:
server 10.20.30.40
port 2003
path proxmox
proto tcp
timeout 3
Signed-off-by: Martin Verges <martin.verges@croit.io>
we allow an id like storage.cfg but leave it optional (so we do not
break existing configs):
influxdb: name
so that one can export the data to multiple servers of the same type
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
since numbers can also be in '1.e-10' format, we have to change
how we check for a number
Scalar::Util is already core and we use it in PVE::Tools, so
no new dependecy.
in case of "NaN" or "Infinity" we omit the key/value pair
else we quote like before
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
the graphite daemons which accept the data (carbon), only
accepts numeric values, and logs all invalid lines
since that were about 5 values per vm/ct this generated lot of noise
in the carbon log
so we check with a regex if a value is numeric, and
additionally we have a blacklist of keys which seem to be numeric but
are either boolean (e.g. template) or a state (e.g. pid)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
This allows filtering by node in InfluxDB queries, so the statistics
of all virtual guests on a specific nodes can be queried.
While for InfluxDB this is only a tag which does changes where the
data is stored, Graphite - our other status plugin - has no such
mechanics available. If we would add it to the object hierarchy,
e.g.: "qemu.$vmid.$nodename" a migration of a VM would result in two
different datasets.
So avoid breaking setups and omit it for Graphite for now.
Suggested-by: Daniel1108 <danielgallegosanchez@gmail.com>
CC: Daniel1108 <danielgallegosanchez@gmail.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
As some Makefiles in sub directories do not implement the distclean
target, namely:
PVE/Service/Makefile
PVE/CLI/Makefile
This target is broken.
As all other implementations just redirect to the 'clean' target I
do not implement the missing ones but rather remove all such
targets. Keep it just in the top level directory, for consistence
sake with other pve repos, and redirect it there directly to the
clean target.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
this patch fixes an issue where we assemble the influxdb
key value pairs to the wrong measurement
and also we did only allow integer fields,
excluding all cpu,load and wait measurements
this patch fixes both issues with a rewrite of the
recursive build_influxdb_payload sub
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
If the socket couldn't be created (e.g. FQDN not resolvable) we
continued witouth any hint, when actualy writing the data we then
die'd. The user then does not really know why, so report errors
if the socket creation failed.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
We only allowed servers with the dns-name format, as such status
server may often be in internal networks and with no hostname
(testing, small network so no dns, ...) do not limit the
configuration possibilities with no reason.
Also move the base property part to the base Status class, all
current plugins use server and port so no need for double
declaration of format/descriptions.
If a future plugin doesn't need them it can omit them by not
returning the respective properties in the options method
inherited by SectionConfig.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>