The hash slice did not work as intented here, it only return the keys
from the last elemend defined in the slice, thus not all workers got
a TERM.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Reverts a hunk of 0da5a3e43b which removed checking &
untainting of pids from the PVE_DAEMON_WORKER_PIDS env var.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Else this options is not really useful. First, sending a SIGTERM lets
the children exit, not quite what "leave_children_open_on_reload"
promises.
The problem this causes is that we may get a time window where no
worker is active and thus, for example, our API daemon would not
accept connections during a restart (or better said, reload).
So, don't request termination of any child worker, if this option is
set, but rather just restart (re-exec) ourself, startup a new set of
workers and only then request the termination of the old ones,
allowing a fully seamless reload.
This is only done on `$daemon-exe restart` and thus on
`systemctl reload $daemon`, systemctl restart or any other stop start
cycles always exit all other workers first.
This expects that the worker can do a graceful termination on
SIGTERM, which is already the case for anything using our AnyEvent
based class (which is base of our HTTPServer module).
With graceful termination is meant the following: the worker accepts
no new work and exits immediately after the current queued work is
done.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Otherwise perl tries to bind+listen on a UDP socket if the
TCP socket fails - which is a waste since we're looking for
TCP ports.
Additionall since UDP doesn't support listen(), perl will
return EOPNOTSUPP instead of, say, EADDRINUSE. (We don't
care about the error in this code though.)
While it should be impossible to bind to a wildcard address
when the port is in use by any other address there's one
case where this is allowed, and that's when the port is in
use by an ipv6 address while trying to bind to an ipv4
wildcard.
This currently happens when qemu finds ::1 for the
'localhost' we pass to qemu's spice address while we're
resolving the local nodename via IPv4.
Previously an external exception (eg. caused by a SIGARLM in a code
which is already inside a run_with_timeout() call) could happen in
various places where we did not properly this situation.
For instance after calling $lock_func() but before reaching the cleanup
code. In this case a lock was leaked.
Additionally the code was broken in that it used perl's automatic hash
creation side effect ($a->{x}->{y} implicitly initializing $a->{x} with
an empty hash when it did not exist). The effect was that if our own
time out was triggered after the initial check for an existing file
handle inside $lock_func() happened (extremely rare since perl would have
to be running insanely slow), the cleanup did:
if (my $fh = $lock_handles->{$$}->{$filename}->{fh}) {
This recreated $lock_handles->{$$}->{$filename} as an empty hash.
A subsequent call to lock_file_full() will think a file descriptor
already exists because the check simply used:
if (!$lock_handles->{$$}->{$filename}) {
While this could have been a one-line fix for this one particular case,
we'd still not be taking external timeouts into account causing the
first issue described above.
Add addr_to_ip and get_ip_from_hostname helpers to PVE::Network
The first helper, addr_to_ip, is based on Wolfgangs version of this
[0]
I just moved it from PVE::Tools to PVE::Network, as it seems a more
fitting place.
It uses getnameinfo to extract information from the paddr parameter,
which is sockaddr struct
It gets used in the second helper and in a bug fix series from
Wolfgang [1]
The second helper, get_ip_from_hostname, resolves an hostname to an
IP and checks if it isn't one from the for loopback reserved 127/8
subnet. It will be used in get_remote_nodeip from PVE::CLuster and
for a bugfix in pvecm.
[0]: http://pve.proxmox.com/pipermail/pve-devel/2017-April/026099.html
[1]: http://pve.proxmox.com/pipermail/pve-devel/2017-April/026098.html
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
(cherry picked from commit 87aa00de73)
since the "lang" param has not worked, introduce a "keeplocale"
parameter instead.
the default behaviour is the same (set LC_ALL to 'C'), but we can use
the parameter to keep the locale from the host (eg. for the vncshell)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
(cherry picked from commit 0f3f314ed7)
This is important when we convert the result to JSON.
Else the GUI receives booleans as "0", which evaluate to true
in JS!
NOTE: "0" evaluates to false with perl.
when reserving ports, we use lock_file to lock the
reservation file, but then use file_set_content which
writes a new file and renames it, making the lock invalid
and different processes waiting for the lock get inconsistent
data
instead we use a designated lock file for the lock, so that we don't
lose the lock when writing the reservation file
this should fix the problem that sometimes multiple vms get the
same vnc/spice port
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
We shouldn't mix different tool sets on the one hand, and on
the other hand net-tools is an optional package in stretch
and there's no real need for us to depend on it.
When a container stops or hotplug changes are applied we
do a veth_delete() which does not cleanup the firewall
bridges or OVS ports. This is problematic at the next
startup. When creating a network device we usually want to
copy the MTU of the bridge we intend to put it on, however,
with OVS still having the old port lying around the
recreated device gets associated with the bridge before we
read its MTU, potentially reducing it to that of the newly
created device.
This cleanup also gets rid of stale fwbr/fwln devices from
stopped containers.