qemu-server

mirror of https://git.proxmox.com/git/qemu-server synced 2025-08-05 18:36:21 +00:00

Go to file

Thomas Lamprecht 1c9d54bfd0 migrate: use ssh forwarded UNIX socket tunnel We cannot guarantee when the SSH forward Tunnel really becomes ready. The check with the mtunnel API call did not help for this prolem as it only checked that the SSH connection itself works and that the destination node has quorum but the forwarded tunnel itself was not checked. The Forward tunnel is a different channel in the SSH connection, independent of the SSH `qm mtunnel` channel, so only if that works it does not guarantees that our migration tunnel is up and ready. When the node(s) where under load, or when we did parallel migrations (migrateall), the migrate command was often started before a tunnel was open and ready to receive data. This led to a direct abortion of the migration and is the main cause in why parallel migrations often leave two thirds or more VMs on the source node. The issue was tracked down to SSH after debugging the QEMU process and enabling debug logging showed that the tunnel became often to late available and ready, or not at all. Fixing the TCP forward tunnel is quirky and not straight ahead, the only way SSH gives as a possibility is to use -N (no command) -f (background) and -o "ExitOnForwardFailure=yes", then it would wait in the foreground until the tunnel is ready and only then background itself. This is not quite the nicest way for our special use case and our code base. Waiting for the local port to become open and ready (through /proc/net/tcp[6]] as a proof of concept is not enough, even if the port is in the listening state and should theoretically accept connections this still failed often as the tunnel was not yet fully ready. Further another problem would still be open if we tried to patch the SSH Forward method we currently use - which we solve for free with the approach of this patch - namely the problem that the method to get an available port (next_migration_port) has a serious race condition which could lead to multiple use of the same port on a parallel migration (I observed this on my many test, seldom but if it happens its really bad). So lets now use UNIX sockets, which ssh supports since version 5.7. The end points are UNIX socket bound to the VMID - thus no port so no race and also no limitation of available ports (we reserved 50 for migration). The endpoints get created in /run/qemu-server/VMID.migrate and as KVM/QEMU in current versions is able to use UNIX socket just as well as TCP we have not to change much on the interaction with QEMU. QEMU is started with the migrate_incoming url at the local destination endpoint and creates the socket file, we then create a listening socket on the source side and connect over SSH to the destination. Now the migration can be started by issuing the migrate qmp command with an updated uri. This breaks live migration from new to old, but not from old to new, so there is a upgrade path. If a live migration from new to old must be made (for whatever reason), use the unsecure_migration setting (man datacenter.conf) to allow this, although that should only be done in trusted network. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>		2016-06-03 11:51:46 +02:00
PVE	migrate: use ssh forwarded UNIX socket tunnel	2016-06-03 11:51:46 +02:00
test	Rework snapshot code, has_feature	2016-03-08 11:42:37 +01:00
.gitignore	add qm.bash-completion to .gitignore	2015-09-14 10:37:12 +02:00
changelog.Debian	bump version to 4.0-78	2016-06-03 11:43:48 +02:00
control.in	Switch from netcat-traditional to netcat6	2015-05-12 06:39:08 +02:00
copyright	change license to AGPL3	2011-08-24 10:07:52 +02:00
Makefile	bump version to 4.0-78	2016-06-03 11:43:48 +02:00
modules-load.conf	remove unnecessary init.d, postint, postrm and qmupdate scripts	2015-02-27 16:09:41 +01:00
pcitest.pl	use warnings instead of global -w flag	2013-10-01 13:14:49 +02:00
pve-bridge	fix #909 : pass rate to tap_plug()	2016-03-08 15:52:31 +01:00
pve-bridge-hotplug	pve-bridge-hotplug code deduplication	2015-11-14 10:34:22 +01:00
pve-bridgedown	add pve-bridgedown script	2014-05-08 08:37:04 +02:00
pve-q35.cfg	enable q35 machine support	2014-06-18 06:03:53 +02:00
pve-usb.cfg	imported from svn 'qemu-server/pve2'	2011-08-23 07:47:04 +02:00
qm	convert qmrestore into a PVE::CLI class	2015-10-05 13:10:24 +02:00
qmextract	qmextract: use PVE::Storage;	2016-03-30 10:38:31 +02:00
qmrestore	convert qmrestore into a PVE::CLI class	2015-10-05 13:10:24 +02:00
sparsecp.c	use correct format to print time_t (%zd)	2012-02-13 11:22:03 +01:00
triggers	use noawait triggers for pve-api-updates	2015-06-01 12:35:16 +02:00
utils.c	fix bug in vmtar	2012-10-25 10:01:24 +02:00
vmtar.c	fix bug in vmtar	2012-10-25 10:01:24 +02:00