pve-guest-common

mirror of https://git.proxmox.com/git/pve-guest-common synced 2025-08-06 09:55:54 +00:00

Author	SHA1	Message	Date
Hannes Laimer	de2ddf940a	replication: delete job even if it is disabled Currently we skip all disabled jobs, also the ones up for deletion, which does not make sense. This came up in support. Signed-off-by: Hannes Laimer <h.laimer@proxmox.com> Link: https://lore.proxmox.com/20250407085138.4653-1-h.laimer@proxmox.com	2025-04-07 14:04:13 +02:00
Dominik Csapak	f1fc7d6c61	ReplicationState: deterministically order replication jobs if we have multiple jobs for the same vmid with the same schedule, the last_sync, next_sync and vmid will always be the same, so the order depends on the order of the $jobs hash (which is random; thanks perl) to have a fixed order, take the jobid also into consideration Signed-off-by: Dominik Csapak <d.csapak@proxmox.com> Reviewed-by: Fabian Ebner <f.ebner@proxmox.com>	2022-06-08 08:48:04 +02:00
Dominik Csapak	1aa4d844a1	ReplicationState: purge state from non local vms when running replication, we don't want to keep replication states for non-local vms. Normally this would not be a problem, since on migration, we transfer the states anyway, but when the ha-manager steals a vm, it cannot do that. In that case, having an old state lying around is harmful, since the code does not expect the state to be out-of-sync with the actual snapshots on disk. One such problem is the following: Replicate vm 100 from node A to node B and C, and activate HA. When node A dies, it will be relocated to e.g. node B and start replicate from there. If node B now had an old state lying around for it's sync to node C, it might delete the common base snapshots of B and C and cannot sync again. Deleting the state for all non local guests fixes that issue, since it always starts fresh, and the potentially existing old state cannot be valid anyway since we just relocated the vm here (from a dead node). Signed-off-by: Dominik Csapak <d.csapak@proxmox.com> Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com> Reviewed-by: Fabian Ebner <f.ebner@proxmox.com>	2022-06-08 08:48:04 +02:00
Thomas Lamprecht	73a3e4cb23	replication config: retry first three failed times quicker before going to 30m So the repeat frequency for a stuck job is now: t0 -> fails t1 = t0 + 5m -> repat t2 = t1 + 10m = t0 + 15m -> repat t3 = t2 + 15m = t0 + 30m -> repat t4 = t3 + 30m = t0 + 60-> repat then tx = tx-1 + 30m -> repat So, we converge more naturally/stable to the 30m intervals than before, when t3 would have been t0 + 45m. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-04-27 09:59:26 +02:00
Fabian Ebner	ff574bf8d2	replication: update last_sync before removing old replication snapshots If pvesr was terminated after finishing with the new sync and after removing old replication snapshots, but before it could write the new state, the next replication would fail. It would wrongly interpret the actual last replication snapshot as stale, remove it, and (if no other snapshots are present) attempt a full sync, which would fail. Reported in the community forum [0], this was brought to light by the new pvescheduler before it learned graceful reload. It's not possible to simply preserve a last remaining snapshot in prepare(), because prepare() is also used for valid removals. Instead, update last_sync early enough. Stale snapshots will still be removed on the next run if there are any. [0]: https://forum.proxmox.com/threads/100154 Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2021-11-29 10:50:36 +01:00
Thomas Lamprecht	960c85be38	buildsys: split packaging and source build-systems Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2021-05-09 20:10:14 +02:00

6 Commits