drive mirror: prevent wrongly logging success when completion fails differently

Currently, when completing a drive mirror job, only errors matching
"cannot be completed" will be handled. Other errors are ignored and
a wrong message that the job was completed successfully will be
printed to the log. An instance of this popped up in the community
forum [0].

The QMP command used for completing the job is either
'block-job-complete' or 'block-job-cancel'. The former causes the VM
to switch to the target drive, the latter doesn't, e.g. migration uses
the latter to not switch the source instance over to the target drive.
The 'block-job-cancel' command doesn't even have the same "cannot be
completed" message, but returns immediately.

The timeout for both 'block-job-cancel' and 'block-job-complete' is
set to 10 minutes in the QMPClient module, which should be enough.

[0]: https://forum.proxmox.com/threads/151518/

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
This commit is contained in:
Fiona Ebner 2024-07-23 14:07:59 +02:00 committed by Thomas Lamprecht
parent f63cc6dbeb
commit 7b4fac1275

View File

@ -8112,10 +8112,13 @@ sub qemu_drive_mirror_monitor {
die "invalid completion value: $completion\n";
}
eval { mon_cmd($vmid, $op, device => $job_id) };
if ($@ =~ m/cannot be completed/) {
my $err = $@;
if ($err && $err =~ m/cannot be completed/) {
print "$job_id: block job cannot be completed, trying again.\n";
$err_complete++;
}else {
} elsif ($err) {
die "$job_id: block job cannot be completed - $err\n";
} else {
print "$job_id: Completed successfully.\n";
$jobs->{$job_id}->{complete} = 1;
}