qemu/block
Vladimir Sementsov-Ogievskiy 46f56631b5 block/nbd: fix reconnect-delay
reconnect-delay has a design flaw: we handle it in the same loop where
we do connection attempt. So, reconnect-delay may be exceeded by
unpredictable time of connection attempt.

Let's instead use separate timer.

How to reproduce the bug:

1. Create an image on node1:
   qemu-img create -f qcow2 xx 100M

2. Start NBD server on node1:
   qemu-nbd xx

3. On node2 start qemu-io:

./build/qemu-io --image-opts \
driver=nbd,server.type=inet,server.host=192.168.100.5,server.port=10809,reconnect-delay=15

4. Type 'read 0 512' in qemu-io interface to check that connection
   works

Be careful: you should make steps 5-7 in a short time, less than 15
seconds.

5. Kill nbd server on node1

6. Run 'read 0 512' in qemu-io interface again, to be sure that nbd
client goes to reconnect loop.

7. On node1 run the following command

   sudo iptables -A INPUT -p tcp --dport 10809 -j DROP

This will make the connect() call of qemu-io at node2 take a long time.

And you'll see that read command in qemu-io will hang for a long time,
more than 15 seconds specified by reconnect-delay parameter. It's the
bug.

8. Don't forget to drop iptables rule on node1:

   sudo iptables -D INPUT -p tcp --dport 10809 -j DROP

Important note: Step [5] is necessary to reproduce _this_ bug. If we
miss step [5], the read command (step 6) will hang for a long time and
this commit doesn't help, because there will be not long connect() to
unreachable host, but long sendmsg() to unreachable host, which should
be fixed by enabling and adjusting keep-alive on the socket, which is a
thing for further patch set.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200903190301.367620-4-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2020-10-09 15:04:32 -05:00
..
export block/export: Move writable to BlockExportOptions 2020-10-02 15:46:40 +02:00
monitor block/export: Add block-export-del 2020-10-02 15:46:40 +02:00
accounting.c block: add empty account cookie type 2019-10-10 10:56:18 +02:00
aio_task.c block: introduce aio task pool 2019-10-10 10:56:17 +02:00
amend.c block/amend: Check whether the node exists 2020-07-27 12:37:25 +02:00
backup-top.c block: Inline bdrv_co_block_status_from_*() 2020-09-07 12:31:31 +02:00
backup-top.h block: introduce backup-top filter driver 2019-10-10 10:56:18 +02:00
backup.c backup: Deal with filters 2020-09-07 12:31:31 +02:00
blkdebug.c block: Inline bdrv_co_block_status_from_*() 2020-09-07 12:31:31 +02:00
blklogwrites.c block: Inline bdrv_co_block_status_from_*() 2020-09-07 12:31:31 +02:00
blkreplay.c block: Use bdrv_default_perms() 2020-05-18 19:05:25 +02:00
blkverify.c error: Eliminate error_propagate() with Coccinelle, part 2 2020-07-10 15:18:08 +02:00
block-backend.c qemu/atomic.h: rename atomic_ to qatomic_ 2020-09-23 16:07:44 +01:00
block-copy.c block-copy: Use CAF to find sync=top base 2020-09-07 12:31:31 +02:00
block-gen.h scripts: add block-coroutine-wrapper.py 2020-10-05 10:59:06 +01:00
bochs.c block: Use bdrv_default_perms() 2020-05-18 19:05:25 +02:00
cloop.c block: Use bdrv_default_perms() 2020-05-18 19:05:25 +02:00
commit.c block: Inline bdrv_co_block_status_from_*() 2020-09-07 12:31:31 +02:00
copy-on-read.c block: Inline bdrv_co_block_status_from_*() 2020-09-07 12:31:31 +02:00
coroutines.h block/io: refactor save/load vmstate 2020-10-05 10:59:42 +01:00
create.c block/create: Do not abort if a block driver is not available 2019-09-13 12:18:37 +02:00
crypto.c block/crypto: disallow write sharing by default 2020-07-21 10:49:02 +02:00
crypto.h block/crypto: implement the encryption key management 2020-07-06 08:49:28 +02:00
curl.c error: Eliminate error_propagate() with Coccinelle, part 1 2020-07-10 15:18:08 +02:00
dirty-bitmap.c block/dirty-bitmap: add bdrv_has_named_bitmaps helper 2020-05-28 13:15:22 -05:00
dmg-bz2.c Include qemu-common.h exactly where needed 2019-06-12 13:20:20 +02:00
dmg-lzfse.c block: adding lzfse decompressing support as a module. 2018-12-14 11:52:40 +01:00
dmg.c block: Use bdrv_default_perms() 2020-05-18 19:05:25 +02:00
dmg.h Include qemu-common.h exactly where needed 2019-06-12 13:20:20 +02:00
file-posix.c block/file: switch to use qemu_open/qemu_create for improved errors 2020-09-16 10:33:48 +01:00
file-win32.c block/file: switch to use qemu_open/qemu_create for improved errors 2020-09-16 10:33:48 +01:00
filter-compress.c block: Inline bdrv_co_block_status_from_*() 2020-09-07 12:31:31 +02:00
gluster.c error: Reduce unnecessary error propagation 2020-07-10 15:18:08 +02:00
io_uring.c io_uring: use io_uring_cq_ready() to check for ready cqes 2020-06-05 09:54:48 +01:00
io.c block/io: refactor save/load vmstate 2020-10-05 10:59:42 +01:00
iscsi-opts.c Include qemu/module.h where needed, drop it from qemu-common.h 2019-06-12 13:18:33 +02:00
iscsi.c qapi: Restrict query-uuid command to machine code 2020-09-29 15:41:35 +02:00
linux-aio.c misc: Replace zero-length arrays with flexible array member (automatic) 2020-03-16 22:07:42 +01:00
meson.build scripts: add block-coroutine-wrapper.py 2020-10-05 10:59:06 +01:00
mirror.c block: Inline bdrv_co_block_status_from_*() 2020-09-07 12:31:31 +02:00
nbd.c block/nbd: fix reconnect-delay 2020-10-09 15:04:32 -05:00
nfs.c qemu/atomic.h: rename atomic_ to qatomic_ 2020-09-23 16:07:44 +01:00
null.c block/null: Implement bdrv_get_allocated_file_size 2020-09-07 12:31:31 +02:00
nvme.c block/nvme: Replace magic value by SCALE_MS definition 2020-10-05 09:35:52 +01:00
parallels.c error: Avoid error_propagate() after migrate_add_blocker() 2020-07-10 15:18:08 +02:00
parallels.h Clean up includes 2018-02-09 05:05:11 +01:00
qapi-sysemu.c block: Move system emulator QMP commands to block/qapi-sysemu.c 2020-03-06 17:15:38 +01:00
qapi.c migration: introduce icount field for snapshots 2020-10-06 08:34:49 +02:00
qcow2-bitmap.c qcow2: Use macros for the L1, refcount and bitmap table entry sizes 2020-09-15 11:05:12 +02:00
qcow2-cache.c core: replace getpagesize() with qemu_real_host_page_size 2019-10-26 15:38:06 +02:00
qcow2-cluster.c qcow2: Use L1E_SIZE in qcow2_write_l1_entry() 2020-10-02 15:46:40 +02:00
qcow2-refcount.c qcow2: Make qcow2_free_any_clusters() free only one cluster 2020-09-15 11:05:13 +02:00
qcow2-snapshot.c migration: introduce icount field for snapshots 2020-10-06 08:34:49 +02:00
qcow2-threads.c qcow2: add zstd cluster compression 2020-05-13 14:20:31 +02:00
qcow2.c qcow2: Convert qcow2_alloc_cluster_offset() into qcow2_alloc_host_offset() 2020-09-15 11:31:10 +02:00
qcow2.h qcow2: introduce icount field for snapshots 2020-10-06 08:34:49 +02:00
qcow.c block/qcow: remove runtime opts 2020-09-15 11:05:13 +02:00
qed-check.c block/qed: add missed coroutine_fn markers 2019-04-30 15:29:00 +02:00
qed-cluster.c qed: protect table cache with CoMutex 2017-07-17 11:34:11 +08:00
qed-l2-cache.c qed: protect table cache with CoMutex 2017-07-17 11:34:11 +08:00
qed-table.c block/qed: add missed coroutine_fn markers 2019-04-30 15:29:00 +02:00
qed.c qapi: Smooth another visitor error checking pattern 2020-07-10 15:18:08 +02:00
qed.h qed: Simplify backing reads 2020-07-06 10:34:14 +02:00
quorum.c block/quorum.c: stable children names 2020-09-15 11:05:12 +02:00
raw-format.c error: Eliminate error_propagate() with Coccinelle, part 2 2020-07-10 15:18:08 +02:00
rbd.c block/rbd: add 'namespace' to qemu_rbd_strong_runtime_opts[] 2020-09-15 11:31:10 +02:00
replication.c error: Reduce unnecessary error propagation 2020-07-10 15:18:08 +02:00
sheepdog.c block/sheepdog: Replace magic val by NANOSECONDS_PER_SECOND definition 2020-10-02 15:46:40 +02:00
snapshot.c block/snapshot: Fix fallback 2020-09-07 12:31:31 +02:00
ssh.c qapi: Smooth another visitor error checking pattern 2020-07-10 15:18:08 +02:00
stream.c stream: Deal with filters 2020-09-07 12:31:31 +02:00
throttle-groups.c qemu/atomic.h: rename atomic_ to qatomic_ 2020-09-23 16:07:44 +01:00
throttle.c qemu/atomic.h: rename atomic_ to qatomic_ 2020-09-23 16:07:44 +01:00
trace-events trace-events: Fix attribution of trace points to source 2020-09-09 17:17:58 +01:00
trace.h trace: switch position of headers to what Meson requires 2020-08-21 06:18:24 -04:00
vdi.c error: Avoid error_propagate() after migrate_add_blocker() 2020-07-10 15:18:08 +02:00
vhdx-endian.c Include qemu-common.h exactly where needed 2019-06-12 13:20:20 +02:00
vhdx-log.c block: Add flags to bdrv(_co)_truncate() 2020-04-30 17:51:07 +02:00
vhdx.c block/vhdx: Support vhdx image only with 512 bytes logical sector size 2020-09-15 11:05:13 +02:00
vhdx.h block/vhdx: Use IEC binary prefixes for size constants 2019-04-30 15:29:00 +02:00
vmdk.c vmdk: Drop vmdk_co_flush() 2020-09-07 12:31:31 +02:00
vpc.c error: Avoid error_propagate() after migrate_add_blocker() 2020-07-10 15:18:08 +02:00
vvfat.c util: rename qemu_open() to qemu_open_old() 2020-09-16 10:33:48 +01:00
win32-aio.c Include qemu/module.h where needed, drop it from qemu-common.h 2019-06-12 13:18:33 +02:00
write-threshold.c qapi: Drop qapi_event_send_FOO()'s Error ** argument 2018-08-28 18:21:38 +02:00