Commit Graph

163 Commits

Author SHA1 Message Date
Steve Sistare
ea263e8fd9 tap: fix net_init_tap() return code
net_init_tap intends to return 0 for success and -1 on error.  However,
when net_init_tap() succeeds for a multi-queue device, it returns 1,
because of this code where ret becomes 1 when g_unix_set_fd_nonblocking
succeeds:

        ret = g_unix_set_fd_nonblocking(fd, true, NULL);
        if (!ret) {
            ... error ...
    free_fail:
        ...
        return ret;

Luckily, the only current call site checks for negative, rather than non-zero:

  net_client_init1()
      if (net_client_init_fun[](...) < 0)

Also, in the unlikely case that g_unix_set_fd_nonblocking fails and returns
false, ret=0 is returned, and net_client_init1 will use a broken interface.

Fix it to be future proof.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2025-07-21 10:21:33 +08:00
Vladimir Sementsov-Ogievskiy
3f9f6299a1 net/tap: drop too small packets
Theoretically tap_read_packet() may return size less than
s->host_vnet_hdr_len, and next, we'll work with negative size
(in case of !s->using_vnet_hdr). Let's avoid it.

Don't proceed with size == s->host_vnet_hdr_len as well in case
of !s->using_vnet_hdr, it doesn't make sense.

Tested-by: Lei Yang <leiyang@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2025-07-21 10:21:28 +08:00
Laurent Vivier
ba5acc5d6e net: Add is_vhost_user flag to vhost_net struct
Introduce a boolean is_vhost_user field to the vhost_net
structure. This flag is initialized during vhost_net_init based
on whether the backend is vhost-user.

This refactoring simplifies checks for vhost-user specific behavior,
replacing direct comparisons of 'net->nc->info->type' with the new
flag. It improves readability and encapsulates the backend type
information directly within the vhost_net instance.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2025-07-14 13:27:09 +08:00
Laurent Vivier
33b78a30a3 net: Allow network backends to advertise max TX queue size
This commit refactors how the maximum transmit queue size for
virtio-net devices is determined, making the mechanism more generic
and extensible.

Previously, virtio_net_max_tx_queue_size() contained hardcoded
checks for specific network backend types (vhost-user and
vhost-vdpa) to determine their supported maximum queue size. This
created direct dependencies and would require modifications for
every new backend that supports variable queue sizes.

To improve flexibility, a new max_tx_queue_size field is added
to the vhost_net structure. This allows each network backend
to advertise its supported maximum transmit queue size directly.

The virtio_net_max_tx_queue_size() function now retrieves the max
TX queue size from the vhost_net struct, if available and set.
Otherwise, it defaults to VIRTIO_NET_TX_QUEUE_DEFAULT_SIZE.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2025-07-14 13:27:09 +08:00
Laurent Vivier
1652f1b335 net: Add save_acked_features callback to vhost_net
This commit introduces a save_acked_features function pointer to
vhost_net and converts the vhost_net function into a generic dispatcher.

The vhost-user backend provides the callback, making its function static.
With this change, no other module has a direct dependency on the
vhost-user implementation.

This cleanup allows for the complete removal of the net/vhost-user.h
header file.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2025-07-14 13:27:09 +08:00
Laurent Vivier
bd38794a11 net: Add get_acked_features callback to VhostNetOptions
This patch continues the effort to decouple the generic vhost layer
from specific network backend implementations.

Previously, the vhost_net initialization code contained a hardcoded
check for the vhost-user client type to retrieve its acked features
by calling vhost_user_get_acked_features(). This exposed an
internal vhost-user function in a public header and coupled the two
modules.

The vhost-user backend is updated to provide a callback, and its
getter function is now static. The call site in vhost_net.c is
simplified to use the new generic helper, removing the type check and
the direct dependency.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2025-07-14 13:27:09 +08:00
Laurent Vivier
effdacbf28 net: Consolidate vhost feature bits into vhost_net structure
Previously, the vhost_net_get_feature_bits() function in
hw/net/vhost_net.c used a large switch statement to determine
the appropriate feature bits based on the NetClientDriver type.

This created unnecessary coupling between the generic vhost layer
and specific network backends (like TAP, vhost-user, and
vhost-vdpa).

This patch moves the definition of vhost feature bits directly into the
vhost_net structure for each relevant network client.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2025-07-14 13:27:09 +08:00
Laurent Vivier
8f6e5c620a net: Add get_vhost_net callback to NetClientInfo
The get_vhost_net() function previously contained a large switch
statement to find the VHostNetState pointer based on the net
client's type. This created a tight coupling, requiring the generic
vhost layer to be aware of every specific backend that supported
vhost, such as tap, vhost-user, and vhost-vdpa.

This approach is not scalable and requires modifying a central function
for any new backend. It also forced each backend to expose its internal
getter function in a public header file.

This patch refactors the logic by introducing a new get_vhost_net
function pointer to the NetClientInfo struct. The central
get_vhost_net() function is now a simple, generic dispatcher that
invokes the callback provided by the net client.

Each backend now implements its own private getter and registers it in
its NetClientInfo.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2025-07-14 13:27:09 +08:00
Philippe Mathieu-Daudé
32cad1ffb8 include: Rename sysemu/ -> system/
Headers in include/sysemu/ are not only related to system
*emulation*, they are also used by virtualization. Rename
as system/ which is clearer.

Files renamed manually then mechanical change using sed tool.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Lei Yang <leiyang@redhat.com>
Message-Id: <20241203172445.28576-1-philmd@linaro.org>
2024-12-20 17:44:56 +01:00
Clément Léger
9996a35c64 net/tap: Use qemu_close_all_open_fd()
Instead of using a slow implementation to close all open fd after
forking, use qemu_close_all_open_fd().

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240802145423.3232974-6-cleger@rivosinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-08-05 08:33:36 +10:00
Clément Léger
a9b5d6e536 net/tap: Factorize fd closing after forking
The same code is used twice to actually close all open file descriptors
after forking. Factorize it in a single place.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240802145423.3232974-4-cleger@rivosinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-08-05 08:06:08 +10:00
Akihiko Odaki
77db537995 tap: Shrink zeroed virtio-net header
tap prepends a zeroed virtio-net header when writing a packet to a
tap with virtio-net header enabled but not in use. This only happens
when s->host_vnet_hdr_len == sizeof(struct virtio_net_hdr).

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2024-06-04 15:14:26 +08:00
Akihiko Odaki
336a058b26 tap: Call tap_receive_iov() from tap_receive()
This will save duplicate logic found in both of tap_receive_iov() and
tap_receive().

Suggested-by: "Zhang, Chen" <chen.zhang@intel.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2024-06-04 15:14:26 +08:00
Akihiko Odaki
b9ad513e18 net: Remove receive_raw()
While netmap implements virtio-net header, it does not implement
receive_raw(). Instead of implementing receive_raw for netmap, add
virtio-net headers in the common code and use receive_iov()/receive()
instead. This also fixes the buffer size for the virtio-net header.

Fixes: fbbdbddec0 ("tap: allow extended virtio header with hash info")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2024-06-04 15:14:26 +08:00
Akihiko Odaki
a67753710d net: Move virtio-net header length assertion
The virtio-net header length assertion should happen for any clients.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2024-06-04 15:14:26 +08:00
Akihiko Odaki
4b52d63249 tap: Remove qemu_using_vnet_hdr()
Since qemu_set_vnet_hdr_len() is always called when
qemu_using_vnet_hdr() is called, we can merge them and save some code.

For consistency, express that the virtio-net header is not in use by
returning 0 with qemu_get_vnet_hdr_len() instead of having a dedicated
function, qemu_get_using_vnet_hdr().

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2024-06-04 15:14:25 +08:00
Akihiko Odaki
52a7ff5269 tap: Remove tap_probe_vnet_hdr_len()
It was necessary since an Linux older than 2.6.35 may implement the
virtio-net header but may not allow to change its length. Remove it
since such an old Linux is no longer supported.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2024-06-04 15:14:25 +08:00
Akihiko Odaki
d9b33018a0 Revert "tap: setting error appropriately when calling net_init_tap_one()"
This reverts commit 46d4d36d0b.

The reverted commit changed to emit warnings instead of errors when
vhost is requested but vhost initialization fails if vhostforce option
is not set.

However, vhostforce is not meant to ignore vhost errors. It was once
introduced as an option to commit 5430a28fe4 ("vhost: force vhost off
for non-MSI guests") to force enabling vhost for non-MSI guests, which
will have worse performance with vhost. The option was deprecated with
commit 1e7398a140 ("vhost: enable vhost without without MSI-X") and
changed to behave identical with the vhost option for compatibility.

Worse, commit bf769f742c ("virtio: del net client if net_init_tap_one
failed") changed to delete the client when vhost fails even when the
failure only results in a warning. The leads to an assertion failure
for the -netdev command line option.

The reverted commit was intended to avoid that the vhost initialization
failure won't result in a corrupted netdev. This problem should have
been fixed by deleting netdev when the initialization fails instead of
ignoring the failure with an arbitrary option. Fortunately, commit
bf769f742c ("virtio: del net client if net_init_tap_one failed"),
mentioned earlier, implements this behavior.

Restore the correct semantics and fix the assertion failure for the
-netdev command line option by reverting the problematic commit.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2024-03-29 14:59:07 +08:00
Stefan Hajnoczi
0b2675c473 Rename "QEMU global mutex" to "BQL" in comments and docs
The term "QEMU global mutex" is identical to the more widely used Big
QEMU Lock ("BQL"). Update the code comments and documentation to use
"BQL" instead of "QEMU global mutex".

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Message-id: 20240102153529.486531-6-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2024-01-08 10:45:43 -05:00
Peter Maydell
6d7a53e9f1 net/tap: Avoid variable-length array
Use a heap allocation instead of a variable length array in
tap_receive_iov().

The codebase has very few VLAs, and if we can get rid of them all we
can make the compiler error on new additions.  This is a defensive
measure against security bugs where an on-stack dynamic allocation
isn't correctly size-checked (e.g.  CVE-2021-3527).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-09-18 14:36:13 +08:00
Yuri Benditovich
f03e0cf63b tap: Add check for USO features
Tap indicates support for USO features according to
capabilities of current kernel module.

Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Signed-off-by: Andrew Melnychecnko <andrew@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-09-18 14:36:13 +08:00
Andrew Melnychenko
2ab0ec3121 tap: Add USO support to tap device.
Passing additional parameters (USOv4 and USOv6 offloads) when
setting TAP offloads

Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-09-18 14:36:13 +08:00
Akihiko Odaki
481c52320a net: Strip virtio-net header when dumping
filter-dump specifiees Ethernet as PCAP LinkType, which does not expect
virtio-net header. Having virtio-net header in such PCAP file breaks
PCAP unconsumable. Unfortunately currently there is no LinkType for
virtio-net so for now strip virtio-net header to convert the output to
Ethernet.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Nikita Ivanov
37b0b24e93 error handling: Use RETRY_ON_EINTR() macro where applicable
There is a defined RETRY_ON_EINTR() macro in qemu/osdep.h
which handles the same while loop.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/415
Signed-off-by: Nikita Ivanov <nivanov@cloudlinux.com>
Message-Id: <20221023090422.242617-3-nivanov@cloudlinux.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[thuth: Dropped the hunk that changed socket_accept() in libqtest.c]
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-01-09 13:50:47 +01:00
Nikita Ivanov
8b6aa69365 Refactoring: refactor TFR() macro to RETRY_ON_EINTR()
Rename macro name to more transparent one and refactor
it to expression.

Signed-off-by: Nikita Ivanov <nivanov@cloudlinux.com>
Message-Id: <20221023090422.242617-2-nivanov@cloudlinux.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-01-09 13:50:47 +01:00
Markus Armbruster
7480874a69 qapi net: Elide redundant has_FOO in generated C
The has_FOO for pointer-valued FOO are redundant, except for arrays.
They are also a nuisance to work with.  Recent commit "qapi: Start to
elide redundant has_FOO in generated C" provided the means to elide
them step by step.  This is the step for qapi/net.json.

Said commit explains the transformation in more detail.  The invariant
violations mentioned there do not occur here.

Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221104160712.3005652-19-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[Fixes for MacOS squashed in]
2022-12-14 20:04:47 +01:00
Laurent Vivier
53b85d9574 net: introduce qemu_set_info_str() function
Embed the setting of info_str in a function.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-10-28 13:28:52 +08:00
lu zhipeng
bf769f742c virtio: del net client if net_init_tap_one failed
If the net tap initializes successful, but failed during
network card hot-plugging, the net-tap will remains,
so cleanup.

Signed-off-by: lu zhipeng <luzhipeng@cestc.cn>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-09-27 15:14:37 +08:00
Marc-André Lureau
a8208626ba net: replace qemu_set_nonblock()
Those calls are POSIX-specific. Use the dedicated GLib
API. (qemu_set_nonblock() is for socket-like)

(this is a preliminary patch before renaming qemu_set_nonblock())

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2022-05-03 15:52:47 +04:00
Marc-André Lureau
0f9668e0c1 Remove qemu-common.h include from most units
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20220323155743.1585078-33-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-06 14:31:55 +02:00
Peter Foley
41bcea7b2c net/tap: Set return code on failure
Match the other error handling in this function.

Fixes: e7b347d0bf ("net: detect errors from probing vnet hdr flag for TAP devices")

Reviewed-by: Patrick Venture <venture@google.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Foley <pefoley@google.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-02-14 11:50:44 +08:00
Jason Wang
6a756d1495 vhost_net: do not assume nvqs is always 2
This patch switches to initialize dev.nvqs from the VhostNetOptions
instead of assuming it was 2. This is useful for implementing control
virtqueue support which will be a single vhost_net structure with a
single cvq.

Note that nvqs is still set to 2 for all users and this patch does not
change functionality.

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20210903091031.47303-6-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-09-04 17:34:05 -04:00
Andrew Melnychenko
8f364e344c net: Added SetSteeringEBPF method for NetClientState.
For now, that method supported only by Linux TAP.
Linux TAP uses TUNSETSTEERINGEBPF ioctl.

Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-06-04 15:25:46 +08:00
Jason Wang
bc38e31b4e net: check the existence of peer before trying to pad
There could be case that peer is NULL. This can happen when during
network device hot-add where net device needs to be added first. So
the patch check the existence of peer before trying to do the pad.

Fixes: 969e50b61a ("net: Pad short frames to minimum size before sending from SLiRP/TAP")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Message-id: 20210423031803.1479-1-jasowang@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-04-23 11:11:28 +01:00
Jason Wang
f9bb0c1f98 Revert "qapi: net: Add query-netdev command"
Several issues has been reported for query-netdev series. Consider
it's late in the rc, this reverts commit
d32ad10a14.

Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-04-08 17:33:59 +08:00
Jason Wang
56e6f594bf Revert "net: Move NetClientState.info_str to dynamic allocations"
Several issues has been reported for query-netdev info
series. Consider it's late in the rc, this reverts commit
commit 59b5437eb7.

Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-04-08 17:33:59 +08:00
Jason Wang
d89b4f839f Revert "net: Do not fill legacy info_str for backends"
Several issues has been reported for query-netdev info
series. Consider it's late in the rc, this reverts commit
f2e8319d45.

Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-04-08 17:33:59 +08:00
Bin Meng
969e50b61a net: Pad short frames to minimum size before sending from SLiRP/TAP
The minimum Ethernet frame length is 60 bytes. For short frames with
smaller length like ARP packets (only 42 bytes), on a real world NIC
it can choose either padding its length to the minimum required 60
bytes, or sending it out directly to the wire. Such behavior can be
hardcoded or controled by a register bit. Similarly on the receive
path, NICs can choose either dropping such short frames directly or
handing them over to software to handle.

On the other hand, for the network backends like SLiRP/TAP, they
don't expose a way to control the short frame behavior. As of today
they just send/receive data from/to the other end connected to them,
which means any sized packet is acceptable. So they can send and
receive short frames without any problem. It is observed that ARP
packets sent from SLiRP/TAP are 42 bytes, and SLiRP/TAP just send
these ARP packets to the other end which might be a NIC model that
does not allow short frames to pass through.

To provide better compatibility, for packets sent from QEMU network
backends like SLiRP/TAP, we change to pad short frames before sending
it out to the other end, if the other end does not forbid it via the
nc->do_not_pad flag. This ensures a backend as an Ethernet sender
does not violate the spec. But with this change, the behavior of
dropping short frames from SLiRP/TAP interfaces in the NIC model
cannot be emulated because it always receives a packet that is spec
complaint. The capability of sending short frames from NIC models is
still supported and short frames can still pass through SLiRP/TAP.

This commit should be able to fix the issue as reported with some
NIC models before, that ARP requests get dropped, preventing the
guest from becoming visible on the network. It was workarounded in
these NIC models on the receive path, that when a short frame is
received, it is padded up to 60 bytes.

The following 2 commits seem to be the one to workaround this issue
in e1000 and vmxenet3 before, and should probably be reverted.

  commit 78aeb23ede ("e1000: Pad short frames to minimum size (60 bytes)")
  commit 40a87c6c9b ("vmxnet3: Pad short frames to minimum size (60 bytes)")

Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-03-22 17:34:31 +08:00
Alexey Kirillov
f2e8319d45 net: Do not fill legacy info_str for backends
As we use QAPI NetClientState->stored_config to store and get information
about backend network devices, we can drop fill of legacy field info_str
for them.

We still use info_str field for NIC and hubports, so we can not completely
remove it.

Signed-off-by: Alexey Kirillov <lekiravi@yandex-team.ru>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-03-15 16:41:22 +08:00
Alexey Kirillov
59b5437eb7 net: Move NetClientState.info_str to dynamic allocations
The info_str field of the NetClientState structure is static and has a size
of 256 bytes. This amount is often unclaimed, and the field itself is used
exclusively for HMP "info network".

The patch translates info_str to dynamic memory allocation.

This action is also allows us to painlessly discard usage of this field
for backend devices.

Signed-off-by: Alexey Kirillov <lekiravi@yandex-team.ru>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-03-15 16:41:22 +08:00
Alexey Kirillov
d32ad10a14 qapi: net: Add query-netdev command
The query-netdev command is used to get the configuration of the current
network device backends (netdevs).
This is the QMP analog of the HMP command "info network" but only for
netdevs (i.e. excluding NIC and hubports).

The query-netdev command returns an array of objects of the NetdevInfo
type, which are an extension of Netdev type. It means that response can
be used for netdev-add after small modification. This can be useful for
recreate the same netdev configuration.

Information about the network device is filled in when it is created or
modified and is available through the NetClientState->stored_config.

Signed-off-by: Alexey Kirillov <lekiravi@yandex-team.ru>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-03-15 16:41:22 +08:00
Keqian Zhu
9925990d01 net: Use correct default-path macro for downscript
Fixes: 63c4db4c2e (net: relocate paths to helpers and scripts)
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2020-11-24 10:40:17 +08:00
yuanjungong
f012bec890 tap: fix a memory leak
Close fd before returning.

Buglink: https://bugs.launchpad.net/qemu/+bug/1904486

Signed-off-by: yuanjungong <ruc_gongyuanjun@163.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2020-11-24 10:40:17 +08:00
Kevin Wolf
947e47448d monitor: Use getter/setter functions for cur_mon
cur_mon really needs to be coroutine-local as soon as we move monitor
command handlers to coroutines and let them yield. As a first step, just
remove all direct accesses to cur_mon so that we can implement this in
the getter function later.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20201005155855.256490-4-kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09 07:08:19 +02:00
Paolo Bonzini
63c4db4c2e net: relocate paths to helpers and scripts
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-30 19:11:36 +02:00
Daniel P. Berrange
e7b347d0bf net: detect errors from probing vnet hdr flag for TAP devices
When QEMU sets up a tap based network device backend, it mostly ignores errors
reported from various ioctl() calls it makes, assuming the TAP file descriptor
is valid. This assumption can easily be violated when the user is passing in a
pre-opened file descriptor. At best, the ioctls may fail with a -EBADF, but if
the user passes in a bogus FD number that happens to clash with a FD number that
QEMU has opened internally for another reason, a wide variety of errnos may
result, as the TUNGETIFF ioctl number may map to a completely different command
on a different type of file.

By ignoring all these errors, QEMU sets up a zombie network backend that will
never pass any data. Even worse, when QEMU shuts down, or that network backend
is hot-removed, it will close this bogus file descriptor, which could belong to
another QEMU device backend.

There's no obvious guaranteed reliable way to detect that a FD genuinely is a
TAP device, as opposed to a UNIX socket, or pipe, or something else. Checking
the errno from probing vnet hdr flag though, does catch the big common cases.
ie calling TUNGETIFF will return EBADF for an invalid FD, and ENOTTY when FD is
a UNIX socket, or pipe which catches accidental collisions with FDs used for
stdio, or monitor socket.

Previously the example below where bogus fd 9 collides with the FD used for the
chardev saw:

$ ./x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hostnet0,fd=9 \
  -chardev socket,id=charchannel0,path=/tmp/qga,server,nowait \
  -monitor stdio -vnc :0
qemu-system-x86_64: -netdev tap,id=hostnet0,fd=9: TUNGETIFF ioctl() failed: Inappropriate ioctl for device
TUNSETOFFLOAD ioctl() failed: Bad address
QEMU 2.9.1 monitor - type 'help' for more information
(qemu) Warning: netdev hostnet0 has no peer

which gives a running QEMU with a zombie network backend.

With this change applied we get an error message and QEMU immediately exits
before carrying on and making a bigger disaster:

$ ./x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hostnet0,fd=9 \
  -chardev socket,id=charchannel0,path=/tmp/qga,server,nowait \
  -monitor stdio -vnc :0
qemu-system-x86_64: -netdev tap,id=hostnet0,vhost=on,fd=9: Unable to query TUNGETIFF on FD 9: Inappropriate ioctl for device

Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 20171027085548.3472-1-berrange@redhat.com
[lv: to simplify, don't check on EINVAL with TUNGETIFF as it exists since v2.6.27]
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2020-07-15 21:00:13 +08:00
Laurent Vivier
894022e616 net: check if the file descriptor is valid before using it
qemu_set_nonblock() checks that the file descriptor can be used and, if
not, crashes QEMU. An assert() is used for that. The use of assert() is
used to detect programming error and the coredump will allow to debug
the problem.

But in the case of the tap device, this assert() can be triggered by
a misconfiguration by the user. At startup, it's not a real problem, but it
can also happen during the hot-plug of a new device, and here it's a
problem because we can crash a perfectly healthy system.

For instance:
 # ip link add link virbr0 name macvtap0 type macvtap mode bridge
 # ip link set macvtap0 up
 # TAP=/dev/tap$(ip -o link show macvtap0 | cut -d: -f1)
 # qemu-system-x86_64 -machine q35 -device pcie-root-port,id=pcie-root-port-0 -monitor stdio 9<> $TAP
 (qemu) netdev_add type=tap,id=hostnet0,vhost=on,fd=9
 (qemu) device_add driver=virtio-net-pci,netdev=hostnet0,id=net0,bus=pcie-root-port-0
 (qemu) device_del net0
 (qemu) netdev_del hostnet0
 (qemu) netdev_add type=tap,id=hostnet1,vhost=on,fd=9
 qemu-system-x86_64: .../util/oslib-posix.c:247: qemu_set_nonblock: Assertion `f != -1' failed.
 Aborted (core dumped)

To avoid that, add a function, qemu_try_set_nonblock(), that allows to report the
problem without crashing.

In the same way, we also update the function for vhostfd in net_init_tap_one() and
for fd in net_init_socket() (both descriptors are provided by the user and can
be wrong).

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2020-07-15 21:00:13 +08:00
Markus Armbruster
af175e85f9 error: Eliminate error_propagate() with Coccinelle, part 2
When all we do with an Error we receive into a local variable is
propagating to somewhere else, we can just as well receive it there
right away.  The previous commit did that with a Coccinelle script I
consider fairly trustworthy.  This commit uses the same script with
the matching of return taken out, i.e. we convert

    if (!foo(..., &err)) {
        ...
        error_propagate(errp, err);
        ...
    }

to

    if (!foo(..., errp)) {
        ...
        ...
    }

This is unsound: @err could still be read between afterwards.  I don't
know how to express "no read of @err without an intervening write" in
Coccinelle.  Instead, I manually double-checked for uses of @err.

Suboptimal line breaks tweaked manually.  qdev_realize() simplified
further to placate scripts/checkpatch.pl.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200707160613.848843-36-armbru@redhat.com>
2020-07-10 15:18:08 +02:00
Markus Armbruster
668f62ec62 error: Eliminate error_propagate() with Coccinelle, part 1
When all we do with an Error we receive into a local variable is
propagating to somewhere else, we can just as well receive it there
right away.  Convert

    if (!foo(..., &err)) {
        ...
        error_propagate(errp, err);
        ...
        return ...
    }

to

    if (!foo(..., errp)) {
        ...
        ...
        return ...
    }

where nothing else needs @err.  Coccinelle script:

    @rule1 forall@
    identifier fun, err, errp, lbl;
    expression list args, args2;
    binary operator op;
    constant c1, c2;
    symbol false;
    @@
         if (
    (
    -        fun(args, &err, args2)
    +        fun(args, errp, args2)
    |
    -        !fun(args, &err, args2)
    +        !fun(args, errp, args2)
    |
    -        fun(args, &err, args2) op c1
    +        fun(args, errp, args2) op c1
    )
            )
         {
             ... when != err
                 when != lbl:
                 when strict
    -        error_propagate(errp, err);
             ... when != err
    (
             return;
    |
             return c2;
    |
             return false;
    )
         }

    @rule2 forall@
    identifier fun, err, errp, lbl;
    expression list args, args2;
    expression var;
    binary operator op;
    constant c1, c2;
    symbol false;
    @@
    -    var = fun(args, &err, args2);
    +    var = fun(args, errp, args2);
         ... when != err
         if (
    (
             var
    |
             !var
    |
             var op c1
    )
            )
         {
             ... when != err
                 when != lbl:
                 when strict
    -        error_propagate(errp, err);
             ... when != err
    (
             return;
    |
             return c2;
    |
             return false;
    |
             return var;
    )
         }

    @depends on rule1 || rule2@
    identifier err;
    @@
    -    Error *err = NULL;
         ... when != err

Not exactly elegant, I'm afraid.

The "when != lbl:" is necessary to avoid transforming

         if (fun(args, &err)) {
             goto out
         }
         ...
     out:
         error_propagate(errp, err);

even though other paths to label out still need the error_propagate().
For an actual example, see sclp_realize().

Without the "when strict", Coccinelle transforms vfio_msix_setup(),
incorrectly.  I don't know what exactly "when strict" does, only that
it helps here.

The match of return is narrower than what I want, but I can't figure
out how to express "return where the operand doesn't use @err".  For
an example where it's too narrow, see vfio_intx_enable().

Silently fails to convert hw/arm/armsse.c, because Coccinelle gets
confused by ARMSSE being used both as typedef and function-like macro
there.  Converted manually.

Line breaks tidied up manually.  One nested declaration of @local_err
deleted manually.  Preexisting unwanted blank line dropped in
hw/riscv/sifive_e.c.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200707160613.848843-35-armbru@redhat.com>
2020-07-10 15:18:08 +02:00
Yuri Benditovich
fbbdbddec0 tap: allow extended virtio header with hash info
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2020-06-18 21:05:50 +08:00