rhbz#1004443
The methods that trigger waitings on the client pipe require that
the waiting will succeed in order to continue, or otherwise, that
all the living pipe items will be released (e.g., when
we must destroy a surface, we need that all its related pipe items will
be released). Shutdown of the socket will eventually trigger
red_channel_client_disconnect (*), which will empty the pipe. However,
if the blocking method failed, we need to empty the pipe synchronously.
It is not safe(**) to call red_channel_client_disconnect from ChannelCbs
, but all the blocking calls in red_worker are done from callbacks that
are triggered from the device.
To summarize, calling red_channel_client_disconnect instead of calling
red_channel_client_shutdown will immediately release all the pipe items that are
held by the channel client (by calling red_channel_client_pipe_clear).
If red_clear_surface_drawables_from_pipe timeouts,
red_channel_client_disconnect will make sure that the surface we wish to
release is not referenced by any pipe-item.
(*) After a shutdown of a socket, we expect that later, when
red_peer_handle_incoming is called, it will encounter a socket
error and will call the channel's on_error callback which calls
red_channel_client_disconnect.
(**) I believe it was not safe before commit 2d2121a170 (before adding ref
count to ChannelClient). However, I think it might still be unsafe, because
red_channel_client_disconnect sets rcc->stream to NULL, and rcc->stream
may be referred later inside a red_channel_client method unsafely. So instead
of checking if (stream != NULL) after calling callbacks, we try to avoid
calling red_channel_client_disconnect from callbacks.
(1) merge 'force' and 'wait_for_outgoing_item' to one parameter.
'wait_for_outgoing_item' is a derivative of 'force'.
(2) move the call to red_wait_outgoing_item to red_clear_surface_drawables_from_pipe
Three blocking functions, one was split to leave the display channel
specific referencing of the DrawablePipeItem being sent inside
red_worker, but the rest (most) of the timeout logic was moved to
red_channel, including the associated constants.
Moved functions:
red_channel_client_wait_pipe_item_sent
red_wait_outgoing_item
red_wait_all_sent
Introduces red_time.h & red_time.c for a small helper function dealing
with time.h
setting DRAW_ALL define doesn't produce correct rendering. Using
update_area instead of red_draw_qxl_drawable will work but it shouldn't
be required. This is not work I intend to do right now, so marking it
for anyone looking at this in the future.
150 seconds is way too long period for holding the guest driver and
waiting for a response for the client. This timeout was 15 seconds, but
when off-screen surfaces ware introduced it was arbitrarily multiplied by
10.
Other existing related bugs emphasize why it is important to decrease
the timeout:
(1) 994211 - the qxl driver waits for an async-io reponse for 60 seconds
and after that, it switches to sync-io mode. Not only that the
driver might use invalid data (since it didn't wait for the query to
complete), falling back to sync-io mode introduces other errors.
(2) 994175 - spice server sometimes doesn't recognize that the client
has disconnected.
(3) There might be cache inconsistency between the client and the server,
and then the display channel waits indefinitely for a cache item (e.g., bug
977998)
This patch changes the timeout to 30 seconds. I tested it under wifi +emulating 2.5Mbps network,
together with playing video on the guest and changing resolutions in a loop. The timeout didn't expired
during my tests.
This bug is related to rhbz#964136 (but from rhbz#964136 info it is still not
clear why the client wasn't responsive).
Specifically, the loop in red_pipes_add_draw can cause spice to abort.
In red_worker.c (WORKER_FOREACH_DCC):
red_pipes_add_drawable
red_pipe_add_drawable
red_handle_drawable_surfaces_client_synced
red_push_surface_image
red_channel_client_push
red_channel_client_send
red_peer_handle_outgoing
reds_stream_writev (if fails -- EPIPE)
handler->cb->on_error = red_channel_client_disconnect()
red_channel_remove_client()
ring_remove() -- of rcc from channel.clients ring.
RCC_FOREACH may be dangerous
The following patches replace FOREACH loops with a SAFE version.
Using unsafe loops may cause spice-server to abort (assert fails).
Specifically a read/write fail in those loops, may cause the client
to disconnect, removing the node currently iterated, which cause spice
to abort in ring_next():
-- assertion `pos->next != NULL && pos->prev != NULL' failed
The image descriptor flags shouldn't be copied as is from the flags that
were set by the driver. Specifically, the CACHE_ME flag shouldn't be copied,
since it is possible that (a) the image won't be cached (b) the image
is already cached, but in its lossy version, and we may want to set the bit for
CACHE_REPLACE_ME, in order to cache it in its lossless version.
In case (b), the client first looks for the CACHE_ME flag, and only if
it is not set it looks for CACHE_REPLACE_ME (see canvas_base.c). Since both flags where set,
the client ignored REPLACE_ME, and didn't turned off the lossy flag of the
cach item. Then, when a request from this lossles item reached the
client (FROM_CACHE_LOSSLESS), the client display channel waited
endlessly for the lossless version of the image.
When setting an initial video stream bit rate, if the bit rate
wasn't calculated by main_channel_client, and we don't have
estimation from previos streams, use some default values.
The patch also removes updating dcc->streams_max_bit_rate when
the bit_rate held by the main_channel is larger than it. It is not necessary
since we compare those 2 values each time we set the initial bit rate
for a stream.
rhbz#956345
After a spice session has been migrated, we don't retest the network
(user experience considerations). Instead, we obtain the is_low_bandwidth flag
from the src-server, via the migration data.
Before this patch, if we migrated from server s1 to s2 and then to s3,
and if the connection to s1 was a low bandwidth one, we erroneously
passed is_low_bandwidth=FALSE from s2 to s3.
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Replace the mixed calls to display_channel_client_is_low_bandwidth
and to main_channel_client_is_low_bandwidth, with one flag in
CommonChannelClient that is set upon channel creation.
red_create_stream is called even without any client but there is no
encoding since the mjpeg encoder is now associated with StreamAgent
which is only created when we have a client.
With a SPICE_DISPLAY_CAP_MONITORS_CONFIG capable client, the client needs to
know what part of the primary to use for each monitor. If the guest driver
does not support this, the server sends messages to the client for a
single monitor spanning the entire primary.
As soon as the guest calls spice_qxl_monitors_config_async once, we set
the red_worker driver_has_monitors_config flag and stop doing this.
This is a problem when the driver gets unloaded, for example after a reboot
or when switching to a text vc with usermode mode-setting under Linux.
To reproduce this start a multi-mon capable Linux guest which uses
usermode mode-setting and then once X has started switch to a text vc. Note
how the client window does not only not resize, if you try to resize it
manually you always keep blackborders since the aspect is wrong.
This patch is the spice-server side of fixing this, it adds a new
spice_qxl_driver_unload method which clears the driver_has_monitors_config
flag.
The other patch needed to fix this is in qemu, and will calls this new method
from qxl_enter_vga_mode.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
mjpeg_encoder modify the initial bit we supply it, according to the
client feedback. If it reaches a bit rate which is higher than the
initial one, we use the higher bit rate as the new bit rate estimation.
A frame can be dropped if a new frame was added during the same
call to red_process_command (we didn't attempt to send the older
frame). Such drops are ignored.