With a SPICE_DISPLAY_CAP_MONITORS_CONFIG capable client, the client needs to
know what part of the primary to use for each monitor. If the guest driver
does not support this, the server sends messages to the client for a
single monitor spanning the entire primary.
As soon as the guest calls spice_qxl_monitors_config_async once, we set
the red_worker driver_has_monitors_config flag and stop doing this.
This is a problem when the driver gets unloaded, for example after a reboot
or when switching to a text vc with usermode mode-setting under Linux.
To reproduce this start a multi-mon capable Linux guest which uses
usermode mode-setting and then once X has started switch to a text vc. Note
how the client window does not only not resize, if you try to resize it
manually you always keep blackborders since the aspect is wrong.
This patch is the spice-server side of fixing this, it adds a new
spice_qxl_driver_unload method which clears the driver_has_monitors_config
flag.
The other patch needed to fix this is in qemu, and will calls this new method
from qxl_enter_vga_mode.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
mjpeg_encoder modify the initial bit we supply it, according to the
client feedback. If it reaches a bit rate which is higher than the
initial one, we use the higher bit rate as the new bit rate estimation.
A frame can be dropped if a new frame was added during the same
call to red_process_command (we didn't attempt to send the older
frame). Such drops are ignored.
This patch only employs setting the stream parameters based on
the initial given bit-rate, the latency, and the encoding size.
Later patches will also employ mjpeg_encoder response to client reports,
and its control over frame drops.
The patch also removes old stream bit rate calculations that weren't
used.
mjpeg_encoder can receive periodic reports about the playback status on
the client side. Then, mjpeg_encoder analyses the report and can
increase or decrease the stream bit rate, depending on the report.
When the bit rate is changed, the quality and frame rate of the stream
are re-evaluated.
Previously, the mjpeg quality was always 70. The frame rate was
tuned according to the frames' congestion in the pipe.
This patch sets the quality and frame rate according to
a given bit rate and the size of the first encoded frames.
The following patches will introduce an adaptive video streaming, in which
the bit rate, the quality, and the frame rate, change in response to
different parameters.
Patches that make red_worker adopt this feature will also follow.
The mjpeg_encoder should be client specific, and not shared between
different clients**, for the following reasons:
(1) Since we use abbreviated jpeg datastream for mjpeg, employing the same
mjpeg_encoder for different clients might cause errors when the
clients decode the jpeg data.
(2) The next patch introduces bit rate control to the mjpeg_encoder.
This feature depends on the bandwidth available, which is client
specific.
** at least till we change multi-clients not to re-encode the same
streams.
When qemu migration completes, we need to stop the streams, and to send
the corresponding upgrade_items to the client.
Otherwise, (1) the client might display lossy regions that we don't track
(streams are not part of the migration data).
(2) streams_timeout may occur after MSG_MIGRATE has been sent, leading
to messages being sent to the client after MSG_MIGRATE and before
MSG_MIGRATE_DATA (e.g., STREAM_CLIP, STREAM_DESTROY, DRAW_COPY).
No message besides MSG_MIGRATE_DATA should be sent after
MSG_MIGRATE.
When a msg other than MIGRATE_DATA reached spice-gtk after MSG_MIGRATE,
spice-gtk sent it to dest server as the migration data, and the dest
server crashed with a "bad message size" assert.
1) This does not buy us much, as red_marshall_monitors_config() also
removes 0x0 sized monitors and does a much better job at it
(also removing intermediate ones, not only tailing ones)
2) The code is wrong, as it allocs space for real_count heads, where
real_count always <= monitors_config->count and then stores
monitors_config->count in worker->monitors_config->count, causing
red_marshall_monitors_config to potentially walk
worker->monitors_config->heads past its boundaries.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
During my dynamic monitor support testing today, I hit the following assert
in red_worker.c:
"red_push_monitors_config: condition `monitors_config != NULL' failed"
This is caused by the following scenario:
1) Guest causes handle_dev_monitors_config_async() to be called
2) handle_dev_monitors_config_async() calls worker_update_monitors_config()
3) handle_dev_monitors_config_async() pushes worker->monitors_config, this
takes a ref on the current monitors_config
4) Guest causes handle_dev_monitors_config_async() to be called *again*
5) handle_dev_monitors_config_async() calls worker_update_monitors_config()
6) worker_update_monitors_config() does a decref on worker->monitors_config,
releasing the workers reference, this monitor_config from step 2 is
not yet free-ed though as the pipe-item still holds a ref
7) worker_update_monitors_config() creates a new monitors_config with an
initial ref-count of 1 and stores that in worker->monitors_config
8) The pipe-item of the *first* monitors_config is send, upon completion
a decref is done on the monitors_config, and monitors_config_decref not
only frees the monitor_config, but *also* sets worker->monitors_config
to NULL, even though worker->monitors_config no longer refers to the
monitor_config being freed, it refers to the 2nd monitor_config!
9) The client which was connected when this all happened disconnects
10) A new client connects, leading to the assert:
at red_worker.c:9519
num_common_caps=1, common_caps=0x5555569b6f60, migrate=0,
stream=<optimized out>, client=<optimized out>, worker=<optimized out>)
at red_worker.c:10423
at red_worker.c:11301
Note that red_worker.c:9519 is:
red_push_monitors_config(dcc);
gdb does not point to the actual line of the assert because the function gets
inlined.
The fix is easy and obvious, don't set worker->monitors_config to NULL in
monitors_config_decref. I'm a bit baffled as to why that code is there in
the first place, the whole point of ref-counting is to not have one single
unique place to store the reference...
This fix should not have any adverse side-effects as the 4 callers of
monitors_config_decref fall into 2 categories:
1) Code which immediately after the decref replaces worker->monitors_config
with a new monitors_config:
worker_update_monitors_config()
set_monitors_config_to_primary()
2) pipe-item freeing code, which should not touch the worker state at all
to being with
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
The stream vis_region should be cleared after the stream region was sent
to the client losslessly. Otherwise, we might send redundant stream upgrades
if we process more drawables that are dependent on the stream region.
resolves: rhbz#891326
Starting from commit 81fe00b08a, red_detach_streams_behind can
trigger modifications in the current tree (by update_area calls). Thus,
after calling red_detach_streams_behind it is not safe to access tree
entries that were calculated before the call.
This patch inserts the drawable to the tree before the call to
red_detach_streams_behind. This change also requires making sure
that rendering operations that can be triggered by
red_detach_streams_behind will not include this drawable (which is now part of the tree).
Reported-by: Michal Luscon <mluscon@redhat.com>
Found by a Coverity scan:
in handle_dev_start -
Checking "worker->display_channel" implies that "worker->display_channel"
might be NULL.
Passing "worker" to function "guest_set_client_capabilities"
in guest_set_client_capabilities -
Directly dereferencing parameter "worker->display_channel"
red_proccess_commands calls were added after calling
guest_set_client_capabilities in order to cleanup the command ring from
old commands that the client might not be able to handle.
However, calling red_process_commands at this stage does send messages
to the client.
In addition, since setting the client capabilities at the guest is not
synchronized, emptying the command ring is not enough in order to make
sure the following commands will be supported by the client.
The call to red_proccess_commands before initializing the display
streams (the call to red_display_start_streams), caused inconsistencies
related to video streaming upon reconnecting (rhbz#883564).
I'm reverting this patch till another solution for the capabilities
mismatch is introduced.
Resolves: rhbz#883564
Internal images are just read from the surface, compressed, and sent to the client.
Then, they are destroyed. I can't find any reason for aligning their memory.
rhbz#876685
The current lz implementation does not support such bitmaps.
The following patch will actually prevent allocating stride > bpp*width
for internal images.
Previously, there was no check for the size of the message received from
the client, and all messages were read into a buffer of size 1024.
However, migration data can be bigger than 1024. In such cases, memory
corruption occurred.
red_wait_outgoing_item only waits till the currently outgoing msg is
completely sent.
red_wait_outgoing_items does the same for multi-clients. handle_dev_stop erroneously called
red_wait_outgoing_items, instead of waiting till all the items in the
pipes are sent.
This waiting is necessary because after drawables are sent to the client, we release them from the
device. The device might have been stopped due to moving to the non-live
phase of migration. Accessing the device memory during this phase can lead
to inconsistencies.
Also, MSG_MIGRATE should be the last message sent to the client, before
MSG_MIGRATE_DATA. Due to this bug, msgs were marshalled and sent after
handle_dev_stop and after handle_dev_display_migrate which sometimes led
to the release of surfaces, and inserting MSG_DISPLAY_DESTROY_SURFACE
after MSG_MIGRATE.
This patch also removes the calls to red_wait_outgoing_items, from
dev_flush_surfaces. They were unnecessary.
fix: rhbz#866929
At migration destination side, we need to restore the client's surfaces
state, before sending surfaces related messages.
Before this patch, we stopped the processing of only the cmd ring, till migration data
arrived.
However, some QXL_IOs require reading and rendering the cmd ring (e.g.,
update_area). Moreover, when the device is reset, after destroying all
surfaces, we assert (in qemu) if the cmd ring is not empty (see
rhbz#866929).
This fix makes the red_worker thread wait till the migration data arrives
(or till a timeout), and not process any input from the device after the
vm is started.
We try to inject an interrupt to the vm in this case, which we cannot do
if it is stopped. Instead log this and update when vm restarts.
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=870972
(that bz is on qemu, it will be cloned or just changed, not
sure yet)
When a new client connects, there may be commands in the ring that it
can't understand, so we need to process these before forwarding new
commands to the client. By doing this after changing the capability
bits we ensure that the new client will never see a command that it
doesn't understand (under the assumption that the guest will read and
obey the capability bits).
Acked-by: Alon Levy <alonl@redhat.com>
A new interface
set_client_capabilities (QXLInstance *qin,
uint8_t client_present,
uint8_t caps[58]);
is added to QXLInstance, and spice server is changed to call it
whenever a client connects or disconnects. The QXL device in response
is expected to update the client capability bits in the ROM of the
device and raise the QXL_INTERRUPT_CLIENT interrupt.
There is a potential race condition in the case where a client
disconnects and a new client with fewer capabilities connects. There
may be commands in the ring that the new client can't handle. This
case is handled by first changing the capability bits, then processing
all commands in the ring, and then start forwarding commands to the
new client. As long as the guest obeys the capability bits, the new
client will never see anything it doesn't understand.
Just checks stride vs width times bpp.
This fixes a potential abort on guest generated bad images in
glz_encoder.
Other files touched to move some consts to red_common, they are
static so no problem to be defined in both red_worker.c and
red_parse_qxl.c .
replace add_ref with add for stack allocated SpiceMigrateDataDisplay.
This fixes wrong MIGRATE_DATA message in display channel (symptom is
glz_encoder_max being way too big, and malloc failure at target) seen on
F18 with gcc-4.7.1-5.fc18.x86_64 and glibc-2.16-8.fc18.x86_64 (didn't
appear on RHEL 6).