Soeren rightfully complained that I had removed all the comments from
André's patch, most importantly that explain why the transformation is
valid. So add a few details to show that B varies linearly across the
scanline and how we can therefore reduce the per-pixel cost of evaluating
B.
Fixes: Bug 22908 -- Invalid output of radial gradient
http://bugs.freedesktop.org/show_bug.cgi?id=22908
We also include a modified patch by André Tupinambá <andrelrt@gmail.com>,
to pull constant expressions out of the inner radial gradient walker.
Microsoft C++ does not define __m64 and all related MMX functions in
x64. However, it succeeds in generating object files for SSE2 code
inside pixman.
The real problem happens during linking, when it cannot find MMX functions
(which are not defined as intrinsics for AMD64 platform).
I have implemented those missing functions using general programming.
MMX __m64 is used relatively scarcely within SSE2 implementation, and the
performance impact probably is negligible.
Bug 22390.
During the fast-path query, the read_func and write_func from the bits
structure are queried for the solid image.
==32723== Conditional jump or move depends on uninitialised value(s)
==32723== at 0x412AF20: _pixman_run_fast_path (pixman-utils.c:681)
==32723== by 0x4136319: sse2_composite (pixman-sse2.c:5554)
==32723== by 0x4100CD2: _pixman_implementation_composite
(pixman-implementation.c:227)
==32723== by 0x412396E: pixman_image_composite (pixman.c:140)
==32723== by 0x4123D64: pixman_image_fill_rectangles (pixman.c:322)
==32723== by 0x40482B7: _cairo_image_surface_fill_rectangles
(cairo-image-surface.c:1180)
==32723== by 0x4063BE7: _cairo_surface_fill_rectangles
(cairo-surface.c:1883)
==32723== by 0x4063E38: _cairo_surface_fill_region
(cairo-surface.c:1840)
==32723== by 0x4067FDC: _clip_and_composite_trapezoids
(cairo-surface-fallback.c:625)
==32723== by 0x40689C5: _cairo_surface_fallback_paint
(cairo-surface-fallback.c:835)
==32723== by 0x4065731: _cairo_surface_paint (cairo-surface.c:1923)
==32723== by 0x4044098: _cairo_gstate_paint (cairo-gstate.c:900)
==32723== Uninitialised value was created by a heap allocation
==32723== at 0x402732D: malloc (vg_replace_malloc.c:180)
==32723== by 0x410099F: _pixman_image_allocate (pixman-image.c:100)
==32723== by 0x41265B8: pixman_image_create_solid_fill
(pixman-solid-fill.c:75)
==32723== by 0x4123CE1: pixman_image_fill_rectangles (pixman.c:314)
==32723== by 0x40482B7: _cairo_image_surface_fill_rectangles
(cairo-image-surface.c:1180)
==32723== by 0x4063BE7: _cairo_surface_fill_rectangles
(cairo-surface.c:1883)
==32723== by 0x4063E38: _cairo_surface_fill_region
(cairo-surface.c:1840)
==32723== by 0x4067FDC: _clip_and_composite_trapezoids
(cairo-surface-fallback.c:625)
==32723== by 0x40689C5: _cairo_surface_fallback_paint
(cairo-surface-fallback.c:835)
==32723== by 0x4065731: _cairo_surface_paint (cairo-surface.c:1923)
==32723== by 0x4044098: _cairo_gstate_paint (cairo-gstate.c:900)
==32723== by 0x403C10B: cairo_paint (cairo.c:2052)
This works because the X server always attempts to set a clip region
within the bounds of the drawable, and it only fails at it when it is
computing the wrong translation and therefore needs the workaround.
Bug 22844 demonstrates that it is not sufficient to play tricks with
the clip regions to work around the bogus images from the X
server. The problem there is that if the operation hits the general
path and the destination has a different format than a8r8g8b8, the
destination pixels will be fetched into a temporary array. But because
those pixels would be outside the clip region, they would be fetched
as black. The previous workaround was relying on fast paths fetching
those pixels without checking the clip region.
In the new scheme we work around the problem at the
pixman_image_composite() level. If an image is determined to need a
work around, we translate both the bits pointer, the coordinates, and
the clip region, thus effectively undoing the X server's broken
computation.
The pld instruction used in the NEON assembler code is only available
for ARMv5e and >= ARMv6.
Set -mcpu=cortex-a8 when compiling the source file (similar to what is
already done for the SIMD build).
128-bit registers "qX" are incorrectly handled in inline assembly
clobber list for codesourcery cs2007q3 gcc toolchain. Only the
first 64-bit half is saved and restored by gcc. Changing clobber
list to use only 64-bit register aliases can solve this problem.
For example, 128-bit register q0 is mapped to two 64-bit
registers d0 and d1, q1 is mapped to d2 and d3, etc.
This patch effectively reverts the changes done by commit
8eeeca9932 which was causing
severe stability issues, and restores old variant of
'neon_composite_over_n_8_0565' function, which used to work
correctly.
- Introduce a GOOD_RECT() macro that checks that a pixman_box_t is not
empty or degenerate an use it.
- Use GOOD_RECT() instead of magic if statements for funtions that take
x, y, width, height arguments
- Use GOOD_RECT() in _reset(). The checks in the previous code seemed to
allow an empty box, but then created a broken region from it.
- Add GOOD(region) check at the end of _translate()
When compiled without optimization, GCC will place various temporaries
on the stack. Since Firefox sometimes causes the stack to be aligned
to four bytes, this causes movdqa to generate faults.