Commit Graph

2659 Commits

Author SHA1 Message Date
Andreas Boll
31381b7057 Upload to unstable. 2017-12-17 13:34:07 +01:00
Andreas Boll
f0178c049c Bump standards version to 4.1.2. 2017-12-17 13:19:45 +01:00
Andreas Boll
9684e88c21 Stop passing --disable-silent-rules to configure, debhelper does it now. 2017-12-17 13:19:23 +01:00
Andreas Boll
397047255e Switch to dbsym package. 2017-12-17 13:18:03 +01:00
Andreas Boll
34c1784503 Declare Multi-Arch: same for libpixman-1-dev (Closes: #884166). 2017-12-17 13:17:44 +01:00
Julien Cristau
87934b6b4f Upload to unstable 2016-09-24 13:25:26 +02:00
Julien Cristau
4daa9a4c6b Use https URL in debian/watch. 2016-09-24 13:23:41 +02:00
Julien Cristau
0f4e087031 Bump changelogs 2016-05-13 12:50:41 +02:00
Julien Cristau
5672fa0f82 pixman 0.34.0 release
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWrh2cAAoJEGUdTbirWueAfxAH/1sf8P0SHY1y9KBKCw0enM4Y
 60sZYAgTgLa5prITcPeTb11bw877WAF73bAVjzL+6pNkT+Xs1ytvckwmbDoKDRZi
 zlptf0vPCnPX95Fh2X2PSO/1G0EErNWbqP5dUtLJ8L4sEaAj5TtDC9r9BouXpFaR
 qdipAmC1dVQNsbheBUinnfIjQ7H7i0NXXoUADFoP+X9V3WW95Hjkbwyoa4IUeYsY
 lPLVKfMRTZfQLksAAViDDpAhQxIrwMYQYApuMlbYXvX3tsW6zZCTeDfjqwRfxkdX
 Nnsz3lKBGvbS2ZJQBx2Xp9YC7+eu12IlxFA8cn3Exa96VngPJK5bR8Qn1ZJlUH8=
 =hex7
 -----END PGP SIGNATURE-----

Merge tag 'pixman-0.34.0' into debian-unstable

pixman 0.34.0 release
2016-05-13 12:49:33 +02:00
Oded Gabbay
1727aa4ab6 Pre-release version bump to 0.34.0
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2016-01-31 16:39:23 +02:00
Andreas Boll
af451ab328 Upload to unstable. 2016-01-14 13:46:57 +01:00
Andreas Boll
cae8b2a893 Add myself to Uploaders. 2016-01-14 13:21:08 +01:00
Andreas Boll
e22e142165 Bump changelogs. 2016-01-14 13:19:45 +01:00
Andreas Boll
5e030aac41 pixman 0.33.6 release
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWeVNeAAoJEGUdTbirWueAZVUIAIMrz8RGz2t/6Y16CPx8Kfat
 NJFe9k0gVxTCBGYcAOtZJxeqcl/RryGuEGrdcN1UiAeCsjDxTCEwefHO1ablC6A6
 Zc57mkxbknM1eOHiU/D59+JFC5cvLM3WlsQSAi2CyUIdlSq/b7vK/ADWas7kn8y9
 AdDd/MEfGXwVKumQqSN+h5GZxLwhOYw6Y9Ew6srR5EX3jzGQ8GQY3cfd3tzXpYYN
 aZ3EME3EUkhrT3DdUg/byoQu1YIppGm5Vb405gqe/1B+QZLMHUsKP3dwMk++jcdn
 4vcZAhs3s5VrVlPkfng6HLdRHmHI//AfwRBktcrEoirGfGGtPF3NKfk9B4KgPRk=
 =FhAa
 -----END PGP SIGNATURE-----

Merge tag 'pixman-0.33.6' into debian-unstable

pixman 0.33.6 release
2016-01-14 13:17:22 +01:00
Oded Gabbay
0e72e78086 Post-release version bump to 0.33.7
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-12-22 15:55:32 +02:00
Oded Gabbay
65f35270e4 Pre-release version bump to 0.33.6
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-12-22 15:30:10 +02:00
Oded Gabbay
a566f627db configura.ac: fix test for SSE2 & SSSE3 assembler support
This patch modifies the SSE2 & SSSE3 tests in configure.ac to use a
global variable to initialize vector variables. In addition, we now
return the value of the computation instead of 0.

This is done so gcc 4.9 (and lower) won't optimize the SSE assembly
instructions (when using -O1 and higher), because then the configure test
might incorrectly pass even though the assembler doesn't support the
SSE instructions (the test will pass because the compiler does support
the intrinsics).

v2: instead of using volatile, use a global variable as input

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-12-22 11:19:01 +02:00
Andrea Canciani
d24b415f3e mmx: Improve detection of support for "K" constraint
Older versions of clang emitted an error on the "K" constraint, but at
least since version 3.7 it is supported. Just like gcc, this
constraint is only allowed for constants, but apparently clang
requires them to be known before inlining.

Using the macro definition _mm_shuffle_pi16(A, N) ensures that the "K"
constraint is always applied to a literal constant, independently from
the compiler optimizations and allows building pixman-mmx on modern
clang.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrea Canciani <ranma42@gmail.com>
2015-11-18 14:19:58 -08:00
Matt Turner
312e381523 Revert "mmx: Use MMX2 intrinsics from xmmintrin.h directly."
This reverts commit 7de61d8d14.

Newer versions of gcc allow inclusion of xmmintrin.h without -msse, but
still won't allow usage of the intrinsics.

Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=564024
2015-11-18 14:19:12 -08:00
Andreas Boll
017a59ec26 Upload to unstable 2015-11-04 13:26:38 +01:00
Andreas Boll
c193730083 Bump changelogs. 2015-11-04 10:30:58 +01:00
Andreas Boll
51c330400f pixman 0.33.4 release
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWKk4tAAoJEGUdTbirWueAIDkH/0YQj9943iFVJFEWhQdhLJe6
 PeHsiZgNjhPTNK2gpuudtOK2yda1akQTCfjGeNzN0nKQ0qPOaDiF71jt/C4Duppx
 rX9M6lkyMEPlCrM27+pZUCJitL+e7j8qYjapAdfvx8lCqvl8Mkq2t5JCsr1PWkte
 5w83kNhWf35eWN0zgRem9tTgVQ0LMYdO5IYPasAnqKHUUaIHO/r2dTNdc8bBFvD7
 k7X3Qz/kqAodraTWpieT59mwttUI0x/CiaNjlXfMDC4KKtbzkZJQlc0Oys74EG17
 Oag2Bvi4vnkTj+lvoixhu8dBGR/LPyEzZHbZyNWfjsDYL2RM2FuovUDxaYYM5nQ=
 =11P2
 -----END PGP SIGNATURE-----

Merge tag 'pixman-0.33.4' into debian-unstable

pixman 0.33.4 release
2015-11-04 10:28:32 +01:00
Oded Gabbay
3a50806cbe Post-release version bump to 0.33.5
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-10-23 18:33:55 +03:00
Oded Gabbay
fa71d08a81 Pre-release version bump to 0.33.4
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-10-23 17:58:49 +03:00
Andrea Canciani
9728241bd0 test: Fix fence-image-self-test on Mac
On MacOS X, according to the manpage of mprotect(), "When a program
violates the protections of a page, it gets a SIGBUS or SIGSEGV
signal.", but fence-image-self-test was only accepting a SIGSEGV as
notification of invalid access.

Fixes fence-image-self-test

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-10-16 15:05:02 +03:00
Matt Turner
7de61d8d14 mmx: Use MMX2 intrinsics from xmmintrin.h directly.
We had lots of hacks to handle the inability to include xmmintrin.h
without compiling with -msse (lest SSE instructions be used in
pixman-mmx.c). Some recent version of gcc relaxed this restriction.

Change configure.ac to test that xmmintrin.h can be included and that we
can use some intrinsics from it, and remove the work-around code from
pixman-mmx.c.

Evidently allows gcc 4.9.3 to optimize better as well:

   text	   data	    bss	    dec	    hex	filename
 657078	  30848	    680	 688606	  a81de	libpixman-1.so.0.33.3 before
 656710	  30848	    680	 688238	  a806e	libpixman-1.so.0.33.3 after

Reviewed-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Tested-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2015-10-13 09:40:42 -07:00
Siarhei Siamashka
90e62c0867 vmx: implement fast path vmx_composite_over_n_8888
Running "lowlevel-blt-bench over_n_8888" on Playstation3 3.2GHz,
Gentoo ppc (32-bit userland) gave the following results:

before:  over_n_8888 =  L1: 147.47  L2: 205.86  M:121.07
after:   over_n_8888 =  L1: 287.27  L2: 261.09  M:133.48

Cairo non-trimmed benchmarks on POWER8, 3.4GHz 8 Cores:

ocitysmap          659.69  -> 611.71   :  1.08x speedup
xfce4-terminal-a1  2725.22 -> 2547.47  :  1.07x speedup

Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-09-29 14:21:46 +03:00
Ben Avison
2876d8d3dd affine-bench: remove 8e margin from COVER area
Patch "Remove the 8e extra safety margin in COVER_CLIP analysis" reduced
the required image area for setting the COVER flags in
pixman.c:analyze_extent(). Do the same reduction in affine-bench.

Leaving the old calculations in place would be very confusing for anyone
reading the code.

Also add a comment that explains how affine-bench wants to hit the COVER
paths. This explains why the intricate extent calculations are copied
from pixman.c.

[Pekka: split patch, change comments, write commit message]
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-09-25 14:26:04 +03:00
Ben Avison
0e2e975128 Remove the 8e extra safety margin in COVER_CLIP analysis
As discussed in
http://lists.freedesktop.org/archives/pixman/2015-August/003905.html

the 8 * pixman_fixed_e (8e) adjustment which was applied to the transformed
coordinates is a legacy of rounding errors which used to occur in old
versions of Pixman, but which no longer apply. For any affine transform,
you are now guaranteed to get the same result by transforming the upper
coordinate as though you transform the lower coordinate and add (size-1)
steps of the increment in source coordinate space. No projective
transform routines use the COVER_CLIP flags, so they cannot be affected.

Proof by Siarhei Siamashka:

Let's take a look at the following affine transformation matrix (with 16.16
fixed point values) and two vectors:

         | a   b     c    |
M      = | d   e     f    |
         | 0   0  0x10000 |

         |  x_dst  |
P     =  |  y_dst  |
         | 0x10000 |

         | 0x10000 |
ONE_X  = |    0    |
         |    0    |

The current matrix multiplication code does the following calculations:

             | (a * x_dst + b * y_dst + 0x8000) / 0x10000 + c |
    M * P =  | (d * x_dst + e * y_dst + 0x8000) / 0x10000 + f |
             |                   0x10000                      |

These calculations are not perfectly exact and we may get rounding
because the integer coordinates are adjusted by 0.5 (or 0x8000 in the
16.16 fixed point format) before doing matrix multiplication. For
example, if the 'a' coefficient is an odd number and 'b' is zero,
then we are losing some of the least significant bits when dividing by
0x10000.

So we need to strictly prove that the following expression is always
true even though we have to deal with rounding:

                                          | a |
    M * (P + ONE_X) - M * P = M * ONE_X = | d |
                                          | 0 |

or

   ((a * (x_dst + 0x10000) + b * y_dst + 0x8000) / 0x10000 + c)
  -
   ((a * x_dst             + b * y_dst + 0x8000) / 0x10000 + c)
  =
    a

It's easy to see that this is equivalent to

    a + ((a * x_dst + b * y_dst + 0x8000) / 0x10000 + c)
      - ((a * x_dst + b * y_dst + 0x8000) / 0x10000 + c)
  =
    a

Which means that stepping exactly by one pixel horizontally in the
destination image space (advancing 'x_dst' by 0x10000) is the same as
changing the transformed 'x_src' coordinate in the source image space
exactly by 'a'. The same applies to the vertical direction too.
Repeating these steps, we can reach any pixel in the source image
space and get exactly the same fixed point coordinates as doing
matrix multiplications per each pixel.

By the way, the older matrix multiplication implementation, which was
relying on less accurate calculations with three intermediate roundings
"((a + 0x8000) >> 16) + ((b + 0x8000) >> 16) + ((c + 0x8000) >> 16)",
also has the same properties. However reverting
    http://cgit.freedesktop.org/pixman/commit/?id=ed39992564beefe6b12f81e842caba11aff98a9c
and applying this "Remove the 8e extra safety margin in COVER_CLIP
analysis" patch makes the cover test fail. The real reason why it fails
is that the old pixman code was using "pixman_transform_point_3d()"
function
    http://cgit.freedesktop.org/pixman/tree/pixman/pixman-matrix.c?id=pixman-0.28.2#n49
for getting the transformed coordinate of the top left corner pixel
in the image scaling code, but at the same time using a different
"pixman_transform_point()" function
    http://cgit.freedesktop.org/pixman/tree/pixman/pixman-matrix.c?id=pixman-0.28.2#n82
in the extents calculation code for setting the cover flag. And these
functions did the intermediate rounding differently. That's why the 8e
safety margin was needed.

** proof ends

However, for COVER_CLIP_NEAREST, the actual margins added were not 8e.
Because the half-way cases round down, that is, coordinate 0 hits pixel
index -1 while coordinate e hits pixel index 0, the extra safety margins
were actually 7e to the left and up, and 9e to the right and down. This
patch removes the 7e and 9e margins and restores the -e adjustment
required for NEAREST sampling in Pixman. For reference, see
pixman/rounding.txt.

For COVER_CLIP_BILINEAR, the margins were exactly 8e as there are no
additional offsets to be restored, so simply removing the 8e additions
is enough.

Proof:

All implementations must give the same numerical results as
bits_image_fetch_pixel_nearest() / bits_image_fetch_pixel_bilinear().

The former does
    int x0 = pixman_fixed_to_int (x - pixman_fixed_e);
which maps directly to the new test for the nearest flag, when you consider
that x0 must fall in the interval [0,width).

The latter does
    x1 = x - pixman_fixed_1 / 2;
    x1 = pixman_fixed_to_int (x1);
    x2 = x1 + 1;
When you write a COVER path, you take advantage of the assumption that
both x1 and x2 fall in the interval [0, width).

As samplers are allowed to fetch the pixel at x2 unconditionally, we
require
    x1 >= 0
    x2 < width
so
    x - pixman_fixed_1 / 2 >= 0
    x - pixman_fixed_1 / 2 + pixman_fixed_1 < width * pixman_fixed_1
so
    pixman_fixed_to_int (x - pixman_fixed_1 / 2) >= 0
    pixman_fixed_to_int (x + pixman_fixed_1 / 2) < width
which matches the source code lines for the bilinear case, once you delete
the lines that add the 8e margin.

Signed-off-by: Ben Avison <bavison@riscosopen.org>
[Pekka: adjusted commit message, left affine-bench changes for another patch]
[Pekka: add commit message parts from Siarhei]
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-09-25 14:24:17 +03:00
Ben Avison
23525b4ea5 pixman-general: Tighten up calculation of temporary buffer sizes
Each of the aligns can only add a maximum of 15 bytes to the space
requirement. This permits some edge cases to use the stack buffer where
previously it would have deduced that a heap buffer was required.

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-09-25 14:19:15 +03:00
Siarhei Siamashka
8b49d4b6b4 pixman-general: Fix stack related pointer arithmetic overflow
As https://bugs.freedesktop.org/show_bug.cgi?id=92027#c6 explains,
the stack is allocated at the very top of the process address space
in some configurations (32-bit x86 systems with ASLR disabled).
And the careless computations done with the 'dest_buffer' pointer
may overflow, failing the buffer upper limit check.

The problem can be reproduced using the 'stress-test' program,
which segfaults when executed via setarch:

    export CFLAGS="-O2 -m32" && ./autogen.sh
    ./configure --disable-libpng --disable-gtk && make
    setarch i686 -R test/stress-test

This patch introduces the required corrections. The extra check
for negative 'width' may be redundant (the invalid 'width' value
is not supposed to reach here), but it's better to play safe
when dealing with the buffers allocated on stack.

Reported-by: Ludovic Courtès <ludo@gnu.org>
Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Reviewed-by: soren.sandmann@gmail.com
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-09-22 13:19:06 +03:00
Thomas Petazzoni
4297e9058d test: add a check for FE_DIVBYZERO
Some architectures, such as Microblaze and Nios2, currently do not
implement FE_DIVBYZERO, even though they have <fenv.h> and
feenableexcept(). This commit adds a configure.ac check to verify
whether FE_DIVBYZERO is defined or not, and if not, disables the
problematic code in test/utils.c.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Marek Vasut <marex@denx.de>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-09-20 15:50:04 +03:00
Oded Gabbay
8189fad961 vmx: Remove unused expensive functions
Now that we replaced the expensive functions with better performing
alternatives, we should remove them so they will not be used again.

Running Cairo benchmark on trimmed traces gave the following results:

POWER8, 8 cores, 3.4GHz, RHEL 7.2 ppc64le.

Speedups
========
t-firefox-scrolling     1232.30 -> 1096.55 :  1.12x
t-gnome-terminal-vim    613.86  -> 553.10  :  1.11x
t-evolution             405.54  -> 371.02  :  1.09x
t-firefox-talos-gfx     919.31  -> 862.27  :  1.07x
t-gvim                  653.02  -> 616.85  :  1.06x
t-firefox-canvas-alpha  941.29  -> 890.42  :  1.06x

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-09-18 10:07:13 +03:00
Oded Gabbay
6b1b8b2b90 vmx: implement fast path vmx_composite_over_n_8_8888
POWER8, 8 cores, 3.4GHz, RHEL 7.2 ppc64le.

reference memcpy speed = 25008.9MB/s (6252.2MP/s for 32bpp fills)

                Before         After           Change
              ---------------------------------------------
L1              91.32          182.84         +100.22%
L2              94.94          182.83         +92.57%
M               95.55          181.51         +89.96%
HT              88.96          162.09         +82.21%
VT              87.4           168.35         +92.62%
R               83.37          146.23         +75.40%
RT              66.4           91.5           +37.80%
Kops/s          683            859            +25.77%

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-09-18 10:07:08 +03:00
Oded Gabbay
8d8caa55a3 vmx: optimize vmx_composite_over_n_8888_8888_ca
This patch optimizes vmx_composite_over_n_8888_8888_ca by removing use
of expand_alpha_1x128, unpack/pack and in_over_2x128 in favor of
splat_alpha, in_over and MUL/ADD macros from pixman_combine32.h.

Running "lowlevel-blt-bench -n over_8888_8888" on POWER8, 8 cores,
3.4GHz, RHEL 7.2 ppc64le gave the following results:

reference memcpy speed = 23475.4MB/s (5868.8MP/s for 32bpp fills)

                Before          After           Change
              --------------------------------------------
L1              244.97          474.05         +93.51%
L2              243.74          473.05         +94.08%
M               243.29          467.16         +92.02%
HT              144.03          252.79         +75.51%
VT              174.24          279.03         +60.14%
R               109.86          149.98         +36.52%
RT              47.96           53.18          +10.88%
Kops/s          524             576            +9.92%

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-09-18 10:07:03 +03:00
Oded Gabbay
857880f0e4 vmx: optimize scaled_nearest_scanline_vmx_8888_8888_OVER
This patch optimizes scaled_nearest_scanline_vmx_8888_8888_OVER and all
the functions it calls (combine1, combine4 and
core_combine_over_u_pixel_vmx).

The optimization is done by removing use of expand_alpha_1x128 and
expand_alpha_2x128 in favor of splat_alpha and MUL/ADD macros from
pixman_combine32.h.

Running "lowlevel-blt-bench -n over_8888_8888" on POWER8, 8 cores,
3.4GHz, RHEL 7.2 ppc64le gave the following results:

reference memcpy speed = 24847.3MB/s (6211.8MP/s for 32bpp fills)

                Before          After           Change
              --------------------------------------------
L1              182.05          210.22         +15.47%
L2              180.6           208.92         +15.68%
M               180.52          208.22         +15.34%
HT              130.17          178.97         +37.49%
VT              145.82          184.22         +26.33%
R               104.51          129.38         +23.80%
RT              48.3            61.54          +27.41%
Kops/s          430             504            +17.21%

v2: Check *pm is not NULL before dereferencing it in combine1()

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-09-18 10:06:50 +03:00
Pekka Paalanen
73e586efb3 armv6: enable over_n_8888
Enable the fast path added in the previous patch by moving the lookup
table entries to their proper locations.

Lowlevel-blt-bench benchmark statistics with 30 iterations, showing the
effect of adding this one patch on top of
"armv6: Add over_n_8888 fast path (disabled)", which was applied on
fd59569294.

       Before          After
      Mean StdDev     Mean StdDev   Confidence   Change
L1    12.5   0.04     45.2   0.10    100.00%    +263.1%
L2    11.1   0.02     43.2   0.03    100.00%    +289.3%
M      9.4   0.00     42.4   0.02    100.00%    +351.7%
HT     8.5   0.02     25.4   0.10    100.00%    +198.8%
VT     8.4   0.02     22.3   0.07    100.00%    +167.0%
R      8.2   0.02     23.1   0.09    100.00%    +183.6%
RT     5.4   0.05     11.4   0.21    100.00%    +110.3%

At most 3 outliers rejected per test per set.

Iterating here means that lowlevel-blt-bench was executed 30 times, and
the statistics above were computed from the output.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-09-17 14:40:39 +03:00
Ben Avison
9eb6889b15 armv6: Add over_n_8888 fast path (disabled)
This new fast path is initially disabled by putting the entries in the
lookup table after the sentinel. The compiler cannot tell the new code
is not used, so it cannot eliminate the code. Also the lookup table size
will include the new fast path. When the follow-up patch then enables
the new fast path, the binary layout (alignments, size, etc.) will stay
the same compared to the disabled case.

Keeping the binary layout identical is important for benchmarking on
Raspberry Pi 1. The addresses at which functions are loaded will have a
significant impact on benchmark results, causing unexpected performance
changes. Keeping all function addresses the same across the patch
enabling a new fast path improves the reliability of benchmarks.

Benchmark results are included in the patch enabling this fast path.

[Pekka: disabled the fast path, commit message]
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-09-17 14:40:39 +03:00
Ben Avison
4c71f595e3 test: Add cover-test v5
This test aims to verify both numerical correctness and the honouring of
array bounds for scaled plots (both nearest-neighbour and bilinear) at or
close to the boundary conditions for applicability of "cover" type fast paths
and iter fetch routines.

It has a secondary purpose: by setting the env var EXACT (to any value) it
will only test plots that are exactly on the boundary condition. This makes
it possible to ensure that "cover" routines are being used to the maximum,
although this requires the use of a debugger or code instrumentation to
verify.

Changes in v4:

  Check the fence page size and skip the test if it is too large. Since
  we need to deal with pixman_fixed_t coordinates that go beyond the
  real image width, make the page size limit 16 kB. A 32 kB or larger
  page size would cause an a8 image width to be 32k or more, which is no
  longer representable in pixman_fixed_t.

  Use a shorthand variable 'filter' in test_cover().

  Whitespace adjustments.

Changes in v5:

  Skip if fenced memory is not supported. Do you know of any such
  platform?

Signed-off-by: Ben Avison <bavison@riscosopen.org>
[Pekka: changes in v4 and v5]
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-09-16 15:34:43 +03:00
Julien Cristau
f9a49b3783 Run tests with VERBOSE=1. 2015-09-12 20:31:08 +02:00
Julien Cristau
4b4898e073 Upload to unstable 2015-09-12 13:08:19 +02:00
Pekka Paalanen
812c9c9758 implementation: add PIXMAN_DISABLE=wholeops
Add a new option to PIXMAN_DISABLE: "wholeops". This option disables all
whole-operation fast paths regardless of implementation level, except
the general path (general_composite_rect).

The purpose is to add a debug option that allows us to test optimized
iterator paths specifically. With this, it is possible to see if:
- fast paths mask bugs in iterators
- compare fast paths with iterator paths for performance

The effect was tested on x86_64 by running:
$ PIXMAN_DISABLE='' ./test/lowlevel-blt-bench over_8888_8888
$ PIXMAN_DISABLE='wholeops' ./test/lowlevel-blt-bench over_8888_8888

In the first case time is spent in sse2_composite_over_8888_8888(), and
in the latter in sse2_combine_over_u().

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-09-09 11:42:55 +03:00
Pekka Paalanen
e9ef2cc4de utils.[ch]: add fence_get_page_size()
Add a function to get the page size used for memory fence purposes, and
use it everywhere where getpagesize() was used.

This offers a single point in code to override the page size, in case
one wants to experiment how the tests work with a higher page size than
what the developer's machine has.

This also offers a clean API, without adding #ifdefs, to tests for
checking the page size.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-09-09 11:30:51 +03:00
Pekka Paalanen
82f8c997df utils.c: fix fallback code for fence_image_create_bits()
Used a wrong variable name, causing:
/home/pq/git/pixman/demos/../test/utils.c: In function ‘fence_image_create_bits’:
/home/pq/git/pixman/demos/../test/utils.c:562:46: error: ‘width’ undeclared (first use in this function)

Use the correct variable.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-09-09 11:29:44 +03:00
Andreas Boll
42fab57651 Bump standards version to 3.9.6. 2015-09-04 13:40:42 +02:00
Andreas Boll
56432ef5e5 Drop XC- prefix from Package-Type field. 2015-09-04 13:39:55 +02:00
Andreas Boll
c0f98e1cf4 Add upstream url. 2015-09-04 12:30:27 +02:00
Andreas Boll
03e2d2138b Update Vcs-* fields. 2015-09-04 12:30:27 +02:00
intrigeri
e6fce5e4e4 Update changelog.
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
2015-09-04 12:30:26 +02:00
intrigeri
7bc925aa50 Enable all hardening build flags. Thanks to Simon Ruderich <simon@ruderich.org> for the patch.
Quoting Simon again: "It currently has the same effect as hardening=+bindnow,
but will automatically enable future hardening options and in case the package
will ever build binaries those are immediately protected with PIE as well."

Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
2015-09-04 12:29:51 +02:00