Commit Graph

2096 Commits

Author SHA1 Message Date
Jeremy Huddleston
a069da6c66 Expand TLS support beyond __thread to __declspec(thread)
This code was pretty much coppied from a similar commit that I made to
xorg-server in April.

cf: xorg/xserver: bb4d145bd25e2aee988b100ecf1105ea3b6a40b8

Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
2012-03-13 18:02:26 -04:00
Jeremy Huddleston
61d999b910 Disable MMX when incompatible clang is being used.
Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
2012-03-13 18:02:26 -04:00
Jeremy Huddleston
ad4b6922f2 Silence a warning about unused pixman_have_mmx
Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
2012-03-13 18:02:25 -04:00
Jeremy Huddleston
bb5ff26878 Revert "Disable MMX when Clang is being used."
This reverts commit 5eb4c12a79.
2012-03-13 18:02:25 -04:00
Cyril Brulebois
c6b4daedbc Upload to experimental. 2012-03-09 13:17:30 +01:00
Cyril Brulebois
b3db603f91 Add new symbols and bump shlibs accordingly. 2012-03-09 13:15:11 +01:00
Cyril Brulebois
e6c37e621b Bump changelogs. 2012-03-09 13:03:52 +01:00
Cyril Brulebois
e4e7b8fcb8 Merge branch 'debian-unstable' into debian-experimental 2012-03-09 13:03:07 +01:00
Cyril Brulebois
44abaa5132 Merge branch 'upstream-unstable' into debian-experimental 2012-03-09 13:03:04 +01:00
Søren Sandmann Pedersen
a6ad5120f7 Post-release version bump to 0.25.3 2012-03-08 10:11:20 -05:00
Søren Sandmann Pedersen
f73f798531 Pre-release version bump to 0.25.2 2012-03-08 09:33:16 -05:00
Søren Sandmann Pedersen
62df04eb25 mmx: Squash a warning by making the argument to ldl_u() const 2012-03-08 09:29:46 -05:00
Alan Coopersmith
85943733cb Just use xmmintrin.h when building with Solaris Studio compilers
Since the Solaris Studio compilers don't have a mode where MMX
instructions are available and SSE instructions are not, we can
just use the <xmmintrin.h> header directly.

Fixes build failure due to Studio not supporting the __gnu_inline__
or __artificial__ attributes.

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2012-03-05 18:57:26 -08:00
Nemanja Lukic
304f57644a MIPS: DSPr2: Added mips_dspr2_blt and mips_dspr2_fill routines.
Performance numbers before/after on MIPS-74kc @ 1GHz

Referent (before):

lowlevel-blt-bench:
              src_n_0565 =  L1: 238.14  L2: 233.15  M: 57.88 ( 77.23%)  HT: 53.22  VT: 49.99  R: 47.73  RT: 24.79 (  91Kops/s)
              src_n_8888 =  L1: 190.19  L2: 187.57  M: 28.94 ( 77.23%)  HT: 27.91  VT: 27.33  R: 26.64  RT: 14.68 (  77Kops/s)
cairo-perf-trace:
[ # ]  backend                         test   min(s) median(s) stddev. count
[ # ]    image: pixman 0.25.1
[  0]    image         gnome-system-monitor  268.460  269.712   0.22%    6/6

Optimized:

lowlevel-blt-bench:
              src_n_0565 =  L1:1081.39  L2: 258.22  M:189.59 (252.91%)  HT: 60.23  VT: 55.01  R: 53.44  RT: 23.68 (  89Kops/s)
              src_n_8888 =  L1: 653.46  L2: 113.55  M:135.26 (360.86%)  HT: 38.99  VT: 37.38  R: 34.95  RT: 18.67 (  84Kops/s)
cairo-perf-trace:
[ # ]  backend                         test   min(s) median(s) stddev. count
[ # ]    image: pixman 0.25.1
[  0]    image         gnome-system-monitor  246.565  246.706   0.04%    6/6
2012-03-04 01:09:56 -05:00
Søren Sandmann Pedersen
999e72b80b pixman-access.c: Remove some unused macros
The macros related to palette entries:

RGB15_TO_ENTRY,
RGB24_TO_ENTRY,
RGB24_TO_ENTRY_Y

are not used anywhere.
2012-03-01 23:49:51 -05:00
Søren Sandmann Pedersen
c0cb48aae0 pixman-accessors.h: Delete unused macros
The MEMCPY_WRAPPED and ACCESS macros are not used anymore.
2012-03-01 23:49:51 -05:00
Søren Sandmann Pedersen
5adf569317 Move fetching for solid bits images to pixman-noop.c
This should be a bit faster because it can reuse the scanline on each iteration.
2012-03-01 23:49:50 -05:00
Matt Turner
3c3c70fa0b lowlevel-blt-bench: add in_8_8 and in_n_8_8
Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-03-01 17:42:37 -05:00
Søren Sandmann Pedersen
fcea053561 Disable implementations mentioned in the PIXMAN_DISABLE environment variable.
With this, it becomes possible to do

     PIXMAN_DISABLE="sse2 mmx" some_app

which will run some_app without SSE2 and MMX enabled. This is useful
for benchmarking, testing and narrowing down bugs.

The current list of implementations that can be disabled:

    fast
    mmx
    sse2
    arm-simd
    arm-iwmmxt
    arm-neon
    mips-dspr2
    vmx

The general and noop implementations can't be disabled because pixman
depends on those being available for correct operation.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-02-28 15:46:13 -05:00
Nemanja Lukic
e7574d336b MIPS: DSPr2: Added fast-paths for SRC operation.
Following fast-path functions are implemented (routines 4, 5 and 6 utilize
same fast-memcpy routine):
    1. src_x888_8888
    2. src_8888_0565
    3. src_0565_8888
    4. src_0565_0565
    5. src_8888_8888
    6. src_0888_0888

Performance numbers before/after on MIPS-74kc @ 1GHz

Referent (before):

lowlevel-blt-bench:
        src_x888_8888 =  L1: 199.35  L2:  96.54  M: 18.87 (100.68%)  HT: 17.12  VT: 16.24  R: 15.43  RT:  9.33 (  61Kops/s)
        src_8888_0565 =  L1:  71.22  L2:  51.95  M: 24.19 ( 96.17%)  HT: 20.71  VT: 19.92  R: 18.15  RT:  9.92 (  63Kops/s)
        src_0565_8888 =  L1:  38.82  L2:  36.22  M: 18.60 ( 73.95%)  HT: 14.47  VT: 13.19  R: 12.97  RT:  6.61 (  49Kops/s)
        src_0565_0565 =  L1: 286.05  L2: 155.02  M: 37.68 (100.54%)  HT: 31.08  VT: 28.07  R: 26.26  RT: 11.93 (  68Kops/s)
        src_8888_8888 =  L1: 454.32  L2: 139.15  M: 19.30 (102.98%)  HT: 17.73  VT: 16.08  R: 16.62  RT: 10.45 (  64Kops/s)
        src_0888_0888 =  L1: 190.47  L2: 106.14  M: 25.26 (101.08%)  HT: 21.88  VT: 20.32  R: 18.83  RT: 10.10 (  63Kops/s)
cairo-perf-trace:
[ # ]  backend                         test   min(s) median(s) stddev. count
[ # ]    image: pixman 0.25.1
[  0]    image            firefox-asteroids  421.215  421.325   0.01%    4/6
[  1]    image         firefox-planet-gnome  647.708  648.486   0.13%    6/6
[  2]    image         gnome-system-monitor  276.073  277.506   0.38%    6/6
[  3]    image           gnome-terminal-vim  263.866  265.229   0.39%    6/6
[  4]    image                      poppler  123.576  124.003   0.15%    6/6

Optimized (with these optimizations):

lowlevel-blt-bench:
        src_x888_8888 =  L1: 369.50  L2:  99.37  M: 27.19 (145.07%)  HT: 20.24  VT: 19.48  R: 19.00  RT: 10.22 (  63Kops/s)
        src_8888_0565 =  L1: 105.65  L2:  67.87  M: 25.41 (101.00%)  HT: 20.78  VT: 19.84  R: 18.52  RT:  9.81 (  63Kops/s)
        src_0565_8888 =  L1:  77.10  L2:  63.04  M: 23.37 ( 92.90%)  HT: 20.29  VT: 19.37  R: 18.14  RT: 10.02 (  63Kops/s)
        src_0565_0565 =  L1: 519.02  L2: 241.32  M: 62.35 (166.34%)  HT: 33.74  VT: 27.63  R: 26.12  RT: 11.70 (  67Kops/s)
        src_8888_8888 =  L1: 390.48  L2: 113.99  M: 30.32 (161.77%)  HT: 19.55  VT: 17.05  R: 17.13  RT: 10.19 (  63Kops/s)
        src_0888_0888 =  L1: 349.74  L2: 156.68  M: 40.68 (162.78%)  HT: 25.58  VT: 20.57  R: 20.20  RT:  9.96 (  63Kops/s)
cairo-perf-trace:
[ # ]  backend                         test   min(s) median(s) stddev. count
[ # ]    image: pixman 0.25.1
[  0]    image            firefox-asteroids  400.050  400.308   0.04%    6/6
[  1]    image         firefox-planet-gnome  628.978  629.364   0.07%    6/6
[  2]    image         gnome-system-monitor  270.247  270.313   0.03%    6/6
[  3]    image           gnome-terminal-vim  256.413  257.641   0.21%    6/6
[  4]    image                      poppler  119.540  120.023   0.21%    6/6
2012-02-25 15:06:43 -05:00
Nemanja Lukic
1364c91bd1 MIPS: DSPr2: Basic infrastructure for MIPS architecture
MIPS DSP instruction set extensions
2012-02-25 15:06:43 -05:00
Matt Turner
e43d65d49d lowlevel-blt: add over_x888_n_8888
Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-24 20:02:55 -05:00
Matt Turner
9f60704995 lowlevel-blt: add over_8888_8888
Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-24 19:58:09 -05:00
Søren Sandmann Pedersen
5eb4c12a79 Disable MMX when Clang is being used.
There are several issues with the Clang compiler and pixman-mmx.c:

- When not optimizing, it doesn't seem to recognize that an argument
  to an __always_inline__ function is compile-time constant. This
  results in this error being produced:

      fatal error: error in backend: Invalid operand for inline asm
              constraint 'K'!

- This inline assembly:

      asm ("pmulhuw %1, %0\n\t"
          : "+y" (__A)
          : "y" (__B)
      );

  results in

      fatal error: error in backend: Unsupported asm: input constraint
              with a matching output constraint of incompatible type!

So disable MMX when the compiler is Clang.
2012-02-24 16:30:41 -05:00
Matt Turner
350e231b3f mmx: make load8888 take a pointer to data instead of the data itself
Allows us to tune how we load data into the vector registers.

Signed-off-by: Matt Turner <mattst88@gmail.com>

And squashed in:

mmx: define and use load8888u function

For unaligned loads.

Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-24 08:46:48 -05:00
Matt Turner
ab68316eda mmx: make store8888 take uint32_t *dest as argument
Allows us to tune how we store data from the vector registers.

Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-24 08:46:28 -05:00
Matt Turner
57a245a6e0 Update .gitignore with more demos and tests
Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-22 16:32:46 -05:00
Søren Sandmann Pedersen
51ae3f2d7f mmx: Delete unused function in_over_full_src_alpha()
Also a few minor formatting fixes.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-02-22 14:14:30 -05:00
Søren Sandmann Pedersen
bbd1e6941b mmx: Enable over_x888_8_8888() for x86 as well
It used to be slower than the generic code (with the gcc that was
current in 2007), but that doesn't seem to be the case anymore:

over_x888_8_8888 =  L1:  22.97  L2:  22.88  M: 22.27 (  5.29%)  HT: 18.30  VT: 15.81  R: 15.54  RT: 10.35 ( 131Kops/s)
over_x888_8_8888 =  L1:  53.56  L2:  53.20  M: 50.50 ( 11.99%)  HT: 38.60  VT: 31.19  R: 29.00  RT: 17.37 ( 208Kops/s)

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-02-22 14:14:08 -05:00
Matt Turner
4fc586c3df mmx: fix typo in pix_add_mul on MSVC
Typo introduced in commit a075a870.

Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-21 16:28:37 -05:00
Matt Turner
84221f4c16 mmx: Use _mm_shuffle_pi16
The pshufw x86 instruction is part of Extended 3DNow! and SSE1. The
equivalent ARM wshufh instruction was available from the first iwMMXt
instrucion set.

This instruction is already used in the SSE2 code.

Reduces code size by ~9%.

amd64
  text    data     bss     dec     hex filename
 29925    2240       0   32165    7da5 .libs/libpixman_mmx_la-pixman-mmx.o
 27237    2240       0   29477    7325 .libs/libpixman_mmx_la-pixman-mmx.o

x86
  text    data     bss     dec     hex filename
 27677    1792       0   29469    731d .libs/libpixman_mmx_la-pixman-mmx.o
 24959    1792       0   26751    687f .libs/libpixman_mmx_la-pixman-mmx.o

arm
  text    data     bss     dec     hex filename
 30176    1792       0   31968    7ce0 .libs/libpixman_iwmmxt_la-pixman-mmx.o
 27384    1792       0   29176    71f8 .libs/libpixman_iwmmxt_la-pixman-mmx.o

Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-21 12:47:49 -05:00
Matt Turner
1420834496 mmx: Use _mm_mulhi_pu16
The pmulhuw x86 instruction is part of Extended 3DNow! and SSE1. The
equivalent ARM wmuluh instruction was available from the first iwMMXt
instrucion set.

This instruction is already used in the SSE2 code.

Reduces code size by ~5%.

amd64
  text    data     bss     dec     hex filename
 31325    2240       0   33565    831d .libs/libpixman_mmx_la-pixman-mmx.o
 29925    2240       0   32165    7da5 .libs/libpixman_mmx_la-pixman-mmx.o

x86
  text    data     bss     dec     hex filename
 29165    1792       0   30957    78ed .libs/libpixman_mmx_la-pixman-mmx.o
 27677    1792       0   29469    731d .libs/libpixman_mmx_la-pixman-mmx.o

arm
  text    data     bss     dec     hex filename
 31632    1792       0   33424    8290 .libs/libpixman_iwmmxt_la-pixman-mmx.o
 30176    1792       0   31968    7ce0 .libs/libpixman_iwmmxt_la-pixman-mmx.o

Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-21 12:46:02 -05:00
Matt Turner
69ed71fad1 mmx: enable over_x888_8_8888 on ARM/iwMMXt
before: over_x888_8_8888 =  L1:   7.63  L2:   7.72  M:  6.44 ( 19.17%)  HT: 6.24  VT:  6.11  R:  5.87  RT:  4.61 (  51Kops/s)
after : over_x888_8_8888 =  L1:  11.88  L2:  11.11  M:  8.70 ( 26.01%)  HT: 8.15  VT:  8.07  R:  7.76  RT:  5.62 (  61Kops/s)

Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-20 19:07:44 -05:00
Matt Turner
a14f0f66bb autoconf: use #error instead of error
We'd rather see the actual #error message rather than a syntax error in
config.log.

Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-20 18:36:24 -05:00
Matt Turner
fced5c82c2 Convert while (w) to if (w) when possible
Missed in commit 57fd8c37.

Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-18 17:41:10 -05:00
Matt Turner
e27bdcd968 Make sure to run AC_SUBST IWMMXT_CFLAGS
Allows you to compile without -flax-vector-conversions in your CFLAGS,
though -march=iwmmxt2 is still necessary since specifying some other
-march= value will override it, and disable iwmmxt.

Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-17 18:10:37 -05:00
Jeremy Huddleston
82a3980701 configure.ac: Add an --enable-libpng option
Now there is a way to not link against libpng even if it's available.

Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
2012-02-16 15:22:32 -05:00
Matt Turner
46fc4eb234 Use AC_LANG_SOURCE for iwMMXt configure program
Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-11 23:47:10 -05:00
Julien Cristau
b60708fb0e Upload to unstable 2012-02-09 21:16:57 +01:00
Julien Cristau
20446ebc6b Bump changelogs 2012-02-09 20:52:20 +01:00
Julien Cristau
00e59db614 pixman 0.24.4 release
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.9 (GNU/Linux)
 
 iEYEABECAAYFAk8zEZkACgkQmxfmIW/3wagcTwCgjGvmVz4suHSfs+OzQWEmBDqv
 dCYAnjcm0p9EaocqWhbUV2UfGC0NMX8A
 =wOcR
 -----END PGP SIGNATURE-----

Merge tag 'pixman-0.24.4' into debian-unstable

pixman 0.24.4 release
2012-02-09 20:48:25 +01:00
Søren Sandmann Pedersen
8bff730a98 Pre-release version bump to 0.24.4 2012-02-08 19:03:22 -05:00
Søren Sandmann Pedersen
c5c866a394 Revert "Reject trapezoids where top (botttom) is above (below) the edges"
Cairo 1.10 will sometimes generate trapezoids like this, so we can't
consider them invalid. Fixes bug 45009, reported by Michael Biebl.

This reverts commit 2437ae80e5.
2012-02-08 19:01:05 -05:00
Bobby Salazar
1ceb66750c iOS Runtime Detection Support For ARM NEON
This patch adds runtime detection support for the ARM NEON fast paths
for code compiled with the iOS SDK.
2012-02-08 19:01:03 -05:00
Søren Sandmann Pedersen
e5555d7a74 Revert "Reject trapezoids where top (botttom) is above (below) the edges"
Cairo 1.10 will sometimes generate trapezoids like this, so we can't
consider them invalid. Fixes bug 45009, reported by Michael Biebl.

This reverts commit 2437ae80e5.
2012-01-31 09:10:07 -05:00
Bobby Salazar
3557787697 iOS Runtime Detection Support For ARM NEON
This patch adds runtime detection support for the ARM NEON fast paths
for code compiled with the iOS SDK.
2012-01-31 09:10:07 -05:00
Cyril Brulebois
11ddc57db9 Upload to unstable. 2012-01-19 12:23:22 +01:00
Cyril Brulebois
cbde497236 Bump changelogs. 2012-01-19 12:21:28 +01:00
Cyril Brulebois
ed216c187b Merge branch 'upstream-unstable' into debian-unstable 2012-01-19 12:20:52 +01:00
Søren Sandmann Pedersen
7ccb0c45e5 Post-release version bump to 0.24.3 2012-01-18 16:06:05 -05:00