Commit Graph

1929 Commits

Author SHA1 Message Date
Julien Cristau
105c2e8664 Bump Standards-Version to 3.9.2. 2011-06-12 16:59:43 +02:00
Julien Cristau
3bb65959ee Add changelog entry for multiarch 2011-06-12 16:58:06 +02:00
Julien Cristau
f7a60c64ac Don't ship debug symbols for the udeb 2011-06-12 16:57:28 +02:00
Julien Cristau
94b5f3b6a4 Merge branch 'multiarch' of git.debian.org:/git/pkg-xorg/lib/pixman into debian-unstable
Conflicts:
	debian/control
	debian/rules
2011-06-12 16:55:38 +02:00
Søren Sandmann
6aceb767aa demos: Comment out some unused variables 2011-05-31 18:07:34 -04:00
Søren Sandmann
4abe76432a sse2: Delete some unused variables 2011-05-31 18:07:26 -04:00
Søren Sandmann
5c60e1855b mmx: Delete some unused variables 2011-05-31 18:06:43 -04:00
Andrea Canciani
827e613338 Include noop in win32 builds 2011-05-29 10:02:21 +02:00
Nis Martensen
65b63728cc Fix a few typos in pixman-combine.c.template
Some equations have too much multiplication with alpha.
2011-05-24 10:01:37 -04:00
Søren Sandmann Pedersen
dd449a2a8e Move NOP src iterator into noop implementation.
The iterator for sources where neither RGB nor ALPHA is needed, really
belongs in the noop implementation.
2011-05-19 13:46:56 +00:00
Søren Sandmann Pedersen
ba480882aa Move NULL iterator into pixman-noop.c
Iterating a NULL image returns NULL for all scanlines. We may as well
do this in the noop iterator.
2011-05-19 13:46:56 +00:00
Søren Sandmann Pedersen
a4e984de19 Add a noop src iterator
When the image is a8r8g8b8 and not transformed, and the fetched
rectangle is within the image bounds, scanlines can be fetched by
simply returning a pointer instead of copying the bits.
2011-05-19 13:46:56 +00:00
Søren Sandmann Pedersen
d4fff4a959 Move noop dest fetching to noop implementation
It will at some point become useful to have CPU specific destination
iterators. However, a problem with that, is that such iterators should
not be used if we can composite directly in the destination image.

By moving the noop destination iterator to the noop implementation, we
can ensure that it will be chosen before any CPU specific iterator.
2011-05-19 13:46:50 +00:00
Søren Sandmann Pedersen
13ce88f800 Add a noop composite function for the DST operator
The DST operator doesn't actually do anything, so add a noop "fast
path" for it, instead of checking in pixman_image_composite32().

The performance tradeoff here is that we get rid of a test for DST in
the common case where the operator is not DST, in return for an extra
walk over the clip rectangles in the uncommon case where the operator
actually is DST.
2011-05-19 13:45:59 +00:00
Søren Sandmann Pedersen
8c76235f41 Add a "noop" implementation.
This new implementation is ahead of all other implementations in the
fallback chain and is supposed to contain operations that are "noops",
ie., they don't require any work. For example, it might contain a
"fast path" for the DST operator that doesn't actually do anything or
an iterator for a8r8g8b8 that just returns a pointer into the image.
2011-05-19 13:45:59 +00:00
Andrea Canciani
0f6a4d4588 test: Fix compilation on win32
MSVC complains about uint32_t being used as an expression:

composite.c(902) : error C2275: 'uint32_t' : illegal use of this type
as an expression
2011-05-17 00:29:55 +02:00
Dave Yeo
838c2b593e Check for working mmap()
OS/2 doesn't have a working mmap().
2011-05-09 12:38:44 +02:00
Søren Sandmann Pedersen
c53625a36e Post-release version bump to 0.23.1 2011-05-02 05:11:49 -04:00
Søren Sandmann Pedersen
918a544406 Pre-release version bump to 0.22.0 2011-05-02 05:06:33 -04:00
Cyril Brulebois
2296b15c9d Upload to unstable. 2011-04-29 17:53:20 +02:00
Cyril Brulebois
c48a9b8035 Mention endianness-related FTBFS fix (Closes: #622211). 2011-04-29 17:53:09 +02:00
Cyril Brulebois
fa956ebd6b Bump changelogs. 2011-04-29 17:52:36 +02:00
Cyril Brulebois
d06147d984 Merge branch 'upstream-unstable' into debian-unstable 2011-04-29 17:51:32 +02:00
Søren Sandmann Pedersen
71b2e2745b Post-release version bump to 0.21.9 2011-04-19 00:22:29 -04:00
Søren Sandmann Pedersen
89868e93bd Pre-release version bump to 0.21.8 2011-04-19 00:00:37 -04:00
Taekyun Kim
33f1652b95 ARM: Enable bilinear fast paths using scanline functions in pixman-arm-neon-asm-bilinear.S
Enable fast paths which is supported by scanline functions in
pixman-arm-neon-asm-bilinear.S
2011-04-18 16:49:46 -04:00
Taekyun Kim
e8185f1cb4 ARM: NEON scanline functions for bilinear scaling
General fetch->combine->store based bilinear scanline functions.
Need further optimizations and eventually will be replaced with optimal
functions one by one.
General functions should be located in pixman-arm-neon-asm-bilinear.S and
optimal functions in pixman-arm-neon-asm.S

Following general bilinear scanline functions are implemented
    over_8888_8888
    add_8888_8888
    src_8888_8_8888
    src_8888_8_0565
    src_0565_8_x888
    src_0565_8_0565
    over_8888_8_8888
    add_8888_8_8888
2011-04-18 16:49:43 -04:00
Taekyun Kim
00939d3562 ARM: Common macro for scaled bilinear scanline function with A8 mask
Defining PIXMAN_ARM_BIND_SCALED_BILINEAR_SRC_A8_DST macro for declaration of
scaled bilinear scanline functions in common header.
2011-04-18 16:49:40 -04:00
Søren Sandmann Pedersen
b455496890 Offset rendering in pixman_composite_trapezoids() by (x_dst, y_dst)
Previously, this function would do coordinate calculations in such a
way that (x_dst, y_dst) would only affect the alignment of the source
image, but not of the traps, which would always be considered to be in
absolute destination coordinates. This is unlike the
pixman_image_composite() function which also registers the mask to the
destination.

This patch makes it so that traps are also offset by (x_dst, y_dst).

Also add a comment explaining how this function is supposed to
operate, and update tri-test.c and composite-trap-test.c to deal with
the new semantics.
2011-04-18 16:27:29 -04:00
Søren Sandmann Pedersen
e75e6a4ef5 ARM: Add 'neon_composite_over_n_8888_0565_ca' fast path
This improves the performance of the firefox-talos-gfx benchmark with
the image16 backend. Benchmark on an 800 MHz ARM Cortex A8:

Before:

[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]  image16            firefox-talos-gfx  121.773  122.218   0.15%    6/6

After:

[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]  image16            firefox-talos-gfx   85.247   85.563   0.22%    6/6

V2: Slightly better instruction scheduling based on comments from Taekyun Kim.
V3: Eliminate all stalls from the inner loop. Also based on comments from Taekyun Kim.
2011-04-18 16:25:36 -04:00
Gilles Espinasse
1670b95214 Fix OpenMP not supported case
PIXMAN_LINK_WITH_ENV did not fail unless -Wall -Werror is used.
So even when the compiler did not support OpenMP, USE_OPENMP was defined.
Fix that by running the second OpenMP test only when first AC_OPENMP find supported

configure tested in the cases :
gcc without libgomp support, no openmp option, --enable-openmp and --disable-openmp
gcc with libgomp support, no openmp option, --enable-openmp and --disable-openmp

Not tested with autoconf version not knowing openmp (<2.62)

Warn when --enable-openmp is requested but no support is found

Signed-off-by: Gilles Espinasse <g.esp@free.fr>
2011-04-18 16:13:58 -04:00
Gilles Espinasse
b9e8f7fb74 Fix missing AC_MSG_RESULT value from Werror test
Use the correct variable name

Signed-off-by: Gilles Espinasse <g.esp@free.fr>
2011-04-18 16:13:58 -04:00
Siarhei Siamashka
caae4e82ff ARM: pipelined NEON implementation of bilinear scaled 'src_8888_0565'
Benchmark on ARM Cortex-A8 r1p3 @600MHz, 32-bit LPDDR @166MHz:
 Microbenchmark (scaling 2000x2000 image with scale factor close to 1x):
  before: op=1, src=20028888, dst=10020565, speed=33.59 MPix/s
  after:  op=1, src=20028888, dst=10020565, speed=46.25 MPix/s

Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz:
 Microbenchmark (scaling 2000x2000 image with scale factor close to 1x):
  before: op=1, src=20028888, dst=10020565, speed=63.86 MPix/s
  after:  op=1, src=20028888, dst=10020565, speed=84.22 MPix/s
2011-04-11 10:48:35 +03:00
Siarhei Siamashka
d080d59b80 ARM: pipelined NEON implementation of bilinear scaled 'src_8888_8888'
Performance of the inner loop when working with the data in L1 cache:
    ARM Cortex-A8: 41 cycles per 4 pixels (no stalls and partial dual issue)
    ARM Cortex-A9: 48 cycles per 4 pixels (no stalls)

It might be still possible to improve performance even more on ARM Cortex-A8
with a better use of dual issue.

Benchmark on ARM Cortex-A8 r1p3 @600MHz, 32-bit LPDDR @166MHz:
 Microbenchmark (scaling 2000x2000 image with scale factor close to 1x):
  before: op=1, src=20028888, dst=20028888, speed=40.38 MPix/s
  after:  op=1, src=20028888, dst=20028888, speed=48.47 MPix/s

Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz:
 Microbenchmark (scaling 2000x2000 image with scale factor close to 1x):
  before: op=1, src=20028888, dst=20028888, speed=79.68 MPix/s
  after:  op=1, src=20028888, dst=20028888, speed=93.11 MPix/s
2011-04-11 10:48:30 +03:00
Siarhei Siamashka
b496a8b279 ARM: support different levels of loop unrolling in bilinear scaler
Now an extra 'flag' parameter is supported in bilinear scaline scaling
function generation macro. It can be used to enable 4 or 8 pixels per
loop iteration unrolling and provide save/restore code for d8-d15
registers.
2011-04-11 10:48:24 +03:00
Siarhei Siamashka
34ca9cf03f ARM: use less ARM instructions in NEON bilinear scaling code
This reduces code size and also puts less pressure on the
instruction decoder.
2011-04-11 10:48:14 +03:00
Siarhei Siamashka
0f7be9f72e ARM: support for software pipelining in bilinear macros
Now it's possible to override the main loop of bilinear scaling code
with optimized pipelined implementation.
2011-04-11 10:48:10 +03:00
Siarhei Siamashka
9638af9583 ARM: use aligned memory writes in NEON bilinear scaling code 2011-04-11 10:48:05 +03:00
Siarhei Siamashka
8bba3a0e1e ARM: tweaked horizontal weights update in NEON bilinear scaling code
Moving horizontal interpolation weights update instructions from the
beginning of loop to its end allows to hide some pipeline stalls and
improve performance.
2011-04-11 10:48:01 +03:00
Cyril Brulebois
eade7b4dbd Upload to unstable. 2011-04-10 23:08:45 +02:00
Søren Sandmann Pedersen
a215322267 ARM: Tiny improvement in over_n_8888_8888_ca_process_pixblock_head
Instead of two

	mvn d24, d24
	mvn d25, d25

use just one

	mvn q12, q12

Also move another vmvn instruction into the created pipeline bubble,
as pointed out by Siarhei.
2011-04-06 23:03:19 -04:00
Søren Sandmann Pedersen
44f99735d9 Makefile.am: Put development releases in "snapshots" directory
Up until now, all pixman release, both snapshots and releases were
uploaded to the "releases" directory on www.cairographics.org, but
it's better to development snapshots in the "snapshots" directory.

This patch changes Makefile.am to do that.
2011-04-06 23:03:10 -04:00
Steve Langasek
c6ce22e73a build for multiarch 2011-03-26 00:30:06 -07:00
Søren Sandmann Pedersen
ad3cbfb073 test: Fix infinite loop in composite
When run in PIXMAN_RANDOMIZE_TESTS mode, this test would go into an
infinite loop because the loop started at 'seed' but the stop
condition was still N_TESTS.
2011-03-22 13:43:29 -04:00
Alexandros Frantzis
b514e63cfc Add support for the r8g8b8a8 and r8g8b8x8 formats to the tests. 2011-03-22 13:43:29 -04:00
Alexandros Frantzis
f05a90e5f8 Add simple support for the r8g8b8a8 and r8g8b8x8 formats.
This format is particularly useful on big-endian architectures, where RGBA in
memory/file order corresponds to r8g8b8a8 as an uint32_t. This is important
because RGBA is in some cases the only available choice (for example as a pixel
format in OpenGL ES 2.0).
2011-03-22 13:43:29 -04:00
Søren Sandmann Pedersen
7eb0abb5e8 test: Randomize some tests if PIXMAN_RANDOMIZE_TESTS is set
This patch makes so that composite and stress-test will start from a
random seed if the PIXMAN_RANDOMIZE_TESTS environment variable is
set. Running the test suite in this mode is useful to get more test
coverage.

Also, in stress-test.c make it so that setting the initial seed causes
threads to be turned off. This makes it much easier to see when
something fails.
2011-03-19 08:51:35 -04:00
Søren Sandmann Pedersen
6b27768d81 Simplify the prototype for iterator initializers.
All of the information previously passed to the iterator initializers
is now available in the iterator itself, so there is no need to pass
it as arguments anymore.
2011-03-18 16:23:10 -04:00
Søren Sandmann Pedersen
74d0f44b6d Fill out parts of iters in _pixman_implementation_{src,dest}_iter_init()
This makes _pixman_implementation_{src,dest}_iter_init() responsible
for filling parts of the information in the iterators. Specifically,
the information passed as arguments is stored in the iterator.

Also add a height field to pixman_iter_t().
2011-03-18 16:23:10 -04:00
Søren Sandmann Pedersen
be4eaa0e4f In delegate_{src,dest}_iter_init() call delegate directly.
There is no reason to go through
_pixman_implementation_{src,dest}_iter_init(), especially since
_pixman_implementation_src_iter_init() is doing various other checks
that only need to be done once.

Also call delegate->src_iter_init() directly in pixman-sse2.c
2011-03-18 16:23:10 -04:00