Commit Graph

1929 Commits

Author SHA1 Message Date
Søren Sandmann Pedersen
79e69aac8c Add support for triangles to pixman.
The Render X extension can draw triangles as well as trapezoids, but
the implementation has always converted them to trapezoids. This patch
moves the X server's triangle conversion code into pixman, where we
can reuse the pixman_composite_trapezoid() code.
2011-02-15 09:25:18 -05:00
Søren Sandmann Pedersen
4e6dd4928d Add a test program for pixman_composite_trapezoids().
A CRC32 based test program to check that pixman_composite_trapezoids()
actually works.
2011-02-15 09:25:18 -05:00
Søren Sandmann Pedersen
803272e38c Add pixman_composite_trapezoids().
This function is an implementation of the X server request
Trapezoids. That request is what the X backend of cairo is using all
the time; by moving it into pixman we can hopefully make it faster.
2011-02-15 09:25:18 -05:00
Søren Sandmann Pedersen
1feaf6bea7 test/Makefile.am: Move all the TEST_LDADD into a new global LDADD.
This gets rid of a bunch of replicated *_LDADD clauses
2011-02-15 09:25:17 -05:00
Søren Sandmann Pedersen
1237fd9bc8 Add @TESTPROGS_EXTRA_LDFLAGS@ to AM_LDFLAGS
Instead of explicitly adding it to each test program.
2011-02-15 09:25:17 -05:00
Søren Sandmann Pedersen
7dfe845786 Move all the GTK+ based test programs to a new subdir, "demos"
This separates the test suite from the random gtk+ using test
programs. "demos" is somewhat misleading because the programs there
are not particularly exciting (with the possible exception of
composite-test which shows off all the compositing operators).
2011-02-15 09:25:17 -05:00
Siarhei Siamashka
8e4100260b SSE2 optimization for nearest scaled over_8888_n_8888
This operation shows up a little bit in some of the html5 based
games from http://www.kesiev.com/akihabara/

=== Cairo trace of the game intro animation for 'Legend of Sadness' ===

before:
[  0]    image    firefox-legend-of-sadness   46.286   46.298   0.01%    5/6

after:
[  0]    image    firefox-legend-of-sadness   45.088   45.102   0.04%    6/6

=== Microbenchmark (scaling ~2000x~2000 -> ~2000x~2000) ===

before:
    translucent: op=3, src=8888, mask=s dst=8888, speed=131.30 MPix/s
    transparent: op=3, src=8888, mask=s dst=8888, speed=132.38 MPix/s
    opaque:      op=3, src=8888, mask=s dst=8888, speed=167.90 MPix/s
after:
    translucent: op=3, src=8888, mask=s dst=8888, speed=301.93 MPix/s
    transparent: op=3, src=8888, mask=s dst=8888, speed=770.70 MPix/s
    opaque:      op=3, src=8888, mask=s dst=8888, speed=301.80 MPix/s
2011-02-15 14:32:41 +02:00
Siarhei Siamashka
39b86b032d ARM: NEON optimization for nearest scaled over_0565_8_0565
In some cases may be used for html5 video when hardware acceleration
is not available.
2011-02-15 14:32:34 +02:00
Siarhei Siamashka
9a90c1c90f ARM: NEON optimization for nearest scaled over_8888_8_0565
In some cases may be used for html5 video when hardware acceleration
is not available.
2011-02-15 14:32:28 +02:00
Siarhei Siamashka
cd1062ded4 ARM: new macro template for using scaled fast paths with a8 mask 2011-02-15 14:32:23 +02:00
Siarhei Siamashka
b099957887 Better support for NONE repeat in nearest scaling main loop template
Scaling function now gets an extra boolean argument, which is set
to TRUE when we are fetching padding pixels for NONE repeat. This
allows to make a decision whether to interpret alpha as 0xFF or 0x00
for such pixels when working with formats which don't have alpha
channel (for example x8r8g8b8 and r5g6b5).
2011-02-15 14:32:16 +02:00
Siarhei Siamashka
14f82083a1 Support for a8 and solid mask in nearest scaling main loop template
In addition to the most common case of not having any mask at all, two
variants of scaling with mask show up in cairo traces:
1. non-scaled a8 mask with SAMPLES_COVER_CLIP flag
2. solid mask

This patch extends the nearest scaling main loop template to also
support these cases.
2011-02-15 14:32:06 +02:00
Siarhei Siamashka
e83cee5aac test: Extend scaling-test to support a8/solid mask and ADD operation
Image width also has been increased because SIMD optimizations typically
do more unrolling in the inner loops, and this needs to be tested.
2011-02-15 14:32:01 +02:00
Siarhei Siamashka
97447f440f Use const modifiers for source buffers in nearest scaling fast paths 2011-02-15 14:29:54 +02:00
Siarhei Siamashka
8d359b00c5 C fast paths for a simple 90/270 degrees rotation
Depending on CPU architecture, performance is in the range of 1.5 to 4 times
slower than simple nonrotated copy (which would be an ideal case, perfectly
utilizing memory bandwidth), but still is more than 7 times faster if
compared to general path.

This implementation sets a performance baseline for rotation. The use
of SIMD instructions may further improve memory bandwidth utilization.
2011-02-10 16:18:01 +02:00
Siarhei Siamashka
e0c7948c97 New flags for 90/180/270 rotation
These flags are set when the transform is a simple nonscaled 90/180/270
degrees rotation.
2011-02-10 16:17:24 +02:00
Siarhei Siamashka
3b68c295fd test: affine-test updated to stress 90/180/270 degrees rotation more 2011-02-10 16:17:18 +02:00
Søren Sandmann Pedersen
56f173f0af Add pixman-conical-gradient.c to Makefile.win32.
Pointed out by Kirill Tishin.
2011-02-10 05:21:42 -05:00
Cyril Brulebois
fc1b85f258 Upload to unstable. 2011-02-06 05:31:27 +01:00
Cyril Brulebois
84bb9a7605 Mention upstream git URL in a comment. 2011-02-06 05:30:48 +01:00
Søren Sandmann Pedersen
7fd4897730 Add SSE2 fetcher for 0565
Before:

add_0565_0565 = L1:  61.08  L2:  61.03  M: 60.57 ( 10.95%)  HT: 46.85  VT: 45.25  R: 39.99  RT: 20.41 ( 233Kops/s)

After:

add_0565_0565 = L1:  77.84  L2:  76.25  M: 75.38 ( 13.71%)  HT: 55.99  VT: 54.56  R: 45.41  RT: 21.95 ( 255Kops/s)
2011-02-03 03:25:05 -05:00
Søren Sandmann Pedersen
8414aa76c2 Improve performance of sse2_combine_over_u()
Split this function into two, one that has a mask, and one that
doesn't. This is a fairly substantial speed-up in many cases.

New output of lowlevel-blt-bench over_x888_8_0565:

over_x888_8_0565 =  L1:  63.76  L2:  62.75  M: 59.37 ( 21.55%)  HT: 45.89  VT: 43.55  R: 34.51  RT: 16.80 ( 201Kops/s)
2011-02-03 03:25:05 -05:00
Søren Sandmann Pedersen
08e855f15c Add SSE2 fetcher for a8
New output of lowlevel-blt-bench over_x888_8_0565:

over_x888_8_0565 =  L1:  57.85  L2:  56.80  M: 54.14 ( 19.50%)  HT: 42.64  VT: 40.56  R: 32.67  RT: 16.22 ( 195Kops/s)

Based in part on code by Steve Snyder from

    https://bugs.freedesktop.org/show_bug.cgi?id=21173
2011-02-03 03:25:05 -05:00
Søren Sandmann Pedersen
2b6b0cf359 Add SSE2 fetcher for x8r8g8b8
New output of lowlevel-blt-bench over_x888_8_0565:

over_x888_8_0565 =  L1:  55.68  L2:  55.11  M: 52.83 ( 19.04%)  HT: 39.62  VT: 37.70  R: 30.88  RT: 14.62 ( 174Kops/s)

The fetcher is looked up in a table, so that other fetchers can easily
be added.

See also https://bugs.freedesktop.org/show_bug.cgi?id=20709
2011-02-03 03:24:47 -05:00
Søren Sandmann Pedersen
13aed37758 Add a test for over_x888_8_0565 in lowlevel_blt_bench().
The next few commits will speed this up quite a bit.

Current output:

---
reference memcpy speed = 2217.5MB/s (554.4MP/s for 32bpp fills)
---
over_x888_8_0565 =  L1:  54.67  L2:  54.01  M: 52.33 ( 18.88%)  HT: 37.19  VT: 35.54  R: 29.40  RT: 13.63 ( 162Kops/s)
2011-01-28 14:35:17 -05:00
Søren Sandmann Pedersen
2de397c272 Move fallback decisions from implementations into pixman-cpu.c.
Instead of having each individual implementation decide which fallback
to use, move it into pixman-cpu.c, where a more global decision can be
made.

This is accomplished by adding a "fallback" argument to all the
pixman_implementation_create_*() implementations, and then in
_pixman_choose_implementation() pass in the desired fallback.
2011-01-26 17:07:35 -05:00
Søren Sandmann Pedersen
ed781df1cc Print a warning when a development snapshot is being configured.
It seems to be relatively common for people to use development
snapshots of pixman thinking they are ordinary releases. This patch
makes it such that if the current minor version is odd, configure will
print a banner explaining the version number scheme plus information
about where to report bugs.
2011-01-26 17:07:35 -05:00
Rolland Dudemaine
fead9eb82a Fix "variable was set but never used" warnings
Removes useless variable declarations. This can only result in more
efficient code, as these variables where sometimes assigned, but
their values were never used.
2011-01-26 15:05:24 +02:00
Rolland Dudemaine
32e556df33 test: Use the right enum types instead of int to fix warnings
Green Hills Software MULTI compiler was producing a number
of warnings due to incorrect uses of int instead of the correct
corresponding pixman_*_t type.
2011-01-26 15:05:18 +02:00
Rolland Dudemaine
b61ec0a686 Correct the initialization of 'max_vx'
http://lists.freedesktop.org/archives/pixman/2011-January/000937.html
2011-01-25 14:55:24 +02:00
Rolland Dudemaine
e8a1b1c4e5 test: Fix for mismatched 'fence_malloc' prototype/implementation
Solves compilation problem when 'mprotect' is not available. For
example, when using Green Hills Software MULTI compiler or mingw:
http://lists.freedesktop.org/archives/pixman/2011-January/000939.html
2011-01-25 14:34:56 +02:00
Siarhei Siamashka
a8e4677ecc The code in 'bitmap_addrect' already assumes non-null 'reg->data'
So the check of 'reg->data' pointer can be safely removed.
2011-01-20 02:14:07 +02:00
Cyril Brulebois
8aeb637bb5 Upload to experimental. 2011-01-19 20:31:42 +01:00
Cyril Brulebois
461dacfb5e Update debian/copyright from upstream's COPYING. 2011-01-19 20:25:41 +01:00
Cyril Brulebois
e581626827 Bump changelogs. 2011-01-19 20:24:49 +01:00
Cyril Brulebois
f5216c99bc Merge branch 'upstream-experimental' into debian-experimental 2011-01-19 20:23:47 +01:00
Søren Sandmann Pedersen
a6a04c07c3 Post-release version bump to 0.21.5 2011-01-19 07:47:52 -05:00
Søren Sandmann Pedersen
4e56cec564 Pre-release version bump to 0.21.4 2011-01-19 07:38:24 -05:00
Søren Sandmann Pedersen
1d7195dd6c Fix dangling-pointer bug in bits_image_fetch_bilinear_no_repeat_8888().
The mask_bits variable is only declared in a limited scope, so the
pointer to it becomes invalid instantly. Somehow this didn't actually
trigger any bugs, but Brent Fulgham reported that Bounds Checker was
complaining about it.

Fix the bug by moving mask_bits to the function scope.
2011-01-19 07:22:42 -05:00
Andrea Canciani
2ac4ae1ae2 Add a test for radial gradients
radial-test is a port of the radial-gradient test from the cairo test
suite. It has been modified so that some pixels have 0 in both the a
and b coefficients of the quadratic equation solved by the rasterizer,
to expose a division by zero in the original implementation.
2011-01-19 13:17:03 +01:00
Søren Sandmann Pedersen
7f4eabbeec Fix destination fetching
When fetching from destinations, we need to ignore transformations,
repeat and filtering. Currently we don't ignore them, which means all
kinds of bad things can happen.

This bug fixes this problem by directly calling the scanline fetchers
for destinations instead of going through the full
get_scanline_32/64().
2011-01-18 12:42:26 -05:00
Søren Sandmann Pedersen
9489c2e04a Turn on testing for destination transformation 2011-01-18 12:42:26 -05:00
Søren Sandmann Pedersen
fffeda703e Skip fetching pixels when possible
Add two new iterator flags, ITER_IGNORE_ALPHA and ITER_IGNORE_RGB that
are set when the alpha and rgb values are not needed. If both are set,
then we can skip fetching entirely and just use
_pixman_iter_get_scanline_noop.
2011-01-18 12:42:26 -05:00
Søren Sandmann Pedersen
3e635d6491 Add direct-write optimization back
Introduce a new ITER_LOCALIZED_ALPHA flag that indicates that the
alpha value computed is used only for the alpha channel of the output;
it doesn't affect the RGB channels.

Then in pixman-bits-image.c, if a destination is either a8r8g8b8 or
x8r8g8b8 with localized alpha, the iterator will return a pointer
directly into the image.
2011-01-18 12:42:26 -05:00
Søren Sandmann Pedersen
0f1a5c4a27 Get rid of the classify methods
They are not used anymore, and the linear gradient is now doing the
optimization in a different way.
2011-01-18 12:42:26 -05:00
Søren Sandmann Pedersen
b66cabb884 Linear: Optimize for horizontal gradients
If the gradient is horizontal, we can reuse the same scanline over and
over. Add support for this optimization to
_pixman_linear_gradient_iter_init().
2011-01-18 12:42:26 -05:00
Søren Sandmann Pedersen
cf14189c69 Consolidate the various get_scanline_32() into get_scanline_narrow()
The separate get_scanline_32() functions in solid, linear, radial and
conical images are no longer necessary because all access to these
images now go through iterators.
2011-01-18 12:42:26 -05:00
Søren Sandmann Pedersen
0a6360a7ee Allow NULL property_changed function
Initialize the field to NULL, and then delete the empty functions from
the solid, linear, radial, and conical images.
2011-01-18 12:42:26 -05:00
Søren Sandmann Pedersen
34b5633105 Move get_scanline_32/64 to the bits part of the image struct
At this point these functions are basically a cache that the bits
image uses for its fetchers, so they can be moved to the bits image.

With the scanline getters only being initialized in the bits image,
the _pixman_image_get_scanline_generic_64 can be moved to
pixman-bits-image.c. That gets rid of the final user of
_pixman_image_get_scanline_32/64, so these can be deleted.
2011-01-18 12:42:26 -05:00
Søren Sandmann Pedersen
d6b13f99b4 Use an iterator in pixman_image_get_solid()
This is a step towards getting rid of the
_pixman_image_get_scanline_32/64() functions.
2011-01-18 12:42:26 -05:00