Commit Graph

1850 Commits

Author SHA1 Message Date
Andrea Canciani
72f5e5f608 test: Add Makefile for Win32 2011-02-28 10:38:02 +01:00
Andrea Canciani
11305b4ecd test: Fix tests for compilation on Windows
The Microsoft C compiler cannot handle subobject initialization and
Win32 does not provide snprintf.

Work around these limitations by using normal struct initialization
and using sprintf (a manual check shows that the buffer size is
sufficient).
2011-02-28 10:38:02 +01:00
Andrea Canciani
20ed723a5a Fix compilation on Win32
Makefile.win32 contained a typo and was missing the dependency from
the built sources.
2011-02-28 10:38:01 +01:00
Søren Sandmann Pedersen
48e951000c Post-release version bump to 0.21.7 2011-02-22 16:13:32 -05:00
Søren Sandmann Pedersen
8b33321660 Pre-release version bump to 0.21.6 2011-02-22 15:43:41 -05:00
Søren Sandmann Pedersen
2cb67d2a0b Minor fix to the RELEASING file 2011-02-22 15:40:34 -05:00
Søren Sandmann Pedersen
3cdf74257b Delete pixman-x64-mmx-emulation.h from pixman/Makefile.am 2011-02-22 15:28:17 -05:00
Siarhei Siamashka
65919ad17f Ensure that tests run as the last step of a build for 'make check'
Previously 'make check' would compile and run tests first, and only
then proceed to compiling demos. Which is not very convenient
because of the need to scroll back console output to see the
tests verdict. Swapping order of SUBDIRS variable entries in
Makefile.am resolves this.
2011-02-22 19:43:57 +02:00
Søren Sandmann Pedersen
34a7ac0474 sse2: Minor coding style cleanups.
Also make pixman_fill_sse2() static.
2011-02-18 16:03:30 -05:00
Søren Sandmann Pedersen
10f69e5ec8 sse2: Remove pixman-x64-mmx-emulation.h
Also stop including mmintrin.h
2011-02-18 16:03:29 -05:00
Søren Sandmann Pedersen
984be4def2 sse2: Delete obsolete or redundant comments 2011-02-18 16:03:29 -05:00
Søren Sandmann Pedersen
33d9890226 sse2: Remove all the core_combine_* functions
Now that _mm_empty() is not used anymore, they are no longer different
from the sse2_combine_* functions, so they can be consolidated.
2011-02-18 16:03:29 -05:00
Søren Sandmann Pedersen
87cd6b8056 sse2: Don't compile pixman-sse2.c with -mmmx anymore
It's not necessary now that the file doesn't use MMX instructions.
2011-02-18 16:03:29 -05:00
Søren Sandmann Pedersen
e7fe5e35e9 sse2: Delete unused MMX functions and constants and all _mm_empty()s
These are not needed because the SSE2 implementation doesn't use MMX
anymore.
2011-02-18 16:03:29 -05:00
Søren Sandmann Pedersen
f88ae14c15 sse2: Convert all uses of MMX registers to use SSE2 registers instead.
By avoiding use of MMX registers we won't need to call emms all over
the place, which avoids various miscompilation issues.
2011-02-18 16:03:29 -05:00
Søren Sandmann Pedersen
7fb75bb3e6 Coding style: core_combine_in_u_pixelsse2 -> core_combine_in_u_pixel_sse2 2011-02-18 16:03:29 -05:00
Søren Sandmann Pedersen
510c0d088a In pixman_image_set_transform() allow NULL for transform
Previously, this would crash unless the existing transform were also
NULL.
2011-02-18 06:21:38 -05:00
Søren Sandmann Pedersen
7feb710e60 Avoid marking images dirty when properties are reset
When an image property is set to the same value that it already is,
there is no reason to mark the image dirty and incur a recomputation
of the flags.
2011-02-18 06:21:37 -05:00
Søren Sandmann Pedersen
3598ec26ec Add new public function pixman_add_triangles()
This allows some more code to be deleted from the X server. The
implementation consists of converting to trapezoids, and is shared
with pixman_composite_triangles().
2011-02-18 06:21:37 -05:00
Søren Sandmann Pedersen
964c7e7cd2 Optimize adding opaque trapezoids onto a8 destination.
When the source is opaque and the destination is alpha only, we can
avoid the temporary mask and just add the trapezoids directly.
2011-02-18 06:21:37 -05:00
Søren Sandmann Pedersen
0bc03482f1 Add a test program, tri-test
This program tests whether the new triangle support works.
2011-02-18 06:21:31 -05:00
Søren Sandmann Pedersen
79e69aac8c Add support for triangles to pixman.
The Render X extension can draw triangles as well as trapezoids, but
the implementation has always converted them to trapezoids. This patch
moves the X server's triangle conversion code into pixman, where we
can reuse the pixman_composite_trapezoid() code.
2011-02-15 09:25:18 -05:00
Søren Sandmann Pedersen
4e6dd4928d Add a test program for pixman_composite_trapezoids().
A CRC32 based test program to check that pixman_composite_trapezoids()
actually works.
2011-02-15 09:25:18 -05:00
Søren Sandmann Pedersen
803272e38c Add pixman_composite_trapezoids().
This function is an implementation of the X server request
Trapezoids. That request is what the X backend of cairo is using all
the time; by moving it into pixman we can hopefully make it faster.
2011-02-15 09:25:18 -05:00
Søren Sandmann Pedersen
1feaf6bea7 test/Makefile.am: Move all the TEST_LDADD into a new global LDADD.
This gets rid of a bunch of replicated *_LDADD clauses
2011-02-15 09:25:17 -05:00
Søren Sandmann Pedersen
1237fd9bc8 Add @TESTPROGS_EXTRA_LDFLAGS@ to AM_LDFLAGS
Instead of explicitly adding it to each test program.
2011-02-15 09:25:17 -05:00
Søren Sandmann Pedersen
7dfe845786 Move all the GTK+ based test programs to a new subdir, "demos"
This separates the test suite from the random gtk+ using test
programs. "demos" is somewhat misleading because the programs there
are not particularly exciting (with the possible exception of
composite-test which shows off all the compositing operators).
2011-02-15 09:25:17 -05:00
Siarhei Siamashka
8e4100260b SSE2 optimization for nearest scaled over_8888_n_8888
This operation shows up a little bit in some of the html5 based
games from http://www.kesiev.com/akihabara/

=== Cairo trace of the game intro animation for 'Legend of Sadness' ===

before:
[  0]    image    firefox-legend-of-sadness   46.286   46.298   0.01%    5/6

after:
[  0]    image    firefox-legend-of-sadness   45.088   45.102   0.04%    6/6

=== Microbenchmark (scaling ~2000x~2000 -> ~2000x~2000) ===

before:
    translucent: op=3, src=8888, mask=s dst=8888, speed=131.30 MPix/s
    transparent: op=3, src=8888, mask=s dst=8888, speed=132.38 MPix/s
    opaque:      op=3, src=8888, mask=s dst=8888, speed=167.90 MPix/s
after:
    translucent: op=3, src=8888, mask=s dst=8888, speed=301.93 MPix/s
    transparent: op=3, src=8888, mask=s dst=8888, speed=770.70 MPix/s
    opaque:      op=3, src=8888, mask=s dst=8888, speed=301.80 MPix/s
2011-02-15 14:32:41 +02:00
Siarhei Siamashka
39b86b032d ARM: NEON optimization for nearest scaled over_0565_8_0565
In some cases may be used for html5 video when hardware acceleration
is not available.
2011-02-15 14:32:34 +02:00
Siarhei Siamashka
9a90c1c90f ARM: NEON optimization for nearest scaled over_8888_8_0565
In some cases may be used for html5 video when hardware acceleration
is not available.
2011-02-15 14:32:28 +02:00
Siarhei Siamashka
cd1062ded4 ARM: new macro template for using scaled fast paths with a8 mask 2011-02-15 14:32:23 +02:00
Siarhei Siamashka
b099957887 Better support for NONE repeat in nearest scaling main loop template
Scaling function now gets an extra boolean argument, which is set
to TRUE when we are fetching padding pixels for NONE repeat. This
allows to make a decision whether to interpret alpha as 0xFF or 0x00
for such pixels when working with formats which don't have alpha
channel (for example x8r8g8b8 and r5g6b5).
2011-02-15 14:32:16 +02:00
Siarhei Siamashka
14f82083a1 Support for a8 and solid mask in nearest scaling main loop template
In addition to the most common case of not having any mask at all, two
variants of scaling with mask show up in cairo traces:
1. non-scaled a8 mask with SAMPLES_COVER_CLIP flag
2. solid mask

This patch extends the nearest scaling main loop template to also
support these cases.
2011-02-15 14:32:06 +02:00
Siarhei Siamashka
e83cee5aac test: Extend scaling-test to support a8/solid mask and ADD operation
Image width also has been increased because SIMD optimizations typically
do more unrolling in the inner loops, and this needs to be tested.
2011-02-15 14:32:01 +02:00
Siarhei Siamashka
97447f440f Use const modifiers for source buffers in nearest scaling fast paths 2011-02-15 14:29:54 +02:00
Siarhei Siamashka
8d359b00c5 C fast paths for a simple 90/270 degrees rotation
Depending on CPU architecture, performance is in the range of 1.5 to 4 times
slower than simple nonrotated copy (which would be an ideal case, perfectly
utilizing memory bandwidth), but still is more than 7 times faster if
compared to general path.

This implementation sets a performance baseline for rotation. The use
of SIMD instructions may further improve memory bandwidth utilization.
2011-02-10 16:18:01 +02:00
Siarhei Siamashka
e0c7948c97 New flags for 90/180/270 rotation
These flags are set when the transform is a simple nonscaled 90/180/270
degrees rotation.
2011-02-10 16:17:24 +02:00
Siarhei Siamashka
3b68c295fd test: affine-test updated to stress 90/180/270 degrees rotation more 2011-02-10 16:17:18 +02:00
Søren Sandmann Pedersen
56f173f0af Add pixman-conical-gradient.c to Makefile.win32.
Pointed out by Kirill Tishin.
2011-02-10 05:21:42 -05:00
Cyril Brulebois
fc1b85f258 Upload to unstable. 2011-02-06 05:31:27 +01:00
Cyril Brulebois
84bb9a7605 Mention upstream git URL in a comment. 2011-02-06 05:30:48 +01:00
Søren Sandmann Pedersen
7fd4897730 Add SSE2 fetcher for 0565
Before:

add_0565_0565 = L1:  61.08  L2:  61.03  M: 60.57 ( 10.95%)  HT: 46.85  VT: 45.25  R: 39.99  RT: 20.41 ( 233Kops/s)

After:

add_0565_0565 = L1:  77.84  L2:  76.25  M: 75.38 ( 13.71%)  HT: 55.99  VT: 54.56  R: 45.41  RT: 21.95 ( 255Kops/s)
2011-02-03 03:25:05 -05:00
Søren Sandmann Pedersen
8414aa76c2 Improve performance of sse2_combine_over_u()
Split this function into two, one that has a mask, and one that
doesn't. This is a fairly substantial speed-up in many cases.

New output of lowlevel-blt-bench over_x888_8_0565:

over_x888_8_0565 =  L1:  63.76  L2:  62.75  M: 59.37 ( 21.55%)  HT: 45.89  VT: 43.55  R: 34.51  RT: 16.80 ( 201Kops/s)
2011-02-03 03:25:05 -05:00
Søren Sandmann Pedersen
08e855f15c Add SSE2 fetcher for a8
New output of lowlevel-blt-bench over_x888_8_0565:

over_x888_8_0565 =  L1:  57.85  L2:  56.80  M: 54.14 ( 19.50%)  HT: 42.64  VT: 40.56  R: 32.67  RT: 16.22 ( 195Kops/s)

Based in part on code by Steve Snyder from

    https://bugs.freedesktop.org/show_bug.cgi?id=21173
2011-02-03 03:25:05 -05:00
Søren Sandmann Pedersen
2b6b0cf359 Add SSE2 fetcher for x8r8g8b8
New output of lowlevel-blt-bench over_x888_8_0565:

over_x888_8_0565 =  L1:  55.68  L2:  55.11  M: 52.83 ( 19.04%)  HT: 39.62  VT: 37.70  R: 30.88  RT: 14.62 ( 174Kops/s)

The fetcher is looked up in a table, so that other fetchers can easily
be added.

See also https://bugs.freedesktop.org/show_bug.cgi?id=20709
2011-02-03 03:24:47 -05:00
Søren Sandmann Pedersen
13aed37758 Add a test for over_x888_8_0565 in lowlevel_blt_bench().
The next few commits will speed this up quite a bit.

Current output:

---
reference memcpy speed = 2217.5MB/s (554.4MP/s for 32bpp fills)
---
over_x888_8_0565 =  L1:  54.67  L2:  54.01  M: 52.33 ( 18.88%)  HT: 37.19  VT: 35.54  R: 29.40  RT: 13.63 ( 162Kops/s)
2011-01-28 14:35:17 -05:00
Søren Sandmann Pedersen
2de397c272 Move fallback decisions from implementations into pixman-cpu.c.
Instead of having each individual implementation decide which fallback
to use, move it into pixman-cpu.c, where a more global decision can be
made.

This is accomplished by adding a "fallback" argument to all the
pixman_implementation_create_*() implementations, and then in
_pixman_choose_implementation() pass in the desired fallback.
2011-01-26 17:07:35 -05:00
Søren Sandmann Pedersen
ed781df1cc Print a warning when a development snapshot is being configured.
It seems to be relatively common for people to use development
snapshots of pixman thinking they are ordinary releases. This patch
makes it such that if the current minor version is odd, configure will
print a banner explaining the version number scheme plus information
about where to report bugs.
2011-01-26 17:07:35 -05:00
Rolland Dudemaine
fead9eb82a Fix "variable was set but never used" warnings
Removes useless variable declarations. This can only result in more
efficient code, as these variables where sometimes assigned, but
their values were never used.
2011-01-26 15:05:24 +02:00
Rolland Dudemaine
32e556df33 test: Use the right enum types instead of int to fix warnings
Green Hills Software MULTI compiler was producing a number
of warnings due to incorrect uses of int instead of the correct
corresponding pixman_*_t type.
2011-01-26 15:05:18 +02:00