Commit Graph

1757 Commits

Author SHA1 Message Date
Søren Sandmann Pedersen
b3afacf9c9 Reorder tests so that they fastest ones run first. 2009-12-16 15:27:50 -05:00
Marvin Schmidt
bbc5108bf8 Build tests and run non-GTK+ ones on make check
Setting TESTS will run the tests on `make check`

Bug 25131
2009-12-16 15:24:36 -05:00
Siarhei Siamashka
4476832070 ARM: added 'neon_combine_add_u' function 2009-12-16 20:56:13 +02:00
Siarhei Siamashka
f2c7a04c41 ARM: added 'neon_combine_over_u' function 2009-12-16 20:56:08 +02:00
Siarhei Siamashka
24cd286af6 ARM: macro template for single scanline compositing functions
Existing template already supports 2D images processing,
but pixman also needs some NEON optimized functions for
improving performance when compositing is decoupled
into "fetch -> process -> store" stages and done via
temporary scanline buffer. That's why a new simplified
template which deals only with the generation of single
scanline processing functions is handy.
2009-12-16 20:55:54 +02:00
Siarhei Siamashka
ae8d9df624 Use canonical pixman license notice for recently added ARM NEON assembly files 2009-12-16 20:39:21 +02:00
Søren Sandmann Pedersen
92865d4dec Pre-release version bump 2009-12-15 11:30:49 -05:00
Søren Sandmann Pedersen
ec6de472d0 region: Enable or disable fatal errors and selfchecks based on version number
There is a couple of bugs in bugzilla where bugs in the X server
triggered asserts in the pixman region code. It is probably better to
let the X server survive this. (In fact, I thought I had disabled them
for 0.16.0, but apparently not).

The patch below uses these rules:

    - In _stable_ pixman releases, assertions and selfchecks are turned
      off. Assertions, so that the X server doesn't die. Selfchecks,
      for performance reasons.

    - In _unstable_ pixman releases, both assertions and selfcheck are
      turned on. These releases are what get added to development
      distributions such as rawhide, so we want as much self-checking
      as possible.

    - In _random git checkouts_, assertions are enabled, so that bugs
      are caught, but selfchecks are disabled so that you can use them
      for performance work without having to fiddle with turning
      selfchecks off.
2009-12-15 11:30:34 -05:00
Siarhei Siamashka
ce78288d77 ARM: added 'neon_composite_src_pixbuf_8888' fast path
This is ARM NEON optimized conversion of native RGBA format used by
GTK/GDK into native 32bpp RGBA format used by cairo/pixman.
2009-12-09 15:22:09 +02:00
Siarhei Siamashka
a732d3baeb ARM: added 'neon_composite_src_0888_0565_rev' fast path
This is ARM NEON optimized conversion of native RGB format used by
GTK/GDK into r5g6b5 format.
2009-12-09 15:22:03 +02:00
Siarhei Siamashka
a1386a1ceb ARM: added 'neon_src_0888_8888_rev' fast path
This is ARM NEON optimized conversion of native RGB format used by
GTK/GDK into native 32bpp RGB format used by cairo/pixman.
2009-12-09 15:21:57 +02:00
Siarhei Siamashka
78a60047ac ARM: added 'neon_composite_over_n_8888' fast path 2009-12-09 11:29:13 +02:00
Siarhei Siamashka
96fd17488f ARM: added 'neon_composite_over_n_0565' fast path 2009-12-09 11:27:57 +02:00
Siarhei Siamashka
2d332c7a56 ARM: added 'neon_composite_src_0565_8888' fast path 2009-12-09 10:33:01 +02:00
Siarhei Siamashka
062da411d8 ARM: added 'neon_composite_add_8888_8888_8888' fast path 2009-12-09 10:26:47 +02:00
Siarhei Siamashka
3d0eedb5d9 ARM: added 'neon_composite_add_8888_8888' fast path 2009-12-09 10:25:03 +02:00
Siarhei Siamashka
86b54c6701 ARM: added 'neon_composite_over_8888_8_8888' fast path 2009-12-09 10:24:30 +02:00
Siarhei Siamashka
aec1524e77 ARM: added 'neon_composite_over_8888_8888_8888' fast path 2009-12-09 10:19:37 +02:00
Siarhei Siamashka
ba59d53d0b ARM: minor source formatting changes
Now it's a bit harder to exceed 80 characters line limit
when binding assembly functions.
2009-12-09 10:17:23 +02:00
Siarhei Siamashka
a47b5167c4 ARM: added '.arch armv7a' directive to NEON assembly file
This fix prevents build failure due to not accepting PLD instruction when
compiling for armv4 cpu with the relevant -mcpu/-march options set in CFLAGS.
2009-12-08 08:52:34 +02:00
Benjamin Otte
3fba7dc6fa Make test program not throw warnings about undefined variables 2009-12-04 15:04:24 +01:00
Benjamin Otte
10ab592d57 Fix bug that prevented pixman_fill MMX and SSE paths for 16 and 8bpp 2009-12-04 15:04:24 +01:00
Siarhei Siamashka
7c7b6f5de7 ARM: NEON optimized pixman_blt
NEON unit has fast access to L1/L2 caches and even simple
copy of memory buffers using NEON provides more than 1.5x
performance improvement on ARM Cortex-A8.
2009-11-30 22:21:08 +02:00
Siarhei Siamashka
dce6e1bd68 test: support for testing pixbuf fast path functions in blitters-test 2009-11-27 15:50:26 +02:00
Benjamin Otte
0901ef41fb Remove nonexistant function from header 2009-11-22 10:57:06 +01:00
Søren Sandmann Pedersen
c97b1e803f Post-release version bump 2009-11-20 12:02:50 +01:00
Søren Sandmann Pedersen
5a7597f818 Pre-release version bump 2009-11-20 11:55:40 +01:00
Søren Sandmann Pedersen
95a08dece3 Remove stray semicolon from blitters-test.c
Pointed out by scottmc2@gmail.com in bug 25137.
2009-11-20 11:18:58 +01:00
Siarhei Siamashka
6e2c7d54c6 C fast path function for 'over_n_1_0565'
This function is needed to improve performance of xfce4 terminal when
using bitmap fonts and running with 16bpp desktop. Some other applications
may potentially benefit too.

After applying this patch, top functions from Xorg process in
oprofile log change from

samples  %        image name               symbol name
13296    29.1528  libpixman-1.so.0.17.1    combine_over_u
6452     14.1466  libpixman-1.so.0.17.1    fetch_scanline_r5g6b5
5516     12.0944  libpixman-1.so.0.17.1    fetch_scanline_a1
2273      4.9838  libpixman-1.so.0.17.1    store_scanline_r5g6b5
1741      3.8173  libpixman-1.so.0.17.1    fast_composite_add_1000_1000
1718      3.7669  libc-2.9.so              memcpy

to

samples  %        image name               symbol name
5594     14.7033  libpixman-1.so.0.17.1    fast_composite_over_n_1_0565
4323     11.3626  libc-2.9.so              memcpy
3695      9.7119  libpixman-1.so.0.17.1    fast_composite_add_1000_1000

when scrolling text in terminal (reading man page).
2009-11-20 11:18:58 +01:00
Søren Sandmann Pedersen
282f5cf8b8 Round horizontal sampling points towards northwest.
This is a similar change as the top/bottom one, but in this case the
rounding is simpler because it's just always rounding down.

Based on a patch by M Joonas Pihlaja.
2009-11-17 01:58:01 -05:00
Søren Sandmann Pedersen
f44431986f Fix rounding of top and bottom coordinates.
The rules for trap rasterization is that coordinates are rounded
towards north-west.

The pixman_sample_ceil() function is used to compute the first
(top-most) sample row included in the trap, so when the input
coordinate is already exactly on a sample row, no rounding should take
place.

On the other hand, pixman_sample_floor() is used to compute the final
(bottom-most) sample row, so if the input is precisely on a sample
row, it needs to be rounded down to the previous row.

This commit fixes the rounding computation. The idea of the
computation is like this:

Floor operation that rounds exact matches down: First subtract
pixman_fixed_e to make sure input already on a sample row gets rounded
down. Then find out how many small steps are between the input and the
first fraction. Then add those small steps to the first fraction.

The ceil operation first adds (small_step + pixman_e), then runs a
floor. This ensures that exact matches are not rounded off.

Based on a patch by M Joonas Pihlaja.
2009-11-17 01:58:01 -05:00
Søren Sandmann Pedersen
3bea18e3ea Fix slightly skewed sampling grid for antialiased traps
The sampling grid is slightly skewed in the antialiased case. Consider
the case where we have n = 8 bits of alpha.

The small step is

     small_step = fixed_1 / 15 = 65536 / 15 = 4369

The first fraction is then

     frac_first = (small_step / 2) = (65536 - 15) / 2 = 2184

and the last fraction becomes

     frac_last
          = frac_first + (15 - 1) * small_step = 2184 + 14 * 4369 = 63350

which means the size of the last bit of the pixel is

     65536 - 63350 = 2186

which is 2 bigger than the first fraction. This is not the end of the
world, but it would be more correct to have 2185 and 2185, and we can
accomplish that simply by making the first fraction half the *big*
step instead of half the small step.

If we ever move to coordinates with 8 fractional bits, the
corresponding values become 8 and 10 out of 256, where 9 and 9 would
be better.

Similarly in the X direction.
2009-11-17 01:58:01 -05:00
Søren Sandmann Pedersen
98bb0a509f Delete the flags field from fast_path_info_t 2009-11-17 00:47:49 -05:00
Søren Sandmann Pedersen
b7fb7e6c70 Eliminate NEED_PIXBUF flag.
Instead introduce two new fake formats

	PIXMAN_pixbuf
	PIXMAN_rpixbuf

and compute whether the source and mask have them in
find_fast_path(). This lead to some duplicate entries in the fast path
tables that could then be removed.
2009-11-17 00:42:21 -05:00
Søren Sandmann Pedersen
542b79c30d Compute src_format outside the fast path loop.
Inside the loop all we have to do is check that the formats match.
2009-11-17 00:42:21 -05:00
Søren Sandmann Pedersen
12108ecbe4 Eliminate the NEED_COMPONENT_ALPHA flag.
Instead introduce two new fake formats

	PIXMAN_a8r8g8b8_ca
	PIXMAN_a8b8g8r8_ca

that are used in the fast path tables for this case.
2009-11-17 00:42:21 -05:00
Søren Sandmann Pedersen
4686d1f53b Eliminate the NEED_SOLID_MASK flag
This flag was used to indicate that the mask was solid while still
allowing a specific format to be required. However, there is not
actually any need for this because the fast paths all used
_pixman_image_get_solid() which already allowed arbitrary formats.

The one thing that had to be dealt with was component alpha. In
addition to interpreting the presence of the NEED_COMPONENT_ALPHA
flag, we now also interprete the *absence* of this flag as a
requirement that the mask does *not* have component alpha.

Siarhei Siamashka pointed out that the first version of this commit
had a bug, in which a NEED_SOLID_MASK was accidentally not turned into
a PIXMAN_solid in the ARM NEON implementation.
2009-11-17 00:42:21 -05:00
Søren Sandmann Pedersen
2ef8b394d7 Use the destination buffer directly in more cases instead of fetching.
When the destination buffer is either a8r8g8b8 or x8r8g8b8, we can use
it directly instead of fetching into a temporary buffer. When the
format is x8r8g8b8, we require the operator to not make use of
destination alpha, but when it is a8r8g8b8, there are no restrictions.

This is approximately a 5% speedup on the poppler cairo benchmark:

[ # ]  backend                         test   min(s) median(s) stddev. count

Before:
[  0]    image                      poppler    6.661    6.709   0.59%    6/6

After:
[  0]    image                      poppler    6.307    6.320   0.12%    5/6
2009-11-17 00:42:21 -05:00
Søren Sandmann Pedersen
13f4e02b14 test: Move image_endian_swap() from blitters-test.c to utils.[ch] 2009-11-17 00:32:03 -05:00
Søren Sandmann Pedersen
24e203a8a8 test: Move random number generator from blitters/scaling-test to utils.[ch] 2009-11-17 00:32:03 -05:00
Søren Sandmann Pedersen
cc34554652 test: In scaling-test use the crc32 from utils.c 2009-11-17 00:32:03 -05:00
Søren Sandmann Pedersen
b465b8b79d test: Move CRC32 code from blitters-test to new files utils.[ch] 2009-11-17 00:32:03 -05:00
Søren Sandmann Pedersen
56bd913401 test: Rename utils.[ch] to gtk-utils.[ch] 2009-11-17 00:32:03 -05:00
Søren Sandmann Pedersen
7be529f3bd sse2: Add a fast path for OVER 8888 x 8 x 8888
This is a small speedup on the swfdec-youtube benchmark:

Before:
[  0]    image               swfdec-youtube    5.789    5.806   0.20%    6/6

After:
[  0]    image               swfdec-youtube    5.489    5.524   0.27%    6/6

Ie., approximately 5% faster.
2009-11-13 15:57:48 -05:00
Siarhei Siamashka
abefe68ae2 ARM: enabled 'neon_composite_add_8000_8000' fast path 2009-11-11 18:12:58 +02:00
Siarhei Siamashka
635f389ff4 ARM: enabled 'neon_composite_add_8_8_8' fast path 2009-11-11 18:12:58 +02:00
Siarhei Siamashka
7e1bfed676 ARM: enabled 'neon_composite_add_n_8_8' fast path 2009-11-11 18:12:58 +02:00
Siarhei Siamashka
deeb67b13a ARM: enabled 'neon_composite_over_8888_8888' fast path 2009-11-11 18:12:58 +02:00
Siarhei Siamashka
f449364849 ARM: enabled 'neon_composite_over_8888_0565' fast path 2009-11-11 18:12:57 +02:00
Siarhei Siamashka
2dfbf6c4a5 ARM: enabled 'neon_composite_over_8888_n_8888' fast path 2009-11-11 18:12:57 +02:00