pixman

mirror of https://salsa.debian.org/xorg-team/lib/pixman synced 2025-09-06 12:25:16 +00:00

Author	SHA1	Message	Date
Siarhei Siamashka	a732d3baeb	ARM: added 'neon_composite_src_0888_0565_rev' fast path This is ARM NEON optimized conversion of native RGB format used by GTK/GDK into r5g6b5 format.	2009-12-09 15:22:03 +02:00
Siarhei Siamashka	a1386a1ceb	ARM: added 'neon_src_0888_8888_rev' fast path This is ARM NEON optimized conversion of native RGB format used by GTK/GDK into native 32bpp RGB format used by cairo/pixman.	2009-12-09 15:21:57 +02:00
Siarhei Siamashka	78a60047ac	ARM: added 'neon_composite_over_n_8888' fast path	2009-12-09 11:29:13 +02:00
Siarhei Siamashka	96fd17488f	ARM: added 'neon_composite_over_n_0565' fast path	2009-12-09 11:27:57 +02:00
Siarhei Siamashka	2d332c7a56	ARM: added 'neon_composite_src_0565_8888' fast path	2009-12-09 10:33:01 +02:00
Siarhei Siamashka	062da411d8	ARM: added 'neon_composite_add_8888_8888_8888' fast path	2009-12-09 10:26:47 +02:00
Siarhei Siamashka	3d0eedb5d9	ARM: added 'neon_composite_add_8888_8888' fast path	2009-12-09 10:25:03 +02:00
Siarhei Siamashka	86b54c6701	ARM: added 'neon_composite_over_8888_8_8888' fast path	2009-12-09 10:24:30 +02:00
Siarhei Siamashka	aec1524e77	ARM: added 'neon_composite_over_8888_8888_8888' fast path	2009-12-09 10:19:37 +02:00
Siarhei Siamashka	ba59d53d0b	ARM: minor source formatting changes Now it's a bit harder to exceed 80 characters line limit when binding assembly functions.	2009-12-09 10:17:23 +02:00
Siarhei Siamashka	a47b5167c4	ARM: added '.arch armv7a' directive to NEON assembly file This fix prevents build failure due to not accepting PLD instruction when compiling for armv4 cpu with the relevant -mcpu/-march options set in CFLAGS.	2009-12-08 08:52:34 +02:00
Benjamin Otte	3fba7dc6fa	Make test program not throw warnings about undefined variables	2009-12-04 15:04:24 +01:00
Benjamin Otte	10ab592d57	Fix bug that prevented pixman_fill MMX and SSE paths for 16 and 8bpp	2009-12-04 15:04:24 +01:00
Siarhei Siamashka	7c7b6f5de7	ARM: NEON optimized pixman_blt NEON unit has fast access to L1/L2 caches and even simple copy of memory buffers using NEON provides more than 1.5x performance improvement on ARM Cortex-A8.	2009-11-30 22:21:08 +02:00
Siarhei Siamashka	dce6e1bd68	test: support for testing pixbuf fast path functions in blitters-test	2009-11-27 15:50:26 +02:00
Benjamin Otte	0901ef41fb	Remove nonexistant function from header	2009-11-22 10:57:06 +01:00
Søren Sandmann Pedersen	c97b1e803f	Post-release version bump	2009-11-20 12:02:50 +01:00
Søren Sandmann Pedersen	5a7597f818	Pre-release version bump	2009-11-20 11:55:40 +01:00
Søren Sandmann Pedersen	95a08dece3	Remove stray semicolon from blitters-test.c Pointed out by scottmc2@gmail.com in bug 25137.	2009-11-20 11:18:58 +01:00
Siarhei Siamashka	6e2c7d54c6	C fast path function for 'over_n_1_0565' This function is needed to improve performance of xfce4 terminal when using bitmap fonts and running with 16bpp desktop. Some other applications may potentially benefit too. After applying this patch, top functions from Xorg process in oprofile log change from samples % image name symbol name 13296 29.1528 libpixman-1.so.0.17.1 combine_over_u 6452 14.1466 libpixman-1.so.0.17.1 fetch_scanline_r5g6b5 5516 12.0944 libpixman-1.so.0.17.1 fetch_scanline_a1 2273 4.9838 libpixman-1.so.0.17.1 store_scanline_r5g6b5 1741 3.8173 libpixman-1.so.0.17.1 fast_composite_add_1000_1000 1718 3.7669 libc-2.9.so memcpy to samples % image name symbol name 5594 14.7033 libpixman-1.so.0.17.1 fast_composite_over_n_1_0565 4323 11.3626 libc-2.9.so memcpy 3695 9.7119 libpixman-1.so.0.17.1 fast_composite_add_1000_1000 when scrolling text in terminal (reading man page).	2009-11-20 11:18:58 +01:00
Søren Sandmann Pedersen	282f5cf8b8	Round horizontal sampling points towards northwest. This is a similar change as the top/bottom one, but in this case the rounding is simpler because it's just always rounding down. Based on a patch by M Joonas Pihlaja.	2009-11-17 01:58:01 -05:00
Søren Sandmann Pedersen	f44431986f	Fix rounding of top and bottom coordinates. The rules for trap rasterization is that coordinates are rounded towards north-west. The pixman_sample_ceil() function is used to compute the first (top-most) sample row included in the trap, so when the input coordinate is already exactly on a sample row, no rounding should take place. On the other hand, pixman_sample_floor() is used to compute the final (bottom-most) sample row, so if the input is precisely on a sample row, it needs to be rounded down to the previous row. This commit fixes the rounding computation. The idea of the computation is like this: Floor operation that rounds exact matches down: First subtract pixman_fixed_e to make sure input already on a sample row gets rounded down. Then find out how many small steps are between the input and the first fraction. Then add those small steps to the first fraction. The ceil operation first adds (small_step + pixman_e), then runs a floor. This ensures that exact matches are not rounded off. Based on a patch by M Joonas Pihlaja.	2009-11-17 01:58:01 -05:00
Søren Sandmann Pedersen	3bea18e3ea	Fix slightly skewed sampling grid for antialiased traps The sampling grid is slightly skewed in the antialiased case. Consider the case where we have n = 8 bits of alpha. The small step is small_step = fixed_1 / 15 = 65536 / 15 = 4369 The first fraction is then frac_first = (small_step / 2) = (65536 - 15) / 2 = 2184 and the last fraction becomes frac_last = frac_first + (15 - 1) * small_step = 2184 + 14 * 4369 = 63350 which means the size of the last bit of the pixel is 65536 - 63350 = 2186 which is 2 bigger than the first fraction. This is not the end of the world, but it would be more correct to have 2185 and 2185, and we can accomplish that simply by making the first fraction half the big step instead of half the small step. If we ever move to coordinates with 8 fractional bits, the corresponding values become 8 and 10 out of 256, where 9 and 9 would be better. Similarly in the X direction.	2009-11-17 01:58:01 -05:00
Søren Sandmann Pedersen	98bb0a509f	Delete the flags field from fast_path_info_t	2009-11-17 00:47:49 -05:00
Søren Sandmann Pedersen	b7fb7e6c70	Eliminate NEED_PIXBUF flag. Instead introduce two new fake formats PIXMAN_pixbuf PIXMAN_rpixbuf and compute whether the source and mask have them in find_fast_path(). This lead to some duplicate entries in the fast path tables that could then be removed.	2009-11-17 00:42:21 -05:00
Søren Sandmann Pedersen	542b79c30d	Compute src_format outside the fast path loop. Inside the loop all we have to do is check that the formats match.	2009-11-17 00:42:21 -05:00
Søren Sandmann Pedersen	12108ecbe4	Eliminate the NEED_COMPONENT_ALPHA flag. Instead introduce two new fake formats PIXMAN_a8r8g8b8_ca PIXMAN_a8b8g8r8_ca that are used in the fast path tables for this case.	2009-11-17 00:42:21 -05:00
Søren Sandmann Pedersen	4686d1f53b	Eliminate the NEED_SOLID_MASK flag This flag was used to indicate that the mask was solid while still allowing a specific format to be required. However, there is not actually any need for this because the fast paths all used _pixman_image_get_solid() which already allowed arbitrary formats. The one thing that had to be dealt with was component alpha. In addition to interpreting the presence of the NEED_COMPONENT_ALPHA flag, we now also interprete the absence of this flag as a requirement that the mask does not have component alpha. Siarhei Siamashka pointed out that the first version of this commit had a bug, in which a NEED_SOLID_MASK was accidentally not turned into a PIXMAN_solid in the ARM NEON implementation.	2009-11-17 00:42:21 -05:00
Søren Sandmann Pedersen	2ef8b394d7	Use the destination buffer directly in more cases instead of fetching. When the destination buffer is either a8r8g8b8 or x8r8g8b8, we can use it directly instead of fetching into a temporary buffer. When the format is x8r8g8b8, we require the operator to not make use of destination alpha, but when it is a8r8g8b8, there are no restrictions. This is approximately a 5% speedup on the poppler cairo benchmark: [ # ] backend test min(s) median(s) stddev. count Before: [ 0] image poppler 6.661 6.709 0.59% 6/6 After: [ 0] image poppler 6.307 6.320 0.12% 5/6	2009-11-17 00:42:21 -05:00
Søren Sandmann Pedersen	13f4e02b14	test: Move image_endian_swap() from blitters-test.c to utils.[ch]	2009-11-17 00:32:03 -05:00
Søren Sandmann Pedersen	24e203a8a8	test: Move random number generator from blitters/scaling-test to utils.[ch]	2009-11-17 00:32:03 -05:00
Søren Sandmann Pedersen	cc34554652	test: In scaling-test use the crc32 from utils.c	2009-11-17 00:32:03 -05:00
Søren Sandmann Pedersen	b465b8b79d	test: Move CRC32 code from blitters-test to new files utils.[ch]	2009-11-17 00:32:03 -05:00
Søren Sandmann Pedersen	56bd913401	test: Rename utils.[ch] to gtk-utils.[ch]	2009-11-17 00:32:03 -05:00
Søren Sandmann Pedersen	7be529f3bd	sse2: Add a fast path for OVER 8888 x 8 x 8888 This is a small speedup on the swfdec-youtube benchmark: Before: [ 0] image swfdec-youtube 5.789 5.806 0.20% 6/6 After: [ 0] image swfdec-youtube 5.489 5.524 0.27% 6/6 Ie., approximately 5% faster.	2009-11-13 15:57:48 -05:00
Siarhei Siamashka	abefe68ae2	ARM: enabled 'neon_composite_add_8000_8000' fast path	2009-11-11 18:12:58 +02:00
Siarhei Siamashka	635f389ff4	ARM: enabled 'neon_composite_add_8_8_8' fast path	2009-11-11 18:12:58 +02:00
Siarhei Siamashka	7e1bfed676	ARM: enabled 'neon_composite_add_n_8_8' fast path	2009-11-11 18:12:58 +02:00
Siarhei Siamashka	deeb67b13a	ARM: enabled 'neon_composite_over_8888_8888' fast path	2009-11-11 18:12:58 +02:00
Siarhei Siamashka	f449364849	ARM: enabled 'neon_composite_over_8888_0565' fast path	2009-11-11 18:12:57 +02:00
Siarhei Siamashka	2dfbf6c4a5	ARM: enabled 'neon_composite_over_8888_n_8888' fast path	2009-11-11 18:12:57 +02:00
Siarhei Siamashka	43824f98f1	ARM: enabled 'neon_composite_over_n_8_8888' fast path	2009-11-11 18:12:57 +02:00
Siarhei Siamashka	189d0d783c	ARM: enabled 'neon_composite_over_n_8_0565' fast path	2009-11-11 18:12:57 +02:00
Siarhei Siamashka	cccfc87f4f	ARM: enabled 'neon_composite_src_0888_0888' fast path	2009-11-11 18:12:57 +02:00
Siarhei Siamashka	e89b4f8105	ARM: enabled 'neon_composite_src_8888_0565' fast path	2009-11-11 18:12:56 +02:00
Siarhei Siamashka	2d54ed46fb	ARM: enabled 'neon_composite_src_0565_0565' fast path	2009-11-11 18:12:56 +02:00
Siarhei Siamashka	5d695cb86e	ARM: added 'bindings' for NEON assembly optimized functions These functions serve as 'adaptors', converting standard internal pixman fast path function arguments into arguments expected by assembly functions.	2009-11-11 18:12:56 +02:00
Siarhei Siamashka	dcfade3df9	ARM: enabled new implementation for pixman_fill_neon	2009-11-11 18:12:56 +02:00
Siarhei Siamashka	bcb4bc7932	ARM: introduction of the new framework for NEON fast path optimizations GNU assembler and its macro preprocessor is now used to generate NEON optimized functions from a common template. This automatically takes care of nuisances like ensuring optimal alignment, dealing with leading/trailing pixels, doing prefetch, etc. Implementations for a lot of compositing functions are also added, but not enabled.	2009-11-11 18:12:56 +02:00
Siarhei Siamashka	1eff0ab487	ARM: removed old ARM NEON optimizations	2009-11-11 18:12:55 +02:00

... 13 14 15 16 17 ...

1948 Commits