pixman

mirror of https://salsa.debian.org/xorg-team/lib/pixman synced 2025-09-07 13:40:37 +00:00

Author	SHA1	Message	Date
Siarhei Siamashka	7c7b6f5de7	ARM: NEON optimized pixman_blt NEON unit has fast access to L1/L2 caches and even simple copy of memory buffers using NEON provides more than 1.5x performance improvement on ARM Cortex-A8.	2009-11-30 22:21:08 +02:00
Siarhei Siamashka	dce6e1bd68	test: support for testing pixbuf fast path functions in blitters-test	2009-11-27 15:50:26 +02:00
Benjamin Otte	0901ef41fb	Remove nonexistant function from header	2009-11-22 10:57:06 +01:00
Søren Sandmann Pedersen	c97b1e803f	Post-release version bump	2009-11-20 12:02:50 +01:00
Søren Sandmann Pedersen	5a7597f818	Pre-release version bump	2009-11-20 11:55:40 +01:00
Søren Sandmann Pedersen	95a08dece3	Remove stray semicolon from blitters-test.c Pointed out by scottmc2@gmail.com in bug 25137.	2009-11-20 11:18:58 +01:00
Siarhei Siamashka	6e2c7d54c6	C fast path function for 'over_n_1_0565' This function is needed to improve performance of xfce4 terminal when using bitmap fonts and running with 16bpp desktop. Some other applications may potentially benefit too. After applying this patch, top functions from Xorg process in oprofile log change from samples % image name symbol name 13296 29.1528 libpixman-1.so.0.17.1 combine_over_u 6452 14.1466 libpixman-1.so.0.17.1 fetch_scanline_r5g6b5 5516 12.0944 libpixman-1.so.0.17.1 fetch_scanline_a1 2273 4.9838 libpixman-1.so.0.17.1 store_scanline_r5g6b5 1741 3.8173 libpixman-1.so.0.17.1 fast_composite_add_1000_1000 1718 3.7669 libc-2.9.so memcpy to samples % image name symbol name 5594 14.7033 libpixman-1.so.0.17.1 fast_composite_over_n_1_0565 4323 11.3626 libc-2.9.so memcpy 3695 9.7119 libpixman-1.so.0.17.1 fast_composite_add_1000_1000 when scrolling text in terminal (reading man page).	2009-11-20 11:18:58 +01:00
Søren Sandmann Pedersen	282f5cf8b8	Round horizontal sampling points towards northwest. This is a similar change as the top/bottom one, but in this case the rounding is simpler because it's just always rounding down. Based on a patch by M Joonas Pihlaja.	2009-11-17 01:58:01 -05:00
Søren Sandmann Pedersen	f44431986f	Fix rounding of top and bottom coordinates. The rules for trap rasterization is that coordinates are rounded towards north-west. The pixman_sample_ceil() function is used to compute the first (top-most) sample row included in the trap, so when the input coordinate is already exactly on a sample row, no rounding should take place. On the other hand, pixman_sample_floor() is used to compute the final (bottom-most) sample row, so if the input is precisely on a sample row, it needs to be rounded down to the previous row. This commit fixes the rounding computation. The idea of the computation is like this: Floor operation that rounds exact matches down: First subtract pixman_fixed_e to make sure input already on a sample row gets rounded down. Then find out how many small steps are between the input and the first fraction. Then add those small steps to the first fraction. The ceil operation first adds (small_step + pixman_e), then runs a floor. This ensures that exact matches are not rounded off. Based on a patch by M Joonas Pihlaja.	2009-11-17 01:58:01 -05:00
Søren Sandmann Pedersen	3bea18e3ea	Fix slightly skewed sampling grid for antialiased traps The sampling grid is slightly skewed in the antialiased case. Consider the case where we have n = 8 bits of alpha. The small step is small_step = fixed_1 / 15 = 65536 / 15 = 4369 The first fraction is then frac_first = (small_step / 2) = (65536 - 15) / 2 = 2184 and the last fraction becomes frac_last = frac_first + (15 - 1) * small_step = 2184 + 14 * 4369 = 63350 which means the size of the last bit of the pixel is 65536 - 63350 = 2186 which is 2 bigger than the first fraction. This is not the end of the world, but it would be more correct to have 2185 and 2185, and we can accomplish that simply by making the first fraction half the big step instead of half the small step. If we ever move to coordinates with 8 fractional bits, the corresponding values become 8 and 10 out of 256, where 9 and 9 would be better. Similarly in the X direction.	2009-11-17 01:58:01 -05:00
Søren Sandmann Pedersen	98bb0a509f	Delete the flags field from fast_path_info_t	2009-11-17 00:47:49 -05:00
Søren Sandmann Pedersen	b7fb7e6c70	Eliminate NEED_PIXBUF flag. Instead introduce two new fake formats PIXMAN_pixbuf PIXMAN_rpixbuf and compute whether the source and mask have them in find_fast_path(). This lead to some duplicate entries in the fast path tables that could then be removed.	2009-11-17 00:42:21 -05:00
Søren Sandmann Pedersen	542b79c30d	Compute src_format outside the fast path loop. Inside the loop all we have to do is check that the formats match.	2009-11-17 00:42:21 -05:00
Søren Sandmann Pedersen	12108ecbe4	Eliminate the NEED_COMPONENT_ALPHA flag. Instead introduce two new fake formats PIXMAN_a8r8g8b8_ca PIXMAN_a8b8g8r8_ca that are used in the fast path tables for this case.	2009-11-17 00:42:21 -05:00
Søren Sandmann Pedersen	4686d1f53b	Eliminate the NEED_SOLID_MASK flag This flag was used to indicate that the mask was solid while still allowing a specific format to be required. However, there is not actually any need for this because the fast paths all used _pixman_image_get_solid() which already allowed arbitrary formats. The one thing that had to be dealt with was component alpha. In addition to interpreting the presence of the NEED_COMPONENT_ALPHA flag, we now also interprete the absence of this flag as a requirement that the mask does not have component alpha. Siarhei Siamashka pointed out that the first version of this commit had a bug, in which a NEED_SOLID_MASK was accidentally not turned into a PIXMAN_solid in the ARM NEON implementation.	2009-11-17 00:42:21 -05:00
Søren Sandmann Pedersen	2ef8b394d7	Use the destination buffer directly in more cases instead of fetching. When the destination buffer is either a8r8g8b8 or x8r8g8b8, we can use it directly instead of fetching into a temporary buffer. When the format is x8r8g8b8, we require the operator to not make use of destination alpha, but when it is a8r8g8b8, there are no restrictions. This is approximately a 5% speedup on the poppler cairo benchmark: [ # ] backend test min(s) median(s) stddev. count Before: [ 0] image poppler 6.661 6.709 0.59% 6/6 After: [ 0] image poppler 6.307 6.320 0.12% 5/6	2009-11-17 00:42:21 -05:00
Søren Sandmann Pedersen	13f4e02b14	test: Move image_endian_swap() from blitters-test.c to utils.[ch]	2009-11-17 00:32:03 -05:00
Søren Sandmann Pedersen	24e203a8a8	test: Move random number generator from blitters/scaling-test to utils.[ch]	2009-11-17 00:32:03 -05:00
Søren Sandmann Pedersen	cc34554652	test: In scaling-test use the crc32 from utils.c	2009-11-17 00:32:03 -05:00
Søren Sandmann Pedersen	b465b8b79d	test: Move CRC32 code from blitters-test to new files utils.[ch]	2009-11-17 00:32:03 -05:00
Søren Sandmann Pedersen	56bd913401	test: Rename utils.[ch] to gtk-utils.[ch]	2009-11-17 00:32:03 -05:00
Søren Sandmann Pedersen	7be529f3bd	sse2: Add a fast path for OVER 8888 x 8 x 8888 This is a small speedup on the swfdec-youtube benchmark: Before: [ 0] image swfdec-youtube 5.789 5.806 0.20% 6/6 After: [ 0] image swfdec-youtube 5.489 5.524 0.27% 6/6 Ie., approximately 5% faster.	2009-11-13 15:57:48 -05:00
Siarhei Siamashka	abefe68ae2	ARM: enabled 'neon_composite_add_8000_8000' fast path	2009-11-11 18:12:58 +02:00
Siarhei Siamashka	635f389ff4	ARM: enabled 'neon_composite_add_8_8_8' fast path	2009-11-11 18:12:58 +02:00
Siarhei Siamashka	7e1bfed676	ARM: enabled 'neon_composite_add_n_8_8' fast path	2009-11-11 18:12:58 +02:00
Siarhei Siamashka	deeb67b13a	ARM: enabled 'neon_composite_over_8888_8888' fast path	2009-11-11 18:12:58 +02:00
Siarhei Siamashka	f449364849	ARM: enabled 'neon_composite_over_8888_0565' fast path	2009-11-11 18:12:57 +02:00
Siarhei Siamashka	2dfbf6c4a5	ARM: enabled 'neon_composite_over_8888_n_8888' fast path	2009-11-11 18:12:57 +02:00
Siarhei Siamashka	43824f98f1	ARM: enabled 'neon_composite_over_n_8_8888' fast path	2009-11-11 18:12:57 +02:00
Siarhei Siamashka	189d0d783c	ARM: enabled 'neon_composite_over_n_8_0565' fast path	2009-11-11 18:12:57 +02:00
Siarhei Siamashka	cccfc87f4f	ARM: enabled 'neon_composite_src_0888_0888' fast path	2009-11-11 18:12:57 +02:00
Siarhei Siamashka	e89b4f8105	ARM: enabled 'neon_composite_src_8888_0565' fast path	2009-11-11 18:12:56 +02:00
Siarhei Siamashka	2d54ed46fb	ARM: enabled 'neon_composite_src_0565_0565' fast path	2009-11-11 18:12:56 +02:00
Siarhei Siamashka	5d695cb86e	ARM: added 'bindings' for NEON assembly optimized functions These functions serve as 'adaptors', converting standard internal pixman fast path function arguments into arguments expected by assembly functions.	2009-11-11 18:12:56 +02:00
Siarhei Siamashka	dcfade3df9	ARM: enabled new implementation for pixman_fill_neon	2009-11-11 18:12:56 +02:00
Siarhei Siamashka	bcb4bc7932	ARM: introduction of the new framework for NEON fast path optimizations GNU assembler and its macro preprocessor is now used to generate NEON optimized functions from a common template. This automatically takes care of nuisances like ensuring optimal alignment, dealing with leading/trailing pixels, doing prefetch, etc. Implementations for a lot of compositing functions are also added, but not enabled.	2009-11-11 18:12:56 +02:00
Siarhei Siamashka	1eff0ab487	ARM: removed old ARM NEON optimizations	2009-11-11 18:12:55 +02:00
Søren Sandmann Pedersen	b8898d77d0	Define PIXMAN_USE_INTERNAL_API in pixman-private.h Instead of mucking around with CFLAGS in configure.ac, preventing users from setting their own CFLAGS, just define the PIXMAN_USE_INTERNAL_API and PIXMAN_DISABLE_DEPRECATED in pixman-private.h	2009-11-07 14:47:22 -05:00
Søren Sandmann Pedersen	67bf739187	Include <inttypes.h> when compiled with HP's C compiler. Fixes bug 23169.	2009-10-27 09:11:28 -04:00
Siarhei Siamashka	384fb88b90	C fast path function for 'over_n_1_8888' This function is needed to improve performance of xfce4 terminal. Some other applications may potentially benefit too.	2009-10-27 12:32:04 +02:00
Siarhei Siamashka	a2985da947	C fast path function for 'add_1000_1000' This function is needed to improve performance of xfce4 terminal. Some other applications may potentially benefit too.	2009-10-27 12:31:59 +02:00
Siarhei Siamashka	5f429e4510	blitters-test updated to also randomly generate mask_x/mask_y	2009-10-27 12:31:55 +02:00
André Tupinambá	0d5562747c	Add fast path scaled, bilinear fetcher. This adds a bilinear fetcher for the case where the image has a scaled transformation, does not repeat, and the format {ax}8r8g8b8. Results for the swfdec-youtube benchmark Before: [ # ] backend test min(s) median(s) stddev. count [ 0] image swfdec-youtube 7.841 7.915 0.72% 6/6 After: [ # ] backend test min(s) median(s) stddev. count [ 0] image swfdec-youtube 6.677 6.780 0.94% 6/6 These results were measured on a faster machine than the ones in the previous commit, so the numbers are not comparable. Signed-off-by: Søren Sandmann Pedersen <sandmann@redhat.com>	2009-10-26 13:04:21 -04:00
André Tupinambá	88323c5abe	Speed up bilinear interpolation. Speed up bilinear interpolation by processing more than one component at a time on 64 bit architectures, and by precomputing the dist{ixiy} products on 32 bit architectures. Previously bilinear interpolation for one pixel would take 24 multiplications. With this improvement it takes 12 on 64 bit, and 20 on 32 bit. This is a small but consistent speedup on the swfdec-youtube benchmark: [ # ] backend test min(s) median(s) stddev. count Before: [ 0] image swfdec-youtube 18.010 18.020 0.09% 4/5 After: [ 0] image swfdec-youtube 17.488 17.584 0.22% 5/6 Signed-off-by: Søren Sandmann Pedersen <sandmann@redhat.com>	2009-10-26 13:04:21 -04:00
Søren Sandmann Pedersen	f0c157f888	Extend scaling-test to also test bilinear filtering.	2009-10-26 13:04:21 -04:00
Jeremy Huddleston	eab882ef38	This is not a GNU project, so declare it foreign. On Wed, 2009-10-21 at 13:36 +1000, Peter Hutterer wrote: > On Tue, Oct 20, 2009 at 08:23:55PM -0700, Jeremy Huddleston wrote: > > I noticed an INSTALL file in xlsclients and libXvMC today, and it > > was quite annoying to work around since 'autoreconf -fvi' replaces > > it and git wants to commit it. Should these files even be in git? > > Can I nuke them for the betterment of humanity and since they get > > created by autoreconf anyways? > > See https://bugs.freedesktop.org/show_bug.cgi?id=24206 As an interim measure, replace AM_INIT_AUTOMAKE([dist-bzip2]) with AM_INIT_AUTOMAKE([foreign dist-bzip2]). This will prevent the generation of the INSTALL file. It is also part of the 24206 solution. Signed-off-by: Jeremy Huddleston <jeremyhu@freedesktop.org>	2009-10-21 12:47:27 -07:00
Søren Sandmann Pedersen	dc46ad274a	Make walk_region_internal() use 32 bit dimensions	2009-10-19 20:32:37 -04:00
Søren Sandmann Pedersen	bb3698d479	Make pixman_compute_composite_region32() use 32 bit dimensions	2009-10-19 20:31:54 -04:00
Søren Sandmann Pedersen	895c281c40	Change prototype of _pixman_walk_composite_region from int16_t to int32_t	2009-10-19 20:30:22 -04:00
Søren Sandmann Pedersen	9cd470665b	Remove unused color_table and color_table_size fields	2009-10-19 20:27:36 -04:00

... 7 8 9 10 11 ...

1635 Commits