pixman

mirror of https://salsa.debian.org/xorg-team/lib/pixman synced 2025-09-01 02:10:49 +00:00

Author	SHA1	Message	Date
Søren Sandmann Pedersen	697cfe1537	Post-release version bump to 0.23.9	2011-10-29 05:51:54 -04:00
Søren Sandmann Pedersen	a0f1b56581	Pre-release version bump to 0.23.8	2011-10-29 05:33:44 -04:00
Søren Sandmann Pedersen	498138c293	Fix use of uninitialized fields reported by valgrind In pixman-noop.c and pixman-sse2.c, we are accessing image->bits.width/height without first making sure the image is a bits image. The warning is harmless because we never act on this information without checking that the image is a8r8g8b8, but valgrind does warn about it. In pixman-noop.c, just reorder the clauses in the if statement; in pixman-sse2.c require images to have the FAST_PATH_BITS_IMAGE flag set.	2011-10-25 12:00:19 -04:00
Julien Cristau	40a04cb1b6	Upload to experimental	2011-10-22 11:09:17 +02:00
Søren Sandmann Pedersen	6131707e8f	Merge branch 'gradients'	2011-10-20 09:13:12 -04:00
Rico Tzschichholz	bdfdaaff5d	Bump changelogs.	2011-10-19 17:44:08 +02:00
Rico Tzschichholz	bccb9afc56	Merge branch 'upstream-experimental' into debian-experimental	2011-10-19 17:24:45 +02:00
Taekyun Kim	3d4d705d2f	ARM: NEON: Fix assembly typo error in src_n_8_8888 Binutils 2.21 does not complain about missing comma between ARM register and alignement specifier in vld/vst instructions which causes build error on binutils 2.20.	2011-10-18 21:50:18 +09:00
Taekyun Kim	19f118f41f	ARM: NEON: Standard fast path src_n_8_8 Performance numbers of before/after on cortex-a8 @ 1GHz - before L1: 28.05 L2: 28.26 M: 26.97 ( 4.48%) HT: 19.79 VT: 19.14 R: 17.61 RT: 9.88 ( 101Kops/s) - after L1:1430.28 L2:1252.10 M:421.93 ( 75.48%) HT:170.16 VT:138.03 R:145.86 RT: 35.51 ( 255Kops/s)	2011-10-18 13:16:50 +09:00
Taekyun Kim	4db9e2bc13	ARM: NEON: Standard fast path src_n_8_8888 Performance numbers of before/after on cortex-a8 @ 1GHz - before L1: 32.39 L2: 31.79 M: 30.84 ( 13.77%) HT: 21.58 VT: 19.75 R: 18.83 RT: 10.46 ( 106Kops/s) - after L1: 516.25 L2: 372.00 M:193.49 ( 85.59%) HT:136.93 VT:109.10 R:104.48 RT: 34.77 ( 253Kops/s)	2011-10-18 13:16:48 +09:00
Taekyun Kim	26659de6cd	ARM: NEON: Instruction scheduling of bilinear over_8888_8_8888 Instructions are reordered to eliminate pipeline stalls and get better memory access. Performance of before/after on cortex-a8 @ 1GHz << 2000 x 2000 with scale factor close to 1.x >> before : 40.53 Mpix/s after : 50.76 Mpix/s	2011-10-18 13:16:42 +09:00
Taekyun Kim	4481920f40	ARM: NEON: Instruction scheduling of bilinear over_8888_8888 Instructions are reordered to eliminate pipeline stalls and get better memory access. Performance of before/after on cortex-a8 @ 1GHz << 2000 x 2000 with scale factor close to 1.x >> before : 50.43 Mpix/s after : 61.09 Mpix/s	2011-10-18 13:14:28 +09:00
Taekyun Kim	1cd916f3a5	ARM: NEON: Replace old bilinear scanline generator with new template Bilinear scanline functions in pixman-arm-neon-asm-bilinear.S can be replaced with new template just by wrapping existing macros.	2011-10-18 13:00:10 +09:00
Taekyun Kim	6682b2b359	ARM: NEON: Bilinear macro template for instruction scheduling This macro template takes 6 code blocks. 1. process_last_pixel 2. process_two_pixels 3. process_four_pixels 4. process_pixblock_head 5. process_pixblock_tail 6. process_pixblock_tail_head process_last_pixel does not need to update horizontal weight. This is done by the template. two and four code block should update horizontal weight inside of them. head/tail/tail_head blocks consist unrolled core loop. You can apply instruction scheduling to the tail_head blocks. You can also specify size of the pixel block. Supported size is 4 and 8. If you want to use mask, give BILINEAR_FLAG_USE_MASK flags to the template, then you can use register MASK. When using d8~d15 registers, give BILINEAR_FLAG_USE_ALL_NEON_REGS to make sure registers are properly saved on the stack and later restored.	2011-10-18 13:00:06 +09:00
Taekyun Kim	b5e4355fa4	ARM: NEON: Some cleanup of bilinear scanline functions Use STRIDE and initial horizontal weight update is done before entering interpolation loop. Cache preload for mask and dst.	2011-10-18 13:00:02 +09:00
Søren Sandmann Pedersen	ec7c9c2b68	Simplify gradient_walker_reset() The code that searches for the closest color stop to the given position is duplicated across the various repeat modes. Replace the switch with two if/else constructions, and put the search code between them.	2011-10-15 10:50:20 -04:00
Søren Sandmann Pedersen	2d0da8ab8d	Use sentinels instead of special casing first and last stops When storing the gradient stops internally, allocate two more stops, one before the beginning of the stop list and one after the end. Initialize those stops based on the repeat property of the gradient. This allows gradient_walker_reset() to be simplified because it can now simply pick the two closest stops to the position without special casing the first and last stops.	2011-10-15 10:50:20 -04:00
Søren Sandmann Pedersen	84d6ca7c89	gradient walker: Correct types and fix formatting The type of pos in gradient_walker_reset() and gradient_walker_pixel() is pixman_fixed_48_16_t and not pixman_fixed_32_32. The types of the positions in the walker struct are pixman_fixed_t and not int32_t, and need_reset is a boolean, not an integer. The spread field should be called repeat and have the type pixman_repeat_t. Also fix some formatting issues, make gradient_walker_reset() static, and delete the pointless PIXMAN_GRADIENT_WALKER_NEED_RESET() macro.	2011-10-15 10:50:14 -04:00
Søren Sandmann Pedersen	ace225b53d	Add stable release / development snapshot to draft release notes This will hopefully serve as a reminder to me that I should put this information in the release notes.	2011-10-11 16:12:32 -04:00
Søren Sandmann Pedersen	bb7142d361	Post-release version bump to 0.23.7	2011-10-11 06:10:39 -04:00
Søren Sandmann Pedersen	e20ac40bd3	Pre-release version bump to 0.23.6	2011-10-11 06:00:51 -04:00
Taekyun Kim	a43946a51f	Simple repeat: Extend too short source scanlines into temporary buffer Too short scanlines can cause repeat handling overhead and optimized pixman composite functions usually process a bunch of pixels in a single loop iteration it might be beneficial to pre-extend source scanlines. The temporary buffers will usually reside in cache, so accessing them should be quite efficient.	2011-10-10 12:18:28 +09:00
Taekyun Kim	eaff774a3f	Simple repeat fast path We can implement simple repeat by stitching existing fast path functions. First lookup COVER_CLIP function for given input and then stitch horizontally using the function.	2011-10-10 12:18:25 +09:00
Taekyun Kim	a258e33fcb	Move _pixman_lookup_composite_function() to pixman-utils.c	2011-10-10 12:18:23 +09:00
Søren Sandmann Pedersen	fc62785aab	Add src, mask, and dest flags to the composite args struct. These flags are useful in the various compositing routines, and the flags stored in the image structs are missing some bits of information that can only be computed when pixman_image_composite() is called.	2011-10-10 12:18:21 +09:00
Taekyun Kim	fa6523d13a	Add new fast path flag FAST_PATH_BITS_IMAGE This fast path flag indicate that type of the image is bits image.	2011-10-10 12:18:18 +09:00
Taekyun Kim	7272e2fcd2	init/fini functions for pixman_image_t pixman_image_t itself can be on stack or heap. So segregating init/fini from create/unref can be useful when we want to use pixman_image_t on stack or other memory.	2011-10-10 12:18:14 +09:00
Taekyun Kim	4dcf1b0107	sse2: Bilinear scaled over_8888_8_8888	2011-10-10 12:13:20 +09:00
Taekyun Kim	81050f2784	sse2: Bilinear scaled over_8888_8888	2011-10-10 12:13:17 +09:00
Taekyun Kim	d67c0b883d	sse2: Macros for assembling bilinear interpolation code fractions Primitive bilinear interpolation code is reusable to implement other bilinear functions. BILINEAR_DECLARE_VARIABLES - Declare variables needed to interpolate src pixels. BILINEAR_INTERPOLATE_ONE_PIXEL - Interpolate one pixel and advance to next pixel BILINEAR_SKIP_ONE_PIXEL - Skip interpolation and just advance to next pixel This is useful for skipping zero mask	2011-10-10 12:12:47 +09:00
Matt Turner	741eb8462c	Correct the minimum gcc version needed for iwmmxt Spotted by Søren Sandmann. Signed-off-by: Matt Turner <mattst88@gmail.com>	2011-10-06 17:56:09 -04:00
Matt Turner	0a34277180	Make sure iwMMXt is only detected on ARM iwMMXt is incorrectly detected on x86 and amd64. This happens because the test uses standard _mm_* intrinsic functions which it compiles with -march=iwmmxt, but when the user has set CFLAGS=-march=k8 for instance, no error is generated from -march=iwmmxt, even though it's not a valid flag on x86/amd64. Passing CFLAGS=-march=native does not override the -march=iwmmxt flag though, which is why it wasn't noticed before. So, just #error out in the test if the __arm__ preprocessor directive isn't defined. Fixes https://bugs.gentoo.org/show_bug.cgi?id=385179 Signed-off-by: Matt Turner <mattst88@gmail.com>	2011-10-06 17:52:12 -04:00
Søren Sandmann Pedersen	879b7c21e4	Don't include stdint.h in scaling-helpers-test. Fixes bug 41257.	2011-09-28 09:16:23 -04:00
Benjamin Otte	01c2dcbe69	build: replace @VAR@ with $(VAR) in makefiles	2011-09-28 01:48:02 +02:00
Benjamin Otte	100f16eae9	tests: Add PNG_CFLAGS/LIBS to tests PNG flags were accidentally included by gdk-pixbuf. This has been fixed recently, so we need to make sure to include it ourselves.	2011-09-28 01:48:01 +02:00
Matt Turner	d1313febbe	mmx: optimize unaligned 64-bit ARM/iwmmxt loads Signed-off-by: Matt Turner <mattst88@gmail.com>	2011-09-27 13:13:22 -04:00
Matt Turner	7ab94c5f99	mmx: compile on ARM for iwmmxt optimizations Check in configure for at least gcc-4.6, since gcc-4.7 (and hopefully 4.6) will be the eariest version capable of compiling the _mm_* intrinsics on ARM/iwmmxt. Even for suitable compile versions I use _mm_srli_si64 which is known to cause unpatched compilers to fail. Select iwmmxt at runtime only after NEON, since we expect the NEON optimizations to be more capable and faster than iwmmxt. Signed-off-by: Matt Turner <mattst88@gmail.com>	2011-09-27 13:13:15 -04:00
Matt Turner	f66887d9ea	mmx: prepare pixman-mmx.c to be compiled for ARM/iwmmxt Signed-off-by: Matt Turner <mattst88@gmail.com>	2011-09-27 13:13:07 -04:00
Matt Turner	7c6d5d1999	mmx: fix unaligned accesses Simply return *p in the unaligned access functions, since alignment constraints are very relaxed on x86 and this allows us to generate identical code as before. Tested with the test suite, lowlevel-blit-test, and cairo-perf-trace on ARM and Alpha with no unaligned accesses found. Signed-off-by: Matt Turner <mattst88@gmail.com>	2011-09-27 13:13:01 -04:00
Matt Turner	5d98abb14c	mmx: wrap x86/MMX inline assembly in ifdef USE_X86_MMX Signed-off-by: Matt Turner <mattst88@gmail.com>	2011-09-27 13:12:55 -04:00
Matt Turner	02c1f1a022	mmx: rename USE_MMX to USE_X86_MMX This will make upcoming ARM usage of pixman-mmx.c unambiguous. Signed-off-by: Matt Turner <mattst88@gmail.com>	2011-09-27 13:12:50 -04:00
Matt Turner	57fd8c37aa	mmx: convert while (w) to if (w) when possible gcc isn't able to see that w is no greater than 1, so it generates unnecessary loop instructions with while (w). Signed-off-by: Matt Turner <mattst88@gmail.com>	2011-09-26 11:30:05 -04:00
Matt Turner	38a7aae1d9	mmx: fix formats in commented code b8r8g8 is apparently no longer supported sometime since this code was commented. Signed-off-by: Matt Turner <mattst88@gmail.com>	2011-09-26 11:29:58 -04:00
Matt Turner	b6b77488a0	lowlevel-blt: add over_x888_8_8888 Signed-off-by: Matt Turner <mattst88@gmail.com>	2011-09-26 11:29:51 -04:00
Siarhei Siamashka	9126f36b96	BILINEAR->NEAREST filter optimization for simple rotation and translation Simple rotation and translation are the additional cases when BILINEAR filter can be safely reduced to NEAREST.	2011-09-21 18:55:25 -04:00
Søren Sandmann Pedersen	ad5c6bbb36	Strength-reduce BILINEAR filter to NEAREST filter for identity transforms An image with a bilinear filter and an identity transform is equivalent to one with a nearest filter, so there is no reason the standard fast paths shouldn't be usable. But because a BILINEAR filter samples a 2x2 pixel block in the source image, FAST_PATH_SAMPLES_COVER_CLIP can't be set in the case where the source area is the entire image, because some compositing operations might then read pixels outside the image. This patch fixes the problem by splitting the FAST_PATH_SAMPLES_COVER_CLIP flag into two separate flags FAST_PATH_SAMPLES_COVER_CLIP_NEAREST and FAST_PATH_SAMPLES_COVER_CLIP_BILINEAR that indicate that the clip covers the samples taking into account NEAREST/BILINEAR filters respectively. All the existing compositing operations that require FAST_PATH_SAMPLES_COVER_CLIP then have their flags modified to pick either COVER_CLIP_NEAREST or COVER_CLIP_BILINEAR depending on which filter they depend on. In compute_image_info() both COVER_CILP_NEAREST and COVER_CLIP_BILINEAR can be set depending on how much room there is around the clip rectangle. Finally, images with an identity transform and a bilinear filter get FAST_PATH_NEAREST_FILTER set as well as FAST_PATH_BILINEAR_FILTER. Performance measurementas with render_bench against Xephyr: Before * ROUND 1 * --------------------------------------------------------------- Test: Test Xrender doing non-scaled Over blends Time: 5.720 sec. --------------------------------------------------------------- Test: Test Xrender (offscreen) doing non-scaled Over blends Time: 5.149 sec. --------------------------------------------------------------- Test: Test Imlib2 doing non-scaled Over blends Time: 6.237 sec. After: * ROUND 1 * --------------------------------------------------------------- Test: Test Xrender doing non-scaled Over blends Time: 4.947 sec. --------------------------------------------------------------- Test: Test Xrender (offscreen) doing non-scaled Over blends Time: 4.487 sec. --------------------------------------------------------------- Test: Test Imlib2 doing non-scaled Over blends Time: 6.235 sec.	2011-09-21 18:55:25 -04:00
Søren Sandmann Pedersen	eb2e7ed81b	test: Occasionally use a BILINEAR filter in blitters-test To test that reductions of BILINEAR->NEAREST for identity transformations happen correctly, occasionally use a bilinear filter in blitters test.	2011-09-21 18:55:25 -04:00
Siarhei Siamashka	2a9f88430e	test: better coverage for BILINEAR->NEAREST filter optimization The upcoming optimization which is going to be able to replace BILINEAR filter with NEAREST where appropriate needs to analyze the transformation matrix and not to make any mistakes. The changes to affine-test include: 1. Higher chance of using the same scale factor for x and y axes. This can help to stress some special cases (for example the case when both x and y scale factors are integer). The same applies to x/y translation. 2. Introduced a small chance for "corrupting" transformation matrix by flipping random bits. This supposedly can help to identify the cases when some of the fast paths or other code logic is wrongly activated due to insufficient checks.	2011-09-21 18:55:10 -04:00
Søren Sandmann Pedersen	054922e2fc	Eliminate compute_sample_extents() function In analyze_extents(), instead of calling compute_sample_extents() call compute_transformed_extents() and inline the remaining part of compute_sample_extents(). The upcoming bilinear->nearest optimization will do something different with these two pieces of code.	2011-09-21 18:53:03 -04:00
Søren Sandmann Pedersen	577b6c46fd	Split computation of sample area into own function compute_sample_extents() have two parts: one that computes the transformed extents, and one that checks whether the computed extents fit within the 16.16 coordinate space. Split the first part into its own function compute_transformed_extents().	2011-09-21 18:52:18 -04:00

... 2 3 4 5 6 ...

2096 Commits