pixman

mirror of https://salsa.debian.org/xorg-team/lib/pixman synced 2025-09-06 12:25:16 +00:00

Author	SHA1	Message	Date
Søren Sandmann Pedersen	87eea99e44	Pre-release version bump to 0.32.6	2014-07-05 18:55:43 -04:00
Siarhei Siamashka	9f18ea3483	configure.ac: Check if the compiler supports GCC vector extensions The Intel Compiler 14.0.0 claims version GCC 4.7.3 compatibility via __GNUC__/__GNUC__MINOR__ macros, but does not provide the same level of GCC vector extensions support as the original GCC compiler: http://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html Which results in the following compilation failure: In file included from ../test/utils.h(7), from ../test/utils.c(3): ../test/utils-prng.h(138): error: expression must have integral type uint32x4 e = x->a - ((x->b << 27) + (x->b >> (32 - 27))); ^ The problem is fixed by doing a special check in configure for this feature.	2014-07-04 20:52:59 -04:00
Søren Sandmann	50d7b5fa8e	create_bits(): Cast the result of height * stride to size_t In create_bits() both height and stride are ints, so the result is also an int, which will overflow if height or stride are big enough and size_t is bigger than int. This patch simply casts height to size_t to prevent these overflows, which prevents the crash in: https://bugzilla.redhat.com/show_bug.cgi?id=972647 It's not even close to fixing the full problem of supporting big images in pixman. See also https://bugs.freedesktop.org/show_bug.cgi?id=69014	2014-07-04 20:50:58 -04:00
Nemanja Lukic	6d2cf40166	MIPS: Fix exported symbols in public API.	2014-07-03 13:35:21 -04:00
Nemanja Lukic	c42824ebb5	MIPS: Fix exported symbols in public API.	2014-07-03 13:34:53 -04:00
Søren Sandmann Pedersen	5a2edb3f2c	test: Rearrange tests in order of increasing runtime Making short tests run first is convenient to catch obvious bugs early.	2014-06-28 19:24:27 -04:00
Søren Sandmann Pedersen	9cd283b2eb	pixman-gradient-walker: Make left_x and right_x 64 bit variables The variables left_x, and right_x in gradient_walker_reset() are computed from pos, which is a 64 bit quantity, so to avoid overflows, these variables must be 64 bit as well. Similarly, the left_x and right_x that are stored in pixman_gradient_walker_t need to be 64 bit as well; otherwise, pixman_gradient_walker_pixel() will call reset too often. This fixes the radial-invalid test, which was generating 'invalid' floating point exceptions when the overflows caused color values to be outside of [0, 255].	2014-05-15 13:29:58 -04:00
Søren Sandmann Pedersen	f5f5dbbbc6	test: Add radial-invalid test program This program demonstrates a bug in gradient walker, where some integer overflows cause colors outside the range [0, 255] to be generated, which in turns cause 'invalid' floating point exceptions when those colors are converted to uint8_t. The bug was first reported by Owen Taylor on the #cairo IRC channel.	2014-05-15 13:29:38 -04:00
Ben Avison	91f32ce961	ARMv6: Add fast path for src_x888_0565 Benchmark results, "before" is upstream/master `5f661ee719`, and "after" contains this patch on top. lowlevel-blt-bench, src_8888_0565, 100 iterations: Before After Mean StdDev Mean StdDev Confidence Change L1 25.9 0.20 115.6 0.70 100.00% +347.1% L2 14.4 0.23 52.7 3.48 100.00% +265.0% M 14.1 0.01 79.8 0.17 100.00% +465.9% HT 10.2 0.03 32.9 0.31 100.00% +221.2% VT 9.8 0.03 29.8 0.25 100.00% +203.4% R 9.4 0.03 27.8 0.18 100.00% +194.7% RT 4.6 0.04 10.9 0.29 100.00% +135.9% At most 19 outliers rejected per test per set. cairo-perf-trace with trimmed traces results were indifferent. A system-wide perf_3.10 profile on Raspbian shows significant differences in the X server CPU usage. The following were measured from a 130x62 char lxterminal running 'dmesg' every 0.5 seconds for roughly 30 seconds. These profiles are libpixman.so symbols only. Before: Samples: 63K of event 'cpu-clock', Event count (approx.): 2941348112, DSO: libpixman-1.so.0.33.1 37.77% Xorg [.] fast_fetch_r5g6b5 14.39% Xorg [.] pixman_composite_over_n_8_8888_asm_armv6 8.51% Xorg [.] fast_write_back_r5g6b5 7.38% Xorg [.] pixman_composite_src_8888_8888_asm_armv6 4.39% Xorg [.] pixman_composite_add_8_8_asm_armv6 3.69% Xorg [.] pixman_composite_src_n_8888_asm_armv6 2.53% Xorg [.] _pixman_image_validate 2.35% Xorg [.] pixman_image_composite32 After: Samples: 31K of event 'cpu-clock', Event count (approx.): 3619782704, DSO: libpixman-1.so.0.33.1 22.36% Xorg [.] pixman_composite_over_n_8_8888_asm_armv6 13.59% Xorg [.] pixman_composite_src_x888_0565_asm_armv6 12.75% Xorg [.] pixman_composite_src_8888_8888_asm_armv6 6.79% Xorg [.] pixman_composite_add_8_8_asm_armv6 5.95% Xorg [.] pixman_composite_src_n_8888_asm_armv6 4.12% Xorg [.] pixman_image_composite32 3.69% Xorg [.] _pixman_image_validate 3.65% Xorg [.] _pixman_bits_image_setup_accessors Before, fast_fetch_r5g6b5 + fast_write_back_r5g6b5 took 46% of the samples in libpixman, and probably incurred some memcpy() load, too. After, pixman_composite_src_x888_0565_asm_armv6 takes 14%. Note, that the sample counts are very different before/after, as less time is spent in Pixman and running time is not exactly the same. Furthermore, in the above test, the CPU idle function was sampled 9% before, and 15% after. v4, Pekka Paalanen <pekka.paalanen@collabora.co.uk> : Re-benchmarked on Raspberry Pi, commit message.	2014-05-01 15:11:42 -04:00
Pekka Paalanen	5f661ee719	ARM: use pixman_asm_function in internal headers The two ARM headers contained open-coded copies of pixman_asm_function, replace these. Since it seems customary that ARM headers do not use CPP include guards, rely on the .S files to #include "pixman-arm-asm.h" first. They all do now. v2: Fix a build failure on rpi by adding one #include.	2014-04-21 20:38:09 -04:00
Ben Avison	ab587b444c	ARMv6: Add fast path for in_reverse_8888_8888 Benchmark results, "before" is the patch * upstream/master `4b76bbfda6` + ARMv6: Support for very variable-hungry composite operations + ARMv6: Add fast path for over_n_8888_8888_ca and "after" contains the additional patches on top: + ARMv6: Add fast path flag to force no preload of destination buffer + ARMv6: Add fast path for in_reverse_8888_8888 (this patch) lowlevel-blt-bench, in_reverse_8888_8888, 100 iterations: Before After Mean StdDev Mean StdDev Confidence Change L1 21.1 0.07 32.3 0.08 100.00% +52.9% L2 11.6 0.29 18.0 0.52 100.00% +54.4% M 10.5 0.01 16.1 0.03 100.00% +54.1% HT 8.2 0.02 12.0 0.04 100.00% +45.9% VT 8.1 0.02 11.7 0.04 100.00% +44.5% R 8.1 0.02 11.3 0.04 100.00% +39.7% RT 4.8 0.04 6.1 0.09 100.00% +27.3% At most 12 outliers rejected per test per set. cairo-perf-trace with trimmed traces, 30 iterations: Before After Mean StdDev Mean StdDev Confidence Change t-firefox-paintball.trace 18.0 0.01 14.1 0.01 100.00% +27.4% t-firefox-chalkboard.trace 36.7 0.03 36.0 0.02 100.00% +1.9% t-firefox-canvas-alpha.trace 20.7 0.22 20.3 0.22 100.00% +1.9% t-swfdec-youtube.trace 7.8 0.03 7.8 0.03 100.00% +0.9% t-firefox-talos-gfx.trace 25.8 0.44 25.6 0.29 93.87% +0.7% (insignificant) t-firefox-talos-svg.trace 20.6 0.04 20.6 0.03 100.00% +0.2% t-firefox-fishbowl.trace 21.2 0.04 21.1 0.02 100.00% +0.2% t-xfce4-terminal-a1.trace 4.8 0.01 4.8 0.01 98.85% +0.2% (insignificant) t-swfdec-giant-steps.trace 14.9 0.03 14.9 0.02 99.99% +0.2% t-poppler-reseau.trace 22.4 0.11 22.4 0.08 86.52% +0.2% (insignificant) t-gnome-system-monitor.trace 17.3 0.03 17.2 0.03 99.74% +0.2% t-firefox-scrolling.trace 24.8 0.12 24.8 0.11 70.15% +0.1% (insignificant) t-firefox-particles.trace 27.5 0.18 27.5 0.21 48.33% +0.1% (insignificant) t-grads-heat-map.trace 4.4 0.04 4.4 0.04 16.61% +0.0% (insignificant) t-firefox-fishtank.trace 13.2 0.01 13.2 0.01 7.64% +0.0% (insignificant) t-firefox-canvas.trace 18.0 0.05 18.0 0.05 1.31% -0.0% (insignificant) t-midori-zoomed.trace 8.0 0.01 8.0 0.01 78.22% -0.0% (insignificant) t-firefox-planet-gnome.trace 10.9 0.02 10.9 0.02 64.81% -0.0% (insignificant) t-gvim.trace 33.2 0.21 33.2 0.18 38.61% -0.1% (insignificant) t-firefox-canvas-swscroll.trace 32.2 0.09 32.2 0.11 73.17% -0.1% (insignificant) t-firefox-asteroids.trace 11.1 0.01 11.1 0.01 100.00% -0.2% t-evolution.trace 13.0 0.05 13.0 0.05 91.99% -0.2% (insignificant) t-gnome-terminal-vim.trace 19.9 0.14 20.0 0.14 97.38% -0.4% (insignificant) t-poppler.trace 9.8 0.06 9.8 0.04 99.91% -0.5% t-chromium-tabs.trace 4.9 0.02 4.9 0.02 100.00% -0.6% At most 6 outliers rejected per test per set. Cairo perf reports the running time, but the change is computed for operations per second instead (inverse of running time). Confidence is based on Welch's t-test. Absolute changes less than 1% can be accounted as measurement errors, even if statistically significant. There was a question of why FLAG_NO_PRELOAD_DST is used. It makes lowlevel-blt-bench results worse except for L1, but improves some Cairo trace benchmarks. "Ben Avison" <bavison@riscosopen.org> wrote: > The thing with the lowlevel-blt-bench benchmarks for the more > sophisticated composite types (as a general rule, anything that involves > branches at the per-pixel level) is that they are only profiling the case > where you have mid-level alpha values in the source/mask/destination. > Real-world images typically have a disproportionate number of fully > opaque and fully transparent pixels, which is why when there's a > discrepancy between which implementation performs best with cairo-perf > trace versus lowlevel-blt-bench, I usually favour the Cairo winner. > > The results of removing FLAG_NO_PRELOAD_DST (in other words, adding > preload of the destination buffer) are easy to explain in the > lowlevel-blt-bench results. In the L1 case, the destination buffer is > already in the L1 cache, so adding the preloads is simply adding extra > instruction cycles that have no effect on memory operations. The "in" > compositing operator depends upon the alpha of both source and > destination, so if you use uniform mid-alpha, then you actually do need > to read your destination pixels, so you benefit from preloading them. But > for fully opaque or fully transparent source pixels, you don't need to > read the corresponding destination pixel - it'll either be left alone or > overwritten. Since the ARM11 doesn't use write-allocate cacheing, both of > these cases avoid both the time taken to load the extra cachelines, as > well as increasing the efficiency of the cache for other data. If you > examine the source images being used by the Cairo test, you'll probably > find they mostly use transparent or opaque pixels. v4, Pekka Paalanen <pekka.paalanen@collabora.co.uk> : Rebased, re-benchmarked on Raspberry Pi, commit message. v5, Pekka Paalanen <pekka.paalanen@collabora.co.uk> : Rebased, re-benchmarked on Raspberry Pi due to a fix to "ARMv6: Add fast path for over_n_8888_8888_ca" patch.	2014-04-21 20:34:26 -04:00
Ben Avison	68d2f7b486	ARMv6: Add fast path flag to force no preload of destination buffer	2014-04-21 20:34:26 -04:00
Ben Avison	4ad769cbec	ARMv6: Add fast path for over_n_8888_8888_ca Benchmark results, "before" is * upstream/master `4b76bbfda6` "after" contains the additional patches on top: + ARMv6: Support for very variable-hungry composite operations + ARMv6: Add fast path for over_n_8888_8888_ca (this patch) lowlevel-blt-bench, over_n_8888_8888_ca, 100 iterations: Before After Mean StdDev Mean StdDev Confidence Change L1 2.7 0.00 16.1 0.06 100.00% +500.7% L2 2.4 0.01 14.1 0.15 100.00% +489.9% M 2.3 0.00 14.3 0.01 100.00% +510.2% HT 2.2 0.00 9.7 0.03 100.00% +345.0% VT 2.2 0.00 9.4 0.02 100.00% +333.4% R 2.2 0.01 9.5 0.03 100.00% +331.6% RT 1.9 0.01 5.5 0.07 100.00% +192.7% At most 1 outliers rejected per test per set. cairo-perf-trace with trimmed traces, 30 iterations: Before After Mean StdDev Mean StdDev Confidence Change t-firefox-talos-gfx.trace 33.1 0.42 25.8 0.44 100.00% +28.6% t-firefox-scrolling.trace 31.4 0.11 24.8 0.12 100.00% +26.3% t-gnome-terminal-vim.trace 22.4 0.10 19.9 0.14 100.00% +12.5% t-evolution.trace 13.9 0.07 13.0 0.05 100.00% +6.5% t-firefox-planet-gnome.trace 11.6 0.02 10.9 0.02 100.00% +6.5% t-gvim.trace 34.0 0.21 33.2 0.21 100.00% +2.4% t-chromium-tabs.trace 4.9 0.02 4.9 0.02 100.00% +1.0% t-poppler.trace 9.8 0.05 9.8 0.06 100.00% +0.7% t-firefox-canvas-swscroll.trace 32.3 0.10 32.2 0.09 100.00% +0.4% t-firefox-paintball.trace 18.1 0.01 18.0 0.01 100.00% +0.3% t-poppler-reseau.trace 22.5 0.09 22.4 0.11 99.29% +0.3% t-firefox-canvas.trace 18.1 0.06 18.0 0.05 99.29% +0.2% t-xfce4-terminal-a1.trace 4.8 0.01 4.8 0.01 99.77% +0.2% t-firefox-fishbowl.trace 21.2 0.03 21.2 0.04 100.00% +0.2% t-gnome-system-monitor.trace 17.3 0.03 17.3 0.03 99.54% +0.1% t-firefox-asteroids.trace 11.1 0.01 11.1 0.01 100.00% +0.1% t-midori-zoomed.trace 8.0 0.01 8.0 0.01 99.98% +0.1% t-grads-heat-map.trace 4.4 0.04 4.4 0.04 34.08% +0.1% (insignificant) t-firefox-talos-svg.trace 20.6 0.03 20.6 0.04 54.06% +0.0% (insignificant) t-firefox-fishtank.trace 13.2 0.01 13.2 0.01 52.81% -0.0% (insignificant) t-swfdec-giant-steps.trace 14.9 0.02 14.9 0.03 85.50% -0.1% (insignificant) t-firefox-chalkboard.trace 36.6 0.02 36.7 0.03 100.00% -0.2% t-firefox-canvas-alpha.trace 20.7 0.32 20.7 0.22 55.76% -0.3% (insignificant) t-swfdec-youtube.trace 7.8 0.02 7.8 0.03 100.00% -0.5% t-firefox-particles.trace 27.4 0.16 27.5 0.18 99.94% -0.6% At most 4 outliers rejected per test per set. Cairo perf reports the running time, but the change is computed for operations per second instead (inverse of running time). Confidence is based on Welch's t-test. Absolute changes less than 1% can be accounted as measurement errors, even if statistically significant. v4, Pekka Paalanen <pekka.paalanen@collabora.co.uk> : Use pixman_asm_function instead of startfunc. Rebased. Re-benchmarked on Raspberry Pi. Commit message. v5, Ben Avison <bavison@riscosopen.org> : Fixed the bug exposed in blitters-test 4928372. 15 hours of testing, compared to the 45 minutes to hit the bug originally. Pekka Paalanen <pekka.paalanen@collabora.co.uk> : Squash the fix, re-benchmark on Raspberry Pi.	2014-04-21 20:34:26 -04:00
Ben Avison	73d2f8b61a	ARMv6: Support for very variable-hungry composite operations Previously, the variable ARGS_STACK_OFFSET was available to extract values from function arguments during the init macro. Now this changes dynamically around stack operations in the function as a whole so that arguments can be accessed at any point. It is also joined by LOCALS_STACK_OFFSET, which allows access to space reserved on the stack during the init macro. On top of this, composite macros now have the option of using all of WK0-WK3 registers rather than just the subset it was told to use; this requires the pixel count to be spilled to the stack over the leading pixels at the start of each line. Thus, at best, each composite operation can use 11 registers, plus any pointer registers not required for the composite type, plus as much stack space as it needs, divided up into constants and variables as necessary.	2014-04-21 20:34:26 -04:00
Søren Sandmann	857e40f3d2	create_bits(): Cast the result of height * stride to size_t In create_bits() both height and stride are ints, so the result is also an int, which will overflow if height or stride are big enough and size_t is bigger than int. This patch simply casts height to size_t to prevent these overflows, which prevents the crash in: https://bugzilla.redhat.com/show_bug.cgi?id=972647 It's not even close to fixing the full problem of supporting big images in pixman. See also https://bugs.freedesktop.org/show_bug.cgi?id=69014	2014-04-15 14:21:14 -04:00
Pekka Paalanen	4b76bbfda6	ARM: share pixman_asm_function definition Several files define identically the asm macro pixman_asm_function. Merge all these definitions into a new asm header. The original definition is taken from pixman-arm-simd-asm-scaled.S with the copyright/licence/author blurb verbatim.	2014-04-02 12:48:26 +03:00
Ben Avison	4ee85b0083	ARMv6: Add fast path for over_reverse_n_8888 Benchmark results, "before" is upstream commit `c343846` lowlevel-blt-bench: add in_reverse_8888_8888 test and "after" is with this patch only added on top. lowlevel-blt-bench, over_reverse_n_8888, 100 iterations: Before After Mean StdDev Mean StdDev Confidence Change L1 15.1 0.1 274.5 2.3 100.00% +1718.9% L2 12.8 0.3 181.8 0.7 100.00% +1315.5% M 10.8 0.0 77.9 0.0 100.00% +621.2% HT 9.7 0.0 29.4 0.2 100.00% +204.9% VT 9.5 0.0 26.7 0.1 100.00% +179.3% R 9.3 0.0 25.3 0.1 100.00% +173.6% RT 6.0 0.1 11.0 0.2 100.00% +82.9% At most 16 outliers rejected per case per set. cairo-perf-trace with trimmed traces, 30 iterations: Before After Mean StdDev Mean StdDev Confidence Change t-poppler.trace 12.9 0.1 9.7 0.0 100.00% +32.6% t-firefox-talos-gfx.trace 33.2 0.7 32.9 0.4 95.23% +0.9% (insignificant) t-firefox-particles.trace 27.4 0.1 27.3 0.2 99.65% +0.4% t-firefox-canvas-alpha.trace 20.5 0.3 20.5 0.3 57.51% +0.3% (insignificant) t-poppler-reseau.trace 22.4 0.1 22.4 0.1 95.69% +0.3% (insignificant) t-firefox-fishtank.trace 13.2 0.0 13.2 0.0 99.84% +0.1% t-swfdec-giant-steps.trace 14.9 0.0 14.9 0.0 87.68% +0.1% (insignificant) t-swfdec-youtube.trace 7.8 0.0 7.8 0.0 35.22% +0.1% (insignificant) t-firefox-planet-gnome.trace 11.5 0.0 11.5 0.0 29.37% +0.0% (insignificant) t-firefox-fishbowl.trace 21.2 0.0 21.2 0.0 18.09% +0.0% (insignificant) t-grads-heat-map.trace 4.4 0.0 4.4 0.0 1.84% +0.0% (insignificant) t-firefox-paintball.trace 18.0 0.0 18.0 0.0 33.43% -0.0% (insignificant) t-firefox-talos-svg.trace 20.5 0.0 20.5 0.1 68.56% -0.1% (insignificant) t-midori-zoomed.trace 8.0 0.0 8.0 0.0 99.98% -0.1% t-firefox-canvas-swscroll.trace 32.1 0.1 32.1 0.1 85.27% -0.1% (insignificant) t-gnome-system-monitor.trace 17.2 0.0 17.2 0.0 99.97% -0.2% t-firefox-chalkboard.trace 36.5 0.0 36.6 0.0 100.00% -0.2% t-firefox-asteroids.trace 11.1 0.0 11.1 0.0 100.00% -0.2% t-firefox-canvas.trace 17.9 0.0 18.0 0.0 100.00% -0.3% t-chromium-tabs.trace 4.9 0.0 4.9 0.0 97.95% -0.3% (insignificant) t-xfce4-terminal-a1.trace 4.8 0.0 4.8 0.0 100.00% -0.4% t-firefox-scrolling.trace 31.1 0.1 31.2 0.1 100.00% -0.5% t-evolution.trace 13.7 0.1 13.8 0.1 99.99% -0.6% t-gnome-terminal-vim.trace 22.0 0.2 22.2 0.1 99.99% -0.7% t-gvim.trace 33.2 0.2 33.5 0.2 100.00% -0.8% At most 6 outliers rejected per case per set. Cairo perf reports the running time, but the change is computed for operations per second instead (inverse of running time). Changes in the order of +/- 1% can be accounted for measurement errors, even if they are deemed to be statistically significant. This claim is based on comparing two 30-iteration identical "before" runs using the exact same binaries, and observing changes from -0.4% to +0.5% with >=99% confidence. Confidence is based on Welch's t-test. v4, Pekka Paalanen <pekka.paalanen@collabora.co.uk> : Rebased, re-benchmarked on Raspberry Pi, commit message.	2014-04-02 12:46:24 +03:00
Siarhei Siamashka	56622140e3	test: Fix OpenMP clauses for the tolerance-test Compiling with the Intel Compiler reveals a problem: tolerance-test.c(350): error: index variable "i" of for statement following an OpenMP for pragma must be private # pragma omp parallel for default(none) shared(i) private (result) ^ In addition to this, the 'result' variable also should not be private (otherwise its value does not survive after the end of the loop). It needs to be either shared or use the reduction clause to describe how the results from multiple threads are combined together. Reduction seems to be more appropriate here.	2014-04-02 12:46:09 +03:00
Siarhei Siamashka	840912b311	configure.ac: Check if the compiler supports GCC vector extensions The Intel Compiler 14.0.0 claims version GCC 4.7.3 compatibility via __GNUC__/__GNUC__MINOR__ macros, but does not provide the same level of GCC vector extensions support as the original GCC compiler: http://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html Which results in the following compilation failure: In file included from ../test/utils.h(7), from ../test/utils.c(3): ../test/utils-prng.h(138): error: expression must have integral type uint32x4 e = x->a - ((x->b << 27) + (x->b >> (32 - 27))); ^ The problem is fixed by doing a special check in configure for this feature.	2014-04-02 12:46:04 +03:00
Ben Avison	c343846625	lowlevel-blt-bench: add in_reverse_8888_8888 test in_reverse_8888_8888 is one of the more commonly used operations in the cairo-perf-trace suite that hasn't been in lowlevel-blt-bench until now. v4, Pekka Paalanen <pekka.paalanen@collabora.co.uk> : Split from "Add extra test to lowlevel-blt-bench and fix an existing one", new summary.	2014-03-20 08:33:05 -04:00
Ben Avison	898859f3d3	lowlevel-blt-bench: over_reverse_n_8888 needs solid source v4, Pekka Paalanen <pekka.paalanen@collabora.co.uk> : Split from "Add extra test to lowlevel-blt-bench and fix an existing one", new summary.	2014-03-20 08:33:05 -04:00
Ben Avison	38317cbfde	ARMv6: remove 1 instr per row in generate_composite_function This knocks off one instruction per row. The effect is probably too small to be measurable, but might as well be included. The second occurrence of this sequence doesn't actually benefit at all, but is changed for consistency. The saved instruction comes from combining the "and" inside the .if statement with an earlier "tst". The "and" was normally needed, except for in one special case, where bits 4-31 were all shifted off the top of the register later on in preload_leading_step2, so we didn't care about their values. v4, Pekka Paalanen <pekka.paalanen@collabora.co.uk> : Remove "bits 0-3" from the comments, update patch summary, and augment message with Ben's suggestion.	2014-03-20 08:33:05 -04:00
Ben Avison	763a6d3e67	ARMv6: Fix indentation in the composite macros	2014-03-20 08:33:05 -04:00
Søren Sandmann	82d094654a	Remove all the operators that use division from pixman-combine32.c These are now handled by floating point combiners.	2014-01-04 16:13:27 -05:00
Søren Sandmann	ccb1df0c5e	Copy the comments from pixman-combine32.c to pixman-combine-float.c An upcoming commit will delete many of the operators from pixman-combine32.c and rely on the ones in pixman-combine-float.c. The comments about how the operators were derived are still useful though, so copy them into pixman-combine-float.c before the deletion.	2014-01-04 16:13:27 -05:00
Søren Sandmann Pedersen	94244b0c40	utils.c: Set DEVIATION to 0.0128 Consider a HARD_LIGHT operation with the following pixels: - source: 15 (6 bits) - source alpha: 255 (8 bits) - mask alpha: 223 (8 bits) - dest 255 (8 bits) - dest alpha: 0 (8 bits) Since 2 times the source is less than source alpha, the first branch of the hard light blend mode is taken: (1 - sa) * d + (1 - da) * s + 2 * s * d Since da is 0 and d is 1, this degenerates to: (1 - sa) + 3 * s Taking (src IN mask) into account along with the fact that sa is 1, this becomes: (1 - ma) + 3 * s * ma = (1 - 223/255.0) + 3 * (15/63.0) * (223/255.0) = 0.7501400560224089 When computed with the source converted by bit replication to eight bits, and additionally with the (src IN mask) part rounded to eight bits, we get: ma = 223/255.0 s * ma = (60 / 255.0) * (223/255.0) which rounds to 52 / 255 and the result is (1 - ma) + 3 * s * ma = (1 - 223/255.0) + 3 * 52/255.0 = 0.7372549019607844 so now we have an error of 0.012885. Without making changes to the way pixman does integer rounding/arithmetic, this error must then be considered acceptable. Due to conservative computations in the test suite we can however get away with 0.0128 as the acceptable deviation. This fixes the remaining failures in pixel-test.	2014-01-04 16:13:27 -05:00
Søren Sandmann	15aa37adec	Use floating point combiners for all operators that involve divisions Consider a DISJOINT_ATOP operation with the following pixels: - source: 0xff (8 bits) - source alpha: 0x01 (8 bits) - mask alpha: 0x7b (8 bits) - dest: 0x00 (8 bits) - dest alpha: 0xff (8 bits) When (src IN mask) is computed in 8 bits, the resulting alpha channel is 0 due to rounding: floor ((0x01 * 0x7b) / 255.0 + 0.5) = floor (0.9823) = 0 which means that since Render defines any division by zero as infinity, the Fa and Fb for this operator end up as follows: Fa = max (1 - (1 - 1) / 0, 0) = 0 Fb = min (1, (1 - 0) / 1) = 1 and so since dest is 0x00, the overall result is 0. However, when computed in full precision, the alpha value no longer rounds to 0, and so Fa ends up being Fa = max (1 - (1 - 1) / 0.0001, 0) = 1 and so the result is now s * ma * Fa + d * Fb = (1.0 * (0x7b / 255.0) * 1) + d * 0 = 0x7b / 255.0 = 0.4823 so the error in this case ends up being 0.48235294, which is clearly not something that can be considered acceptable. In order to avoid this problem, we need to do all arithmetic in such a way that a multiplication of two tiny numbers can never end up being zero unless one of the input numbers is itself zero. This patch makes all computations that involve divisions take place in floating point, which is sufficient to fix the test cases This brings the number of failures in pixel-test down to 14.	2014-01-04 16:13:27 -05:00
Søren Sandmann	8f38243163	Soft Light: Consistent approach to division by zero The Soft Light operator has several branches. One them is decided based on whether 2 * s is less than or equal to 2 * sa. In floating point implementations, when those two values are very close to each other, it may not be completely predictable which branch we hit. This is a problem because in one branch, when destination alpha is zero, we get the result r = d * as and in the other we get r = 0 So when d and as are not 0, this causes two different results to be returned from essentially identical input values. In other words, there is a discontinuity in the current implementation. This patch randomly changes the second branch such that it now returns d * sa instead. There is no deep meaning behind this, because essentially this is an attempt to assign meaning to division by zero, and all that is requires is that that meaning doesn't depend on minute differences in input values. This makes the number of failed pixels in pixel-test go down to 347.	2014-01-04 16:13:27 -05:00
Søren Sandmann Pedersen	89662adf77	pixman-combine32.c: Fix bugs related to integer promotion In the component alpha part of the PDF_SEPARABLE_BLEND_MODE macro, the expression ~RED_8 (m) is used. Because RED_8(m) gets promoted to int before ~ is applied, the whole expression typically becomes some negative value rather than (255 - RED_8(m)) as desired. Fix this by using unsigned temporary variables. This reduces the number of failures in pixel-test to 363.	2014-01-04 16:13:27 -05:00
Søren Sandmann Pedersen	e7a99b3b0f	pixman/pixman-combine32.c: Bug fixes for separable blend modes This commit fixes four separate bugs: 1. In the computation (1 - sa) * d + (1 - da) * s + sa * da * B(s, d) we were using regular addition for all four channels, but for superluminescent pixels, the addition could overflow causing nonsensical results. 2. The variables and return types used for the results of the blend mode calculations were unsigned, but for various blend modes (and especially with superluminescent pixels), the blend mode calculations could be negative, resulting in underflows. 3. The blend mode computations were returned as 8-bit values, which is not sufficient precision (especially considering that we need signed results). 4. The value before the final division by 255 was not properly clamped to [0, 255]. This patch fixes all those bugs. The blend mode computations are now returned as signed 16 bit values with 1 represented as 255 * 255. With these fixes, the number of failing pixels in pixel-test goes down from 431 to 384.	2014-01-04 16:13:27 -05:00
Søren Sandmann	fe3504d03f	pixel-test.c: Add a number of pixels that have failed at some point This commit adds a large number of pixel regressions to pixel-test. All of these have at some point been failing in blend-mode-test, and most of them do fail currently. To be specific, with this commit, pixel-test reports 431 failed tests.	2014-01-04 16:13:27 -05:00
Søren Sandmann Pedersen	bd94c17937	test/tolerance-test: New test program This new test program is similar to test/composite in that it relies on the pixel_checker_t API to do tolerance based verification. But unlike the composite test, which verifies combinations of a fixed set of pixels, this one generates random images and verifies that those composite correctly. Also unlike composite, tolerance-test supports all the separable blend mode operators in addition to the original Render operators. When tests fail, a C struct is printed that can be pasted into pixel-test for regression purposes. There is an option "--forever" which causes the random seed to be set to the current time, and then the test runs until interrupted. This is useful for overnight runs. This test currently fails badly due to various bugs in the blend mode operators. Later commits will fix those.	2014-01-04 16:13:27 -05:00
Søren Sandmann	c2fd65dba3	pixel-test: Command line argument to specify the regression to run A new command line argument allows the user to specify which one of the regressions should be run.	2014-01-04 16:13:27 -05:00
Søren Sandmann	a692e01600	pixel-test: Add support for mask pixels Support is added to pixel-test for verifying operations involving masks. If a regression includes a mask, it is verified with the pixel_checker API in in both unified and component alpha modes.	2014-01-04 16:13:27 -05:00
Søren Sandmann Pedersen	779ca46e98	test/check-formats.c: Add support for separable blend modes	2014-01-04 16:13:27 -05:00
Søren Sandmann Pedersen	a42af27fc0	test/utils.c: Add support for separable blend mode ops to do_composite() The implementations are copied from the floating point pipeline, but use double precision instead of single precision.	2014-01-04 16:13:27 -05:00
Søren Sandmann	b29d74ef0c	configure.ac: Check and use -Wno-unused-local-typedefs GCC option With GCC 4.8.2 the COMPILE_TIME_ASSERT macro produces a spurious warning about an unused local typedef: In file included from pixman.c:29:0: pixman.c: In function 'optimize_operator': pixman-private.h:1019:22: warning: typedef 'compile_time_assertion' locally defined but not used [-Wunused-local-typedefs] The flag -Wno-unused-local-typedefs suppresses that warning.	2013-12-26 09:41:53 -05:00
Julien Cristau	08ff9fa402	Upload to unstable	2013-12-17 22:04:30 +01:00
Julien Cristau	e66148cda6	Bump changelogs	2013-12-08 15:33:18 +01:00
Julien Cristau	9c9f210896	pixman 0.32.4 release -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJSiXHSAAoJEA/daC2XTKcqUtQQALogcIuKShzPrZCnNke9jXJF Ujq4M0fHMBru4Uzqq+MCp02ssWLnoBvW8emwzalzt3xulZU+fUeYs1u56Epi1SnG oHt5ah1ZSicAwNBlDdflKgqnBGdsFJg5yj9F09zwZeBEBYwhJBaTQfIK6i0sww3s MQ66uANWsJQsW8/wFq5pJLmmmSWlelEHXz5pcjLavaYkOIITSzTeZF+xOvhBUwv2 1zTsv9c2k05cR+8UKDpDURrEn5Cp5uQo0iV9FpKsyKL01ukqCbuBRWVxjSbXCmtu GWZ4qDLjScM8sCAQbZF4/MZuGoytC2cKxaWnjKn4h1L4+qZMIvjmcAlsP7CfJ14o AtWkYvU6rlY5m4je8Lh3QMbLkSTNFR8ix97jDhFmZlEQA3EXnPvme2YFecOmlVgF c1mVhVBR2Je/Hav0LiIne7151dFJ+THCAPOLcVqDCzRw2BMjAfp0Kx7qnFiXyvEt zgpoAmybf1kHOCpEugHGKwe4elCTvjq7xv3+JwkzqvV7uIvk1/J0ctIkBsboeMsP nvIJ8nBj9fNuJdP++jNX1xsi3C0LM16Bhd5n8wZcX4sqekSVj+LDht4JBPalMC7A m50kD9XlFSJ8UyoKrKMGx71XLnkGgT1hbQgE9ML8MumXZZMpjwIb9p7g7D2A1hXM /1kzDHmAaqbLcmFBTyO9 =klDd -----END PGP SIGNATURE----- Merge tag 'pixman-0.32.4' into debian-unstable pixman 0.32.4 release Conflicts: configure.ac	2013-12-08 15:28:54 +01:00
Søren Sandmann	945ab7a6f3	Soft Light: The first comparison should be <=, not < According to the definition of soft light, the first comparison is less-than-or-equal, not less-than.	2013-12-03 18:14:24 -05:00
Søren Sandmann	9ba3a34797	general: Support component alpha for all image types Currently, if you attempt to use component alpha on source images or images without RGB channels, Pixman will silently just use unified alpha instead. This patch makes such images supported for component alpha. There is no particularly compelling usecase at the moment, but this patch does get rid of a bit of special-case code both in pixman-general.c and in test/composite.c.	2013-11-23 20:30:33 -05:00
Maarten Lankhorst	166899c913	release to sid	2013-11-18 15:55:02 +01:00
Maarten Lankhorst	7d8317abd4	Cherry-pick upstream bigfixes for fixing a crash when rendering invalid trapezoids. (LP: #1197921 )	2013-11-18 15:54:49 +01:00
Ritesh Khadgaray	f740a26fe1	pixman_trapezoid_valid(): Fix underflow when bottom is close to MIN_INT If t->bottom is close to MIN_INT (probably invalid value), subtracting top can lead to underflow which causes crashes. Attached patch will fix the issue. This fixes bug 67484. (cherry picked from commit `5e14da97f1`)	2013-11-18 15:08:42 +01:00
Søren Sandmann Pedersen	f4acde9c71	test/trap-crasher.c: Add trapezoid that demonstrates a crash This trapezoid causes a crash due to an underflow in the pixman_trapezoid_valid(). Test case from Ritesh Khadgaray. (cherry picked from commit `2f876cf867`)	2013-11-18 15:08:41 +01:00
Matt Turner	dae5a758e2	Post-release version bump to 0.32.5	2013-11-17 17:48:54 -08:00
Matt Turner	4b3a66b05e	Pre-release version bump to 0.32.4	2013-11-17 17:46:52 -08:00
Søren Sandmann	97a655d5ca	test/utils.c: Make the stack unaligned only on 32 bit Windows The call_test_function() contains some assembly that deliberately causes the stack to be aligned to 32 bits rather than 128 bits on x86-32. The intention is to catch bugs that surface when pixman is called from code that only uses a 32 bit alignment. However, recent versions of GCC apparently make the assumption (either accidentally or deliberately) that that the incoming stack is aligned to 128 bits, where older versions only seemed to make this assumption when compiling with -msse2. This causes the vector code in the PRNG to now segfault when called from call_test_function() on x86-32. This patch fixes that by only making the stack unaligned on 32 bit Windows, where it would definitely be incorrect for GCC to assume that the incoming stack is aligned to 128 bits. V2: Put "defined(...)" around __GNUC__ Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=491110 (cherry picked from commit `f473fd1e75`)	2013-11-17 17:45:56 -08:00
Jakub Bogusz	5a313af74e	Fix the SSSE3 CPUID detection. SSSE3 is detected by bit 9 of ECX, but we were checking bit 9 of EDX which is APIC leading to SSSE3 routines being called on CPUs without SSSE3. Reviewed-by: Matt Turner <mattst88@gmail.com> (cherry picked from commit `8487dfbcd0`)	2013-11-17 17:45:54 -08:00

... 4 5 6 7 8 ...

2788 Commits