pixman

mirror of https://salsa.debian.org/xorg-team/lib/pixman synced 2025-09-01 10:35:16 +00:00

Author	SHA1	Message	Date
Nemanja Lukic	f69335d529	test: add "pixbuf" and "rpixbuf" to lowlevel-blt-bench Add necessary support to lowlevel-blt benchmark for benchmarking pixbuf and rpixbuf fast paths. bench_composite function now checks for pixbuf string in testname, and if that is detected, use same bits for src and mask images.	2013-04-30 15:38:43 -04:00
Nemanja Lukic	3dc9e3827e	test: add "src_0888_8888_rev" and "src_0888_0565_rev" to lowlevel-blt-bench	2013-04-30 15:38:43 -04:00
Ben Avison	5e207f825b	Fix to lowlevel-blt-bench The source, mask and destination buffers are initialised to 0xCC just after they are allocated. Between each benchmark, there are a pair of memcpys, from the destination buffer to the source buffer and back again (there are no explanatory comments, but presumably this is an effort to flush the caches). However, it has an unintended consequence, which is to change the contents of the buffers on entry to subsequent benchmarks. This means it is not a fair test: for example, with over_n_8888 (featured in the following patches) it reports L2 and even M tests as being faster than the L1 test, because after the L1 test, the source buffer is filled with fully opaque pixels, for which over_n_8888 has a shortcut. The fix here is simply to reverse the order of the memcpys, so src and destination are both filled with 0xCC on entry to all tests.	2013-02-13 02:24:34 -05:00
Ben Avison	69a7a9b6b6	Improve L1 and L2 benchmark tests for caches that don't use allocate-on-write In particular this affects single-core ARMs (e.g. ARM11, Cortex-A8), which are usually configured this way. For other CPUs, this should only add a constant time, which will be cancelled out by the EXCLUDE_OVERHEAD runs. The problems were caused by cachelines becoming permanently evicted from the cache, because the code that was intended to pull them back in again on each iteration assumed too long a cache line (for the L1 test) or failed to read memory beyond the first pixel row (for the L2 test). Also, the reloading of the source buffer was unnecessary. These issues were identified by Siarhei in this post: http://lists.freedesktop.org/archives/pixman/2013-January/002543.html	2013-01-29 15:23:05 -05:00
Ben Avison	24e83cae64	Tweaks to lowlevel-blt-bench This adds two extra tests, src_n_8 and src_8_8, which I have been using to benchmark my ARMv6 changes. I'd also like to propose that it requires an exact test name as the executable's argument, as achieved by this strstr to strcmp change. Without this, it is impossible to only benchmark (for example) add_8_8, add_n_8 or src_n_8, due to those also being substrings of many other test names.	2013-01-25 11:13:07 -05:00
Siarhei Siamashka	e4519360c1	test: add "src_0565_8888" to lowlevel-blt-bench	2012-12-18 20:43:51 +02:00
Siarhei Siamashka	fc162bad56	test: support nearest/bilinear scaling in lowlevel-blt-bench Scale factor is selected to be nearly 1x, so that the MPix/s results can be directly compared with the results of non-scaled compositing operations.	2012-06-29 03:24:29 +03:00
Matt Turner	62c4bdc94f	mmx: add over_reverse_n_8888 Loongson: over_reverse_n_8888 = L1: 16.04 L2: 15.35 M: 10.20 ( 27.96%) HT: 10.95 VT: 10.45 R: 9.18 RT: 6.99 ( 76Kops/s) over_reverse_n_8888 = L1: 27.40 L2: 26.67 M: 16.97 ( 45.78%) HT: 16.66 VT: 15.38 R: 14.15 RT: 9.44 ( 97Kops/s) image poppler 34.106 35.500 1.48% 6/6 image poppler 29.598 30.835 1.70% 6/6 ARM/iwMMXt: over_reverse_n_8888 = L1: 15.63 L2: 14.33 M: 10.83 ( 27.55%) HT: 9.78 VT: 9.91 R: 9.49 RT: 6.96 ( 69Kops/s) over_reverse_n_8888 = L1: 22.79 L2: 19.40 M: 13.76 ( 34.19%) HT: 11.66 VT: 11.86 R: 11.17 RT: 7.85 ( 75Kops/s) image poppler 38.040 38.606 1.10% 6/6 image poppler 31.686 32.278 0.80% 5/6	2012-05-26 20:32:27 -04:00
Matt Turner	3c3c70fa0b	lowlevel-blt-bench: add in_8_8 and in_n_8_8 Signed-off-by: Matt Turner <mattst88@gmail.com>	2012-03-01 17:42:37 -05:00
Matt Turner	e43d65d49d	lowlevel-blt: add over_x888_n_8888 Signed-off-by: Matt Turner <mattst88@gmail.com>	2012-02-24 20:02:55 -05:00
Matt Turner	9f60704995	lowlevel-blt: add over_8888_8888 Signed-off-by: Matt Turner <mattst88@gmail.com>	2012-02-24 19:58:09 -05:00
Andrea Canciani	97b9fa090c	Use the ARRAY_LENGTH() macro when possible This patch has been generated by the following Coccinelle semantic patch: // Use the ARRAY_LENGTH() macro when possible // // Replace open-coded array length computations with the // ARRAY_LENGTH() macro @@ type T; T[] E; @@ - (sizeof(E)/sizeof(T)) + ARRAY_LENGTH (E)	2011-11-09 09:17:00 +01:00
Andrea Canciani	06760f5cb0	test: Cleanup includes All the tests are linked to libutil, hence it makes sence to always include utils.h and reuse what it provides (config.h inclusion, access to private pixman APIs, ARRAY_LENGTH, ...).	2011-11-09 09:17:00 +01:00
Matt Turner	b6b77488a0	lowlevel-blt: add over_x888_8_8888 Signed-off-by: Matt Turner <mattst88@gmail.com>	2011-09-26 11:29:51 -04:00
Andrea Canciani	a1ebff0dcb	win32: Build benchmarks Add the makefile rules needed to compile lowlevel-blt-bench on win32 and fix the compilation errors.	2011-08-29 07:37:46 +02:00
Søren Sandmann Pedersen	bdfb5944ff	Don't include stdint.h in lowlevel-blt-bench.c Some systems don't have the file, and the types are already defined in pixman.h. https://bugs.freedesktop.org//show_bug.cgi?id=37422	2011-08-11 03:32:14 -04:00
Søren Sandmann Pedersen	a89f8cfaf1	Replace argumentxs to composite functions with a pointer to a struct This allows more information, such as flags or the composite region, to be passed to the composite functions.	2011-06-20 02:03:23 -04:00
Søren Sandmann Pedersen	13aed37758	Add a test for over_x888_8_0565 in lowlevel_blt_bench(). The next few commits will speed this up quite a bit. Current output: --- reference memcpy speed = 2217.5MB/s (554.4MP/s for 32bpp fills) --- over_x888_8_0565 = L1: 54.67 L2: 54.01 M: 52.33 ( 18.88%) HT: 37.19 VT: 35.54 R: 29.40 RT: 13.63 ( 162Kops/s)	2011-01-28 14:35:17 -05:00
Søren Sandmann Pedersen	ba693d2e88	Fix search-and-replace issue in lowlevel-blt-bench.c	2010-09-28 02:52:17 -04:00
Søren Sandmann Pedersen	77d3e5f6ff	Rename all the fast paths with _8000 in their names to _8 This inconsistent naming somehow survived the refactoring from a while back.	2010-09-28 00:07:47 -04:00
Jonathan Morton	7cd4f2fa20	Add a lowlevel blitter benchmark This test is a modified version of Siarhei's compositor throughput benchmark. It's expanded with explicit reporting of memory bandwidth consumption for the M-test, and with an additional 8x8-random test intended to determine peak ops/sec capability. There are also quite a lot more operations tested for.	2010-09-21 08:50:18 -04:00

21 Commits