pixman

mirror of https://salsa.debian.org/xorg-team/lib/pixman synced 2025-09-02 21:33:55 +00:00

Author	SHA1	Message	Date
Pekka Paalanen	e2d211ac49	lowlevel-blt-bench: add option to skip memcpy measurement The memcpy speed measurement takes several seconds. When you are running single tests in a harness that iterates dozens or hundreds of times, the repeated measurements are redundant and take a lot of time. It is also an open question whether the measured speed changes over long test runs due to unidentified platform reasons (Raspberry Pi). Add a command line option to set the reference memcpy speed, skipping the measuring. The speed is mainly used to compute how many iterations do run inside the bench_*() functions, so for repeated testing on the same hardware, it makes sense to lock that number to a constant. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk> Reviewed-by: Ben Avison <bavison@riscosopen.org>	2015-07-06 12:04:50 +03:00
Pekka Paalanen	31cb0d4267	lowlevel-blt-bench: add CSV output mode Add a command line option for choosing CSV output mode. In CSV mode, only the results in Mpixels/s are printed in an easily machine-parseable format. All user-friendly printing is suppressed. This is intended for cases where you benchmark one particular operation at a time. Running the "all" set of benchmarks will print just fine, but you may have trouble matching rows to operations as you have to look at the tests_tbl[] to see what row is which. Reviewed-by: Ben Avison <bavison@riscosopen.org> v2: don't add a space after comma in CSV. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>	2015-07-06 12:04:32 +03:00
Pekka Paalanen	9a7e0bc6d0	lowlevel-blt-bench: refactor to Mpx_per_sec() Refactor the Mpixels/s computations into a function. Easier to read and better documents what is being computed. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk> Reviewed-by: Ben Avison <bavison@riscosopen.org>	2015-07-06 12:04:27 +03:00
Pekka Paalanen	6e9c48c579	lowlevel-blt-bench: all bench funcs to return pix_cnt The bench_* functions, that did not already do it, are modified to return the number of pixels processed during the benchmark. This moves the computation to the site that actually determines the number, and simplifies bench_composite() a bit. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk> Reviewed-by: Ben Avison <bavison@riscosopen.org>	2015-07-06 12:04:22 +03:00
Pekka Paalanen	9e8f2bcaf5	lowlevel-blt-bench: move speed and scaling printing Move the printing of the memory speed and scaling mode into a new function. This will help with implementing a machine-readable output option. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk> Reviewed-by: Ben Avison <bavison@riscosopen.org>	2015-07-06 12:04:18 +03:00
Pekka Paalanen	a33c2e6853	lowlevel-blt-bench: print single pattern details When given just a single test pattern instead of "all", print the test details. This can be used to verify the pattern parser agrees with the user, just like scaling settings are printed. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk> Reviewed-by: Ben Avison <bavison@riscosopen.org>	2015-07-06 12:04:12 +03:00
Pekka Paalanen	3ac7ae2017	lowlevel-blt-bench: make test_entry::testname const We assign string literals to it, so it better be const. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk> Reviewed-by: Ben Avison <bavison@riscosopen.org>	2015-07-06 12:04:07 +03:00
Pekka Paalanen	56d8b365f5	lowlevel-blt-bench: move explanation printing Move explanation printing to a new function. This will help with implementing a machine-readable output option. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk> Reviewed-by: Ben Avison <bavison@riscosopen.org>	2015-07-06 12:04:03 +03:00
Pekka Paalanen	bddff993ed	lowlevel-blt-bench: move usage to a function Move printing of usage into a new function and use argv[0] as the program name. This will help printing usage from multiple places. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk> Reviewed-by: Ben Avison <bavison@riscosopen.org>	2015-07-06 12:03:28 +03:00
Pekka Paalanen	58e21d3e45	lowlevel-blt-bench: use a8r8g8b8 for CA solid masks When doing component alpha with a solid mask, use a mask format that has all the color channels instead of just a8. As Ben Avison explains it: "Lowlevel-blt-bench initialises all its images using memset(0xCC) so an a8 solid image would be converted by _pixman_image_get_solid() to 0xCC000000 whereas an a8r8g8b8 would be 0xCCCCCCCC. When you're not in component alpha mode, only the alpha byte matters for the mask image, but in the case of component alpha operations, a fast path might decide that it can save itself a lot of multiplications if it spots that 3 constant mask components are already 0." No (default) test so far has a solid mask with CA. This is just future-proofing lowlevel-blt-bench to do what one would expect. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk> Reviewed-by: Ben Avison <bavison@riscosopen.org>	2015-04-20 16:18:18 +03:00
Pekka Paalanen	be49f929b6	lowlevel-blt-bench: use the test pattern parser Let lowlevel-blt-bench parse the test name string from the command line, allowing to run almost infinitely more tests. One is no longer limited to the tests listed in the big table. While you can use the old short-hand names like src_8888_8888, you can also use all possible operators now, and specify pixel formats exactly rather than just x888, for instance. This even allows to run crazy patterns like conjoint_over_reverse_a8b8g8r8_n_r8g8b8x8. All individual patterns are now interpreted through the parser. The pattern "all" runs the same old default test set as before but through the parser instead of the hard-coded parameters. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk> Reviewed-by: Ben Avison <bavison@riscosopen.org>	2015-04-15 12:43:01 +03:00
Pekka Paalanen	5b27912108	lowlevel-blt-bench: add test name parser and self-test This patch is inspired by "lowlevel-blt-bench: Parse test name strings in general case" by Ben Avison. From Ben's commit message: "There are many types of composite operation that are useful to benchmark but which are omitted from the table. Continually having to add extra entries to the table is a nuisance and is prone to human error, so this patch adds the ability to break down unknow strings of the format <operation>_<src>[_<mask]_<dst>[_ca] where bitmap formats are specified by number of bits of each component (assumed in ARGB order) or 'n' to indicate a solid source or mask." Add the parser to lowlevel-blt-bench.c, but do not hook it up to the command line just yet. Instead, make it run a self-test. As we now dynamically parse strings similar to the test names in the huge table 'tests_tbl', we should make sure we can parse the old well-known test names and produce exactly the same test parameters. The self-test goes through this old table and verifies the parsing results. Unfortunately the old table is not exactly consistent, it contains some special cases that cannot be produced by the parsing rules. Whether these special cases are intentional or just an oversight is not always clear. Anyway, add a small table to reproduce the special cases verbatim. If we wanted, we could remove the big old table in a follow-up commit, but then we would also lose the parser self-test. The point of this whole excercise to let lowlevel-blt-bench recognize novel test patterns in the future, following exactly the conventions used in the old table. Ben, from what I see, this parser has one major difference to what you wrote. For a solid mask, your parser uses a8r8g8b8 format, while mine uses a8 which comes from the old table. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk> Reviewed-by: Ben Avison <bavison@riscosopen.org>	2015-04-15 12:42:51 +03:00
Ben Avison	c343846625	lowlevel-blt-bench: add in_reverse_8888_8888 test in_reverse_8888_8888 is one of the more commonly used operations in the cairo-perf-trace suite that hasn't been in lowlevel-blt-bench until now. v4, Pekka Paalanen <pekka.paalanen@collabora.co.uk> : Split from "Add extra test to lowlevel-blt-bench and fix an existing one", new summary.	2014-03-20 08:33:05 -04:00
Ben Avison	898859f3d3	lowlevel-blt-bench: over_reverse_n_8888 needs solid source v4, Pekka Paalanen <pekka.paalanen@collabora.co.uk> : Split from "Add extra test to lowlevel-blt-bench and fix an existing one", new summary.	2014-03-20 08:33:05 -04:00
Nemanja Lukic	f69335d529	test: add "pixbuf" and "rpixbuf" to lowlevel-blt-bench Add necessary support to lowlevel-blt benchmark for benchmarking pixbuf and rpixbuf fast paths. bench_composite function now checks for pixbuf string in testname, and if that is detected, use same bits for src and mask images.	2013-04-30 15:38:43 -04:00
Nemanja Lukic	3dc9e3827e	test: add "src_0888_8888_rev" and "src_0888_0565_rev" to lowlevel-blt-bench	2013-04-30 15:38:43 -04:00
Ben Avison	5e207f825b	Fix to lowlevel-blt-bench The source, mask and destination buffers are initialised to 0xCC just after they are allocated. Between each benchmark, there are a pair of memcpys, from the destination buffer to the source buffer and back again (there are no explanatory comments, but presumably this is an effort to flush the caches). However, it has an unintended consequence, which is to change the contents of the buffers on entry to subsequent benchmarks. This means it is not a fair test: for example, with over_n_8888 (featured in the following patches) it reports L2 and even M tests as being faster than the L1 test, because after the L1 test, the source buffer is filled with fully opaque pixels, for which over_n_8888 has a shortcut. The fix here is simply to reverse the order of the memcpys, so src and destination are both filled with 0xCC on entry to all tests.	2013-02-13 02:24:34 -05:00
Ben Avison	69a7a9b6b6	Improve L1 and L2 benchmark tests for caches that don't use allocate-on-write In particular this affects single-core ARMs (e.g. ARM11, Cortex-A8), which are usually configured this way. For other CPUs, this should only add a constant time, which will be cancelled out by the EXCLUDE_OVERHEAD runs. The problems were caused by cachelines becoming permanently evicted from the cache, because the code that was intended to pull them back in again on each iteration assumed too long a cache line (for the L1 test) or failed to read memory beyond the first pixel row (for the L2 test). Also, the reloading of the source buffer was unnecessary. These issues were identified by Siarhei in this post: http://lists.freedesktop.org/archives/pixman/2013-January/002543.html	2013-01-29 15:23:05 -05:00
Ben Avison	24e83cae64	Tweaks to lowlevel-blt-bench This adds two extra tests, src_n_8 and src_8_8, which I have been using to benchmark my ARMv6 changes. I'd also like to propose that it requires an exact test name as the executable's argument, as achieved by this strstr to strcmp change. Without this, it is impossible to only benchmark (for example) add_8_8, add_n_8 or src_n_8, due to those also being substrings of many other test names.	2013-01-25 11:13:07 -05:00
Siarhei Siamashka	e4519360c1	test: add "src_0565_8888" to lowlevel-blt-bench	2012-12-18 20:43:51 +02:00
Siarhei Siamashka	fc162bad56	test: support nearest/bilinear scaling in lowlevel-blt-bench Scale factor is selected to be nearly 1x, so that the MPix/s results can be directly compared with the results of non-scaled compositing operations.	2012-06-29 03:24:29 +03:00
Matt Turner	62c4bdc94f	mmx: add over_reverse_n_8888 Loongson: over_reverse_n_8888 = L1: 16.04 L2: 15.35 M: 10.20 ( 27.96%) HT: 10.95 VT: 10.45 R: 9.18 RT: 6.99 ( 76Kops/s) over_reverse_n_8888 = L1: 27.40 L2: 26.67 M: 16.97 ( 45.78%) HT: 16.66 VT: 15.38 R: 14.15 RT: 9.44 ( 97Kops/s) image poppler 34.106 35.500 1.48% 6/6 image poppler 29.598 30.835 1.70% 6/6 ARM/iwMMXt: over_reverse_n_8888 = L1: 15.63 L2: 14.33 M: 10.83 ( 27.55%) HT: 9.78 VT: 9.91 R: 9.49 RT: 6.96 ( 69Kops/s) over_reverse_n_8888 = L1: 22.79 L2: 19.40 M: 13.76 ( 34.19%) HT: 11.66 VT: 11.86 R: 11.17 RT: 7.85 ( 75Kops/s) image poppler 38.040 38.606 1.10% 6/6 image poppler 31.686 32.278 0.80% 5/6	2012-05-26 20:32:27 -04:00
Matt Turner	3c3c70fa0b	lowlevel-blt-bench: add in_8_8 and in_n_8_8 Signed-off-by: Matt Turner <mattst88@gmail.com>	2012-03-01 17:42:37 -05:00
Matt Turner	e43d65d49d	lowlevel-blt: add over_x888_n_8888 Signed-off-by: Matt Turner <mattst88@gmail.com>	2012-02-24 20:02:55 -05:00
Matt Turner	9f60704995	lowlevel-blt: add over_8888_8888 Signed-off-by: Matt Turner <mattst88@gmail.com>	2012-02-24 19:58:09 -05:00
Andrea Canciani	97b9fa090c	Use the ARRAY_LENGTH() macro when possible This patch has been generated by the following Coccinelle semantic patch: // Use the ARRAY_LENGTH() macro when possible // // Replace open-coded array length computations with the // ARRAY_LENGTH() macro @@ type T; T[] E; @@ - (sizeof(E)/sizeof(T)) + ARRAY_LENGTH (E)	2011-11-09 09:17:00 +01:00
Andrea Canciani	06760f5cb0	test: Cleanup includes All the tests are linked to libutil, hence it makes sence to always include utils.h and reuse what it provides (config.h inclusion, access to private pixman APIs, ARRAY_LENGTH, ...).	2011-11-09 09:17:00 +01:00
Matt Turner	b6b77488a0	lowlevel-blt: add over_x888_8_8888 Signed-off-by: Matt Turner <mattst88@gmail.com>	2011-09-26 11:29:51 -04:00
Andrea Canciani	a1ebff0dcb	win32: Build benchmarks Add the makefile rules needed to compile lowlevel-blt-bench on win32 and fix the compilation errors.	2011-08-29 07:37:46 +02:00
Søren Sandmann Pedersen	bdfb5944ff	Don't include stdint.h in lowlevel-blt-bench.c Some systems don't have the file, and the types are already defined in pixman.h. https://bugs.freedesktop.org//show_bug.cgi?id=37422	2011-08-11 03:32:14 -04:00
Søren Sandmann Pedersen	a89f8cfaf1	Replace argumentxs to composite functions with a pointer to a struct This allows more information, such as flags or the composite region, to be passed to the composite functions.	2011-06-20 02:03:23 -04:00
Søren Sandmann Pedersen	13aed37758	Add a test for over_x888_8_0565 in lowlevel_blt_bench(). The next few commits will speed this up quite a bit. Current output: --- reference memcpy speed = 2217.5MB/s (554.4MP/s for 32bpp fills) --- over_x888_8_0565 = L1: 54.67 L2: 54.01 M: 52.33 ( 18.88%) HT: 37.19 VT: 35.54 R: 29.40 RT: 13.63 ( 162Kops/s)	2011-01-28 14:35:17 -05:00
Søren Sandmann Pedersen	ba693d2e88	Fix search-and-replace issue in lowlevel-blt-bench.c	2010-09-28 02:52:17 -04:00
Søren Sandmann Pedersen	77d3e5f6ff	Rename all the fast paths with _8000 in their names to _8 This inconsistent naming somehow survived the refactoring from a while back.	2010-09-28 00:07:47 -04:00
Jonathan Morton	7cd4f2fa20	Add a lowlevel blitter benchmark This test is a modified version of Siarhei's compositor throughput benchmark. It's expanded with explicit reporting of memory bandwidth consumption for the M-test, and with an additional 8x8-random test intended to determine peak ops/sec capability. There are also quite a lot more operations tested for.	2010-09-21 08:50:18 -04:00

35 Commits