The Microsoft C compiler cannot handle subobject initialization and
Win32 does not provide snprintf.
Work around these limitations by using normal struct initialization
and using sprintf (a manual check shows that the buffer size is
sufficient).
Previously 'make check' would compile and run tests first, and only
then proceed to compiling demos. Which is not very convenient
because of the need to scroll back console output to see the
tests verdict. Swapping order of SUBDIRS variable entries in
Makefile.am resolves this.
This allows some more code to be deleted from the X server. The
implementation consists of converting to trapezoids, and is shared
with pixman_composite_triangles().
The Render X extension can draw triangles as well as trapezoids, but
the implementation has always converted them to trapezoids. This patch
moves the X server's triangle conversion code into pixman, where we
can reuse the pixman_composite_trapezoid() code.
This function is an implementation of the X server request
Trapezoids. That request is what the X backend of cairo is using all
the time; by moving it into pixman we can hopefully make it faster.
This separates the test suite from the random gtk+ using test
programs. "demos" is somewhat misleading because the programs there
are not particularly exciting (with the possible exception of
composite-test which shows off all the compositing operators).
Scaling function now gets an extra boolean argument, which is set
to TRUE when we are fetching padding pixels for NONE repeat. This
allows to make a decision whether to interpret alpha as 0xFF or 0x00
for such pixels when working with formats which don't have alpha
channel (for example x8r8g8b8 and r5g6b5).
In addition to the most common case of not having any mask at all, two
variants of scaling with mask show up in cairo traces:
1. non-scaled a8 mask with SAMPLES_COVER_CLIP flag
2. solid mask
This patch extends the nearest scaling main loop template to also
support these cases.
Depending on CPU architecture, performance is in the range of 1.5 to 4 times
slower than simple nonrotated copy (which would be an ideal case, perfectly
utilizing memory bandwidth), but still is more than 7 times faster if
compared to general path.
This implementation sets a performance baseline for rotation. The use
of SIMD instructions may further improve memory bandwidth utilization.
Split this function into two, one that has a mask, and one that
doesn't. This is a fairly substantial speed-up in many cases.
New output of lowlevel-blt-bench over_x888_8_0565:
over_x888_8_0565 = L1: 63.76 L2: 62.75 M: 59.37 ( 21.55%) HT: 45.89 VT: 43.55 R: 34.51 RT: 16.80 ( 201Kops/s)
New output of lowlevel-blt-bench over_x888_8_0565:
over_x888_8_0565 = L1: 57.85 L2: 56.80 M: 54.14 ( 19.50%) HT: 42.64 VT: 40.56 R: 32.67 RT: 16.22 ( 195Kops/s)
Based in part on code by Steve Snyder from
https://bugs.freedesktop.org/show_bug.cgi?id=21173
New output of lowlevel-blt-bench over_x888_8_0565:
over_x888_8_0565 = L1: 55.68 L2: 55.11 M: 52.83 ( 19.04%) HT: 39.62 VT: 37.70 R: 30.88 RT: 14.62 ( 174Kops/s)
The fetcher is looked up in a table, so that other fetchers can easily
be added.
See also https://bugs.freedesktop.org/show_bug.cgi?id=20709
The next few commits will speed this up quite a bit.
Current output:
---
reference memcpy speed = 2217.5MB/s (554.4MP/s for 32bpp fills)
---
over_x888_8_0565 = L1: 54.67 L2: 54.01 M: 52.33 ( 18.88%) HT: 37.19 VT: 35.54 R: 29.40 RT: 13.63 ( 162Kops/s)
Instead of having each individual implementation decide which fallback
to use, move it into pixman-cpu.c, where a more global decision can be
made.
This is accomplished by adding a "fallback" argument to all the
pixman_implementation_create_*() implementations, and then in
_pixman_choose_implementation() pass in the desired fallback.
It seems to be relatively common for people to use development
snapshots of pixman thinking they are ordinary releases. This patch
makes it such that if the current minor version is odd, configure will
print a banner explaining the version number scheme plus information
about where to report bugs.
Removes useless variable declarations. This can only result in more
efficient code, as these variables where sometimes assigned, but
their values were never used.
Green Hills Software MULTI compiler was producing a number
of warnings due to incorrect uses of int instead of the correct
corresponding pixman_*_t type.