The general_composite_rect() function has two invocations
of the return_if_fail() macro before any of its variable
declarations. Removing them allows for compilation to
succeed using a pre-C99 compiler.
Each scanline of the destination is bulk-loaded into a cached buffer on
the stack (using the QuadWordCopy routine) before being processed. This
is the primary benefit on uncached framebuffers, since it is necessary
to minimise the number of accesses to such things and avoid
write-to-read turnarounds.
This also simplifies edge handling, since QuadWordCopy() can do a
precise writeback efficiently via the write-combiner, allowing the main
routine to "over-read" the scanline edge safely when required. This is
why the glyph's mask data is also copied into a temporary buffer of
known size.
Each group of 8 pixels is then processed using fewer instructions,
taking advantage of the lower precision requirements of the 6-bit
destination (so a simpler pixel multiply can be used) and using a more
efficient bit-repacking method.
(As an aside, this patch removes nearly twice as much code as it
introduces. Most of this is due to duplication of Ian's inner loop,
since he has to handle narrow cases separately. RVCT support is of
course preserved.)
We measured the doubling of performance by rendering 96-pixel height
glyph strings, which are fillrate limited rather than latency/overhead
limited. The performance is also improved, albeit by a smaller amount,
on the more usual smaller text, demonstrating that internal overhead is
not a problem.
X servers prior to
ebfd6688d1927288155221e7a78fbca9f9293952
relied on pixman not clipping to destination geometry whenever an
explicit clip region was set. Since only X servers set
source_clipping, we can just trigger off of that.
The new rule is:
- Output is clipped to the destination clip region.
- If a source image has the clip_sources property set, then there
is an additional step, after repeating and transforming, but before
compositing, where pixels that are not in the source clip are
rejected. Rejected means no compositing takes place (not that the
pixel is treated as 0). By default source clipping is turned off;
when they are turned on, only client-set clips are honored.
The old rules were unclear and inconsistently implemented.