The MSVC compiler is very strict about variable declarations after
statements.
Move all the declarations of each block before any statement in
the same block to fix multiple instances of:
pixman-mmx.c(xxxx) : error C2275: '__m64' : illegal use of this type
as an expression
The sRGB conversion has to be done every time the limits are being
computed. Without this fix, pixel_checker_get_min/max() will produce
the wrong results when called from somewhere other than
pixel_checker_check().
This declaration used to be necessary when
_pixman_image_get_scanline_generic_64() referred to a structure that
itself referred back to _pixman_image_get_scanline_generic_64().
This makes show_image() deal with more formats than just a8r8g8b8, in
particular, a8r8g8b8_sRGB can now be handled.
Images that are passed to show_image with a format of a8r8g8b8_sRGB
are displayed without modification under the assumption that the
monitor is approximately sRGB.
Images with a format of a8r8g8b8 are also displayed without
modification since many other users of show_image() have been
generating essentially sRGB data with this format. Other formats are
also assumed to be gamma compressed; these are converted to a8r8g8b8
before being displayed.
With these changes, srgb-test.c doesn't need to do its own conversion
anymore.
glyph-test would sometimes set a solid image as an alpha map, which is
not allowed. When this happened and the debug spew was enabled,
messages like this one would be generated:
*** BUG ***
In pixman_image_set_alpha_map: The expression
!alpha_map || alpha_map->type == BITS was false
Set a breakpoint on '_pixman_log_error' to debug
Fix this by not passing the ALLOW_SOLID flag to create_image() when
the resulting is to be used as an alpha map.
The rectangles in the clip region set in set_general_properties()
would sometimes overflow, which would lead to messages like these:
*** BUG ***
In pixman_region32_union_rect: Invalid rectangle passed
Set a breakpoint on '_pixman_log_error' to debug
when the micro version number of pixman is even.
Fix this by detecting the overflow and clamping such that the x2/y2
coordinates are less than INT32_MAX.
Composite checks random combinations of operations that now also have
sRGB sources, masks and destinations, and stress-test validates the
read/write primitives.
Simple sRGB color blender test can be used to determine if the sRGB processing
works as expected. It blends alpha ramps of purple and green together such that
at midpoint of image, 50 % blend of both is realized. At that point, sRGB-aware
processing yields a result close to #bbb rather than #888, which is the linear
light blending result.
The demo also contains the sample computation for sRGB premultiplied alpha.
sRGB format is defined as a new format type, PIXMAN_TYPE_ARGB_SRGB. One form of
this type is provided, PIXMAN_a8r8g8b8_sRGB. Use of an sRGB format triggers
wide processing, and the pixel fetch/store functions handle the relevant
conversion between color spaces. Pixman itself is thought to compose in the
linearized sRGB color space.
sRGB conversion is tabularized. For sRGB to linear, we are using only 256
values because the current source format uses 8 bits per component precision.
For linear to sRGB, it turns out that only 4096 brightness levels are required
to generate all of the 256 sRGB color values, and therefore only 12 bits per
component are considered during store. As a special case, a no-op
sRGB->linear->sRGB conversion is constructed to be lossless by adjusting the
sRGB->linear conversion table where necessary.
When not optimizing, write _mm_shuffle_pi16() as a statement
expression with inline assembly. That way we avoid
__builtin_ia32_pshufw(), which is only available when compiling with
-msse, while still allowing the non-optimizing gcc to understand that
the second argument is a compile time constant.
Tested-by: Knut Petersen <knut_petersen@t-online.de>
A new function pixman_cpuid() is added that runs the cpuid instruction
and returns the results. On GCC this function uses inline assembly; on
MSVC, the function calls the __cpuid intrinsic.
There is also a new function called have_cpuid() which detects whether
cpuid is available. On x86-64 and MSVC, it simply returns TRUE; on
x86-32 bit, it checks whether the 22nd bit of eflags can be
modified. On MSVC this does have the consequence that pixman will no
longer work CPUS without cpuid (ie., older than 486 and some 486
models).
These two functions together makes it possible to write a generic
detect_cpu_features() in plain C. This function is then used in a new
have_feature() function that checks whether a specific set of feature
bits is available.
Aside from the cleanups and simplifications, the main benefit from
this patch is that pixman now can do feature detection on x86-64, so
that newer instruction sets such as SSSE3 and SSE4.1 can be used. (And
apparently the assumption that x86-64 CPUs always have MMX and SSE2 is
no longer correct: Knight's Corner is x86-64, but doesn't have them).
V2: Rename the constants in the getisax() code, as pointed out by Alan
Coopersmith. Also reinstate the result variable and initialize
features to 0.
V3: Fixes for the fact that the upper 32 bits of a 64 bit register are
zeroed whenever the corresponding 32 bit register is written to.
V4: Fixes for the fact that in 32 bit mode, when gcc is not optimizing
there were not enough registers available. The new code uses the "a",
"b", "c", and "d" constraints instead, and has two separate versions
for 32 and 64 bit modes.
Declare functions *_inverse() and *_contains_rectangle() in the same
way as the other functions are declared. This doesn't imply any semantic
changes. It's just a unification of coding styles.
Get rid of the initialized and have_vmx static variables in
pixman-ppc.c There is no point to them since CPU detection only
happens once per process.
On Linux, just read /proc/self/auxv instead of generating the filename
with getpid() and don't bother with the stack buffer. Instead just
read the aux entries one by one.
Organize pixman-arm.c such that each operating system/compiler exports
a detect_cpu_features() function that returns a bitmask with the
various features that we are interested in. A new function
have_feature() then calls this function, caches the result, and return
whether the given feature is available.
The result is that all the pixman_have_arm_<feature> functions become
redundant and can be deleted.
There is no reason to have pixman_have_<feature> functions when all
they do is call pixman_have_mips_feature().
Instead rename pixman_have_mips_feature() to have_feature() and call
it directly from _pixman_mips_get_implementations(). Also on
non-Linux, just make have_feature() return FALSE.
Similar to the x86 commit, this moves the ARM specific CPU detection
to its own file which exports a pixman_arm_get_implementations()
function that is supposed to be a noop on non-ARM.
Extract the x86 specific parts of pixman-cpu.c and put them in their
own file called pixman-x86.c which exports one function
pixman_x86_get_implementations() that creates the MMX and SSE2
implementations. This file is supposed to be compiled on all
architectures, but pixman_x86_get_implementations() should be a noop
on non-x86.
When compiling with -O0, gcc doesn't understand that in
signed char x = 0;
...
asm ("...",
: "K" (x));
x is constant. Fix this by using an immediate constant instead of a
variable.
In fast_composite_tiled_repeat() if the source image is less than a
certain constant width, a clone is created which is then
pre-repeated. However, the source image's palette, if it has one, is
not cloned, so for indexed images, the pre-repeating would crash.
Fix this by not doing any pre-repeating for images with a palette set.
stress-test current almost never composites anything because the clip
rectangles and transformations are such that either
_pixman_compute_composite_region32() or analyze_extent() will return
FALSE.
Fix this by:
- making log_rand() return smaller numbers so that the clip rectangles
are more likely to be within the destination image
- adding rand_x() and rand_y() functions that pick positions within an
image and using them for positioning alpha maps and source/mask
positions.
- making it less likely that clip regions are used in general
These changes make the test take longer, so speed it up a little by
making most images smaller and by reducing the maximum convolution
filter from 17x19 to 3x4.
With these changes, stress-test reveals a crash in iteration 0xd39
where fast_composite_tiled_repeat() creates an indexed image without a
palette.