Compare commits

...

885 Commits

Author SHA1 Message Date
Dylan Aïssi
b85678a8de debian/copyright: Convert to machine-readable format 2025-07-31 22:22:45 +02:00
Timo Aaltonen
7d26aad890 releasing package pixman version 0.44.0-3 2024-11-09 11:03:01 +02:00
Timo Aaltonen
07627e9f31 Replace timeout bump patch by using a multiplier option instead. Thanks, Aurelien Jarno! (Closes: #1086999) 2024-11-09 11:02:51 +02:00
Timo Aaltonen
dc43d37962 releasing package pixman version 0.44.0-2 2024-11-08 09:58:31 +02:00
Timo Aaltonen
c05da7d917 patches: Increase test timeout 120->240s. (Closes: #1086999) 2024-11-08 09:53:41 +02:00
Timo Aaltonen
e55fd151a2 releasing package pixman version 0.44.0-1 2024-11-07 16:48:34 +02:00
Timo Aaltonen
7d5149536f rules: Drop obsolete dbgsym-migration. 2024-11-07 15:54:27 +02:00
Timo Aaltonen
2ad078304f control: Migrate to pkgconf. 2024-11-07 15:53:40 +02:00
Timo Aaltonen
7cca9d2d9a symbols: Updated. 2024-11-07 15:45:25 +02:00
Timo Aaltonen
c8cb00a5ad control, rules: Build with meson. 2024-11-07 15:45:17 +02:00
Timo Aaltonen
b87363cd49 patches: Refresh patch. 2024-11-07 14:31:18 +02:00
Timo Aaltonen
2e58ff85bd version bump 2024-11-07 14:30:41 +02:00
Timo Aaltonen
31b00cc770 Merge branch 'upstream-unstable' into debian-unstable 2024-11-07 14:29:36 +02:00
Matt Turner
ae6646f159 Pre-release version bump to 0.44.0 2024-11-05 11:51:31 -05:00
Lance Arsenault
126d61e796 pixman: Add library destructor
Closes: https://gitlab.freedesktop.org/pixman/pixman/-/issues/111
2024-11-05 04:31:04 +00:00
f wasil
a987256be8 Fixed memory leak in tests 2024-11-05 03:39:54 +00:00
f wasil
0e424031bd RISC-V floating point operations 2024-10-30 03:39:37 +00:00
Changqing Li
643f098a39 pixman-combine-float.c: fix inlining failed error
Refer [1], always-inline is not suggested to be used if you have indirect
calls. so replace force_inline with inline to fix error like:
In function ‘combine_inner’,
    inlined from ‘combine_soft_light_ca_float’ at ../pixman/pixman-combine-float.c:655:511:
../pixman/pixman-combine-float.c:655:211: error: inlining failed in call to ‘always_inline’ ‘combine_soft_light_c’: function not considered for inlining

Test with gcc-9 and gcc-14, both works well

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115679

Signed-off-by: Changqing Li <changqing.li@windriver.com>
2024-10-30 01:34:41 +00:00
Marek Pikuła
90f9cf1726
ci: Disable coverage for arm-v5 and mipsel targets
Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-10-21 16:49:50 +02:00
Marek Pikuła
bc2ec45d3b
ci: Add auto_cancel policy
Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-10-21 16:49:41 +02:00
Marek Pikuła
de59d1a9fb
ci: Don't execute failing jobs
Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-10-21 16:49:40 +02:00
Marek Pikuła
15336dc7cd
ci: Pin gcovr version to 7.x
Temporary version pin of gcovr due to errors in coverage report
generation when running with newly released version 8.x.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-10-21 13:17:47 +02:00
Marek Pikuła
0476eda33a
ci: Remove MESON_TESTTHREADS workaround
https://github.com/mesonbuild/meson/pull/13604 got merged and released
with Meson 1.6.0, which we already use in the Docker images, so the
workaround can be dropped.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-10-21 13:17:25 +02:00
Marek Pikuła
11e51bc72f
ci: Disable OpenMP for Win32 target
OpenMP introduces random stack overflow errors for 32-bit Windows
target.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-10-14 16:12:44 +02:00
Marek Pikuła
277f485a9c
ci: Add missing ":failing" suffix for linux-ppc job
Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-09-27 00:22:55 +02:00
Marek Pikuła
126b083142
ci: Add option to use different version of LLVM
Some targets require different version of LLVM, so now it's possible to
set it in the target's environment. Mind that the highest available
version depends on the base Debian image.

The change bumps LLVM version for all Linux targets:
- by default from 14 to 16,
- from 16 to 18 for riscv64 (based on Sid; for now, LLVM 19 doesn't have
  libomp packaged),
- mipsel stays at 14 as there seem to be some missing packages for
  higher versions.

Windows targets stay the same, as they use a different source of LLVM
(MinGW-compatible, which is currently version 18).

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-09-27 00:22:54 +02:00
Marek Pikuła
a3d297fa46
ci: riscv64: Verify if tests run on target without RVV
To ensure that the runtime discovery works correctly, and RVV code is
disabled for target without RVV extension.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-09-26 23:33:52 +02:00
Marek Pikuła
9176847f1d
ci: riscv64: Don't force enable RVV globally
RVV compilation will be enabled for RVV implementation alone, similar to
other platforms. This prevents introducing autovectorized code in the
main library, thus making pixman compatible with RISC-V targets without
RVV.
2024-09-26 23:33:52 +02:00
Marek Pikuła
76b133f293
ci: Fix active target rule for Docker stage
If rule condition for selectively running Docker image builds was ill
formed. It resulted in build of all images even when not all targets
were selected with ACTIVE_TARGET_PATTERN variable.
2024-09-26 21:54:21 +02:00
Marek Pikuła
b7ac7cd122
ci: Fix Docker image source for MRs
If the MR doesn't modify the Docker context, the pipeline should use the
image from upstream.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-09-25 20:20:08 +02:00
Marek Pikuła
ffa5645a2d
ci: Add support for Windows on ARM
It uses LLVM MinGW pre-built toolchain, and wine-arm64 base Docker image
from Linaro.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-09-03 18:21:02 +02:00
Marek Pikuła
51dcfb8027
ci: Add support for LLVM for Windows targets
It uses LLVM MinGW project to get the precompiled LLVM toolchain for
cross-compilation.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-09-03 18:21:01 +02:00
Marek Pikuła
c0ee08aab0
ci: Add LLVM support to the CI workflow
Add support for LLVM for all targets. Mind that in the current state,
some targets fail either build or test stage. For the time being, these
jobs are marked with `:failing` job name suffix.

Relevant issues:
- https://gitlab.freedesktop.org/pixman/pixman/-/issues/105
- https://gitlab.freedesktop.org/pixman/pixman/-/issues/106
- https://gitlab.freedesktop.org/pixman/pixman/-/issues/107
- https://gitlab.freedesktop.org/pixman/pixman/-/issues/108
- https://gitlab.freedesktop.org/pixman/pixman/-/issues/109
2024-09-03 18:21:00 +02:00
Marek Pikuła
44927bf1e1
ci: Unify build and test stage as job templates
This commit unifies codecov and pltcov build and test stages as single
parametrizable GitLab job templates. This cleans up the pipeline flow in
preparation for LLVM support in the pipeline.

Each target has now a Meson cross file, even when using a native
compiler, so that the job template can be better generalized. This also
allows to move architecture-specific build configuration to the cross
file instead of using the additional Meson flags in the job declaration.
2024-09-03 18:20:59 +02:00
Marek Pikuła
19b1a98e8d
ci: Unify Docker image as multi-stage build
This commit merges codecov and pltcov Dockerfiles into a single,
multi-stage Dockerfile. This results in more streamlined Docker image
builds with some common layers which can be reused by multiple images.

Also, by making a common Dockerfile, all common dependencies have the
same exact description, which decreases disparity between different
images for all the supported architectures. Mind that package version
disparity cannot be prevented 100%, as different base images may be used
for different architectures (e.g., Debian Sid for riscv64 instead of
Bookworm).

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-09-03 18:20:58 +02:00
Marek Pikuła
028213b588
ci: Unify target enable flag
It replaces CODE- and PLT- specific target enable variables. It is a
ground work for unification of codecov and pltcov flows.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-09-03 18:20:57 +02:00
Marek Pikuła
05b5ecd934
ci: Use env files instead of awk script
It makes per-targe environment declaration more extensible, as it's
possible now to set custom env variables only for the selected target
for the entire pipeline workflow in a centralized way.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-09-03 18:20:56 +02:00
Julia DeMille
726d77f6fe mmx: Fix compilation with clang-cl 2024-09-03 00:35:47 +00:00
Marek Pikuła
0cb4fbe324
ci: Fix Docker change detection
There was a missing wildcard for Docker directory
change detection, so basically this rule was not
checked correctly.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-08-21 18:46:07 +02:00
Marek Pikuła
4047a553d9
ci: Add platform coverage targets
Platform coverage checks if the code builds and executes properly for
architectures that are not officially supported by Debian. They don't
contribute to general code coverage report but provide a valuable
insight if all supported platforms are working correctly.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-08-20 18:05:44 +02:00
Marek Pikuła
cbf9d7e0d3
ci: Add architecture coverage Docker images
Add images providing an environment for architecture coverage tests.
There is a separate build for Linux and Windows, as the Windows image is
really large compared to Linux one. It decreases the execution time of
both targets, as the images needed to be pulled by runners are smaller.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-08-16 20:15:30 +02:00
Marek Pikuła
c35e47bd88
ci: Increase granularity of Docker build selection
Now, it's possible to selectively disable Docker image builds. Before,
it was only possible to disable build/test jobs for a given
architecture.
2024-08-16 20:10:21 +02:00
Marek Pikuła
e7ef051a6d
ci: Build and test on the supported platforms
This commit introduces a build and test CI workflow, which tests the
correctness of execution for nearly all configurations supported by
pixman. The notable exception is ARM iWMMXt, which is omitted as it's
soon to be deprecated as mentioned in #98.

The build and test stage is separated, as a single build can be used to
test multiple configurations for a given platform (e.g., MMX, SSE2,
SSSE3 for x86).

Execution is performed using multi-arch Docker images built in the
`docker` stage. The important thing to note is that the runner needs to
have a relatively recent version of Docker and QEMU, and needs to have
the qemu-user-static+binfmt execution enabled.

Once all tests are complete, coverage reports are merged together in the
`summary` stage. Then the result can be used in a GitLab-native coverage
report summary.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-08-16 20:04:49 +02:00
Marek Pikuła
2d35a8769c
mips: Add option to force MIPS CPU feature discovery
Used to force feature discovery in CI where /proc/cpuinfo is unreliable.
It can happen, e.g., if executed in qemu-user-static mode.

For such a build, MIPS-specific features need to be manually disabled by
using `PIXMAN_DISABLE` env variable.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-08-16 20:03:29 +02:00
Marek Pikuła
15af6fd0bc
mips: Widen CPU family check for DSPr2
DSPr2 can be available for targets other than mips32. Some distros
(e.g., Debian) don't support mips32 but still support mipsel. Extending
the check enables use of such images for testing.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-08-16 20:03:28 +02:00
Marek Pikuła
a7263190c2
ci: Add multiarch Docker image build
The image is used in CI pipeline to build and test on different
architectures.

This commit introduces more extensible GitLab CI scheme borrowed from
qemu project.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-08-16 20:03:19 +02:00
Marek Pikuła
b753a6f49b
mips: Fix a typo in mips_dspr2_flags
Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-08-14 14:13:07 +02:00
Even Rouault
6410ec79bd pixman-combine-float.c: fix typo in MAKE_NON_SEPARABLE_PDF_COMBINERS()
There's a copy&paste typo updating sc.g twice when there's a mask
2024-08-14 02:48:25 +00:00
Marco Trevisan
5b8e928139 pixman-region: Make translate a no-op when using 0 offsets
This avoids callers to have to optimize this codepath, in case this scenario happens.
And definitely it may happen when the function is not explicitly called.
2024-08-14 02:41:08 +00:00
Matt Turner
2e29b7c43d iwmmxt: Drop support
In all likelyhood unused for at least many years, and possibly ever.

Support is deprecated and will be removed in gcc-15. See deprecation
notice in https://gcc.gnu.org/gcc-13/changes.html

Closes: https://gitlab.freedesktop.org/pixman/pixman/-/issues/98
2024-08-13 13:51:36 -04:00
Peter Hutterer
e5f8efc4c7 ci: add workflow rules to allow for MR pipelines
See
https://gitlab.freedesktop.org/freedesktop/freedesktop/-/wikis/GitLab-CI#for-project-developers
2024-08-07 09:59:34 +10:00
Bill Roberts
7ed0f8d04d
aarch64: support PAC and BTI
Enable Pointer Authentication Codes (PAC) and Branch Target
Identification (BTI) support for ARM 64 targets.

PAC works by signing the LR with either an A key or B key and verifying
the return address. There are quite a few instructions capable of doing
this, however, the Linux ARM ABI is to use hint compatible instructions
that can be safely NOP'd on older hardware and can be assembled and
linked with older binutils. This limits the instruction set to paciasp,
pacibsp, autiasp and autibsp. Instructions prefixed with pac are for
signing and instructions prefixed with aut are for signing. Both
instructions are then followed with an a or b to indicate which signing
key they are using. The keys can be controlled using
-mbranch-protection=pac-ret for the A key and
-mbranch-protection=pac-ret+b-key for the B key.

BTI works by marking all call and jump positions with bti c and bti
j instructions. If execution control transfers to an instruction other
than a BTI instruction, the execution is killed via SIGILL. Note that
to remove one instruction, the aforementioned pac instructions will
also work as a BTI landing pad for bti c usages.

For BTI to work, all object files linked for a unit of execution,
whether an executable or a library must have the GNU Notes section of
the ELF file marked to indicate BTI support. This is so loader/linkers
can apply the proper permission bits (PROT_BRI) on the memory region.

PAC can also be annotated in the GNU ELF notes section, but it's not
required for enablement, as interleaved PAC and non-pac code works as
expected since it's the callee that performs all the checking. The
linker follows the same rules as BTI for discarding the PAC flag from
the GNU Notes section.

Testing was done under the following CFLAGS and CXXFLAGS for all
combinations:
1. -mbranch-protection=none
2. -mbranch-protection=standard
3. -mbranch-protection=pac-ret
4. -mbranch-protection=pac-ret+b-key
5. -mbranch-protection=bti

Signed-off-by: Bill Roberts <bill.roberts@arm.com>
2024-07-22 16:57:13 -05:00
Bill Roberts
3a32506877
arm: add include guards on header
Prevent double inclusion of header file.

Signed-off-by: Bill Roberts <bill.roberts@arm.com>
2024-07-22 16:57:13 -05:00
Mike Hommey
865e6ce00b pixman: Adjust arm assembly for binutils change
A change in the latest version of binutils broke building pixman for arm.

The binutils change:
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=226749d5a6ff0d5c607d6428d6c81e1e7e7a994b

Closes: https://gitlab.freedesktop.org/pixman/pixman/-/issues/96
2024-07-12 15:55:33 -04:00
Matt Turner
b252d40714 Post-release version bump to 0.43.5 2024-02-29 11:19:46 -05:00
Matt Turner
54cad71674 Pre-release version bump to 0.43.4 2024-02-29 11:13:20 -05:00
Matt Turner
add7c8db45 pixman-arm: Use unified syntax
Allows us to use the same assembly without a bunch of #ifdef __clang__.
2024-02-29 10:47:07 -05:00
Makoto Kato
63ae6af9a6 pixman-arm: Fix build on clang/arm32
Closes: https://gitlab.freedesktop.org/pixman/pixman/-/issues/74
2024-02-29 10:47:00 -05:00
Matt Turner
033716e99a Revert "Allow to build pixman on clang/arm32"
This reverts merge request !78
2024-02-29 15:41:37 +00:00
Heiko Lewin
74130e84c5 Allow to build pixman on clang/arm32 2024-02-29 14:46:55 +00:00
Matt Turner
63332b4e72 pixman-x86: Move #include "cpuid.h" inside conditionals
Closes: https://gitlab.freedesktop.org/pixman/pixman/-/issues/93
Closes: https://gitlab.freedesktop.org/pixman/pixman/-/issues/94
2024-02-25 17:28:14 -05:00
Matt Turner
8c6d59a9f8 pixman-x86: Use cpuid.h header 2024-02-24 12:36:53 -05:00
Gayathri Berli
ac485a9b66 Revert the changes to fix the problem in big-endian architectures
This reverts commit b4a105d772.

There is an endianness issue in pixman-fast-path.c. In the function
bits_image_fetch_separable_convolution_affine we have this code:

#ifdef WORDS_BIGENDIAN
	buffer[k] = (satot << 0) | (srtot << 8) | (sgtot << 16) | (sbtot << 24);
#else
	buffer[k] = (satot << 24) | (srtot << 16) | (sgtot << 8) | (sbtot << 0);
#endif

This will write out the pixels as BGRA on big endian systems but
obviously that's wrong. Pixel order should be ARGB on big endian systems
so we don't need any #ifdef for big endian here at all. Instead, the
code should be the same on little and big endian, i.e. it should be just
this line instead of the 5 lines above:

	buffer[k] = (satot << 24) | (srtot << 16) | (sgtot << 8) | (sbtot << 0);

Changing the code like this fixes the wrong colors that I get with
pixman on my PowerPC/s390x system.

Here is what cairo.h has to say (which is rooted in pixman):

 * @CAIRO_FORMAT_ARGB32: each pixel is a 32-bit quantity, with
 *   alpha in the upper 8 bits, then red, then green, then blue.
 *   The 32-bit quantities are stored native-endian. Pre-multiplied
 *   alpha is used. (That is, 50% transparent red is 0x80800000,
 *   not 0x80ff0000.) (Since 1.0)

Closes: https://gitlab.freedesktop.org/pixman/pixman/-/issues/78
Signed-off-by: Gayathri Berli <gayathri.berli@ibm.com>
2024-02-24 12:28:30 -05:00
Simon Ser
fdd7161097 Post-release version bump to 0.43.3 2024-01-28 13:32:42 +01:00
Simon Ser
91b8526c1e Pre-release version bump to 0.43.2 2024-01-28 13:26:31 +01:00
Simon Ser
e8bb34e302 Drop contrib/ci.sh
This is unused and outdated (Autotools is no longer supported).

Signed-off-by: Simon Ser <contact@emersion.fr>
2024-01-28 12:23:29 +00:00
Simon Ser
43773c69db Drop ChangeLog
This file is empty and unused.

Signed-off-by: Simon Ser <contact@emersion.fr>
2024-01-28 12:22:00 +00:00
Simon Ser
8c39ce2437 Drop automatic DEBUG define
We don't use the historical odd stable release numbering scheme
anymore.

Developers can still enable this debugging code via CFLAGS=-DDEBUG.

Signed-off-by: Simon Ser <contact@emersion.fr>
Closes: https://gitlab.freedesktop.org/pixman/pixman/-/issues/87
2024-01-27 13:15:28 +01:00
Simon Ser
8e4be8c2db Post-release version bump to 0.43.1
Signed-off-by: Simon Ser <contact@emersion.fr>
2024-01-04 11:48:38 +01:00
Simon Ser
6c2e4a0dd9 Pre-release version bump to 0.43.0
Signed-off-by: Simon Ser <contact@emersion.fr>
2024-01-04 11:01:05 +01:00
Matt Turner
396e1a76ed test: Use fabsl on float128 2024-01-03 21:40:12 -05:00
Matt Turner
7e76c96281 pixman-access: Mark __dummy__ variables with MAYBE_UNUSED 2024-01-03 21:24:46 -05:00
Matt Turner
af101d3c21 pixman-mmx: Don't redefine _MM_SHUFFLE 2024-01-03 21:24:46 -05:00
Matt Turner
20cc4ee0e9 pixman-sse2: Remove unused functions 2024-01-03 21:24:46 -05:00
Simon Ser
7883ab8d63 ci: upgrade to Fedora 39
Fedora 28 is super old.

Signed-off-by: Simon Ser <contact@emersion.fr>
2023-12-15 13:21:09 +01:00
Pavel Labath
86f9162332 Fix alignment problem in pixman-fast-path.c
The variable is accessed through uint32_t pointer, so it needs to be
aligned to avoid undefined behavior (crashes on architectures which
require aligned accesses).

Closes: https://gitlab.freedesktop.org/pixman/pixman/-/issues/84
2023-12-15 13:10:52 +01:00
Benjamin Gilbert
b4b789df5b meson: avoid linking with -pthread if we don't have pthreads
Meson always returns -pthread in dependency('threads') on non-MSVC
compilers.  Fix a link error when building on MinGW without winpthreads.
2023-11-08 18:43:10 +00:00
Sam James
08115a4217
pixman-bits-image: fix -Walloc-size
GCC 14 introduces a new -Walloc-size included in -Wextra which gives (when forced
to be an error):
```
../pixman/pixman-bits-image.c: In function ‘create_bits’:
../pixman/pixman-bits-image.c:1273:16: error: allocation of insufficient size ‘1’ for type ‘uint32_t’ {aka ‘unsigned int’} with size ‘4’ [-Werror=alloc-size]
 1273 |         return calloc (buf_size, 1);
      |                ^~~~~~~~~~~~~~~~~~~~
```

The calloc prototype is:
```
void *calloc(size_t nmemb, size_t size);
```

So, just swap the number of members and size arguments to match the prototype, as
we're initialising 1 element of size `buf_size`. GCC then sees we're not
doing anything wrong.

Signed-off-by: Sam James <sam@gentoo.org>
2023-11-07 22:31:05 +00:00
Havard Eidnes
47a1c3d330 vmx: Reimplement create_mask_32_128 and use it in vmx_fill
Based on suggestion from @siamashka.

This lets the compiler pick the vector instruction to use which is
usually the best idea.

Use create_mask_32_128() instead of create_mask_1x32_128() in
vmx_fill(), avoiding loading memory beyond the filler argument on the
stack.

Remove the now-unused create_mask_1x32_128(). This gets rid of some
(correct) warnings from the compiler about indexing beyond the variable
in question.
2023-08-30 12:14:40 -04:00
Havard Eidnes
634b8196d2 vmx: Simplify scaled_nearest_scanline_vmx_8888_8888_OVER
Since combine4() does not take vector variables as arguments, there's no
need to use a vector variable and casts back and forth to normal scalars
for the arguments.
2023-08-30 12:14:26 -04:00
Matt Turner
753f5e095e meson: Fix syntax 2023-08-30 11:58:04 -04:00
Simon Ser
7aeeb501ad Fix const warnings in pixman_image_set_clip_region()
Fixes the following warnings:

    pixman-image.c: In function 'pixman_image_set_clip_region':
    pixman-image.c:601:81: warning: passing argument 2 of 'pixman_region32_copy_from_region16' discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
      601 |         if ((result = pixman_region32_copy_from_region16 (&common->clip_region, region)))
          |                                                                                 ^~~~~~
    In file included from pixman-image.c:32:
    pixman-private.h:859:56: note: expected 'pixman_region16_t *' {aka 'struct pixman_region16 *'} but argument is of type 'const pixman_region16_t *' {aka 'const struct pixman_region16 *'}
      859 |                                     pixman_region16_t *src);
          |                                     ~~~~~~~~~~~~~~~~~~~^~~
    pixman-utils.c:240:1: error: conflicting types for 'pixman_region16_copy_from_region32'; have 'pixman_bool_t(pixman_region16_t *, pixman_region32_t *)' {aka 'int(struct pixman_region16 *, struct pixman_region32 *)'}
      240 | pixman_region16_copy_from_region32 (pixman_region16_t *dst,
          | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    In file included from pixman-utils.c:31:
    pixman-private.h:862:1: note: previous declaration of 'pixman_region16_copy_from_region32' with type 'pixman_bool_t(pixman_region16_t *, const pixman_region32_t *)' {aka 'int(struct pixman_region16 *, const struct pixman_region32 *)'}
      862 | pixman_region16_copy_from_region32 (pixman_region16_t *dst,
          | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    pixman-utils.c:270:1: error: conflicting types for 'pixman_region32_copy_from_region16'; have 'pixman_bool_t(pixman_region32_t *, pixman_region16_t *)' {aka 'int(struct pixman_region32 *, struct pixman_region16 *)'}
      270 | pixman_region32_copy_from_region16 (pixman_region32_t *dst,
          | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    In file included from pixman-utils.c:31:
    pixman-private.h:858:1: note: previous declaration of 'pixman_region32_copy_from_region16' with type 'pixman_bool_t(pixman_region32_t *, const pixman_region16_t *)' {aka 'int(struct pixman_region32 *, const struct pixman_region16 *)'}
      858 | pixman_region32_copy_from_region16 (pixman_region32_t *dst,
          | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Simon Ser <contact@emersion.fr>
2023-08-30 15:49:50 +00:00
Matt Turner
7169c0404f Use more Markdown-friendly syntax 2023-08-30 11:15:00 -04:00
Matt Turner
f1072b07eb Remove generic build system information 2023-08-30 11:14:04 -04:00
Gauthier Östervall
2cf9ae1cea Update build instructions to meson and ninja 2023-08-30 11:12:41 -04:00
Dylan Baker
72c4245b2e delete win32 make files
meson can handle building for win32 (including using visual studio, and
mingw), and does a good deal more than these could. Since we're dropping
autotools, we might as well drop these too.
2023-08-30 10:54:46 -04:00
Dylan Baker
55eb680a1f autotools: remove autotools
At this point meson is pretty well tested and seems to pretty much work,
so we can consider dropping an extra build system.

This doesn't solve the problem that pixman's release scripts are part of
the autotools build system (as make targets). One solution might be to
use xorg's release.sh instead.
2023-08-30 10:51:27 -04:00
Matt Turner
593a970266 test: Revert to including pixman-private.h
This broke the Visual Studio builds in GTK's CI system.
2023-07-19 15:08:22 -04:00
Heiko Lewin
67490a8bc1
pixman-arma64: Adjustments to build with llvm integrated assembler
This enables building the aarch64 assembly with clang.
Changes:
1. Use `.func` or `.endfunc` only if available
2. Prefix macro arg names with `\` 
3. Use `\()` instead of `&`
4. Always use commas to separate macro arguments
5. Prefix asm symbols with an undderscore if necessary
2023-07-18 07:20:01 +02:00
Benjamin Gilbert
47d3fbe38f mmx: use xmmintrin.h if building with SSE2
As of mingw-w64 commit 463f00975, winnt.h includes emmintrin.h when
compiling with SSE2, causing redefinition errors for our copied MMX
intrinsics.  If the build is assuming SSE2 anyway, just use the system
header instead.
2023-07-09 01:56:40 +00:00
Simon Ser
55845c3dd3 Constify pixman_image_set_clip_region()
This function copies the region passed in.

Signed-off-by: Simon Ser <contact@emersion.fr>
2023-07-09 01:53:48 +00:00
Simon Ser
672f67db96 Add pixman_region{,32}_empty()
Inverse of pixman_region32_not_empty().

Most of the time, callers want to check whether a region is empty,
not whether a region is not empty. This results in code with
double-negatives such as !pixman_region32_not_empty(), which is
confusing to read.

Signed-off-by: Simon Ser <contact@emersion.fr>
2023-07-09 01:48:29 +00:00
Benjamin Gilbert
48d5df1f37 meson: don't dllexport when built as static library
If a static Pixman is linked with a dynamic library, Pixman shouldn't
export its own symbols into the latter's ABI.
2023-07-08 17:36:00 -04:00
Emanuel Schmidt
e4c878d179 Fixed missing dependency in libdemo
After the latest changes and separation of demo- and test-targets,
it was visible that a dependency towards `libtestutils_dep` was
missing in one of the demo-dependencies. This change will fix
this particular problem.
2023-02-17 18:52:14 +01:00
Emanuel Schmidt
ee145e53d1 Changed name of the config-header to "pixman-config.h" 2023-02-14 22:20:12 +01:00
Emanuel Schmidt
eb998d7b65 Separate meson build options for demos and tests 2023-02-08 20:56:05 +01:00
Emilio Pozuelo Monfort
a7a919b881 Release to sid 2022-11-11 13:42:32 +01:00
Emilio Pozuelo Monfort
a4e8d8901f Remove patch for CVE-2022-44638 included in 0.42.2 2022-11-08 13:24:10 +01:00
Emilio Pozuelo Monfort
590b8eb08f New upstream release 2022-11-08 13:11:47 +01:00
Emilio Pozuelo Monfort
dbe5c715e6 Merge branch 'upstream-unstable' into debian-unstable 2022-11-08 13:11:17 +01:00
Emilio Pozuelo Monfort
e71a54d0f0 Import 0.40.0-1.1 NMU
* Avoid integer overflow leading to out-of-bounds write (CVE-2022-44638)
  (Closes: #1023427)
2022-11-08 13:03:18 +01:00
Heiko Lewin
713077d0a3 Fix signed-unsigned semantics in reduce_32 2022-11-03 19:13:41 +00:00
Matt Turner
618e3d4283 Post-release version bump to 0.42.3 2022-11-03 09:53:12 -04:00
Claude Heiland-Allen
40d6c9b256 add r8g8b8 sRGB to test suite
Signed-off-by: Claude Heiland-Allen <claude@mathr.co.uk>
2022-11-03 12:51:47 +00:00
Claude Heiland-Allen
83ba024483 implement r8g8b8 sRGB (without alpha)
Signed-off-by: Claude Heiland-Allen <claude@mathr.co.uk>
2022-11-03 12:51:47 +00:00
Matt Turner
37216a3283 Pre-release version bump to 0.42.2 2022-11-02 13:25:48 -04:00
Matt Turner
a1f88e842e Avoid integer overflow leading to out-of-bounds write
Thanks to Maddie Stone and Google's Project Zero for discovering this
issue, providing a proof-of-concept, and a great analysis.

Closes: https://gitlab.freedesktop.org/pixman/pixman/-/issues/63
2022-11-02 13:25:48 -04:00
Matt Turner
c3bbb94b4c Revert "Fix signed-unsigned semantics in reduce_32"
This reverts commit aaf59b0338.

This commit regressed the scaling-test unit test, by apparently allowing
the compiler to emit fused multiply-add instructions in cases they
wouldn't have been allowed before. While using gcc's -ffp-contract=...
flag avoids the issue on amd64, it does not on at least aarch64 and
ppc64.

This is unfortunate, because the commit being reverted resolved
https://gitlab.freedesktop.org/pixman/pixman/-/issues/43 so we will
reintroduce this failure, but after more than a year without a fix for
the unit test, I think it's time to bite the bullet.

Fixes: https://gitlab.freedesktop.org/pixman/pixman/-/issues/49
2022-10-27 15:10:30 -04:00
Matt Turner
ca7bb8894e build: Add a64-neon-test.S to EXTRA_DIST
Fixes: https://gitlab.freedesktop.org/pixman/pixman/-/issues/66
2022-10-27 14:36:54 -04:00
Simon Ser
1a0d50ce70 meson: explicitly set C standard to gnu99
This explicitly indicates that GNU extensions (like asm) are used.
This fixes build errors when Pixman is used as a Meson subproject.

Signed-off-by: Simon Ser <contact@emersion.fr>
2022-10-27 18:21:37 +00:00
Simon Ser
0cf92877a9 meson: override pixman-1 dependency
This eases usage as a Meson subproject.

Signed-off-by: Simon Ser <contact@emersion.fr>
2022-10-27 18:17:26 +00:00
Thomas Klausner
4ee322c4e2 Makefile.am: increase shell portability
Use standard test(1) instead of bash's '[['.

Signed-off-by: Thomas Klausner <wiz@gatalith.at>
2022-10-18 17:48:49 +02:00
Thomas Klausner
b5b3243792 configure.ac: avoid unportable test(1) operator
"==" is only supported by bash, POSIX mandates "="

Signed-off-by: Thomas Klausner <wiz@gatalith.at>
2022-10-18 17:48:24 +02:00
Simon Ser
7df9e162c6 Post-release version bump to 0.42.1
Signed-off-by: Simon Ser <contact@emersion.fr>
2022-10-18 11:01:24 +02:00
Simon Ser
8d6d7f44f4 Pre-release version bump to 0.42.0
Signed-off-by: Simon Ser <contact@emersion.fr>
2022-10-18 09:44:04 +02:00
Benjamin Gilbert
421fc252ab meson: Add feature to disable compiler TLS support
When compiling with MinGW, use of the __thread attribute causes pixman
to gain a dependency on the winpthread DLL.  With Autotools, this could
be avoided by configuring with ac_cv_tls=none, causing pixman to fall
back to TlsSetValue() instead.

Add a Meson 'tls' option that can be 'disabled' to skip support for TLS
compiler attributes, or 'enabled' to require a working TLS attribute.
2022-10-18 01:02:43 +00:00
Alan Coopersmith
7989483929 configure.ac: allow x64 libraries on Solaris to run on non-SSSE3 machines
Override the x64 hardware capability autodetection by Solaris Studio
compilers for x64 libraries the same way we do for x86 libraries.

Also fix configure test for this override to work in out-of-tree builds.

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
2022-10-13 20:58:57 +00:00
Jocelyn Falempe
b4a105d772 Fix inverted colors on big endian system
bits_image_fetch_separable_convolution_affine() didn't take care
of big endian system

Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
2022-06-29 11:00:04 +02:00
Alan Coopersmith
285b9a907c configure: replace bugzilla URL with gitlab issues
Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
2022-02-19 13:37:54 -08:00
Nirbheek Chauhan
adc07d4618 meson: Fix usage of pkgconfig.generate()
The library that the pkgconfig file is for should be the first
positional argument. The `libraries:` kwarg is for libraries that the
user must also link against, and which meson does not know about (and
hence cannot automatically add to the `Libs:` or `Requires:` section
in the .pc file).

Fixes:
```
subprojects/pixman/meson.build:564: DEPRECATION: Library pixman-1 was
passed to the "libraries" keyword argument of a previous call to
generate() method instead of first positional argument. Adding
pixman-1 to "Requires" field, but this is a deprecated behaviour that
will change in a future version of Meson. Please report the issue if
this warning cannot be avoided in your case.
```
2022-01-22 13:25:57 +05:30
Nirbheek Chauhan
3563dfe436 meson: Fix warning about extract_all_objects usage
We use this because of a meson bug that was fixed in 0.52:

https://mesonbuild.com/Release-notes-for-0-52-0.html#improved-support-for-static-libraries

Bump the requirement and remove the extract_all_objects workaround.
This gets rid of a meson warning:

WARNING: extract_all_objects called without setting recursive
keyword argument. Meson currently defaults to
non-recursive to maintain backward compatibility but
the default will be changed in the future.
2022-01-21 09:07:53 +00:00
Manuel Stoeckl
c6e1af995e demos: port to Gtk3
GTK2 has reached end of life, and GTK3 has been available for a
almost a decade.

Signed-off-by: Manuel Stoeckl <code@mstoeckl.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
2022-01-12 23:19:39 -05:00
Mizuki Asakura
eadb82866b added aarch64 bilinear implementations (ver.4.1)
Since aarch64 has different neon syntax from aarch32 and has no
support for (older) arm-simd,
there are no SIMD accelerations for pixman on aarch64.

We need new implementations.

This patch also contains Ben Avions's series of patches for aarch32
and now the benchmark results are fine to aarch64.

Please find the result at the below ticket.

Added: https://bugs.freedesktop.org/show_bug.cgi?id=94758
Signed-off-by: Mizuki Asakura <ed6e117f@gmail.com>
2021-09-17 17:03:02 +00:00
Simon Ser
36001032b7 Constify region APIs
This allows callers to pass around const Pixman region in their
APIs, improving type safety and documentation.

Signed-off-by: Simon Ser <contact@emersion.fr>
2021-09-17 16:22:51 +00:00
Nirbheek Chauhan
bd4e7a9b9e tests: Fix undefined symbol build error on macOS
prng_state and prng_state_data are getting classified as a "Common
symbol" by the compiler due to the convoluted way in which it is
`#include`-ed in various test sources, and that's not read as a valid
symbol by the linker later.

Initializing the symbol clarifies it to the compiler that this
specific declaration is the canonical location for this variable, and
that it's not a "Common symbol".

Fixes https://gitlab.freedesktop.org/pixman/pixman/-/issues/42
2021-09-17 16:08:04 +00:00
Alex Richardson
e0d4403e78 Fix -Wincompatible-function-pointer-types warning
Adding const to the return type does nothing and means that the function
pointer types do not match exactly:

error: incompatible function pointer types passing 'const float (int, int)' to parameter of type 'dither_factor_t' (aka 'float (*)(int, int)')
2021-09-17 16:03:48 +00:00
Manuel Stoeckl
5f5e752f15 Fix masked pixel fetching with wide format
In __bits_image_fetch_affine_no_alpha and __bits_image_fetch_general,
when `wide` is true, the mask is actually an array of argb_t instead
of the array of uint32_t it was cast to, and the access to `mask[i]`
does not correctly detect when the pixel is nontrivial. The code now
uses a check appropriate for argb_t when `wide` is true.

One caveat: this new check only skips entries when the mask pixel data
is binary all zero; this misses cases like `-0.f` which would be caught
by the FLOAT_IS_ZERO macro. As the mask check only appears to be a
performance optimization to avoid loading inconsequential pixels, it
erring on the side of loading more pixels is safe.

Signed-off-by: Manuel Stoeckl <code@mstoeckl.com>
2021-08-09 21:43:58 -04:00
Heiko Lewin
aaf59b0338 Fix signed-unsigned semantics in reduce_32 2021-07-21 14:50:52 +00:00
pkubaj
4251202d9d Fix AltiVec detection on FreeBSD. 2021-05-07 15:58:56 +00:00
Jonathan Kew
e93eaff517 Avoid out-of-bounds read when accessing individual bytes from mask.
The important changes here are a handful of places where we replace

            memcpy(&m, mask++, sizeof(uint32_t));

or similar code with

            uint8_t m = *mask++;

because we're only supposed to be reading a single byte from *mask,
and accessing a 32-bit value may read out of bounds (besides that
it reads values we don't actually want; whether this matters would
depend exactly how the value in m is subsequently used).

I've also changed a bunch of other places to use this same pattern
(a local 8-bit variable) when reading individual bytes from the mask;
the code was inconsistent about this, sometimes casting the byte to
a uint32_t instead. This makes no actual difference, it just seemed
better to use a consistent pattern throughout the file.
2021-05-07 09:37:28 -04:00
Timo Aaltonen
52a3693957 release to sid 2020-12-03 15:38:38 +02:00
Timo Aaltonen
16f9268369 symbols: Updated, bump shlibs 2020-12-03 15:25:18 +02:00
Timo Aaltonen
ad3904afb6 control, rules: Migrate to debhelper-compat, bump to 13. 2020-12-03 15:19:07 +02:00
Timo Aaltonen
8b58485eb3 bump the version 2020-12-03 15:15:54 +02:00
Timo Aaltonen
4772386a28 Merge branch 'upstream-unstable' into debian-unstable 2020-12-03 15:14:50 +02:00
Érico Rolim
d93ec57138 meson: update option descriptions.
- gtk is only used in demos
- libpng is only used in tests
- openmp is only used in tests (in the standard build)
2020-10-22 20:43:26 -03:00
Dylan Baker
9b49f4e087 meson: remove pixman dependency
AFAICT from the git history, what happened is that the gtk demos rely on
gtk being built with pixman support. pkg-config isn't really expressive
enough to have that information, so the solution that was come up with
was to search for pixman as well as gtk+ and hope that pixman being
installed was.

This isn't actually used anywhere in the meson build anyway, and it's
causing problems for projects that want to use pixman as a supproject
(there's a port of cairo underway that's hitting this), because it
confuses meson.
2020-06-18 14:21:09 -07:00
Tim-Philipp Müller
606f5c15b0 meson: add option to skip building of tests and demos
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2020-06-02 02:30:39 +00:00
Tim-Philipp Müller
15e0668616 meson: add cpu-features-path option for Android
Add option to include cpu-features.[ch] from a given path
into the build for platforms that don't provide this out
of the box. This is needed on Android.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2020-06-02 01:15:33 +01:00
Tim-Philipp Müller
0ba6cbe1ac Update README a little
- bugzilla -> gitlab
- convert links to https
- suggest issues and patches be filed via gitlab
2020-05-30 11:34:26 +01:00
Tom Stellard
c2fe1568ff Add -ftrapping-math to default cflags
This should resolve https://gitlab.freedesktop.org/pixman/pixman/-/issues/22
and make the tests pass with clang.

-ftrapping-math is already the default[1] for gcc, so this should not change
behavior when compiling with gcc.  However, clang defaults[2] to -fno-trapping-math,
so -ftrapping-math is needed to avoid floating-point expceptions when running the
combiner and stress tests.

The root causes of this issue is that that pixman-combine-float.c guards floating-point
division operations with a FLOAT_IS_ZERO check e.g.

if (FLOAT_IS_ZERO (sa))
	f = 1.0f;
else
	f = CLAMP (da / sa);

With -fno-trapping-math, the compiler assumes that division will never trap, so it may
re-order the division and the guard and execute the division first.  In most cases,
this would not be an issue, because floating-point exceptions are ignored.  However,
these tests call enable_divbyzero_exceptions() which causes the SIGFPE signal to
be sent to the program when a divide by zero exception is raised.

[1] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
[2] https://clang.llvm.org/docs/UsersManual.html#controlling-floating-point-behavior
2020-05-11 22:33:49 +00:00
Michael Forney
3b1fefda7f Prevent empty top-level declaration
The expansion of PIXMAN_DEFINE_THREAD_LOCAL(...) may end in a
function definition, so the following semicolon is considered an
empty top-level declaration, which is not allowed in ISO C.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2020-04-26 13:46:43 -07:00
Matt Turner
10a057e27f Post-release version bump to 0.40.1
Signed-off-by: Matt Turner <mattst88@gmail.com>
2020-04-19 15:01:30 -07:00
Matt Turner
244383bf9f Pre-release version bump to 0.40.0
Signed-off-by: Matt Turner <mattst88@gmail.com>
2020-04-19 14:52:22 -07:00
Matt Turner
405f26068c Move from MD5/SHA1 to SHA256/SHA512 digests
Signed-off-by: Matt Turner <mattst88@gmail.com>
2020-04-19 14:52:22 -07:00
Matt Turner
88b167d18c Build xz tarballs instead of bzip2
Signed-off-by: Matt Turner <mattst88@gmail.com>
2020-04-19 14:49:46 -07:00
Matt Turner
54a13221ee Distribute the blue-noise files
Signed-off-by: Matt Turner <mattst88@gmail.com>
2020-04-19 14:46:56 -07:00
Ghabry
eb0c3d26ed Enabled armv6 SIMD for 3DS (devkitARM) and arm neon SIMD for PS Vita (vitasdk) and Switch (devkitA64) 2020-04-14 00:08:57 +00:00
Matt Turner
9976d2c099 loongson: Avoid C90 mixing-code-and-decls warning 2020-04-07 15:18:09 -07:00
Shiyou Yin
5330640025 configure.ac: use '-mloongson-mmi' for Loongson MMI
It's recommended to use '-mloongson-mmi' for MMI.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2020-04-07 15:18:03 -07:00
Adam Jackson
348e99b52f fast-path: Fix some sketchy pointer arithmetic
We want a uint8_t * at the end of this math, because that's what the
function we're about to pass it to takes. But ->bits is a uint32_t, so
if we just do the math in units of that we can avoid the explicit factor
of four which would risk an integer overflow.

Fixes: pixman/pixman#14
2020-04-02 14:58:52 +00:00
Matt Turner
ba5d794515 lowlevel-blt-bench: Remove unused variable
Closes: https://gitlab.freedesktop.org/pixman/pixman/issues/7
2020-03-20 12:42:45 -07:00
Federico Mena Quintero
6fe0131394 Initialize temporary buffers in general_composite_rect()
Otherwise, Valgrind shows things like "conditional jump or move
depends on uninitialised values" errors much later in calling code.
For example, see https://gitlab.gnome.org/GNOME/librsvg/issues/572

Fixes https://gitlab.freedesktop.org/pixman/pixman/issues/9
2020-03-18 18:52:16 -06:00
Antonio Ospite
3344f507dd pixman-compiler.h: fix building tests with MinGW
MinGW supports __declspec(dllexport) but the current logic that sets
PIXMAN_EXPORT only uses it when building with MSVC, leaving some symbols
hidden when building with MinGW.

This results in an error when trying to link the tests:

-----------------------------------------------------------------------
FAILED: subprojects/pixman/test/combiner-test.exe
x86_64-w64-mingw32-gcc  -o subprojects/pixman/test/combiner-test.exe 'subprojects/pixman/test/f48fa9c@@combiner-test@exe/combiner-test.c.obj' -Wl,--allow-shlib-undefined -Wl,--start-group subprojects/pixman/test/libtestutils.a subprojects/pixman/pixman/libpixman-1.dll.a -pthread -fopenmp -fopenmp -lm -mconsole -lkernel32 -luser32 -lgdi32 -lwinspool -lshell32 -lole32 -loleaut32 -luuid -lcomdlg32 -ladvapi32 -Wl,--end-group
/usr/bin/x86_64-w64-mingw32-ld: subprojects/pixman/test/f48fa9c@@combiner-test@exe/combiner-test.c.obj: in function `main':
.../build/../subprojects/pixman/test/combiner-test.c:124: undefined reference to `_pixman_internal_only_get_implementation'
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.
-----------------------------------------------------------------------

By using PIXMAN_API also when building with MinGW, the tests can link
successfully and the build succeed.

Tested with x86_64-w64-mingw32-gcc (GCC) 8.3-win32 20191201.
2020-03-15 00:19:56 +01:00
Yin Shiyou
127d9525d6 pixman-combine: Fix wrong value of RB_MASK_PLUS_ONE.
No functional change, as explained by Søren in
https://lists.freedesktop.org/archives/pixman/2020-February/004902.html
2020-02-20 09:55:17 -08:00
Mathieu Duponchelle
e8321503c6 meson: add missing function check (getisax)
.. and add gettimeofday to the list of funcs to check instead
of having a separate check for it.
2020-01-30 23:31:35 +01:00
Mathieu Duponchelle
8992d5b4fc meson: finish porting over mmx and ssse2 flags for sun and msvc
Those flags are set by the configure.ac script
2020-01-30 23:29:20 +01:00
Khem Raj
364760cd3d test/utils: Check for FE_INVALID definition before use
Some architectures e.g. nios2 do not support all exceptions.
2019-12-19 23:34:38 +00:00
Chun-wei Fan
7331d2b4e3 thread-test.c: Use Windows Threading API on Windows
...When we don't have a pthreads implementation available, which is
normally the case on Windows.  This attempts to make it easier for people
on Windows to verify whether their builds of Pixman (and Cairo component,
if applicable) are thread-safe.  Also, make the number of threads
a #define, so if we need to change it at some point, it's easier.

This re-enables the thread-test program on Windows in Meson builds.
2019-11-19 05:50:28 +08:00
Chun-wei Fan
1dd3bc0a35 demos: Define _USE_MATH_DEFINES on MSVC-style compilers
This is required for the use of M_PI.
2019-11-19 05:49:35 +08:00
Chun-wei Fan
3bceb3a9d3 test/solid-test.c: Include stdint.h
We need that to make sure we have UINT16_MAX.
2019-11-19 05:49:35 +08:00
Chun-wei Fan
c608e9663e pixman/meson.build: Define PIXMAN_API on MSVC-style compilers
This will make the public APIs exported from the DLL, so that we have an
import libary that we can use.
2019-11-19 05:49:35 +08:00
Chun-wei Fan
9d8dd17ada pixman-[compiler|private].h: Export symbols for tests
Define the existing PIXMAN_EXPORT to be PIXMAN_API, which can overriden
to be __declspec(dllexport) during the build of the pixman DLL on MSVC
builds, which will be in the next patch.

Also, export more private symbols as they are needed for the test
programs.
2019-11-19 05:49:35 +08:00
Chun-wei Fan
21d8ded566 pixman/pixman.h: Mark public APIs with PIXMAN_API
We can override PIXMAN_API with a CFLAG or config.h define to export
the symbols with compiler directives, if needed.
2019-11-19 05:49:35 +08:00
Chun-wei Fan
b7eea54028 pixman/pixman-version.h.in: Add a PIXMAN_API macro
This prepares to mark the public APIs that we have in pixman.h so that
we can use compiler directives such as __declspec(dllexport) to export
those symbols.
2019-11-19 05:49:35 +08:00
Chun-wei Fan
06a3f6e60b meson.build: Improve libpng search on MSVC
The build system for libpng for MSVC does not generate a pkg-config file
for us, and CMake support in Meson does not work very well.  So, look
for libpng manually on MSVC builds if depedency discovery did not work
out via pkg-config or the CMake config files.
2019-11-19 05:49:35 +08:00
Chun-wei Fan
7661b1fae9 build: Don't assume PThreads if threading support is found
Look also for pthread.h if threading support is found by Meson, as the
underlying threading support may not be PThreads, depending on platform.

For now, disable the thread-test test program if pthread.h and if
necessary, the PThreads library, cannot be found, as the current
implementation assumes the use of PThreads.

Also bump the required Meson version to 0.50.0 since we need it for
-cc.get_argument_syntax()
-For a later commit, the has_headers sub-method for cc.find_library()
2019-11-19 05:49:35 +08:00
Chun-wei Fan
e9db26898b meson.build: Disable OpenMP on MSVC builds
The implementation of OpenMP is not compliant for our uses, so disable
it for now by just not checking for it on MSVC builds, as we implicitly
add an /openmp switch to the build, which will cause linking the tests
programs to fail, as the OpenMP implementation is not enough.
2019-11-19 05:49:34 +08:00
Chun-wei Fan
f251c12f8a meson.build: Fix MMX, SSE2 and SSSE3 checks on MSVC
-For MSVC builds, do not use the GCC-specific CFlags when checking for
 these features.

-For the MMX check, assume that we have good enough MMX intrinsics and
 inline assembly support (on ix86), since MSVC provides sufficient
 support for those since before the times of MSVC 2008, and 2008 is the
 oldest version that we can support, as with the pre-C99 GTK+ stack.

Unfortunately due to x64 compiler issues, pre-Visual Studio 2010 will
crash when building SSSE3 code, so we do not enable building SSSE3 code
on pre-2010 Visual Studio.

Also, for all x64 Visual Studio builds, we do not enable USE_X86_MMX
as inline assembly is not allowed for x64 Visual Studio builds, and
instead use the compatibility instrinsics that we already have in the
code.
2019-11-18 16:19:36 +08:00
Adam Jackson
32a55aa8ac pixman-sse2: Fix undefined unaligned loads 2019-11-13 20:00:20 +00:00
Adam Jackson
47bec681d9 pixman-mmx: Fix undefined unaligned loads 2019-11-13 20:00:20 +00:00
Adam Jackson
baed75faa9 pixman-mmx: Fix undefined left-shifts 2019-11-13 20:00:20 +00:00
Adam Jackson
85acb0a933 test: Fix unrepresentable subtraction in stress-test
Does not make the test pass, but does fix this error:

../test/stress-test.c:538:25: runtime error: signed integer overflow: 2147483647 - -2 cannot be represented in type 'int'
2019-11-01 14:36:54 -04:00
Adam Jackson
1f5b20c4aa pixman-matrix: Fix left shift of a negative number
../pixman/pixman-matrix.c:276:35: runtime error: left shift of negative value -32768
2019-11-01 14:36:54 -04:00
Adam Jackson
bcfb3490db pixman-bits-image: Fix left shift of a negative number
../pixman/pixman-bits-image.c:678:33: runtime error: left shift of negative value -32768
2019-11-01 14:36:52 -04:00
Adam Jackson
fef82109eb pixman-bits-image: Fix various undefined left shifts
../pixman/pixman-bits-image.c:221:20: runtime error: left shift of 204 by 24 places cannot be represented in type 'int'
2019-10-15 16:35:25 -04:00
Adam Jackson
7d6b71b315 pixman-fast-path: Fix various undefined left shifts
../pixman/pixman-fast-path.c:3089:23: runtime error: left shift of 154 by 24 places cannot be represented in type 'int'
2019-10-15 16:34:56 -04:00
Adam Jackson
880f48b2b4 pixman-sse2: Fix an undefined left shift
../pixman/pixman-sse2.c:3346:14: runtime error: left shift of 41891 by 16 places cannot be represented in type 'int'
2019-10-15 16:33:46 -04:00
Adam Jackson
4897ad0a3f pixman-gradient-walker: Fix undefined left shift
../pixman/pixman-gradient-walker.c:216:35: runtime error: left shift of 163 by 24 places cannot be represented in type 'int'
2019-10-15 16:31:45 -04:00
Adam Jackson
7eb9c8c004 pixman-image: Fix undefined left shift
../pixman/pixman-image.c:963:46: runtime error: left shift of 255 by 24 places cannot be represented in type 'int'
2019-10-15 16:31:45 -04:00
Adam Jackson
81c87543d1 pixman-combine: Fix various undefined left shifts
../pixman/pixman-combine32.c:657:1: runtime error: left shift of 128 by 24 places cannot be represented in type 'int'
../pixman/pixman-combine32.c:694:1: runtime error: left shift of 232 by 24 places cannot be represented in type 'int'
../pixman/pixman-combine32.c:712:1: runtime error: left shift of 255 by 24 places cannot be represented in type 'int'
../pixman/pixman-combine32.c:786:1: runtime error: left shift of 255 by 24 places cannot be represented in type 'int'
../pixman/pixman-combine32.c:805:1: runtime error: left shift of 255 by 24 places cannot be represented in type 'int'
2019-10-15 16:31:45 -04:00
Adam Jackson
6d0a930b14 pixman-access: Fix various undefined left shifts
../pixman/pixman-access.c:389:2: runtime error: left shift of 1 by 31 places cannot be represented in type 'int'
../pixman/pixman-access.c:1101:2: runtime error: left shift of 2 by 30 places cannot be represented in type 'int'
../pixman/pixman-access.c:1152:2: runtime error: left shift of 2 by 30 places cannot be represented in type 'int'
2019-10-15 16:31:43 -04:00
Adam Jackson
a09bcc062f pixman: Fix undefined left shift in pixel_contract_from_float
../pixman/pixman-utils.c:216:14: runtime error: left shift of 255 by 24 places cannot be represented in type 'int'
2019-10-15 16:31:40 -04:00
Adam Jackson
f6040f56da test: Fix undefined left shift in pixel_checker_init
../test/utils.c:2070:57: runtime error: left shift of 255 by 24 places cannot be represented in type 'int'
2019-10-15 16:31:38 -04:00
Adam Jackson
52c27c82de test: Fix undefined left shift in affine-test
../test/affine-test.c:174:34: runtime error: left shift of 1 by 31 places cannot be represented in type 'int'
2019-10-15 16:31:33 -04:00
Jonathan Kew
d60b0af5e3 Avoid undefined behavior (left-shifting negative value) in pixman_int_to_fixed
Reported in https://bugzilla.mozilla.org/show_bug.cgi?id=1580352. Casting the argument to uint32_t should avoid invoking undefined behavior here. We'll still have *implementation-defined* behavior when casting the result back to pixman_fixed_t, but that's better than *undefined*.
2019-09-11 12:07:46 +00:00
Dylan Baker
afc6c935f1 meson: don't use link_with for library()
Meson doesn't do the expected thing when library() creates a static
library. Instead of combining the libraries together into a single
archive it effectively discards them, resulting in missing symbols.

To work around this we manually unpack the archives and shove the .o
files into the final library. This doesn't affect the shared library at
all, but makes the static library have the necessary symbols

Fixes #33
2019-09-09 16:06:18 -07:00
Jonathan Kew
c558647fdf Explicitly cast byte to uint32_t before left-shifting.
To avoid potential signed integer overflow (undefined behavior), as implicit integer promotion means the operand becomes a (signed) int.

(Issue originally reported at https://bugzilla.mozilla.org/show_bug.cgi?id=1577669)
2019-08-30 10:42:45 +00:00
Christoph Reiter
fd5c0da579 meson: fix TLS support under mingw
GCC on Windows complains that "__declspec(thread)" doesn't work, but still
compiles it, so the meson check doesn't work. The warning printed by gcc:
"warning: 'thread' attribute directive ignored [-Wattributes]"

Pass -Werror=attributes to make the check fail instead.

This fixes the test suite (minus gtk tests) on Windows with mingw.
2019-06-10 16:42:59 +00:00
Christoph Reiter
4851d4e20f meson: allow building a static library
So that passing "-Ddefault_library=both" also creates a static lib.

Note that Libs.private in the .pc file will still be wrong because of
https://github.com/mesonbuild/meson/issues/3934 (it contains things like
-lpixman-mmx)
2019-06-10 16:38:39 +00:00
Christoph Reiter
be0d3e6994 meson: define SIZEOF_LONG and use -Wundef
meson builds defaulted to SIZEOF_LONG=0 in various places
2019-06-10 16:34:06 +00:00
Basile Clement
0ee0ad23de Don't use GNU extension for binary numbers
The dithering code (specifically `dither_factor_bayer_8`) uses a GNU
extension for binary notation, eg 0b001.  This is not supported by MSVC
(at least) and breaks the build on this platform [1].

This patches uses hexadecimal notation instead, fixing the build.

[1]: https://lists.freedesktop.org/archives/pixman/2019-June/004883.html

Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-06-10 09:32:12 -07:00
Basile Clement
cb2ec4268f Ordered dithering with blue noise, v2
On some screens (typically low quality laptop screens), using Bayer
ordered dithering has been observed to cause color changes depending on
*where the gradient is rendered on the screen*, causing visible
flickering when moving an image on the screen.

To alleviate the issue, this patch adds support for ordered dithering
using a 64x64 matrix tuned from blue noise.  In addition to being devoid
of the positional dependency on screen, the blue noise matrix also
generates more pleasing and less discernable patterns.  As such, it is
now the method used for PIXMAN_DITHER_GOOD and PIXMAN_DITHER_BEST
dithering methods.

The 64x64 blue noise matrix has been generated using the provided
`pixman/dither/make-blue-noise.c` script, which uses the
void-and-cluster method.

Changes since v1 (thanks Bill):
 - Use uint16_t for the blue noise matrix for lower memory usage
 - Use bitwise computation for array index
2019-05-25 07:30:19 -07:00
Basile Clement
98b5ec74ca demos: Add a dithering demo
This adds a dither.c which provides a demo of the dithering feature.
This is based on the scale.c demo for scaling and provides a selection
of intermediate formats and dithering operators (currently, only
PIXMAN_DITHER_ORDERED_BAYER_8) to use.  Images are first blitted onto a
surface of the intermediate format with the requested dither setup, then
blitted back onto a a8r8g8b8 surface for display.
2019-05-25 07:30:11 -07:00
Basile Clement
37d2e681b3 test: Check the dithering path in tolerance-test
This adds support for testing dithered destinations in tolerance-test.
When dithering is enabled, the pixel checker allows for an additional
quantization error.
2019-05-25 07:30:02 -07:00
Basile Clement
ddcc41b999 Implement basic dithering for the wide pipeline, v3
This patch implements dithering in pixman.  A "dither" property is added
to BITS images, which is used to:

 - Force rendering to the image to go through the floating point
   pipeline.  Note that this is different from FAST_PATH_NARROW_FORMAT
   as it should not enable the floating point pipeline when reading from
   the image.

 - Enable dithering in dest_write_back_wide.  The dithering uses the
   destination format to determine noise amplitude.

This does not change pixman's behavior when dithering is disabled (the
default).

Additional types and functions are added to the public API:

 - The `pixman_dither_t` enum exposes the available dithering methods.
   Currently a single dithering method based on 8x8 Bayer matrices is
   implemented (PIXMAN_DITHER_ORDERED_BAYER_8).  The PIXMAN_DITHER_FAST,
   PIXMAN_DITHER_GOOD and PIXMAN_DITHER_BEST aliases are provided and
   should be used to benefit from future specializations.

 - The `pixman_image_set_dither` function allows to set the dithering
   method to use when rendering to a bits image.

 - The `pixman_image_set_dither_offset` function allows to set a
   vertical and horizontal offsets for the dither matrix.  This can be
   used after scrolling to ensure a consistent spatial positioning of
   the dither matrix.

Changes since previous version (v2):
 - linear_gradient_is_horizontal optimization is still compatible with
   the wide pipeline.  The code disabling it was a remnant of a previous
   patch which performed dithering directly inside linear_get_scanline,
   and thus needed to be called independently for each scanline.

Changes since v1:
 - Renamed PIXMAN_DITHER_BAYER_8 to PIXMAN_DITHER_ORDERED_BAYER_8
 - Disable dithering for channels with 32bpp or more (since they can
   represent exactly the wide values already).  This makes the patches
   compatible with the newly added floating point format.

Dithering is compatible with linear_gradient_is_horizontal
2019-05-25 07:29:55 -07:00
Fan Jinke
85bfa8b4f9 add Hygon Dhyana support to enable X86_MMX_EXTENSIONS feature
Signed-off-by: Fan Jinke <fanjinke@hygon.cn>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-05-02 16:07:19 -07:00
Basile Clement
8256c235d9 Fix bilinear filter computation in wide pipeline
The recently introduced wide pipeline for filters has a typo which
causes it to improperly compute bilinear interpolation positions,
causing various glitches when enabled.

This patch uses the proper computation for bilinear interpolation in the
wide pipeline.  It also makes related `if` statements conformant to the
CODING_STYLE:

* If a substatement spans multiple lines, then there must be braces
  around it.

* If one substatement of an if statement has braces, then the other
  must too.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2019-04-11 10:59:00 +02:00
Matt Turner
e21ebfb13f Post-release version bump to 0.38.5
Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-04-10 10:25:18 -07:00
Matt Turner
e8df10eea9 Pre-release version bump to 0.38.4
Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-04-10 10:17:47 -07:00
Matt Turner
23f036d461 Makefile.am: Ship Meson assembly test files in the tarball
These were forgotten in commit 0ea37df428 (meson: store ARM SIMD and
NEON tests as text files) and since autotools doesn't use them make
distcheck still succeeded.

Fixes #30

Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-04-10 10:10:47 -07:00
Matt Turner
e7058fe49d Makefile.am: Update download links
Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-04-07 13:43:57 -07:00
Matt Turner
8888e752bf Post-release version bump to 0.38.3
Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-04-07 13:34:44 -07:00
Matt Turner
a7ffb3e617 Pre-release version bump to 0.38.2
Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-04-07 13:13:30 -07:00
Matt Turner
4c4753c407 meson: Correct copy-and-paste mistake
Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-04-07 12:31:40 -07:00
Niveditha Rau
72959837ab void function should not return a value
Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-03-27 15:14:05 -07:00
Simon Richter
ef4fb03248 Windows: Support building with SHELL=cmd.exe
When GNU Make is not from msys, the startup cost for sh.exe is massive
compared to cmd.exe.

Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-03-27 15:12:52 -07:00
Simon Richter
55d8f956c2 Windows: Show compiler invocation
Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-03-27 15:12:52 -07:00
Dylan Baker
0ea37df428 meson: store ARM SIMD and NEON tests as text files
This is unfortunately required to make the tests work correctly, as
otherwise meson assumes that the files are C code not assembly. I've
opened https://github.com/mesonbuild/meson/issues/5151, to discuss
fixing the issue in meson upstream.

Fixes #29
2019-03-27 10:54:50 -07:00
Dylan Baker
2065a07e98 meson: simplify and fix mmx library compilation
This simplifies the logic and fixes the loongson-mmi implementation to
build correctly.
2019-03-27 10:54:50 -07:00
Dylan Baker
6e206cf7fc meson: Add proper include paths for the loongson check 2019-03-27 10:53:34 -07:00
Dylan Baker
9ed0576a73 meson: fix copy-n-paste error for arm simd assembly
mentioned in #29
2019-03-27 10:53:34 -07:00
Dylan Baker
d13f6a8b1d meson: fix typo which breaks loongson checks
mach -> march
2019-03-27 10:53:34 -07:00
Dylan Baker
e7ac62c3c7 meson: work around meson issue #5115
This issue causes openmp arguments to be injected into compilers that
can support openmp, even if they don't. This issue will be fixed in
0.51 (code already landed in mesonbuild#5116), for older versions lets
work around the issue.
2019-03-27 10:53:33 -07:00
Maarten Lankhorst
5d2cf8fc21 Bump version to 0.38.0
And update RELEASING for the new meson build system.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2019-02-11 13:27:25 +01:00
Maarten Lankhorst
6240ad15c6 pixman: Use maximum precision for pixman-bits-image, v2.
pixman-bits-image's wide helpers first obtains the 8-bits image,
then converts it to float. This destroys all the precision that
the wide path was offering.

Fix this by making get_pixel() take a pointer instead of returning
a value. Floating point will fill in a argb_t, while the 8-bits path
will fill a 32-bits ARGB value. This also requires writing a floating
point bilinear interpolator. With this change pixman can use the full
floating point precision internally in all paths.

Changes since v1:
- Make accum and reduce an argument to convolution functions,
  to remove duplication.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Basile Clement <basile-pixman@clement.pm>
2019-02-11 12:48:57 +01:00
Basile Clement
a32fc4faf9 Implement floating point gradient computation, v2.
This patch modifies the gradient walker to be able to generate floating
point values directly in addition to a8r8g8b8 32 bit values.  This is
then used by the various gradient implementations to render in floating
point when asked to do so, instead of rendering to a8r8g8b8 and then
expanding to floating point as they were doing previously.

Changes since v1 (mlankhorst):
- Implement pixman_gradient_walker_pixel_32 without calling
  pixman_gradient_walker_pixel_float, to prevent performance degradation.
  Suggested by Adam Jackson.
- Fix whitespace errors.
- Remove unnecessary function prototypes in pixman-private.h

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
[mlankhorst: Add comment about pixman_contract_from_float,
             based on Basille's suggestion]
Acked-by: Basile Clement <basile-pixman@clement.pm>
2019-02-11 12:48:21 +01:00
Dylan Baker
b40d5495ec build: Add meson files to EXTRA_DIST
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-01-15 19:13:24 -08:00
Dylan Baker
16eacf19a3 editorconfig: use tabs for Makefiles
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-01-15 19:13:14 -08:00
Andreas Boll
60eec33554 Upload to unstable. 2018-12-12 22:02:53 +01:00
Andreas Boll
431e754de7 Bump standards version to 4.2.1. 2018-12-12 21:55:48 +01:00
Andreas Boll
8c5411a23b Bump debhelper compat to 11. 2018-12-12 21:55:28 +01:00
Andreas Boll
ea26aeb957 Set source format to 1.0. 2018-12-12 21:54:43 +01:00
Andreas Boll
da6e874f81 Use https URL in debian/copyright. 2018-12-12 21:53:40 +01:00
Andreas Boll
2d7f5e5831 Update Vcs-* URLs to point to salsa.debian.org. 2018-12-12 21:52:18 +01:00
Andreas Boll
c8e824af7b Update to my Debian address. 2018-12-12 21:51:21 +01:00
Andreas Boll
a26c93f936 Bump changelogs 2018-12-12 21:07:41 +01:00
Andreas Boll
51baef77fb Merge branch 'debian-unstable' into debian-unstable-new 2018-12-12 21:01:26 +01:00
Andreas Boll
0c08dcfc0c pixman 0.34.0 release
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWrh2cAAoJEGUdTbirWueAfxAH/1sf8P0SHY1y9KBKCw0enM4Y
 60sZYAgTgLa5prITcPeTb11bw877WAF73bAVjzL+6pNkT+Xs1ytvckwmbDoKDRZi
 zlptf0vPCnPX95Fh2X2PSO/1G0EErNWbqP5dUtLJ8L4sEaAj5TtDC9r9BouXpFaR
 qdipAmC1dVQNsbheBUinnfIjQ7H7i0NXXoUADFoP+X9V3WW95Hjkbwyoa4IUeYsY
 lPLVKfMRTZfQLksAAViDDpAhQxIrwMYQYApuMlbYXvX3tsW6zZCTeDfjqwRfxkdX
 Nnsz3lKBGvbS2ZJQBx2Xp9YC7+eu12IlxFA8cn3Exa96VngPJK5bR8Qn1ZJlUH8=
 =hex7
 -----END PGP SIGNATURE-----

Merge tag 'pixman-0.34.0' into debian-unstable-new

pixman 0.34.0 release
2018-12-12 21:01:18 +01:00
Maarten Lankhorst
146fa64351 Merge remote-tracking branch 'origin/master'
And bump meson version to 37.1 as well. Seems my push to upstream failed.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2018-12-07 14:20:44 +01:00
Maarten Lankhorst
0202f0d89d Post release version bump to 37.1
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2018-12-07 13:44:38 +01:00
Dylan Baker
eb0dfaa0c6 gitlab-ci: Add meson build to pipeline test 2018-11-29 16:57:01 +00:00
Dylan Baker
199a3bd275 meson: Add a meson build system
This commit adds a meson build system for pixman. It carries the usual
improvements of meson, better clean build time, much better incremental
build times, while being simpler and easier to understand.

This takes advantage of some features from the most recent versions of
meson: the builtin openmp dependency and the feature option type.

There are a couple of things that I've done a bit differently than the
autotools build system, I've built a libdemos which is the utilities
from the demos folder, and I've linked the demos with libtestutils from
tetsts, otherwise I expect that most things will be the same.

I've tested so far cross compiling from x86_64 -> x86, x86_64 ->
Aarch64, and Linux to Windows via mingw, as well as native x86_64 Linux
builds which all work. I've also built with mingw nativly, there are
some test failures there. An MSVC build can be generated, but fails.

v2: - set WORDS_BIGENDIAN in the config for big endian systems.
2018-11-29 16:57:01 +00:00
Dylan Baker
761f36c3c8 Add .editorconfig file
This sets the style for meson (which uses the upstream style, 2 space
indent with no tabs), and sets the tab_width to 8 per the CODING_STYLE
document.
2018-11-29 16:57:01 +00:00
Maarten Lankhorst
0313f35ab9 Bump version to 0.36.0
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2018-11-21 12:40:26 +01:00
Maarten Lankhorst
8a5d44c420 pixman: Update git repository to the one at gitlab.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2018-11-21 12:39:33 +01:00
Maarten Lankhorst
489fa0df11 pixman: Add tests for (a)rgb floating point formats.
Add some basic tests to ensure that the newly added formats work as
intended.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-11-06 14:25:49 +01:00
Maarten Lankhorst
a4b8a26d2b pixman: Add support for argb/xrgb float formats, v5.
Pixman is already using the floating point formats internally, expose
this capability in case someone wants to support higher bit per
component formats.

This is useful for igt which depends on cairo to do the rendering.
It can use it to convert floats internally to planar Y'CbCr formats,
or to F16.

We add a new type PIXMAN_TYPE_RGBA_FLOAT for this format, which is an
all float array of R, G, B, and A. Formats that use mixed float/int
RGBA aren't supported, and will probably need their own type.

Changes since v1:
- Use RGBA 128 bits and RGB 96 bits memory layouts, to better match the opengl format.
Changes since v2:
- Add asserts in accessor and for strides to force alignment.
- Move test changes to their own commit.
Changes since v3:
- Define 32bpc as PIXMAN_FORMAT_PACKED_C32
- Rename pixman accessors from rgb*_float_float to rgb*f_float
Changes since v4:
- Create a new PIXMAN_FORMAT_BYTE for fitting up to 64 bits per component.
  (based on Siarhei Siamashka's suggestion)
- Use new format type PIXMAN_TYPE_RGBA_FLOAT

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> #v4
[mlankhorst: Fix missing braces in PIXMAN_FORMAT_RESHIFT macro]
2018-11-06 14:24:05 +01:00
Siarhei Siamashka
018bf2f230 test: Fix stride calculation in stress-test
Currently the number of bits per pixel is used instead of the
number of bytes per pixel when calculating image strides. This
does not cause any real problems, but the gaps between scanlines
are excessively large.

This patch actually converts bits to bytes and rounds up the result
to the nearest byte boundary.

Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Reviewed-by: soren.sandmann@gmail.com
2018-07-06 14:44:22 -04:00
Vladimir Smirnov
bd2b49185b test: Adjust for clang's removal of __builtin_shuffle
__builtin_shuffle was removed in clang 5.0.

Build log says:
test/utils-prng.c:207:27: error: use of unknown builtin '__builtin_shuffle' [-Wimplicit-function-declaration]
            randdata.vb = __builtin_shuffle (randdata.vb, bswap_shufflemask);
                          ^
test/utils-prng.c:207:25: error: assigning to 'uint8x16' (vector of 16 'uint8_t' values) from incompatible type 'int'
            randdata.vb = __builtin_shuffle (randdata.vb, bswap_shufflemask);
                        ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2 errors generated

Link to original discussion:
http://lists.llvm.org/pipermail/cfe-dev/2017-August/055140.html

It's possible to build pixman if attached patch is applied. Basically
patch adds check for __builtin_shuffle support and in case there is
none, falls back to clang-specific __builtin_shufflevector that do the
same but have different API.

Bugzilla: https://bugs.gentoo.org/646360
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104886
Tested-by: Philip Chimento <philip.chimento@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-06-05 12:35:07 -04:00
Adam Jackson
a75c69f122 Merge branch 'ci' into 'master'
ci: Add .gitlab-ci.yml

See merge request pixman/pixman!1
2018-06-05 16:33:50 +00:00
Adam Jackson
9034d0cc32 ci: Add .gitlab-ci.yml
Just builds on Fedora 28 for x86_64 at the moment, but it's a start.
Credit to Daniel Stone for eliminating the nested docker image.

Signed-off-by: Adam Jackson <ajax@redhat.com>
2018-06-05 12:13:35 -04:00
Dan Horák
ddf42d627c vmx: Fix vector loads on ppc64le
Use vector intrinsic for loading possibly unaligned data instead of a
typecast.

Bugzilla: https://bugzilla.redhat.com/1572540
Signed-off-by: Dan Horák <dan@danny.cz>
Signed-off-by: Adam Jackson <ajax@redhat.com>
Tested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2018-05-14 16:31:49 -04:00
Behdad Esfahbod
8b95e0e460 Promote unsigned short to unsigned int explicitly
...to avoid default promotion to signed int, which causes undefined
behaviour in the shift expression.
2018-01-09 10:26:29 +01:00
Andreas Boll
31381b7057 Upload to unstable. 2017-12-17 13:34:07 +01:00
Andreas Boll
f0178c049c Bump standards version to 4.1.2. 2017-12-17 13:19:45 +01:00
Andreas Boll
9684e88c21 Stop passing --disable-silent-rules to configure, debhelper does it now. 2017-12-17 13:19:23 +01:00
Andreas Boll
397047255e Switch to dbsym package. 2017-12-17 13:18:03 +01:00
Andreas Boll
34c1784503 Declare Multi-Arch: same for libpixman-1-dev (Closes: #884166). 2017-12-17 13:17:44 +01:00
Julien Cristau
87934b6b4f Upload to unstable 2016-09-24 13:25:26 +02:00
Julien Cristau
4daa9a4c6b Use https URL in debian/watch. 2016-09-24 13:23:41 +02:00
Søren Sandmann Pedersen
85467ec308 Revert "demos/scale: Added pulldown to choose PIXMAN_FILTER_* value"
This reverts commit 375f5ec5c5.

This patch was accidentally pushed.
2016-09-03 15:09:12 -04:00
Bill Spitzak
17c4ce2e39 pixman-filter: Made Gaussian a bit wider
Expanded the size slightly (from ~4.25 to 5) to make the cutoff less
noticable.  Previouly the value at the cutoff was
gaussian_filter(sqrt(2)*3/2) = 0.00626 which is larger than the
difference between 8-bit pixels (1/255 = 0.003921). New cutoff is
gaussian_filter(2.5) = 0.001089 which is smaller.

v11: added some math to commit message
v14: left SIGMA in there
Signed-off-by: Bill Spitzak <spitzak@gmail.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Søren Sandmann <soren.sandmann@gmail.com>
2016-09-03 14:53:07 -04:00
Bill Spitzak
d286078b28 pixman-filter: Nested polynomial for cubic
v11: Restored range checks

Signed-off-by: Bill Spitzak <spitzak@gmail.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
2016-09-03 14:53:07 -04:00
Søren Sandmann Pedersen
133142449b pixman-filter: Fix several issues related to normalization
There are a few bugs in the current normalization code

(1) The normalization is based on the sum of the *floating point*
    values generated by integral(). But in order to get the sum to be
    close to pixman_fixed_1, the sum of the rounded fixed point values
    should be used.

(2) The multiplications in the normalization loops often round the
    same way, so the residual error can fairly large.

(3) The residual error is added to the sample located at index
    (width - width / 2), which is not the midpoint for odd widths (and
    for width 1 is in fact outside the array).

This patch fixes these issues by (1) using the sum of the fixed point
values as the total to divide by, (2) doing error diffusion in the
normalization loop, and (3) putting any residual error (which is now
guaranteed to be less than pixman_fixed_e) at the first sample, which
is the only one that didn't get any error diffused into it.

Signed-off-by: Søren Sandmann <soren.sandmann@gmail.com>
2016-09-03 14:53:06 -04:00
Søren Sandmann Pedersen
3b46fce6fe pixman-filter: Speed up BOX/BOX filter
The convolution of two BOX filters is simply the length of the
interval where both are non-zero, so we can simply return width from
the integral() function because the integration region has already
been restricted to be such that both functions are non-zero on it.

This is both faster and more accurate than doing numerical integration.

This patch is based on one by Bill Spitzak

    https://lists.freedesktop.org/archives/pixman/2016-March/004446.html

with these changes:

- Rebased to not assume any changes in the arguments to integral().

- Dropped the multiplication by scale

- Added more details in the commit message.

Signed-off-by: Søren Sandmann <soren.sandmann@gmail.com>
Reviewed-by: Bill Spitzak <spitzak@gmail.com>
2016-09-02 00:40:12 -04:00
Bill Spitzak
8855b3a2a2 pixman-filter: integral splitting is only needed for triangle filter
Only the triangle is discontinuous at 0. The other filters resemble a
cubic closely enough that Simpsons integration works without
splitting.

Changes by Søren: Rebase without the changes to the integral function,
update comment to match the new code.

Signed-off-by: Bill Spitzak <spitzak@gmail.com>
Signed-off-by: Søren Sandmann <soren.sandmann@gmail.com>
Reviewed-by: Søren Sandmann <soren.sandmann@gmail.com>
2016-09-02 00:40:12 -04:00
Bill Spitzak
6ae281fbb7 pixman-filter: Correct Simpsons integration
Simpsons uses cubic curve fitting, with 3 samples defining each
cubic. This makes the weights of the samples be in a pattern of
1,4,2,4,2...4,1, and then dividing the result by 3.

The previous code was using weights of 1,2,0,6,0,6...,2,1.

With this fix the integration is accurate enough that the number of
samples could be reduced a lot. Multiples of 12 seem to work best.

v7: Merged with patch to reduce from 128 samples to 16
v9: Changed samples from 16 to 12
v10: Fixed rebase error that made it not compile
v11: minor whitespace change
v14: more whitespace changes

Signed-off-by: Bill Spitzak <spitzak@gmail.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Søren Sandmann <soren.sandmann@gmail.com>
2016-09-02 00:40:12 -04:00
Bill Spitzak
6acaf2bcb1 pixman-filter: reduce amount of malloc/free/memcpy to generate filter
Rearranged so that the entire block of memory for the filter pair
is allocated first, and then filled in. Previous version allocated
and freed two temporary buffers for each filter and did an extra
memcpy.

v8: small refactor to remove the filter_width function

v10: Restored filter_width function but with arguments changed to
     match later patches

v11: Removed unused arg and pointer from filter_width function
     Whitespace fixes.

Signed-off-by: Bill Spitzak <spitzak@gmail.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Søren Sandmann <soren.sandmann@gmail.com>
2016-09-02 00:40:12 -04:00
Bill Spitzak
d0e6c9f4f6 pixman-image: Added enable-gnuplot config to view filters in gnuplot
If enable-gnuplot is configured, then you can pipe the output of a
pixman-using program to gnuplot and get a continuously-updated plot of
the horizontal filter. This works well with demos/scale to test the
filter generation.

The plot is all the different subposition filters shuffled
together. This is misleading in a few cases:

  IMPULSE.BOX - goes up and down as the subfilters have different
                numbers of non-zero samples

  IMPULSE.TRIANGLE - somewhat crooked for the same reason

  1-wide filters - looks triangular, but a 1-wide box would be more
                   accurate

Changes by Søren: Rewrote the pixman-filter.c part to
     - make it generate correct coordinates
     - add a comment on how coordinates are generated
     - in rounding.txt, add a ceil() variant of the first-sample
       formula
     - make the gnuplot output slightly prettier

v7: First time this ability was included

v8: Use config option
    Moved code to the filter generator
    Modified scale demo to not call filter generator a second time.

v10: Only print if successful generation of plots
     Use #ifdef, not #if

v11: small whitespace fixes
v12: output range from -width/2 to width/2 and include y==0, to avoid misleading plots
     for subsample_bits==0 and for box filters which may have no small values.

Signed-off-by: Bill Spitzak <spitzak@gmail.com>
2016-09-02 00:40:11 -04:00
Bill Spitzak
375f5ec5c5 demos/scale: Added pulldown to choose PIXMAN_FILTER_* value
This is very useful for comparing the results of SEPARABLE_CONVOLUTION
with BILINEAR and NEAREST.

v14: Removed good/best items
v15: Skip filter generation so gnuplot output continues showing previous value

Signed-off-by: Bill Spitzak <spitzak@gmail.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
2016-09-02 00:40:11 -04:00
Bill Spitzak
afee2adc1e demos/scale: Default to locked axis
Signed-off-by: Bill Spitzak <spitzak@gmail.com>
Reviewed-by: Søren Sandmann <soren.sandmann@gmail.com>
2016-09-02 00:40:11 -04:00
Bill Spitzak
1e1af34d3b demos/scale: fix blank subsamples spin box
It now shows the initial value of 4 when the demo is started

Signed-off-by: Bill Spitzak <spitzak@gmail.com>
Reviewed-by: Søren Sandmann <soren.sandmann@gmail.com>
2016-09-02 00:40:11 -04:00
Bill Spitzak
99b574109d demos/scale: Compute filter size using boundary of xformed ellipse
Instead of using the boundary of xformed rectangle, use the boundary
of xformed ellipse. This is much more accurate and less blurry. In
particular the filtering does not change as the image is rotated.

Signed-off-by: Bill Spitzak <spitzak@gmail.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Soren Sandmann <soren.sandmann@gmail.com>
2016-09-02 00:40:11 -04:00
Søren Sandmann Pedersen
b9ead7ddf7 More general BILINEAR=>NEAREST reduction
Generalize and simplify the code that reduces BILINEAR to NEAREST so
that the reduction happens for all affine transformations where
t00...t12 are integers and (t00 + t01) and (t10 + t11) are both
odd. This is a sufficient condition for the resulting transformed
coordinates to be exactly at the center of a pixel so that BILINEAR
becomes identical to NEAREST.

V2: Address some comments by Bill Spitzak

Signed-off-by: Søren Sandmann <soren.sandmann@gmail.com>
Reviewed-by: Bill Spitzak <spitzak@gmail.com>
2016-09-02 00:40:11 -04:00
Søren Sandmann Pedersen
7612369013 Add new test of filter reduction from BILINEAR to NEAREST
This new test tests a bunch of bilinear downscalings, where many have
a transformation such that the BILINEAR filter can be reduced to
NEAREST (and many don't).

A CRC32 is computed for all the resulting images and compared to a
known-good value for both 4-bit and 7-bit interpolation.

V2: Remove leftover comment, some minor formatting fixes, use a
timestamp as the PRNG seed.

Signed-off-by: Søren Sandmann <soren.sandmann@gmail.com>
Reviewed-by: Bill Spitzak <spitzak@gmail.com>
2016-09-02 00:40:11 -04:00
Søren Sandmann Pedersen
eb4a832ec2 pixman-fast-path.c: Pick NEAREST affine fast paths before BILINEAR ones
When a BILINEAR filter is reduced to NEAREST, it is possible for both
types of fast paths to run; in this case, the NEAREST ones should be
preferred as that is the simpler filter.

Signed-off-by: Soren Sandmann <soren.sandmann@gmail.com>
Reviewed-by: Bill Spitzak <spitzak@gmail.com>
2016-09-02 00:40:11 -04:00
Julien Cristau
0f4e087031 Bump changelogs 2016-05-13 12:50:41 +02:00
Julien Cristau
5672fa0f82 pixman 0.34.0 release
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWrh2cAAoJEGUdTbirWueAfxAH/1sf8P0SHY1y9KBKCw0enM4Y
 60sZYAgTgLa5prITcPeTb11bw877WAF73bAVjzL+6pNkT+Xs1ytvckwmbDoKDRZi
 zlptf0vPCnPX95Fh2X2PSO/1G0EErNWbqP5dUtLJ8L4sEaAj5TtDC9r9BouXpFaR
 qdipAmC1dVQNsbheBUinnfIjQ7H7i0NXXoUADFoP+X9V3WW95Hjkbwyoa4IUeYsY
 lPLVKfMRTZfQLksAAViDDpAhQxIrwMYQYApuMlbYXvX3tsW6zZCTeDfjqwRfxkdX
 Nnsz3lKBGvbS2ZJQBx2Xp9YC7+eu12IlxFA8cn3Exa96VngPJK5bR8Qn1ZJlUH8=
 =hex7
 -----END PGP SIGNATURE-----

Merge tag 'pixman-0.34.0' into debian-unstable

pixman 0.34.0 release
2016-05-13 12:49:33 +02:00
Oded Gabbay
1727aa4ab6 Pre-release version bump to 0.34.0
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2016-01-31 16:39:23 +02:00
Thomas Petazzoni
7c6066b700 pixman-private: include <float.h> only in C code
<float.h> is included unconditionally by pixman-private.h, which in
turn gets included by assembler files. Unfortunately, with certain C
libraries (like the musl C library), <float.h> cannot be included in
assembler files:

  CCLD     libpixman-arm-simd.la
/home/test/buildroot/output/host/usr/arm-buildroot-linux-musleabihf/sysroot/usr/include/float.h: Assembler messages:
/home/test/buildroot/output/host/usr/arm-buildroot-linux-musleabihf/sysroot/usr/include/float.h:8: Error: bad instruction `int __flt_rounds(void)'
/home/test/buildroot/output/host/usr/arm-buildroot-linux-musleabihf/sysroot/usr/include/float.h: Assembler messages:
/home/test/buildroot/output/host/usr/arm-buildroot-linux-musleabihf/sysroot/usr/include/float.h:8: Error: bad instruction `int __flt_rounds(void)'

It turns out however that <float.h> is not needed by assembly files,
so we move its inclusion within the #ifndef __ASSEMBLER__ condition,
which solves the problem.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Reviewed-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2016-01-31 16:15:26 +02:00
Andreas Boll
af451ab328 Upload to unstable. 2016-01-14 13:46:57 +01:00
Andreas Boll
cae8b2a893 Add myself to Uploaders. 2016-01-14 13:21:08 +01:00
Andreas Boll
e22e142165 Bump changelogs. 2016-01-14 13:19:45 +01:00
Andreas Boll
5e030aac41 pixman 0.33.6 release
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWeVNeAAoJEGUdTbirWueAZVUIAIMrz8RGz2t/6Y16CPx8Kfat
 NJFe9k0gVxTCBGYcAOtZJxeqcl/RryGuEGrdcN1UiAeCsjDxTCEwefHO1ablC6A6
 Zc57mkxbknM1eOHiU/D59+JFC5cvLM3WlsQSAi2CyUIdlSq/b7vK/ADWas7kn8y9
 AdDd/MEfGXwVKumQqSN+h5GZxLwhOYw6Y9Ew6srR5EX3jzGQ8GQY3cfd3tzXpYYN
 aZ3EME3EUkhrT3DdUg/byoQu1YIppGm5Vb405gqe/1B+QZLMHUsKP3dwMk++jcdn
 4vcZAhs3s5VrVlPkfng6HLdRHmHI//AfwRBktcrEoirGfGGtPF3NKfk9B4KgPRk=
 =FhAa
 -----END PGP SIGNATURE-----

Merge tag 'pixman-0.33.6' into debian-unstable

pixman 0.33.6 release
2016-01-14 13:17:22 +01:00
Andrea Canciani
342cbf1644 build: Distinguish SKIP and FAIL on Win32
The `check` target in test/Makefile.win32 assumed that any non-0 exit
code from the tests was an error, but the testsuite is currently using
77 as a SKIP exit code (based on the convention used in autotools).

Fixes fence-image-self-test and cover-test (now reported as SKIP).

Signed-off-by: Andrea Canciani <ranma42@gmail.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-12-30 14:06:40 +01:00
Simon Richter
af0689716a build: Use del instead of rm on cmd.exe shells
The `rm` command is not usually available when running on Win32 in a
`cmd.exe` shell. Instead the shell provides the `del` builtin, which
has somewhat more limited wildcars expansion and error handling.

This makes all of the Makefile targets work on Win32 both using
`cmd.exe` and using the MSYS environment.

Signed-off-by: Simon Richter <Simon.Richter@hogyros.de>
Signed-off-by: Andrea Canciani <ranma42@gmail.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-12-23 21:24:17 +01:00
Andrea Canciani
93b876c110 build: Do not use mkdir -p on Windows
When the build is performed using `cmd.exe` as shell, the `mkdir`
command does not support the `-p` flag. The ability to create multiple
netsted folder is not used, hence it can be easily replaced by only
creating the directory if it does not exist.

This makes the build work on the `cmd.exe` shell, except for the
`clean` targets.

Signed-off-by: Andrea Canciani <ranma42@gmail.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-12-23 21:24:06 +01:00
Andrea Canciani
cc35d01980 build: Avoid phony pixman target in test/Makefile.win32
Instead of explicitly depending on "pixman" for the "all" and "check"
targets, rely on the dependency to the .lib file

Signed-off-by: Andrea Canciani <ranma42@gmail.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-12-23 21:23:57 +01:00
Andrea Canciani
ceb49cbda9 build: Remove use of BUILT_SOURCES from Makefile.win32
Since 3d81d89c29 BUILT_SOURCES is not
used anymore, but it was unintentionally left in Win32 Makefiles.

Signed-off-by: Andrea Canciani <ranma42@gmail.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-12-23 21:23:46 +01:00
Oded Gabbay
ba1868a854 Post 0.34 branch creation version bump to 0.35.1
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-12-23 10:46:40 +02:00
Oded Gabbay
0e72e78086 Post-release version bump to 0.33.7
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-12-22 15:55:32 +02:00
Oded Gabbay
65f35270e4 Pre-release version bump to 0.33.6
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-12-22 15:30:10 +02:00
Oded Gabbay
a566f627db configura.ac: fix test for SSE2 & SSSE3 assembler support
This patch modifies the SSE2 & SSSE3 tests in configure.ac to use a
global variable to initialize vector variables. In addition, we now
return the value of the computation instead of 0.

This is done so gcc 4.9 (and lower) won't optimize the SSE assembly
instructions (when using -O1 and higher), because then the configure test
might incorrectly pass even though the assembler doesn't support the
SSE instructions (the test will pass because the compiler does support
the intrinsics).

v2: instead of using volatile, use a global variable as input

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-12-22 11:19:01 +02:00
Andrea Canciani
d24b415f3e mmx: Improve detection of support for "K" constraint
Older versions of clang emitted an error on the "K" constraint, but at
least since version 3.7 it is supported. Just like gcc, this
constraint is only allowed for constants, but apparently clang
requires them to be known before inlining.

Using the macro definition _mm_shuffle_pi16(A, N) ensures that the "K"
constraint is always applied to a literal constant, independently from
the compiler optimizations and allows building pixman-mmx on modern
clang.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrea Canciani <ranma42@gmail.com>
2015-11-18 14:19:58 -08:00
Matt Turner
312e381523 Revert "mmx: Use MMX2 intrinsics from xmmintrin.h directly."
This reverts commit 7de61d8d14.

Newer versions of gcc allow inclusion of xmmintrin.h without -msse, but
still won't allow usage of the intrinsics.

Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=564024
2015-11-18 14:19:12 -08:00
Andreas Boll
017a59ec26 Upload to unstable 2015-11-04 13:26:38 +01:00
Andreas Boll
c193730083 Bump changelogs. 2015-11-04 10:30:58 +01:00
Andreas Boll
51c330400f pixman 0.33.4 release
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWKk4tAAoJEGUdTbirWueAIDkH/0YQj9943iFVJFEWhQdhLJe6
 PeHsiZgNjhPTNK2gpuudtOK2yda1akQTCfjGeNzN0nKQ0qPOaDiF71jt/C4Duppx
 rX9M6lkyMEPlCrM27+pZUCJitL+e7j8qYjapAdfvx8lCqvl8Mkq2t5JCsr1PWkte
 5w83kNhWf35eWN0zgRem9tTgVQ0LMYdO5IYPasAnqKHUUaIHO/r2dTNdc8bBFvD7
 k7X3Qz/kqAodraTWpieT59mwttUI0x/CiaNjlXfMDC4KKtbzkZJQlc0Oys74EG17
 Oag2Bvi4vnkTj+lvoixhu8dBGR/LPyEzZHbZyNWfjsDYL2RM2FuovUDxaYYM5nQ=
 =11P2
 -----END PGP SIGNATURE-----

Merge tag 'pixman-0.33.4' into debian-unstable

pixman 0.33.4 release
2015-11-04 10:28:32 +01:00
Oded Gabbay
3a50806cbe Post-release version bump to 0.33.5
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-10-23 18:33:55 +03:00
Oded Gabbay
fa71d08a81 Pre-release version bump to 0.33.4
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-10-23 17:58:49 +03:00
Andrea Canciani
9728241bd0 test: Fix fence-image-self-test on Mac
On MacOS X, according to the manpage of mprotect(), "When a program
violates the protections of a page, it gets a SIGBUS or SIGSEGV
signal.", but fence-image-self-test was only accepting a SIGSEGV as
notification of invalid access.

Fixes fence-image-self-test

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-10-16 15:05:02 +03:00
Matt Turner
7de61d8d14 mmx: Use MMX2 intrinsics from xmmintrin.h directly.
We had lots of hacks to handle the inability to include xmmintrin.h
without compiling with -msse (lest SSE instructions be used in
pixman-mmx.c). Some recent version of gcc relaxed this restriction.

Change configure.ac to test that xmmintrin.h can be included and that we
can use some intrinsics from it, and remove the work-around code from
pixman-mmx.c.

Evidently allows gcc 4.9.3 to optimize better as well:

   text	   data	    bss	    dec	    hex	filename
 657078	  30848	    680	 688606	  a81de	libpixman-1.so.0.33.3 before
 656710	  30848	    680	 688238	  a806e	libpixman-1.so.0.33.3 after

Reviewed-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Tested-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2015-10-13 09:40:42 -07:00
Siarhei Siamashka
90e62c0867 vmx: implement fast path vmx_composite_over_n_8888
Running "lowlevel-blt-bench over_n_8888" on Playstation3 3.2GHz,
Gentoo ppc (32-bit userland) gave the following results:

before:  over_n_8888 =  L1: 147.47  L2: 205.86  M:121.07
after:   over_n_8888 =  L1: 287.27  L2: 261.09  M:133.48

Cairo non-trimmed benchmarks on POWER8, 3.4GHz 8 Cores:

ocitysmap          659.69  -> 611.71   :  1.08x speedup
xfce4-terminal-a1  2725.22 -> 2547.47  :  1.07x speedup

Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-09-29 14:21:46 +03:00
Ben Avison
2876d8d3dd affine-bench: remove 8e margin from COVER area
Patch "Remove the 8e extra safety margin in COVER_CLIP analysis" reduced
the required image area for setting the COVER flags in
pixman.c:analyze_extent(). Do the same reduction in affine-bench.

Leaving the old calculations in place would be very confusing for anyone
reading the code.

Also add a comment that explains how affine-bench wants to hit the COVER
paths. This explains why the intricate extent calculations are copied
from pixman.c.

[Pekka: split patch, change comments, write commit message]
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-09-25 14:26:04 +03:00
Ben Avison
0e2e975128 Remove the 8e extra safety margin in COVER_CLIP analysis
As discussed in
http://lists.freedesktop.org/archives/pixman/2015-August/003905.html

the 8 * pixman_fixed_e (8e) adjustment which was applied to the transformed
coordinates is a legacy of rounding errors which used to occur in old
versions of Pixman, but which no longer apply. For any affine transform,
you are now guaranteed to get the same result by transforming the upper
coordinate as though you transform the lower coordinate and add (size-1)
steps of the increment in source coordinate space. No projective
transform routines use the COVER_CLIP flags, so they cannot be affected.

Proof by Siarhei Siamashka:

Let's take a look at the following affine transformation matrix (with 16.16
fixed point values) and two vectors:

         | a   b     c    |
M      = | d   e     f    |
         | 0   0  0x10000 |

         |  x_dst  |
P     =  |  y_dst  |
         | 0x10000 |

         | 0x10000 |
ONE_X  = |    0    |
         |    0    |

The current matrix multiplication code does the following calculations:

             | (a * x_dst + b * y_dst + 0x8000) / 0x10000 + c |
    M * P =  | (d * x_dst + e * y_dst + 0x8000) / 0x10000 + f |
             |                   0x10000                      |

These calculations are not perfectly exact and we may get rounding
because the integer coordinates are adjusted by 0.5 (or 0x8000 in the
16.16 fixed point format) before doing matrix multiplication. For
example, if the 'a' coefficient is an odd number and 'b' is zero,
then we are losing some of the least significant bits when dividing by
0x10000.

So we need to strictly prove that the following expression is always
true even though we have to deal with rounding:

                                          | a |
    M * (P + ONE_X) - M * P = M * ONE_X = | d |
                                          | 0 |

or

   ((a * (x_dst + 0x10000) + b * y_dst + 0x8000) / 0x10000 + c)
  -
   ((a * x_dst             + b * y_dst + 0x8000) / 0x10000 + c)
  =
    a

It's easy to see that this is equivalent to

    a + ((a * x_dst + b * y_dst + 0x8000) / 0x10000 + c)
      - ((a * x_dst + b * y_dst + 0x8000) / 0x10000 + c)
  =
    a

Which means that stepping exactly by one pixel horizontally in the
destination image space (advancing 'x_dst' by 0x10000) is the same as
changing the transformed 'x_src' coordinate in the source image space
exactly by 'a'. The same applies to the vertical direction too.
Repeating these steps, we can reach any pixel in the source image
space and get exactly the same fixed point coordinates as doing
matrix multiplications per each pixel.

By the way, the older matrix multiplication implementation, which was
relying on less accurate calculations with three intermediate roundings
"((a + 0x8000) >> 16) + ((b + 0x8000) >> 16) + ((c + 0x8000) >> 16)",
also has the same properties. However reverting
    http://cgit.freedesktop.org/pixman/commit/?id=ed39992564beefe6b12f81e842caba11aff98a9c
and applying this "Remove the 8e extra safety margin in COVER_CLIP
analysis" patch makes the cover test fail. The real reason why it fails
is that the old pixman code was using "pixman_transform_point_3d()"
function
    http://cgit.freedesktop.org/pixman/tree/pixman/pixman-matrix.c?id=pixman-0.28.2#n49
for getting the transformed coordinate of the top left corner pixel
in the image scaling code, but at the same time using a different
"pixman_transform_point()" function
    http://cgit.freedesktop.org/pixman/tree/pixman/pixman-matrix.c?id=pixman-0.28.2#n82
in the extents calculation code for setting the cover flag. And these
functions did the intermediate rounding differently. That's why the 8e
safety margin was needed.

** proof ends

However, for COVER_CLIP_NEAREST, the actual margins added were not 8e.
Because the half-way cases round down, that is, coordinate 0 hits pixel
index -1 while coordinate e hits pixel index 0, the extra safety margins
were actually 7e to the left and up, and 9e to the right and down. This
patch removes the 7e and 9e margins and restores the -e adjustment
required for NEAREST sampling in Pixman. For reference, see
pixman/rounding.txt.

For COVER_CLIP_BILINEAR, the margins were exactly 8e as there are no
additional offsets to be restored, so simply removing the 8e additions
is enough.

Proof:

All implementations must give the same numerical results as
bits_image_fetch_pixel_nearest() / bits_image_fetch_pixel_bilinear().

The former does
    int x0 = pixman_fixed_to_int (x - pixman_fixed_e);
which maps directly to the new test for the nearest flag, when you consider
that x0 must fall in the interval [0,width).

The latter does
    x1 = x - pixman_fixed_1 / 2;
    x1 = pixman_fixed_to_int (x1);
    x2 = x1 + 1;
When you write a COVER path, you take advantage of the assumption that
both x1 and x2 fall in the interval [0, width).

As samplers are allowed to fetch the pixel at x2 unconditionally, we
require
    x1 >= 0
    x2 < width
so
    x - pixman_fixed_1 / 2 >= 0
    x - pixman_fixed_1 / 2 + pixman_fixed_1 < width * pixman_fixed_1
so
    pixman_fixed_to_int (x - pixman_fixed_1 / 2) >= 0
    pixman_fixed_to_int (x + pixman_fixed_1 / 2) < width
which matches the source code lines for the bilinear case, once you delete
the lines that add the 8e margin.

Signed-off-by: Ben Avison <bavison@riscosopen.org>
[Pekka: adjusted commit message, left affine-bench changes for another patch]
[Pekka: add commit message parts from Siarhei]
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-09-25 14:24:17 +03:00
Ben Avison
23525b4ea5 pixman-general: Tighten up calculation of temporary buffer sizes
Each of the aligns can only add a maximum of 15 bytes to the space
requirement. This permits some edge cases to use the stack buffer where
previously it would have deduced that a heap buffer was required.

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-09-25 14:19:15 +03:00
Siarhei Siamashka
8b49d4b6b4 pixman-general: Fix stack related pointer arithmetic overflow
As https://bugs.freedesktop.org/show_bug.cgi?id=92027#c6 explains,
the stack is allocated at the very top of the process address space
in some configurations (32-bit x86 systems with ASLR disabled).
And the careless computations done with the 'dest_buffer' pointer
may overflow, failing the buffer upper limit check.

The problem can be reproduced using the 'stress-test' program,
which segfaults when executed via setarch:

    export CFLAGS="-O2 -m32" && ./autogen.sh
    ./configure --disable-libpng --disable-gtk && make
    setarch i686 -R test/stress-test

This patch introduces the required corrections. The extra check
for negative 'width' may be redundant (the invalid 'width' value
is not supposed to reach here), but it's better to play safe
when dealing with the buffers allocated on stack.

Reported-by: Ludovic Courtès <ludo@gnu.org>
Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Reviewed-by: soren.sandmann@gmail.com
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-09-22 13:19:06 +03:00
Thomas Petazzoni
4297e9058d test: add a check for FE_DIVBYZERO
Some architectures, such as Microblaze and Nios2, currently do not
implement FE_DIVBYZERO, even though they have <fenv.h> and
feenableexcept(). This commit adds a configure.ac check to verify
whether FE_DIVBYZERO is defined or not, and if not, disables the
problematic code in test/utils.c.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Marek Vasut <marex@denx.de>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-09-20 15:50:04 +03:00
Oded Gabbay
8189fad961 vmx: Remove unused expensive functions
Now that we replaced the expensive functions with better performing
alternatives, we should remove them so they will not be used again.

Running Cairo benchmark on trimmed traces gave the following results:

POWER8, 8 cores, 3.4GHz, RHEL 7.2 ppc64le.

Speedups
========
t-firefox-scrolling     1232.30 -> 1096.55 :  1.12x
t-gnome-terminal-vim    613.86  -> 553.10  :  1.11x
t-evolution             405.54  -> 371.02  :  1.09x
t-firefox-talos-gfx     919.31  -> 862.27  :  1.07x
t-gvim                  653.02  -> 616.85  :  1.06x
t-firefox-canvas-alpha  941.29  -> 890.42  :  1.06x

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-09-18 10:07:13 +03:00
Oded Gabbay
6b1b8b2b90 vmx: implement fast path vmx_composite_over_n_8_8888
POWER8, 8 cores, 3.4GHz, RHEL 7.2 ppc64le.

reference memcpy speed = 25008.9MB/s (6252.2MP/s for 32bpp fills)

                Before         After           Change
              ---------------------------------------------
L1              91.32          182.84         +100.22%
L2              94.94          182.83         +92.57%
M               95.55          181.51         +89.96%
HT              88.96          162.09         +82.21%
VT              87.4           168.35         +92.62%
R               83.37          146.23         +75.40%
RT              66.4           91.5           +37.80%
Kops/s          683            859            +25.77%

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-09-18 10:07:08 +03:00
Oded Gabbay
8d8caa55a3 vmx: optimize vmx_composite_over_n_8888_8888_ca
This patch optimizes vmx_composite_over_n_8888_8888_ca by removing use
of expand_alpha_1x128, unpack/pack and in_over_2x128 in favor of
splat_alpha, in_over and MUL/ADD macros from pixman_combine32.h.

Running "lowlevel-blt-bench -n over_8888_8888" on POWER8, 8 cores,
3.4GHz, RHEL 7.2 ppc64le gave the following results:

reference memcpy speed = 23475.4MB/s (5868.8MP/s for 32bpp fills)

                Before          After           Change
              --------------------------------------------
L1              244.97          474.05         +93.51%
L2              243.74          473.05         +94.08%
M               243.29          467.16         +92.02%
HT              144.03          252.79         +75.51%
VT              174.24          279.03         +60.14%
R               109.86          149.98         +36.52%
RT              47.96           53.18          +10.88%
Kops/s          524             576            +9.92%

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-09-18 10:07:03 +03:00
Oded Gabbay
857880f0e4 vmx: optimize scaled_nearest_scanline_vmx_8888_8888_OVER
This patch optimizes scaled_nearest_scanline_vmx_8888_8888_OVER and all
the functions it calls (combine1, combine4 and
core_combine_over_u_pixel_vmx).

The optimization is done by removing use of expand_alpha_1x128 and
expand_alpha_2x128 in favor of splat_alpha and MUL/ADD macros from
pixman_combine32.h.

Running "lowlevel-blt-bench -n over_8888_8888" on POWER8, 8 cores,
3.4GHz, RHEL 7.2 ppc64le gave the following results:

reference memcpy speed = 24847.3MB/s (6211.8MP/s for 32bpp fills)

                Before          After           Change
              --------------------------------------------
L1              182.05          210.22         +15.47%
L2              180.6           208.92         +15.68%
M               180.52          208.22         +15.34%
HT              130.17          178.97         +37.49%
VT              145.82          184.22         +26.33%
R               104.51          129.38         +23.80%
RT              48.3            61.54          +27.41%
Kops/s          430             504            +17.21%

v2: Check *pm is not NULL before dereferencing it in combine1()

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-09-18 10:06:50 +03:00
Pekka Paalanen
73e586efb3 armv6: enable over_n_8888
Enable the fast path added in the previous patch by moving the lookup
table entries to their proper locations.

Lowlevel-blt-bench benchmark statistics with 30 iterations, showing the
effect of adding this one patch on top of
"armv6: Add over_n_8888 fast path (disabled)", which was applied on
fd59569294.

       Before          After
      Mean StdDev     Mean StdDev   Confidence   Change
L1    12.5   0.04     45.2   0.10    100.00%    +263.1%
L2    11.1   0.02     43.2   0.03    100.00%    +289.3%
M      9.4   0.00     42.4   0.02    100.00%    +351.7%
HT     8.5   0.02     25.4   0.10    100.00%    +198.8%
VT     8.4   0.02     22.3   0.07    100.00%    +167.0%
R      8.2   0.02     23.1   0.09    100.00%    +183.6%
RT     5.4   0.05     11.4   0.21    100.00%    +110.3%

At most 3 outliers rejected per test per set.

Iterating here means that lowlevel-blt-bench was executed 30 times, and
the statistics above were computed from the output.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-09-17 14:40:39 +03:00
Ben Avison
9eb6889b15 armv6: Add over_n_8888 fast path (disabled)
This new fast path is initially disabled by putting the entries in the
lookup table after the sentinel. The compiler cannot tell the new code
is not used, so it cannot eliminate the code. Also the lookup table size
will include the new fast path. When the follow-up patch then enables
the new fast path, the binary layout (alignments, size, etc.) will stay
the same compared to the disabled case.

Keeping the binary layout identical is important for benchmarking on
Raspberry Pi 1. The addresses at which functions are loaded will have a
significant impact on benchmark results, causing unexpected performance
changes. Keeping all function addresses the same across the patch
enabling a new fast path improves the reliability of benchmarks.

Benchmark results are included in the patch enabling this fast path.

[Pekka: disabled the fast path, commit message]
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-09-17 14:40:39 +03:00
Ben Avison
4c71f595e3 test: Add cover-test v5
This test aims to verify both numerical correctness and the honouring of
array bounds for scaled plots (both nearest-neighbour and bilinear) at or
close to the boundary conditions for applicability of "cover" type fast paths
and iter fetch routines.

It has a secondary purpose: by setting the env var EXACT (to any value) it
will only test plots that are exactly on the boundary condition. This makes
it possible to ensure that "cover" routines are being used to the maximum,
although this requires the use of a debugger or code instrumentation to
verify.

Changes in v4:

  Check the fence page size and skip the test if it is too large. Since
  we need to deal with pixman_fixed_t coordinates that go beyond the
  real image width, make the page size limit 16 kB. A 32 kB or larger
  page size would cause an a8 image width to be 32k or more, which is no
  longer representable in pixman_fixed_t.

  Use a shorthand variable 'filter' in test_cover().

  Whitespace adjustments.

Changes in v5:

  Skip if fenced memory is not supported. Do you know of any such
  platform?

Signed-off-by: Ben Avison <bavison@riscosopen.org>
[Pekka: changes in v4 and v5]
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-09-16 15:34:43 +03:00
Julien Cristau
f9a49b3783 Run tests with VERBOSE=1. 2015-09-12 20:31:08 +02:00
Julien Cristau
4b4898e073 Upload to unstable 2015-09-12 13:08:19 +02:00
Pekka Paalanen
812c9c9758 implementation: add PIXMAN_DISABLE=wholeops
Add a new option to PIXMAN_DISABLE: "wholeops". This option disables all
whole-operation fast paths regardless of implementation level, except
the general path (general_composite_rect).

The purpose is to add a debug option that allows us to test optimized
iterator paths specifically. With this, it is possible to see if:
- fast paths mask bugs in iterators
- compare fast paths with iterator paths for performance

The effect was tested on x86_64 by running:
$ PIXMAN_DISABLE='' ./test/lowlevel-blt-bench over_8888_8888
$ PIXMAN_DISABLE='wholeops' ./test/lowlevel-blt-bench over_8888_8888

In the first case time is spent in sse2_composite_over_8888_8888(), and
in the latter in sse2_combine_over_u().

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-09-09 11:42:55 +03:00
Pekka Paalanen
e9ef2cc4de utils.[ch]: add fence_get_page_size()
Add a function to get the page size used for memory fence purposes, and
use it everywhere where getpagesize() was used.

This offers a single point in code to override the page size, in case
one wants to experiment how the tests work with a higher page size than
what the developer's machine has.

This also offers a clean API, without adding #ifdefs, to tests for
checking the page size.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-09-09 11:30:51 +03:00
Pekka Paalanen
82f8c997df utils.c: fix fallback code for fence_image_create_bits()
Used a wrong variable name, causing:
/home/pq/git/pixman/demos/../test/utils.c: In function ‘fence_image_create_bits’:
/home/pq/git/pixman/demos/../test/utils.c:562:46: error: ‘width’ undeclared (first use in this function)

Use the correct variable.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-09-09 11:29:44 +03:00
Andreas Boll
42fab57651 Bump standards version to 3.9.6. 2015-09-04 13:40:42 +02:00
Andreas Boll
56432ef5e5 Drop XC- prefix from Package-Type field. 2015-09-04 13:39:55 +02:00
Andreas Boll
c0f98e1cf4 Add upstream url. 2015-09-04 12:30:27 +02:00
Andreas Boll
03e2d2138b Update Vcs-* fields. 2015-09-04 12:30:27 +02:00
intrigeri
e6fce5e4e4 Update changelog.
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
2015-09-04 12:30:26 +02:00
intrigeri
7bc925aa50 Enable all hardening build flags. Thanks to Simon Ruderich <simon@ruderich.org> for the patch.
Quoting Simon again: "It currently has the same effect as hardening=+bindnow,
but will automatically enable future hardening options and in case the package
will ever build binaries those are immediately protected with PIE as well."

Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
2015-09-04 12:29:51 +02:00
intrigeri
2fb4da778c Simplify hardening build flags handling. Thanks to Simon Ruderich <simon@ruderich.org> for the patch.
Quoting Simon Ruderich <simon@ruderich.org>:
"There's no need to use dpkg-buildflags manually in debian/rules.
Debhelper with compat=9 automatically enables the hardening flags when
dh_auto_configure is used. So just by calling dh_auto_configure [...]
the hardening flags get automatically passed to the build system.
DEB_BUILD_MAINT_OPTIONS is also respected."

Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
2015-09-04 12:29:51 +02:00
Andreas Boll
e47fb32ae3 Enable vmx on ppc64el (closes: #786345). 2015-09-04 12:29:49 +02:00
Andreas Boll
18e4bdcadf Bump changelogs. 2015-09-04 12:28:52 +02:00
Andreas Boll
40eb5d8140 Merge branch 'debian-unstable' into debian-unstable-new 2015-09-04 11:24:59 +02:00
Andreas Boll
266eaac369 Merge branch 'upstream-unstable' into debian-unstable-new 2015-09-04 11:24:48 +02:00
Pekka Paalanen
0700685382 test: add fence-image-self-test
Tests that fence_malloc and fence_image_create_bits actually work: that
out-of-bounds and out-of-row (unused stride area) accesses trigger
SIGSEGV.

If fence_malloc is a dummy (FENCE_MALLOC_ACTIVE not defined), this test
is skipped.

Changes in v2:

- check FENCE_MALLOC_ACTIVE value, not whether it is defined
- test that reading bytes near the fence pages does not cause a
  segmentation fault

Changes in v3:

- Do not print progress messages unless VERBOSE environment variable is
  set. Avoid spamming the terminal output of 'make check' on some
  versions of autotools.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-09-03 14:00:32 +03:00
Pekka Paalanen
13d93aa120 utils.[ch]: add fence_image_create_bits ()
Useful for detecting out-of-bounds accesses in composite operations.

This will be used by follow-up patches adding new tests.

Changes in v2:

- fix style on fence_image_create_bits args
- add page to stride only if stride_fence
- add comment on the fallback definition about freeing storage

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-09-01 17:06:46 +03:00
Pekka Paalanen
c70ddd5c9e utils.[ch]: add FENCE_MALLOC_ACTIVE
Define a new token to simplify checking whether fence_malloc() actually
can catch out-of-bounds access.

This will be used in the future to skip tests that rely on fence_malloc
checking functionality.

Changes in v2:

- #define FENCE_MALLOC_ACTIVE always, but change its value to help catch
  use of it without including utils.h

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-09-01 17:05:58 +03:00
Ben Avison
a82e519944 scaling-test: list more details when verbose
Add mask details to the output.

[Pekka: redo whitespace and print src,dst,mask x and y.]
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-08-28 14:24:28 +03:00
Pekka Paalanen
fd59569294 lowlevel-blt-bench: make extra arguments an error
If a user gives multiple patterns or extra arguments, only the last one
was used as the pattern while the former were just ignored. This is a
user error silently converted to something possibly unexpected.

In presence of extra arguments, complain and quit.

Cc: Ben Avison <bavison@riscosopen.org>
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-08-18 10:23:27 +03:00
Oded Gabbay
69611473c5 Post-release version bump to 0.33.3
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-08-01 23:01:43 +03:00
Oded Gabbay
ee790044b0 Pre-release version bump to 0.33.2
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-08-01 22:34:53 +03:00
Oded Gabbay
8d9be3619a vmx: implement fast path iterator vmx_fetch_a8
no changes were observed when running cairo trimmed benchmarks.

Running "lowlevel-blt-bench src_8_8888" on POWER8, 8 cores,
3.4GHz, RHEL 7.1 ppc64le gave the following results:

reference memcpy speed = 25197.2MB/s (6299.3MP/s for 32bpp fills)

                Before          After           Change
              --------------------------------------------
L1              965.34          3936           +307.73%
L2              942.99          3436.29        +264.40%
M               902.24          2757.77        +205.66%
HT              448.46          784.99         +75.04%
VT              430.05          819.78         +90.62%
R               412.9           717.04         +73.66%
RT              168.93          220.63         +30.60%
Kops/s          1025            1303           +27.12%

It was benchmarked against commid id e2d211a from pixman/master

Siarhei Siamashka reported that on playstation3, it shows the following
results:

== before ==

              src_8_8888 =  L1: 194.37  L2: 198.46  M:155.90 (148.35%)
              HT: 59.18  VT: 36.71  R: 38.93  RT: 12.79 ( 106Kops/s)

== after ==

              src_8_8888 =  L1: 373.96  L2: 391.10  M:245.81 (233.88%)
              HT: 80.81  VT: 44.33  R: 48.10  RT: 14.79 ( 122Kops/s)

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-07-16 16:13:35 +03:00
Oded Gabbay
47f74ca946 vmx: implement fast path iterator vmx_fetch_x8r8g8b8
It was benchmarked against commid id 2be523b from pixman/master

POWER8, 8 cores, 3.4GHz, RHEL 7.1 ppc64le.

cairo trimmed benchmarks :

Speedups
========
t-firefox-asteroids  533.92  -> 489.94 :  1.09x

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-07-16 16:13:35 +03:00
Oded Gabbay
fcbb97d445 vmx: implement fast path scaled nearest vmx_8888_8888_OVER
It was benchmarked against commid id 2be523b from pixman/master

POWER8, 8 cores, 3.4GHz, RHEL 7.1 ppc64le.
reference memcpy speed = 24764.8MB/s (6191.2MP/s for 32bpp fills)

                Before           After           Change
              ---------------------------------------------
L1              134.36          181.68          +35.22%
L2              135.07          180.67          +33.76%
M               134.6           180.51          +34.11%
HT              121.77          128.79          +5.76%
VT              120.49          145.07          +20.40%
R               93.83           102.3           +9.03%
RT              50.82           46.93           -7.65%
Kops/s          448             422             -5.80%

cairo trimmed benchmarks :

Speedups
========
t-firefox-asteroids  533.92 -> 497.92 :  1.07x
    t-midori-zoomed  692.98 -> 651.24 :  1.06x

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-07-16 16:13:35 +03:00
Oded Gabbay
ad612c4205 vmx: implement fast path vmx_composite_src_x888_8888
It was benchmarked against commid id 2be523b from pixman/master

POWER8, 8 cores, 3.4GHz, RHEL 7.1 ppc64le.
reference memcpy speed = 24764.8MB/s (6191.2MP/s for 32bpp fills)

                Before           After           Change
              ---------------------------------------------
L1              1115.4          5006.49         +348.85%
L2              1112.26         4338.01         +290.02%
M               1110.54         2524.15         +127.29%
HT              745.41          1140.03         +52.94%
VT              749.03          1287.13         +71.84%
R               423.91          547.6           +29.18%
RT              205.79          194.98          -5.25%
Kops/s          1414            1361            -3.75%

cairo trimmed benchmarks :

Speedups
========
t-gnome-system-monitor  1402.62  -> 1212.75 :  1.16x
   t-firefox-asteroids   533.92  ->  474.50 :  1.13x

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-07-16 16:13:35 +03:00
Oded Gabbay
fafc1d403b vmx: implement fast path vmx_composite_over_n_8888_8888_ca
It was benchmarked against commid id 2be523b from pixman/master

POWER8, 8 cores, 3.4GHz, RHEL 7.1 ppc64le.

reference memcpy speed = 24764.8MB/s (6191.2MP/s for 32bpp fills)

                Before           After           Change
              ---------------------------------------------
L1              61.92            244.91          +295.53%
L2              62.74            243.3           +287.79%
M               63.03            241.94          +283.85%
HT              59.91            144.22          +140.73%
VT              59.4             174.39          +193.59%
R               53.6             111.37          +107.78%
RT              37.99            46.38           +22.08%
Kops/s          436              506             +16.06%

cairo trimmed benchmarks :

Speedups
========
t-xfce4-terminal-a1  1540.37 -> 1226.14 :  1.26x
t-firefox-talos-gfx  1488.59 -> 1209.19 :  1.23x

Slowdowns
=========
        t-evolution  553.88  -> 581.63  :  1.05x
          t-poppler  364.99  -> 383.79  :  1.05x
t-firefox-scrolling  1223.65 -> 1304.34 :  1.07x

The slowdowns can be explained in cases where the images are small and
un-aligned to 16-byte boundary. In that case, the function will first
work on the un-aligned area, even in operations of 1 byte. In case of
small images, the overhead of such operations can be more than the
savings we get from using the vmx instructions that are done on the
aligned part of the image.

In the C fast-path implementation, there is no special treatment for the
un-aligned part, as it works in 4 byte quantities on the entire image.

Because llbb is a synthetic test, I would assume it has much less
alignment issues than "real-world" scenario, such as cairo benchmarks,
which are basically recorded traces of real application activity.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-07-16 16:13:35 +03:00
Oded Gabbay
a3e914407e vmx: implement fast path composite_add_8888_8888
Copied impl. from sse2 file and edited to use vmx functions

It was benchmarked against commid id 2be523b from pixman/master

POWER8, 16 cores, 3.4GHz, ppc64le :

reference memcpy speed = 27036.4MB/s (6759.1MP/s for 32bpp fills)

                Before           After           Change
              ---------------------------------------------
L1              248.76          3284.48         +1220.34%
L2              264.09          2826.47         +970.27%
M               261.24          2405.06         +820.63%
HT              217.27          857.3           +294.58%
VT              213.78          980.09          +358.46%
R               176.61          442.95          +150.81%
RT              107.54          150.08          +39.56%
Kops/s          917             1125            +22.68%

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-07-16 16:13:35 +03:00
Oded Gabbay
d5b5343c7d vmx: implement fast path composite_add_8_8
Copied impl. from sse2 file and edited to use vmx functions

It was benchmarked against commid id 2be523b from pixman/master

POWER8, 16 cores, 3.4GHz, ppc64le :

reference memcpy speed = 27036.4MB/s (6759.1MP/s for 32bpp fills)

                Before           After           Change
              ---------------------------------------------
L1              687.63          9140.84         +1229.33%
L2              715             7495.78         +948.36%
M               717.39          8460.14         +1079.29%
HT              569.56          1020.12         +79.11%
VT              520.3           1215.56         +133.63%
R               514.81          874.35          +69.84%
RT              341.28          305.42          -10.51%
Kops/s          1621            1579            -2.59%

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-07-16 16:13:35 +03:00
Oded Gabbay
339eeaf095 vmx: implement fast path composite_over_8888_8888
Copied impl. from sse2 file and edited to use vmx functions

It was benchmarked against commid id 2be523b from pixman/master

POWER8, 16 cores, 3.4GHz, ppc64le :

reference memcpy speed = 27036.4MB/s (6759.1MP/s for 32bpp fills)

                Before           After           Change
              ---------------------------------------------
L1              129.47          1054.62         +714.57%
L2              138.31          1011.02         +630.98%
M               139.99          1008.65         +620.52%
HT              122.11          468.45          +283.63%
VT              121.06          532.21          +339.62%
R               108.48          240.5           +121.70%
RT              77.87           116.7           +49.87%
Kops/s          758             981             +29.42%

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-07-16 16:13:35 +03:00
Oded Gabbay
0cc8a2e971 vmx: implement fast path vmx_fill
Based on sse2 impl.

It was benchmarked against commid id e2d211a from pixman/master

Tested cairo trimmed benchmarks on POWER8, 8 cores, 3.4GHz,
RHEL 7.1 ppc64le :

speedups
========
     t-swfdec-giant-steps  1383.09 ->  718.63  :  1.92x speedup
   t-gnome-system-monitor  1403.53 ->  918.77  :  1.53x speedup
              t-evolution  552.34  ->  415.24  :  1.33x speedup
      t-xfce4-terminal-a1  1573.97 ->  1351.46 :  1.16x speedup
      t-firefox-paintball  847.87  ->  734.50  :  1.15x speedup
      t-firefox-asteroids  565.99  ->  492.77  :  1.15x speedup
t-firefox-canvas-swscroll  1656.87 ->  1447.48 :  1.14x speedup
          t-midori-zoomed  724.73  ->  642.16  :  1.13x speedup
   t-firefox-planet-gnome  975.78  ->  911.92  :  1.07x speedup
          t-chromium-tabs  292.12  ->  274.74  :  1.06x speedup
     t-firefox-chalkboard  690.78  ->  653.93  :  1.06x speedup
      t-firefox-talos-gfx  1375.30 ->  1303.74 :  1.05x speedup
   t-firefox-canvas-alpha  1016.79 ->  967.24  :  1.05x speedup

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-07-16 16:13:35 +03:00
Oded Gabbay
c12ee95089 vmx: add helper functions
This patch adds the following helper functions for reuse of code,
hiding BE/LE differences and maintainability.

All of the functions were defined as static force_inline.

Names were copied from pixman-sse2.c so conversion of fast-paths between
sse2 and vmx would be easier from now on. Therefore, I tried to keep the
input/output of the functions to be as close as possible to the sse2
definitions.

The functions are:

- load_128_aligned       : load 128-bit from a 16-byte aligned memory
                           address into a vector

- load_128_unaligned     : load 128-bit from memory into a vector,
                           without guarantee of alignment for the
                           source pointer

- save_128_aligned       : save 128-bit vector into a 16-byte aligned
                           memory address

- create_mask_16_128     : take a 16-bit value and fill with it
                           a new vector

- create_mask_1x32_128   : take a 32-bit pointer and fill a new
                           vector with the 32-bit value from that pointer

- create_mask_32_128     : take a 32-bit value and fill with it
                           a new vector

- unpack_32_1x128        : unpack 32-bit value into a vector

- unpacklo_128_16x8      : unpack the eight low 8-bit values of a vector

- unpackhi_128_16x8      : unpack the eight high 8-bit values of a vector

- unpacklo_128_8x16      : unpack the four low 16-bit values of a vector

- unpackhi_128_8x16      : unpack the four high 16-bit values of a vector

- unpack_128_2x128       : unpack the eight low 8-bit values of a vector
                           into one vector and the eight high 8-bit
                           values into another vector

- unpack_128_2x128_16    : unpack the four low 16-bit values of a vector
                           into one vector and the four high 16-bit
                           values into another vector

- unpack_565_to_8888     : unpack an RGB_565 vector to 8888 vector

- pack_1x128_32          : pack a vector and return the LSB 32-bit of it

- pack_2x128_128         : pack two vectors into one and return it

- negate_2x128           : xor two vectors with mask_00ff (separately)

- is_opaque              : returns whether all the pixels contained in
                           the vector are opaque

- is_zero                : returns whether the vector equals 0

- is_transparent         : returns whether all the pixels
                           contained in the vector are transparent

- expand_pixel_8_1x128   : expand an 8-bit pixel into lower 8 bytes of a
                           vector

- expand_alpha_1x128     : expand alpha from vector and return the new
                           vector

- expand_alpha_2x128     : expand alpha from one vector and another alpha
                           from a second vector

- expand_alpha_rev_2x128 : expand a reversed alpha from one vector and
                           another reversed alpha from a second vector

- pix_multiply_2x128     : do pix_multiply for two vectors (separately)

- over_2x128             : perform over op. on two vectors

- in_over_2x128          : perform in-over op. on two vectors

v2: removed expand_pixel_32_1x128 as it was not used by any function and
its implementation was erroneous

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-07-16 16:13:35 +03:00
Oded Gabbay
034149537b vmx: add LOAD_VECTOR macro
This patch adds a macro for loading a single vector.
It also make the other LOAD_VECTORx macros use this macro as a base so
code would be re-used.

In addition, I fixed minor coding style issues.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-07-16 16:13:35 +03:00
Nemanja Lukic
7441340256 MIPS: update author's e-mail address
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-07-11 23:08:02 +03:00
Pekka Paalanen
e2d211ac49 lowlevel-blt-bench: add option to skip memcpy measurement
The memcpy speed measurement takes several seconds. When you are running
single tests in a harness that iterates dozens or hundreds of times, the
repeated measurements are redundant and take a lot of time. It is also
an open question whether the measured speed changes over long test runs
due to unidentified platform reasons (Raspberry Pi).

Add a command line option to set the reference memcpy speed, skipping
the measuring.

The speed is mainly used to compute how many iterations do run inside
the bench_*() functions, so for repeated testing on the same hardware,
it makes sense to lock that number to a constant.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-07-06 12:04:50 +03:00
Pekka Paalanen
31cb0d4267 lowlevel-blt-bench: add CSV output mode
Add a command line option for choosing CSV output mode.

In CSV mode, only the results in Mpixels/s are printed in an easily
machine-parseable format. All user-friendly printing is suppressed.

This is intended for cases where you benchmark one particular operation
at a time. Running the "all" set of benchmarks will print just fine, but
you may have trouble matching rows to operations as you have to look at
the tests_tbl[] to see what row is which.

Reviewed-by: Ben Avison <bavison@riscosopen.org>

v2: don't add a space after comma in CSV.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-07-06 12:04:32 +03:00
Pekka Paalanen
9a7e0bc6d0 lowlevel-blt-bench: refactor to Mpx_per_sec()
Refactor the Mpixels/s computations into a function. Easier to read and
better documents what is being computed.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-07-06 12:04:27 +03:00
Pekka Paalanen
6e9c48c579 lowlevel-blt-bench: all bench funcs to return pix_cnt
The bench_* functions, that did not already do it, are modified to
return the number of pixels processed during the benchmark. This moves
the computation to the site that actually determines the number, and
simplifies bench_composite() a bit.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-07-06 12:04:22 +03:00
Pekka Paalanen
9e8f2bcaf5 lowlevel-blt-bench: move speed and scaling printing
Move the printing of the memory speed and scaling mode into a new
function. This will help with implementing a machine-readable output
option.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-07-06 12:04:18 +03:00
Pekka Paalanen
a33c2e6853 lowlevel-blt-bench: print single pattern details
When given just a single test pattern instead of "all", print the test
details. This can be used to verify the pattern parser agrees with the
user, just like scaling settings are printed.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-07-06 12:04:12 +03:00
Pekka Paalanen
3ac7ae2017 lowlevel-blt-bench: make test_entry::testname const
We assign string literals to it, so it better be const.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-07-06 12:04:07 +03:00
Pekka Paalanen
56d8b365f5 lowlevel-blt-bench: move explanation printing
Move explanation printing to a new function. This will help with
implementing a machine-readable output option.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-07-06 12:04:03 +03:00
Pekka Paalanen
bddff993ed lowlevel-blt-bench: move usage to a function
Move printing of usage into a new function and use argv[0] as the
program name. This will help printing usage from multiple places.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-07-06 12:03:28 +03:00
Oded Gabbay
2be523b204 vmx: fix pix_multiply for ppc64le
vec_mergeh/l operates differently for BE and LE, because of the order of
the vector elements (l->r in BE and r->l in LE).
To fix that, we simply need to swap between the input parameters, in case
we are working in LE.

v2:

- replace _LITTLE_ENDIAN with WORDS_BIGENDIAN for consistency
- fixed whitespaces and indentation issues

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-07-02 10:04:41 +03:00
Oded Gabbay
8d379ad88e vmx: fix unused var warnings
v2: don't put ';' at the end of macro definition. Instead, move it to
    each line the macro is used.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-07-02 10:04:34 +03:00
Oded Gabbay
ff66a4a3ce vmx: encapsulate the temporary variables inside the macros
v2: fixed whitespaces and indentation issues

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-07-02 10:04:27 +03:00
Fernando Seiti Furusato
f6a26d0925 vmx: adjust macros when loading vectors on ppc64le
Replaced usage of vec_lvsl to direct unaligned assignment
operation (=). That is because, according to Power ABI Specification,
the usage of lvsl is deprecated on ppc64le.

Changed COMPUTE_SHIFT_{MASK,MASKS,MASKC} macro usage to no-op for powerpc
little endian since unaligned access is supported on ppc64le.

v2:

- replace _LITTLE_ENDIAN with WORDS_BIGENDIAN for consistency
- fixed whitespaces and indentation issues

Signed-off-by: Fernando Seiti Furusato <ferseiti@linux.vnet.ibm.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-07-02 10:04:15 +03:00
Oded Gabbay
b3a61703f4 vmx: fix splat_alpha for ppc64le
The permutation vector isn't correct for LE, so correct its values
in case we are in LE mode.

v2:

- replace _LITTLE_ENDIAN with WORDS_BIGENDIAN for consistency
- change #ifndef to #ifdef for readability

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-07-02 10:03:54 +03:00
Ben Avison
eebc1b7820 mmx/sse2: Use SIMPLE_NEAREST_SOLID_MASK_FAST_PATH for NORMAL repeat
These two architectures were the only place where
SIMPLE_NEAREST_SOLID_MASK_FAST_PATH was used, and in both cases the
equivalent SIMPLE_NEAREST_SOLID_MASK_FAST_PATH_NORMAL macro was used
immediately afterwards, so including the NORMAL case in the main macro
simplifies the fast path table.

[Pekka: removed extra comma from the end of
 SIMPLE_NEAREST_SOLID_MASK_FAST_PATH]

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-06-01 13:57:09 +03:00
Ben Avison
7f66928079 mmx/sse2: Use SIMPLE_NEAREST_FAST_PATH macro
There is some reordering, but the only significant thing to ensure that
the same routine is chosen is that a COVER fast path for a given
combination of operator and source/destination pixel formats must
precede all the variants of repeated fast paths for the same
combination. This patch (and the other mmx/sse2 one) still follows that
rule.

I believe that in every other case, the set of operations that match any
pair of fast paths that are reordered in these patches are mutually
exclusive. While there will be a very subtle timing difference due to
the distance through the table we have to search to find a match
(sometimes faster, sometime slower) there is no evidence that the tables
have been carefully ordered by frequency of occurrence - just for ease
of copy-and-pasting.

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-06-01 13:57:00 +03:00
Ben Avison
dee5000abb mips: Retire PIXMAN_MIPS_SIMPLE_NEAREST_A8_MASK_FAST_PATH
This macro does exactly the same thing as the platform-neutral macro
SIMPLE_NEAREST_A8_MASK_FAST_PATH.

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-06-01 13:56:54 +03:00
Ben Avison
4c70d18acc arm: Simplify PIXMAN_ARM_SIMPLE_NEAREST_A8_MASK_FAST_PATH
This macro is a superset of the platform-neutral macro
SIMPLE_NEAREST_A8_MASK_FAST_PATH. In other words, in addition to the
_COVER, _NONE and _PAD suffixes, its expansion includes the _NORMAL suffix.

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-06-01 13:56:45 +03:00
Ben Avison
de255e6a5e arm: Retire PIXMAN_ARM_SIMPLE_NEAREST_FAST_PATH
This macro does exactly the same thing as the platform-neutral macro
SIMPLE_NEAREST_FAST_PATH.

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-06-01 13:56:29 +03:00
Ben Avison
62a772f2ea test: Fix solid-test for big-endian targets
When generating test data, we need to make sure the interpretation of
the data is the same regardless of endianess. That is, the pixel value
for each channel is the same on both little and big-endians.

This fixes a test failure on ppc64 (big-endian).

Tested-by: Fernando Seiti Furusato <ferseiti@linux.vnet.ibm.com> (ppc64le, ppc64, powerpc)
Tested-by: Ben Avison <bavison@riscosopen.org> (armv6l, armv7l, i686)
[Pekka: added commit message]
Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Tested-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk> (x86_64)
2015-06-01 13:11:15 +03:00
Ben Avison
82f9b4faaf test: Add new fuzz tester targeting solid images
This places a heavier emphasis on solid images than the other fuzz testers,
and tests both single-pixel repeating bitmap images as well as those created
using pixman_image_create_solid_fill(). In the former case, it also
exercises the case where the bitmap contents are written to after the
image's first use, which is not a use-case that any other test has
previously covered.

[Pekka: added the default case to the switch in test_solid ().]

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-05-15 16:30:21 +03:00
James Cowgill
cf086d4949 MIPS: Drop #ifdef __ELF__ in definition of LEAF_MIPS32R2
Commit 6d2cf40166 ("MIPS: Fix exported symbols in public API") attempted to
add a .hidden assembly directive, conditional on the code being compiled for an
ELF target. Unfortunately the #ifdef added was already inside a macro and
wasn't expanded properly by the preprocessor.

Fix by removing the check. It's unlikely there are many non-ELF MIPS systems
around anyway.

Fixes: Bug 83358 (https://bugs.freedesktop.org/83358)
Fixes: 6d2cf40166 ("MIPS: Fix exported symbols in public API")
Signed-off-by: James Cowgill <james410@cowgill.org.uk>
Cc: Vicente Olivert Riera <Vincent.Riera@imgtec.com>
Cc: Nemanja Lukic <nemanja.lukic@rt-rk.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-05-07 12:49:09 +03:00
Bill Spitzak
6f14bae79e test: Added more demos and tests to .gitignore file
Uses a wildcard to handle the majority which end in "-test".

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-05-05 09:49:25 +03:00
Ben Avison
e0c0153d8e test: Add a new benchmarker targeting affine operations
Affine-bench is written by following the example of lowlevel-blt-bench.

Affine-bench differs from lowlevel-blt-bench in the following:
- does not test different sized operations fitting to specific caches,
  destination is always 1920x1080
- allows defining the affine transformation parameters
- carefully computes operation extents to hit the COVER_CLIP fast paths

Original version by Ben Avison. Changes by Pekka in v3:
- commit message
- style fixes
- more comments
- refactoring (e.g. bench_info_t)
- help output tweak

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-04-24 10:25:42 +03:00
Pekka Paalanen
58e21d3e45 lowlevel-blt-bench: use a8r8g8b8 for CA solid masks
When doing component alpha with a solid mask, use a mask format that has
all the color channels instead of just a8. As Ben Avison explains it:

"Lowlevel-blt-bench initialises all its images using memset(0xCC) so an
a8 solid image would be converted by _pixman_image_get_solid() to
0xCC000000 whereas an a8r8g8b8 would be 0xCCCCCCCC. When you're not in
component alpha mode, only the alpha byte matters for the mask image,
but in the case of component alpha operations, a fast path might decide
that it can save itself a lot of multiplications if it spots that 3
constant mask components are already 0."

No (default) test so far has a solid mask with CA. This is just
future-proofing lowlevel-blt-bench to do what one would expect.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-04-20 16:18:18 +03:00
Pekka Paalanen
be49f929b6 lowlevel-blt-bench: use the test pattern parser
Let lowlevel-blt-bench parse the test name string from the command line,
allowing to run almost infinitely more tests. One is no longer limited
to the tests listed in the big table.

While you can use the old short-hand names like src_8888_8888, you can
also use all possible operators now, and specify pixel formats exactly
rather than just x888, for instance.

This even allows to run crazy patterns like
conjoint_over_reverse_a8b8g8r8_n_r8g8b8x8.

All individual patterns are now interpreted through the parser. The
pattern "all" runs the same old default test set as before but through
the parser instead of the hard-coded parameters.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-04-15 12:43:01 +03:00
Pekka Paalanen
5b27912108 lowlevel-blt-bench: add test name parser and self-test
This patch is inspired by "lowlevel-blt-bench: Parse test name strings in
general case" by Ben Avison. From Ben's commit message:

"There are many types of composite operation that are useful to benchmark
but which are omitted from the table. Continually having to add extra
entries to the table is a nuisance and is prone to human error, so this
patch adds the ability to break down unknow strings of the format
  <operation>_<src>[_<mask]_<dst>[_ca]
where bitmap formats are specified by number of bits of each component
(assumed in ARGB order) or 'n' to indicate a solid source or mask."

Add the parser to lowlevel-blt-bench.c, but do not hook it up to the
command line just yet. Instead, make it run a self-test.

As we now dynamically parse strings similar to the test names in the
huge table 'tests_tbl', we should make sure we can parse the old
well-known test names and produce exactly the same test parameters. The
self-test goes through this old table and verifies the parsing results.

Unfortunately the old table is not exactly consistent, it contains some
special cases that cannot be produced by the parsing rules. Whether
these special cases are intentional or just an oversight is not always
clear. Anyway, add a small table to reproduce the special cases
verbatim.

If we wanted, we could remove the big old table in a follow-up commit,
but then we would also lose the parser self-test.

The point of this whole excercise to let lowlevel-blt-bench recognize
novel test patterns in the future, following exactly the conventions
used in the old table.

Ben, from what I see, this parser has one major difference to what you
wrote. For a solid mask, your parser uses a8r8g8b8 format, while mine
uses a8 which comes from the old table.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-04-15 12:42:51 +03:00
Pekka Paalanen
1f45bd6565 test/utils: add format aliases used by lowlevel-blt-bench
Lowlevel-blt-bench uses several pixel format shorthands. Pick them from
the great table in lowlevel-blt-bench.c and add them here so that
format_from_string() can recognize them.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-04-15 12:42:45 +03:00
Pekka Paalanen
ef9c28a0e4 test/utils: add operator aliases for lowlevel-blt-bench
Lowlevel-blt-bench uses the operator alias "outrev". Add an alias for it
in the operator-name table.

Also add aliases for overrev, inrev and atoprev, so that
lowlevel-blt-bench can later recognize them for new test cases.

The aliases are added such, that an operator to name lookup will never
return them; it returns the proper names instead.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-04-15 12:42:40 +03:00
Pekka Paalanen
f1f6cc23ce test/utils: support format name aliases
Previously there was a flat list of formats, used to iterate over all
formats when looking up a format from name or listing them. This cannot
support name aliases.

To support name aliases (multiple name strings mapping to the same
format), create a format-name mapping table. Functions format_name(),
format_from_string(), and list_formats() should keep on working exactly
like before, except format_from_string() now recognizes the additional
formats that format_name() already supported.

The only the formats from the old format list are added with ENTRY, so
that list_formats() works as before. The whole list is verified against
the authoritative list in pixman.h, entries missing from the old list
are commented out.

The extra formats supported by the old format_name() are added as
ALIASes. A side-effect of that is that now also format_from_string()
recognizes the following new names: x4c4 / c8, x4g4 / g8, c4, g4, g1,
yuy2, yv12, null, solid, pixbuf, rpixbuf, unknown.

Name aliases will be useful in follow-up patches, where
lowlevel-blt-bench.c is converted to parse short-hand format names from
strings.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-04-15 12:42:33 +03:00
Pekka Paalanen
2c5fac9320 test/utils: support operator name aliases
Previously there was a flat list of operators (pixman_op_t), used to
iterate over all operators when looking up an operator from name or
listing them. This cannot support name aliases.

To support name aliases (multiple name strings mapping to the same
operator), create an operator-name mapping table. Functions
operator_name, operator_from_string, and list_operators should keep on
working exactly like before, except operator_from_string now recognizes
a few aliases too.

Name aliases will be useful in follow-up patches, where
lowlevel-blt-bench.c is converted to parse operator names from strings.
Lowlevel-blt-bench uses shorthand names instead of the usual names. This
change allows lowlevel-blt-bench.s to use operator_from_string in the
future.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-04-15 12:41:47 +03:00
Ben Avison
f122907dc1 test: Move format and operator string functions to utils.[ch]
This permits format_from_string(), list_formats(), list_operators() and
operator_from_string() to be used from tests other than check-formats.

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-04-13 10:11:51 +03:00
Ben Avison
9bc025f7cd pixman.c: Coding style
A few violations of coding style were identified in code copied from here
into affine-bench.

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-04-09 12:04:55 +03:00
Ben Avison
978dd9fc65 armv6: Fix typo in preload macro
Missing "lsl" meant that cases with a 32-bit source and/or mask, and an
8-bit destination, the code would not assemble.
2015-04-01 18:38:36 -07:00
Siarhei Siamashka
594e6a6c93 mmx: Fix _mm_empty problems for over_8888_8888/over_8888_n_8888
Using "--disable-sse2 --disable-ssse3" configure options and
CFLAGS="-m32 -O2 -g" on an x86 system results in pixman "make check"
failures:

    ../test-driver: line 95: 29874 Aborted
    FAIL: affine-test
    ../test-driver: line 95: 29887 Aborted
    FAIL: scaling-test

One _mm_empty () was missing and another one is needed to workaround
an old GCC bug https://gcc.gnu.org/PR47759 (GCC may move MMX instructions
around and cause test suite failures).

Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-24 14:25:30 -07:00
Søren Sandmann Pedersen
a8669137b9 Fix comment about BILINEAR_INTERPOLATION_BITS to say < 8 rather than <= 8
Since a4c79d695d the constant
BILINEAR_INTERPOLATION_BITS must be strictly less than 8, so fix the
comment to say this, and also add a COMPILE_TIME_ASSERT in the
bilinear fetcher in pixman-fast-path.c
2014-10-05 12:42:47 -07:00
Matt Turner
f078727f39 mmx: Add nearest over_8888_8888
lowlevel-blt-bench -n, over_8888_8888, 15 iterations on Loongson 2f:

           Before          After
          Mean StdDev     Mean StdDev   Change
    L1    15.8   0.02     24.0   0.06   +52.0%
    L2    14.8   0.15     23.3   0.13   +56.9%
    M     10.3   0.01     13.8   0.03   +33.6%
    HT    10.0   0.02     14.5   0.05   +44.7%
    VT     9.7   0.02     13.5   0.04   +39.2%
    R      9.1   0.01     12.2   0.04   +34.4%
    RT     7.1   0.06      8.9   0.09   +25.2%
2014-09-05 00:22:07 -07:00
Matt Turner
f868ff5e34 mmx: Add nearest over_8888_n_8888
lowlevel-blt-bench -n, over_8888_n_8888, 15 iterations on Loongson 2f:

           Before          After
          Mean StdDev     Mean StdDev   Change
    L1     9.7   0.01     19.2   0.02   +98.2%
    L2     9.6   0.11     19.2   0.16   +99.5%
    M      7.3   0.02     12.5   0.01   +72.0%
    HT     6.6   0.01     13.4   0.02  +103.2%
    VT     6.4   0.01     12.6   0.03   +96.1%
    R      6.3   0.01     11.2   0.01   +76.5%
    RT     4.4   0.01      8.1   0.03   +82.6%
2014-09-05 00:22:04 -07:00
Julien Cristau
b483955605 Upload to unstable 2014-08-23 22:16:47 -07:00
intrigeri
c03d98f8d1 Enable hardening build flags with dpkg-buildflags.
All default dpkg-buildflags, plus the bonus bindnow one, are used.
The last available one (PIE) is not applicable to shared libraries.
2014-08-23 22:16:13 -07:00
Cyril Brulebois
b16d4c7ed7 Upload to unstable 2014-08-18 22:52:38 +02:00
Julien Cristau
f9c2d54a62 Disable vmx on ppc64el (closes: #745547).
Thanks, Breno Leitao!
2014-07-24 22:43:15 +02:00
Julien Cristau
cd23302b1a Upload to unstable 2014-07-13 16:31:09 +02:00
Julien Cristau
98eadfa08b Remove Cyril from Uploaders. 2014-07-13 16:31:02 +02:00
Julien Cristau
9e8362a51f Bump debhelper compat level to 9. 2014-07-13 16:24:29 +02:00
Julien Cristau
6a7a144be1 Bump changelogs 2014-07-13 16:22:42 +02:00
Julien Cristau
fd99d1a9c8 pixman 0.32.6 release
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQEcBAABAgAGBQJTuIYBAAoJEIWlZJw4kjNuHA8H/0wBuk7d/7uqCAfqyQ3o5Qs9
 q00UvsZVCymC6f1Hh+bgQGtMHgy2Wo1gvw/usSoxlxqc+T4wWeN912RPZwvprVzn
 v9+J7UyjLH28yUVq9NBn91LqHEWfzLK8gf3Y3i3IIQNd9YtIkqjPMyKDuTaQVUYc
 Op6vzXzjzwf0lKjTTZOWsnm9Zh6vvFoqVOajS6hSvA20/xczknAbU3HfUIBI+G4o
 6/br7A6OpIB08vFAJd1XJpAkrHjjIJCECg3wxsfxuCYcoRSWhUPoul4IEkHXn4p4
 mTKjTzBxuDM85FAadTT7PxygABelcQljMlJPKwY4rJwz5t8/yFLc5h5WXft2laI=
 =z2mk
 -----END PGP SIGNATURE-----

Merge tag 'pixman-0.32.6' into debian-unstable

pixman 0.32.6 release
2014-07-13 16:21:47 +02:00
Søren Sandmann Pedersen
87eea99e44 Pre-release version bump to 0.32.6 2014-07-05 18:55:43 -04:00
Siarhei Siamashka
9f18ea3483 configure.ac: Check if the compiler supports GCC vector extensions
The Intel Compiler 14.0.0 claims version GCC 4.7.3 compatibility
via __GNUC__/__GNUC__MINOR__ macros, but does not provide the same
level of GCC vector extensions support as the original GCC compiler:
    http://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html

Which results in the following compilation failure:

In file included from ../test/utils.h(7),
                 from ../test/utils.c(3):
../test/utils-prng.h(138): error: expression must have integral type
      uint32x4 e = x->a - ((x->b << 27) + (x->b >> (32 - 27)));
                            ^

The problem is fixed by doing a special check in configure for
this feature.
2014-07-04 20:52:59 -04:00
Søren Sandmann
50d7b5fa8e create_bits(): Cast the result of height * stride to size_t
In create_bits() both height and stride are ints, so the result is
also an int, which will overflow if height or stride are big enough
and size_t is bigger than int.

This patch simply casts height to size_t to prevent these overflows,
which prevents the crash in:

    https://bugzilla.redhat.com/show_bug.cgi?id=972647

It's not even close to fixing the full problem of supporting big
images in pixman.

See also

    https://bugs.freedesktop.org/show_bug.cgi?id=69014
2014-07-04 20:50:58 -04:00
Nemanja Lukic
6d2cf40166 MIPS: Fix exported symbols in public API. 2014-07-03 13:35:21 -04:00
Nemanja Lukic
c42824ebb5 MIPS: Fix exported symbols in public API. 2014-07-03 13:34:53 -04:00
Søren Sandmann Pedersen
5a2edb3f2c test: Rearrange tests in order of increasing runtime
Making short tests run first is convenient to catch obvious bugs
early.
2014-06-28 19:24:27 -04:00
Søren Sandmann Pedersen
9cd283b2eb pixman-gradient-walker: Make left_x and right_x 64 bit variables
The variables left_x, and right_x in gradient_walker_reset() are
computed from pos, which is a 64 bit quantity, so to avoid overflows,
these variables must be 64 bit as well.

Similarly, the left_x and right_x that are stored in
pixman_gradient_walker_t need to be 64 bit as well; otherwise,
pixman_gradient_walker_pixel() will call reset too often.

This fixes the radial-invalid test, which was generating 'invalid'
floating point exceptions when the overflows caused color values to be
outside of [0, 255].
2014-05-15 13:29:58 -04:00
Søren Sandmann Pedersen
f5f5dbbbc6 test: Add radial-invalid test program
This program demonstrates a bug in gradient walker, where some integer
overflows cause colors outside the range [0, 255] to be generated,
which in turns cause 'invalid' floating point exceptions when those
colors are converted to uint8_t.

The bug was first reported by Owen Taylor on the #cairo IRC channel.
2014-05-15 13:29:38 -04:00
Ben Avison
91f32ce961 ARMv6: Add fast path for src_x888_0565
Benchmark results, "before" is upstream/master
5f661ee719, and "after" contains this
patch on top.

lowlevel-blt-bench, src_8888_0565, 100 iterations:

       Before          After
      Mean StdDev     Mean StdDev   Confidence   Change
L1    25.9   0.20    115.6   0.70    100.00%    +347.1%
L2    14.4   0.23     52.7   3.48    100.00%    +265.0%
M     14.1   0.01     79.8   0.17    100.00%    +465.9%
HT    10.2   0.03     32.9   0.31    100.00%    +221.2%
VT     9.8   0.03     29.8   0.25    100.00%    +203.4%
R      9.4   0.03     27.8   0.18    100.00%    +194.7%
RT     4.6   0.04     10.9   0.29    100.00%    +135.9%

At most 19 outliers rejected per test per set.

cairo-perf-trace with trimmed traces results were indifferent.

A system-wide perf_3.10 profile on Raspbian shows significant
differences in the X server CPU usage. The following were measured from
a 130x62 char lxterminal running 'dmesg' every 0.5 seconds for roughly
30 seconds. These profiles are libpixman.so symbols only.

Before:

Samples: 63K of event 'cpu-clock', Event count (approx.): 2941348112, DSO: libpixman-1.so.0.33.1
 37.77%  Xorg  [.] fast_fetch_r5g6b5
 14.39%  Xorg  [.] pixman_composite_over_n_8_8888_asm_armv6
  8.51%  Xorg  [.] fast_write_back_r5g6b5
  7.38%  Xorg  [.] pixman_composite_src_8888_8888_asm_armv6
  4.39%  Xorg  [.] pixman_composite_add_8_8_asm_armv6
  3.69%  Xorg  [.] pixman_composite_src_n_8888_asm_armv6
  2.53%  Xorg  [.] _pixman_image_validate
  2.35%  Xorg  [.] pixman_image_composite32

After:

Samples: 31K of event 'cpu-clock', Event count (approx.): 3619782704, DSO: libpixman-1.so.0.33.1
 22.36%  Xorg  [.] pixman_composite_over_n_8_8888_asm_armv6
 13.59%  Xorg  [.] pixman_composite_src_x888_0565_asm_armv6
 12.75%  Xorg  [.] pixman_composite_src_8888_8888_asm_armv6
  6.79%  Xorg  [.] pixman_composite_add_8_8_asm_armv6
  5.95%  Xorg  [.] pixman_composite_src_n_8888_asm_armv6
  4.12%  Xorg  [.] pixman_image_composite32
  3.69%  Xorg  [.] _pixman_image_validate
  3.65%  Xorg  [.] _pixman_bits_image_setup_accessors

Before, fast_fetch_r5g6b5 + fast_write_back_r5g6b5 took 46% of the
samples in libpixman, and probably incurred some memcpy() load, too.
After, pixman_composite_src_x888_0565_asm_armv6 takes 14%. Note, that
the sample counts are very different before/after, as less time is spent
in Pixman and running time is not exactly the same.

Furthermore, in the above test, the CPU idle function was sampled 9%
before, and 15% after.

v4, Pekka Paalanen <pekka.paalanen@collabora.co.uk> :
	Re-benchmarked on Raspberry Pi, commit message.
2014-05-01 15:11:42 -04:00
Pekka Paalanen
5f661ee719 ARM: use pixman_asm_function in internal headers
The two ARM headers contained open-coded copies of pixman_asm_function,
replace these.

Since it seems customary that ARM headers do not use CPP include guards,
rely on the .S files to #include "pixman-arm-asm.h" first. They all
do now.

v2: Fix a build failure on rpi by adding one #include.
2014-04-21 20:38:09 -04:00
Ben Avison
ab587b444c ARMv6: Add fast path for in_reverse_8888_8888
Benchmark results, "before" is the patch
* upstream/master 4b76bbfda6
+ ARMv6: Support for very variable-hungry composite operations
+ ARMv6: Add fast path for over_n_8888_8888_ca
and "after" contains the additional patches on top:
+ ARMv6: Add fast path flag to force no preload of destination buffer
+ ARMv6: Add fast path for in_reverse_8888_8888 (this patch)

lowlevel-blt-bench, in_reverse_8888_8888, 100 iterations:

       Before          After
      Mean StdDev     Mean StdDev   Confidence   Change
L1    21.1   0.07     32.3   0.08    100.00%     +52.9%
L2    11.6   0.29     18.0   0.52    100.00%     +54.4%
M     10.5   0.01     16.1   0.03    100.00%     +54.1%
HT     8.2   0.02     12.0   0.04    100.00%     +45.9%
VT     8.1   0.02     11.7   0.04    100.00%     +44.5%
R      8.1   0.02     11.3   0.04    100.00%     +39.7%
RT     4.8   0.04      6.1   0.09    100.00%     +27.3%

At most 12 outliers rejected per test per set.

cairo-perf-trace with trimmed traces, 30 iterations:

                                    Before          After
                                   Mean StdDev     Mean StdDev   Confidence   Change
t-firefox-paintball.trace          18.0   0.01     14.1   0.01    100.00%     +27.4%
t-firefox-chalkboard.trace         36.7   0.03     36.0   0.02    100.00%      +1.9%
t-firefox-canvas-alpha.trace       20.7   0.22     20.3   0.22    100.00%      +1.9%
t-swfdec-youtube.trace              7.8   0.03      7.8   0.03    100.00%      +0.9%
t-firefox-talos-gfx.trace          25.8   0.44     25.6   0.29     93.87%      +0.7%  (insignificant)
t-firefox-talos-svg.trace          20.6   0.04     20.6   0.03    100.00%      +0.2%
t-firefox-fishbowl.trace           21.2   0.04     21.1   0.02    100.00%      +0.2%
t-xfce4-terminal-a1.trace           4.8   0.01      4.8   0.01     98.85%      +0.2%  (insignificant)
t-swfdec-giant-steps.trace         14.9   0.03     14.9   0.02     99.99%      +0.2%
t-poppler-reseau.trace             22.4   0.11     22.4   0.08     86.52%      +0.2%  (insignificant)
t-gnome-system-monitor.trace       17.3   0.03     17.2   0.03     99.74%      +0.2%
t-firefox-scrolling.trace          24.8   0.12     24.8   0.11     70.15%      +0.1%  (insignificant)
t-firefox-particles.trace          27.5   0.18     27.5   0.21     48.33%      +0.1%  (insignificant)
t-grads-heat-map.trace              4.4   0.04      4.4   0.04     16.61%      +0.0%  (insignificant)
t-firefox-fishtank.trace           13.2   0.01     13.2   0.01      7.64%      +0.0%  (insignificant)
t-firefox-canvas.trace             18.0   0.05     18.0   0.05      1.31%      -0.0%  (insignificant)
t-midori-zoomed.trace               8.0   0.01      8.0   0.01     78.22%      -0.0%  (insignificant)
t-firefox-planet-gnome.trace       10.9   0.02     10.9   0.02     64.81%      -0.0%  (insignificant)
t-gvim.trace                       33.2   0.21     33.2   0.18     38.61%      -0.1%  (insignificant)
t-firefox-canvas-swscroll.trace    32.2   0.09     32.2   0.11     73.17%      -0.1%  (insignificant)
t-firefox-asteroids.trace          11.1   0.01     11.1   0.01    100.00%      -0.2%
t-evolution.trace                  13.0   0.05     13.0   0.05     91.99%      -0.2%  (insignificant)
t-gnome-terminal-vim.trace         19.9   0.14     20.0   0.14     97.38%      -0.4%  (insignificant)
t-poppler.trace                     9.8   0.06      9.8   0.04     99.91%      -0.5%
t-chromium-tabs.trace               4.9   0.02      4.9   0.02    100.00%      -0.6%

At most 6 outliers rejected per test per set.

Cairo perf reports the running time, but the change is computed for
operations per second instead (inverse of running time).

Confidence is based on Welch's t-test. Absolute changes less than 1%
can be accounted as measurement errors, even if statistically
significant.

There was a question of why FLAG_NO_PRELOAD_DST is used. It makes
lowlevel-blt-bench results worse except for L1, but improves some
Cairo trace benchmarks.

"Ben Avison" <bavison@riscosopen.org> wrote:

> The thing with the lowlevel-blt-bench benchmarks for the more
> sophisticated composite types (as a general rule, anything that involves
> branches at the per-pixel level) is that they are only profiling the case
> where you have mid-level alpha values in the source/mask/destination.
> Real-world images typically have a disproportionate number of fully
> opaque and fully transparent pixels, which is why when there's a
> discrepancy between which implementation performs best with cairo-perf
> trace versus lowlevel-blt-bench, I usually favour the Cairo winner.
>
> The results of removing FLAG_NO_PRELOAD_DST (in other words, adding
> preload of the destination buffer) are easy to explain in the
> lowlevel-blt-bench results. In the L1 case, the destination buffer is
> already in the L1 cache, so adding the preloads is simply adding extra
> instruction cycles that have no effect on memory operations. The "in"
> compositing operator depends upon the alpha of both source and
> destination, so if you use uniform mid-alpha, then you actually do need
> to read your destination pixels, so you benefit from preloading them. But
> for fully opaque or fully transparent source pixels, you don't need to
> read the corresponding destination pixel - it'll either be left alone or
> overwritten. Since the ARM11 doesn't use write-allocate cacheing, both of
> these cases avoid both the time taken to load the extra cachelines, as
> well as increasing the efficiency of the cache for other data. If you
> examine the source images being used by the Cairo test, you'll probably
> find they mostly use transparent or opaque pixels.

v4, Pekka Paalanen <pekka.paalanen@collabora.co.uk> :
	Rebased, re-benchmarked on Raspberry Pi, commit message.

v5, Pekka Paalanen <pekka.paalanen@collabora.co.uk> :
	Rebased, re-benchmarked on Raspberry Pi due to a fix to
	"ARMv6: Add fast path for over_n_8888_8888_ca" patch.
2014-04-21 20:34:26 -04:00
Ben Avison
68d2f7b486 ARMv6: Add fast path flag to force no preload of destination buffer 2014-04-21 20:34:26 -04:00
Ben Avison
4ad769cbec ARMv6: Add fast path for over_n_8888_8888_ca
Benchmark results, "before" is
* upstream/master 4b76bbfda6
"after" contains the additional patches on top:
+ ARMv6: Support for very variable-hungry composite operations
+ ARMv6: Add fast path for over_n_8888_8888_ca (this patch)

lowlevel-blt-bench, over_n_8888_8888_ca, 100 iterations:

       Before          After
      Mean StdDev     Mean StdDev   Confidence   Change
L1     2.7   0.00     16.1   0.06    100.00%    +500.7%
L2     2.4   0.01     14.1   0.15    100.00%    +489.9%
M      2.3   0.00     14.3   0.01    100.00%    +510.2%
HT     2.2   0.00      9.7   0.03    100.00%    +345.0%
VT     2.2   0.00      9.4   0.02    100.00%    +333.4%
R      2.2   0.01      9.5   0.03    100.00%    +331.6%
RT     1.9   0.01      5.5   0.07    100.00%    +192.7%

At most 1 outliers rejected per test per set.

cairo-perf-trace with trimmed traces, 30 iterations:

                                    Before          After
                                   Mean StdDev     Mean StdDev   Confidence   Change
t-firefox-talos-gfx.trace          33.1   0.42     25.8   0.44    100.00%     +28.6%
t-firefox-scrolling.trace          31.4   0.11     24.8   0.12    100.00%     +26.3%
t-gnome-terminal-vim.trace         22.4   0.10     19.9   0.14    100.00%     +12.5%
t-evolution.trace                  13.9   0.07     13.0   0.05    100.00%      +6.5%
t-firefox-planet-gnome.trace       11.6   0.02     10.9   0.02    100.00%      +6.5%
t-gvim.trace                       34.0   0.21     33.2   0.21    100.00%      +2.4%
t-chromium-tabs.trace               4.9   0.02      4.9   0.02    100.00%      +1.0%
t-poppler.trace                     9.8   0.05      9.8   0.06    100.00%      +0.7%
t-firefox-canvas-swscroll.trace    32.3   0.10     32.2   0.09    100.00%      +0.4%
t-firefox-paintball.trace          18.1   0.01     18.0   0.01    100.00%      +0.3%
t-poppler-reseau.trace             22.5   0.09     22.4   0.11     99.29%      +0.3%
t-firefox-canvas.trace             18.1   0.06     18.0   0.05     99.29%      +0.2%
t-xfce4-terminal-a1.trace           4.8   0.01      4.8   0.01     99.77%      +0.2%
t-firefox-fishbowl.trace           21.2   0.03     21.2   0.04    100.00%      +0.2%
t-gnome-system-monitor.trace       17.3   0.03     17.3   0.03     99.54%      +0.1%
t-firefox-asteroids.trace          11.1   0.01     11.1   0.01    100.00%      +0.1%
t-midori-zoomed.trace               8.0   0.01      8.0   0.01     99.98%      +0.1%
t-grads-heat-map.trace              4.4   0.04      4.4   0.04     34.08%      +0.1%  (insignificant)
t-firefox-talos-svg.trace          20.6   0.03     20.6   0.04     54.06%      +0.0%  (insignificant)
t-firefox-fishtank.trace           13.2   0.01     13.2   0.01     52.81%      -0.0%  (insignificant)
t-swfdec-giant-steps.trace         14.9   0.02     14.9   0.03     85.50%      -0.1%  (insignificant)
t-firefox-chalkboard.trace         36.6   0.02     36.7   0.03    100.00%      -0.2%
t-firefox-canvas-alpha.trace       20.7   0.32     20.7   0.22     55.76%      -0.3%  (insignificant)
t-swfdec-youtube.trace              7.8   0.02      7.8   0.03    100.00%      -0.5%
t-firefox-particles.trace          27.4   0.16     27.5   0.18     99.94%      -0.6%

At most 4 outliers rejected per test per set.

Cairo perf reports the running time, but the change is computed for
operations per second instead (inverse of running time).

Confidence is based on Welch's t-test. Absolute changes less than 1%
can be accounted as measurement errors, even if statistically
significant.

v4, Pekka Paalanen <pekka.paalanen@collabora.co.uk> :
	Use pixman_asm_function instead of startfunc.
	Rebased. Re-benchmarked on Raspberry Pi.
	Commit message.

v5, Ben Avison <bavison@riscosopen.org> :
	Fixed the bug exposed in blitters-test 4928372.
	15 hours of testing, compared to the 45 minutes to hit
	the bug originally.
    Pekka Paalanen <pekka.paalanen@collabora.co.uk> :
	Squash the fix, re-benchmark on Raspberry Pi.
2014-04-21 20:34:26 -04:00
Ben Avison
73d2f8b61a ARMv6: Support for very variable-hungry composite operations
Previously, the variable ARGS_STACK_OFFSET was available to extract values
from function arguments during the init macro. Now this changes dynamically
around stack operations in the function as a whole so that arguments can be
accessed at any point. It is also joined by LOCALS_STACK_OFFSET, which
allows access to space reserved on the stack during the init macro.

On top of this, composite macros now have the option of using all of WK0-WK3
registers rather than just the subset it was told to use; this requires the
pixel count to be spilled to the stack over the leading pixels at the start
of each line. Thus, at best, each composite operation can use 11 registers,
plus any pointer registers not required for the composite type, plus as much
stack space as it needs, divided up into constants and variables as necessary.
2014-04-21 20:34:26 -04:00
Søren Sandmann
857e40f3d2 create_bits(): Cast the result of height * stride to size_t
In create_bits() both height and stride are ints, so the result is
also an int, which will overflow if height or stride are big enough
and size_t is bigger than int.

This patch simply casts height to size_t to prevent these overflows,
which prevents the crash in:

    https://bugzilla.redhat.com/show_bug.cgi?id=972647

It's not even close to fixing the full problem of supporting big
images in pixman.

See also

    https://bugs.freedesktop.org/show_bug.cgi?id=69014
2014-04-15 14:21:14 -04:00
Pekka Paalanen
4b76bbfda6 ARM: share pixman_asm_function definition
Several files define identically the asm macro pixman_asm_function.
Merge all these definitions into a new asm header.

The original definition is taken from pixman-arm-simd-asm-scaled.S with
the copyright/licence/author blurb verbatim.
2014-04-02 12:48:26 +03:00
Ben Avison
4ee85b0083 ARMv6: Add fast path for over_reverse_n_8888
Benchmark results, "before" is upstream commit
c343846 lowlevel-blt-bench: add in_reverse_8888_8888 test
and "after" is with this patch only added on top.

lowlevel-blt-bench, over_reverse_n_8888, 100 iterations:

       Before          After
      Mean StdDev     Mean StdDev   Confidence   Change
L1    15.1    0.1    274.5    2.3    100.00%   +1718.9%
L2    12.8    0.3    181.8    0.7    100.00%   +1315.5%
M     10.8    0.0     77.9    0.0    100.00%    +621.2%
HT     9.7    0.0     29.4    0.2    100.00%    +204.9%
VT     9.5    0.0     26.7    0.1    100.00%    +179.3%
R      9.3    0.0     25.3    0.1    100.00%    +173.6%
RT     6.0    0.1     11.0    0.2    100.00%     +82.9%

At most 16 outliers rejected per case per set.

cairo-perf-trace with trimmed traces, 30 iterations:

                                    Before          After
                                   Mean StdDev     Mean StdDev   Confidence   Change
t-poppler.trace                    12.9    0.1      9.7    0.0    100.00%     +32.6%
t-firefox-talos-gfx.trace          33.2    0.7     32.9    0.4     95.23%      +0.9%  (insignificant)
t-firefox-particles.trace          27.4    0.1     27.3    0.2     99.65%      +0.4%
t-firefox-canvas-alpha.trace       20.5    0.3     20.5    0.3     57.51%      +0.3%  (insignificant)
t-poppler-reseau.trace             22.4    0.1     22.4    0.1     95.69%      +0.3%  (insignificant)
t-firefox-fishtank.trace           13.2    0.0     13.2    0.0     99.84%      +0.1%
t-swfdec-giant-steps.trace         14.9    0.0     14.9    0.0     87.68%      +0.1%  (insignificant)
t-swfdec-youtube.trace              7.8    0.0      7.8    0.0     35.22%      +0.1%  (insignificant)
t-firefox-planet-gnome.trace       11.5    0.0     11.5    0.0     29.37%      +0.0%  (insignificant)
t-firefox-fishbowl.trace           21.2    0.0     21.2    0.0     18.09%      +0.0%  (insignificant)
t-grads-heat-map.trace              4.4    0.0      4.4    0.0      1.84%      +0.0%  (insignificant)
t-firefox-paintball.trace          18.0    0.0     18.0    0.0     33.43%      -0.0%  (insignificant)
t-firefox-talos-svg.trace          20.5    0.0     20.5    0.1     68.56%      -0.1%  (insignificant)
t-midori-zoomed.trace               8.0    0.0      8.0    0.0     99.98%      -0.1%
t-firefox-canvas-swscroll.trace    32.1    0.1     32.1    0.1     85.27%      -0.1%  (insignificant)
t-gnome-system-monitor.trace       17.2    0.0     17.2    0.0     99.97%      -0.2%
t-firefox-chalkboard.trace         36.5    0.0     36.6    0.0    100.00%      -0.2%
t-firefox-asteroids.trace          11.1    0.0     11.1    0.0    100.00%      -0.2%
t-firefox-canvas.trace             17.9    0.0     18.0    0.0    100.00%      -0.3%
t-chromium-tabs.trace               4.9    0.0      4.9    0.0     97.95%      -0.3%  (insignificant)
t-xfce4-terminal-a1.trace           4.8    0.0      4.8    0.0    100.00%      -0.4%
t-firefox-scrolling.trace          31.1    0.1     31.2    0.1    100.00%      -0.5%
t-evolution.trace                  13.7    0.1     13.8    0.1     99.99%      -0.6%
t-gnome-terminal-vim.trace         22.0    0.2     22.2    0.1     99.99%      -0.7%
t-gvim.trace                       33.2    0.2     33.5    0.2    100.00%      -0.8%

At most 6 outliers rejected per case per set.

Cairo perf reports the running time, but the change is computed for
operations per second instead (inverse of running time).

Changes in the order of +/- 1% can be accounted for measurement errors,
even if they are deemed to be statistically significant. This claim is
based on comparing two 30-iteration identical "before" runs using the
exact same binaries, and observing changes from -0.4% to +0.5% with
>=99% confidence.

Confidence is based on Welch's t-test.

v4, Pekka Paalanen <pekka.paalanen@collabora.co.uk> :
	Rebased, re-benchmarked on Raspberry Pi, commit message.
2014-04-02 12:46:24 +03:00
Siarhei Siamashka
56622140e3 test: Fix OpenMP clauses for the tolerance-test
Compiling with the Intel Compiler reveals a problem:

tolerance-test.c(350): error: index variable "i" of for statement following an OpenMP for pragma must be private
  #       pragma omp parallel for default(none) shared(i) private (result)
  ^

In addition to this, the 'result' variable also should not be private
(otherwise its value does not survive after the end of the loop). It
needs to be either shared or use the reduction clause to describe how
the results from multiple threads are combined together. Reduction
seems to be more appropriate here.
2014-04-02 12:46:09 +03:00
Siarhei Siamashka
840912b311 configure.ac: Check if the compiler supports GCC vector extensions
The Intel Compiler 14.0.0 claims version GCC 4.7.3 compatibility
via __GNUC__/__GNUC__MINOR__ macros, but does not provide the same
level of GCC vector extensions support as the original GCC compiler:
    http://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html

Which results in the following compilation failure:

In file included from ../test/utils.h(7),
                 from ../test/utils.c(3):
../test/utils-prng.h(138): error: expression must have integral type
      uint32x4 e = x->a - ((x->b << 27) + (x->b >> (32 - 27)));
                            ^

The problem is fixed by doing a special check in configure for
this feature.
2014-04-02 12:46:04 +03:00
Ben Avison
c343846625 lowlevel-blt-bench: add in_reverse_8888_8888 test
in_reverse_8888_8888 is one of the more commonly used operations in the
cairo-perf-trace suite that hasn't been in lowlevel-blt-bench until now.

v4, Pekka Paalanen <pekka.paalanen@collabora.co.uk> :
	Split from "Add extra test to lowlevel-blt-bench and fix an
	existing one", new summary.
2014-03-20 08:33:05 -04:00
Ben Avison
898859f3d3 lowlevel-blt-bench: over_reverse_n_8888 needs solid source
v4, Pekka Paalanen <pekka.paalanen@collabora.co.uk> :
	Split from "Add extra test to lowlevel-blt-bench and fix an
	existing one", new summary.
2014-03-20 08:33:05 -04:00
Ben Avison
38317cbfde ARMv6: remove 1 instr per row in generate_composite_function
This knocks off one instruction per row. The effect is probably too small to
be measurable, but might as well be included. The second occurrence of this
sequence doesn't actually benefit at all, but is changed for consistency.

The saved instruction comes from combining the "and" inside the .if
statement with an earlier "tst". The "and" was normally needed, except
for in one special case, where bits 4-31 were all shifted off the top of
the register later on in preload_leading_step2, so we didn't care about
their values.

v4, Pekka Paalanen <pekka.paalanen@collabora.co.uk> :
	Remove "bits 0-3" from the comments, update patch summary, and
	augment message with Ben's suggestion.
2014-03-20 08:33:05 -04:00
Ben Avison
763a6d3e67 ARMv6: Fix indentation in the composite macros 2014-03-20 08:33:05 -04:00
Søren Sandmann
82d094654a Remove all the operators that use division from pixman-combine32.c
These are now handled by floating point combiners.
2014-01-04 16:13:27 -05:00
Søren Sandmann
ccb1df0c5e Copy the comments from pixman-combine32.c to pixman-combine-float.c
An upcoming commit will delete many of the operators from
pixman-combine32.c and rely on the ones in pixman-combine-float.c. The
comments about how the operators were derived are still useful though,
so copy them into pixman-combine-float.c before the deletion.
2014-01-04 16:13:27 -05:00
Søren Sandmann Pedersen
94244b0c40 utils.c: Set DEVIATION to 0.0128
Consider a HARD_LIGHT operation with the following pixels:

- source:           15      (6 bits)
- source alpha:     255     (8 bits)
- mask alpha:       223     (8 bits)
- dest              255     (8 bits)
- dest alpha:       0       (8 bits)

Since 2 times the source is less than source alpha, the first branch
of the hard light blend mode is taken:

        (1 - sa) * d + (1 - da) * s + 2 * s * d

Since da is 0 and d is 1, this degenerates to:

        (1 - sa) + 3 * s

Taking (src IN mask) into account along with the fact that sa is 1,
this becomes:

        (1 - ma) + 3 * s * ma

      = (1 - 223/255.0) + 3 * (15/63.0) * (223/255.0)

      = 0.7501400560224089

When computed with the source converted by bit replication to eight
bits, and additionally with the (src IN mask) part rounded to eight
bits, we get:

        ma = 223/255.0

        s * ma = (60 / 255.0) * (223/255.0) which rounds to 52 / 255

and the result is

        (1 - ma) + 3 * s * ma

      = (1 - 223/255.0) + 3 * 52/255.0

      = 0.7372549019607844

so now we have an error of 0.012885.

Without making changes to the way pixman does integer
rounding/arithmetic, this error must then be considered
acceptable. Due to conservative computations in the test suite we can
however get away with 0.0128 as the acceptable deviation.

This fixes the remaining failures in pixel-test.
2014-01-04 16:13:27 -05:00
Søren Sandmann
15aa37adec Use floating point combiners for all operators that involve divisions
Consider a DISJOINT_ATOP operation with the following pixels:

- source:	0xff (8 bits)
- source alpha:	0x01 (8 bits)
- mask alpha:	0x7b (8 bits)
- dest:		0x00 (8 bits)
- dest alpha:	0xff (8 bits)

When (src IN mask) is computed in 8 bits, the resulting alpha channel
is 0 due to rounding:

     floor ((0x01 * 0x7b) / 255.0 + 0.5) = floor (0.9823) = 0

which means that since Render defines any division by zero as
infinity, the Fa and Fb for this operator end up as follows:

     Fa = max (1 - (1 - 1) / 0, 0) = 0

     Fb = min (1, (1 - 0) / 1) = 1

and so since dest is 0x00, the overall result is 0.

However, when computed in full precision, the alpha value no longer
rounds to 0, and so Fa ends up being

     Fa = max (1 - (1 - 1) / 0.0001, 0) = 1

and so the result is now

     s * ma * Fa + d * Fb

   = (1.0 * (0x7b / 255.0) * 1) + d * 0

   = 0x7b / 255.0

   = 0.4823

so the error in this case ends up being 0.48235294, which is clearly
not something that can be considered acceptable.

In order to avoid this problem, we need to do all arithmetic in such a
way that a multiplication of two tiny numbers can never end up being
zero unless one of the input numbers is itself zero.

This patch makes all computations that involve divisions take place in
floating point, which is sufficient to fix the test cases

This brings the number of failures in pixel-test down to 14.
2014-01-04 16:13:27 -05:00
Søren Sandmann
8f38243163 Soft Light: Consistent approach to division by zero
The Soft Light operator has several branches. One them is decided
based on whether 2 * s is less than or equal to 2 * sa. In floating
point implementations, when those two values are very close to each
other, it may not be completely predictable which branch we hit.

This is a problem because in one branch, when destination alpha is
zero, we get the result

      r = d * as

and in the other we get

      r = 0

So when d and as are not 0, this causes two different results to be
returned from essentially identical input values. In other words,
there is a discontinuity in the current implementation.

This patch randomly changes the second branch such that it now returns
d * sa instead. There is no deep meaning behind this, because
essentially this is an attempt to assign meaning to division by zero,
and all that is requires is that that meaning doesn't depend on minute
differences in input values.

This makes the number of failed pixels in pixel-test go down to 347.
2014-01-04 16:13:27 -05:00
Søren Sandmann Pedersen
89662adf77 pixman-combine32.c: Fix bugs related to integer promotion
In the component alpha part of the PDF_SEPARABLE_BLEND_MODE macro, the
expression ~RED_8 (m) is used. Because RED_8(m) gets promoted to int
before ~ is applied, the whole expression typically becomes some
negative value rather than (255 - RED_8(m)) as desired.

Fix this by using unsigned temporary variables.

This reduces the number of failures in pixel-test to 363.
2014-01-04 16:13:27 -05:00
Søren Sandmann Pedersen
e7a99b3b0f pixman/pixman-combine32.c: Bug fixes for separable blend modes
This commit fixes four separate bugs:

1. In the computation

      (1 - sa) * d + (1 - da) * s + sa * da * B(s, d)

   we were using regular addition for all four channels, but for
   superluminescent pixels, the addition could overflow causing
   nonsensical results.

2. The variables and return types used for the results of the blend
   mode calculations were unsigned, but for various blend modes (and
   especially with superluminescent pixels), the blend mode
   calculations could be negative, resulting in underflows.

3. The blend mode computations were returned as 8-bit values, which is
   not sufficient precision (especially considering that we need
   signed results).

4. The value before the final division by 255 was not properly clamped
   to [0, 255].

This patch fixes all those bugs. The blend mode computations are now
returned as signed 16 bit values with 1 represented as 255 * 255.

With these fixes, the number of failing pixels in pixel-test goes down
from 431 to 384.
2014-01-04 16:13:27 -05:00
Søren Sandmann
fe3504d03f pixel-test.c: Add a number of pixels that have failed at some point
This commit adds a large number of pixel regressions to
pixel-test. All of these have at some point been failing in
blend-mode-test, and most of them do fail currently.

To be specific, with this commit, pixel-test reports 431 failed tests.
2014-01-04 16:13:27 -05:00
Søren Sandmann Pedersen
bd94c17937 test/tolerance-test: New test program
This new test program is similar to test/composite in that it relies
on the pixel_checker_t API to do tolerance based verification. But
unlike the composite test, which verifies combinations of a fixed set
of pixels, this one generates random images and verifies that those
composite correctly.

Also unlike composite, tolerance-test supports all the separable blend
mode operators in addition to the original Render operators.

When tests fail, a C struct is printed that can be pasted into
pixel-test for regression purposes.

There is an option "--forever" which causes the random seed to be set
to the current time, and then the test runs until interrupted. This is
useful for overnight runs.

This test currently fails badly due to various bugs in the blend mode
operators. Later commits will fix those.
2014-01-04 16:13:27 -05:00
Søren Sandmann
c2fd65dba3 pixel-test: Command line argument to specify the regression to run
A new command line argument allows the user to specify which one of
the regressions should be run.
2014-01-04 16:13:27 -05:00
Søren Sandmann
a692e01600 pixel-test: Add support for mask pixels
Support is added to pixel-test for verifying operations involving
masks. If a regression includes a mask, it is verified with the
pixel_checker API in in both unified and component alpha modes.
2014-01-04 16:13:27 -05:00
Søren Sandmann Pedersen
779ca46e98 test/check-formats.c: Add support for separable blend modes 2014-01-04 16:13:27 -05:00
Søren Sandmann Pedersen
a42af27fc0 test/utils.c: Add support for separable blend mode ops to do_composite()
The implementations are copied from the floating point pipeline, but
use double precision instead of single precision.
2014-01-04 16:13:27 -05:00
Søren Sandmann
b29d74ef0c configure.ac: Check and use -Wno-unused-local-typedefs GCC option
With GCC 4.8.2 the COMPILE_TIME_ASSERT macro produces a spurious
warning about an unused local typedef:

    In file included from pixman.c:29:0:
    pixman.c: In function 'optimize_operator':
    pixman-private.h:1019:22: warning: typedef 'compile_time_assertion' locally defined but not used [-Wunused-local-typedefs]

The flag -Wno-unused-local-typedefs suppresses that warning.
2013-12-26 09:41:53 -05:00
Julien Cristau
08ff9fa402 Upload to unstable 2013-12-17 22:04:30 +01:00
Julien Cristau
e66148cda6 Bump changelogs 2013-12-08 15:33:18 +01:00
Julien Cristau
9c9f210896 pixman 0.32.4 release
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJSiXHSAAoJEA/daC2XTKcqUtQQALogcIuKShzPrZCnNke9jXJF
 Ujq4M0fHMBru4Uzqq+MCp02ssWLnoBvW8emwzalzt3xulZU+fUeYs1u56Epi1SnG
 oHt5ah1ZSicAwNBlDdflKgqnBGdsFJg5yj9F09zwZeBEBYwhJBaTQfIK6i0sww3s
 MQ66uANWsJQsW8/wFq5pJLmmmSWlelEHXz5pcjLavaYkOIITSzTeZF+xOvhBUwv2
 1zTsv9c2k05cR+8UKDpDURrEn5Cp5uQo0iV9FpKsyKL01ukqCbuBRWVxjSbXCmtu
 GWZ4qDLjScM8sCAQbZF4/MZuGoytC2cKxaWnjKn4h1L4+qZMIvjmcAlsP7CfJ14o
 AtWkYvU6rlY5m4je8Lh3QMbLkSTNFR8ix97jDhFmZlEQA3EXnPvme2YFecOmlVgF
 c1mVhVBR2Je/Hav0LiIne7151dFJ+THCAPOLcVqDCzRw2BMjAfp0Kx7qnFiXyvEt
 zgpoAmybf1kHOCpEugHGKwe4elCTvjq7xv3+JwkzqvV7uIvk1/J0ctIkBsboeMsP
 nvIJ8nBj9fNuJdP++jNX1xsi3C0LM16Bhd5n8wZcX4sqekSVj+LDht4JBPalMC7A
 m50kD9XlFSJ8UyoKrKMGx71XLnkGgT1hbQgE9ML8MumXZZMpjwIb9p7g7D2A1hXM
 /1kzDHmAaqbLcmFBTyO9
 =klDd
 -----END PGP SIGNATURE-----

Merge tag 'pixman-0.32.4' into debian-unstable

pixman 0.32.4 release

Conflicts:
	configure.ac
2013-12-08 15:28:54 +01:00
Søren Sandmann
945ab7a6f3 Soft Light: The first comparison should be <=, not <
According to the definition of soft light, the first comparison is
less-than-or-equal, not less-than.
2013-12-03 18:14:24 -05:00
Søren Sandmann
9ba3a34797 general: Support component alpha for all image types
Currently, if you attempt to use component alpha on source images or
images without RGB channels, Pixman will silently just use unified
alpha instead. This patch makes such images supported for component
alpha.

There is no particularly compelling usecase at the moment, but this
patch does get rid of a bit of special-case code both in
pixman-general.c and in test/composite.c.
2013-11-23 20:30:33 -05:00
Maarten Lankhorst
166899c913 release to sid 2013-11-18 15:55:02 +01:00
Maarten Lankhorst
7d8317abd4 Cherry-pick upstream bigfixes for fixing a crash when rendering invalid trapezoids. (LP: #1197921) 2013-11-18 15:54:49 +01:00
Ritesh Khadgaray
f740a26fe1 pixman_trapezoid_valid(): Fix underflow when bottom is close to MIN_INT
If t->bottom is close to MIN_INT (probably invalid value), subtracting
top can lead to underflow which causes crashes.  Attached patch will
fix the issue.

This fixes bug 67484.

(cherry picked from commit 5e14da97f1)
2013-11-18 15:08:42 +01:00
Søren Sandmann Pedersen
f4acde9c71 test/trap-crasher.c: Add trapezoid that demonstrates a crash
This trapezoid causes a crash due to an underflow in the
pixman_trapezoid_valid().

Test case from Ritesh Khadgaray.

(cherry picked from commit 2f876cf867)
2013-11-18 15:08:41 +01:00
Matt Turner
dae5a758e2 Post-release version bump to 0.32.5 2013-11-17 17:48:54 -08:00
Matt Turner
4b3a66b05e Pre-release version bump to 0.32.4 2013-11-17 17:46:52 -08:00
Søren Sandmann
97a655d5ca test/utils.c: Make the stack unaligned only on 32 bit Windows
The call_test_function() contains some assembly that deliberately
causes the stack to be aligned to 32 bits rather than 128 bits on
x86-32. The intention is to catch bugs that surface when pixman is
called from code that only uses a 32 bit alignment.

However, recent versions of GCC apparently make the assumption (either
accidentally or deliberately) that that the incoming stack is aligned
to 128 bits, where older versions only seemed to make this assumption
when compiling with -msse2. This causes the vector code in the PRNG to
now segfault when called from call_test_function() on x86-32.

This patch fixes that by only making the stack unaligned on 32 bit
Windows, where it would definitely be incorrect for GCC to assume that
the incoming stack is aligned to 128 bits.

V2: Put "defined(...)" around __GNUC__

Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=491110
(cherry picked from commit f473fd1e75)
2013-11-17 17:45:56 -08:00
Jakub Bogusz
5a313af74e Fix the SSSE3 CPUID detection.
SSSE3 is detected by bit 9 of ECX, but we were checking bit 9 of EDX
which is APIC leading to SSSE3 routines being called on CPUs without
SSSE3.

Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 8487dfbcd0)
2013-11-17 17:45:54 -08:00
Søren Sandmann
f473fd1e75 test/utils.c: Make the stack unaligned only on 32 bit Windows
The call_test_function() contains some assembly that deliberately
causes the stack to be aligned to 32 bits rather than 128 bits on
x86-32. The intention is to catch bugs that surface when pixman is
called from code that only uses a 32 bit alignment.

However, recent versions of GCC apparently make the assumption (either
accidentally or deliberately) that that the incoming stack is aligned
to 128 bits, where older versions only seemed to make this assumption
when compiling with -msse2. This causes the vector code in the PRNG to
now segfault when called from call_test_function() on x86-32.

This patch fixes that by only making the stack unaligned on 32 bit
Windows, where it would definitely be incorrect for GCC to assume that
the incoming stack is aligned to 128 bits.

V2: Put "defined(...)" around __GNUC__

Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=491110
2013-11-17 17:44:51 -08:00
Jakub Bogusz
8487dfbcd0 Fix the SSSE3 CPUID detection.
SSSE3 is detected by bit 9 of ECX, but we were checking bit 9 of EDX
which is APIC leading to SSSE3 routines being called on CPUs without
SSSE3.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-11-12 12:59:42 -08:00
Søren Sandmann
917a52003d Post-release version bump to 0.32.3 2013-11-11 19:55:18 -05:00
Søren Sandmann
a980f83a68 Pre-release version bump to 0.32.2 2013-11-11 19:44:54 -05:00
Søren Sandmann
7410073110 demos/Makefile.am: Move EXTRA_DIST outside "if HAVE_GTK"
Without this, if tarballs are generated on a system that doesn't have
GTK+ 2 development headers available, the files in EXTRA_DIST will not
be included, which then causes builds from the tarball to fail on
systems that do have GTK+ 2 headers available.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=71465
2013-11-11 19:28:30 -05:00
Søren Sandmann
e2e3817021 demos/Makefile.am: Move EXTRA_DIST outside "if HAVE_GTK"
Without this, if tarballs are generated on a system that doesn't have
GTK+ 2 development headers available, the files in EXTRA_DIST will not
be included, which then causes builds from the tarball to fail on
systems that do have GTK+ 2 headers available.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=71465
2013-11-11 19:13:31 -05:00
Andrea Canciani
950d1310f7 test: Fix the win32 build
The win32 build has no config.h, so HAVE_CONFIG_H should be checked
before including it, as in utils.h.
2013-11-11 19:09:46 -05:00
Andrea Canciani
9bab46e9b8 test: Fix the win32 build
The win32 build has no config.h, so HAVE_CONFIG_H should be checked
before including it, as in utils.h.
2013-11-11 19:09:28 -05:00
Søren Sandmann
7a00965d7a Post-release version bump to 0.32.1 2013-11-11 19:07:35 -05:00
Søren Sandmann
ca5a4dec44 Post-release version bump to 0.33.1 2013-11-10 18:17:12 -05:00
Søren Sandmann
895e7e05b7 Pre-release version bump to 0.32.0 2013-11-10 18:05:47 -05:00
Søren Sandmann Pedersen
8cbc7da4e5 Post-release version bump to 0.31.3 2013-11-01 20:52:00 -04:00
Søren Sandmann Pedersen
99e8605be0 Pre-release version bump to 0.31.2 2013-11-01 20:39:46 -04:00
Ritesh Khadgaray
5e14da97f1 pixman_trapezoid_valid(): Fix underflow when bottom is close to MIN_INT
If t->bottom is close to MIN_INT (probably invalid value), subtracting
top can lead to underflow which causes crashes.  Attached patch will
fix the issue.

This fixes bug 67484.
2013-11-01 20:24:57 -04:00
Søren Sandmann Pedersen
2f876cf867 test/trap-crasher.c: Add trapezoid that demonstrates a crash
This trapezoid causes a crash due to an underflow in the
pixman_trapezoid_valid().

Test case from Ritesh Khadgaray.
2013-11-01 20:24:27 -04:00
Brad Smith
8ef7e0d18e Fix pixman build with older GCC releases
The following patch fixes building pixman with older GCC releases
such as GCC 3.3 and older (OpenBSD; some older archs use GCC 3.3.6)
by changing the method of detecting the presence of __builtin_clz
to utilizing an autoconf check to determine its presence. Compilers
that pretend to be GCC, implement __builtin_clz and are already
utilizing the intrinsic include LLVM/Clang, Open64, EKOPath and
PCC.
2013-11-01 20:14:33 -04:00
Søren Sandmann Pedersen
3c2f4b6517 pixman-glyph.c: Add __force_align_arg_pointer to composite functions
The functions pixman_composite_glyphs_no_mask() and
pixman_composite_glyphs() can call into code compiled with -msse2,
which requires the stack to be aligned to 16 bytes. Since the ABIs on
Windows and Linux for x86-32 don't provide this guarantee, we need to
use this attribute to make GCC generate a prologue that realigns the
stack.

This fixes the crash introduced in the previous commit and also

   https://bugs.freedesktop.org/show_bug.cgi?id=70348

and

   https://bugs.freedesktop.org/show_bug.cgi?id=68300
2013-10-17 11:14:14 -04:00
Søren Sandmann Pedersen
3dce229772 utils.c: On x86-32 unalign the stack before calling test_function
GCC when compiling with -msse2 and -mssse3 will assume that the stack
is aligned to 16 bytes even on x86-32 and accordingly issue movdqa
instructions for stack allocated variables.

But despite what GCC thinks, the standard ABI on x86-32 only requires
a 4-byte aligned stack. This is true at least on Windows, but there
also was (and maybe still is) Linux code in the wild that assumed
this. When such code calls into pixman and hits something compiled
with -msse2, we get a segfault from the unaligned movdqas.

Pixman has worked around this issue in the past with the gcc attribute
"force_align_arg_pointer" but the problem has resurfaced now in

    https://bugs.freedesktop.org/show_bug.cgi?id=68300

because pixman_composite_glyphs() is missing this attribute.

This patch makes fuzzer_test_main() call the test_function through a
trampoline, which, on x86-32, has a bit of assembly that deliberately
avoids aligning the stack to 16 bytes as GCC normally expects. The
result is that glyph-test now crashes.

V2: Mark caller-save registers as clobbered, rather than using
noinline on the trampoline.
2013-10-17 11:14:14 -04:00
Siarhei Siamashka
9e81419ed5 configure.ac: check and use -Wdeclaration-after-statement GCC option
The accidental use of declaration after statement breaks compilation
with C89 compilers such as MSVC. Assuming that MSVC is one of the
supported compilers, it makes sense to ask GCC to at least report
warnings for such problematic code.
2013-10-14 00:27:04 +03:00
Siarhei Siamashka
a863bbcce0 sse2: bilinear fast path for src_x888_8888
Running cairo-perf-trace benchmark on Intel Core2 T7300:

Before:
[  0]    image    t-firefox-canvas-swscroll    1.989    2.008   0.43%    8/8
[  1]    image        firefox-canvas-scroll    4.574    4.609   0.50%    8/8

After:
[  0]    image    t-firefox-canvas-swscroll    1.404    1.418   0.51%    8/8
[  1]    image        firefox-canvas-scroll    4.228    4.259   0.36%    8/8
2013-10-14 00:26:51 +03:00
Søren Sandmann Pedersen
8f75f638ab configure.ac: Add check for pmulhuw assembly
Clang 3.0 chokes on the following bit of assembly

    asm ("pmulhuw %1, %0\n\t"
        : "+y" (__A)
        : "y" (__B)
    );

from pixman-mmx.c with this error message:

    fatal error: error in backend: Unsupported asm: input constraint
        with a matching output constraint of incompatible type!

So add a check in configure to only enable MMX when the compiler can
deal with it.
2013-10-12 15:04:27 -04:00
Søren Sandmann Pedersen
09a62d4dbc scale.c: Use int instead of kernel_t for values in named_int_t
The 'value' field in the 'named_int_t' struct is used for both
pixman_repeat_t and pixman_kernel_t values, so the type should be int,
not pixman_kernel_t.

Fixes some warnings like this

scale.c:124:33: warning: implicit conversion from enumeration
      type 'pixman_repeat_t' to different enumeration type
      'pixman_kernel_t' [-Wconversion]
    { "None",                   PIXMAN_REPEAT_NONE },
    ~                           ^~~~~~~~~~~~~~~~~~

when compiled with clang.
2013-10-12 15:04:27 -04:00
Søren Sandmann Pedersen
9367243801 pixman-combine32.c: Make Color Burn routine follow the math more closely
For superluminescent destinations, the old code could underflow in

    uint32_t r = (ad - d) * as / s;

when (ad - d) was negative. The new code avoids this problem (and
therefore causes changes in the checksums of thread-test and
blitters-test), but it is likely still buggy due to the use of
unsigned variables and other issues in the blend mode code.
2013-10-12 15:04:27 -04:00
Søren Sandmann Pedersen
105fa74fad pixman-combine32: Make Color Dodge routine follow the math more closely
Change blend_color_dodge() to follow the math in the comment more
closely.

Note, the new code here is in some sense worse than the old code
because it can now underflow the unsigned variables when the source is
superluminescent and (as - s) is therefore negative. The old code was
careful to clamp to 0.

But for superluminescent variables we really need the ability for the
blend function to become negative, and so the solution the underflow
problem is to just use signed variables. The use of unsigned variables
is a general problem in all of the blend mode code that will have to
be solved later.

The CRC32 values in thread-test and blitters-test are updated to
account for the changes in output.
2013-10-12 15:04:27 -04:00
Søren Sandmann Pedersen
2527a72432 pixman-combine32: Rename a number of variable from sa/sca to as/s
There are no semantic changes, just variables renames. The motivation
for these renames is so that the names are shorter and better match
the one used in the comments.
2013-10-12 15:04:27 -04:00
Søren Sandmann Pedersen
eaa4778c42 pixman-combine32: Improve documentation for blend mode operators
This commit overhauls the comments in pixman-comine32.c regarding
blend modes:

- Add a link to the PDF supplement that clarifies the specification of
  ColorBurn and ColorDodge

- Clarify how the formulas for premultiplied colors are derived form
  the ones in the PDF specifications

- Write out the derivation of the formulas in each blend routine
2013-10-12 15:04:27 -04:00
Søren Sandmann Pedersen
4bf1502fe8 pixman-combine32.c: Formatting fixes
Fix a bunch of spacing issues.

V2: More spacing issues, in the _ca combiners
2013-10-12 15:04:00 -04:00
Andrea Canciani
54be1a52f7 Fix thread-test on non-OpenMP systems
The non-reentrant versions of prng_* functions are thread-safe only in
OpenMP-enabled builds.

Fixes thread-test failing when compiled with Clang (both on Linux and
on MacOS).
2013-10-09 18:23:27 +02:00
Andrea Canciani
0af2fcaebc Add support for SSSE3 to the MSVC build system
Handle SSSE3 just like MMX and SSE2.
2013-10-09 14:23:12 +02:00
Andrea Canciani
e4d9c623d3 Fix build of check-formats on MSVC
Fixes

check-formats.obj : error LNK2019: unresolved external symbol
_strcasecmp referenced in function _format_from_string

check-formats.obj : error LNK2019: unresolved external symbol
_snprintf referenced in function _list_operators
2013-10-09 14:23:11 +02:00
Andrea Canciani
96ad6ebd8b Fix building of "other" programs on MSVC
In d1434d112c the benchmarks have been
extended to include other programs as well and the variable names have
been updated accordingly in the autotools-based build system, but not
in the MSVC one.
2013-10-09 14:23:11 +02:00
Andrea Canciani
31ac784f34 Fix build on MSVC
After a4c79d695d the MMX and SSE2 code
has some declarations after the beginning of a block, which is not
allowed by MSVC.

Fixes multiple errors like:

pixman-mmx.c(3625) : error C2275: '__m64' : illegal use of this type
as an expression

pixman-sse2.c(5708) : error C2275: '__m128i' : illegal use of this
type as an expression
2013-10-09 14:23:11 +02:00
Søren Sandmann Pedersen
c89f4c8266 fast: Swap image and iter flags in generated fast paths
The generated fast paths that were moved into the 'fast'
implementation in ec0e38cbb7 had their
image and iter flag arguments swapped; as a result, none of the fast
paths were ever called.
2013-10-04 14:11:57 -04:00
Siarhei Siamashka
7d05a7f4dc vmx: there is no need to handle unaligned destination anymore
So the redundant variables, memory reads/writes and reshuffles
can be safely removed. For example, this makes the inner loop
of 'vmx_combine_add_u_no_mask' function much more simple.

Before:

    7a20:7d a8 48 ce lvx     v13,r8,r9
    7a24:7d 80 48 ce lvx     v12,r0,r9
    7a28:7d 28 50 ce lvx     v9,r8,r10
    7a2c:7c 20 50 ce lvx     v1,r0,r10
    7a30:39 4a 00 10 addi    r10,r10,16
    7a34:10 0d 62 eb vperm   v0,v13,v12,v11
    7a38:10 21 4a 2b vperm   v1,v1,v9,v8
    7a3c:11 2c 6a eb vperm   v9,v12,v13,v11
    7a40:10 21 4a 00 vaddubs v1,v1,v9
    7a44:11 a1 02 ab vperm   v13,v1,v0,v10
    7a48:10 00 0a ab vperm   v0,v0,v1,v10
    7a4c:7d a8 49 ce stvx    v13,r8,r9
    7a50:7c 00 49 ce stvx    v0,r0,r9
    7a54:39 29 00 10 addi    r9,r9,16
    7a58:42 00 ff c8 bdnz+   7a20 <.vmx_combine_add_u_no_mask+0x120>

After:

    76c0:7c 00 48 ce lvx     v0,r0,r9
    76c4:7d a8 48 ce lvx     v13,r8,r9
    76c8:39 29 00 10 addi    r9,r9,16
    76cc:7c 20 50 ce lvx     v1,r0,r10
    76d0:10 00 6b 2b vperm   v0,v0,v13,v12
    76d4:10 00 0a 00 vaddubs v0,v0,v1
    76d8:7c 00 51 ce stvx    v0,r0,r10
    76dc:39 4a 00 10 addi    r10,r10,16
    76e0:42 00 ff e0 bdnz+   76c0 <.vmx_combine_add_u_no_mask+0x120>
2013-10-01 23:43:44 +03:00
Siarhei Siamashka
b6c5ba06f0 vmx: align destination to fix valgrind invalid memory writes
The SIMD optimized inner loops in the VMX/Altivec code are trying
to emulate unaligned accesses to the destination buffer. For each
4 pixels (which fit into a 128-bit register) the current
implementation:
  1. first performs two aligned reads, which cover the needed data
  2. reshuffles bytes to get the needed data in a single vector register
  3. does all the necessary calculations
  4. reshuffles bytes back to their original location in two registers
  5. performs two aligned writes back to the destination buffer

Unfortunately in the case if the destination buffer is unaligned and
the width is a perfect multiple of 4 pixels, we may have some writes
crossing the boundaries of the destination buffer. In a multithreaded
environment this may potentially corrupt the data outside of the
destination buffer if it is concurrently read and written by some
other thread.

The valgrind report for blitters-test is full of:

==23085== Invalid write of size 8
==23085==    at 0x1004B0B4: vmx_combine_add_u (pixman-vmx.c:1089)
==23085==    by 0x100446EF: general_composite_rect (pixman-general.c:214)
==23085==    by 0x10002537: test_composite (blitters-test.c:363)
==23085==    by 0x1000369B: fuzzer_test_main._omp_fn.0 (utils.c:733)
==23085==    by 0x10004943: fuzzer_test_main (utils.c:728)
==23085==    by 0x10002C17: main (blitters-test.c:397)
==23085==  Address 0x5188218 is 0 bytes after a block of size 88 alloc'd
==23085==    at 0x4051DA0: memalign (vg_replace_malloc.c:581)
==23085==    by 0x4051E7B: posix_memalign (vg_replace_malloc.c:709)
==23085==    by 0x10004CFF: aligned_malloc (utils.c:833)
==23085==    by 0x10001DCB: create_random_image (blitters-test.c:47)
==23085==    by 0x10002263: test_composite (blitters-test.c:283)
==23085==    by 0x1000369B: fuzzer_test_main._omp_fn.0 (utils.c:733)
==23085==    by 0x10004943: fuzzer_test_main (utils.c:728)
==23085==    by 0x10002C17: main (blitters-test.c:397)

This patch addresses the problem by first aligning the destination
buffer at a 16 byte boundary in each combiner function. This trick
is borrowed from the pixman SSE2 code.

It allows to pass the new thread-test on PowerPC VMX/Altivec systems and
also resolves the "make check" failure reported for POWER7 hardware:
    http://lists.freedesktop.org/archives/pixman/2013-August/002871.html
2013-10-01 23:42:56 +03:00
Søren Sandmann Pedersen
0438435b9c test: Add new thread-test program
This test program allocates an array of 16 * 7 uint32_ts and spawns 16
threads that each use 7 of the allocated uint32_ts as a destination
image for a large number of composite operations. Each thread then
computes and returns a checksum for the image. Finally, the main
thread computes a checksum of the checksums and verifies that it
matches expectations.

The purpose of this test is catch errors where memory outside images
is read and then written back. Such out-of-bounds accesses are broken
when multiple threads are involved, because the threads will race to
read and write the shared memory.

V2:
- Incorporate fixes from Siarhei for endianness and undefined behavior
  regarding argument evaluation
- Make the images 7 pixels wide since the bug only happens when the
  composite width is greater than 4.
- Compute a checksum of the checksums so that you don't have to
  update 16 values if something changes.

V3: Remove stray dollar sign
2013-10-01 23:33:57 +03:00
Søren Sandmann Pedersen
6582950407 Rename HAVE_PTHREAD_SETSPECIFIC to HAVE_PTHREADS
The test for pthread_setspecific() can be used as a general test for
whether pthreads are available, so rename the variable from
HAVE_PTHREAD_SETSPECIFIC to HAVE_PTHREADS and run the test even when
better support for thread local variables are available.

However, the pthread arguments are still only added to CFLAGS and
LDFLAGS when pthread_setspecific() is used for thread local variables.

V2: AC_SUBST(PTHREAD_CFLAGS)
2013-10-01 23:33:35 +03:00
Søren Sandmann Pedersen
b513b3dffe blitters-test: Remove unused variable 2013-09-29 16:47:53 -04:00
Søren Sandmann Pedersen
fa0559eb71 utils.c: Make image_endian_swap() deal with negative strides
Use a temporary variable s containing the absolute value of the stride
as the upper bound in the inner loops.

V2: Do this for the bpp == 16 case as well
2013-09-27 17:11:08 -04:00
Søren Sandmann Pedersen
ff682089ce utils.c: Make print_image actually cope with negative strides
Commit 4312f07736 claimed to have made
print_image() work with negative strides, but it didn't actually
work. When the stride was negative, the image buffer would be accessed
as if the stride were positive.

Fix the bug by not changing the stride variable and instead using a
temporary, s, that contains the absolute value of stride.
2013-09-26 13:35:29 -04:00
Søren Sandmann Pedersen
ec0e38cbb7 Move generated affine fetchers into pixman-fast-path.c
The generated fetchers for NEAREST, BILINEAR, and
SEPARABLE_CONVOLUTION filters are fast paths and so they belong in
pixman-fast-path.c
2013-09-26 10:21:29 -04:00
Søren Sandmann Pedersen
96e163d2fd Move bits_image_fetch_bilinear_no_repeat_8888 into pixman-fast-path.c
This iterator is really a fast path, so it belongs in the fast path
implementation.
2013-09-26 10:21:29 -04:00
Søren Sandmann Pedersen
8d465c2a5d fast, ssse3: Simplify logic to fetch lines in the bilinear iterators
Instead of having logic to swap the lines around when one of them
doesn't match, store the two lines in an array and use the least
significant bit of the y coordinate as the index into that
array. Since the two lines always have different least significant
bits, they will never collide.

The effect is that lines corresponding to even y coordinates are
stored in info->lines[0] and lines corresponding to odd y coordinates
are stored in info->lines[1].
2013-09-26 10:20:43 -04:00
Søren Sandmann Pedersen
aa5c45254e test: Test negative strides
Pixman supports negative strides, but up until now they haven't been
tested outside of stress-test. This commit adds testing of negative
strides to blitters-test, scaling-test, affine-test, rotate-test, and
composite-traps-test.
2013-09-19 21:37:56 -04:00
Søren Sandmann Pedersen
4312f07736 test: Share the image printing code
The affine-test, blitters-test, and scaling-test all have the ability
to print out the bytes of the destination image. Share this code by
moving it to utils.c.

At the same time make the code work correctly with negative strides.
2013-09-19 21:37:56 -04:00
Søren Sandmann Pedersen
51d7135456 {scaling,affine,composite-traps}-test: Use compute_crc32_for_image()
By using this function instead of compute_crc32() the alpha masking
code and the call to image_endian_swap() are not duplicated.
2013-09-19 21:37:56 -04:00
Søren Sandmann Pedersen
75506e6367 pixman-filter.c: Use 65536, not 65535, for fixed point conversion
Converting a double precision number to 16.16 fixed point should be
done by multiplying with 65536.0, not 65535.0.

The bug could potentially cause certain filters that would otherwise
leave the image bit-for-bit unchanged under an identity
transformation, to not do so, but the numbers are close enough that
there weren't any visual differences.
2013-09-16 17:54:46 -04:00
Søren Sandmann Pedersen
9899a7bae8 demos/scale.ui: Allow subsample_bits to be 0
The separable convolution filter supports a subsample_bits of 0 which
corresponds to no subsampling at all, so allow this value to be used
in the scale demo.
2013-09-16 17:54:46 -04:00
Søren Sandmann Pedersen
58a79dfe6d ssse3: Add iterator for separable bilinear scaling
This new iterator uses the SSSE3 instructions pmaddubsw and pabsw to
implement a fast iterator for bilinear scaling.

There is a graph here recording the per-pixel time for various
bilinear scaling algorithms as reported by scaling-bench:

    http://people.freedesktop.org/~sandmann/ssse3.v2/ssse3.v2.png

As the graph shows, this new iterator is clearly faster than the
existing C iterator, and when used with an SSE2 combiner, it is also
faster than the existing SSE2 fast paths for upscaling, though not for
downscaling.

Another graph:

    http://people.freedesktop.org/~sandmann/ssse3.v2/movdqu.png

shows the difference between writing to iter->buffer with movdqa,
movdqu on an aligned buffer, and movdqu on a deliberately unaligned
buffer. Since the differences are very small, the patch here avoids
using movdqa because imposing alignment restrictions on iter->buffer
may interfere with other optimizations, such as writing directly to
the destination image.

The data was measured with scaling-bench on a Sandy Bridge Core
i3-2350M @ 2.3GHz and is available in this directory:

    http://people.freedesktop.org/~sandmann/ssse3.v2/

where there is also a Gnumeric spreadsheet ssse3.v2.gnumeric
containing the per-pixel values and the graph.

V2:
- Use uintptr_t instead of unsigned long in the ALIGN macro
- Use _mm_storel_epi64 instead of _mm_cvtsi128_si64 as the latter form
  is not available on x86-32.
- Use _mm_storeu_si128() instead of _mm_store_si128() to avoid
  imposing alignment requirements on iter->buffer
2013-09-16 16:50:35 -04:00
Søren Sandmann Pedersen
f1792b3221 Add empty SSSE3 implementation
This commit adds a new, empty SSSE3 implementation and the associated
build system support.

configure.ac:   detect whether the compiler understands SSSE3
                intrinsics and set up the required CFLAGS

Makefile.am:    Add libpixman-ssse3.la

pixman-x86.c:   Add X86_SSSE3 feature flag and detect it in
                detect_cpu_features().

pixman-ssse3.c: New file with an empty SSSE3 implementation

V2: Remove SSSE3_LDFLAGS since it isn't necessary unless Solaris
support is added.
2013-09-16 16:50:35 -04:00
Søren Sandmann Pedersen
f10b5449a8 general: Ensure that iter buffers are aligned to 16 bytes
At the moment iter buffers are only guaranteed to be aligned to a 4
byte boundary. SIMD implementations benefit from the buffers being
aligned to 16 bytes, so ensure this is the case.

V2:
- Use uintptr_t instead of unsigned long
- allocate 3 * SCANLINE_BUFFER_LENGTH byte on stack rather than just
  SCANLINE_BUFFER_LENGTH
- use sizeof (stack_scanline_buffer) instead of SCANLINE_BUFFER_LENGTH
  to determine overflow
2013-09-16 16:50:35 -04:00
Siarhei Siamashka
700db9d872 sse2: faster bilinear scaling (pack 4 pixels to write with MOVDQA)
The loops are already unrolled, so it was just a matter of packing
4 pixels into a single XMM register and doing aligned 128-bit
writes to memory via MOVDQA instructions for the SRC compositing
operator fast path. For the other fast paths, this XMM register
is also directly routed to further processing instead of doing
extra reshuffling. This replaces "8 PACKSSDW/PACKUSWB + 4 MOVD"
instructions with "3 PACKSSDW/PACKUSWB + 1 MOVDQA" per 4 pixels,
which results in a clear performance improvement.

There are also some other (less important) tweaks:

1. Convert 'pixman_fixed_t' to 'intptr_t' before using it as an
   index for addressing memory. The problem is that 'pixman_fixed_t'
   is a 32-bit data type and it has to be extended to 64-bit
   offsets, which needs extra instructions on 64-bit systems.

2. Allow to recalculate the horizontal interpolation weights only
   once per 4 pixels by treating the XMM register as four pairs
   of 16-bit values. Each of these 16-bit/16-bit pairs can be
   replicated to fill the whole 128-bit register by using PSHUFD
   instructions. So we get "3 PADDW/PSRLW + 4 PSHUFD" instructions
   per 4 pixels instead of "12 PADDW/PSRLW" per 4 pixels
   (or "3 PADDW/PSRLW" per each pixel).

   Now a good question is whether replacing "9 PADDW/PSRLW" with
   "4 PSHUFD" is a favourable exchange. As it turns out, PSHUFD
   instructions are very fast on new Intel processors (including
   Atoms), but are rather slow on the first generation of Core2
   (Merom) and on the other processors from that time or older.
   A good instructions latency/throughput table, covering all the
   relevant processors, can be found at:
        http://www.agner.org/optimize/instruction_tables.pdf

   Enabling this optimization is controlled by the PSHUFD_IS_FAST
   define in "pixman-sse2.c".

3. One use of PSHUFD instruction (_mm_shuffle_epi32 intrinsic) in
   the older code has been also replaced by PUNPCKLQDQ equivalent
   (_mm_unpacklo_epi64 intrinsic) in PSHUFD_IS_FAST=0 configuration.
   The PUNPCKLQDQ instruction is usually faster on older processors,
   but has some side effects (instead of fully overwriting the
   destination register like PSHUFD does, it retains half of the
   original value, which may inhibit some compiler optimizations).

Benchmarks with "lowlevel-blt-bench -b src_8888_8888" using GCC 4.8.1 on
x86-64 system and default optimizations. The results are in MPix/s:

====== Intel Core2 T7300 (2GHz) ======

old:                     src_8888_8888 =  L1: 128.69  L2: 125.07  M:124.86
                        over_8888_8888 =  L1:  83.19  L2:  81.73  M: 80.63
                      over_8888_n_8888 =  L1:  79.56  L2:  78.61  M: 77.85
                      over_8888_8_8888 =  L1:  77.15  L2:  75.79  M: 74.63

new (PSHUFD_IS_FAST=0):  src_8888_8888 =  L1: 168.67  L2: 163.26  M:162.44
                        over_8888_8888 =  L1: 102.91  L2: 100.43  M: 99.01
                      over_8888_n_8888 =  L1:  97.40  L2:  95.64  M: 94.24
                      over_8888_8_8888 =  L1:  98.04  L2:  95.83  M: 94.33

new (PSHUFD_IS_FAST=1):  src_8888_8888 =  L1: 154.67  L2: 149.16  M:148.48
                        over_8888_8888 =  L1:  95.97  L2:  93.90  M: 91.85
                      over_8888_n_8888 =  L1:  93.18  L2:  91.47  M: 90.15
                      over_8888_8_8888 =  L1:  95.33  L2:  93.32  M: 91.42

====== Intel Core i7 860 (2.8GHz) ======

old:                     src_8888_8888 =  L1: 323.48  L2: 318.86  M:314.81
                        over_8888_8888 =  L1: 187.38  L2: 186.74  M:182.46

new (PSHUFD_IS_FAST=0):  src_8888_8888 =  L1: 373.06  L2: 370.94  M:368.32
                        over_8888_8888 =  L1: 217.28  L2: 215.57  M:211.32

new (PSHUFD_IS_FAST=1):  src_8888_8888 =  L1: 401.98  L2: 397.65  M:395.61
                        over_8888_8888 =  L1: 218.89  L2: 217.56  M:213.48

The most interesting benchmark is "src_8888_8888" (because this code can
be reused for a generic non-separable SSE2 bilinear fetch iterator).

The results shows that PSHUFD instructions are bad for Intel Core2 T7300
(Merom core) and good for Intel Core i7 860 (Nehalem core). Both of these
processors support SSSE3 instructions though, so they are not the primary
targets for SSE2 code. But without having any other more relevant hardware
to test, PSHUFD_IS_FAST=0 seems to be a reasonable default for SSE2 code
and old processors (until the runtime CPU features detection becomes
clever enough to recognize different microarchitectures).

(Rebased on top of patch that removes support for 8-bit bilinear
 filtering -ssp)
2013-09-16 16:48:44 -04:00
Siarhei Siamashka
e43cc9c902 test: safeguard the scaling-bench test against COW
The calloc call from pixman_image_create_bits may still
rely on http://en.wikipedia.org/wiki/Copy-on-write
Explicitly initializing the destination image results in
a more predictable behaviour.

V2:
 - allocate 16 bytes aligned buffer with aligned stride instead
   of delegating this to pixman_image_create_bits
 - use memset for the allocated buffer instead of pixman solid fill
 - repeat tests 3 times and select best results in order to filter
   out even more measurement noise
2013-09-07 17:20:09 -04:00
Søren Sandmann Pedersen
a4c79d695d Drop support for 8-bit precision in bilinear filtering
The default has been 7-bit for a while now, and the quality
improvement with 8-bit precision is not enough to justify keeping the
code around as a compile-time option.
2013-09-07 17:19:50 -04:00
Søren Sandmann Pedersen
80a232db68 Make the first argument to scanline fetchers have type bits_image_t *
Scanline fetchers haven't been used for images other than bits for a
long time, so by making the type reflect this fact, a bit of casting
can be saved in various places.
2013-09-07 17:12:18 -04:00
Matt Turner
8ad63f90cd iwmmxt: Disallow if gcc version is < 4.8.
Later versions of gcc-4.7.x are capable of generating iwMMXt
instructions properly, but gcc-4.8 contains better support and other
fixes, including iwMMXt in conjunction with hardfp. The existing 4.5
requirement was based on attempts to have OLPC use a patched gcc to
build pixman. Let's just require gcc-4.8.
2013-09-04 23:48:52 -07:00
Søren Sandmann Pedersen
02906e57bd fast_bilinear_cover_init: Don't install a finalizer on the error path
No memory is allocated in the error case, so a finalizer is not
necessary, and will cause problems if the data pointer is not
initialized to NULL.
2013-08-31 14:19:58 -04:00
Julien Cristau
d4898ac139 Upload to unstable 2013-08-13 12:08:22 +02:00
Julien Cristau
105c249996 Increase alpha-loop test timeout some more. 2013-08-13 12:03:40 +02:00
Julien Cristau
9b844940ba Includes big-endian matrix-test fix 2013-08-13 12:01:40 +02:00
Julien Cristau
2fc06503f6 Bump changelogs 2013-08-13 12:00:48 +02:00
Julien Cristau
a781ff50e7 pixman 0.30.2 release
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQEcBAABAgAGBQJSAlYRAAoJEIWlZJw4kjNuBQYIAKwOAc0rKtX5c/z5iuf90akR
 EfEKK5ICQ8iE55Jvmn3e9ny12yrRbP/S6++W2kKkaF6gEmab2/3YswN42/ZPn3gJ
 1RER7b+x/CxsJbJVNPbRBLdkfF2HH8RicJru7cQ98TjR2mSC9uKAyiC/podWQZvO
 96rcnXZZBZMMjZLCUYfhiNz71Frhjh3fZrodx9GUJ6Lbka74bvWJ3fB4PXoTtbbr
 H8OPkxJQw5OjGtqgwB8lbLQZmZLhuZYUGOF0wbSA2+2HvylxlPlpUgC1c3r8yn77
 MQsD/ex+CfswwxxMTrINkHSVllaoJafM8cjk8HFG3EPkW/ohdpDthhtZpmSsM5E=
 =09FF
 -----END PGP SIGNATURE-----

Merge tag 'pixman-0.30.2' into debian-unstable

pixman 0.30.2 release
2013-08-13 12:00:07 +02:00
Søren Sandmann Pedersen
3518a0dafa Add an iterator that can fetch bilinearly scaled images
This new iterator works in a separable way; that is, for a destination
scaline, it scales the two involved source scanlines and then caches
them so that they can be reused for the next destination scanlines.

There are two versions of the code, one that uses 64 bit arithmetic,
and one that uses 32 bit arithmetic only. The latter version is
used on 32 bit systems, where it is expected to be faster.

This scheme saves a substantial amount of arithmetic for larger
scalings; the per-pixel times for various configurations as reported
by scaling-bench are graphed here:

	http://people.freedesktop.org/~sandmann/separable.v2/v2.png

The "sse2" graph is current default on x86, "mmx" is with sse2
disabled, "old c" is with sse2 and mmx disabled. The "new 32" and "new
64" graphs show times for the new code. As the graphs show, the 64 bit
version of the new code beats the "old c" for all scaling ratios.

The data was taken on a Sandy Bridge Core i3-2350M CPU @ 2.0 GHz
running in 64 bit mode.

The data used to generate the graph is available in this directory:

    http://people.freedesktop.org/~sandmann/separable.v2/

There is also a Gnumeric spreadsheet v2.gnumeric containing the
per-pixel values and the graph.

V2:
- Add error message in the OOM/bad matrix case
- Save some shifts by storing the cached scanlines in AGBR order
- Special cased version that uses 32 bit arithmetic when sizeof(long) <= 4
2013-08-10 11:18:23 -04:00
Søren Sandmann Pedersen
146116eff4 Add support for iter finalizers
Iterators may sometimes need to allocate auxillary memory. In order to
be able to free this memory, optional iterator finalizers are
required.
2013-08-10 11:18:23 -04:00
Søren Sandmann Pedersen
1be9208e04 test/scaling-bench.c: New benchmark for bilinear scaling
This new benchmark scales a 320 x 240 test a8r8g8b8 image by all
ratios from 0.1, 0.2, ... up to 10.0 and reports the time it to took
to do each of the scaling operations, and the time spent per
destination pixel.

The times reported for the scaling operations are given in
milliseconds, the times-per-pixel are in nanoseconds.

V2: Format output better
2013-08-10 11:18:23 -04:00
Søren Sandmann Pedersen
fedd6b192d RELEASING: Add note about changing the topic of the #cairo IRC channel 2013-08-07 10:22:25 -04:00
Søren Sandmann Pedersen
f8a0812b1c Pre-release version bump to 0.30.2 2013-08-07 10:07:35 -04:00
Siarhei Siamashka
b5167b8a54 test: fix matrix-test on big endian systems 2013-08-05 01:45:59 +03:00
Siarhei Siamashka
d87601ffc3 test: fix matrix-test on big endian systems 2013-08-05 01:42:29 +03:00
Julien Cristau
bbb3765faf Upload to unstable 2013-08-03 10:24:43 +02:00
Julien Cristau
2e13b569cb Increase timeout for the alpha-loop test.
That will hopefully let it pass on the mips buildd.
2013-08-03 10:23:41 +02:00
Andrea Canciani
a82b95a264 test: Fix build on MSVC
The MSVC compiler is very strict about variable declarations after
statements.

Move all the declarations of each block before any statement in the
same block to fix multiple instances of:

alpha-loop.c(XX) : error C2275: 'pixman_image_t' : illegal use of this
type as an expression
2013-08-01 09:08:15 -07:00
Søren Sandmann Pedersen
4c04a86c68 Version bump to 0.30.1 2013-08-01 07:19:21 -04:00
Alexander Troosh
6300452952 Require GTK+ version >= 2.16
I'm got bug in my system:

lcc: "scale.c", line 374: warning: function "gtk_scale_add_mark" declared
          implicitly [-Wimplicit-function-declaration]
      gtk_scale_add_mark (GTK_SCALE (widget), 0.0, GTK_POS_LEFT, NULL);
      ^

  CCLD   scale
scale.o: In function `app_new':
(.text+0x23e4): undefined reference to `gtk_scale_add_mark'
scale.o: In function `app_new':
(.text+0x250c): undefined reference to `gtk_scale_add_mark'
scale.o: In function `app_new':
(.text+0x2634): undefined reference to `gtk_scale_add_mark'
make[2]: *** [scale] Error 1
make[2]: Target `all' not remade because of errors.

$ pkg-config --modversion gtk+-2.0
2.12.1

The demos/scale.c use call to gtk_scale_add_mark() function from 2.16+
version of GTK+. Need do support old GTK+ (rewrite scale.c) or simple
demand of high version of GTK+, like this:
2013-07-30 08:18:35 -04:00
Matthieu Herrb
02869a1229 configure.ac: Don't use '+=' since it's not POSIX
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Matthieu Herrb <matthieu.herrb@laas.fr>
2013-07-30 08:18:25 -04:00
Markos Chandras
35da06c828 Use AC_LINK_IFELSE to check if the Loongson MMI code can link
The Loongson code is compiled with -march=loongson2f to enable the MMI
instructions, but binutils refuses to link object code compiled with
different -march settings, leading to link failures later in the
compile. This avoids that problem by checking if we can link code
compiled for Loongson.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
2013-07-30 08:18:02 -04:00
ingmar@irsoft.de
e14f5a739f Fix broken build when HAVE_CONFIG_H is undefined, e.g. on Win32.
Build fix for platforms without a generated config.h, for example Win32.
2013-07-30 08:17:49 -04:00
Julien Cristau
3f0d759608 Upload to unstable 2013-07-27 21:40:50 +02:00
Julien Cristau
3c4dac9a7c Fix matrix-test on big endian
Patch from Siarhei Siamashka.
2013-07-27 21:40:09 +02:00
Julien Cristau
3473a947da Disable arm iwmmxt fast paths. It breaks the build. 2013-07-27 14:48:50 +02:00
Julien Cristau
dc29515934 Disable silent Makefile rules. 2013-07-27 14:37:23 +02:00
Julien Cristau
2084b2d3bd Upload to unstable 2013-07-26 14:58:46 +02:00
Julien Cristau
317b3c3eea Add more test-only exported functions to symbols file 2013-07-26 14:47:35 +02:00
Julien Cristau
73ff58c119 Remove png file missing from the tarball 2013-07-26 14:36:14 +02:00
Julien Cristau
d2fbfbc23c Bump changelog and symbols for 0.30.0 2013-07-26 14:31:38 +02:00
Julien Cristau
5de927bd3e Merge branch 'upstream-merge' into debian-unstable 2013-07-26 14:26:43 +02:00
Julien Cristau
0ef6350c3d Revert "Add 00-unexport-symbol.diff"
This reverts commit 01c2431ef8.
2013-07-26 14:26:30 +02:00
Julien Cristau
07473e703e Merge remote-tracking branch 'origin/debian-experimental' into debian-unstable
Conflicts:
	debian/changelog
2013-07-26 14:26:11 +02:00
Julien Cristau
be9bb76118 Merge remote-tracking branch 'origin/upstream-experimental' into upstream-merge 2013-07-26 14:24:21 +02:00
Andrea Canciani
1e49329333 test: Fix build on MSVC
The MSVC compiler is very strict about variable declarations after
statements.

Move all the declarations of each block before any statement in the
same block to fix multiple instances of:

alpha-loop.c(XX) : error C2275: 'pixman_image_t' : illegal use of this
type as an expression
2013-06-25 16:55:24 +02:00
Alexander Troosh
279bdcda7e Require GTK+ version >= 2.16
I'm got bug in my system:

lcc: "scale.c", line 374: warning: function "gtk_scale_add_mark" declared
          implicitly [-Wimplicit-function-declaration]
      gtk_scale_add_mark (GTK_SCALE (widget), 0.0, GTK_POS_LEFT, NULL);
      ^

  CCLD   scale
scale.o: In function `app_new':
(.text+0x23e4): undefined reference to `gtk_scale_add_mark'
scale.o: In function `app_new':
(.text+0x250c): undefined reference to `gtk_scale_add_mark'
scale.o: In function `app_new':
(.text+0x2634): undefined reference to `gtk_scale_add_mark'
make[2]: *** [scale] Error 1
make[2]: Target `all' not remade because of errors.

$ pkg-config --modversion gtk+-2.0
2.12.1

The demos/scale.c use call to gtk_scale_add_mark() function from 2.16+
version of GTK+. Need do support old GTK+ (rewrite scale.c) or simple
demand of high version of GTK+, like this:
2013-06-11 12:09:49 -04:00
Matthieu Herrb
889f118946 configure.ac: Don't use '+=' since it's not POSIX
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Matthieu Herrb <matthieu.herrb@laas.fr>
2013-06-08 10:21:54 -07:00
Søren Sandmann Pedersen
2acfac5f8e Consolidate all the iter_init_bits_stride functions
The SSE2, MMX, and fast implementations all have a copy of the
function iter_init_bits_stride that computes an image buffer and
stride.

Move that function to pixman-utils.c and share it among all the
implementations.
2013-05-22 09:43:21 -04:00
Søren Sandmann Pedersen
533f54430a Delete the old src/dest_iter_init() functions
Now that we are using the new _pixman_implementation_iter_init(), the
old _src/_dest_iter_init() functions are no longer needed, so they can
be deleted, and the corresponding fields in pixman_implementation_t
can be removed.
2013-05-22 09:43:21 -04:00
Søren Sandmann Pedersen
125a4fd36f Add _pixman_implementation_iter_init() and use instead of _src/_dest_init()
A new field, 'iter_info', is added to the implementation struct, and
all the implementations store a pointer to their iterator tables in
it. A new function, _pixman_implementation_iter_init(), is then added
that searches those tables, and the new function is called in
pixman-general.c and pixman-image.c instead of the old
_pixman_implementation_src_init() and _pixman_implementation_dest_init().
2013-05-22 09:43:21 -04:00
Søren Sandmann Pedersen
245d0090c5 general: Store the iter initializer in a one-entry pixman_iter_info_t table
In preparation for sharing all iterator initialization code from all
the implementations, move the general implementation to use a table of
pixman_iter_info_t.

The existing src_iter_init and dest_iter_init functions are
consolidated into one general_iter_init() function that checks the
iter_flags for whether it is dealing with a source or destination
iterator.

Unlike in the other implementations, the general_iter_init() function
stores its own get_scanline() and write_back() functions in the
iterator, so it relies on the initializer being called after
get_scanline and write_back being copied from the struct to the
iterator.
2013-05-22 09:43:21 -04:00
Søren Sandmann Pedersen
9c15afb105 fast: Replace the fetcher_info_t table with a pixman_iter_info_t table
Similar to the SSE2 and MMX patches, this commit replaces a table of
fetcher_info_t with a table of pixman_iter_info_t, and similar to the
noop patch, both fast_src_iter_init() and fast_dest_iter_init() are
now doing exactly the same thing, so their code can be shared in a new
function called fast_iter_init_common().
2013-05-22 09:43:21 -04:00
Søren Sandmann Pedersen
71c2d519d0 mmx: Replace the fetcher_info_t table with a pixman_iter_info_t table
Similar to the SSE2 commit, information about the iterators is stored
in a table of pixman_iter_info_t.
2013-05-22 09:43:21 -04:00
Søren Sandmann Pedersen
78f437d61e sse2: Replace the fetcher_info_t table with a pixman_iter_info_t table
Similar to the changes to noop, put all the iterators into a table of
pixman_iter_info_t and then do a generic search of that table during
iterator initialization.
2013-05-22 09:43:20 -04:00
Søren Sandmann Pedersen
c7b0da8a96 noop: Keep information about iterators in an array of pixman_iter_info_t
Instead of having a nest of if statements, store the information about
iterators in a table of a new struct type, pixman_iter_info_t, and
then walk that table when initializing iterators.

The new struct contains a format, a set of image flags, and a set of
iter flags, plus a pixman_iter_get_scanline_t, a
pixman_iter_write_back_t, and a new function type
pixman_iter_initializer_t.

If the iterator matches an entry, it is first initialized with the
given get_scanline and write_back functions, and then the provided
iter_initializer (if present) is run. Running the iter_initializer
after setting get_scanline and write_back allows the initializer to
override those fields if it wishes.

The table contains both source and destination iterators,
distinguished based on the recently-added ITER_SRC and ITER_DEST;
similarly, wide iterators are recognized with the ITER_WIDE
flag. Having both source and destination iterators in the table means
the noop_src_iter_init() and noop_dest_iter_init() functions become
identical, so this patch factors out their code in a new function
noop_iter_init_common() that both calls.

The following patches in this series will change all the
implementations to use an iterator table, and then move the table
search code to pixman-implementation.c.
2013-05-22 09:43:20 -04:00
Søren Sandmann Pedersen
3b96ee4e77 Always set the FAST_PATH_NO_ALPHA_MAP flag for non-BITS images
We only support alpha maps for BITS images, so it's always to ignore
the alpha map for non-BITS image. This makes it possible get rid of
the check for SOLID images since it will now be subsumed by the check
for FAST_PATH_NO_ALPHA_MAP.

Opaque masks are reduced to NULL images in pixman.c, and those can
also safely be treated as not having an alpha map, so set the
FAST_PATH_NO_ALPHA_MAP bit for those as well.
2013-05-22 09:43:12 -04:00
Søren Sandmann Pedersen
52ff5f0cd9 Add ITER_WIDE iter flag
This will be useful for putting iterators into tables where they can
be looked up by iterator flags. Without this flag, wide iterators can
only be recognized by the absence of ITER_NARROW, which makes testing
for a match difficult.
2013-05-22 09:43:03 -04:00
Søren Sandmann Pedersen
e8a180797c Add ITER_SRC and ITER_DEST iter flags
These indicate whether the iterator is for a source or a destination
image. Note iterator initializers are allowed to rely on one of these
being set, so they can't be left out the way it's generally harmless
(aside from potentil performance degradation) to leave out a
particular fast path flag.
2013-05-22 09:41:10 -04:00
Søren Sandmann Pedersen
2320f0520b Make use of image flag in noop iterators
Similar to c2230fe2af, simply check against SAMPLES_COVER_CLIP_NEAREST
instead of comparing all the x/y/width/height parameters.
2013-05-22 04:28:41 -04:00
Markos Chandras
d77d75cc6e Use AC_LINK_IFELSE to check if the Loongson MMI code can link
The Loongson code is compiled with -march=loongson2f to enable the MMI
instructions, but binutils refuses to link object code compiled with
different -march settings, leading to link failures later in the
compile. This avoids that problem by checking if we can link code
compiled for Loongson.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
2013-05-19 09:01:34 -07:00
Matt Turner
a74be759a1 mmx: Document implementation(s) of pix_multiply().
I look at that function and can never remember what it does or how it
manages to do it.
2013-05-15 09:51:15 -07:00
ingmar@irsoft.de
cb5d131ff4 Fix broken build when HAVE_CONFIG_H is undefined, e.g. on Win32.
Build fix for platforms without a generated config.h, for example Win32.
2013-05-11 16:09:39 -04:00
Søren Sandmann Pedersen
d70141955e Post-release version bump to 0.31.1 2013-05-08 19:40:12 -04:00
Søren Sandmann Pedersen
41daf50aae Pre-release version bump to 0.30.0 2013-05-08 19:31:22 -04:00
Søren Sandmann Pedersen
5a7179191d Post-release version bump to 0.29.5 2013-04-30 18:57:43 -04:00
Søren Sandmann Pedersen
2714b5d201 Pre-release version bump to 0.29.4 2013-04-30 18:50:04 -04:00
Søren Sandmann Pedersen
7fc2654a1f pixman/refactor: Delete this file
Essentially all of it is obsolete by now.
2013-04-30 16:25:10 -04:00
Nemanja Lukic
cb928a77c0 MIPS: DSPr2: Added rpixbuf fast path.
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
       rpixbuf =  L1:  14.63  L2:  13.55  M:  9.91 ( 79.53%)  HT:  8.47  VT:  8.32  R:  8.17  RT:  4.90 (  33Kops/s)

Optimized:
       rpixbuf =  L1:  45.69  L2:  37.30  M: 17.24 (138.31%)  HT: 15.66  VT: 14.88  R: 13.97  RT:  8.38 (  44Kops/s)
2013-04-30 15:38:43 -04:00
Nemanja Lukic
c6a6fbdcd3 MIPS: DSPr2: Added pixbuf fast path.
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        pixbuf =  L1:  18.18  L2:  16.47  M: 13.36 (107.27%)  HT: 10.16  VT: 10.07  R:  9.84  RT:  5.54 (  35Kops/s)

Optimized:
        pixbuf =  L1:  43.54  L2:  36.02  M: 17.08 (137.09%)  HT: 15.58  VT: 14.85  R: 13.87  RT:  8.38 (  44Kops/s)
2013-04-30 15:38:43 -04:00
Nemanja Lukic
f69335d529 test: add "pixbuf" and "rpixbuf" to lowlevel-blt-bench
Add necessary support to lowlevel-blt benchmark for benchmarking pixbuf and
rpixbuf fast paths. bench_composite function now checks for pixbuf string in
testname, and if that is detected, use same bits for src and mask images.
2013-04-30 15:38:43 -04:00
Nemanja Lukic
3dc9e3827e test: add "src_0888_8888_rev" and "src_0888_0565_rev" to lowlevel-blt-bench 2013-04-30 15:38:43 -04:00
Nemanja Lukic
44174ce51d MIPS: DSPr2: Fix for bug in in_n_8 routine.
Rounding logic was not implemented right.
Instead of using rounding version of the 8-bit shift, logical shifts were used.
Also, code used unnecessary multiplications, which could be avoided by packing
4 destination (a8) pixel into one 32bit register. There were also, unnecessary
spills on stack. Code is rewritten to address mentioned issues.

The bug was revealed by increasing number of the iterations in blitters-test.

Performance numbers on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
                   in_n_8 =  L1:  21.20  L2:  22.86  M: 21.42 ( 14.21%)  HT: 15.97  VT: 15.69  R: 15.47  RT:  8.00 (  48Kops/s)
Optimized (first implementation, with bug):
                   in_n_8 =  L1:  89.38  L2:  86.07  M: 65.48 ( 43.44%)  HT: 44.64  VT: 41.50  R: 40.77  RT: 16.94 (  66Kops/s)
Optimized (with bug fix, and code revisited):
                   in_n_8 =  L1: 102.33  L2:  95.65  M: 70.54 ( 46.84%)  HT: 48.35  VT: 45.06  R: 43.20  RT: 17.60 (  66Kops/s)
2013-04-30 15:38:43 -04:00
Nemanja Lukic
5858f09d26 MIPS: DSPr2: Added src_0565_8888 nearest neighbor fast path.
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
         src_0565_8888 =  L1:  20.70  L2:  19.22  M: 12.50 ( 49.79%)  HT: 10.45  VT: 10.18  R:  9.99  RT:  5.31 (  31Kops/s)

Optimized:
         src_0565_8888 =  L1:  62.98  L2:  53.44  M: 23.07 ( 91.87%)  HT: 19.85  VT: 19.15  R: 17.70  RT:  9.68 (  43Kops/s)
2013-04-30 15:38:43 -04:00
Nemanja Lukic
311d55b6d8 MIPS: DSPr2: Added over_8888_0565 nearest neighbor fast path.
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        over_8888_0565 =  L1:  13.22  L2:  12.02  M:  9.77 ( 38.92%)  HT:  8.58  VT:  8.35  R:  8.38  RT:  5.78 (  35Kops/s)

Optimized:
        over_8888_0565 =  L1:  26.20  L2:  22.97  M: 15.92 ( 63.40%)  HT: 13.33  VT: 13.13  R: 12.72  RT:  7.65 (  39Kops/s)
2013-04-30 15:38:43 -04:00
Nemanja Lukic
bd487ee34c MIPS: DSPr2: Added over_8888_8888 nearest neighbor fast path.
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        over_8888_8888 =  L1:  19.47  L2:  16.30  M: 11.24 ( 59.69%)  HT:  9.54  VT:  9.29  R:  9.47  RT:  6.24 (  37Kops/s)

Optimized:
        over_8888_8888 =  L1:  43.67  L2:  33.30  M: 16.32 ( 86.65%)  HT: 14.10  VT: 13.78  R: 12.96  RT:  7.85 (  39Kops/s)
2013-04-30 15:38:43 -04:00
Nemanja Lukic
66def909ad MIPS: DSPr2: Fix bug in over_n_8888_8888_ca/over_n_8888_0565_ca routines
After introducing new PRNG (pseudorandom number generator) a bug in two DSPr2
routines was revealed. Bug manifested by wrong calculation in composite and
glyph tests, which caused make check to fail for MIPS DSPr2 optimizations.

Bug was in the calculation of the:
*dst = over (src, *dst) when ma == 0xffffffff

In this case src was not negated and shifted right by 24 bits, it was only
negated. When implementing this routine in the first place, I missplaced those
shifts, which alowed me to combine code for over operation and:
    UN8x4_MUL_UN8x4 (s, ma);
    UN8x4_MUL_UN8 (ma, srca);
    ma = ~ma;
    UN8x4_MUL_UN8x4_ADD_UN8x4 (d, ma, s);
So I decided to rewrite that piece of code from scratch. I changed logic, so
now assembly code mimics code from pixman-fast-path.c but processes two pixels
at a time. This code should be easier to debug and maintain.

The bug was revealed in commit b31a6962. Errors were detected by composite
and glyph tests.
2013-04-30 15:38:43 -04:00
Siarhei Siamashka
d768558ce1 sse2: faster bilinear interpolation (get rid of XOR instruction)
The old code was calculating horizontal weights for right pixels
in the following way (for simplicity assume 8-bit interpolation
precision):

  Start with "x = vx" and do increment "x += ux" after each pixel.
  In this case right pixel weight for interpolation can be calculated
  as "((x >> 8) ^ 0xFF) + 1", which is the same as "256 - (x >> 8)".

The new code instead:

  Starts with "x = -(vx + 1)", performs increment "x += -ux" after
  each pixel and calculates right weights as just "(x >> 8) + 1",
  eliminating the need for XOR operation in the inner loop.

So we have one instruction less on the critical path. Benchmarks
with "lowlevel-blt-bench -b src_8888_8888" using GCC 4.7.2 on
x86-64 system and default optimizations:

Intel Core i7 860 (2.8GHz):
    before: src_8888_8888 =  L1: 291.37  L2: 288.58  M:285.38
    after:  src_8888_8888 =  L1: 319.66  L2: 316.47  M:312.06

Intel Core2 T7300 (2GHz):
    before: src_8888_8888 =  L1: 121.95  L2: 118.38  M:118.52
    after:  src_8888_8888 =  L1: 128.82  L2: 125.12  M:124.88

Intel Atom N450 (1.67GHz):
    before: src_8888_8888 =  L1:  64.25  L2:  62.37  M: 61.80
    after:  src_8888_8888 =  L1:  64.23  L2:  62.37  M: 61.82

Inspired by the "sse2_bilinear_interpolation" function (single
pixel interpolation) from:
    http://lists.freedesktop.org/archives/pixman/2013-January/002575.html
2013-04-28 23:22:41 +03:00
Siarhei Siamashka
59109f3293 test: larger 0xFF/0x00 filled clusters in random images for blitters-test
Current blitters-test program had difficulties detecting a bug in
over_n_8888_8888_ca implementation for MIPS DSPr2:

    http://lists.freedesktop.org/archives/pixman/2013-March/002645.html

In order to hit the buggy code path, two consecutive mask values had
to be equal to 0xFFFFFFFF because of loop unrolling. The current
blitters-test generates random images in such a way that each byte
has 25% probability for having 0xFF value. Hence each 32-bit mask
value has ~0.4% probability for 0xFFFFFFFF. Because we are testing
many compositing operations with many pixels, encountering at least
one 0xFFFFFFFF mask value reasonably fast is not a problem. If a
bug related to 0xFFFFFFFF mask value is artificialy introduced into
over_n_8888_8888_ca generic C function, it gets detected on 675591
iteration in blitters-test (out of 2000000).

However two consecutive 0xFFFFFFFF mask values are much less likely
to be generated, so the bug was missed by blitters-test.

This patch addresses the problem by also randomly setting the 32-bit
values in images to either 0xFFFFFFFF or 0x00000000 (also with 25%
probability). It allows to have larger clusters of consecutive 0x00
or 0xFF bytes in images which may have special shortcuts for handling
them in unrolled or SIMD optimized code.
2013-04-28 22:14:47 +03:00
Stefan Weil
a99147d1ea Trivial spelling fixes in comments
They were found by codespell.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2013-04-27 04:08:45 -04:00
Peter Breitenlohner
9d0bb10312 Check for missing sqrtf() as, e.g., for Solaris 9
Signed-off-by: Peter Breitenlohner <peb@mppmu.mpg.de>
2013-04-08 14:33:25 -04:00
Søren Sandmann Pedersen
d8ac35af12 Improve precision of calculations in pixman-gradient-walker.c
The computations in pixman-gradient-walker.c currently take place at
very limited 8 bit precision which results in quite visible artefacts
in gradients. An example is the one produced by demos/linear-gradient
which currently looks like this:

    http://i.imgur.com/kQbX8nd.png

With the changes in this commit, the gradient looks like this:

    http://i.imgur.com/nUlyuKI.png

The images are also available here:

    http://people.freedesktop.org/~sandmann/gradients/before.png
    http://people.freedesktop.org/~sandmann/gradients/after.png

This patch computes pixels using floating point, but uses a faster
algorithm, which makes up for the loss of performance.

== Theory:

In both the new and the old algorithm, the various gradient
implementations compute a parameter x that indicates how far along the
gradient the current scanline is. The current algorithm has a cache of
the two color stops surrounding the last parameter; those are used in
a SIMD-within-register fashion in this way:

    t1 = walker->left_rb * idist + walker->right_rb * dist;

where dist and idist are the distances to the left and right color
stops respectively normalized to the distance between the left and
right stops. The normalization (which involves a division) is captured
in another cached variable "stepper". The cached values are recomputed
whenever the parameter moves in between two different stops (called
"reset" in the implementation).

Because idist and dist are computed in 8 bits only, a lot of
information is lost, which is quite visible as the image linked above
shows.

The new algorithm caches more information in the following way. When
interpolating between stops, the formula to be used is this:

     t = ((x - left) / (right - left));

     result = lc * (1 - t) + rc * t;

where

    - x is the parameter as computed by the main gradient code,
    - left is the position of the left color stop,
    - right is the position of the right color stop
    - lc is the color of the left color stop
    - rc is the color of the right color stop

That formula can also be written like this:

    result
      = lc * (1 - t) + rc * t;
      = lc + (rc - lc) * t
      = lc + (rc - lc) * ((x - left) / (right - left))
      = (rc - lc) / (right - left) * x +
      	       lc - (left * (rc - lc)) / (right - left)
      = s * x + b

where

    s = (rc - lc) / (right - left)

and

    b = lc - left * (rc - lc) / (right - left)
      = (lc * (right - left) - left * (rc - lc)) / (right - left)
      = (lc * right - rc * left) / (right - left)

To summarize, setting w = (right - left):

    s = (rc - lc) / w
    b = (lc * right - rc * left) / w

    r = s * x + b

Since s and b only depend on the two active stops, both can be cached
so that the computation only needs to do one multiplication and one
addition per pixel (followed by premultiplication of the alpha
channel). That is, seven multiplications in total, which is the same
number as the old SIMD-within-register implementation had.

== Implementation notes:

The new formula described above is implemented in single precision
floating point, and the eight divisions necessary to compute the
cached values are done by multiplication with the reciprocal of the
distance between the color stops.

The alpha values used in the cached computation are scaled by 255.0,
whereas the RGB values are kept in the [0, 1] interval. The ensures
that after premultiplication, all values will be in the [0, 255]
interval.

This scaling is done by first dividing all the all the channels by
257, and then later on dividing the r, g, b channels by 255. It would
be more natural to do all this scaling in only one place, but
inexplicably, that results in a (substantial) slowdown on Sandy Bridge
with GCC v 4.7.

== Performance impact (median of three runs of radial-perf-test):

   == Intel Sandy Bridge, Core i3 @ 1.2GHz

   Before: 0.014553
   After:  0.014410
   Change: 1.0% faster

   == AMD Barcelona @ 1.2 GHz

   Before: 0.021735
   After:  0.021328
   Change: 1.9% faster

Ie., slightly faster, though conceivably there could be a negative
impact on machines with a bigger difference between integer and
floating point performance.

V2:

- Use 's' and 'b' in the variable names instead of 'm' and 'd'. This
  way they match the explanation above

- Move variable declarations to the top of the function

- Remove unused stepper field

- Some formatting fixes

- Don't pointlessly include pixman-combine32.h

- Don't offset x for each pixel; go back to offsetting left_x and
  right_x at reset time. The offsets cancel out in the formula above,
  so there is no impact on the calcualations.
2013-03-16 01:14:22 -04:00
Søren Sandmann Pedersen
a1c2331e0e Move the IS_ZERO() to pixman-private.h and rename to FLOAT_IS_ZERO()
Some upcoming changes to pixman-gradient-walker.c will need this
macro.
2013-03-11 22:41:55 -04:00
Søren Sandmann Pedersen
2c953e572f test: Add radial-perf-test, a microbenchmark for radial gradients
This benchmark renders one of the radial gradients used in the
swfdec-youtube cairo trace 500 times and reports the average time it
took.

V2: Update .gitignore
2013-03-11 22:41:45 -04:00
Søren Sandmann Pedersen
460faaa411 demos: Add linear-gradient demo program
This program displays a linear gradient from blue to yellow. Due to
limited precision in pixman-gradient-walker.c, it currently has some
ugly artefacts that gives it a 'brushed metal' appearance.

V2: Update .gitignore
2013-03-11 22:40:05 -04:00
Behdad Esfahbod
aaae3d8eef Remove unused macro 2013-03-08 06:00:00 -05:00
Nemanja Lukic
5feda20fc3 MIPS: DSPr2: Added more fast-paths for SRC operation:
- src_0888_8888_rev
 - src_0888_0565_rev

Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        src_0888_8888_rev =  L1:  51.88  L2:  42.00  M: 19.04 ( 88.50%)  HT: 15.27  VT: 14.62  R: 14.13  RT:  7.12 (  45Kops/s)
        src_0888_0565_rev =  L1:  31.96  L2:  30.90  M: 22.60 ( 75.03%)  HT: 15.32  VT: 15.11  R: 14.49  RT:  6.64 (  43Kops/s)

Optimized:
        src_0888_8888_rev =  L1: 222.73  L2: 113.70  M: 20.97 ( 97.35%)  HT: 18.31  VT: 17.14  R: 16.71  RT:  9.74 (  54Kops/s)
        src_0888_0565_rev =  L1: 100.37  L2:  74.27  M: 29.43 ( 97.63%)  HT: 22.92  VT: 21.59  R: 20.52  RT: 10.56 (  56Kops/s)
2013-02-27 14:40:51 +01:00
Nemanja Lukic
43914d68d1 MIPS: DSPr2: Added more fast-paths for OVER operation:
- over_8888_0565
 - over_n_8_8

Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        over_8888_0565 =  L1:  14.30  L2:  13.22  M: 10.43 ( 41.56%)  HT: 12.51  VT: 12.95  R: 11.82  RT:  7.34 (  49Kops/s)
            over_n_8_8 =  L1:  12.77  L2:  16.93  M: 15.03 ( 29.94%)  HT: 10.78  VT: 10.72  R: 10.29  RT:  4.92 (  33Kops/s)

Optimized:
        over_8888_0565 =  L1:  26.03  L2:  22.92  M: 15.68 ( 62.43%)  HT: 16.19  VT: 16.27  R: 14.93  RT:  8.60 (  52Kops/s)
            over_n_8_8 =  L1:  62.00  L2:  55.17  M: 40.29 ( 80.23%)  HT: 26.77  VT: 25.64  R: 24.13  RT: 10.01 (  47Kops/s)
2013-02-27 14:39:45 +01:00
Julien Cristau
259f681187 Upload to unstable 2013-02-18 20:17:18 +01:00
Søren Sandmann Pedersen
6dfdd8534f Fix for infinite-loop test
The infinite loop detected by "affine-test 212944861" is caused by an
overflow in this expression:

    max_x = pixman_fixed_to_int (vx + (width - 1) * unit_x) + 1;

where (width - 1) * unit_x doesn't fit in a signed int. This causes
max_x to be too small so that this:

    src_width = 0

    while (src_width < REPEAT_NORMAL_MIN_WIDTH && src_width <= max_x)
        src_width += src_image->bits.width;

results in src_width being 0. Later on when src_width is used for
repeat calculations, we get the infinite loop.

By casting unit_x to int64_t, the expression no longer overflows and
affine-test 212944861 and infinite-loop no longer loop forever.
(cherry picked from commit de60e2e0e3)
2013-02-18 19:58:06 +01:00
Søren Sandmann Pedersen
2156fb51b3 gtk-utils.c: Use cairo in show_image() rather than GdkPixbuf
GdkPixbufs are not premultiplied, so when using them to display pixman
images, there is some unecessary conversions going on: First the image
is converted to non-premultiplied, and then GdkPixbuf premultiplies
before sending the result to the X server. These conversions may cause
the displayed image to not be exactly identical to the original.

This patch just uses a cairo image surface instead, which avoids these
conversions.

Also make the comment about sRGB a little more concise.
2013-02-15 18:57:24 -05:00
Ben Avison
5e207f825b Fix to lowlevel-blt-bench
The source, mask and destination buffers are initialised to 0xCC just after
they are allocated. Between each benchmark, there are a pair of memcpys,
from the destination buffer to the source buffer and back again (there are
no explanatory comments, but presumably this is an effort to flush the
caches). However, it has an unintended consequence, which is to change the
contents of the buffers on entry to subsequent benchmarks. This means it is
not a fair test: for example, with over_n_8888 (featured in the following
patches) it reports L2 and even M tests as being faster than the L1 test,
because after the L1 test, the source buffer is filled with fully opaque
pixels, for which over_n_8888 has a shortcut.

The fix here is simply to reverse the order of the memcpys, so src and
destination are both filled with 0xCC on entry to all tests.
2013-02-13 02:24:34 -05:00
Stefan Weil
d26f922dc1 sse2: Use uintptr_t in type casts from pointer to integral value
Some recent code added new type casts from pointer to unsigned long.
These type casts result in compiler warnings for systems like
MinGW-w64 (64 bit Windows) where sizeof(unsigned long) != sizeof(void *).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-02-13 02:18:01 -05:00
Søren Sandmann Pedersen
dc80eb09e2 lookup_composite: Don't update cache in case of error
If we fail to find a composite function, don't update the fast path
cache with the dummy compositing function.

Also make the error message state that the bug is likely caused by
issues with thread local storage.
2013-02-13 02:18:01 -05:00
Søren Sandmann Pedersen
4dced81c91 Turn on error logging at all times
While releasing 0.29.2 the distcheck run produced a number of error
messages that had to be fixed in 349015e1fc.
These were not caught before so nobody had actually run pixman with
debugging turned on. It's not the first time this has happened, see
5b0563f39e for example.

So this patch makes the return_if_fail() macros use unlikely() around
the expressions and then turns on error logging at all times. The
performance hit should negligible since we were already evaluating the
expressions.

The place where DEBUG actually does cause a performance hit is in the
region selfcheck code, and that will still only be enabled in
development snapshots.
2013-02-13 02:18:01 -05:00
Søren Sandmann Pedersen
f4c9492c12 pixman-compiler.h: Add unlikely() macro
When compiling with GCC this macro expands to __builtin_expect((expr), 0).
On other compilers, it just expands to (expr).
2013-02-13 02:18:01 -05:00
Søren Sandmann Pedersen
5ebb5ac380 utils.c: Increase acceptable deviation to 0.0064 in pixel_checker_t
The check-formats programs reveals that the 8 bit pipeline cannot meet
the current 0.004 acceptable deviation specified in utils.c, so we
have to increase it. Some of the failing pixels were captured in
pixel-test, which with this commit now passes.

== a4r4g4b4 DISJOINT_XOR a8r8g8b8 ==

The DISJOINT_XOR operator applied to an a4r4g4b4 source pixel of
0xd0c0 and a destination pixel of 0x5300ea00 results in the exact
value:

    fa = (1 - da) / sa = (1 - 0x53 / 255.0) / (0xd / 15.0) = 0.7782
    fb = (1 - sa) / da = (1 - 0xd / 15.0) / (0x53 / 255.0) = 0.4096

    r = fa * (0xc / 15.0) + fb * (0xea / 255.0) = 0.99853

But when computing in 8 bits, we get:

    fa8 = ((255 - 0x53) * 255 + 0xdd / 2) / 0xdd = 0xc6
    fb8 = ((255 - 0xdd) * 255 + 0x53 / 3) / 0x53 = 0x68

    r8 = (fa8 * 0xcc + 127) / 255 + (fb8 * 0xea + 127) / 255 = 0xfd

and

    0xfd / 255.0 = 0.9921568627450981

for a deviation of 0.00637118610187, which we then have to consider
acceptable given the current implementation.

By switching to computing the result with

   r = (fa * s + fb * d + 127) / 255

rather than

   r = (fa * s + 127) / 255 + (fb * d + 127) / 255

the deviation would be only 0.00244961747442, so at some point it may
be worth doing either this, or switching to floating point for
operators that involve divisions.

Note that the conversion from 4 bits to 8 bits does not cause any
error in this case because both rounding and bit replication produces
an exact result when the number of from-bits divide the number of
to-bits.

== a8r8g8b8 OVER r5g6b5 ==

When OVER compositing the a8r8g8b8 pixel 0x0f00c300 with the x14r6g6b6
pixel 0x03c0, the true floating point value of the resulting green
channel is:

   0xc3 / 255.0 + (1.0 - 0x0f / 255.0) * (0x0f / 63.0) = 0.9887955

but when compositing 8 bit values, where the 6-bit green channel is
converted to 8 bit through bit replication, the 8-bit result is:

   0xc3 + ((255 - 0x0f) * 0x3c + 127) / 255 = 251

which corresponds to a real value of 0.984314. The difference from the
true value is 0.004482 which is bigger than the acceptable deviation
of 0.004. So, if we were to compute all the CONJOINT/DISJOINT
operators in floating point, or otherwise make them more accurate, the
acceptable deviation could be set at 0.0045.

If we were doing the 6-bit conversion with rounding:

   (x / 63.0 * 255.0 + 0.5)

instead of bit replication, the deviation in this particular case
would be only 0.0005, so we may want to consider this at some
point.
2013-02-13 02:18:01 -05:00
Søren Sandmann Pedersen
f2ba7fe1d8 test: Add new pixel-test regression test
This test program contains a table of individual operator/pixel
combinations. For each pixel combination, images of various sizes are
filled with the pixels and then composited. The result is then
verified against the output of do_composite(). If the result doesn't
match, detailed error information is printed.

The initial 14 pixel combinations currently all fail.
2013-02-13 02:18:01 -05:00
Søren Sandmann Pedersen
6781636740 a1-trap-test: Add tests for operator_name and format_name()
The check-formats.c test depends on the exact format of the strings
returned from these functions, so add a test here.

a1-trap-test isn't the ideal place, but it seems like overkill to add
a new test just for these trivial checks.
2013-02-13 02:18:01 -05:00
Søren Sandmann Pedersen
d1434d112c test: Add new check-formats utility
Given an operator and two formats, this program will composite and
check all pixels where the red and blue channels are 0. That is, if
the two formats are a8r8g8b8 and a4r4g4b4, all source pixels matching
the mask

    0xff00ff00

are composited with the given operator against all destination pixels
matching the mask

    0xf0f0

and the result is then verified against the do_composite() function
that was moved to utils.c earlier.

This program reveals that a number of operators and format
combinations are not computed to within the precision currently
accepted by pixel_checker_t. For example:

    check-formats over a8r8g8b8 r5g6b5 | grep failed | wc -l
    30

reveals that there are 30 pixel combinations where OVER produces
insufficiently precise results for the a8r8g8b8 and r5g6b5 formats.
2013-02-13 02:18:01 -05:00
Søren Sandmann Pedersen
1820131fe6 utils.[ch]: Add pixel_checker_get_masks()
This function returns the a, r, g, and b masks corresponding to the
pixel checker's format.
2013-02-13 02:18:01 -05:00
Søren Sandmann Pedersen
5eb61f72ea test/utils.[ch]: Add pixel_checker_convert_pixel_to_color()
This function takes a pixel in the format corresponding to the pixel
checker, and converts to a color_t.
2013-02-13 02:18:01 -05:00
Søren Sandmann Pedersen
3ae717f71a test: Move do_composite() function from composite.c to utils.c
So that it can be used in other tests.
2013-02-13 02:18:01 -05:00
Søren Sandmann Pedersen
958bd334b3 Post-release version bump to 0.29.3 2013-01-29 21:42:02 -05:00
Søren Sandmann Pedersen
a56707e23b Pre-release version bump to 0.29.2 2013-01-29 21:14:51 -05:00
Søren Sandmann Pedersen
349015e1fc stresstest: Ensure that the rasterizer is only given alpha formats
In c2cb303d33, return_if_fail()s were added to
prevent the trapezoid rasterizers from being called with non-alpha
formats. However, stress-test actually does call the rasterizers with
non-alpha formats, but because _pixman_log_error() is disabled in
versions with an odd minor number, the errors never materialized.

Fix this by changing the argument to random format to an enum of three
values DONT_CARE, PREFER_ALPHA, or REQUIRE_ALPHA, and then in the
switch that calls the trapezoid rasterizers, pass the appropriate
value for the function in question.
2013-01-29 20:43:51 -05:00
Søren Sandmann Pedersen
afde862928 Change default GPGKEY to 3892336E, which is soren.sandmann@gmail.com
The old one belongs to the email address sandmann@daimi.au.dk, which
doesn't work anyore.

Also use gpg to get the name and address for the "(Signed by ...)"
line since that works more reliably for me than using git.
2013-01-29 15:24:22 -05:00
Ben Avison
69a7a9b6b6 Improve L1 and L2 benchmark tests for caches that don't use allocate-on-write
In particular this affects single-core ARMs (e.g. ARM11, Cortex-A8), which
are usually configured this way. For other CPUs, this should only add a
constant time, which will be cancelled out by the EXCLUDE_OVERHEAD runs.

The problems were caused by cachelines becoming permanently evicted from
the cache, because the code that was intended to pull them back in again on
each iteration assumed too long a cache line (for the L1 test) or failed to
read memory beyond the first pixel row (for the L2 test). Also, the reloading
of the source buffer was unnecessary.

These issues were identified by Siarhei in this post:
http://lists.freedesktop.org/archives/pixman/2013-January/002543.html
2013-01-29 15:23:05 -05:00
Søren Sandmann Pedersen
1fa67f499d pixman-combine-float.c: Use IS_ZERO() in clip_color() and set_sat()
The clip_color() function has some checks to avoid division by zero,
but they are done by comparing the value to 4 * FLT_EPSILON, where a
better choice is the IS_ZERO() macro that compares to +/- FLT_MIN.

In set_sat(), the check is that *max > *min before dividing by *max -
*min, but that has the potential problem that interactions between GCC
optimizions and 80 bit x87 registers could mean that (*max > *min) is
true in 80 bits, but (*max - *min) is 0 in 32 bits, so that the
division by zero is not prevented. Using IS_ZERO() here as well
prevents this.
2013-01-29 15:23:05 -05:00
Ben Avison
7e53e58664 ARMv6: Replacement add_8_8, over_8888_8888, over_8888_n_8888 and over_n_8_8888 routines
Improved by adding preloads, combining writes and using the SEL
instruction.

add_8_8

    Before          After
    Mean   StdDev   Mean   StdDev  Confidence  Change
L1  62.1   0.2      543.4  12.4    100.0%      +774.9%
L2  38.7   0.4      116.8  1.7     100.0%      +201.8%
M   40.0   0.1      110.1  0.5     100.0%      +175.3%
HT  30.9   0.2      43.4   0.5     100.0%      +40.4%
VT  30.6   0.3      39.2   0.5     100.0%      +28.0%
R   21.3   0.2      35.4   0.4     100.0%      +66.6%
RT  8.6    0.2      10.2   0.3     100.0%      +19.4%

over_8888_8888

    Before          After
    Mean   StdDev   Mean   StdDev  Confidence  Change
L1  32.3   0.1      38.0   0.2     100.0%      +17.7%
L2  15.9   0.4      30.6   0.5     100.0%      +92.8%
M   13.3   0.0      25.6   0.0     100.0%      +92.9%
HT  10.5   0.1      15.5   0.1     100.0%      +47.1%
VT  10.4   0.1      14.6   0.1     100.0%      +40.8%
R   10.3   0.1      15.8   0.1     100.0%      +53.3%
RT  6.0    0.1      7.6    0.1     100.0%      +25.9%

over_8888_n_8888

    Before          After
    Mean   StdDev   Mean   StdDev  Confidence  Change
L1  17.6   0.1      21.0   0.1     100.0%      +19.2%
L2  11.2   0.2      19.2   0.1     100.0%      +71.2%
M   10.2   0.0      19.6   0.0     100.0%      +92.6%
HT  8.4    0.0      11.9   0.1     100.0%      +41.7%
VT  8.3    0.0      11.3   0.1     100.0%      +36.4%
R   8.3    0.0      11.8   0.1     100.0%      +43.1%
RT  5.1    0.1      6.2    0.1     100.0%      +21.3%

over_n_8_8888

    Before          After
    Mean   StdDev   Mean   StdDev  Confidence  Change
L1  17.5   0.1      22.8   0.8     100.0%      +30.1%
L2  14.2   0.3      21.7   0.2     100.0%      +52.6%
M   12.0   0.0      22.3   0.0     100.0%      +84.8%
HT  10.5   0.1      14.1   0.1     100.0%      +34.5%
VT  10.0   0.1      13.5   0.1     100.0%      +35.3%
R   9.4    0.0      12.9   0.2     100.0%      +37.7%
RT  5.5    0.1      6.5    0.2     100.0%      +19.2%
2013-01-29 21:48:03 +02:00
Ben Avison
f87dfd6f37 ARMv6: New conversion routines
There was no previous attempt at accelerating these specifically for
ARMv6.

src_x888_8888

    Before          After
    Mean   StdDev   Mean   StdDev  Confidence  Change
L1  96.7   0.5      270.4  2.6     100.0%      +179.5%
L2  44.6   2.7      110.6  9.7     100.0%      +148.0%
M   26.9   0.1      87.6   0.5     100.0%      +226.1%
HT  19.3   0.2      37.5   0.4     100.0%      +93.7%
VT  18.6   0.1      33.7   0.4     100.0%      +81.6%
R   18.4   0.1      32.2   0.3     100.0%      +75.2%
RT  9.2    0.2      12.1   0.3     100.0%      +31.4%

src_0565_8888

    Before          After
    Mean   StdDev   Mean   StdDev  Confidence  Change
L1  37.0   0.3      66.9   0.2     100.0%      +80.8%
L2  30.3   0.2      55.9   0.3     100.0%      +84.4%
M   25.9   0.0      62.3   0.2     100.0%      +140.3%
HT  15.2   0.1      33.1   0.3     100.0%      +116.9%
VT  15.1   0.1      30.7   0.3     100.0%      +103.6%
R   14.2   0.1      27.6   0.3     100.0%      +94.0%
RT  6.0    0.1      11.2   0.3     100.0%      +87.2%
2013-01-29 21:47:59 +02:00
Ben Avison
a0f59f3b28 ARMv6: New blit routines
These are usable either as various composite operations, or via the
top-level function pixman_blt() which now does some blitting for the
first time on an ARMv6 platform (previously it just returned FALSE).

src_8888_8888

    Before          After
    Mean   StdDev   Mean   StdDev  Confidence  Change
L1  414.5  9.4      445.8  3.6     100.0%      +7.6%
L2  93.3   20.7     114.5  12.9    100.0%      +22.7%
M   57.0   0.2      89.2   0.5     100.0%      +56.4%
HT  28.7   0.3      39.6   0.4     100.0%      +37.9%
VT  25.5   0.2      35.3   0.4     100.0%      +38.4%
R   20.1   0.1      33.8   0.3     100.0%      +67.8%
RT  7.8    0.2      12.7   0.4     100.0%      +62.7%

src_0565_0565

    Before          After
    Mean   StdDev   Mean   StdDev  Confidence  Change
L1  397.4  6.1      412.5  5.2     100.0%      +3.8%
L2  143.2  10.9     141.9  6.5     68.9%       -0.9%  (insignificant)
M   90.7   0.4      133.5  0.7     100.0%      +47.1%
HT  38.6   0.3      53.7   0.7     100.0%      +39.0%
VT  33.0   0.3      47.3   0.6     100.0%      +43.3%
R   25.7   0.2      42.1   0.5     100.0%      +64.1%
RT  8.0    0.2      13.3   0.3     100.0%      +65.6%

src_8_8

    Before          After
    Mean   StdDev   Mean   StdDev  Confidence  Change
L1  716.5  9.8      768.2  20.4    100.0%      +7.2%
L2  246.2  12.7     260.5  8.8     100.0%      +5.8%
M   146.8  0.7      227.9  0.7     100.0%      +55.2%
HT  44.9   0.6      62.1   1.0     100.0%      +38.2%
VT  35.6   0.4      53.4   0.7     100.0%      +50.0%
R   29.7   0.3      48.2   0.6     100.0%      +62.2%
RT  8.6    0.2      12.9   0.4     100.0%      +49.3%
2013-01-29 21:47:54 +02:00
Ben Avison
3cff56c5b0 ARMv6: New fill routines
Note that this also effectively accelerates src_n_8888, src_n_0565 and
src_n_8 composite types, because of the fast paths in
pixman-fast-path.c implemented by fast_composite_solid_fill(), which
end up dispatching these platform-specific fill routines.

src_n_8888

    Before          After
    Mean   StdDev   Mean   StdDev  Confidence  Change
L1  157.3  1.1      574.2  8.7     100.0%      +265.0%
L2  94.2   0.5      364.8  4.2     100.0%      +287.3%
M   92.7   0.4      358.7  1.1     100.0%      +287.1%
HT  68.5   0.9      133.6  4.0     100.0%      +95.2%
VT  61.3   0.8      111.8  2.6     100.0%      +82.4%
R   61.1   0.9      108.7  2.8     100.0%      +78.1%
RT  24.6   1.0      28.6   1.6     100.0%      +16.0%

src_n_0565

    Before          After
    Mean   StdDev   Mean   StdDev  Confidence  Change
L1  157.4  1.0      983.1  38.5    100.0%      +524.6%
L2  93.6   0.5      696.0  14.3    100.0%      +643.4%
M   92.7   0.4      680.5  1.0     100.0%      +634.0%
HT  68.3   0.9      160.3  6.6     100.0%      +134.6%
VT  61.1   0.8      130.1  3.4     100.0%      +112.9%
R   61.0   0.8      125.4  4.1     100.0%      +105.7%
RT  24.9   1.3      29.5   1.5     100.0%      +18.2%

src_n_8

    Before          After
    Mean   StdDev   Mean   StdDev  Confidence  Change
L1  154.7  1.0      1324.4 48.5    100.0%      +756.3%
L2  92.4   0.4      1178.4 10.9    100.0%      +1175.6%
M   92.9   0.4      1275.7 2.1     100.0%      +1273.5%
HT  68.2   1.0      169.8  5.5     100.0%      +149.0%
VT  61.2   1.0      138.5  3.6     100.0%      +126.3%
R   61.3   0.9      130.1  3.8     100.0%      +112.4%
RT  25.5   1.3      29.2   1.9     100.0%      +14.6%
2013-01-29 21:47:49 +02:00
Ben Avison
2e173326aa ARMv6: Lay the groundwork for later patches in the series
Move the entire contents of pixman-arm-simd-asm.S to a new file;
ultimately this will only retain the scaled operations, so it is
named pixman-arm-simd-asm-scaled.S. Added new header file
pixman-arm-simd-asm.h, containing the macros which are the basis of
all the new ARMv6 implementations, although at this point in the
series, nothing uses them and the library should be binary-identical.
2013-01-29 21:47:42 +02:00
Søren Sandmann Pedersen
65fc1adb65 demo/scale: Add a spin button to set the number of subsample bits
For large upscalings the level of subsampling for the filter has a
quite visible effect, so make it settable in the UI so that people can
experiment with various values.
2013-01-27 23:06:28 -05:00
Siarhei Siamashka
ed39992564 Use pixman_transform_point_31_16() from pixman_transform_point()
Old functions pixman_transform_point() and pixman_transform_point_3d()
now become just wrappers for pixman_transform_point_31_16() and
pixman_transform_point_31_16_3d(). Eventually their uses should be
completely eliminated in the pixman code and replaced with their
extended range counterparts. This is needed in order to be able
to correctly handle any matrices and parameters that may come
to pixman from the code responsible for XRender implementation.
2013-01-27 20:50:38 +02:00
Siarhei Siamashka
5a78d74ccc test: Added matrix-test for testing projective transform accuracy
This test uses __float128 data type when it is available
for implementing a "perfect" reference implementation. The
output from from pixman_transform_point_31_16() and
pixman_transform_point_31_16_affine() is compared with the
reference implementation to make sure that the rounding
errors may only show up in a single least significant bit.

The platforms and compilers, which do not support __float128
data type, can rely on crc32 checksum for the pseudorandom
transform results.
2013-01-27 20:50:31 +02:00
Siarhei Siamashka
09600ae7e3 configure.ac: Added detection for __float128 support
GCC supports 128-bit floating point data type on some platforms (including
but not limited to x86 and x86-64). This may be useful for tests, which
need prefectly accurate reference implementations of certain algorithms.
2013-01-27 20:50:26 +02:00
Siarhei Siamashka
c3deb8334a Add higher precision "pixman_transform_point_*" functions
The following new functions are added:

pixman_transform_point_31_16_3d() -
    Calculates the product of a matrix and a vector multiplication.

pixman_transform_point_31_16() -
    Calculates the product of a matrix and a vector multiplication.
    Then converts the homogenous resulting vector [x, y, z] to
    cartesian [x', y', 1] variant, where x' = x / z, and y' = y / z.

pixman_transform_point_31_16_affine() -
    A faster sibling of the other two functions, which assumes affine
    transformation, where the bottom row of the matrix is [0, 0, 1] and
    the last element of the input vector is set to 1.

These functions transform a point with 31.16 fixed point coordinates from
the destination space to a point with 48.16 fixed point coordinates in
the source space.

The results are accurate and the rounding errors may only show up in
the least significant bit. No overflows are possible for the affine
transformations as long as the input data is provided in 31.16 format.
In the case of projective transformations, some output values may be not
representable using 48.16 fixed point format. In this case the results
are clamped to return maximum or minimum 48.16 values (so that the caller
can at least handle NONE and PAD repeats correctly).
2013-01-27 20:49:43 +02:00
Siarhei Siamashka
a47ed2c311 Faster fetch for the C variant of r5g6b5 src/dest iterator
Processing two pixels at once is used to reduce the number of
arithmetic operations.

The speedup relative to the generic fetch_scanline_r5g6b5() from
"pixman-access.c" (pixman was compiled with gcc 4.7.2):

    MIPS 74K        480MHz  :  20.32 MPix/s ->  26.47 MPix/s
    ARM11           700MHz  :  34.95 MPix/s ->  38.22 MPix/s
    ARM Cortex-A8  1000MHz  :  87.44 MPix/s -> 100.92 MPix/s
    ARM Cortex-A9  1700MHz  : 150.95 MPix/s -> 158.13 MPix/s
    ARM Cortex-A15 1700MHz  : 148.91 MPix/s -> 155.42 MPix/s
    IBM Cell PPU   3200MHz  :  75.29 MPix/s ->  98.33 MPix/s
    Intel Core i7  2800MHz  : 257.02 MPix/s -> 376.93 MPix/s

That's the performance for C code (SIMD and assembly optimizations
are disabled via PIXMAN_DISABLE environment variable).
2013-01-27 20:48:31 +02:00
Siarhei Siamashka
e66fd5ccb6 Faster write-back for the C variant of r5g6b5 dest iterator
Unrolling loops improves performance, so just use it here.

Also GCC can't properly optimize this code for RISC processors and
allocate 0x1F001F constant in a register. Because this constant is
too large to be represented as an immediate operand in instructions,
GCC inserts some redundant arithmetics. This problem can be workarounded
by explicitly using a variable for 0x1F001F constant and also initializing
it by a read from another volatile variable. In this case GCC is forced
to allocate a register for it, because it is not seen as a constant anymore.

The speedup relative to the generic store_scanline_r5g6b5() from
"pixman-access.c" (pixman was compiled with gcc 4.7.2):

    MIPS 74K        480MHz  :  33.22 MPix/s ->  43.42 MPix/s
    ARM11           700MHz  :  50.16 MPix/s ->  78.23 MPix/s
    ARM Cortex-A8  1000MHz  : 117.75 MPix/s -> 196.34 MPix/s
    ARM Cortex-A9  1700MHz  : 177.04 MPix/s -> 320.32 MPix/s
    ARM Cortex-A15 1700MHz  : 231.44 MPix/s -> 261.64 MPix/s
    IBM Cell PPU   3200MHz  : 130.25 MPix/s -> 145.61 MPix/s
    Intel Core i7  2800MHz  : 502.21 MPix/s -> 721.73 MPix/s

That's the performance for C code (SIMD and assembly optimizations
are disabled via PIXMAN_DISABLE environment variable).
2013-01-27 20:48:26 +02:00
Siarhei Siamashka
a9f6669416 Added C variants of r5g6b5 fetch/write-back iterators
Adding specialized iterators for r5g6b5 color format allows us to work
on fine tuning performance of r5g6b5 fetch/write-back operations in the
pixman general "fetch -> combine -> store" pipeline.

These iterators also make "src_x888_0565" fast path redundant, so it can
be removed.
2013-01-27 20:48:22 +02:00
Chris Wilson
794033ed43 Eliminate duplicate copies of channel flags for pixman_image_composite32()
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-27 14:04:16 +00:00
Chris Wilson
a59f081df4 Always return a valid function from lookup_combiner()
We should always have at least a C combiner available, so we never
expect the search to fail. If it does, emit an error and return a
dummy function.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-27 14:04:16 +00:00
Chris Wilson
520230914b Always return a valid function from lookup_composite()
We never expect to fail to find the appropriate function as the
general_composite_rect should always match. So if somehow we fallthrough
the search, emit a _pixman_log_error() and return a dummy function.

Note that we remove some conditionals and a level of indentation hence a
large amount of code movement. This also reveals that in a few places we
are duplicating stack variables that can be eliminated later.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-27 14:04:15 +00:00
Chris Wilson
b283c864a3 sse2: Add fast paths for bilinear source with a solid mask
Based on the existing sse2_8888_n_8888 nearest scaling routines.

fishbowl on an i5-2500: 60.9s -> 56.9s

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-27 14:04:15 +00:00
Chris Wilson
d00ce40912 sse2: Add a fast path for add_n_8_8888
This path is being exercised by compositing of trapezoids for clipmasks, for
instance as used in the firefox-asteroids cairo-trace.

IVB i7-3720qm ./tests/lowlevel-blt-bench add_n_8_8888:

reference memcpy speed = 14846.7MB/s (3711.7MP/s for 32bpp fills)

before: L1: 681.10  L2: 735.14  M:701.44 ( 28.35%)  HT:283.32  VT:213.23  R:208.93  RT: 77.89 ( 793Kops/s)

after:  L1: 992.91  L2:1017.33  M:982.58 ( 39.88%)  HT:458.93  VT:332.32  R:326.13  RT:136.66 (1287Kops/s)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-27 14:04:15 +00:00
Chris Wilson
7ced3beec9 sse2: Add a fast path for add_n_8888
This path is being exercised by inplace compositing of trapezoids, for
instance as used in the firefox-asteroids cairo-trace.

IVB i3-3720qm ./tests/lowlevel-blt-bench add_n_888:

reference memcpy speed = 14918.3MB/s (3729.6MP/s for 32bpp fills)

before: L1:1752.44  L2:2259.48  M:2215.73 ( 58.80%)  HT:589.49   VT:404.04   R:424.69  RT:134.68 (1182Kops/s)

after:  L1:3931.21  L2:6132.78  M:3440.17 ( 92.24%)  HT:1337.70  VT:1357.64  R:1270.27  RT:359.78 (2161Kops/s)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-27 14:04:15 +00:00
Jeff Muizelaar
b7f523e3bc Add a version of bilinear_interpolation for precision <=4
Having 4 or fewer bits means we can do two components at
a time in a single 32 bit register.

Here are the results for firefox-fishtank on a Pandaboard with
4.6.3 and PIXMAN_DISABLE="arm-neon"

Before:
[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]    image           t-firefox-fishtank    7.841    7.910   0.70%    6/6

After:
[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]    image           t-firefox-fishtank    6.951    6.995   1.11%    6/6
2013-01-25 13:14:37 -05:00
Ben Avison
24e83cae64 Tweaks to lowlevel-blt-bench
This adds two extra tests, src_n_8 and src_8_8, which I have been
using to benchmark my ARMv6 changes.

I'd also like to propose that it requires an exact test name as the
executable's argument, as achieved by this strstr to strcmp change.
Without this, it is impossible to only benchmark (for example)
add_8_8, add_n_8 or src_n_8, due to those also being substrings of
many other test names.
2013-01-25 11:13:07 -05:00
Søren Sandmann Pedersen
b527a0e615 test: Use operator_name() and format_name() in composite.c
With the operator_name() and format_name() functions there is no
longer any reason for composite.c to have its own table of format and
operator names.
2013-01-23 12:24:31 -05:00
Søren Sandmann Pedersen
4eb9a24aba utils.[ch]: Add new format_name() function
This function returns the name of the given format code, which is
useful for printing out debug information. The function is written as
a switch without a default value so that the compiler will warn if new
formats are added in the future. The fake formats used in the fast
path tables are also recognized.

The function is used in alpha_map.c, where it replaces an existing
format_name() function, and in blitters-test.c, affine-test.c, and
scaling-test.c.
2013-01-23 12:24:31 -05:00
Søren Sandmann Pedersen
1676b49389 test/utils.[ch]: Add new function operator_name()
This function returns the name of the given operator, which is useful
for printing out debug information. The function is done as a switch
without a default value so that the compiler will warn if new
operators are added in the future.

The function is used in affine-test.c, scaling-test.c, and
blitters-test.c.
2013-01-23 12:24:31 -05:00
Søren Sandmann Pedersen
8d85311143 README: Add guidelines on how to contribute patches
Ben Avison pointed out here:

   http://lists.freedesktop.org/archives/pixman/2013-January/002485.html

that there isn't really any documentation about how to submit patches
to pixman. This patch adds some information to the README file.

v2: Incorporate some comments from Ben Avison
v3: Change gitweb URL to cgit
2013-01-23 12:22:40 -05:00
Matt Turner
61dacffaf4 Convert INCLUDES to AM_CPPFLAGS
INCLUDES has been deprecated starting with automake 1.13. Convert all
occurrences with the recommended AM_CPPFLAGS replacement.
2013-01-22 22:08:30 -08:00
Matt Turner
c7c28f440d Add new demos and tests to .gitignore 2013-01-22 22:08:30 -08:00
Nemanja Lukic
2c6577476e MIPS: DSPr2: Added more fast-paths:
- over_reverse_n_8888
 - in_n_8_8

Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        over_reverse_n_8888 =  L1:  19.42  L2:  19.07  M: 15.38 ( 40.80%)  HT: 13.35  VT: 13.10  R: 12.92  RT:  8.27 (  49Kops/s)
                   in_n_8_8 =  L1:  21.20  L2:  22.86  M: 21.42 ( 14.21%)  HT: 15.97  VT: 15.69  R: 15.47  RT:  8.00 (  48Kops/s)

Optimized:
        over_reverse_n_8888 =  L1:  60.09  L2:  47.87  M: 28.65 ( 76.02%)  HT: 23.58  VT: 22.51  R: 21.99  RT: 12.28 (  60Kops/s)
                   in_n_8_8 =  L1:  89.38  L2:  86.07  M: 65.48 ( 43.44%)  HT: 44.64  VT: 41.50  R: 40.77  RT: 16.94 (  66Kops/s)
2013-01-22 03:12:59 +01:00
Nemanja Lukic
a67b0e24d7 MIPS: DSPr2: Added more fast-paths for REVERSE operation:
- out_reverse_8_0565
 - out_reverse_8_8888

Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        out_reverse_8_0565 =  L1:  14.29  L2:  13.58  M: 12.14 ( 24.16%)  HT:  9.23  VT:  9.12  R:  8.84  RT:  4.75 (  36Kops/s)
        out_reverse_8_8888 =  L1:  27.46  L2:  23.24  M: 17.41 ( 57.73%)  HT: 12.61  VT: 12.47  R: 11.79  RT:  5.86 (  41Kops/s)

Optimized:
        out_reverse_8_0565 =  L1:  28.24  L2:  25.64  M: 20.63 ( 41.05%)  HT: 16.69  VT: 16.14  R: 15.50  RT:  8.69 (  52Kops/s)
        out_reverse_8_8888 =  L1:  52.78  L2:  41.44  M: 23.50 ( 77.94%)  HT: 18.79  VT: 18.16  R: 16.90  RT:  9.11 (  53Kops/s)
2013-01-22 03:10:31 +01:00
Maarten Lankhorst
01c2431ef8 Add 00-unexport-symbol.diff
* Add 00-unexport-symbol.diff
  - remove test-only use of _pixman_internal_only_get_implementation
  - zap the only test requiring the use of this symbol
2013-01-08 18:16:23 +01:00
Maarten Lankhorst
d6b69d4f63 update symbols file and addd lintian override for hidden symbol 2013-01-08 17:10:12 +01:00
Maarten Lankhorst
0f8c56fe52 new upstream release 2013-01-08 16:12:25 +01:00
Maarten Lankhorst
818af795d4 pixman 0.28.2 release
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iEYEABECAAYFAlDF0BkACgkQmxfmIW/3waiEegCcCVDzXL2gGouDGCBqJVOmzUcv
 ZnMAoI50IhP5KXKKEEx2dJlfFkzKVo5N
 =J62R
 -----END PGP SIGNATURE-----

Merge tag 'pixman-0.28.2' into debian-experimental

pixman 0.28.2 release
2013-01-08 16:10:57 +01:00
Søren Sandmann Pedersen
35cc965514 pixman-filter.c: Cope with NULL returns from malloc()
v2: Don't return a pointer to uninitialized memory when the allocation
of horz and vert fails, but allocation of params doesn't.
2013-01-06 17:38:23 -05:00
Søren Sandmann Pedersen
58526cfc72 Handle solid images in the noop iterator
The noop src iterator already has code to handle solid images, but
that code never actually runs currently because it is not possible for
an image to have both a format code of PIXMAN_solid and a flag of
FAST_PATH_BITS_IMAGE.

If these two were to be set at the same time, the
fast_composite_tiled_repeat() fast path would trigger for solid images
(because it triggers for PIXMAN_any formats, which includes
PIXMAN_solid), but for solid images we can usually do better than that
fast path.

So this patch removes _pixman_solid_fill_iter_init() and instead
handles such images (along with repeating 1x1 bits images without an
alpha map) in pixman-noop.c.

When a 1x1R image is involved in the general composite path, before
this patch, it would hit this code in repeat() in pixman-inlines.h:

        while (*c >= size)
            *c -= size;
        while (*c < 0)
            *c += size;

and those loops could run for a huge number of iteratons (proportional
to the composite width). For such cases, the performance improvement
is really big:

./test/lowlevel-blt-bench -n add_n_8888:

Before:

    add_n_8888 =  L1:   3.86  L2:   3.78  M:  1.40 (  0.06%)  HT:  1.43  VT:  1.41  R:  1.41  RT:  1.38 (  19Kops/s)

After:

    add_n_8888 =  L1:1236.86  L2:2468.49  M:1097.88 ( 49.04%)  HT:476.49  VT:429.05  R:417.04  RT:155.12 ( 817Kops/s)
2013-01-06 17:30:12 -05:00
Marko Lindqvist
480dd38fd1 Fix build with automake-1.13
Automake-1.13 has removed long obsolete AM_CONFIG_HEADER macro (
http://lists.gnu.org/archive/html/automake/2012-12/msg00038.html )
and autoreconf errors out upon seeing it.

Attached patch replaces obsolete AM_CONFIG_HEADER with now proper
AC_CONFIG_HEADERS.
2013-01-04 01:54:10 +02:00
Siarhei Siamashka
1abde88ae6 Use more appropriate types and remove a magic constant 2013-01-04 01:27:06 +02:00
Siarhei Siamashka
c1fd5a4243 Define SIZE_MAX if it is not provided by the standard C headers
C++ compilers do not define SIZE_MAX. It is also not available
if the code is compiled by some C compilers:
    http://lists.freedesktop.org/archives/pixman/2012-August/002196.html
2013-01-04 01:26:55 +02:00
Siarhei Siamashka
66c4292822 Rename 'xor' variable to 'filler' (because 'xor' is a C++ keyword) 2012-12-20 03:14:21 +02:00
Søren Sandmann Pedersen
4dfda2adfe float-combiner.c: Change tests for x == 0.0 tests to - FLT_MIN < x < FLT_MIN
pixman-float-combiner.c currently uses checks like these:

    if (x == 0.0f)
        ...
    else
        ... / x;

to prevent division by 0. In theory this is correct: a division-by-zero
exception is only supposed to happen when the floating point numerator is
exactly equal to a positive or negative zero.

However, in practice, the combination of x87 and gcc optimizations
causes issues. The x87 registers are 80 bits wide, which means the
initial test:

	if (x == 0.0f)

may be false when x is an 80 bit floating point number, but when x is
rounded to a 32 bit single precision number, it becomes equal to
0.0. In principle, gcc should compensate for this quirk of x87, and
there are some options such as -ffloat-store, -fexcess-precision=standard,
and -std=c99 that will make it do so, but these all have a performance
cost.  It is also possible to set the FPU to a mode that makes it do
all computation with single or double precision, but that would
require pixman to save the existing mode before doing anything with
floating point and restore it afterwards.

Instead, this patch side-steps the issue by replacing exact checks for
equality with zero with a new macro that checkes whether the value is
between -FLT_MIN and FLT_MIN.

There is extensive reading material about this issue linked off the
infamous gcc bug 323:

    http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323
2012-12-19 13:49:32 -05:00
Siarhei Siamashka
2734071d7b ARM: make use of UQADD8 instruction even in generic C code paths
ARMv6 has UQADD8 instruction, which implements unsigned saturated
addition for 8-bit values packed in 32-bit registers. It is very useful
for UN8x4_ADD_UN8x4, UN8_rb_ADD_UN8_rb and ADD_UN8 macros (which would
otherwise need a lot of arithmetic operations to simulate this operation).
Since most of the major ARM linux distros are built for ARMv7, we are
much less dependent on runtime CPU detection and can get practical
benefits from conditional compilation here for a lot of users.

The results of cairo-perf-trace benchmark on ARM Cortex-A15 with pixman
compiled by gcc 4.7.2 and PIXMAN_DISABLE set to "arm-simd arm-neon":

Speedups
========
image    firefox-talos-gfx  (29938.22 0.12%) ->  (27814.76 0.51%) : 1.08x speedup
image    firefox-asteroids  (23241.11 0.07%) ->  (21795.19 0.07%) : 1.07x speedup
image firefox-canvas-alpha (174519.85 0.08%) -> (164788.64 0.20%) : 1.06x speedup
image              poppler   (9464.46 1.61%) ->   (8991.53 0.14%) : 1.05x speedup
2012-12-18 20:49:58 +02:00
Siarhei Siamashka
f9a41703b2 Faster conversion from a8r8g8b8 to r5g6b5 in C code
This change reduces 3 shifts, 3 ANDs and 2 ORs (total 8 arithmetic
operations) to 3 shifts, 2 ANDs and 2 ORs (total 7 arithmetic
operations).

We get garbage in the high 16 bits of the result, which might need
to be cleared when casting to uint16_t (it would bring us back to
total 8 arithmetic operations). However in the case if the result
of a8r8g8b8->r5g6b5 conversion is immediately stored to memory, no
extra instructions for clearing these garbage bits are needed.

This allows the a8r8g8b8->r5g6b5 conversion code to be compiled
into 4 instructions for ARM instead of 5 (assuming a good optimizing
compiler), which has no pipeline stalls on ARM11 as an additional
bonus.

The change in benchmark results for 'lowlevel-blt-bench src_8888_0565'
with PIXMAN_DISABLE="arm-simd arm-neon mips-dspr2 mmx sse2" and pixman
compiled by gcc-4.7.2:

    MIPS 74K        480MHz  :  40.44 MPix/s ->  40.13 MPix/s
    ARM11           700MHz  :  50.28 MPix/s ->  62.85 MPix/s
    ARM Cortex-A8  1000MHz  : 124.38 MPix/s -> 141.85 MPix/s
    ARM Cortex-A15 1700MHz  : 281.07 MPix/s -> 303.29 MPix/s
    Intel Core i7  2800MHz  : 515.92 MPix/s -> 531.16 MPix/s

The same trick was used in xomap (X server for Nokia N800/N810):
    http://repository.maemo.org/pool/diablo/free/x/xorg-server/
    xorg-server_1.3.99.0~git20070321-0osso20083801.tar.gz
2012-12-18 20:45:57 +02:00
Siarhei Siamashka
3922e90c40 Change CONVERT_XXXX_TO_YYYY macros into inline functions
It is easier and safer to modify their code in the case if the
calculations need some temporary variables. And the temporary
variables will be needed soon.
2012-12-18 20:45:47 +02:00
Siarhei Siamashka
e4519360c1 test: add "src_0565_8888" to lowlevel-blt-bench 2012-12-18 20:43:51 +02:00
Søren Sandmann Pedersen
6a6c8c51ed pixman_composite_trapezoids(): Check for NULL return from create_bits()
A check is needed that the creation of the temporary image in
pixman_composite_trapezoids() succeeds.

Fixes crash in stress-test -s 0x313c on my system.
2012-12-13 16:13:11 -05:00
Søren Sandmann Pedersen
c2cb303d33 pixman_composite_trapezoids: Return early if mask_format is not of TYPE_ALPHA
stress-test -s 0x17ee crashes because pixman_composite_trapezoids() is
given a mask_format of PIXMAN_c8, which causes it to create a
temporary image with that format but without a palette. This causes
crashes later.

The only mask_format that we actually support are those of TYPE_ALPHA,
so this patch add a return_if_fail() to ensure this.

Similarly, although currently it won't crash if given an invalid
format, alpha-only formats have always been the only thing that made
sense for the pixman_rasterize_edges() functions, so add a
return_if_fail() ensuring that the destination format is of type
PIXMAN_TYPE_ALPHA.
2012-12-13 16:10:41 -05:00
Søren Sandmann Pedersen
1f0c02811e Add testing of trapezoids to stress-test
The entry points add_trapezoids(), rasterize_trapezoid() and
composite_trapezoid() are exercised with random trapezoids.

This uncovers crashes with stress-test seeds 0x17ee and 0x313c.
2012-12-13 15:59:18 -05:00
Søren Sandmann Pedersen
526dc06e56 demos/radial-test: Add checkerboard to display the alpha channel 2012-12-11 09:05:58 -05:00
Søren Sandmann Pedersen
6402b2aa0c demos/conical-test: Use the draw_checkerboard() utility function
Instead of having its own copy.
2012-12-11 09:05:58 -05:00
Søren Sandmann Pedersen
e382e52d67 test/utils.[ch]: Add utility function to draw a checkerboard
This is useful in demo programs to display the alpha channel.
2012-12-11 09:05:58 -05:00
Søren Sandmann Pedersen
b0a6504122 radial: When comparing t to mindr, use >= rather than >
Radial gradients are conceptually rendered as a sequence of circles
generated by linearly extrapolating from the two circles given by the
gradient specification. Any circles in that sequence that would end up
with a negative radius are not drawn, a condition that is enforced by
checking that t * dr is bigger than mindr:

     if (t * dr > mindr)

However, it is legitimate for a circle to have radius exactly 0, so
the test should use >= rather than >.

This gets rid of the dots in demos/radial-test except for when the c2
circle has radius 0 and a repeat mode of either NONE or NORMAL. Both
those dots correspond to a t value of 1.0, which is outside the
defined interval of [0.0, 1.0) and therefore subject to the repeat
algorithm. As a result, in the NONE case, a value of 1.0 turns into
transparent black. In the NORMAL case, 1.0 wraps around and becomes
0.0 which is red, unlike 0.99 which is blue.

Cc: ranma42@gmail.com
2012-12-11 09:05:38 -05:00
Søren Sandmann Pedersen
54aca22058 demos/radial-test: Add zero-radius circles to demonstrate rendering bugs
Add two new gradient columns, one where the start circle is has radius
0 and one where the end circle has radius 0. All the new gradients
except for one are rendered with a bright dot in the middle. In most
but not all cases this is incorrect.

Cc: ranma42@gmail.com
2012-12-11 08:20:45 -05:00
Siarhei Siamashka
fdab3c1b6c test: Workaround unaligned MOVDQA bug (http://gcc.gnu.org/PR55614)
Just use SSE2 intrinsics to do unaligned memory accesses as
a workaround for this gcc bug related to vector extensions.
2012-12-10 20:05:15 +02:00
Siarhei Siamashka
2bc59006d7 Improve performance of combine_over_u
The generic C over_u combiner can be a lot faster with the
addition of special shortcuts for 0xFF and 0x00 alpha/mask
values. This is already implemented in C and SSE2 fast paths.

Profiling the run of cairo-perf-trace benchmarks with PIXMAN_DISABLE
environment variable set to "fast mmx sse2" on Intel Core i7:

=== before ===

37.32%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] combine_over_u
21.37%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] bits_image_fetch_bilinear_no_repeat_8888
13.51%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] bits_image_fetch_bilinear_affine_none_a8r8g8b8
 2.96%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] radial_compute_color
 2.74%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] fetch_scanline_a8
 2.71%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] fetch_scanline_x8r8g8b8
 2.17%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] _pixman_gradient_walker_pixel
 1.86%  cairo-perf-trac  libcairo.so.2.11200.0 [.] _cairo_tor_scan_converter_generate
 1.57%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] bits_image_fetch_bilinear_affine_pad_a8r8g8b8
 0.97%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] combine_in_reverse_u
 0.96%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] combine_over_ca

=== after ===

28.79%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] bits_image_fetch_bilinear_no_repeat_8888
18.44%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] bits_image_fetch_bilinear_affine_none_a8r8g8b8
15.54%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] combine_over_u
 3.94%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] radial_compute_color
 3.69%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] fetch_scanline_a8
 3.69%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] fetch_scanline_x8r8g8b8
 2.94%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] _pixman_gradient_walker_pixel
 2.52%  cairo-perf-trac  libcairo.so.2.11200.0 [.] _cairo_tor_scan_converter_generate
 2.08%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] bits_image_fetch_bilinear_affine_pad_a8r8g8b8
 1.31%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] combine_in_reverse_u
 1.29%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] combine_over_ca
2012-12-10 20:02:08 +02:00
Søren Sandmann Pedersen
a5e5179b56 Pre-release version bump to 0.28.2 2012-12-10 06:46:36 -05:00
Benjamin Gilbert
6e270a7968 Fix thread safety on mingw-w64 and clang
After finding a working TLS storage class specifier, configure was
continuing to test other candidates.  This caused it to prefer
__declspec(thread) over __thread.  However, __declspec(thread) is
ignored with a warning by mingw-w64 [1] and silently ignored by clang [2].
The resulting binary behaved as if PIXMAN_NO_TLS was defined.

Bug introduced by a069da6c.

[1] https://bugs.freedesktop.org/show_bug.cgi?id=57591
[2] http://lists.freedesktop.org/archives/pixman/2012-October/002320.html
2012-12-10 06:46:36 -05:00
Stefan Weil
d91f550b2a Always use xmmintrin.h for 64 bit Windows
MinGW-w64 uses the GNU compiler and does not define _MSC_VER.
Nevertheless, it provides xmmintrin.h and must be handled
here like the MS compiler. Otherwise compilation fails due to
conflicting declarations.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2012-12-10 06:46:36 -05:00
Joshua Root
2092aa0d92 Fix undeclared variable use and sysctlbyname error handling on ppc
Fixes bug 56889.
2012-12-10 06:46:36 -05:00
Søren Sandmann Pedersen
9029026edd Post-release version bump to 0.28.1 2012-12-10 06:46:36 -05:00
Søren Sandmann Pedersen
8ca4e14472 Add fast paths for separable convolution
Similar to the fast paths for general affine access, add some fast
paths for the separable filter for all combinations of formats
x8r8g8b8, a8r8g8b8, r5g6b5, a8 with the four repeat modes.

It is easy to see the speedup in the demos/scale program.
2012-12-08 12:38:58 -05:00
Søren Sandmann Pedersen
4f18ba30ce Add demo program for conical gradients
This new test is derived from radial-test.c and displays conical
gradients at various angles.

It also demonstrates how PIXMAN_REPEAT_NORMAL is supposed to work when
used with a gradient specification where the first stop is not a 0.0:
In this case the gradient is supposed to have a smooth transition from
the last stop back to the first stop with no sharp transitions. It
also shows that the repeat mode is not ignored for conical gradients
as one might be tempted to think.
2012-12-08 10:50:51 -05:00
Søren Sandmann Pedersen
3a98787bdd Add demos/zone_plate.png
The zone plate image is a useful test case for image scalers because
it contains all representable frequencies, so any imperfection in
resampling filters will show up as Moire patterns.

This version is symmetric around the midpoint of the image, so since
rotating it is supposed to be a noop, it can also be used to verify
that the resampling filters don't shift the image.

V2: Run the file through OptiPNG to cut the size in half, as suggested
by Siarhei.
2012-12-08 10:50:51 -05:00
Søren Sandmann Pedersen
97491ed26c demos: Add new demo program, "scale"
This program allows interactively scaling and rotating images with
using various filters and repeat modes. It uses
pixman_filter_create_separate_convolution() to generate the filters.
2012-12-08 10:50:51 -05:00
Søren Sandmann Pedersen
7f5bb22d17 demos/gtk-utils.[ch]: Add pixman_image_from_file()
This function uses GdkPixbuf to load various common formats such as
.png and .jpg into a pixman image.
2012-12-08 10:50:51 -05:00
Søren Sandmann Pedersen
6915f3e24f Add new pixman_filter_create_separable_convolution() API
This new API is a helper function to create filter parameters suitable
for use with PIXMAN_FILTER_SEPARABLE_CONVOLUTION.

For each dimension, given a scale factor, reconstruction and sample
filter kernels, and a subsampling resolution, this function will
compute a convolution of the two kernels scaled appropriately, then
sample that convolution and return the resulting vectors in a form
suitable for being used as parameters to
PIXMAN_FILTER_SEPARABLE_CONVOLUTION.

The filter kernels offered are the following:

  - IMPULSE:            Dirac delta function, ie., point sampling
  - BOX:                Box filter
  - LINEAR:             Linear filter, aka. "Tent" filter
  - CUBIC:              Cubic filter, currently Mitchell-Netravali
  - GAUSSIAN:           Gaussian function, sigma=1, support=3*sigma
  - LANCZOS2:           Two-lobed Lanczos filter
  - LANCZOS3:           Three-lobed Lanczos filter
  - LANCZOS3_STRETCHED: Three-lobed Lanczos filter, stretched by 4/3.0.
                        This is the "Nice" filter from Dirty Pixels by
                        Jim Blinn.

The intended way to use this function is to extract scaling factors
from the transformation and then pass those to this function to get a
filter suitable for compositing with that transformation. The filter
kernels can be chosen according to quality and performance tradeoffs.

To get equivalent quality to GdkPixbuf for downscalings, use BOX for
both reconstruction and sampling. For upscalings, use LINEAR for
reconstruction and IMPULSE for sampling (though note that for
upscaling in both X and Y directions, simply using
PIXMAN_FILTER_BILINEAR will likely be a better choice).
2012-12-08 10:50:51 -05:00
Søren Sandmann Pedersen
68760d3fe1 rounding.txt: Describe how SEPARABLE_CONVOLUTION filter works
Add some notes on how to compute the convolution matrices to be used
with the SEPARABLE_CONVOLUTION filter.
2012-12-08 10:50:51 -05:00
Søren Sandmann Pedersen
6fd480b17c Add new filter PIXMAN_FILTER_SEPARABLE_CONVOLUTION
This filter is a new way to use a convolution matrix for filtering. In
contrast to the existing CONVOLUTION filter, this new variant is
different in two respects:

- It is subsampled: Instead of just one convolution matrix, this
  filter chooses between a number of matrices based on the subpixel
  sample location, allowing the convolution kernel to be sampled at a
  higher resolution.

- It is separable: Each matrix is specified as the tensor product of
  two vectors. This has the advantages that many fewer values have to
  be stored, and that the filtering can be done separately in the x
  and y dimensions (although the initial implementation doesn't
  actually do that).

The motivation for this new filter is to improve image downsampling
quality. Currently, the best pixman can do is the regular convolution
filter which is limited to coarsely sampled convolution kernels.

With this new feature, any separable filter can be used at any desired
resolution.
2012-12-08 10:50:51 -05:00
Benjamin Gilbert
7e39861da3 Fix thread safety on mingw-w64 and clang
After finding a working TLS storage class specifier, configure was
continuing to test other candidates.  This caused it to prefer
__declspec(thread) over __thread.  However, __declspec(thread) is
ignored with a warning by mingw-w64 [1] and silently ignored by clang [2].
The resulting binary behaved as if PIXMAN_NO_TLS was defined.

Bug introduced by a069da6c.

[1] https://bugs.freedesktop.org/show_bug.cgi?id=57591
[2] http://lists.freedesktop.org/archives/pixman/2012-October/002320.html
2012-12-08 16:41:10 +02:00
Siarhei Siamashka
ebedd9a2ad test: Get rid of the obsolete 'prng_rand_N' and 'prng_rand_u32'
They are the same as 'prng_rand_n' and 'prng_rand'
2012-12-06 17:20:38 +02:00
Siarhei Siamashka
b31a696263 test: Switch to the new PRNG instead of old LCG
Wallclock time for running pixman "make check" (compile time not included):

----------------------------+----------------+-----------------------------+
                            | old PRNG (LCG) |   new PRNG (Bob Jenkins)    |
       Processor type       +----------------+------------+----------------+
                            |    gcc 4.5     |  gcc 4.5   | gcc 4.7 (simd) |
----------------------------+----------------+------------+----------------+
quad Intel Core i7  @2.8GHz |    0m49.494s   |  0m43.722s |    0m37.560s   |
dual ARM Cortex-A15 @1.7GHz |     5m8.465s   |  4m37.375s |    3m45.819s   |
     IBM Cell PPU   @3.2GHz |    23m0.821s   | 20m38.316s |   16m37.513s   |
----------------------------+----------------+------------+----------------+

But some tests got a particularly large boost. For example benchmarking and
profiling blitters-test on Core i7:

=== before ===

$ time ./blitters-test

real    0m10.907s
user    0m55.650s
sys     0m0.000s

  70.45%  blitters-test  blitters-test       [.] create_random_image
  15.81%  blitters-test  blitters-test       [.] compute_crc32_for_image_internal
   2.26%  blitters-test  blitters-test       [.] _pixman_implementation_lookup_composite
   1.07%  blitters-test  libc-2.15.so        [.] _int_free
   0.89%  blitters-test  libc-2.15.so        [.] malloc_consolidate
   0.87%  blitters-test  libc-2.15.so        [.] _int_malloc
   0.75%  blitters-test  blitters-test       [.] combine_conjoint_general_u
   0.61%  blitters-test  blitters-test       [.] combine_disjoint_general_u
   0.40%  blitters-test  blitters-test       [.] test_composite
   0.31%  blitters-test  libc-2.15.so        [.] _int_memalign
   0.31%  blitters-test  blitters-test       [.] _pixman_bits_image_setup_accessors
   0.28%  blitters-test  libc-2.15.so        [.] malloc

=== after ===

$ time ./blitters-test

real    0m3.655s
user    0m20.550s
sys     0m0.000s

  41.77%  blitters-test.n  blitters-test.new  [.] compute_crc32_for_image_internal
  15.77%  blitters-test.n  blitters-test.new  [.] prng_randmemset_r
   6.15%  blitters-test.n  blitters-test.new  [.] _pixman_implementation_lookup_composite
   3.09%  blitters-test.n  libc-2.15.so       [.] _int_free
   2.68%  blitters-test.n  libc-2.15.so       [.] malloc_consolidate
   2.39%  blitters-test.n  libc-2.15.so       [.] _int_malloc
   2.27%  blitters-test.n  blitters-test.new  [.] create_random_image
   2.22%  blitters-test.n  blitters-test.new  [.] combine_conjoint_general_u
   1.52%  blitters-test.n  blitters-test.new  [.] combine_disjoint_general_u
   1.40%  blitters-test.n  blitters-test.new  [.] test_composite
   1.02%  blitters-test.n  blitters-test.new  [.] prng_srand_r
   1.00%  blitters-test.n  blitters-test.new  [.] _pixman_image_validate
   0.96%  blitters-test.n  blitters-test.new  [.] _pixman_bits_image_setup_accessors
   0.90%  blitters-test.n  libc-2.15.so       [.] malloc
2012-12-06 17:20:35 +02:00
Siarhei Siamashka
309e66f047 test: Search/replace 'lcg_*' -> 'prng_*'
The 'lcg' prefix is going to be misleading if we replace
PRNG algorithm.
2012-12-06 17:20:31 +02:00
Siarhei Siamashka
d6545a2fc6 test: Added a better PRNG (pseudorandom number generator)
This adds a fast SIMD-optimized variant of a small noncryptographic
PRNG originally developed by Bob Jenkins:
    http://www.burtleburtle.net/bob/rand/smallprng.html

The generated pseudorandom data is good enough to pass "Big Crush"
tests from TestU01 (http://en.wikipedia.org/wiki/TestU01).

SIMD code uses http://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html
which is a GCC specific extension. There is also a slower alternative
code path, which should work with any C compiler.

The performance of filling buffer with random data:
   Intel Core i7  @2.8GHz (SSE2)     : ~5.9 GB/s
   ARM Cortex-A15 @1.7GHz (NEON)     : ~2.2 GB/s
   IBM Cell PPU   @3.2GHz (Altivec)  : ~1.7 GB/s
2012-12-06 17:20:27 +02:00
Siarhei Siamashka
41f98a07fc test: Change is_little_endian() into inline function
Also dropped redundant volatile keyword because any object
can be accessed via char* pointer without breaking aliasing
rules. The compilers are able to optimize this function to either
constant 0 or 1.
2012-12-06 17:20:23 +02:00
Cyril Brulebois
97a117ef1d New upstream release. 2012-11-27 14:00:27 +01:00
Cyril Brulebois
e33dbc6c69 Merge branch 'upstream-experimental' into debian-experimental 2012-11-27 13:59:51 +01:00
Søren Sandmann Pedersen
978bab253d Add text file rounding.txt describing how rounding works
It is not entirely obvious how pixman gets from "location in the
source image" to "pixel value stored in the destination". This file
describes how the filters work, and in particular how positions are
rounded to samples.
2012-11-22 01:16:54 -05:00
Søren Sandmann Pedersen
74319e9d39 Convolution filter: round color values instead of truncating
The pixel computed by the convolution filter should be rounded off,
not truncated. As a simple example consider a convolution matrix
consisting of five times 0x3333. If all five all five input pixels are
0xff, then the result of truncating will be

    (5 * 0x3333 * 255) >> 16 = 254

But the real value of the computation is (5 * 0x3333 / 65536.0) * 254
= 254.9961, so the error is almost 1. If the user isn't very careful
about normalizing the convolution kernel so that it sums to one in
fixed point, such error might cause solid images to change color, or
opaque images to become translucent.

The fix is simply to round instead of truncate.
2012-11-22 01:06:29 -05:00
Søren Sandmann Pedersen
f0816ddaf4 Round fixed-point multiplication
After two fixed-point numbers are multiplied, the result is shifted
into place, but up until now pixman has simply discarded the low-order
bits instead of rounding to the closest number.

Fix that by adding 0x8000 (or 0x2 in one place) before shifting and
update the test checksums to match.
2012-11-20 03:23:51 -05:00
Stefan Weil
44dd746bb6 test: Fix compiler warnings caused by unused code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2012-11-14 18:02:14 -05:00
Stefan Weil
5f96022d3b pixman: Use uintptr_t in type casts from pointer to integral value
These modifications fix lots of compiler warnings for systems where
sizeof(unsigned long) != sizeof(void *).
This is especially true for MinGW-w64 (64 bit Windows).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2012-11-14 18:02:14 -05:00
Stefan Weil
a96efd02d6 Always use xmmintrin.h for 64 bit Windows
MinGW-w64 uses the GNU compiler and does not define _MSC_VER.
Nevertheless, it provides xmmintrin.h and must be handled
here like the MS compiler. Otherwise compilation fails due to
conflicting declarations.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2012-11-14 18:02:13 -05:00
Nemanja Lukic
899e0d6052 MIPS: DSPr2: Added several nearest neighbor fast paths with a8 mask:
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench -n

Referent (before):
        over_8888_8_0565 =  L1:   9.62  L2:   8.85  M:  7.40 ( 39.27%)  HT:  5.67  VT:  5.61  R:  5.45  RT:  2.98 (  22Kops/s)
        over_0565_8_0565 =  L1:   7.90  L2:   7.49  M:  6.72 ( 26.75%)  HT:  5.24  VT:  5.20  R:  5.06  RT:  2.90 (  22Kops/s)

Optimized:
        over_8888_8_0565 =  L1:  18.51  L2:  16.82  M: 12.13 ( 64.43%)  HT: 10.06  VT:  9.88  R:  9.54  RT:  5.63 (  31Kops/s)
        over_0565_8_0565 =  L1:  14.82  L2:  13.94  M: 11.34 ( 45.20%)  HT:  9.45  VT:  9.35  R:  9.03  RT:  5.50 (  31Kops/s)
2012-11-14 18:01:18 -05:00
Nemanja Lukic
a432bdce66 MIPS: DSPr2: Added more fast-paths for OVER operation:
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        over_n_0565 =  L1:  14.48  L2:  21.36  M: 17.57 ( 23.30%)  HT:  6.95  VT:  6.44  R:  6.39  RT:  2.16 (  22Kops/s)
        over_n_8888 =  L1:  92.60  L2:  86.13  M: 24.41 ( 64.74%)  HT:  8.94  VT:  8.06  R:  8.00  RT:  2.53 (  25Kops/s)

Optimized:
        over_n_0565 =  L1:  27.65  L2: 189.22  M: 58.19 ( 77.12%)  HT: 52.80  VT: 49.88  R: 47.53  RT: 23.67 (  72Kops/s)
        over_n_8888 =  L1: 235.99  L2: 230.86  M: 29.09 ( 77.11%)  HT: 27.95  VT: 27.24  R: 26.58  RT: 18.10 (  67Kops/s)
2012-11-14 18:01:18 -05:00
Nemanja Lukic
e33e9d3f55 MIPS: DSPr2: Added more fast-paths for SRC operation:
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        src_n_8_8888 =  L1:  13.79  L2:  22.47  M: 17.55 ( 58.28%)  HT:  6.95  VT:  6.46  R:  6.34  RT:  2.07 (  20Kops/s)
           src_n_8_8 =  L1:  20.22  L2:  20.21  M: 18.20 ( 24.17%)  HT:  6.65  VT:  6.22  R:  6.11  RT:  2.03 (  20Kops/s)

Optimized:
        src_n_8_8888 =  L1:  58.31  L2:  53.34  M: 25.69 ( 85.29%)  HT: 22.55  VT: 21.44  R: 19.91  RT: 10.34 (  48Kops/s)
           src_n_8_8 =  L1: 102.60  L2:  89.43  M: 65.01 ( 86.32%)  HT: 37.87  VT: 37.02  R: 32.43  RT: 12.41 (  51Kops/s)
2012-11-14 18:01:18 -05:00
Søren Sandmann Pedersen
d881e1f580 Allow src and dst to be identical in pixman_f_transform_invert()
It is useful to be able to invert a matrix in place, but currently
pixman_f_transform_invert() will produce wrong results if you pass the
same matrix as both source and destination.

Fix that by inverting into a temporary matrix and then copying that to
the destination.
2012-11-11 14:09:22 -05:00
Søren Sandmann Pedersen
614e7aaf14 pixman.h: Add typedefs for pixman_f_transform and pixman_f_vector 2012-11-10 01:46:17 -05:00
Joshua Root
b2e0e240fe Fix undeclared variable use and sysctlbyname error handling on ppc
Fixes bug 56889.
2012-11-09 16:13:31 -05:00
Søren Sandmann Pedersen
400436dc52 pixman_image_composite: Reduce opaque masks to NULL
When the mask is known to be opaque, we might as well reduce it to
NULL to take advantage of the various fast paths that operate on NULL
masks.
2012-11-09 16:13:31 -05:00
Søren Sandmann Pedersen
f2ada9e63f Post-release version bump to 0.29.1 2012-11-07 13:45:09 -05:00
Søren Sandmann Pedersen
8a2ff3e0ef Pre-release version bump to 0.28.0 2012-11-07 13:41:15 -05:00
Søren Sandmann Pedersen
4b91f6ca72 Post-release version bump to 0.27.5 2012-10-25 10:42:26 -04:00
Søren Sandmann Pedersen
0de3f33449 Pre-release version bump to 0.27.4 2012-10-25 10:35:27 -04:00
Nemanja Lukic
f075025845 MIPS: DSPr2: Added more fast-paths for ADD operation: - add_8888_8888_8888 - add_8_8 - add_8888_8888
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        add_8888_8888_8888 =  L1:  17.55  L2:  13.35  M:  8.13 ( 93.95%)  HT:  6.60  VT:  6.64  R:  6.45  RT:  3.47 (  26Kops/s)
        add_8_8            =  L1:  86.07  L2:  84.89  M: 62.36 ( 90.11%)  HT: 36.36  VT: 34.74  R: 29.56  RT: 11.56 (  52Kops/s)
        add_8888_8888      =  L1:  95.59  L2:  73.05  M: 17.62 (101.84%)  HT: 15.46  VT: 15.01  R: 13.94  RT:  6.71 (  42Kops/s)

Optimized:
        add_8888_8888_8888 =  L1:  41.52  L2:  33.21  M: 11.97 (138.45%)  HT: 10.47  VT: 10.19  R:  9.42  RT:  4.86 (  32Kops/s)
        add_8_8            =  L1: 135.06  L2: 104.82  M: 57.13 ( 82.58%)  HT: 34.79  VT: 36.60  R: 28.28  RT: 10.54 (  51Kops/s)
        add_8888_8888      =  L1: 176.36  L2:  67.82  M: 17.48 (101.06%)  HT: 15.16  VT: 14.62  R: 13.88  RT:  8.05 (  45Kops/s)
2012-10-25 10:04:30 -04:00
Nemanja Lukic
ca83717c63 MIPS: DSPr2: Added more fast-paths for ADD operation: - add_0565_8_0565 - add_8888_8_8888 - add_8888_n_8888
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        add_0565_8_0565 =  L1:   8.89  L2:   8.37  M:  7.35 ( 29.22%)  HT:  5.90  VT:  5.85  R:  5.67  RT:  3.31 (  26Kops/s)
        add_8888_8_8888 =  L1:  17.22  L2:  14.17  M:  9.89 ( 65.56%)  HT:  7.57  VT:  7.50  R:  7.36  RT:  4.10 (  30Kops/s)
        add_8888_n_8888 =  L1:  17.79  L2:  14.87  M: 10.35 ( 54.89%)  HT:  5.19  VT:  4.93  R:  4.92  RT:  1.90 (  19Kops/s)

Optimized:
        add_0565_8_0565 =  L1:  21.72  L2:  20.01  M: 14.96 ( 59.54%)  HT: 12.03  VT: 11.81  R: 11.26  RT:  6.33 (  37Kops/s)
        add_8888_8_8888 =  L1:  47.42  L2:  38.64  M: 15.90 (105.48%)  HT: 13.34  VT: 13.03  R: 11.84  RT:  6.63 (  38Kops/s)
        add_8888_n_8888 =  L1:  54.83  L2:  42.66  M: 17.36 ( 92.11%)  HT: 15.20  VT: 14.82  R: 13.66  RT:  7.83 (  41Kops/s)
2012-10-25 10:04:30 -04:00
Nemanja Lukic
52d20e692e MIPS: DSPr2: Added fast-paths for ADD operation: - add_n_8_8 - add_n_8_8888 - add_8_8_8
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        add_n_8_8    =  L1:  41.37  L2:  37.83  M: 30.38 ( 60.45%)  HT: 23.70  VT: 22.85  R: 21.51  RT: 10.32 (  45Kops/s)
        add_n_8_8888 =  L1:  16.01  L2:  14.46  M: 11.64 ( 46.32%)  HT:  5.50  VT:  5.18  R:  5.06  RT:  1.89 (  18Kops/s)
        add_8_8_8    =  L1:  13.26  L2:  12.47  M: 11.16 ( 29.61%)  HT:  8.09  VT:  8.04  R:  7.68  RT:  3.90 (  29Kops/s)

Optimized:
        add_n_8_8    =  L1:  96.03  L2:  79.37  M: 51.89 (103.31%)  HT: 32.59  VT: 31.29  R: 28.52  RT: 11.08 (  46Kops/s)
        add_n_8_8888 =  L1:  53.61  L2:  46.92  M: 23.78 ( 94.70%)  HT: 19.06  VT: 18.64  R: 17.30  RT:  9.15 (  43Kops/s)
        add_8_8_8    =  L1:  89.65  L2:  66.82  M: 37.10 ( 98.48%)  HT: 22.10  VT: 21.74  R: 20.12  RT:  8.12 (  41Kops/s)
2012-10-25 10:04:30 -04:00
Siarhei Siamashka
9df645dfb0 Workaround for FTBFS with gcc 4.6 (http://gcc.gnu.org/PR54965)
GCC 4.6 has problems with force_inline, so just use normal inline instead.
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=55630
2012-10-25 00:39:41 +03:00
Søren Sandmann Pedersen
31e5a0a393 pixman_composite_trapezoids(): don't clip to extents for some operators
pixman_composite_trapezoids() is supposed to composite across the
entire destination, but it actually only composites across the extent
of the trapezoids. For operators such as ADD or OVER this doesn't
matter since a zero source has no effect on the destination. But for
operators such as SRC or IN, it does matter.

So for such operators where a zero source has an effect, don't clip to
the trap extents.
2012-10-21 04:13:36 -04:00
Søren Sandmann Pedersen
65db2362e2 pixman_composite_trapezoids(): Factor out extents computation
The computation of the extents rectangle is moved to its own
function.
2012-10-21 04:13:36 -04:00
Søren Sandmann Pedersen
2d9cb563b4 Add new pixman_image_create_bits_no_clear() API
When pixman_image_create_bits() function is given NULL for bits, it
will allocate a new buffer and initialize it to zero. However, in some
cases, only a small region of the image is actually used; in that case
it is wasteful to touch all of the memory.

The new pixman_image_create_bits_no_clear() works exactly like
_create_bits() except that it doesn't initialize any newly allocated
memory.
2012-10-21 04:13:36 -04:00
Benny Siegert
af803be17b configure.ac: PIXMAN_LINK_WITH_ENV fix
(fixes bug #52101)

On MirBSD, the compiler produces a (harmless) warning when the compiler
is called without the standard CFLAGS:

foo.c:0: note: someone does not honour COPTS correctly, passed 0 times

However, PIXMAN_LINK_WITH_ENV considers _any_ output on stderr as an
error, even if the exit status of the compiler is 0. Furthermore, it
resets CFLAGS and LDFLAGS at the start. On MirBSD, this will lead to a
warning in each test, making all such tests fail. In particular, the
pthread_setspecific test fails, thus pixman is compiled without thread
support. This leads to compile errors later on, or at least it did when
I tried this on pkgsrc. Re-adding the saved CFLAGS, LDFLAGS and LIBS
before the test makes it work.

The second hunk inverts the order of the pthread flag checks. On BSD
systems (this is true at least on OpenBSD and MirBSD), both -lpthread
and -pthread work but the latter is "preferred", whatever this means.
2012-10-17 14:42:56 -04:00
Siarhei Siamashka
6e56098c03 Add missing force_inline to in() function used for C fast paths 2012-10-16 22:31:38 +03:00
Siarhei Siamashka
90bcafa495 MIPS: skip runtime detection for DSPr2 if -mdspr2 option is in CFLAGS
This provides a way to enable MIPS DSP ASE optimizations if running
under qemu-user (where /proc/cpuinfo contains information about the
host processor instead of the emulated one). Can be used for running
pixman test suite in qemu-user when having no access to real MIPS
hardware.
2012-10-16 18:27:45 +03:00
Søren Sandmann Pedersen
d5f2f39319 region: Remove overlap argument from pixman_op()
This is used to compute whether the regions in question overlap, but
nothing makes use of this information, so it can be removed.
2012-10-11 05:09:19 -04:00
Søren Sandmann Pedersen
cb4f325ec0 region: Formatting fix
The while part of a do/while loop was formatted as if it were a while
loop with an empty body. Probably some indent tool misinterpreted the
code at some point.
2012-10-11 04:08:48 -04:00
Søren Sandmann Pedersen
15b153d633 Only regard images as pixbufs if they have identity transformations
In order for a src/mask pair to be considered a pixbuf, they have to
have identical transformations, but we don't check for that. Since the
only fast paths we have for pixbufs require identity transformations,
it sufficies to check that both source and mask are
untransformed.

This is also the reason that this bug can't be triggered by any test
code - if the source and mask had different transformations, we would
consider them a pixbuf, but then wouldn't take the fast path because
at least one of the transformations would be different from the
identity.
2012-10-07 18:00:09 -04:00
Søren Sandmann Pedersen
3d81d89c29 Remove BUILT_SOURCES
pixman-combine32.[ch] were the only built sources, so BUILT_SOURCES
can now be removed.
2012-10-04 12:44:22 -04:00
Søren Sandmann Pedersen
ec7aa11a6e Speed up pixman_expand_to_float()
GCC doesn't move the divisions out of the loop, so do it manually by
looking up the four (1.0f / mask) values in a table. Table lookups are
used under the theory that one L2 hit plus three L1 hits is preferable
to four floating point divisions.
2012-10-04 03:34:05 -04:00
Søren Sandmann Pedersen
8ccda2be30 Don't auto-generate pixman-combine32.[ch] anymore
Since pixman-combine64.[ch] are not used anymore, there is no point
generating these files from pixman-combine.[ch].template.

Also get rid of dependency on perl in configure.ac.
2012-10-04 03:33:50 -04:00
Søren Sandmann Pedersen
4afd20cc71 Remove 64 bit pipeline
The 64 bit pipeline is not used anymore, so it can now be removed.

Don't generate pixman-combine64.[ch] anymore. Don't generate the
pixman-srgb.c anymore. Delete all the 64 bit fetchers in
pixman-access.c, all the 64 bit iterator functions in
pixman-bits-image.c and all the functions that expand from 8 to 16
bits.
2012-10-01 12:56:09 -04:00
Søren Sandmann Pedersen
5ff0bbd972 Switch the wide pipeline over to using floating point
In pixman-bits-image.c, remove bits_image_fetch_untransformed_64() and
add bits_image_fetch_untransformed_float(); change
dest_get_scanline_wide() to produce a floating point buffer,

In the gradients, change *_get_scanline_wide() to call
pixman_expand_to_float() instead of pixman_expand().

In pixman-general.c change the wide Bpp to 16 instead of 8, and
initialize the buffers to 0 to prevent NaNs from causing trouble.

In pixman-noop.c make the wide solid iterator generate floating point
pixels.

In pixman-solid-fill.c, cache a floating point pixel, and make the
wide iterator generate floating point pixels.

Bug fix in bits_image_fetch_untransformed_repeat_normal
2012-10-01 12:56:09 -04:00
Søren Sandmann Pedersen
e75bacc5f9 pixman-access.c: Add floating point accessor functions
Three new function pointer fields are added to bits_image_t:

      fetch_scanline_float
      fetch_pixel_float
      store_scanline_float

similar to the existing 32 and 64 bit accessors. The fetcher_info_t
struct in pixman_access similarly gets a new get_scanline_float field.

For most formats, the new get_scanline_float field is set to a new
function fetch_scanline_generic_float() that first calls the 32 bit
fetcher uses the 32 bit scanline fetcher and then expands these pixels
to floating point.

For the 10 bpc formats, new floating point accessors are added that
use pixman_unorm_to_float() and pixman_float_to_unorm() to convert
back and forth.

The PIXMAN_a8r8g8b8_sRGB format is handled with a 256-entry table that
maps 8 bit sRGB channels to linear single precision floating point
numbers. The sRGB->linear direction can then be done with a simple
table lookup.

The other direction is currently done with 4096-entry table which
works fine for 16 bit integers, but not so great for floating
point. So instead this patch uses a binary search in the sRGB->linear
table. The existing 32 bit accessors for the sRGB format are also
converted to use this method.
2012-10-01 12:56:09 -04:00
Søren Sandmann Pedersen
23252393a2 pixman-utils.c, pixman-private.h: Add floating point conversion routines
A new struct argb_t containing a floating point pixel is added to
pixman-private.h and conversion routines are added to pixman-utils.c
to convert normalized integers to and from that struct.

New functions:

  - pixman_expand_to_float()
    Expands a buffer of integer pixels to a buffer of argb_t pixels

  - pixman_contract_from_float()
    Converts a buffer of argb_t pixels to a buffer integer pixels

  - pixman_float_to_unorm()
    Converts a floating point number to an unsigned normalized integer

  - pixman_unorm_to_float()
    Converts an unsigned normalized integer to a floating point number
2012-10-01 12:56:09 -04:00
Søren Sandmann Pedersen
4760599ff3 Add combiner test
This test runs the new floating point combiners on random input with
divide-by-zero exceptions turned on.

With the floating point combiners the only thing we guarantee is that
divide-by-zero exceptions are not generated, so change
enable_fp_exceptions() to only enable those, and rename accordingly.
2012-10-01 12:56:09 -04:00
Søren Sandmann Pedersen
a5b459114e Add pixman-combine-float.c
This file contains floating point implementations of combiners for all
pixman operators. These combiners operate on buffers containing single
precision floating point pixels stored in (a, r, g, b) order.

The combiners are added to the pixman_implementation_t struct, but
nothing uses them yet.

This commit incorporates a number of bug fixes contributed by Andrea
Canciani.

Some notes:

- The combiners are making sure to never divide by zero regardless of
  input, so an application could enable divide-by-zero exceptions and
  pixman wouldn't generate any.

- The operators are implemented according to the Render spec. Ie.,

    - If the input pixels are between 0 and 1, then so is the output.

    - The source and destination coefficients for the conjoint and
      disjoint operators are clamped to [0, 1].

- The PDF operators are not described in the render spec, and the
  implementation here doesn't do any clamping except in the final
  conversion from floating point to destination format.

All of the above will need to be rethought if we add support for pixel
formats that can support negative and greater-than-one pixels. It is
in fact already the case in principle that convolution filters can
produce pixels with negative values, but since these go through the
broken "wide" path that narrows everything to 32 bits, these negative
values don't currently survive to the combiners.
2012-10-01 12:56:09 -04:00
Søren Sandmann Pedersen
7a9c2d586b blitters-test: Prepare for floating point
Comment out some formats in blitters-test that are going to rely on
floating point in some upcoming patches.
2012-10-01 12:56:09 -04:00
Søren Sandmann Pedersen
600a06c81d glyph-test: Prepare for floating point
In preparation for an upcoming change of the wide pipe to use floating
point, comment out some formats in glyph-test that are going to be
using floating point and update the CRC32 value to match.
2012-10-01 12:56:09 -04:00
Søren Sandmann Pedersen
2e17b6dd4e Make pixman.h more const-correct
Add const to pointer arguments when the function doesn't change the
pointed-to data.

Also in add_glyphs() in pixman-glyph.c make 'white' in add_glyphs()
static and const.
2012-10-01 12:52:58 -04:00
Matt Turner
183afcf1d9 iwmmxt: Don't define dummy _mm_empty for >=gcc-4.8
Definition was not present in <4.8.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55451
2012-09-30 11:59:23 -07:00
Søren Sandmann Pedersen
d4b72eb6cc rotate-test: Call image_endian_swap() in make_image()
Otherwise the test fails on big-endian.

Tested-by: Matt Turner <mattst88@gmail.com>
2012-09-29 18:15:54 -04:00
Siarhei Siamashka
aff796d6ce Add scaled nearest repeat fast paths
Before this patch it was often faster to scale and repeat
in two passes because each pass used a fast path vs.
the slow path that the single pass approach takes. This
makes it so that the single pass approach has competitive
performance.
2012-09-26 00:03:10 -04:00
Matt Turner
05560828c4 sse2: mark pack_565_2x128_128 as static force_inline 2012-09-25 14:41:24 -07:00
Søren Sandmann Pedersen
de60e2e0e3 Fix for infinite-loop test
The infinite loop detected by "affine-test 212944861" is caused by an
overflow in this expression:

    max_x = pixman_fixed_to_int (vx + (width - 1) * unit_x) + 1;

where (width - 1) * unit_x doesn't fit in a signed int. This causes
max_x to be too small so that this:

    src_width = 0

    while (src_width < REPEAT_NORMAL_MIN_WIDTH && src_width <= max_x)
        src_width += src_image->bits.width;

results in src_width being 0. Later on when src_width is used for
repeat calculations, we get the infinite loop.

By casting unit_x to int64_t, the expression no longer overflows and
affine-test 212944861 and infinite-loop no longer loop forever.
2012-09-24 18:43:31 -04:00
Søren Sandmann Pedersen
aa311a4641 test: Add inifinite-loop test
This test demonstrates a bug where a certain transformation matrix can
result in an infinite loop. It was extracted as a standalone version
of "affine-test 212944861".

If given the option -nf, the test program will not call fail_after()
and therefore potentially run forever.
2012-09-24 18:29:30 -04:00
Søren Sandmann Pedersen
d5c721768c affine-test: Print out the transformation matrix when verbose
Printing out the translation and scale is a bit misleading because the
actual transformation matrix can be modified in various other ways.

Instead simply print the whole transformation matrix that is actually
used.
2012-09-24 18:27:10 -04:00
Nemanja Lukic
292fce7a23 MIPS: DSPr2: Added OVER combiner and two new fast paths: - over_8888_8888 - over_8888_8888_8888
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
          over_8888_8888 =  L1:  19.61  L2:  17.10  M: 11.16 ( 59.20%)  HT: 16.47  VT: 15.81  R: 14.82  RT:  8.90 (  50Kops/s)
     over_8888_8888_8888 =  L1:  13.56  L2:  11.22  M:  7.46 ( 79.18%)  HT:  6.24  VT:  6.20  R:  6.11  RT:  3.95 (  29Kops/s)

Optimized:
          over_8888_8888 =  L1:  46.42  L2:  36.70  M: 16.69 ( 88.57%)  HT: 17.11  VT: 16.55  R: 15.31  RT:  9.48 (  52Kops/s)
     over_8888_8888_8888 =  L1:  26.06  L2:  22.53  M: 11.49 (121.91%)  HT:  9.93  VT:  9.62  R:  9.19  RT:  5.75 (  36Kops/s)
2012-09-24 17:13:46 -04:00
Nemanja Lukic
28c9bd4866 MIPS: DSPr2: Added fast-paths for OVER operation: - over_0565_n_0565 - over_0565_8_0565
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        over_0565_n_0565 =  L1:   7.56  L2:   7.24  M:  6.16 ( 16.38%)  HT:  4.01  VT:  3.84  R:  3.79  RT:  1.66 (  18Kops/s)
        over_0565_8_0565 =  L1:   7.43  L2:   7.05  M:  5.98 ( 23.85%)  HT:  5.27  VT:  5.23  R:  5.09  RT:  3.14 (  28Kops/s)

Optimized:
        over_0565_n_0565 =  L1:  15.47  L2:  14.52  M: 12.30 ( 32.65%)  HT: 10.76  VT: 10.57  R: 10.27  RT:  6.63 (  46Kops/s)
        over_0565_8_0565 =  L1:  15.47  L2:  14.61  M: 11.78 ( 46.92%)  HT: 10.00  VT:  9.84  R:  9.40  RT:  5.81 (  43Kops/s)
2012-09-24 17:12:57 -04:00
Nemanja Lukic
b660eb30b4 MIPS: DSPr2: Added fast-paths for OVER operation: - over_8888_n_0565 - over_8888_8_0565
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        over_8888_n_0565 =  L1:   8.95  L2:   8.33  M:  6.95 ( 27.74%)  HT:  4.27  VT:  4.07  R:  4.01  RT:  1.74 (  19Kops/s)
        over_8888_8_0565 =  L1:   8.86  L2:   8.11  M:  6.72 ( 35.71%)  HT:  5.68  VT:  5.62  R:  5.47  RT:  3.35 (  30Kops/s)

Optimized:
        over_8888_n_0565 =  L1:  18.76  L2:  17.55  M: 13.11 ( 52.19%)  HT: 11.35  VT: 11.10  R: 10.88  RT:  6.94 (  47Kops/s)
        over_8888_8_0565 =  L1:  18.14  L2:  16.79  M: 12.10 ( 64.25%)  HT: 10.24  VT:  9.98  R:  9.63  RT:  5.89 (  43Kops/s)
2012-09-24 17:12:57 -04:00
Nemanja Lukic
37e3368e20 MIPS: DSPr2: Added fast-paths for OVER operation: - over_8888_n_8888 - over_8888_8_8888
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        over_8888_n_8888 =  L1:   9.92  L2:  11.27  M:  8.50 ( 45.23%)  HT:  4.70  VT:  4.45  R:  4.49  RT:  1.85 (  20Kops/s)
        over_8888_8_8888 =  L1:  12.54  L2:  10.86  M:  8.18 ( 54.36%)  HT:  6.53  VT:  6.45  R:  6.41  RT:  3.83 (  33Kops/s)

Optimized:
        over_8888_n_8888 =  L1:  28.02  L2:  24.92  M: 14.72 ( 78.15%)  HT: 13.03  VT: 12.65  R: 12.00  RT:  7.49 (  49Kops/s)
        over_8888_8_8888 =  L1:  26.92  L2:  23.93  M: 13.65 ( 90.58%)  HT: 11.68  VT: 11.29  R: 10.56  RT:  6.37 (  45Kops/s)
2012-09-24 17:12:56 -04:00
Søren Sandmann Pedersen
f580c4c5b2 pixman-combine.c.template: Formatting clean-ups
Various formatting fixes, and removal of some obsolete comments about
strength reduction of operators.
2012-09-22 23:41:19 -04:00
Søren Sandmann Pedersen
58f8704664 Fix bugs in pixman-image.c
In the checks for whether the transforms are rotation matrices "-1"
and "1" were used instead of the correct -pixman_fixed_1 and
pixman_fixed_1.

Fixes test suite failure for rotate-test.
2012-09-22 23:41:19 -04:00
Søren Sandmann Pedersen
550dfc5e7e Add rotate-test.c test program
This program exercises a bug in pixman-image.c where "-1" and "1" were
used instead of the correct "- pixman_fixed_1" and "pixman_fixed_1".

With the fast implementation enabled:

     % ./rotate-test
     rotate test failed! (checksum=35A01AAB, expected 03A24D51)

Without it:

     % env PIXMAN_DISABLE=fast ./rotate-test
     pixman: Disabled fast implementation
     rotate test passed (checksum=03A24D51)

V2: The first version didn't have lcg_srand (testnum) in test_transform().
2012-09-22 23:41:19 -04:00
Søren Sandmann Pedersen
2ab77c97a5 Fix bugs in component alpha combiners for separable PDF operators
In general, the component alpha version of an operator is supposed to
do this:

       - multiply source with mask in all channels
       - multiply mask with source alpha in all channels
       - compute the regular operator in all channels using the
         mask value whenever source alpha is called for

The first two steps are usually accomplished with the function
combine_mask_ca(), but for operators where source alpha is not used,
such as SRC, ADD and OUT, the simpler function
combine_mask_value_ca(), which doesn't compute the new mask values,
can be used.

However, the PDF blend modes generally *do* make use of source alpha,
so they can't use combine_mask_value_ca() as they do now. They have to
use combine_mask_ca().

This patch fixes this in combine_multiply_ca() and the CA combiners
generated by PDF_SEPARABLE_BLEND_MODE.
2012-09-22 23:41:19 -04:00
Søren Sandmann Pedersen
c4b69e706e Fix bug in fast_composite_scaled_nearest()
The fast_composite_scaled_nearest() function can be called when the
format is x8b8g8r8. In that case pixels fetched in fetch_nearest()
need to have their alpha channel set to 0xff.

Fixes test suite failure in scaling-test.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-09-22 23:40:52 -04:00
Søren Sandmann Pedersen
35be7acb66 Add PIXMAN_x8b8g8r8 and PIXMAN_a8b8g8r8 formats to scaling-test
Update the CRC values based on what the general implementation
reports. This reveals a bug in the fast implementation:

    % env PIXMAN_DISABLE="mmx sse2" ./test/scaling-test
    pixman: Disabled mmx implementation
    pixman: Disabled sse2 implementation
    scaling test failed! (checksum=AA722B06, expected 03A23E0C)

vs.

    % env PIXMAN_DISABLE="mmx sse2 fast" ./test/scaling-test
    pixman: Disabled fast implementation
    pixman: Disabled mmx implementation
    pixman: Disabled sse2 implementation
    scaling test passed (checksum=03A23E0C)

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-09-22 23:40:52 -04:00
Søren Sandmann Pedersen
9decb9a979 implementation: Rename delegate to fallback
At this point the chain of implementations has nothing to do with the
delegation design pattern anymore, so rename the delegate pointer to
'fallback'.
2012-09-19 12:22:59 -04:00
Søren Sandmann Pedersen
b96599ccf3 _pixman_implementation_create(): Initialize implementation with memset()
All the function pointers are NULL by default now, so we can just zero
the struct. Also write the function a little more compactly.
2012-09-19 12:22:59 -04:00
Søren Sandmann Pedersen
9539a18832 Rename _pixman_lookup_composite_function() to _pixman_implementation_lookup_composite()
And move it into pixman-implementation.c which is where it belongs
logically.
2012-09-19 12:22:59 -04:00
Søren Sandmann Pedersen
ee6af72dad Move delegation of src/dest iter init into pixman-implementation.c
Instead of relying on each implementation to delegate when an iterator
can't be initialized, change the type of iterator initializers to
boolean and make pixman-implementation.c do the delegation whenever an
iterator initializer returns FALSE.
2012-09-19 12:22:58 -04:00
Søren Sandmann Pedersen
c710d0fae2 Move fill delegation into pixman-implementation.c
As in the blt commit, do the delegation in pixman-implementation.c
whenever the implementation fill returns FALSE instead of relying on
each implementation to do it by itself.

With this change there is no longer any reason for the implementations
to have one fill function that delegates and one that actually blits,
so consolidate those in the NEON, DSPr2, SSE2, and MMX
implementations.
2012-09-19 12:22:58 -04:00
Søren Sandmann Pedersen
534507ba3b Move blt delegation into pixman-implementation.c
Rather than require each individual implementation to do the
delegation for blt, just do it in pixman-implementation.c whenever the
implementation blt returns FALSE.

With this change, there is no longer any reason for the
implementations to have one blt function that delegates and one that
actually blits, so consolidate those in the NEON, DSPr2, SSE2, and MMX
implementations.
2012-09-19 12:22:58 -04:00
Søren Sandmann Pedersen
7ef4436abb implementation: Write lookup_combiner() in a less convoluted way.
Instead of initializing an array on the stack, just use a simple
switch to select which set of combiners to look up in.
2012-09-19 12:22:58 -04:00
Matt Turner
3124a51abb build: Remove useless DEP_CFLAGS/DEP_LIBS variables 2012-09-15 23:46:21 -07:00
Andrea Canciani
46e4faf8ef build: Improve win32 build system
Handle cross-directory dependencies using PHONY targets and clean up
some redundancies.
2012-09-15 07:49:53 +02:00
Andrea Canciani
c89efdd211 mmx: Fix x86 build on MSVC
The MSVC compiler is very strict about variable declarations after
statements.

Move all the declarations of each block before any statement in
the same block to fix multiple instances of:

pixman-mmx.c(xxxx) : error C2275: '__m64' : illegal use of this type
as an expression
2012-09-15 07:49:52 +02:00
Søren Sandmann Pedersen
1e3e569b04 test/utils.c: Use pow(), not powf() in sRGB conversion routines
These functions are operating on double precision values, so use pow()
instead of powf().
2012-08-29 15:05:49 -04:00
Søren Sandmann Pedersen
8577daba04 pixel_checker: Move sRGB conversion into get_limits()
The sRGB conversion has to be done every time the limits are being
computed. Without this fix, pixel_checker_get_min/max() will produce
the wrong results when called from somewhere other than
pixel_checker_check().
2012-08-26 18:13:47 -04:00
Søren Sandmann Pedersen
62eb6e5e05 Remove obsolete TODO file 2012-08-25 17:17:24 -04:00
Søren Sandmann Pedersen
384846b38c Remove pointless declaration of _pixman_image_get_scanline_generic_64()
This declaration used to be necessary when
_pixman_image_get_scanline_generic_64() referred to a structure that
itself referred back to _pixman_image_get_scanline_generic_64().
2012-08-19 13:45:21 -04:00
Søren Sandmann Pedersen
09cb1ae10b demos: Add srgb_trap_test.c
This demo program composites a bunch of trapezoids side by side with
and without gamma aware compositing.
2012-08-09 11:24:37 -04:00
Søren Sandmann Pedersen
04e878c231 Make show_image() cope with more formats
This makes show_image() deal with more formats than just a8r8g8b8, in
particular, a8r8g8b8_sRGB can now be handled.

Images that are passed to show_image with a format of a8r8g8b8_sRGB
are displayed without modification under the assumption that the
monitor is approximately sRGB.

Images with a format of a8r8g8b8 are also displayed without
modification since many other users of show_image() have been
generating essentially sRGB data with this format. Other formats are
also assumed to be gamma compressed; these are converted to a8r8g8b8
before being displayed.

With these changes, srgb-test.c doesn't need to do its own conversion
anymore.
2012-08-09 11:24:37 -04:00
Søren Sandmann Pedersen
8db9ec9814 Define TIMER_BEGIN and TIMER_END even when timers are not enabled
This allows code that uses these macros to build when timers are
disabled.
2012-08-09 11:23:45 -04:00
Søren Sandmann Pedersen
da5268cc19 Post-release version bump to 0.27.3 2012-08-01 15:56:13 -04:00
Søren Sandmann Pedersen
e8ddef78b6 Pre-release version bump to 0.27.2 2012-08-01 15:22:57 -04:00
Sebastian Bauer
c214ca51a0 Use angle brackets form of including config.h 2012-08-01 15:21:51 -04:00
Sebastian Bauer
98617b3796 Added HAVE_CONFIG_H check before including config.h 2012-08-01 15:21:51 -04:00
Søren Sandmann Pedersen
5b0563f39e glyph-test: Avoid setting solid images as alpha maps.
glyph-test would sometimes set a solid image as an alpha map, which is
not allowed. When this happened and the debug spew was enabled,
messages like this one would be generated:

    *** BUG ***
    In pixman_image_set_alpha_map: The expression
            !alpha_map || alpha_map->type == BITS was false
    Set a breakpoint on '_pixman_log_error' to debug

Fix this by not passing the ALLOW_SOLID flag to create_image() when
the resulting is to be used as an alpha map.
2012-07-31 23:51:53 -04:00
Søren Sandmann Pedersen
38fe7cd7be stress-test: Avoid overflows in clip rectangles
The rectangles in the clip region set in set_general_properties()
would sometimes overflow, which would lead to messages like these:

      *** BUG ***
      In pixman_region32_union_rect: Invalid rectangle passed
      Set a breakpoint on '_pixman_log_error' to debug

when the micro version number of pixman is even.

Fix this by detecting the overflow and clamping such that the x2/y2
coordinates are less than INT32_MAX.
2012-07-31 23:51:53 -04:00
Søren Sandmann Pedersen
24d83cbf3d Add make-srgb.pl to EXTRA_DIST
Otherwise make distcheck doesn't pass.
2012-07-31 23:51:52 -04:00
Antti S. Lankila
72ba0b9555 Add tests to validate new sRGB behavior
Composite checks random combinations of operations that now also have
sRGB sources, masks and destinations, and stress-test validates the
read/write primitives.
2012-07-30 15:44:38 -04:00
Antti S. Lankila
a161a6ba23 Add sRGB blending demo program
Simple sRGB color blender test can be used to determine if the sRGB processing
works as expected. It blends alpha ramps of purple and green together such that
at midpoint of image, 50 % blend of both is realized. At that point, sRGB-aware
processing yields a result close to #bbb rather than #888, which is the linear
light blending result.

The demo also contains the sample computation for sRGB premultiplied alpha.
2012-07-30 15:40:16 -04:00
Antti S. Lankila
7460457f80 Add support for sRGB surfaces
sRGB format is defined as a new format type, PIXMAN_TYPE_ARGB_SRGB. One form of
this type is provided, PIXMAN_a8r8g8b8_sRGB. Use of an sRGB format triggers
wide processing, and the pixel fetch/store functions handle the relevant
conversion between color spaces. Pixman itself is thought to compose in the
linearized sRGB color space.

sRGB conversion is tabularized. For sRGB to linear, we are using only 256
values because the current source format uses 8 bits per component precision.
For linear to sRGB, it turns out that only 4096 brightness levels are required
to generate all of the 256 sRGB color values, and therefore only 12 bits per
component are considered during store. As a special case, a no-op
sRGB->linear->sRGB conversion is constructed to be lossless by adjusting the
sRGB->linear conversion table where necessary.
2012-07-30 15:37:26 -04:00
Antti S. Lankila
1dcca0f7ae Remove unnecessary dst initialization
The initialization work is already performed correctly in image_init().
2012-07-29 11:01:11 -04:00
Cyril Brulebois
1713a099d6 Upload to unstable. 2012-06-27 12:11:58 +02:00
Cyril Brulebois
9026e61d84 Disable loongson2f optimizations, fix FTBFS on mipsel. 2012-06-27 11:21:54 +02:00
Søren Sandmann Pedersen
56321eff65 Make pixman-mmx.c compile on x86-32 without optimization
When not optimizing, write _mm_shuffle_pi16() as a statement
expression with inline assembly. That way we avoid
__builtin_ia32_pshufw(), which is only available when compiling with
-msse, while still allowing the non-optimizing gcc to understand that
the second argument is a compile time constant.

Tested-by: Knut Petersen <knut_petersen@t-online.de>
2012-06-20 02:53:31 -04:00
Søren Sandmann Pedersen
0c81957e9b Cleanups and simplifications in x86 CPU feature detection
A new function pixman_cpuid() is added that runs the cpuid instruction
and returns the results. On GCC this function uses inline assembly; on
MSVC, the function calls the __cpuid intrinsic.

There is also a new function called have_cpuid() which detects whether
cpuid is available. On x86-64 and MSVC, it simply returns TRUE; on
x86-32 bit, it checks whether the 22nd bit of eflags can be
modified. On MSVC this does have the consequence that pixman will no
longer work CPUS without cpuid (ie., older than 486 and some 486
models).

These two functions together makes it possible to write a generic
detect_cpu_features() in plain C. This function is then used in a new
have_feature() function that checks whether a specific set of feature
bits is available.

Aside from the cleanups and simplifications, the main benefit from
this patch is that pixman now can do feature detection on x86-64, so
that newer instruction sets such as SSSE3 and SSE4.1 can be used. (And
apparently the assumption that x86-64 CPUs always have MMX and SSE2 is
no longer correct: Knight's Corner is x86-64, but doesn't have them).

V2: Rename the constants in the getisax() code, as pointed out by Alan
Coopersmith. Also reinstate the result variable and initialize
features to 0.

V3: Fixes for the fact that the upper 32 bits of a 64 bit register are
zeroed whenever the corresponding 32 bit register is written to.

V4: Fixes for the fact that in 32 bit mode, when gcc is not optimizing
there were not enough registers available. The new code uses the "a",
"b", "c", and "d" constraints instead, and has two separate versions
for 32 and 64 bit modes.
2012-06-20 02:51:04 -04:00
Sebastian Bauer
4d641c3803 Changed the style of two function headers
Declare functions *_inverse() and *_contains_rectangle() in the same
way as the other functions are declared. This doesn't imply any semantic
changes. It's just a unification of coding styles.
2012-07-08 18:49:24 -04:00
Nemanja Lukic
86ad09b548 MIPS: DSPr2: Added more bilinear fast paths (without mask)
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench -b

Referent (before):
  src_8888_8888 =  L1:   8.18  L2:   7.79  M:  6.32 ( 33.51%)  HT:  5.78  VT:  5.70  R:  5.61  RT:  3.79 (  29Kops/s)
  src_8888_0565 =  L1:   6.90  L2:   7.14  M:  6.47 ( 25.75%)  HT:  5.54  VT:  5.51  R:  5.46  RT:  3.53 (  28Kops/s)
  src_0565_x888 =  L1:   3.76  L2:   3.71  M:  3.37 ( 13.41%)  HT:  3.26  VT:  3.22  R:  3.20  RT:  2.58 (  23Kops/s)
  src_0565_0565 =  L1:   3.59  L2:   3.56  M:  3.47 (  9.19%)  HT:  3.19  VT:  3.18  R:  3.16  RT:  2.46 (  22Kops/s)
 over_8888_8888 =  L1:   5.99  L2:   5.66  M:  4.95 ( 26.28%)  HT:  4.40  VT:  4.38  R:  4.31  RT:  3.02 (  26Kops/s)
  add_8888_8888 =  L1:   6.84  L2:   6.39  M:  5.48 ( 29.09%)  HT:  4.80  VT:  4.79  R:  4.70  RT:  3.20 (  27Kops/s)

Optimized:
  src_8888_8888 =  L1:  18.27  L2:  16.69  M: 12.87 ( 68.25%)  HT: 11.80  VT: 11.61  R: 10.60  RT:  7.05 (  41Kops/s)
  src_8888_0565 =  L1:  15.18  L2:  14.10  M: 11.75 ( 46.71%)  HT: 10.64  VT: 10.50  R: 10.03  RT:  7.15 (  41Kops/s)
  src_0565_x888 =  L1:  10.45  L2:   9.96  M:  9.23 ( 36.72%)  HT:  8.39  VT:  8.29  R:  8.02  RT:  5.75 (  37Kops/s)
  src_0565_0565 =  L1:   9.37  L2:   8.98  M:  8.50 ( 22.53%)  HT:  7.71  VT:  7.66  R:  7.52  RT:  5.59 (  37Kops/s)
 over_8888_8888 =  L1:  12.21  L2:  11.01  M:  8.56 ( 45.36%)  HT:  7.71  VT:  7.64  R:  7.43  RT:  5.51 (  36Kops/s)
  add_8888_8888 =  L1:  17.72  L2:  15.16  M: 10.78 ( 57.13%)  HT:  9.46  VT:  9.30  R:  9.00  RT:  6.03 (  38Kops/s)
2012-07-08 21:38:14 +03:00
Nemanja Lukic
707a8be112 MIPS: DSPr2: Added several bilinear fast paths with a8 mask
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench -b

Referent (before):

  src_8888_8_8888 =  L1:   6.37  L2:   6.08  M:  5.46 ( 32.57%)  HT:  4.64  VT:  4.61  R:  4.52  RT:  2.85 (  23Kops/s)
  src_8888_8_0565 =  L1:   5.89  L2:   5.66  M:  5.11 ( 23.71%)  HT:  4.36  VT:  4.34  R:  4.26  RT:  2.71 (  22Kops/s)
  src_0565_8_x888 =  L1:   3.32  L2:   3.27  M:  3.17 ( 14.71%)  HT:  2.86  VT:  2.84  R:  2.81  RT:  2.07 (  19Kops/s)
  src_0565_8_0565 =  L1:   3.19  L2:   3.15  M:  3.05 ( 10.11%)  HT:  2.75  VT:  2.74  R:  2.71  RT:  2.00 (  18Kops/s)
 over_8888_8_8888 =  L1:   4.99  L2:   4.71  M:  4.11 ( 27.22%)  HT:  3.59  VT:  3.58  R:  3.50  RT:  2.36 (  21Kops/s)
  add_8888_8_8888 =  L1:   5.60  L2:   5.26  M:  4.52 ( 29.95%)  HT:  3.92  VT:  3.89  R:  3.80  RT:  2.49 (  21Kops/s)

Optimized:

  src_8888_8_8888 =  L1:  13.19  L2:  12.13  M:  9.75 ( 58.22%)  HT:  8.60  VT:  8.44  R:  7.90  RT:  5.06 (  33Kops/s)
  src_8888_8_0565 =  L1:  11.64  L2:  10.81  M:  9.18 ( 42.63%)  HT:  8.04  VT:  7.90  R:  7.57  RT:  5.02 (  32Kops/s)
  src_0565_8_x888 =  L1:   8.34  L2:   7.95  M:  7.29 ( 33.85%)  HT:  6.55  VT:  6.48  R:  6.25  RT:  4.35 (  30Kops/s)
  src_0565_8_0565 =  L1:   7.71  L2:   7.35  M:  6.90 ( 22.90%)  HT:  6.14  VT:  6.10  R:  5.94  RT:  4.07 (  29Kops/s)
 over_8888_8_8888 =  L1:   9.73  L2:   8.99  M:  7.15 ( 47.41%)  HT:  6.40  VT:  6.30  R:  6.11  RT:  4.28 (  30Kops/s)
  add_8888_8_8888 =  L1:  13.01  L2:  11.72  M:  8.70 ( 57.68%)  HT:  7.59  VT:  7.46  R:  7.20  RT:  4.74 (  32Kops/s)
2012-07-08 21:38:09 +03:00
Søren Sandmann Pedersen
6aac8e8570 Simplify CPU detection on PPC.
Get rid of the initialized and have_vmx static variables in
pixman-ppc.c There is no point to them since CPU detection only
happens once per process.

On Linux, just read /proc/self/auxv instead of generating the filename
with getpid() and don't bother with the stack buffer. Instead just
read the aux entries one by one.
2012-07-07 01:09:23 -04:00
Søren Sandmann Pedersen
4b78d78537 Simplifications to ARM CPU detection
Organize pixman-arm.c such that each operating system/compiler exports
a detect_cpu_features() function that returns a bitmask with the
various features that we are interested in. A new function
have_feature() then calls this function, caches the result, and return
whether the given feature is available.

The result is that all the pixman_have_arm_<feature> functions become
redundant and can be deleted.
2012-07-07 01:09:23 -04:00
Søren Sandmann Pedersen
8b795a9c17 Simplify MIPS CPU detection
There is no reason to have pixman_have_<feature> functions when all
they do is call pixman_have_mips_feature().

Instead rename pixman_have_mips_feature() to have_feature() and call
it directly from _pixman_mips_get_implementations(). Also on
non-Linux, just make have_feature() return FALSE.
2012-07-07 01:09:23 -04:00
Søren Sandmann Pedersen
16502dd3ae Move the remaining bits of pixman-cpu into pixman-implementation.c 2012-07-07 01:09:23 -04:00
Søren Sandmann Pedersen
5813bb96ae Move MIPS specific CPU detection to its own file, pixman-mips.c 2012-07-07 01:09:23 -04:00
Søren Sandmann Pedersen
4ac0a1d60f Move PowerPC specific CPU detection to its own file pixman-ppc.c 2012-07-07 01:09:23 -04:00
Søren Sandmann Pedersen
8590415f0e Move ARM specific CPU detection to a new file pixman-arm.c
Similar to the x86 commit, this moves the ARM specific CPU detection
to its own file which exports a pixman_arm_get_implementations()
function that is supposed to be a noop on non-ARM.
2012-07-07 01:09:22 -04:00
Søren Sandmann Pedersen
39ac18570a Move x86 specific CPU detection to a new file pixman-x86.c
Extract the x86 specific parts of pixman-cpu.c and put them in their
own file called pixman-x86.c which exports one function
pixman_x86_get_implementations() that creates the MMX and SSE2
implementations. This file is supposed to be compiled on all
architectures, but pixman_x86_get_implementations() should be a noop
on non-x86.
2012-07-06 23:53:19 -04:00
Søren Sandmann Pedersen
1a3b7614a9 pixman-cpu.c: Rename disabled to _pixman_disabled() and export it 2012-07-06 23:52:14 -04:00
Sebastian Bauer
d4aa82fb91 Qualify the static variables in pixman_f_transform_invert() with the const keyword.
Their contents is not overwritten.
2012-07-06 23:50:21 -04:00
Søren Sandmann Pedersen
f9c91ee2f2 Use a compile-time constant for the "K" constraint in the MMX detection.
When compiling with -O0, gcc doesn't understand that in

     signed char x = 0;

     ...

     asm ("...",
     	  : "K" (x));

x is constant. Fix this by using an immediate constant instead of a
variable.
2012-07-02 18:21:21 -04:00
Søren Sandmann Pedersen
cd7ecf548a In fast_composite_tiled_repeat() don't clone images with a palette
In fast_composite_tiled_repeat() if the source image is less than a
certain constant width, a clone is created which is then
pre-repeated. However, the source image's palette, if it has one, is
not cloned, so for indexed images, the pre-repeating would crash.

Fix this by not doing any pre-repeating for images with a palette set.
2012-07-02 18:21:21 -04:00
Søren Sandmann Pedersen
7b20ad39f7 test: Make stress-test more likely to actually composite something
stress-test current almost never composites anything because the clip
rectangles and transformations are such that either
_pixman_compute_composite_region32() or analyze_extent() will return
FALSE.

Fix this by:

- making log_rand() return smaller numbers so that the clip rectangles
  are more likely to be within the destination image

- adding rand_x() and rand_y() functions that pick positions within an
  image and using them for positioning alpha maps and source/mask
  positions.

- making it less likely that clip regions are used in general

These changes make the test take longer, so speed it up a little by
making most images smaller and by reducing the maximum convolution
filter from 17x19 to 3x4.

With these changes, stress-test reveals a crash in iteration 0xd39
where fast_composite_tiled_repeat() creates an indexed image without a
palette.
2012-07-02 18:21:21 -04:00
Matt Turner
4cdf8e9f3a sse2: add missing ABGR entires for bilinear src_8888_8888 2012-07-01 16:35:46 -04:00
Matt Turner
ef99f9e972 loongson: optimize _mm_set_pi* functions with shuffle instructions 2012-07-01 16:34:45 -04:00
Matt Turner
9aa8e3a260 mmx: optimize bilinear function when using 7-bit precision
Loongson:
image             firefox-fishtank 1037.738 1040.218   0.19%    3/3
image             firefox-fishtank 1056.611 1057.581   0.20%    3/3

ARM/iwMMXt:
image             firefox-fishtank 1487.282 1492.640   0.17%    3/3
image             firefox-fishtank 1363.913 1364.366   0.11%    3/3
2012-07-01 16:34:21 -04:00
Matt Turner
1ad6ae6ee8 mmx: add scaled bilinear over_8888_8_8888
Loongson:
image             firefox-fishtank 1665.163 1670.370   0.17%    3/3
image             firefox-fishtank 1037.738 1040.218   0.19%    3/3

ARM/iwMMXt:
image             firefox-fishtank 2042.723 2045.308   0.10%    3/3
image             firefox-fishtank 1487.282 1492.640   0.17%    3/3
2012-07-01 16:34:14 -04:00
Matt Turner
c43de364cb mmx: add scaled bilinear over_8888_8888
Loongson:
image         firefox-planet-gnome  157.012  158.087   0.30%    6/6
image         firefox-planet-gnome  156.617  157.109   0.15%    5/6

ARM/iwMMXt:
image         firefox-planet-gnome  148.086  149.339   0.76%    6/6
image         firefox-planet-gnome  144.939  146.123   0.61%    6/6
2012-07-01 16:33:19 -04:00
Matt Turner
9209cd746b mmx: add scaled bilinear src_8888_8888
Loongson:
image         firefox-planet-gnome  170.025  170.229   0.09%    3/4
image         firefox-planet-gnome  157.012  158.087   0.30%    6/6

ARM/iwMMXt:
image         firefox-planet-gnome  164.192  164.875   0.34%    3/4
image         firefox-planet-gnome  148.086  149.339   0.76%    6/6
2012-07-01 16:33:08 -04:00
Matt Turner
51f27d7364 mmx: Use expand_alpha instead of mask/shift 2012-07-01 16:25:30 -04:00
Siarhei Siamashka
b0855f095a Change default bilinear interpolation precision to 7 bits
This improves performance for the current SSE2 code. Further
reduction to 4 bits may be considered later if it proves
to allow additional speedup.
2012-07-01 23:00:34 +03:00
Siarhei Siamashka
c430b1dba7 sse2: _mm_madd_epi16 for faster bilinear scaling with 7-bit precision
Reducing interpolation precision allows the use of PMADDWD instruction.
This makes bilinear scaling much faster (on Intel Core i7):

8-bit: image             firefox-fishtank   57.584   58.349   0.74%    3/3
7-bit: image             firefox-fishtank   51.139   51.229   0.30%    3/3

8-bit: src_8888_8888 =  L1: 228.71  L2: 226.52  M:224.82 ( 14.95%)  HT:183.22  VT:154.02  R:171.72  RT:109.36
7-bit: src_8888_8888 =  L1: 320.45  L2: 317.43  M:314.38 ( 20.77%)  HT:215.13  VT:177.35  R:204.46  RT:121.93
2012-07-01 22:40:23 +03:00
Siarhei Siamashka
ccd31896bc Bilinear interpolation precision is now configurable at compile time
Macro BILINEAR_INTERPOLATION_BITS in pixman-private.h selects
the number of fractional bits used for bilinear interpolation.

scaling-test and affine-test have checksums for 4-bit, 7-bit
and 8-bit configurations.
2012-07-01 21:45:43 +03:00
Matt Turner
ad9f1d0201 Fix distcheck due to custom iwMMXt rules 2012-06-29 14:24:30 -04:00
Siarhei Siamashka
ff5d041b88 sse2: faster bilinear scaling (use _mm_loadl_epi64)
Using _mm_loadl_epi64() to load two pixels at once (pairs of top
and bottom pixels) is faster than loading each pixel separately
and combining them with _mm_set_epi32().

=== cairo-perf-trace ===

before: image             firefox-fishtank   66.912   66.931   0.13%    3/3
after:  image             firefox-fishtank   57.584   58.349   0.74%    3/3

=== lowlevel-blt-bench ===

before: src_8888_8888 =  L1: 181.10  L2: 179.14  M:178.08 ( 11.02%)  HT:153.22  VT:133.45  R:142.24  RT: 95.32
after:  src_8888_8888 =  L1: 228.68  L2: 225.75  M:223.98 ( 14.23%)  HT:185.32  VT:155.06  R:162.73  RT:102.52

This improvement was suggested by Matt Turner on irc.
2012-06-29 03:29:32 +03:00
Siarhei Siamashka
fc162bad56 test: support nearest/bilinear scaling in lowlevel-blt-bench
Scale factor is selected to be nearly 1x, so that the MPix/s results
can be directly compared with the results of non-scaled compositing
operations.
2012-06-29 03:24:29 +03:00
Siarhei Siamashka
387e9bcddb test: Fix for strict aliasing issue in 'get_random_seed'
Gets rid of gcc warning when compiled with -fstrict-aliasing option in CFLAGS
2012-06-29 03:23:09 +03:00
Andrea Canciani
4cbeb0aedc build: Fix compilation on win32
When compiling using the win32 build system, config.h is not
available nor needed.

Fixes:

pixman-glyph.c(26) : fatal error C1083: Cannot open include file:
'config.h': No such file or directory
2012-06-20 17:13:33 +02:00
Matt Turner
21077e1b83 sse2: add src_x888_0565
Port of 2ddd1c498b to SSE2.

Uses the pmadd technique described in
http://software.intel.com/sites/landingpage/legacy/mmx/MMX_App_24-16_Bit_Conversion.pdf

Works around lack of packusdw instruction by first sign extending the
values.

fast:	src_8888_0565 =  L1: 681.40  L2: 689.20  M: 644.76 ( 25.51%)  HT:404.42  VT:288.04  R:306.07  RT:150.80 (1619Kops/s)
mmx:	src_8888_0565 =  L1:2056.03  L2:1985.44  M:1574.91 ( 61.87%)  HT:533.10  VT:376.35  R:416.10  RT:178.79 (1833Kops/s)
sse2:	src_8888_0565 =  L1:3793.42  L2:3653.44  M:1878.83 ( 73.94%)  HT:535.03  VT:407.96  R:421.46  RT:163.31 (1727Kops/s)

and for reference, using packusdw
sse4:	src_8888_0565 =  L1:4396.18  L2:4229.25  M:1904.04 ( 75.18%)  HT:559.79  VT:427.96  R:440.06  RT:165.71 (1744Kops/s)

Notice that MMX is faster in the RT case because it can operate on
8-bytes instead of the current 16-bytes for SSE2.
2012-06-16 16:00:00 -04:00
Cyril Brulebois
3acc1ffc32 Upload to unstable. 2012-06-15 01:25:23 +02:00
Cyril Brulebois
1952e2a77b Document the cherry-pick, fixing FTBFS on *i386. 2012-06-15 01:20:14 +02:00
Matt Turner
1701defb49 mmx: add missing _mm_empty calls
Fixes spurious test failures on x86-32.
(cherry picked from commit da6193b1fc)
2012-06-15 01:19:04 +02:00
Cyril Brulebois
8940c5222e Upload to unstable. 2012-06-15 00:16:59 +02:00
Cyril Brulebois
0181d422ab Bump changelogs. 2012-06-15 00:15:43 +02:00
Cyril Brulebois
f53c40a739 Merge branch 'upstream-unstable' into debian-unstable 2012-06-15 00:15:23 +02:00
Matt Turner
7db07cb731 sse2: enable over_n_0565 for b5g6r5
Same as b950bb12 for MMX.
2012-06-13 19:32:21 -04:00
Matt Turner
45946c5fa1 .gitignore: add test/glyph-test 2012-06-13 19:32:21 -04:00
Søren Sandmann Pedersen
eadb442b5c test: Add missing break in stress-test.c
Found by coverity:

https://bugzilla.redhat.com/show_bug.cgi?id=756069
2012-06-13 07:30:06 -04:00
Siarhei Siamashka
492dac7593 test: fix bisecting issue in fuzzer-find-diff.pl
Before bisecting to find the exact test which has failed, we
first need to make sure that the first test is fine (the first
test is "good" and the whole range is "bad"). Otherwise
test 2 gets incorrectly flagged as problematic in the case
if we already got a failure on test 1 right from the start.
2012-06-12 04:21:57 +03:00
Siarhei Siamashka
40a0d10eea test: OpenMP 2.5 requires signed loop iteration variables
Unsigned loop variables are only supported since version 3.0
of OpenMP specification. Changing loop variables to use int32_t
type fixes pixman build problems with path64 compiler.
2012-06-12 04:21:07 +03:00
Søren Sandmann Pedersen
619a60d201 test: Make glyph test pass on big endian
The destination buffer was initialized with random uint32_t values, so
it started out different on big endian vs. little endian. Fix that by
initializing the buffer with random uint8_t values instead.
2012-06-11 19:19:23 -04:00
Søren Sandmann Pedersen
f80e7ad3cb bits-image: Turn all the fetchers into iterator getters
Instead of caching these fetchers in the image structure, and then
have the iterator getter call them from there, simply change them to
be iterator getters themselves.

This avoids an extra indirect function call and lets us get rid of the
get_scanline_32/64 fields in pixman_image_t.
2012-06-11 07:15:00 -04:00
Antti S. Lankila
fd175f9d02 Faster unorm_to_unorm for wide processing.
Optimizing the unorm_to_unorm functions allows a speedup from:

src_8888_2x10 =  L1:  62.08  L2:  60.73  M: 59.61 (  4.30%)  HT: 46.81
	VT: 42.17  R: 43.18  RT: 26.01 (325Kops/s)

to:

src_8888_2x10 =  L1:  76.94  L2:  78.43  M: 75.87 (  5.59%)  HT: 56.73
	VT: 52.39  R: 53.00  RT: 29.29 (363Kops/s)

on a i7 Q720 -based laptop.

The key of the patch is the observation that unorm_to_unorm's work can
more easily be done with a simple multiplication and shift, when the
function is applied repeatedly and the parameters are not compile-time
constants. For instance, converting from 0xfe to 0xfefe (expanding
from 8 bits to 16 bits) can be done by calculating

c = c * 0x101

However, sometimes the result is not a neat replication of all the
bits. For instance, going from 10 bits to 16 bits can be done by
calculating

c = c * 0x401UL >> 4

where the intermediate result is 20 bit wide repetition of the 10-bit
pattern followed by shifting off the unnecessary lowest bits.

The patch has the algorithm to calculate the factor and the shift, and
converts the code to use it.
2012-06-10 14:23:17 -04:00
Matt Turner
367b78fd5c configure.ac: add iwmmxt2 configure flag
The flag allows the user to select whether pixman-mmx.c is compiled with
-march=iwmmxt or -march=iwmmxt2.

gcc has scheduling support for the Marvell CPU in the XO 1.75 when
building with -march=iwmmxt2.
2012-06-09 16:57:16 -04:00
Matt Turner
31a6563ec5 autotools: use custom build rule to build iwMMXt code
gcc has no sane way of enabling iwmmxt code generation, like -msse for
SSE, so you have to use -march=iwmmxt{,2}. User CFLAGS are placed after
-march=iwmmxt and override the march value, so we have to use a custom
build rule to order the CFLAGS such that pixman-mmx.c will be built with
the necessary CFLAGS.
2012-06-09 16:57:16 -04:00
Søren Sandmann Pedersen
706bf8264c Speed up _pixman_image_get_solid() in common cases
Make _pixman_image_get_solid() faster by special-casing the common
cases where the image is SOLID or a repeating a8r8g8b8 image.

This optimization together with the previous one results in a small
but reproducable performance improvement on the xfce4-terminal-a1
cairo trace:

[ # ]  backend                         test   min(s) median(s) stddev. count
Before:
[  0]    image            xfce4-terminal-a1    1.221    1.239   1.21%  100/100
After:
[  0]    image            xfce4-terminal-a1    1.170    1.199   1.26%  100/100

Either optimization by itself is difficult to separate from noise.
2012-06-02 08:19:38 -04:00
Søren Sandmann Pedersen
934c9d8546 Speed up _pixman_composite_glyphs_no_mask()
Bypass much of the overhead of pixman_image_composite32() by only
computing the composite region once instead of once per glyph, and by
only looking up the composite function whenever the glyph format or
flags change.

As part of this, the pixman_compute_composite_region32() was renamed
to _pixman_compute_composite_region32() and exported in
pixman-private.h.

I couldn't find a trace that would reliably demonstrate that this is
actually an improvement by itself (since _pixman_composite_glyphs_no_mask()
is called so rarely), but together with the following optimization for
solid sources, there is a small but reliable improvement to the
xfce4-a1-terminal cairo trace.
2012-06-02 08:19:38 -04:00
Søren Sandmann Pedersen
a162189dc0 Speed up pixman_composite_glyphs()
When adding glyphs to the mask, bypass most of the overhead of
pixman_image_composite32() by:

- Only looking up the composite function when the glyph changes either
  format or flags.

- Only using a white source when the glyph format is different from
  the mask format.

- Simply intersecting the glyph rectangle with the destination
  rectangle instead of doing the full _pixman_composite_region32().

Performance results:

[ # ]  backend                         test   min(s) median(s) stddev. count
Before:
[  0]    image            firefox-talos-gfx    6.570    6.577   0.13%    8/10
After:
[  0]    image            firefox-talos-gfx    4.272    4.289   0.28%   10/10

V2: Changes to deal with white sources
2012-06-02 08:19:30 -04:00
Søren Sandmann Pedersen
d9710442b4 test: Add glyph-test
This test tests the new glyph cache and compositing API. Much of this
test is intending to making sure that clipping and alpha map handling
survive any optimizations that may be added to the glyph compositing.

V2: Evaluating lcg_rand_n() multiple times in an argument list lead
    to undefined behavior.
2012-06-02 07:55:11 -04:00
Søren Sandmann Pedersen
dc92374727 Add support for alpha maps to compute_crc32_for_image().
When a destination image I has an alpha map A, the following rules apply:

   - If I has an alpha channel itself, the content of that channel is
     undefined

   - If A has RGB channels, the content of those channels is
     undefined.

Hence in order to compute the CRC32 for such an image, we have to mask
off the alpha channel of the image, and the RGB channels of the alpha
map.

V2: Shifting by 32 is undefined in C
2012-06-02 07:55:11 -04:00
Søren Sandmann Pedersen
43e029d525 Move CRC32 computation from blitters-test.c into utils.c
This way it can be used in other tests.
2012-06-02 07:55:11 -04:00
Søren Sandmann Pedersen
fce31a5ef8 Add pixman_glyph_cache_t API
This new API allows entire glyph strings to be composited in one go
which reduces overhead compared to multiple calls to
pixman_image_composite32().

The pixman_glyph_cache_t is a hash table that maps two keys (a "font"
and a "glyph" key, but they are just keys; there is no distinction
between them as far as pixman is concerned) to a glyph. Glyphs in the
cache can be composited through two new entry points
pixman_glyph_cache_composite_glyphs() and
pixman_glyph_cache_composite_glyphs_no_mask().

A glyph cache may only be inserted into when it is "frozen", which is
achieved by calling pixman_glyph_cache_freeze(). When
pixman_glyph_cache_thaw() is later called, if the cache has become too
crowded, some glyphs (currently the least-recently-used) will
automatically be evicted. This means that a user must ensure that all
the required glyphs are present in the cache before compositing a
string. The intended way to use the cache is like this:

        pixman_glyph_t glyphs[MAX_GLYPHS];

        pixman_glyph_cache_freeze (cache);

        for (i = 0; i < n_glyphs; ++i)
        {
            const void *g;

            if (!(g = pixman_glyph_cache_lookup (cache, font_key, glyph_key)))
            {
                img = <rasterize glyph as a pixman_image_t>;

                g = pixman_glyph_cache_insert (cache, font_key, glyph_key,
                                               glyph_origin_x, glyph_origin_y,
                                               img);

                if (!g)
                {
                    /* Clean up out-of-memory condition */
                    goto oom;
                }

                glyphs[i].pos_x = glyph_x_pos;
                glyphs[i].pos_y = glyph_y_pos;
                glyphs[i].glyph = g;
            }
        }

        pixman_composite_glyphs (op, src, dest, ..., cache, n_glyphs, glyphs);

        pixman_glyph_cache_thaw (cache);

V2:
- Move glyphs to front of the MRU list when they are used. Pointed
  out by Behdad Esfahbod.
- Composite glyphs with (white IN glyph) ADD mask in order to support
  mixed a8 and a8r8g8b8 glyphs. Also pointed out by Behdad.
- Add pixman_glyph_get_mask_format
2012-06-02 07:55:11 -04:00
Søren Sandmann Pedersen
a3ae88b71b Add doubly linked lists
This commit adds some new inline functions to maintain a doubly linked
list.

The way to use them is to embed a pixman_link_t into the structures
that should be linked, and use a pixman_list_t as the head of the
list.

The new functions are

    pixman_list_init (pixman_list_t *list);
    pixman_list_prepend (pixman_list_t *list, pixman_link_t *link);
    pixman_list_move_to_front (pixman_list_t *list, pixman_link_t *link);

There are also a new macro:

    CONTAINER_OF(type, member, data);

that can be used to get from a pointer to a member to the containing
structure.

V2: Use the C89 macro offsetof() instead of rolling our own -
suggested by Alan Coopersmith.
2012-06-02 07:54:48 -04:00
Søren Sandmann Pedersen
c2230fe2af Make use of image flags in mmx and sse2 iterators
Now that we have the full image flags available, the SSE2 and MMX
iterators can simply check against SAMPLES_COVER_CLIP_NEAREST (which
is computed in pixman_image_composite32()) instead of comparing all
the x/y/width/height parameters.
2012-05-30 04:42:29 -04:00
Søren Sandmann Pedersen
c1065a9cb4 Pass the full image flags to iterators
When pixman_image_composite32() is called some flags are computed that
indicate various things about the composite operation that can't be
deduced from the image flags themselves. These additional flags are
not currently available to iterators. All they can do is read the
image flags in image->common.flags.

Fix that by passing the info->{src, mask, dest}_flags on to the
iterator initialization and store the flags in the iter struct as
"image_flags". At the same time rename the *iterator* flags variable
to "iter_flags" to avoid confusion.
2012-05-30 04:34:29 -04:00
Matt Turner
da6193b1fc mmx: add missing _mm_empty calls
Fixes spurious test failures on x86-32.
2012-05-27 14:59:56 -04:00
Matt Turner
62c4bdc94f mmx: add over_reverse_n_8888
Loongson:
over_reverse_n_8888 =  L1:  16.04  L2:  15.35  M: 10.20 ( 27.96%)  HT: 10.95  VT: 10.45  R:  9.18  RT:  6.99 (  76Kops/s)
over_reverse_n_8888 =  L1:  27.40  L2:  26.67  M: 16.97 ( 45.78%)  HT: 16.66  VT: 15.38  R: 14.15  RT:  9.44 (  97Kops/s)

image                      poppler   34.106   35.500   1.48%    6/6
image                      poppler   29.598   30.835   1.70%    6/6

ARM/iwMMXt:
over_reverse_n_8888 =  L1:  15.63  L2:  14.33  M: 10.83 ( 27.55%)  HT:  9.78  VT:  9.91  R:  9.49  RT:  6.96 (  69Kops/s)
over_reverse_n_8888 =  L1:  22.79  L2:  19.40  M: 13.76 ( 34.19%)  HT: 11.66  VT: 11.86  R: 11.17  RT:  7.85 (  75Kops/s)

image                      poppler   38.040   38.606   1.10%    6/6
image                      poppler   31.686   32.278   0.80%    5/6
2012-05-26 20:32:27 -04:00
Matt Turner
17acc7a4c7 mmx: add add_0565_0565
Loongson:
add_0565_0565 =  L1:  15.37  L2:  14.91  M: 11.83 ( 16.06%)  HT: 10.53  VT: 10.15  R:  9.74  RT:  6.19 (  68Kops/s)
add_0565_0565 =  L1:  45.06  L2:  46.71  M: 27.45 ( 38.00%)  HT: 23.76  VT: 22.84  R: 18.96  RT:  9.79 ( 104Kops/s)

ARM/iwMMXt:
add_0565_0565 =  L1:  12.87  L2:  11.58  M: 10.11 ( 12.50%)  HT:  9.06  VT:  8.66  R:  7.70  RT:  5.62 (  58Kops/s)
add_0565_0565 =  L1:  31.14  L2:  28.87  M: 22.46 ( 28.60%)  HT: 18.61  VT: 17.04  R: 15.21  RT:  9.35 (  90Kops/s)
2012-05-26 20:32:27 -04:00
Matt Turner
d551dc0494 fast: add add_0565_0565 function
I'll need this code for header and tail alignment loops in MMX, so I
might as well implement a fast path here.
2012-05-26 20:32:27 -04:00
Matt Turner
f8dc0e9834 mmx: implement expand_4x565 in terms of expand_4xpacked565
Loongson:
        over_n_0565 =  L1:  38.57  L2:  38.88  M: 30.01 ( 20.97%)  HT: 23.60  VT: 23.88  R: 21.95  RT: 11.65 ( 113Kops/s)
        over_n_0565 =  L1:  56.28  L2:  55.90  M: 34.20 ( 23.82%)  HT: 25.66  VT: 26.60  R: 23.78  RT: 11.80 ( 115Kops/s)

     over_8888_0565 =  L1:  35.89  L2:  36.11  M: 21.56 ( 45.47%)  HT: 18.33  VT: 17.90  R: 16.27  RT:  9.07 (  98Kops/s)
     over_8888_0565 =  L1:  40.91  L2:  41.06  M: 23.13 ( 48.46%)  HT: 19.24  VT: 18.71  R: 16.82  RT:  9.18 (  99Kops/s)

      over_n_8_0565 =  L1:  28.92  L2:  29.12  M: 21.42 ( 30.00%)  HT: 18.37  VT: 17.75  R: 16.15  RT:  8.79 (  91Kops/s)
      over_n_8_0565 =  L1:  32.32  L2:  32.13  M: 22.44 ( 31.27%)  HT: 19.15  VT: 18.66  R: 16.62  RT:  8.86 (  92Kops/s)

over_n_8888_0565_ca =  L1:  29.33  L2:  29.22  M: 18.99 ( 66.69%)  HT: 16.69  VT: 16.22  R: 14.63  RT:  8.42 (  88Kops/s)
over_n_8888_0565_ca =  L1:  34.97  L2:  34.14  M: 20.32 ( 71.73%)  HT: 17.67  VT: 17.19  R: 15.23  RT:  8.50 (  89Kops/s)

ARM/iwMMXt:
        over_n_0565 =  L1:  29.70  L2:  30.53  M: 24.47 ( 14.84%)  HT: 22.28  VT: 21.72  R: 21.13  RT: 12.58 ( 105Kops/s)
        over_n_0565 =  L1:  41.42  L2:  40.00  M: 30.95 ( 19.13%)  HT: 27.06  VT: 27.28  R: 23.43  RT: 14.44 ( 114Kops/s)

     over_8888_0565 =  L1:  12.73  L2:  11.53  M:  9.07 ( 16.47%)  HT:  9.00  VT:  9.25  R:  8.44  RT:  7.27 (  76Kops/s)
     over_8888_0565 =  L1:  23.72  L2:  21.76  M: 15.89 ( 29.51%)  HT: 14.36  VT: 14.05  R: 12.44  RT:  8.94 (  86Kops/s)

      over_n_8_0565 =  L1:   6.80  L2:   7.15  M:  6.37 (  7.90%)  HT:  6.58  VT:  6.24  R:  6.49  RT:  5.94 (  59Kops/s)
      over_n_8_0565 =  L1:  12.06  L2:  11.02  M: 10.16 ( 13.43%)  HT:  9.57  VT:  8.49  R:  9.10  RT:  6.86 (  69Kops/s)

over_n_8888_0565_ca =  L1:   7.62  L2:   7.01  M:  6.27 ( 20.52%)  HT:  6.00  VT:  6.07  R:  5.68  RT:  5.53 (  57Kops/s)
over_n_8888_0565_ca =  L1:  13.54  L2:  11.96  M:  9.76 ( 30.66%)  HT:  9.72  VT:  8.45  R:  9.37  RT:  6.85 (  67Kops/s)
2012-05-26 20:32:27 -04:00
Matt Turner
51681a052f mmx: add and use expand_4xpacked565 function
Loongson:
add_0565_0565 =  L1:  14.39  L2:  13.98  M: 11.28 ( 15.22%)  HT: 10.11  VT:  9.74  R:  9.39  RT:  6.05 (  67Kops/s)
add_0565_0565 =  L1:  15.37  L2:  14.91  M: 11.83 ( 16.06%)  HT: 10.53  VT: 10.15  R:  9.74  RT:  6.19 (  68Kops/s)

ARM/iwMMXt:
add_0565_0565 =  L1:  11.12  L2:  10.40  M:  8.82 ( 10.65%)  HT:  7.98  VT:  7.41  R:  7.57  RT:  5.21 (  54Kops/s)
add_0565_0565 =  L1:  12.87  L2:  11.58  M: 10.11 ( 12.50%)  HT:  9.06  VT:  8.66  R:  7.70  RT:  5.62 (  58Kops/s)
2012-05-26 20:32:27 -04:00
Søren Sandmann Pedersen
6491c70e3a Post-release version bump to 0.27.1 2012-05-26 16:34:13 -04:00
Søren Sandmann Pedersen
b1a401e6c9 Pre-release version bump to 0.26.0 2012-05-26 16:17:14 -04:00
Ingmar Runge
f71e3dba97 Fix MSVC compilation
Only up to three SSE intrinsics supported in function declaration.
2012-05-25 20:10:31 -04:00
Søren Sandmann Pedersen
1e59e18d73 test: Composite with solid images instead of using pixman_image_fill_*
There is a couple of places where the test suite uses the
pixman_image_fill_* functions to initialize images. These functions
can fail, and will do so if the "fast" implementation is disabled.

So to make sure the test suite passes even using
PIXMAN_DISABLE="fast", use pixman_image_composite32() with a solid
image instead of pixman_image_fill_*.
2012-05-24 15:30:41 -04:00
Nemanja Lukic
30816e3068 MIPS: DSPr2: Added bilinear over_8888_8_8888 fast path.
Performance numbers before/after on MIPS-74kc @ 1GHz

Referent (before):

cairo-perf-trace:
[ # ]  backend                         test   min(s) median(s) stddev. count
[ # ]    image: pixman 0.25.3
[  0]    image             firefox-fishtank 2289.180 2290.567   0.05%    5/6

Optimized:

cairo-perf-trace:
[ # ]  backend                         test   min(s) median(s) stddev. count
[ # ]    image: pixman 0.25.3
[  0]    image             firefox-fishtank 1700.925 1708.314   0.22%    5/6
2012-05-23 13:50:05 -04:00
Nemanja Lukic
aea0522f6f MIPS: DSPr2: Fix bug in over_n_8888_8888_ca/over_n_8888_0565_ca routines
In main loop (unrolled by factor 2), instead of negating multiplied
mask values by srca, values of srca was negated, and passed as alpha
argument for

    UN8x4_MUL_UN8x4_ADD_UN8x4 macro.

Instead of:

    ma = ~ma;
    UN8x4_MUL_UN8x4_ADD_UN8x4 (d, ma, s);

Code was doing this:

    ma = ~srca;
    UN8x4_MUL_UN8x4_ADD_UN8x4 (d, ma, s);

Key is in substituting registers s0/s1 (containing srca value), with
t0/t1 containing mask values multiplied by srca.  Register usage is
also improved (less registers are saved on stack, for
over_n_8888_8888_ca routine).

The bug was introduced in commit d2ee5631 and revealed by composite test.
2012-05-23 13:41:44 -04:00
Søren Sandmann Pedersen
74bf5dc2f9 demos: Add parrot.jpg to EXTRA_DIST
Pointed out by Cyril Brulebois.
2012-05-20 13:09:16 -04:00
Cyril Brulebois
ae5a109768 Upload to experimental. 2012-05-20 17:56:41 +02:00
Cyril Brulebois
a2283057a6 Remove demos/parrot.jpg before building the source package.
Let's avoid “binary file contents changed” until it's shipped in the
upstream tarball.
2012-05-20 17:56:18 +02:00
Cyril Brulebois
5cb7202a34 Bump changelogs. 2012-05-20 17:41:34 +02:00
Cyril Brulebois
4ed6f63c09 Merge branch 'upstream-experimental' into debian-experimental 2012-05-20 17:40:56 +02:00
Matt Turner
55698584be configure.ac: Fail the ARM/iwMMXt test if not compiling with -march=iwmmxt
If not compiling with -march=iwmmxt, the configure test will still pass,
thinking that the __builtin_arm_* intrinsic is a function instead of
generating a single instruction. Since no linking is done, the configure
test doesn't catch this, and we get linking errors in the build.
2012-05-15 16:41:22 -04:00
Søren Sandmann Pedersen
3682b61515 Post-release version bump to 0.25.7 2012-05-15 13:38:44 -04:00
Søren Sandmann Pedersen
1e1a00e964 Pre-release version bump to 0.25.6
Note that 0.25.4 was a botched release that doesn't have a tag and
doesn't correspond to any commit ID. It was however uploaded and
announced, so I'll just use the 0.25.6 version number.
2012-05-15 13:20:09 -04:00
Søren Sandmann Pedersen
b2c16aaadf demos/Makefile.am: Add parrot.c to EXTRA_DIST
To get 'make distcheck' to pass.
2012-05-15 13:19:19 -04:00
Matt Turner
50d3088d78 configure.ac: Rename loongson -> loongson-mmi
Make it match with the other fast paths, and the PIXMAN_DISABLE value is
already loongson-mmi.
2012-05-11 21:59:13 -04:00
Matt Turner
a0a40cb822 configure.ac: Fix loongson-mmi out-of-tree builds
When building out-of-tree, gcc wasn't able to find loongson-mmintrin.h
to compile the test program. Add -I$srcdir to CFLAGS to point gcc to it.
2012-05-11 21:49:42 -04:00
Nemanja Lukic
618a08e6aa MIPS: DSPr2: Added over_n_8_8888 and over_n_8_0565 fast paths.
Performance numbers before/after on MIPS-74kc @ 1GHz

Referent (before):

lowlevel-blt-bench:
     over_n_8_8888 =  L1:  10.40  L2:   9.79  M:  8.47 ( 33.62%)  HT:  7.64  VT:  7.59  R:  7.48  RT:  5.30 (  40Kops/s)
     over_n_8_0565 =  L1:   7.40  L2:   7.23  M:  6.78 ( 17.94%)  HT:  6.23  VT:  6.17  R:  6.14  RT:  4.62 (  37Kops/s)

Optimized:

lowlevel-blt-bench:
     over_n_8_8888 =  L1:  27.25  L2:  26.24  M: 18.15 ( 72.12%)  HT: 14.52  VT: 14.31  R: 13.83  RT:  7.57 (  48Kops/s)
     over_n_8_0565 =  L1:  18.91  L2:  17.59  M: 15.06 ( 39.90%)  HT: 12.18  VT: 11.98  R: 11.83  RT:  6.80 (  46Kops/s)
2012-05-11 17:11:27 -04:00
Matt Turner
7d4beedc61 mmx: add and use pack_4x565 function
The pack_4x565 makes use of the pack_4xpacked565 function which uses pmadd.

Some of the speed up is probably attributable to removing the artificial
serialization imposed by the
	vdest = pack_565 (..., vdest, 0);
	vdest = pack_565 (..., vdest, 1);
	...
pattern.

Loongson:
        over_n_0565 =  L1:  16.44  L2:  16.42  M: 13.83 (  9.85%)  HT: 12.83  VT: 12.61  R: 12.34  RT:  8.90 (  93Kops/s)
        over_n_0565 =  L1:  42.48  L2:  42.53  M: 29.83 ( 21.20%)  HT: 23.39  VT: 23.72  R: 21.80  RT: 11.60 ( 113Kops/s)

     over_8888_0565 =  L1:  15.61  L2:  15.42  M: 12.11 ( 25.79%)  HT: 11.07  VT: 10.70  R: 10.37  RT:  7.25 (  82Kops/s)
     over_8888_0565 =  L1:  35.01  L2:  35.20  M: 21.42 ( 45.57%)  HT: 18.12  VT: 17.61  R: 16.09  RT:  9.01 (  97Kops/s)

      over_n_8_0565 =  L1:  15.17  L2:  14.94  M: 12.57 ( 17.86%)  HT: 11.96  VT: 11.52  R: 10.79  RT:  7.31 (  79Kops/s)
      over_n_8_0565 =  L1:  29.83  L2:  29.79  M: 21.85 ( 30.94%)  HT: 18.82  VT: 18.25  R: 16.15  RT:  8.72 (  91Kops/s)

over_n_8888_0565_ca =  L1:  15.25  L2:  15.02  M: 11.64 ( 41.39%)  HT: 11.08  VT: 10.72  R: 10.02  RT:  7.00 (  77Kops/s)
over_n_8888_0565_ca =  L1:  30.12  L2:  29.99  M: 19.47 ( 68.99%)  HT: 17.05  VT: 16.55  R: 14.67  RT:  8.38 (  88Kops/s)

ARM/iwMMXt:
        over_n_0565 =  L1:  19.29  L2:  19.88  M: 17.38 ( 10.54%)  HT: 15.53  VT: 16.11  R: 13.69  RT: 11.00 (  96Kops/s)
        over_n_0565 =  L1:  36.02  L2:  34.85  M: 28.04 ( 16.97%)  HT: 22.12  VT: 24.21  R: 22.36  RT: 12.22 ( 103Kops/s)

     over_8888_0565 =  L1:  18.38  L2:  16.59  M: 12.34 ( 22.29%)  HT: 11.67  VT: 11.71  R: 11.02  RT:  6.89 (  72Kops/s)
     over_8888_0565 =  L1:  24.96  L2:  22.17  M: 15.11 ( 26.81%)  HT: 14.14  VT: 13.71  R: 13.18  RT:  8.13 (  78Kops/s)

      over_n_8_0565 =  L1:  14.65  L2:  12.44  M: 11.56 ( 14.50%)  HT: 10.93  VT: 10.39  R: 10.06  RT:  7.05 (  70Kops/s)
      over_n_8_0565 =  L1:  18.37  L2:  14.98  M: 13.97 ( 16.51%)  HT: 12.67  VT: 10.35  R: 11.80  RT:  8.14 (  74Kops/s)

over_n_8888_0565_ca =  L1:  14.27  L2:  12.93  M: 10.52 ( 33.23%)  HT:  9.70  VT:  9.90  R:  9.31  RT:  6.34 (  65Kops/s)
over_n_8888_0565_ca =  L1:  19.69  L2:  17.58  M: 13.40 ( 42.35%)  HT: 11.75  VT: 11.33  R: 11.17  RT:  7.49 (  73Kops/s)
2012-05-10 16:21:07 -04:00
Matt Turner
2beabd9fed configure.ac: make -march=loongson2f come before CFLAGS
Otherwise we'd have -march=loongson2f being overridden by automake's
CFLAGS ordering which causes build failures when -march=<not loongson2f>
is specified by the user.
2012-05-10 16:15:34 -04:00
Søren Sandmann Pedersen
dadb9a318b Add Makefile.win32 and Makefile.win32.common to EXTRA_DIST
https://bugs.freedesktop.org/show_bug.cgi?id=46905
2012-05-10 15:54:32 -04:00
Matt Turner
3c57ec471e .gitignore: add demos/checkerboard and demos/quad2quad 2012-05-09 22:50:50 -04:00
Matt Turner
2d431b53d3 mmx: Use wpackhus in src_x888_0565 on iwMMXt
iwMMXt which has an unsigned saturation pack instruction, while MMX/EXT
and Loongson don't.

ARM/iwMMXt:
src_8888_0565 =  L1: 110.38  L2:  82.33  M: 40.92 ( 73.22%)  HT: 35.63  VT: 32.22  R: 30.07  RT: 18.40 ( 132Kops/s)
src_8888_0565 =  L1: 117.91  L2:  83.05  M: 41.52 ( 75.58%)  HT: 37.63  VT: 35.40  R: 29.37  RT: 19.39 ( 134Kops/s)
2012-04-27 16:39:13 -04:00
Matt Turner
2ddd1c498b mmx: add src_8888_0565
Uses the pmadd technique described in
http://software.intel.com/sites/landingpage/legacy/mmx/MMX_App_24-16_Bit_Conversion.pdf

The technique uses the packssdw instruction which uses signed
saturatation. This works in their example because they pack 888 to 555
leaving the high bit as zero. For packing to 565, it is unsuitable, so
we replace it with an or+shuffle.

Loongson:
src_8888_0565 =  L1: 106.13  L2:  83.57  M: 33.46 ( 68.90%)  HT: 30.29  VT: 27.67  R: 26.11  RT: 15.06 ( 135Kops/s)
src_8888_0565 =  L1: 122.10  L2: 117.53  M: 37.97 ( 78.58%)  HT: 33.14  VT: 30.09  R: 29.01  RT: 15.76 ( 139Kops/s)

ARM/iwMMXt:
src_8888_0565 =  L1:  67.88  L2:  56.61  M: 31.20 ( 56.74%)  HT: 29.22  VT: 27.01  R: 25.39  RT: 19.29 ( 130Kops/s)
src_8888_0565 =  L1: 110.38  L2:  82.33  M: 40.92 ( 73.22%)  HT: 35.63  VT: 32.22  R: 30.07  RT: 18.40 ( 132Kops/s)
2012-04-27 14:12:28 -04:00
Matt Turner
3e8fe65a08 mmx: add x8f8g8b8 fetcher
Loongson:
   add_x888_x888 =  L1:  29.36  L2:  27.81  M: 14.05 ( 38.74%)  HT: 12.45  VT: 11.78  R: 11.52  RT:  7.23 (  75Kops/s)
   add_x888_x888 =  L1:  36.06  L2:  34.55  M: 14.81 ( 41.03%)  HT: 14.01  VT: 13.41  R: 13.06  RT:  9.06 (  90Kops/s)

 src_x888_8_x888 =  L1:  21.92  L2:  20.15  M: 13.35 ( 41.42%)  HT: 11.70  VT: 10.95  R: 10.53  RT:  6.18 (  65Kops/s)
 src_x888_8_x888 =  L1:  25.43  L2:  23.51  M: 14.12 ( 44.00%)  HT: 13.14  VT: 12.50  R: 11.86  RT:  7.49 (  76Kops/s)

over_x888_8_0565 =  L1:  10.64  L2:  10.17  M:  7.74 ( 21.35%)  HT:  6.83  VT:  6.55  R:  6.34  RT:  4.03 (  46Kops/s)
over_x888_8_0565 =  L1:  11.41  L2:  10.97  M:  8.07 ( 22.36%)  HT:  7.42  VT:  7.18  R:  6.92  RT:  4.62 (  52Kops/s)

ARM/iwMMXt:
   add_x888_x888 =  L1:  22.10  L2:  18.93  M: 13.48 ( 32.29%)  HT: 11.32  VT: 10.64  R: 10.36  RT:  6.51 (  61Kops/s)
   add_x888_x888 =  L1:  24.26  L2:  20.83  M: 14.52 ( 35.64%)  HT: 12.66  VT: 12.98  R: 11.34  RT:  7.69 (  72Kops/s)

 src_x888_8_x888 =  L1:  19.33  L2:  17.66  M: 14.26 ( 38.43%)  HT: 11.53  VT: 10.83  R: 10.57  RT:  6.12 (  58Kops/s)
 src_x888_8_x888 =  L1:  21.23  L2:  19.60  M: 15.41 ( 42.55%)  HT: 12.66  VT: 13.30  R: 11.55  RT:  7.32 (  67Kops/s)

over_x888_8_0565 =  L1:   8.15  L2:   7.56  M:  6.50 ( 15.58%)  HT:  5.73  VT:  5.49  R:  5.50  RT:  3.53 (  38Kops/s)
over_x888_8_0565 =  L1:   8.35  L2:   7.85  M:  6.68 ( 16.40%)  HT:  6.12  VT:  5.97  R:  5.78  RT:  4.03 (  43Kops/s)
2012-04-27 13:42:36 -04:00
Matt Turner
c2b1630d96 mmx: add a8 fetcher
oprofile of xfce4-terminal-a1
210535    9.0407  libpixman-1.so.0.25.3    fetch_scanline_a8
144802    6.0054  libpixman-1.so.0.25.3    mmx_fetch_a8

Loongson:
       add_8_8_8 =  L1:  17.98  L2:  17.28  M: 14.28 ( 19.79%)  HT: 11.11  VT: 10.38  R:  9.97  RT:  5.14 (  55Kops/s)
       add_8_8_8 =  L1:  20.44  L2:  19.65  M: 15.62 ( 21.53%)  HT: 12.86  VT: 11.98  R: 11.32  RT:  6.13 (  64Kops/s)

 src_8888_8_0565 =  L1:  19.97  L2:  18.59  M: 13.42 ( 32.55%)  HT: 11.46  VT: 10.78  R: 10.33  RT:  5.87 (  61Kops/s)
 src_8888_8_0565 =  L1:  21.16  L2:  19.68  M: 13.94 ( 33.64%)  HT: 12.31  VT: 11.52  R: 11.02  RT:  6.54 (  68Kops/s)

 src_x888_8_x888 =  L1:  20.54  L2:  18.88  M: 13.07 ( 40.74%)  HT: 11.05  VT: 10.36  R: 10.02  RT:  5.68 (  60Kops/s)
 src_x888_8_x888 =  L1:  21.92  L2:  20.15  M: 13.35 ( 41.42%)  HT: 11.70  VT: 10.95  R: 10.53  RT:  6.18 (  65Kops/s)

over_x888_8_0565 =  L1:  10.32  L2:   9.85  M:  7.63 ( 21.13%)  HT:  6.56  VT:  6.30  R:  6.12  RT:  3.80 (  43Kops/s)
over_x888_8_0565 =  L1:  10.64  L2:  10.17  M:  7.74 ( 21.35%)  HT:  6.83  VT:  6.55  R:  6.34  RT:  4.03 (  46Kops/s)

ARM/iwMMXt:
       add_8_8_8 =  L1:  13.10  L2:  11.67  M: 10.74 ( 13.46%)  HT:  8.62  VT:  8.15  R:  7.94  RT:  4.39 (  44Kops/s)
       add_8_8_8 =  L1:  13.81  L2:  12.79  M: 11.63 ( 13.93%)  HT:  9.33  VT:  9.20  R:  9.04  RT:  5.43 (  52Kops/s)

 src_8888_8_0565 =  L1:  16.62  L2:  15.07  M: 12.52 ( 27.46%)  HT: 10.07  VT: 10.17  R:  9.95  RT:  5.64 (  54Kops/s)
 src_8888_8_0565 =  L1:  16.84  L2:  16.11  M: 13.22 ( 27.71%)  HT: 11.74  VT: 10.90  R: 10.80  RT:  6.66 (  62Kops/s)

 src_x888_8_x888 =  L1:  17.49  L2:  16.22  M: 13.73 ( 38.73%)  HT: 10.10  VT: 10.33  R:  9.55  RT:  5.21 (  52Kops/s)
 src_x888_8_x888 =  L1:  19.33  L2:  17.66  M: 14.26 ( 38.43%)  HT: 11.53  VT: 10.83  R: 10.57  RT:  6.12 (  58Kops/s)

over_x888_8_0565 =  L1:   7.57  L2:   7.29  M:  6.37 ( 15.97%)  HT:  5.53  VT:  5.33  R:  5.21  RT:  3.22 (  35Kops/s)
over_x888_8_0565 =  L1:   8.15  L2:   7.56  M:  6.50 ( 15.58%)  HT:  5.73  VT:  5.49  R:  5.50  RT:  3.53 (  38Kops/s)
2012-04-27 13:42:26 -04:00
Matt Turner
20bad64d9a mmx: add r5g6b5 fetcher
Loongson:
add_0565_0565 =  L1:  12.73  L2:  12.26  M: 10.05 ( 13.87%)  HT:  8.77  VT:  8.50  R:  8.25  RT:  5.28 (  58Kops/s)
add_0565_0565 =  L1:  14.04  L2:  13.63  M: 10.96 ( 15.19%)  HT:  9.73  VT:  9.43  R:  9.11  RT:  5.93 (  64Kops/s)

ARM/iwMMXt:
add_0565_0565 =  L1:  10.36  L2:  10.03  M:  9.04 ( 10.88%)  HT:  3.11  VT:  7.16  R:  7.72  RT:  5.12 (  51Kops/s)
add_0565_0565 =  L1:  10.84  L2:  10.20  M:  9.15 ( 11.46%)  HT:  7.60  VT:  7.82  R:  7.70  RT:  5.41 (  53Kops/s)
2012-04-27 13:42:16 -04:00
Matt Turner
c136e535ad mmx: Use Loongson pextrh instruction in expand565
Same story as pinsrh in the previous commit.

 text	data	bss	dec	hex filename
25336	1952	  0   27288    6a98 .libs/libpixman_loongson_mmi_la-pixman-mmx.o
25072	1952	  0   27024    6990 .libs/libpixman_loongson_mmi_la-pixman-mmx.o

-dsll: 95
+dsll: 70
-dsrl: 135
+dsrl: 105
-ldc1: 462
+ldc1: 445
-lw: 721
+lw: 700
+pextrh: 30
2012-04-27 13:42:07 -04:00
Matt Turner
facceb4a1f mmx: Use Loongson pinsrh instruction in pack_565
The pinsrh instruction is analogous to MMX EXT's pinsrw, except like
other Loongson vector instructions it cannot access the general purpose
registers. In the cases of other Loongson vector instructions, this is a
headache, but it is actually a good thing here. Since the instruction is
different from MMX, I've named the intrinsic loongson_insert_pi16.

 text	data	bss	dec	 hex filename
25976	1952	  0   27928	6d18 .libs/libpixman_loongson_mmi_la-pixman-mmx.o
25336	1952	  0   27288	6a98 .libs/libpixman_loongson_mmi_la-pixman-mmx.o

-and: 181
+and: 147
-dsll: 143
+dsll: 95
-dsrl: 87
+dsrl: 135
-ldc1: 523
+ldc1: 462
-lw: 767
+lw: 721
+pinsrh: 35
2012-04-27 13:41:47 -04:00
Matt Turner
6d29b7d755 mmx: don't pack and unpack src unnecessarily
The combine function was store8888'ing the result, and all consumers
were immediately load8888'ing it, causing lots of unnecessary pack and
unpack instructions.

It's a very straight forward conversion, except for mmx_combine_over_u
and mmx_combine_saturate_u. mmx_combine_over_u was testing the integer
result to skip pixels, so we use the is_* functions to test the __m64
data directly without loading it into an integer register.

For mmx_combine_saturate_u there's not a lot we can do, since it uses
DIV_UN8.
2012-04-27 13:35:31 -04:00
Matt Turner
ee75003425 mmx: introduce is_equal, is_opaque, and is_zero functions
To be used by the next commit.
2012-04-27 13:35:25 -04:00
Matt Turner
10c77b339f mmx: simplify srcsrcsrcsrc calculation in over_n_8_0565 2012-04-27 13:35:19 -04:00
Matt Turner
e06947d101 mmx: remove unnecessary uint64_t<->__m64 conversions
Loongson:
add_8888_8888 =  L1:  68.73  L2:  55.09  M: 25.39 ( 68.18%)  HT: 25.28 VT: 22.42  R: 20.74  RT: 13.26 ( 131Kops/s)
add_8888_8888 =  L1: 159.19  L2: 114.10  M: 30.74 ( 77.91%)  HT: 27.63 VT: 24.99  R: 24.61  RT: 14.49 ( 141Kops/s)
2012-04-27 13:35:14 -04:00
Matt Turner
c78e986085 mmx: compile on MIPS for Loongson MMI optimizations
image               image16
           evolution   32.985 ->  29.667    27.314 ->  23.870
firefox-planet-gnome  197.982 -> 180.437   220.986 -> 205.057
gnome-system-monitor   48.482 ->  49.752    52.820 ->  49.528
  gnome-terminal-vim   60.799 ->  50.528    51.655 ->  44.131
      grads-heat-map    3.167 ->   3.181     3.328 ->   3.321
                gvim   38.646 ->  32.552    38.126 ->  34.453
       midori-zoomed   44.371 ->  43.338    28.860 ->  28.865
           ocitysmap   23.065 ->  18.057    23.046 ->  18.055
             poppler   43.676 ->  36.077    43.065 ->  36.090
  swfdec-giant-steps   20.166 ->  20.365    22.354 ->  16.578
      swfdec-youtube   31.502 ->  28.118    44.052 ->  41.771
   xfce4-terminal-a1   69.517 ->  51.288    62.225 ->  53.309
2012-04-27 13:35:05 -04:00
Matt Turner
4e0c7902b2 mmx: make ldq_u take __m64* directly
Before, if __m64 is allocated in vector or floating-point registers,

	__m64 vs = ldq_u((uint64_t *)src);

would cause src to be loaded into an integer register and then
transferred to an __m64 register. By switching ldq_u's argument type to
__m64 we give the compile enough information to recognize that it can
load to the vector register directly.

This patch is necessary for the Loongson optimizations when __m64 is
typedef'd as double.
2012-04-27 13:34:59 -04:00
Matt Turner
2e54b76a2d mmx: add load function and use it in add_8888_8888 2012-04-27 13:34:53 -04:00
Matt Turner
084e3f2f4b mmx: add store function and use it in add_8888_8888 2012-04-27 13:34:45 -04:00
Søren Sandmann Pedersen
e24c1c849d bits_image_fetch_pixel_convolution(): Make sure channels are signed
In the computation:

    srtot += RED_8 (pixel) * f

RED_8 (pixel) is an unsigned quantity, which means the signed filter
coefficient f gets converted to an unsigned integer before the
multiplication. We get away with this because when the 32 bit unsigned
result is converted to int32_t, the correct sign is produced. But if
srtot had been an int64_t, the result would have been a very large
positive number.

Fix this by explicitly casting the channels to int.
2012-04-20 10:17:13 -04:00
Søren Sandmann Pedersen
4d2fee1406 test/utils.c: Clip values to the [0, 255] interval
Unpremultiplying a superluminescent pixel can result in values greater
than 255.
2012-04-20 10:17:13 -04:00
Matt Turner
e291764584 configure.ac: fix iwMMXt/gcc version error message 2012-04-18 18:14:13 -04:00
Matt Turner
b87cd1f605 mmx: fix _mm_shuffle_pi16 function when compiling without optimization
The last argument must be an immediate value, and when compiling without
optimization the compiler might not recognize this. So use a macro if
not optimizing.
2012-04-15 14:03:08 -04:00
Matt Turner
e927d23971 configure.ac: require >= gcc-4.5 for ARM iwMMXt
We're using a patched gcc-4.5, and having to modify configure.ac and
autoreconf between changes is annoying. And besides, 4.5, 4.6, and 4.7's
iwMMXt intrinsic support is equally broken, and we test a known broken
intrinsic in the configure test program, so the version check is rather
meaningless.
2012-04-15 14:00:17 -04:00
Matt Turner
0531170436 mmx: Use force_inline instead of __inline__ (bug 46906)
Fixes the build on MSVC.
2012-04-05 17:36:05 -04:00
Matt Turner
b950bb12dc mmx: enable over_n_0565 for b5g6r5
Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-04-05 17:34:26 -04:00
Søren Sandmann Pedersen
87ecec8d72 gtk-utils.c: In pixbuf_from_argb32() use a8r8g8b8_to_rgba_np()
Instead of inlining a copy of that functionality.
2012-04-02 15:25:00 -04:00
Søren Sandmann Pedersen
d1ec1467f6 test/utils.c: Rename and export the pngify_pixels() function.
This function converts from a8r8g8b8 to non-premultiplied RGBA (the
PNG or GdkPixbuf format that has the channels in this order: R, G, B,
A in memory regardless of the computer's endianness). The function's
new name is a8r8g8b8_to_rgba_np().
2012-04-02 15:24:56 -04:00
Søren Sandmann Pedersen
b16ddf1782 gtk-utils.c: Don't include pixman-private.h
Use pixman_image_get_format() instead of image->bits.format.
2012-04-02 14:59:02 -04:00
Søren Sandmann Pedersen
b9ca23a9c7 Rename fast_composite_add_1000_1000 to _add_1_1()
The 1000_1000 name is a relic from before the refactoring.
2012-03-27 22:04:37 -04:00
Søren Sandmann Pedersen
746291a19e Add the original parrot image.
This is the Parrot image that was downscaled and cropped before being
used in the composite-test.c demo.
2012-03-27 22:04:36 -04:00
Søren Sandmann Pedersen
451b25ae90 composite-test.c: Add a parrot image
Instead of the yellow square, use a parrot as the source image. This
demonstrates the various blend modes much better.

The parrot is a cropped version of finger painting by Rubens LP:

    http://www.flickr.com/photos/dorubens/4030604504/in/set-72157622586088192/

where the background has been removed. Used here under Creative
Commons Attribution. The artist's web site:

     http://www.rubenslp.com.br/
2012-03-27 22:04:32 -04:00
Søren Sandmann Pedersen
3aa45d62e4 composite-test.c: Use similar gradient to the one in the PDF spec. 2012-03-24 16:41:47 -04:00
Søren Sandmann Pedersen
e1b8969e78 demos: Add checkerboard demo
This is a simple demo that displays a checkboard with a projective
transformation.
2012-03-24 16:29:36 -04:00
Søren Sandmann Pedersen
41863fbabb demos: Add quad2quad program
This program can compute the projective transformation that transforms
one quadrilateral into another. The code is basically maxima[1] output
translated into C.

[1] http://maxima.sourceforge.net/
2012-03-24 16:29:27 -04:00
Søren Sandmann Pedersen
cf0d0d6364 Use "=a" and "=d" constraints for rdtsc inline assembly
In 32 bit mode the "=A" constraint refers to the register pair
edx:eax, but according to GCC developers this is not the case in 64
bit mode, where it refers to "rax".

Hence, using "=A" for rdtsc is incorrect in 64 bit mode.

See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21249
2012-03-24 16:26:07 -04:00
Jeremy Huddleston
8a8aabf05c configure.ac: Fix a copy-paste-o in TLS detection
Regression from: a069da6c66

Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
Tested-by: Matt Turner <mattst88@gmail.com>
2012-03-16 12:41:14 -07:00
Matt Turner
ee6bac11c2 Use AC_LANG_SOURCE for DSPr2 configure program
Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-03-15 16:49:29 -04:00
Chun-wei Fan
21eeecffa9 Just include xmmintrin.h on MSVC as well
The xmmintrin.h as shipped with recent Visual C++ (2003+) provides
_mm_shuffle_pi16 and _mm_mulhi_pu16, so including that header
will do for using these functions, and MSVC does not like the GCC-specific
implementations of _mm_shuffle_pi16 and _mm_mulhi_pu16 that is
currently in the code.

_MM_SHUFFLE is declared in the same way in MSVC's xmmintrin.h, so don't
re-define it here to avoid a compilation warning.
2012-03-15 15:18:11 -04:00
Jeremy Huddleston
94aea2e868 Fix a false-negative in MMX check
Silence warnings that could make -Werror give a false negative
Use signed char to avoid cases where int8_t isn't declared

Reported-by: Mike Lothian <mike@fireburn.co.uk>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
2012-03-14 19:10:22 -07:00
Nemanja Lukic
d2ee5631ae MIPS: DSPr2: Added over_n_8888_8888_ca and over_n_8888_0565_ca fast paths.
Performance numbers before/after on MIPS-74kc @ 1GHz

Referent (before):

lowlevel-blt-bench:
     over_n_8888_8888_ca =  L1:   8.32  L2:   7.65  M:  6.38 ( 51.08%)  HT:  5.78  VT:  5.74  R:  5.84  RT:  4.39 (  37Kops/s)
     over_n_8888_0565_ca =  L1:   7.40  L2:   6.95  M:  6.16 ( 41.06%)  HT:  5.72  VT:  5.52  R:  5.63  RT:  4.28 (  36Kops/s)
cairo-perf-trace:
[ # ]  backend                         test   min(s) median(s) stddev. count
[ # ]    image: pixman 0.25.3
[  0]    image            xfce4-terminal-a1  138.223  139.070   0.33%    6/6
[ # ]  image16: pixman 0.25.3
[  0]  image16            xfce4-terminal-a1  132.763  132.939   0.06%    5/6

Optimized:

lowlevel-blt-bench:
     over_n_8888_8888_ca =  L1:  19.35  L2:  23.84  M: 13.68 (109.39%)  HT: 11.39  VT: 11.19  R: 11.27  RT:  6.90 (  47Kops/s)
     over_n_8888_0565_ca =  L1:  18.68  L2:  17.00  M: 12.56 ( 83.70%)  HT: 10.72  VT: 10.45  R: 10.43  RT:  5.79 (  43Kops/s)
cairo-perf-trace:
[ # ]  backend                         test   min(s) median(s) stddev. count
[ # ]    image: pixman 0.25.3
[  0]    image            xfce4-terminal-a1  130.400  131.720   0.46%    6/6
[ # ]  image16: pixman 0.25.3
[  0]  image16            xfce4-terminal-a1  125.830  126.604   0.34%    6/6
2012-03-13 18:04:31 -04:00
Jeremy Huddleston
a069da6c66 Expand TLS support beyond __thread to __declspec(thread)
This code was pretty much coppied from a similar commit that I made to
xorg-server in April.

cf: xorg/xserver: bb4d145bd25e2aee988b100ecf1105ea3b6a40b8

Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
2012-03-13 18:02:26 -04:00
Jeremy Huddleston
61d999b910 Disable MMX when incompatible clang is being used.
Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
2012-03-13 18:02:26 -04:00
Jeremy Huddleston
ad4b6922f2 Silence a warning about unused pixman_have_mmx
Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
2012-03-13 18:02:25 -04:00
Jeremy Huddleston
bb5ff26878 Revert "Disable MMX when Clang is being used."
This reverts commit 5eb4c12a79.
2012-03-13 18:02:25 -04:00
Søren Sandmann Pedersen
a6ad5120f7 Post-release version bump to 0.25.3 2012-03-08 10:11:20 -05:00
227 changed files with 50340 additions and 29514 deletions

11
.editorconfig Normal file
View File

@ -0,0 +1,11 @@
# To use this config on you editor, follow the instructions at:
# http://editorconfig.org
root = true
[*]
tab_width = 8
[meson.build,meson_options.txt]
indent_style = space
indent_size = 2

49
.gitignore vendored
View File

@ -26,51 +26,28 @@ stamp-h?
config.h
config.h.in
.*.swp
demos/alpha-test
demos/*-test
demos/checkerboard
demos/clip-in
demos/clip-test
demos/composite-test
demos/convolution-test
demos/gradient-test
demos/radial-test
demos/screen-test
demos/trap-test
demos/tri-test
pixman/pixman-combine32.c
pixman/pixman-combine32.h
pixman/pixman-combine64.c
pixman/pixman-combine64.h
demos/linear-gradient
demos/quad2quad
demos/scale
demos/dither
pixman/pixman-srgb.c
pixman/pixman-version.h
test/a1-trap-test
test/affine-test
test/*-test
test/affine-bench
test/alpha-loop
test/alphamap
test/alpha-test
test/blitters-test
test/check-formats
test/clip-in
test/clip-test
test/composite
test/composite-test
test/composite-traps-test
test/convolution-test
test/fetch-test
test/gradient-crash-test
test/gradient-test
test/infinite-loop
test/lowlevel-blt-bench
test/oob-test
test/pdf-op-test
test/region-contains-test
test/region-test
test/radial-invalid
test/region-translate
test/region-translate-test
test/scaling-crash-test
test/scaling-helpers-test
test/scaling-test
test/screen-test
test/stress-test
test/scaling-bench
test/trap-crasher
test/trap-test
test/window-test
*.pdb
*.dll
*.lib

View File

@ -0,0 +1,80 @@
# Docker build stage
#
# It builds a multi-arch image for all required architectures. Each image can be
# later easily used with properly configured Docker (which uses binfmt and QEMU
# underneath).
docker:
stage: docker
image: quay.io/buildah/stable
rules:
- if: "$CI_PIPELINE_SOURCE == 'merge_request_event' && $TARGET =~ $ACTIVE_TARGET_PATTERN"
changes:
paths:
- .gitlab-ci.d/01-docker.yml
- .gitlab-ci.d/01-docker/**/*
variables:
DOCKER_TAG: $CI_COMMIT_REF_SLUG
DOCKER_IMAGE_NAME: ${CI_REGISTRY_IMAGE}/pixman:${DOCKER_TAG}
- if: "$CI_PIPELINE_SOURCE == 'schedule' && $TARGET =~ $ACTIVE_TARGET_PATTERN"
- if: "$CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH && $TARGET =~ $ACTIVE_TARGET_PATTERN"
- if: "$CI_COMMIT_TAG && $TARGET =~ $ACTIVE_TARGET_PATTERN"
variables:
# Use vfs with buildah. Docker offers overlayfs as a default, but Buildah
# cannot stack overlayfs on top of another overlayfs filesystem.
STORAGE_DRIVER: vfs
# Write all image metadata in the docker format, not the standard OCI
# format. Newer versions of docker can handle the OCI format, but older
# versions, like the one shipped with Fedora 30, cannot handle the format.
BUILDAH_FORMAT: docker
BUILDAH_ISOLATION: chroot
CACHE_IMAGE: ${CI_REGISTRY_IMAGE}/cache
CACHE_ARGS: --cache-from ${CACHE_IMAGE} --cache-to ${CACHE_IMAGE}
before_script:
# Login to the target registry.
- echo "${CI_REGISTRY_PASSWORD}" |
buildah login -u "${CI_REGISTRY_USER}" --password-stdin ${CI_REGISTRY}
# Docker Hub login is optional, and can be used to circumvent image pull
# quota for anonymous pulls for base images.
- echo "${DOCKERHUB_PASSWORD}" |
buildah login -u "${DOCKERHUB_USER}" --password-stdin docker.io ||
echo "Failed to login to Docker Hub."
parallel:
matrix:
- TARGET:
- linux-386
- linux-amd64
- linux-arm-v5
- linux-arm-v7
- linux-arm64-v8
- linux-mips
- linux-mips64el
- linux-mipsel
- linux-ppc
- linux-ppc64
- linux-ppc64le
- linux-riscv64
- windows-686
- windows-amd64
- windows-arm64-v8
script:
# Prepare environment.
- ${LOAD_TARGET_ENV}
- FULL_IMAGE_NAME=${DOCKER_IMAGE_NAME}-${TARGET}
# Build and push the image.
- buildah bud
--tag ${FULL_IMAGE_NAME}
--layers ${CACHE_ARGS}
--target ${TARGET}
--platform=${DOCKER_PLATFORM}
--build-arg BASE_IMAGE=${BASE_IMAGE}
--build-arg BASE_IMAGE_TAG=${BASE_IMAGE_TAG}
--build-arg LLVM_VERSION=${LLVM_VERSION}
-f Dockerfile .gitlab-ci.d/01-docker/
- buildah images
- buildah push ${FULL_IMAGE_NAME}

View File

@ -0,0 +1,150 @@
ARG BASE_IMAGE=docker.io/debian
ARG BASE_IMAGE_TAG=bookworm-slim
FROM ${BASE_IMAGE}:${BASE_IMAGE_TAG} AS base
LABEL org.opencontainers.image.title="Pixman build environment for platform coverage" \
org.opencontainers.image.authors="Marek Pikuła <m.pikula@partner.samsung.com>"
ARG DEBIAN_FRONTEND=noninteractive
ENV APT_UPDATE="apt-get update" \
APT_INSTALL="apt-get install -y --no-install-recommends" \
APT_CLEANUP="rm -rf /var/lib/apt/lists/* /var/cache/apt/archives/*"
ARG GCOVR_VERSION="~=7.2"
ARG MESON_VERSION="~=1.6"
RUN ${APT_UPDATE} \
&& ${APT_INSTALL} \
# Build dependencies.
build-essential \
ninja-build \
pkg-config \
qemu-user \
# pipx dependencies.
python3-argcomplete \
python3-packaging \
python3-pip \
python3-platformdirs \
python3-userpath \
python3-venv \
# gcovr dependencies.
libxml2-dev \
libxslt-dev \
python3-dev \
&& ${APT_CLEANUP} \
# Install pipx using pip to have a more recent version of pipx, which
# supports the `--global` flag.
&& pip install pipx --break-system-packages \
# Install a recent version of meson and gcovr using pipx to have the same
# version across all variants regardless of base.
&& pipx install --global \
gcovr${GCOVR_VERSION} \
meson${MESON_VERSION} \
&& gcovr --version \
&& echo Meson version: \
&& meson --version
FROM base AS llvm-base
# LLVM 16 is the highest available in Bookworm. Preferably, we should use the
# same version for all platforms, but it's not possible at the moment.
ARG LLVM_VERSION=16
RUN ${APT_UPDATE} \
&& ${APT_INSTALL} \
clang-${LLVM_VERSION} \
libclang-rt-${LLVM_VERSION}-dev \
lld-${LLVM_VERSION} \
llvm-${LLVM_VERSION} \
&& ${APT_CLEANUP} \
&& ln -f /usr/bin/clang-${LLVM_VERSION} /usr/bin/clang \
&& ln -f /usr/bin/lld-${LLVM_VERSION} /usr/bin/lld \
&& ln -f /usr/bin/llvm-ar-${LLVM_VERSION} /usr/bin/llvm-ar \
&& ln -f /usr/bin/llvm-strip-${LLVM_VERSION} /usr/bin/llvm-strip
FROM llvm-base AS native-base
ARG LLVM_VERSION=16
RUN ${APT_UPDATE} \
&& ${APT_INSTALL} \
# Runtime library dependencies.
libglib2.0-dev \
libgtk-3-dev \
libpng-dev \
# Install libomp-dev if available (OpenMP support for LLVM). It's done only
# for the native images, as OpenMP support in cross-build environment is
# tricky for LLVM.
&& (${APT_INSTALL} libomp-${LLVM_VERSION}-dev \
|| echo "OpenMP not available on this platform.") \
&& ${APT_CLEANUP}
# The following targets differ in BASE_IMAGE.
FROM native-base AS linux-386
FROM native-base AS linux-amd64
FROM native-base AS linux-arm-v5
FROM native-base AS linux-arm-v7
FROM native-base AS linux-arm64-v8
FROM native-base AS linux-mips64el
FROM native-base AS linux-mipsel
FROM native-base AS linux-ppc64le
FROM native-base AS linux-riscv64
# The following targets should have a common BASE_IMAGE.
FROM llvm-base AS linux-mips
RUN ${APT_UPDATE} \
&& ${APT_INSTALL} gcc-multilib-mips-linux-gnu \
&& ${APT_CLEANUP}
FROM llvm-base AS linux-ppc
RUN ${APT_UPDATE} \
&& ${APT_INSTALL} gcc-multilib-powerpc-linux-gnu \
&& ${APT_CLEANUP}
FROM llvm-base AS linux-ppc64
RUN ${APT_UPDATE} \
&& ${APT_INSTALL} gcc-multilib-powerpc64-linux-gnu \
&& ${APT_CLEANUP}
# We use a common image for Windows i686 and amd64, as it doesn't make sense to
# make them separate in terms of build time and image size. After two runs they
# should use the same cache layers, so in the end it makes the collective image
# size smaller.
FROM base AS windows-base
ARG LLVM_MINGW_RELEASE=20240619
ARG LLVM_MINGW_VARIANT=llvm-mingw-${LLVM_MINGW_RELEASE}-msvcrt-ubuntu-20.04-x86_64
RUN ${APT_UPDATE} \
&& ${APT_INSTALL} wget \
&& ${APT_CLEANUP} \
&& cd /opt \
&& wget https://github.com/mstorsjo/llvm-mingw/releases/download/${LLVM_MINGW_RELEASE}/${LLVM_MINGW_VARIANT}.tar.xz \
&& tar -xf ${LLVM_MINGW_VARIANT}.tar.xz \
&& rm -f ${LLVM_MINGW_VARIANT}.tar.xz
ENV PATH=${PATH}:/opt/${LLVM_MINGW_VARIANT}/bin
FROM windows-base AS windows-x86-base
RUN dpkg --add-architecture i386 \
&& ${APT_UPDATE} \
&& ${APT_INSTALL} \
gcc-mingw-w64-i686 \
gcc-mingw-w64-x86-64 \
mingw-w64-tools \
procps \
wine \
wine32 \
wine64 \
&& ${APT_CLEANUP} \
# Inspired by https://code.videolan.org/videolan/docker-images
&& wine wineboot --init \
&& while pgrep wineserver > /dev/null; do \
echo "waiting ..."; \
sleep 1; \
done \
&& rm -rf /tmp/wine-*
FROM windows-x86-base AS windows-686
FROM windows-x86-base AS windows-amd64
# aarch64 image requires linaro/wine-arm64 as a base.
FROM windows-base AS windows-arm64-v8
RUN wine-arm64 wineboot --init \
&& while pgrep wineserver > /dev/null; do \
echo "waiting ..."; \
sleep 1; \
done \
&& rm -rf /tmp/wine-*

View File

@ -0,0 +1,4 @@
DOCKER_PLATFORM=linux/386
BASE_IMAGE=docker.io/i386/debian
BASE_IMAGE_TAG=bookworm-slim
LLVM_VERSION=16

View File

@ -0,0 +1,4 @@
DOCKER_PLATFORM=linux/amd64
BASE_IMAGE=docker.io/amd64/debian
BASE_IMAGE_TAG=bookworm-slim
LLVM_VERSION=16

View File

@ -0,0 +1,4 @@
DOCKER_PLATFORM=linux/arm/v5
BASE_IMAGE=docker.io/arm32v5/debian
BASE_IMAGE_TAG=bookworm-slim
LLVM_VERSION=16

View File

@ -0,0 +1,4 @@
DOCKER_PLATFORM=linux/arm/v7
BASE_IMAGE=docker.io/arm32v7/debian
BASE_IMAGE_TAG=bookworm-slim
LLVM_VERSION=16

View File

@ -0,0 +1,4 @@
DOCKER_PLATFORM=linux/arm64/v8
BASE_IMAGE=docker.io/arm64v8/debian
BASE_IMAGE_TAG=bookworm-slim
LLVM_VERSION=16

View File

@ -0,0 +1,4 @@
DOCKER_PLATFORM=linux/amd64
BASE_IMAGE=docker.io/amd64/debian
BASE_IMAGE_TAG=bookworm-slim
LLVM_VERSION=16

View File

@ -0,0 +1,4 @@
DOCKER_PLATFORM=linux/mips64el
BASE_IMAGE=docker.io/mips64le/debian
BASE_IMAGE_TAG=bookworm-slim
LLVM_VERSION=16

View File

@ -0,0 +1,4 @@
DOCKER_PLATFORM=linux/mipsel
BASE_IMAGE=docker.io/serenitycode/debian-debootstrap
BASE_IMAGE_TAG=mipsel-bookworm-slim
LLVM_VERSION=14

View File

@ -0,0 +1 @@
linux-amd64.env

View File

@ -0,0 +1 @@
linux-amd64.env

View File

@ -0,0 +1,4 @@
DOCKER_PLATFORM=linux/ppc64le
BASE_IMAGE=docker.io/ppc64le/debian
BASE_IMAGE_TAG=bookworm-slim
LLVM_VERSION=16

View File

@ -0,0 +1,4 @@
DOCKER_PLATFORM=linux/riscv64
BASE_IMAGE=docker.io/riscv64/debian
BASE_IMAGE_TAG=sid-slim
LLVM_VERSION=18

View File

@ -0,0 +1 @@
linux-amd64.env

View File

@ -0,0 +1 @@
linux-amd64.env

View File

@ -0,0 +1,3 @@
DOCKER_PLATFORM=linux/amd64
BASE_IMAGE=docker.io/linaro/wine-arm64
BASE_IMAGE_TAG=latest

107
.gitlab-ci.d/02-build.yml Normal file
View File

@ -0,0 +1,107 @@
# Build stage
#
# This stage builds pixman with enabled coverage for all supported
# architectures.
#
# Some targets don't support atomic profile update, so to decrease the number of
# gcov errors, they need to be built without OpenMP (single threaded) by adding
# `-Dopenmp=disabled` Meson argument.
variables:
# Used in test stage as well.
BUILD_DIR: build-${TOOLCHAIN}
# Applicable to all build targets.
include:
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-386
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-amd64
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-arm-v5
qemu_cpu: arm1136
# Disable coverage, as the tests take too long to run with a single thread.
enable_gnu_coverage: false
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-arm-v7
qemu_cpu: max
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-arm64-v8
qemu_cpu: max
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-mips
toolchain: [gnu]
qemu_cpu: 74Kf
enable_gnu_coverage: false
# TODO: Merge with the one above once the following issue is resolved:
# https://gitlab.freedesktop.org/pixman/pixman/-/issues/105).
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-mips
toolchain: [llvm]
qemu_cpu: 74Kf
job_name_prefix: "."
job_name_suffix: ":failing"
allow_failure: true
retry: 0
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-mips64el
qemu_cpu: Loongson-3A4000
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-mipsel
toolchain: [gnu]
qemu_cpu: 74Kf
# Disable coverage, as the tests take too long to run with a single thread.
enable_gnu_coverage: false
# TODO: Merge with the one above once the following issue is resolved:
# https://gitlab.freedesktop.org/pixman/pixman/-/issues/105).
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-mipsel
toolchain: [llvm]
qemu_cpu: 74Kf
job_name_prefix: "."
job_name_suffix: ":failing"
allow_failure: true
retry: 0
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-ppc
qemu_cpu: g4
enable_gnu_coverage: false
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-ppc64
qemu_cpu: ppc64
enable_gnu_coverage: false
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-ppc64le
qemu_cpu: power10
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-riscv64
qemu_cpu: rv64
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: windows-686
enable_gnu_coverage: false
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: windows-amd64
enable_gnu_coverage: false
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: windows-arm64-v8
toolchain: [llvm] # GNU toolchain doesn't seem to support Windows on ARM.
qemu_cpu: max
enable_gnu_coverage: false

175
.gitlab-ci.d/03-test.yml Normal file
View File

@ -0,0 +1,175 @@
# Test stage
#
# This stage executes the test suite for pixman for all architectures in
# different configurations. Build and test is split, as some architectures can
# have different QEMU configuration or have multiple supported pixman backends,
# which are executed as job matrix.
#
# Mind that `PIXMAN_ENABLE` variable in matrix runs does nothing, but it looks
# better in CI to indicate what is actually being tested.
#
# Some emulated targets are really slow or cannot be run in multithreaded mode
# (mipsel, arm-v5). Thus coverage reporting is disabled for them.
variables:
# Used in summary stage as well.
COVERAGE_BASE_DIR: coverage
COVERAGE_OUT: ${COVERAGE_BASE_DIR}/${CI_JOB_ID}
TEST_NAME: "" # Allow to specify a set of tests to run with run variables.
include:
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-386
toolchain: [gnu]
pixman_disable:
- "sse2 ssse3" # Testing "mmx"
- "mmx ssse3" # Testing "sse2"
- "mmx sse2" # Testing "ssse3"
# TODO: Merge up after resolving
# https://gitlab.freedesktop.org/pixman/pixman/-/issues/106
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-386
toolchain: [llvm]
pixman_disable:
# Same as above.
- "sse2 ssse3"
- "mmx ssse3"
- "mmx sse2"
job_name_prefix: "."
job_name_suffix: ":failing"
allow_failure: true
retry: 0
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-amd64
pixman_disable:
- ""
- "fast"
- "wholeops"
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-arm-v5
toolchain: [gnu]
qemu_cpu: [arm1136]
pixman_disable: ["arm-neon"] # Test only arm-simd.
timeout: 3h
test_timeout_multiplier: 40
# TODO: Merge up after resolving
# https://gitlab.freedesktop.org/pixman/pixman/-/issues/107
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-arm-v5
toolchain: [llvm]
qemu_cpu: [arm1136]
pixman_disable: ["arm-neon"] # Test only arm-simd.
timeout: 3h
test_timeout_multiplier: 40
job_name_prefix: "."
job_name_suffix: ":failing"
allow_failure: true
retry: 0
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-arm-v7
qemu_cpu: [max]
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-arm64-v8
qemu_cpu: [max]
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-mips
toolchain: [gnu] # TODO: Add llvm once the build is fixed.
qemu_cpu: [74Kf]
job_name_prefix: "."
job_name_suffix: ":failing"
allow_failure: true # Some tests seem to fail.
retry: 0
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-mips64el
toolchain: [gnu]
qemu_cpu: [Loongson-3A4000]
# TODO: Merge up after resolving
# https://gitlab.freedesktop.org/pixman/pixman/-/issues/108
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-mips64el
toolchain: [llvm]
qemu_cpu: [Loongson-3A4000]
job_name_prefix: "."
job_name_suffix: ":failing"
allow_failure: true
retry: 0
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-mipsel
toolchain: [gnu] # TODO: Add llvm once the build is fixed.
qemu_cpu: [74Kf]
timeout: 2h
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-ppc
qemu_cpu: [g4]
job_name_prefix: "."
job_name_suffix: ":failing"
allow_failure: true # SIGILL for some tests
retry: 0
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-ppc64
qemu_cpu: [ppc64]
job_name_prefix: "."
job_name_suffix: ":failing"
allow_failure: true # SIGSEGV for some tests
retry: 0
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-ppc64le
toolchain: [gnu]
qemu_cpu: [power10]
# TODO: Merge up after resolving
# https://gitlab.freedesktop.org/pixman/pixman/-/issues/109
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-ppc64le
toolchain: [llvm]
qemu_cpu: [power10]
job_name_prefix: "."
job_name_suffix: ":failing"
allow_failure: true
retry: 0
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-riscv64
qemu_cpu:
# Test on target without RVV (verify no autovectorization).
- rv64,v=false
# Test correctness for different VLENs.
- rv64,v=true,vext_spec=v1.0,vlen=128,elen=64
- rv64,v=true,vext_spec=v1.0,vlen=256,elen=64
- rv64,v=true,vext_spec=v1.0,vlen=512,elen=64
- rv64,v=true,vext_spec=v1.0,vlen=1024,elen=64
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: windows-686
pixman_disable:
# The same as for linux-386.
- "sse2 ssse3"
- "mmx ssse3"
- "mmx sse2"
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: windows-amd64
pixman_disable:
# The same as for linux-amd64.
- ""
- "fast"
- "wholeops"
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: windows-arm64-v8
toolchain: [llvm]
qemu_cpu: [max]

View File

@ -0,0 +1,47 @@
# Summary stage
#
# This stage takes coverage reports from test runs for all architectures, and
# merges it into a single report, with GitLab visualization. There is also an
# HTML report generated as a separate artifact.
summary:
extends: .target:all
stage: summary
variables:
TARGET: linux-amd64
COVERAGE_SUMMARY_DIR: ${COVERAGE_BASE_DIR}/summary
needs:
- job: test:linux-386
optional: true
- job: test:linux-amd64
optional: true
- job: test:linux-arm-v7
optional: true
- job: test:linux-arm64-v8
optional: true
- job: test:linux-mips64el
optional: true
- job: test:linux-ppc64le
optional: true
- job: test:linux-riscv64
optional: true
script:
- echo "Input coverage reports:" && ls ${COVERAGE_BASE_DIR}/*.json || (echo "No coverage reports available." && exit)
- |
args=( )
for f in ${COVERAGE_BASE_DIR}/*.json; do
args+=( "-a" "$f" )
done
- mkdir -p ${COVERAGE_SUMMARY_DIR}
- gcovr "${args[@]}"
--cobertura-pretty --cobertura ${COVERAGE_SUMMARY_DIR}/coverage.xml
--html-details ${COVERAGE_SUMMARY_DIR}/coverage.html
--txt --print-summary
coverage: '/^TOTAL.*\s+(\d+\%)$/'
artifacts:
reports:
coverage_report:
coverage_format: cobertura
path: ${COVERAGE_SUMMARY_DIR}/coverage.xml
paths:
- ${COVERAGE_SUMMARY_DIR}/

View File

@ -0,0 +1 @@
native-gnu.meson

View File

@ -0,0 +1 @@
native-llvm.meson

View File

@ -0,0 +1 @@
native-gnu.meson

View File

@ -0,0 +1 @@
native-llvm.meson

View File

@ -0,0 +1 @@
native-gnu-noopenmp.meson

View File

@ -0,0 +1 @@
native-llvm-noopenmp.meson

View File

@ -0,0 +1 @@
native-gnu.meson

View File

@ -0,0 +1 @@
native-llvm.meson

View File

@ -0,0 +1 @@
native-gnu.meson

View File

@ -0,0 +1 @@
native-llvm.meson

View File

@ -0,0 +1,11 @@
[binaries]
c = ['mips-linux-gnu-gcc', '-DCI_HAS_ALL_MIPS_CPU_FEATURES']
ar = 'mips-linux-gnu-ar'
strip = 'mips-linux-gnu-strip'
exe_wrapper = ['qemu-mips', '-L', '/usr/mips-linux-gnu/']
[host_machine]
system = 'linux'
cpu_family = 'mips32'
cpu = 'mips32'
endian = 'big'

View File

@ -0,0 +1,14 @@
[binaries]
c = ['clang', '-target', 'mips-linux-gnu', '-fPIC', '-DCI_HAS_ALL_MIPS_CPU_FEATURES']
ar = 'llvm-ar'
strip = 'llvm-strip'
exe_wrapper = ['qemu-mips', '-L', '/usr/mips-linux-gnu/']
[built-in options]
c_link_args = ['-target', 'mips-linux-gnu', '-fuse-ld=lld']
[host_machine]
system = 'linux'
cpu_family = 'mips32'
cpu = 'mips32'
endian = 'big'

View File

@ -0,0 +1,8 @@
[binaries]
c = ['gcc', '-DCI_HAS_ALL_MIPS_CPU_FEATURES']
ar = 'ar'
strip = 'strip'
pkg-config = 'pkg-config'
[project options]
mips-dspr2 = 'disabled'

View File

@ -0,0 +1,8 @@
[binaries]
c = ['clang', '-DCI_HAS_ALL_MIPS_CPU_FEATURES']
ar = 'llvm-ar'
strip = 'llvm-strip'
pkg-config = 'pkg-config'
[project options]
mips-dspr2 = 'disabled'

View File

@ -0,0 +1 @@
native-gnu-noopenmp.meson

View File

@ -0,0 +1 @@
native-llvm-noopenmp.meson

View File

@ -0,0 +1,11 @@
[binaries]
c = 'powerpc-linux-gnu-gcc'
ar = 'powerpc-linux-gnu-ar'
strip = 'powerpc-linux-gnu-strip'
exe_wrapper = ['qemu-ppc', '-L', '/usr/powerpc-linux-gnu']
[host_machine]
system = 'linux'
cpu_family = 'ppc'
cpu = 'ppc'
endian = 'big'

View File

@ -0,0 +1,15 @@
[binaries]
c = ['clang', '-target', 'powerpc-linux-gnu']
ar = 'llvm-ar'
strip = 'llvm-strip'
exe_wrapper = ['qemu-ppc', '-L', '/usr/powerpc-linux-gnu/']
[built-in options]
# We cannot use LLD, as it doesn't support big-endian PPC.
c_link_args = ['-target', 'powerpc-linux-gnu']
[host_machine]
system = 'linux'
cpu_family = 'ppc'
cpu = 'ppc'
endian = 'big'

View File

@ -0,0 +1,11 @@
[binaries]
c = 'powerpc64-linux-gnu-gcc'
ar = 'powerpc64-linux-gnu-ar'
strip = 'powerpc64-linux-gnu-strip'
exe_wrapper = ['qemu-ppc64', '-L', '/usr/powerpc64-linux-gnu/']
[host_machine]
system = 'linux'
cpu_family = 'ppc64'
cpu = 'ppc64'
endian = 'big'

View File

@ -0,0 +1,15 @@
[binaries]
c = ['clang', '-target', 'powerpc64-linux-gnu']
ar = 'llvm-ar'
strip = 'llvm-strip'
exe_wrapper = ['qemu-ppc64', '-L', '/usr/powerpc64-linux-gnu/']
[built-in options]
# We cannot use LLD, as it doesn't support big-endian PPC.
c_link_args = ['-target', 'powerpc64-linux-gnu']
[host_machine]
system = 'linux'
cpu_family = 'ppc64'
cpu = 'ppc64'
endian = 'big'

View File

@ -0,0 +1 @@
native-gnu.meson

View File

@ -0,0 +1 @@
native-llvm.meson

View File

@ -0,0 +1 @@
native-gnu.meson

View File

@ -0,0 +1 @@
native-llvm.meson

View File

@ -0,0 +1,8 @@
[binaries]
c = ['gcc', '-DCI_HAS_ALL_MIPS_CPU_FEATURES']
ar = 'ar'
strip = 'strip'
pkg-config = 'pkg-config'
[project options]
openmp = 'disabled'

View File

@ -0,0 +1,5 @@
[binaries]
c = 'gcc'
ar = 'ar'
strip = 'strip'
pkg-config = 'pkg-config'

View File

@ -0,0 +1,8 @@
[binaries]
c = ['clang', '-DCI_HAS_ALL_MIPS_CPU_FEATURES']
ar = 'llvm-ar'
strip = 'llvm-strip'
pkg-config = 'pkg-config'
[project options]
openmp = 'disabled'

View File

@ -0,0 +1,5 @@
[binaries]
c = 'clang'
ar = 'llvm-ar'
strip = 'llvm-strip'
pkg-config = 'pkg-config'

View File

@ -0,0 +1,18 @@
[binaries]
c = 'i686-w64-mingw32-gcc'
ar = 'i686-w64-mingw32-ar'
strip = 'i686-w64-mingw32-strip'
windres = 'i686-w64-mingw32-windres'
exe_wrapper = 'wine'
[built-in options]
c_link_args = ['-static-libgcc']
[host_machine]
system = 'windows'
cpu_family = 'x86'
cpu = 'i686'
endian = 'little'
[project options]
openmp = 'disabled'

View File

@ -0,0 +1,18 @@
[binaries]
c = 'i686-w64-mingw32-clang'
ar = 'i686-w64-mingw32-llvm-ar'
strip = 'i686-w64-mingw32-strip'
windres = 'i686-w64-mingw32-windres'
exe_wrapper = 'wine'
[built-in options]
c_link_args = ['-static']
[project options]
openmp = 'disabled'
[host_machine]
system = 'windows'
cpu_family = 'x86'
cpu = 'i686'
endian = 'little'

View File

@ -0,0 +1,15 @@
[binaries]
c = 'x86_64-w64-mingw32-gcc'
ar = 'x86_64-w64-mingw32-ar'
strip = 'x86_64-w64-mingw32-strip'
windres = 'x86_64-w64-mingw32-windres'
exe_wrapper = 'wine'
[built-in options]
c_link_args = ['-static-libgcc']
[host_machine]
system = 'windows'
cpu_family = 'x86_64'
cpu = 'x86_64'
endian = 'little'

View File

@ -0,0 +1,20 @@
[binaries]
c = 'x86_64-w64-mingw32-clang'
ar = 'x86_64-w64-mingw32-llvm-ar'
strip = 'x86_64-w64-mingw32-strip'
windres = 'x86_64-w64-mingw32-windres'
exe_wrapper = 'wine'
[built-in options]
# Static linking is a workaround around `libwinpthread-1` not being discovered correctly.
c_link_args = ['-static']
[project options]
# OpenMP is disabled as it is not being discovered correctly during tests.
openmp = 'disabled'
[host_machine]
system = 'windows'
cpu_family = 'x86_64'
cpu = 'x86_64'
endian = 'little'

View File

@ -0,0 +1,18 @@
[binaries]
c = 'aarch64-w64-mingw32-clang'
ar = 'aarch64-w64-mingw32-llvm-ar'
strip = 'aarch64-w64-mingw32-strip'
windres = 'aarch64-w64-mingw32-windres'
exe_wrapper = 'wine-arm64'
[built-in options]
c_link_args = ['-static']
[project options]
openmp = 'disabled'
[host_machine]
system = 'windows'
cpu_family = 'aarch64'
cpu = 'aarch64'
endian = 'little'

View File

@ -0,0 +1,65 @@
# This file contains the set of jobs run by the pixman project:
# https://gitlab.freedesktop.org/pixman/pixman/-/pipelines
stages:
- docker
- build
- test
- summary
variables:
# Make it possible to change RUNNER_TAG from GitLab variables. The default
# `kvm` tag has been tested with FDO infrastructure.
RUNNER_TAG: kvm
# Docker image global configuration.
DOCKER_TAG: latest
DOCKER_IMAGE_NAME: registry.freedesktop.org/pixman/pixman/pixman:${DOCKER_TAG}
# Execute to load a target-specific environment.
LOAD_TARGET_ENV: source .gitlab-ci.d/01-docker/target-env/${TARGET}.env
# Enable/disable specific targets for code and platform coverage targets.
ACTIVE_TARGET_PATTERN: '/linux-386|linux-amd64|linux-arm-v5|linux-arm-v7|linux-arm64-v8|linux-mips|linux-mips64el|linux-mipsel|linux-ppc|linux-ppc64|linux-ppc64le|linux-riscv64|windows-686|windows-amd64|windows-arm64-v8/i'
workflow:
rules:
# Use modified Docker image if building in MR and Docker image is affected
# by the MR.
- if: $CI_PIPELINE_SOURCE == 'merge_request_event'
changes:
paths:
- .gitlab-ci.d/01-docker.yml
- .gitlab-ci.d/01-docker/**/*
variables:
DOCKER_TAG: $CI_COMMIT_REF_SLUG
DOCKER_IMAGE_NAME: ${CI_REGISTRY_IMAGE}/pixman:${DOCKER_TAG}
# A standard set of GitLab CI triggers (i.e., MR, schedule, default branch,
# and tag).
- if: $CI_PIPELINE_SOURCE == 'merge_request_event'
- if: $CI_COMMIT_BRANCH && $CI_OPEN_MERGE_REQUESTS
when: never
- if: $CI_PIPELINE_SOURCE == 'schedule'
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
- if: $CI_COMMIT_BRANCH
- if: $CI_COMMIT_TAG
auto_cancel:
on_new_commit: conservative
on_job_failure: all
default:
tags:
- $RUNNER_TAG
# Retry in case the runner is misconfigured for multi-arch builds or some
# random unexpected runner error occurs (it happened during testing).
retry: 1
include:
- local: "/.gitlab-ci.d/templates/targets.yml"
- local: "/.gitlab-ci.d/01-docker.yml"
- local: "/.gitlab-ci.d/02-build.yml"
- local: "/.gitlab-ci.d/03-test.yml"
- local: "/.gitlab-ci.d/04-summary.yml"

View File

@ -0,0 +1,80 @@
spec:
inputs:
target:
description:
Build target in form of "OS-ARCH" pair (e.g., linux-amd64). Mostly the
same as platform string for Docker but with a hyphen instead of slash.
toolchain:
description:
An array of toolchains to test with. Each toolchain should have an
appropriate Meson cross file.
type: array
default: [gnu, llvm]
qemu_cpu:
description:
QEMU_CPU environmental variable used by Docker (which uses QEMU
underneath). It is not used by x86 targets, as they are executed
natively on the host.
default: ""
enable_gnu_coverage:
description:
Enable coverage build flags. It can be later used to compile a coverage
report for all the jobs. Should be enabled only for native build
environments as they have all the optional dependencies, and are the
most reliable and uniform (so disable for cross environments).
type: boolean
default: true
job_name_prefix:
description:
Additional prefix for the job name. Can be used to disable a job with a
"." prefix.
default: ""
job_name_suffix:
description:
Additional suffix for the job name. Can be used to prevent job
duplication for jobs for the same target.
default: ""
allow_failure:
description:
Set the `allow_failure` flag for jobs that are expected to fail.
Remember to set `retry` argument to 0 to prevent unnecessary retries.
type: boolean
default: false
retry:
description:
Set the `retry` flag for a job. Usually used together with
`allow_failure`.
type: number
default: 1
---
"$[[ inputs.job_name_prefix ]]build:$[[ inputs.target ]]$[[ inputs.job_name_suffix ]]":
extends: .target:all
stage: build
allow_failure: $[[ inputs.allow_failure ]]
retry: $[[ inputs.retry ]]
needs:
- job: docker
optional: true
parallel:
matrix:
- TARGET: $[[ inputs.target ]]
variables:
TARGET: $[[ inputs.target ]]
QEMU_CPU: $[[ inputs.qemu_cpu ]]
parallel:
matrix:
- TOOLCHAIN: $[[ inputs.toolchain ]]
script:
- |
if [ "$[[ inputs.enable_gnu_coverage ]]" == "true" ] && [ "${TOOLCHAIN}" == "gnu" ]; then
COV_C_ARGS=-fprofile-update=atomic
COV_MESON_BUILD_ARGS=-Db_coverage=true
fi
- meson setup ${BUILD_DIR}
--cross-file .gitlab-ci.d/meson-cross/${TARGET}-${TOOLCHAIN}.meson
-Dc_args="${COV_C_ARGS}" ${COV_MESON_BUILD_ARGS}
- meson compile -C ${BUILD_DIR}
artifacts:
paths:
- ${BUILD_DIR}/

View File

@ -0,0 +1,9 @@
# General target templates.
.target:all:
image:
name: $DOCKER_IMAGE_NAME-$TARGET
rules:
- if: "$TARGET =~ $ACTIVE_TARGET_PATTERN"
before_script:
- ${LOAD_TARGET_ENV}

View File

@ -0,0 +1,112 @@
spec:
inputs:
target:
description:
Build target in form of "OS-ARCH" pair (e.g., linux-amd64). Mostly the
same as platform string for Docker but with a hyphen instead of slash.
toolchain:
description:
An array of toolchains to test with. Each toolchain should have an
appropriate Meson cross file.
type: array
default: [gnu, llvm]
qemu_cpu:
description:
An array of QEMU_CPU environmental variables used as a job matrix
variable, and in turn by Docker (which uses QEMU underneath). It is not
used by x86 targets, as they are executed natively on the host.
type: array
default: [""]
pixman_disable:
description:
An array of PIXMAN_DISABLE targets used as a job matrix variable.
type: array
default: [""]
timeout:
description:
GitLab job timeout property. May need to be increased for slow
targets.
default: 1h
test_timeout_multiplier:
description:
Test timeout multiplier flag used for Meson test execution. May need to
be increased for slow targets.
type: number
default: 20
meson_testthreads:
description:
Sets MESON_TESTTHREADS environmental variable. For some platforms, the
tests should be executed one by one (without multithreading) to prevent
gcovr errors.
type: number
default: 0
gcovr_flags:
description:
Additional flags passed to gcovr tool.
default: ""
job_name_prefix:
description:
Additional prefix for the job name. Can be used to disable a job with a
"." prefix.
default: ""
job_name_suffix:
description:
Additional suffix for the job name. Can be used to prevent job
duplication for jobs for the same target.
default: ""
allow_failure:
description:
Set the `allow_failure` flag for jobs that are expected to fail.
Remember to set `retry` argument to 0 to prevent unnecessary retries.
type: boolean
default: false
retry:
description:
Set the `retry` flag for a job. Usually used together with
`allow_failure`.
type: number
default: 1
---
"$[[ inputs.job_name_prefix ]]test:$[[ inputs.target ]]$[[ inputs.job_name_suffix ]]":
extends: .target:all
stage: test
allow_failure: $[[ inputs.allow_failure ]]
retry: $[[ inputs.retry ]]
timeout: $[[ inputs.timeout ]]
needs:
- job: docker
optional: true
parallel:
matrix:
- TARGET: $[[ inputs.target ]]
- job: build:$[[ inputs.target ]]
parallel:
matrix:
- TOOLCHAIN: $[[ inputs.toolchain ]]
variables:
TARGET: $[[ inputs.target ]]
TEST_TIMEOUT_MULTIPLIER: $[[ inputs.test_timeout_multiplier ]]
GCOVR_FLAGS: $[[ inputs.gcovr_flags ]]
MESON_ARGS: -t ${TEST_TIMEOUT_MULTIPLIER} --no-rebuild -v ${TEST_NAME}
MESON_TESTTHREADS: $[[ inputs.meson_testthreads ]]
parallel:
matrix:
- TOOLCHAIN: $[[ inputs.toolchain ]]
PIXMAN_DISABLE: $[[ inputs.pixman_disable ]]
QEMU_CPU: $[[ inputs.qemu_cpu ]]
script:
- meson test -C ${BUILD_DIR} ${MESON_ARGS}
after_script:
- mkdir -p ${COVERAGE_OUT}
- gcovr ${GCOVR_FLAGS} -r ./ ${BUILD_DIR} -e ./subprojects
--json ${COVERAGE_OUT}.json
--html-details ${COVERAGE_OUT}/coverage.html
--print-summary || echo "No coverage data available."
artifacts:
paths:
- ${BUILD_DIR}/meson-logs/testlog.txt
- ${COVERAGE_BASE_DIR}/
reports:
junit:
- ${BUILD_DIR}/meson-logs/testlog.junit.xml

16
.gitlab-ci.yml Normal file
View File

@ -0,0 +1,16 @@
#
# This is the GitLab CI configuration file for the mainstream pixman project:
# https://gitlab.freedesktop.org/pixman/pixman/-/pipelines
#
# !!! DO NOT ADD ANY NEW CONFIGURATION TO THIS FILE !!!
#
# Only documentation or comments is accepted.
#
# To use a different set of jobs than the mainstream project, you need to set
# the location of your custom yml file at "custom CI/CD configuration path", on
# your GitLab CI namespace:
# https://docs.gitlab.com/ee/ci/pipelines/settings.html#custom-cicd-configuration-path
#
include:
- local: '/.gitlab-ci.d/pixman-project.yml'

16309
ChangeLog

File diff suppressed because it is too large Load Diff

View File

@ -1,133 +0,0 @@
SUBDIRS = pixman demos test
pkgconfigdir=$(libdir)/pkgconfig
pkgconfig_DATA=pixman-1.pc
$(pkgconfig_DATA): pixman-1.pc.in
snapshot:
distdir="$(distdir)-`date '+%Y%m%d'`"; \
test -d "$(srcdir)/.git" && distdir=$$distdir-`cd "$(srcdir)" && git rev-parse HEAD | cut -c 1-6`; \
$(MAKE) $(AM_MAKEFLAGS) distdir="$$distdir" dist
GPGKEY=6FF7C1A8
USERNAME=$$USER
RELEASE_OR_SNAPSHOT = $$(if test "x$(PIXMAN_VERSION_MINOR)" = "x$$(echo "$(PIXMAN_VERSION_MINOR)/2*2" | bc)" ; then echo release; else echo snapshot; fi)
RELEASE_CAIRO_HOST = $(USERNAME)@cairographics.org
RELEASE_CAIRO_DIR = /srv/cairo.freedesktop.org/www/$(RELEASE_OR_SNAPSHOT)s
RELEASE_CAIRO_URL = http://cairographics.org/$(RELEASE_OR_SNAPSHOT)s
RELEASE_XORG_URL = http://xorg.freedesktop.org/archive/individual/lib
RELEASE_XORG_HOST = $(USERNAME)@xorg.freedesktop.org
RELEASE_XORG_DIR = /srv/xorg.freedesktop.org/archive/individual/lib
RELEASE_ANNOUNCE_LIST = cairo-announce@cairographics.org, xorg-announce@lists.freedesktop.org, pixman@lists.freedesktop.org
tar_gz = $(PACKAGE)-$(VERSION).tar.gz
tar_bz2 = $(PACKAGE)-$(VERSION).tar.bz2
sha1_tgz = $(tar_gz).sha1
md5_tgz = $(tar_gz).md5
sha1_tbz2 = $(tar_bz2).sha1
md5_tbz2 = $(tar_bz2).md5
gpg_file = $(sha1_tgz).asc
$(sha1_tgz): $(tar_gz)
sha1sum $^ > $@
$(md5_tgz): $(tar_gz)
md5sum $^ > $@
$(sha1_tbz2): $(tar_bz2)
sha1sum $^ > $@
$(md5_tbz2): $(tar_bz2)
md5sum $^ > $@
$(gpg_file): $(sha1_tgz)
@echo "Please enter your GPG password to sign the checksum."
gpg --armor --sign $^
HASHFILES = $(sha1_tgz) $(sha1_tbz2) $(md5_tgz) $(md5_tbz2)
release-verify-newer:
@echo -n "Checking that no $(VERSION) release already exists at $(RELEASE_XORG_HOST)..."
@ssh $(RELEASE_XORG_HOST) test ! -e $(RELEASE_XORG_DIR)/$(tar_gz) \
|| (echo "Ouch." && echo "Found: $(RELEASE_XORG_HOST):$(RELEASE_XORG_DIR)/$(tar_gz)" \
&& echo "Refusing to try to generate a new release of the same name." \
&& false)
@ssh $(RELEASE_CAIRO_HOST) test ! -e $(RELEASE_CAIRO_DIR)/$(tar_gz) \
|| (echo "Ouch." && echo "Found: $(RELEASE_CAIRO_HOST):$(RELEASE_CAIRO_DIR)/$(tar_gz)" \
&& echo "Refusing to try to generate a new release of the same name." \
&& false)
@echo "Good."
release-remove-old:
$(RM) $(tar_gz) $(tar_bz2) $(HASHFILES) $(gpg_file)
ensure-prev:
@if [[ "$(PREV)" == "" ]]; then \
echo "" && \
echo "You must set the PREV variable on the make command line to" && \
echo "the last version." && \
echo "" && \
echo "For example:" && \
echo " make PREV=0.7.3" && \
echo "" && \
false; \
fi
release-check: ensure-prev release-verify-newer release-remove-old distcheck
release-tag:
git tag -u $(GPGKEY) -m "$(PACKAGE) $(VERSION) release" $(PACKAGE)-$(VERSION)
release-upload: release-check $(tar_gz) $(tar_bz2) $(sha1_tgz) $(sha1_tbz2) $(md5_tgz) $(gpg_file)
scp $(tar_gz) $(sha1_tgz) $(gpg_file) $(RELEASE_CAIRO_HOST):$(RELEASE_CAIRO_DIR)
scp $(tar_gz) $(tar_bz2) $(RELEASE_XORG_HOST):$(RELEASE_XORG_DIR)
ssh $(RELEASE_CAIRO_HOST) "rm -f $(RELEASE_CAIRO_DIR)/LATEST-$(PACKAGE)-[0-9]* && ln -s $(tar_gz) $(RELEASE_CAIRO_DIR)/LATEST-$(PACKAGE)-$(VERSION)"
RELEASE_TYPE = $$(if test "x$(PIXMAN_VERSION_MINOR)" = "x$$(echo "$(PIXMAN_VERSION_MINOR)/2*2" | bc)" ; then echo "stable release in the" ; else echo "development snapshot leading up to a stable"; fi)
release-publish-message: $(HASHFILES) ensure-prev
@echo "Please follow the instructions in RELEASING to push stuff out and"
@echo "send out the announcement mails. Here is the excerpt you need:"
@echo ""
@echo "Lists: $(RELEASE_ANNOUNCE_LIST)"
@echo "Subject: [ANNOUNCE] $(PACKAGE) release $(VERSION) now available"
@echo "============================== CUT HERE =============================="
@echo "A new $(PACKAGE) release $(VERSION) is now available. This is a $(RELEASE_TYPE)"
@echo ""
@echo "tar.gz:"
@echo " $(RELEASE_CAIRO_URL)/$(tar_gz)"
@echo " $(RELEASE_XORG_URL)/$(tar_gz)"
@echo ""
@echo "tar.bz2:"
@echo " $(RELEASE_XORG_URL)/$(tar_bz2)"
@echo ""
@echo "Hashes:"
@echo -n " MD5: "
@cat $(md5_tgz)
@echo -n " MD5: "
@cat $(md5_tbz2)
@echo -n " SHA1: "
@cat $(sha1_tgz)
@echo -n " SHA1: "
@cat $(sha1_tbz2)
@echo ""
@echo "GPG signature:"
@echo " $(RELEASE_CAIRO_URL)/$(gpg_file)"
@echo " (signed by `git config --get user.name` <`git config --get user.email`>)"
@echo ""
@echo "Git:"
@echo " git://git.freedesktop.org/git/pixman"
@echo " tag: $(PACKAGE)-$(VERSION)"
@echo ""
@echo "Log:"
@git log --no-merges "$(PACKAGE)-$(PREV)".."$(PACKAGE)-$(VERSION)" | git shortlog | awk '{ printf "\t"; print ; }' | cut -b1-80
@echo "============================== CUT HERE =============================="
@echo ""
release-publish: release-upload release-tag release-publish-message
.PHONY: release-upload release-publish release-publish-message release-tag

View File

@ -1,25 +0,0 @@
default: all
top_srcdir = .
include $(top_srcdir)/Makefile.win32.common
# Recursive targets
pixman_r:
@$(MAKE) -C pixman -f Makefile.win32
test_r:
@$(MAKE) -C test -f Makefile.win32
clean_r:
@$(MAKE) -C pixman -f Makefile.win32 clean
@$(MAKE) -C test -f Makefile.win32 clean
check_r:
@$(MAKE) -C test -f Makefile.win32 check
# Base targets
all: test_r
clean: clean_r
check: check_r

View File

@ -1,54 +0,0 @@
LIBRARY = pixman-1
CC = cl
LD = link
AR = lib
PERL = perl
ifeq ($(top_builddir),)
top_builddir = $(top_srcdir)
endif
CFG_VAR = $(CFG)
ifeq ($(CFG_VAR),)
CFG_VAR = release
endif
ifeq ($(CFG_VAR),debug)
CFG_CFLAGS = -MDd -Od -Zi
CFG_LDFLAGS = -DEBUG
else
CFG_CFLAGS = -MD -O2
CFG_LDFLAGS =
endif
# Package definitions, to be used instead of those provided in config.h
PKG_CFLAGS = -DPACKAGE=$(LIBRARY) -DPACKAGE_VERSION="" -DPACKAGE_BUGREPORT=""
BASE_CFLAGS = -nologo -I. -I$(top_srcdir) -I$(top_srcdir)/pixman
PIXMAN_CFLAGS = $(BASE_CFLAGS) $(PKG_CFLAGS) $(CFG_CFLAGS) $(CFLAGS)
PIXMAN_LDFLAGS = -nologo $(CFG_LDFLAGS) $(LDFLAGS)
PIXMAN_ARFLAGS = -nologo $(LDFLAGS)
inform:
ifneq ($(CFG),release)
ifneq ($(CFG),debug)
ifneq ($(CFG),)
@echo "Invalid specified configuration option: "$(CFG)"."
@echo
@echo "Possible choices for configuration are 'release' and 'debug'"
@exit 1
endif
@echo "Using default RELEASE configuration... (use CFG=release or CFG=debug)"
endif
endif
$(CFG_VAR)/%.obj: %.c $(BUILT_SOURCES)
@mkdir -p $(CFG_VAR)
@$(CC) -c $(PIXMAN_CFLAGS) -Fo"$@" $<
clean: inform
@$(RM) $(CFG_VAR)/*.{exe,ilk,lib,obj,pdb} $(BUILT_SOURCES) || exit 0

136
README
View File

@ -1,22 +1,134 @@
pixman is a library that provides low-level pixel manipulation
Pixman
======
Pixman is a library that provides low-level pixel manipulation
features such as image compositing and trapezoid rasterization.
All questions regarding this software should be directed to the pixman
mailing list:
Questions should be directed to the pixman mailing list:
http://lists.freedesktop.org/mailman/listinfo/pixman
https://lists.freedesktop.org/mailman/listinfo/pixman
Please send patches and bug reports either to the mailing list above,
or file them at the freedesktop bug tracker:
You can also file bugs at
https://bugs.freedesktop.org/enter_bug.cgi?product=pixman
https://gitlab.freedesktop.org/pixman/pixman/-/issues/new
The master development code repository can be found at:
or submit improvements in form of a Merge Request via
git://anongit.freedesktop.org/git/pixman
https://gitlab.freedesktop.org/pixman/pixman/-/merge_requests
http://gitweb.freedesktop.org/?p=pixman;a=summary
For real time discussions about pixman, feel free to join the IRC
channels #cairo and #xorg-devel on the FreeNode IRC network.
For more information on the git code manager, see:
http://wiki.x.org/wiki/GitPage
Contributing
------------
In order to contribute to pixman, you will need a working knowledge of
the git version control system. For a quick getting started guide,
there is the "Everyday Git With 20 Commands Or So guide"
https://www.kernel.org/pub/software/scm/git/docs/everyday.html
from the Git homepage. For more in depth git documentation, see the
resources on the Git community documentation page:
https://git-scm.com/documentation
Pixman uses the infrastructure from the freedesktop.org umbrella
project. For instructions about how to use the git service on
freedesktop.org, see:
https://www.freedesktop.org/wiki/Infrastructure/git/Developers
The Pixman master repository can be found at:
https://gitlab.freedesktop.org/pixman/pixman
Sending patches
---------------
Patches should be submitted in form of Merge Requests via Gitlab.
You will first need to create a fork of the main pixman repository at
https://gitlab.freedesktop.org/pixman/pixman
via the Fork button on the top right. Once that is done you can add your
personal repository as a remote to your local pixman development git checkout:
git remote add my-gitlab git@gitlab.freedesktop.org:YOURUSERNAME/pixman.git
git fetch my-gitlab
Make sure to have added ssh keys to your gitlab profile at
https://gitlab.freedesktop.org/profile/keys
Once that is set up, the general workflow for sending patches is to create a
new local branch with your improvements and once it's ready push it to your
personal pixman fork:
git checkout -b fix-some-bug
...
git push my-gitlab
The output of the `git push` command will include a link that allows you to
create a Merge Request against the official pixman repository.
Whenever you make changes to your branch (add new commits or fix up commits)
you push them back to your personal pixman fork:
git push -f my-gitlab
If there is an open Merge Request Gitlab will automatically pick up the
changes from your branch and pixman developers can review them anew.
In order for your patches to be accepted, please consider the
following guidelines:
- At each point in the series, pixman should compile and the test
suite should pass.
The exception here is if you are changing the test suite to
demonstrate a bug. In this case, make one commit that makes the
test suite fail due to the bug, and then another commit that fixes
the bug.
You can run the test suite with
meson test -C builddir
It will take around two minutes to run on a modern PC.
- Follow the coding style described in the CODING_STYLE file
- For bug fixes, include an update to the test suite to make sure
the bug doesn't reappear.
- For new features, add tests of the feature to the test
suite. Also, add a program demonstrating the new feature to the
demos/ directory.
- Write descriptive commit messages. Useful information to include:
- Benchmark results, before and after
- Description of the bug that was fixed
- Detailed rationale for any new API
- Alternative approaches that were rejected (and why they
don't work)
- If review comments were incorporated, a brief version
history describing what those changes were.
- For big patch series, write an introductory post with an overall
description of the patch series, including benchmarks and
motivation. Each commit message should still be descriptive and
include enough information to understand why this particular commit
was necessary.
Pixman has high standards for code quality and so almost everybody
should expect to have the first versions of their patches rejected.
If you think that the reviewers are wrong about something, or that the
guidelines above are wrong, feel free to discuss the issue. The purpose
of the guidelines and code review is to ensure high code quality; it is
not an exercise in compliance.

View File

@ -10,12 +10,11 @@ Here are the steps to follow to create a new pixman release:
git log master...origin (no output; note: *3* dots)
2) Increment pixman_(major|minor|micro) in configure.ac according to
the directions in that file.
2) Increment the version in meson.build.
3) Make sure that new version works, including
- make distcheck passes
- meson test passes
- the X server still works with the new pixman version
installed
@ -55,3 +54,5 @@ Here are the steps to follow to create a new pixman release:
You must use "--tags" here; otherwise the new tag will not
be pushed out.
8) Change the topic of the #cairo IRC channel on freenode to advertise
the new version.

271
TODO
View File

@ -1,271 +0,0 @@
- Testing
- Test implementations against each other
- Test both with and without the operator strength reduction.
They shold be identical.
- SSE 2 issues:
- Use MM_HINT_NTA instead of MM_HINT_T0
- Use of fbCompositeOver_x888x8x8888sse2()
- Update the RLEASING file
- Things to keep in mind if breaking ABI:
- There should be a guard #ifndef I_AM_EITHER_CAIRO_OR_THE_X_SERVER
- X server will require 16.16 essentially forever. Can we get
the required precision by simply adding offset_x/y to the
relevant rendering API?
- Get rid of workaround for X server bug.
- pixman_image_set_indexed() should copy its argument, and X
should be ported over to use a pixman_image as the
representation of a Picture, rather than creating one on each
operation.
- We should get rid of pixman_set_static_pointers()
- We should get rid of the various trapezoid helper functions().
(They only exist because they are theoretically available to
drivers).
- 16 bit regions should be deleted
- There should only be one trap rasterization API.
- The PIXMAN_g8/c8/etc formats should use the A channel
to indicate the actual depth. That way PIXMAN_x4c4 and PIXMAN_c8
won't collide.
- Maybe bite the bullet and make configure.ac generate a pixman-types.h
file that can be included from pixman.h to avoid the #ifdef magic
in pixman.h
- Make pixman_region_point_in() survive a NULL box, then fix up
pixman-compose.c
- Possibly look into inlining the fetch functions
- There is a bug with source clipping demonstrated by clip-test in the
test directory. If we interprete source clipping as given in
destination coordinates, which is probably the only sane choice,
then the result should have two red bars down the sides.
- Test suite
- Add a general way of dealing with architecture specific
fast-paths. The current idea is to have each operation that can
be optimized is called through a function pointer that is
initially set to an initialization function that is responsible for
setting the function pointer to the appropriate fast-path.
- Go through things marked FIXME
- Add calls to prepare and finish access where necessary. grep for
ACCESS_MEM, and make sure they are correctly wrapped in prepare
and finish.
- restore READ/WRITE in the fbcompose combiners since they sometimes
store directly to destination drawables.
- It probably makes sense to move the more strange X region API
into pixman as well, but guarded with PIXMAN_XORG_COMPATIBILITY
- Reinstate the FbBits typedef? At the moment we don't
even have the FbBits type; we just use uint32_t everywhere.
Keith says in bug 2335:
The 64-bit code in fb (pixman) is probably broken; it hasn't been
used in quite some time as PCI (and AGP) is 32-bits wide, so
doing things 64-bits at a time is a net loss. To quickly fix
this, I suggest just using 32-bit datatypes by setting
IC_SHIFT to 5 for all machines.
- Consider optimizing the 8/16 bit solid fills in pixman-util.c by
storing more than one value at a time.
- Add an image cache to prevent excessive malloc/free. Note that pixman
needs to be thread safe when used from cairo.
- Moving to 24.8 coordinates. This is tricky because X is still
defined as 16.16 and will be basically forever. It's possible we
could do this by adding extra offset_x/y parameters to the
trapezoid calls. The X server could then just call the API with
(0, 0). Cairo would have to make sure that the delta *within* a
batch of trapezoids does not exceed 16 bit.
- Consider adding actual backends. Brain dump:
A backend is something that knows how to
- Create images
- Composite three images
- Rasterize trapezoids
- Do solid fills and blits
These operations are provided by a vtable that the backend will
create when it is initialized. Initial backends:
- VMX
- SSE2
- MMX
- Plain Old C
When the SIMD backends are initialized, they will be passed a
pointer to the Plain Old C backend that they can use for fallback
purposes.
Images would gain a vtable as well that would contain things like
- Read scanline
- Write scanline
(Or even read_patch/write_patch as suggested by Keith a while
back).
This could simplify the compositing code considerably.
- Review the pixman_format_code_t enum to make sure it will support
future formats. Some formats we will probably need:
ARGB/ABGR with 16/32/64 bit integer/floating channels
YUV2,
YV12
Also we may need the ability to distinguish between PICT_c8 and
PICT_x4c4. (This could be done by interpreting the A channel as
the depth for TYPE_COLOR and TYPE_GRAY formats).
A possibility may be to reserve the two top bits and make them
encode "number of places to shift the channel widths given" Since
these bits are 00 at the moment everything will continue to work,
but these additional widths will be allowed:
All even widths between 18-32
All multiples of four widths between 33 and 64
All multiples of eight between 64 and 128
This means things like r21g22b21 won't work - is that worth
worrying about? I don't think so. And of course the bpp field
can't handle a depth of over 256, so > 64 bit channels arent'
really all that useful.
We could reserve one extra bit to indicate floating point, but
we may also just add
PIXMAN_TYPE_ARGB_FLOAT
PIXMAN_TYPE_BGRA_FLOAT
PIXMAN_TYPE_A_FLOAT
image types. With five bits we can support up to 32 different
format types, which should be enough for everybody, even if we
decide to support all the various video formats here:
http://www.fourcc.org/yuv.php
It may make sense to have a PIXMAN_TYPE_YUV, and then use the
channel bits to specify the exact subtype.
Another possibility is to add
PIXMAN_TYPE_ARGB_W
PIXMAN_TYPE_ARGB_WW
where the channel widths would get 16 and 32 added to them,
respectively.
What about color spaces such a linear vs. srGB etc.?
done:
- Use pixmanFillsse2 and pixmanBltsse2
- Be consistent about calling sse2 sse2
- Rename "SSE" to "MMX_EXTENSIONS". (Deleted mmx extensions).
- Commented-out uses of fbCompositeCopyAreasse2()
- Consider whether calling regions region16 is really such a great
idea. Vlad wants 32 bit regions for Cairo. This will break X server
ABI, but should otherwise be mostly harmless, though a
pixman_region_get_boxes16() may be useful.
- Altivec signal issue (Company has fix, there is also a patch by
dwmw2 in rawhide).
- Behdad's MMX issue - see list
- SSE2 issues:
- Crashes in Mozilla because of unaligned stack. Possible fixes
- Make use of gcc 4.2 feature to align the stack
- Write some sort of trampoline that aligns the stack
before calling SSE functions.
- Get rid of the switch-of-doom; replace it with a big table
describing the various fast paths.
- Make source clipping optional.
- done: source clipping happens through an indirection.
still needs to make the indirection settable. (And call it
from X)
- Run cairo test suite; fix bugs
- one bug in source-scale-clip
- Remove the warning suppression in the ACCESS_MEM macro and fix the
warnings that are real
- irrelevant now.
- make the wrapper functions global instead of image specific
- this won't work since pixman is linked to both fb and wfb
- Add non-mmx solid fill
- Make sure the endian-ness macros are defined correctly.
- The rectangles in a region probably shouldn't be returned const as
the X server will be changing them.
- Right now we _always_ have a clip region, which is empty by default.
Why does this work at all? It probably doesn't. The server
distinguishes two cases, one where nothing is clipped (CT_NONE), and
one where there is a clip region (CT_REGION).
- Default clip region should be the full image
- Test if pseudo color still works. It does, but it also shows that
copying a pixman_indexed_t on every composite operation is not
going to fly. So, for now set_indexed() does not copy the
indexed table.
Also just the malloc() to allocate a pixman image shows up pretty
high.
Options include
- Make all the setters not copy their arguments
- Possibly combined with going back to the stack allocated
approach that we already use for regions.
- Keep a cached pixman_image_t around for every picture. It would
have to be kept uptodate every time something changes about the
picture.
- Break the X server ABI and simply have the relevant parameter
stored in the pixman image. This would have the additional benefits
that:
- We can get rid of the annoying repeat field which is duplicated
elsewhere.
- We can use pixman_color_t and pixman_gradient_stop_t
etc. instead of the types that are defined in
renderproto.h

5
a64-neon-test.S Normal file
View File

@ -0,0 +1,5 @@
.text
.arch armv8-a
.altmacro
prfm pldl2strm, [x0]
xtn v0.8b, v0.8h

10
arm-simd-test.S Normal file
View File

@ -0,0 +1,10 @@
.text
.arch armv6
.object_arch armv4
.arm
.altmacro
#ifndef __ARM_EABI__
#error EABI is required (to be sure that calling conventions are compatible)
#endif
pld [r0]
uqadd8 r0, r0, r0

View File

@ -1,14 +0,0 @@
#! /bin/sh
srcdir=`dirname $0`
test -z "$srcdir" && srcdir=.
ORIGDIR=`pwd`
cd $srcdir
autoreconf -v --install || exit 1
cd $ORIGDIR || exit $?
if test -z "$NOCONFIGURE"; then
$srcdir/configure "$@"
fi

View File

@ -1,951 +0,0 @@
dnl Copyright 2005 Red Hat, Inc.
dnl
dnl Permission to use, copy, modify, distribute, and sell this software and its
dnl documentation for any purpose is hereby granted without fee, provided that
dnl the above copyright notice appear in all copies and that both that
dnl copyright notice and this permission notice appear in supporting
dnl documentation, and that the name of Red Hat not be used in
dnl advertising or publicity pertaining to distribution of the software without
dnl specific, written prior permission. Red Hat makes no
dnl representations about the suitability of this software for any purpose. It
dnl is provided "as is" without express or implied warranty.
dnl
dnl RED HAT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
dnl INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO
dnl EVENT SHALL RED HAT BE LIABLE FOR ANY SPECIAL, INDIRECT OR
dnl CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE,
dnl DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER
dnl TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
dnl PERFORMANCE OF THIS SOFTWARE.
dnl
dnl Process this file with autoconf to create configure.
AC_PREREQ([2.57])
# Pixman versioning scheme
#
# - The version in git has an odd MICRO version number
#
# - Released versions, both development and stable, have an
# even MICRO version number
#
# - Released development versions have an odd MINOR number
#
# - Released stable versions have an even MINOR number
#
# - Versions that break ABI must have a new MAJOR number
#
# - If you break the ABI, then at least this must be done:
#
# - increment MAJOR
#
# - In the first development release where you break ABI, find
# all instances of "pixman-n" and change them to pixman-(n+1)
#
# This needs to be done at least in
# configure.ac
# all Makefile.am's
# pixman-n.pc.in
#
# This ensures that binary incompatible versions can be installed
# in parallel. See http://www106.pair.com/rhp/parallel.html for
# more information
#
m4_define([pixman_major], 0)
m4_define([pixman_minor], 25)
m4_define([pixman_micro], 2)
m4_define([pixman_version],[pixman_major.pixman_minor.pixman_micro])
AC_INIT(pixman, pixman_version, [pixman@lists.freedesktop.org], pixman)
AM_INIT_AUTOMAKE([foreign dist-bzip2])
# Suppress verbose compile lines
m4_ifdef([AM_SILENT_RULES], [AM_SILENT_RULES([yes])])
AM_CONFIG_HEADER(config.h)
AC_CANONICAL_HOST
test_CFLAGS=${CFLAGS+set} # We may override autoconf default CFLAGS.
AC_PROG_CC
AM_PROG_AS
AC_PROG_LIBTOOL
AC_CHECK_FUNCS([getisax])
AC_C_BIGENDIAN
AC_C_INLINE
dnl PIXMAN_LINK_WITH_ENV(env-setup, program, true-action, false-action)
dnl
dnl Compiles and links the given program in the environment setup by env-setup
dnl and executes true-action on success and false-action on failure.
AC_DEFUN([PIXMAN_LINK_WITH_ENV],[dnl
save_CFLAGS="$CFLAGS"
save_LDFLAGS="$LDFLAGS"
save_LIBS="$LIBS"
CFLAGS=""
LDFLAGS=""
LIBS=""
$1
AC_LINK_IFELSE(
[AC_LANG_SOURCE([$2])],
[pixman_cc_stderr=`test -f conftest.err && cat conftest.err`
pixman_cc_flag=yes],
[pixman_cc_stderr=`test -f conftest.err && cat conftest.err`
pixman_cc_flag=no])
if test "x$pixman_cc_stderr" != "x"; then
pixman_cc_flag=no
fi
if test "x$pixman_cc_flag" = "xyes"; then
ifelse([$3], , :, [$3])
else
ifelse([$4], , :, [$4])
fi
CFLAGS="$save_CFLAGS"
LDFLAGS="$save_LDFLAGS"
LIBS="$save_LIBS"
])
dnl Find a -Werror for catching warnings.
WERROR=
for w in -Werror -errwarn; do
if test "z$WERROR" = "z"; then
AC_MSG_CHECKING([whether the compiler supports $w])
PIXMAN_LINK_WITH_ENV(
[CFLAGS=$w],
[int main(int c, char **v) { (void)c; (void)v; return 0; }],
[WERROR=$w; yesno=yes], [yesno=no])
AC_MSG_RESULT($yesno)
fi
done
dnl PIXMAN_CHECK_CFLAG(flag, [program])
dnl Adds flag to CFLAGS if the given program links without warnings or errors.
AC_DEFUN([PIXMAN_CHECK_CFLAG], [dnl
AC_MSG_CHECKING([whether the compiler supports $1])
PIXMAN_LINK_WITH_ENV(
[CFLAGS="$WERROR $1"],
[$2
int main(int c, char **v) { (void)c; (void)v; return 0; }
],
[_yesno=yes],
[_yesno=no])
if test "x$_yesno" = xyes; then
CFLAGS="$CFLAGS $1"
fi
AC_MSG_RESULT($_yesno)
])
AC_CHECK_SIZEOF(long)
# Checks for Sun Studio compilers
AC_CHECK_DECL([__SUNPRO_C], [SUNCC="yes"], [SUNCC="no"])
AC_CHECK_DECL([__amd64], [AMD64_ABI="yes"], [AMD64_ABI="no"])
# Default CFLAGS to -O -g rather than just the -g from AC_PROG_CC
# if we're using Sun Studio and neither the user nor a config.site
# has set CFLAGS.
if test $SUNCC = yes && \
test "x$test_CFLAGS" = "x" && \
test "$CFLAGS" = "-g"
then
CFLAGS="-O -g"
fi
#
# We ignore pixman_major in the version here because the major version should
# always be encoded in the actual library name. Ie., the soname is:
#
# pixman-$(pixman_major).0.minor.micro
#
m4_define([lt_current], [pixman_minor])
m4_define([lt_revision], [pixman_micro])
m4_define([lt_age], [pixman_minor])
LT_VERSION_INFO="lt_current:lt_revision:lt_age"
PIXMAN_VERSION_MAJOR=pixman_major()
AC_SUBST(PIXMAN_VERSION_MAJOR)
PIXMAN_VERSION_MINOR=pixman_minor()
AC_SUBST(PIXMAN_VERSION_MINOR)
PIXMAN_VERSION_MICRO=pixman_micro()
AC_SUBST(PIXMAN_VERSION_MICRO)
AC_SUBST(LT_VERSION_INFO)
# Check for dependencies
PIXMAN_CHECK_CFLAG([-Wall])
PIXMAN_CHECK_CFLAG([-fno-strict-aliasing])
AC_PATH_PROG(PERL, perl, no)
if test "x$PERL" = xno; then
AC_MSG_ERROR([Perl is required to build pixman.])
fi
AC_SUBST(PERL)
dnl =========================================================================
dnl OpenMP for the test suite?
dnl
# Check for OpenMP support only when autoconf support that (require autoconf >=2.62)
OPENMP_CFLAGS=
m4_ifdef([AC_OPENMP], [AC_OPENMP])
if test "x$enable_openmp" = "xyes" && test "x$ac_cv_prog_c_openmp" = "xunsupported" ; then
AC_MSG_WARN([OpenMP support requested but found unsupported])
fi
dnl May not fail to link without -Wall -Werror added
dnl So try to link only when openmp is supported
dnl ac_cv_prog_c_openmp is not defined when --disable-openmp is used
if test "x$ac_cv_prog_c_openmp" != "xunsupported" && test "x$ac_cv_prog_c_openmp" != "x"; then
m4_define([openmp_test_program],[dnl
#include <stdio.h>
extern unsigned int lcg_seed;
#pragma omp threadprivate(lcg_seed)
unsigned int lcg_seed;
unsigned function(unsigned a, unsigned b)
{
lcg_seed ^= b;
return ((a + b) ^ a ) + lcg_seed;
}
int main(int argc, char **argv)
{
int i;
int n1 = 0, n2 = argc;
unsigned checksum = 0;
int verbose = argv != NULL;
unsigned (*test_function)(unsigned, unsigned);
test_function = function;
#pragma omp parallel for reduction(+:checksum) default(none) \
shared(n1, n2, test_function, verbose)
for (i = n1; i < n2; i++)
{
unsigned crc = test_function (i, 0);
if (verbose)
printf ("%d: %08X\n", i, crc);
checksum += crc;
}
printf("%u\n", checksum);
return 0;
}
])
PIXMAN_LINK_WITH_ENV(
[CFLAGS="$OPENMP_CFLAGS" LDFLAGS="$OPENMP_CFLAGS"],
[openmp_test_program],
[have_openmp=yes],
[have_openmp=no])
if test "x$have_openmp" = "xyes" ; then
AC_DEFINE(USE_OPENMP, 1, [use OpenMP in the test suite])
fi
fi
AC_SUBST(OPENMP_CFLAGS)
dnl =========================================================================
dnl -fvisibility stuff
PIXMAN_CHECK_CFLAG([-fvisibility=hidden], [dnl
#if defined(__GNUC__) && (__GNUC__ >= 4)
#ifdef _WIN32
#error Have -fvisibility but it is ignored and generates a warning
#endif
#else
#error Need GCC 4.0 for visibility
#endif
])
PIXMAN_CHECK_CFLAG([-xldscope=hidden], [dnl
#if defined(__SUNPRO_C) && (__SUNPRO_C >= 0x550)
#else
#error Need Sun Studio 8 for visibility
#endif
])
dnl ===========================================================================
dnl Check for MMX
if test "x$MMX_CFLAGS" = "x" ; then
if test "x$SUNCC" = "xyes"; then
# Sun Studio doesn't have an -xarch=mmx flag, so we have to use sse
# but if we're building 64-bit, mmx & sse support is on by default and
# -xarch=sse throws an error instead
if test "$AMD64_ABI" = "no" ; then
MMX_CFLAGS="-xarch=sse"
fi
else
MMX_CFLAGS="-mmmx -Winline"
fi
fi
have_mmx_intrinsics=no
AC_MSG_CHECKING(whether to use MMX intrinsics)
xserver_save_CFLAGS=$CFLAGS
CFLAGS="$MMX_CFLAGS $CFLAGS"
AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
#if defined(__GNUC__) && (__GNUC__ < 3 || (__GNUC__ == 3 && __GNUC_MINOR__ < 4))
#error "Need GCC >= 3.4 for MMX intrinsics"
#endif
#if defined(__clang__)
#error "clang chokes on the inline assembly in pixman-mmx.c"
#endif
#include <mmintrin.h>
int main () {
__m64 v = _mm_cvtsi32_si64 (1);
return _mm_cvtsi64_si32 (v);
}]])], have_mmx_intrinsics=yes)
CFLAGS=$xserver_save_CFLAGS
AC_ARG_ENABLE(mmx,
[AC_HELP_STRING([--disable-mmx],
[disable x86 MMX fast paths])],
[enable_mmx=$enableval], [enable_mmx=auto])
if test $enable_mmx = no ; then
have_mmx_intrinsics=disabled
fi
if test $have_mmx_intrinsics = yes ; then
AC_DEFINE(USE_X86_MMX, 1, [use x86 MMX compiler intrinsics])
else
MMX_CFLAGS=
fi
AC_MSG_RESULT($have_mmx_intrinsics)
if test $enable_mmx = yes && test $have_mmx_intrinsics = no ; then
AC_MSG_ERROR([x86 MMX intrinsics not detected])
fi
AM_CONDITIONAL(USE_X86_MMX, test $have_mmx_intrinsics = yes)
dnl ===========================================================================
dnl Check for SSE2
if test "x$SSE2_CFLAGS" = "x" ; then
if test "x$SUNCC" = "xyes"; then
# SSE2 is enabled by default in the Sun Studio 64-bit environment
if test "$AMD64_ABI" = "no" ; then
SSE2_CFLAGS="-xarch=sse2"
fi
else
SSE2_CFLAGS="-msse2 -Winline"
fi
fi
have_sse2_intrinsics=no
AC_MSG_CHECKING(whether to use SSE2 intrinsics)
xserver_save_CFLAGS=$CFLAGS
CFLAGS="$SSE2_CFLAGS $CFLAGS"
AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
#if defined(__GNUC__) && (__GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 2))
# if !defined(__amd64__) && !defined(__x86_64__)
# error "Need GCC >= 4.2 for SSE2 intrinsics on x86"
# endif
#endif
#include <mmintrin.h>
#include <xmmintrin.h>
#include <emmintrin.h>
int main () {
__m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c;
c = _mm_xor_si128 (a, b);
return 0;
}]])], have_sse2_intrinsics=yes)
CFLAGS=$xserver_save_CFLAGS
AC_ARG_ENABLE(sse2,
[AC_HELP_STRING([--disable-sse2],
[disable SSE2 fast paths])],
[enable_sse2=$enableval], [enable_sse2=auto])
if test $enable_sse2 = no ; then
have_sse2_intrinsics=disabled
fi
if test $have_sse2_intrinsics = yes ; then
AC_DEFINE(USE_SSE2, 1, [use SSE2 compiler intrinsics])
fi
AC_MSG_RESULT($have_sse2_intrinsics)
if test $enable_sse2 = yes && test $have_sse2_intrinsics = no ; then
AC_MSG_ERROR([SSE2 intrinsics not detected])
fi
AM_CONDITIONAL(USE_SSE2, test $have_sse2_intrinsics = yes)
dnl ===========================================================================
dnl Other special flags needed when building code using MMX or SSE instructions
case $host_os in
solaris*)
# When building 32-bit binaries, apply a mapfile to ensure that the
# binaries aren't flagged as only able to run on MMX+SSE capable CPUs
# since they check at runtime before using those instructions.
# Not all linkers grok the mapfile format so we check for that first.
if test "$AMD64_ABI" = "no" ; then
use_hwcap_mapfile=no
AC_MSG_CHECKING(whether to use a hardware capability map file)
hwcap_save_LDFLAGS="$LDFLAGS"
HWCAP_LDFLAGS='-Wl,-M,$(srcdir)/solaris-hwcap.mapfile'
LDFLAGS="$LDFLAGS -Wl,-M,pixman/solaris-hwcap.mapfile"
AC_LINK_IFELSE([AC_LANG_SOURCE([[int main() { return 0; }]])],
use_hwcap_mapfile=yes,
HWCAP_LDFLAGS="")
LDFLAGS="$hwcap_save_LDFLAGS"
AC_MSG_RESULT($use_hwcap_mapfile)
fi
if test "x$MMX_LDFLAGS" = "x" ; then
MMX_LDFLAGS="$HWCAP_LDFLAGS"
fi
if test "x$SSE2_LDFLAGS" = "x" ; then
SSE2_LDFLAGS="$HWCAP_LDFLAGS"
fi
;;
esac
AC_SUBST(IWMMXT_CFLAGS)
AC_SUBST(MMX_CFLAGS)
AC_SUBST(MMX_LDFLAGS)
AC_SUBST(SSE2_CFLAGS)
AC_SUBST(SSE2_LDFLAGS)
dnl ===========================================================================
dnl Check for VMX/Altivec
if test -n "`$CC -v 2>&1 | grep version | grep Apple`"; then
VMX_CFLAGS="-faltivec"
else
VMX_CFLAGS="-maltivec -mabi=altivec"
fi
have_vmx_intrinsics=no
AC_MSG_CHECKING(whether to use VMX/Altivec intrinsics)
xserver_save_CFLAGS=$CFLAGS
CFLAGS="$VMX_CFLAGS $CFLAGS"
AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
#if defined(__GNUC__) && (__GNUC__ < 3 || (__GNUC__ == 3 && __GNUC_MINOR__ < 4))
#error "Need GCC >= 3.4 for sane altivec support"
#endif
#include <altivec.h>
int main () {
vector unsigned int v = vec_splat_u32 (1);
v = vec_sub (v, v);
return 0;
}]])], have_vmx_intrinsics=yes)
CFLAGS=$xserver_save_CFLAGS
AC_ARG_ENABLE(vmx,
[AC_HELP_STRING([--disable-vmx],
[disable VMX fast paths])],
[enable_vmx=$enableval], [enable_vmx=auto])
if test $enable_vmx = no ; then
have_vmx_intrinsics=disabled
fi
if test $have_vmx_intrinsics = yes ; then
AC_DEFINE(USE_VMX, 1, [use VMX compiler intrinsics])
else
VMX_CFLAGS=
fi
AC_MSG_RESULT($have_vmx_intrinsics)
if test $enable_vmx = yes && test $have_vmx_intrinsics = no ; then
AC_MSG_ERROR([VMX intrinsics not detected])
fi
AC_SUBST(VMX_CFLAGS)
AM_CONDITIONAL(USE_VMX, test $have_vmx_intrinsics = yes)
dnl ==========================================================================
dnl Check if assembler is gas compatible and supports ARM SIMD instructions
have_arm_simd=no
AC_MSG_CHECKING(whether to use ARM SIMD assembler)
xserver_save_CFLAGS=$CFLAGS
CFLAGS="-x assembler-with-cpp $CFLAGS"
AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
.text
.arch armv6
.object_arch armv4
.arm
.altmacro
#ifndef __ARM_EABI__
#error EABI is required (to be sure that calling conventions are compatible)
#endif
pld [r0]
uqadd8 r0, r0, r0]])], have_arm_simd=yes)
CFLAGS=$xserver_save_CFLAGS
AC_ARG_ENABLE(arm-simd,
[AC_HELP_STRING([--disable-arm-simd],
[disable ARM SIMD fast paths])],
[enable_arm_simd=$enableval], [enable_arm_simd=auto])
if test $enable_arm_simd = no ; then
have_arm_simd=disabled
fi
if test $have_arm_simd = yes ; then
AC_DEFINE(USE_ARM_SIMD, 1, [use ARM SIMD assembly optimizations])
fi
AM_CONDITIONAL(USE_ARM_SIMD, test $have_arm_simd = yes)
AC_MSG_RESULT($have_arm_simd)
if test $enable_arm_simd = yes && test $have_arm_simd = no ; then
AC_MSG_ERROR([ARM SIMD intrinsics not detected])
fi
dnl ==========================================================================
dnl Check if assembler is gas compatible and supports NEON instructions
have_arm_neon=no
AC_MSG_CHECKING(whether to use ARM NEON assembler)
xserver_save_CFLAGS=$CFLAGS
CFLAGS="-x assembler-with-cpp $CFLAGS"
AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
.text
.fpu neon
.arch armv7a
.object_arch armv4
.eabi_attribute 10, 0
.arm
.altmacro
#ifndef __ARM_EABI__
#error EABI is required (to be sure that calling conventions are compatible)
#endif
pld [r0]
vmovn.u16 d0, q0]])], have_arm_neon=yes)
CFLAGS=$xserver_save_CFLAGS
AC_ARG_ENABLE(arm-neon,
[AC_HELP_STRING([--disable-arm-neon],
[disable ARM NEON fast paths])],
[enable_arm_neon=$enableval], [enable_arm_neon=auto])
if test $enable_arm_neon = no ; then
have_arm_neon=disabled
fi
if test $have_arm_neon = yes ; then
AC_DEFINE(USE_ARM_NEON, 1, [use ARM NEON assembly optimizations])
fi
AM_CONDITIONAL(USE_ARM_NEON, test $have_arm_neon = yes)
AC_MSG_RESULT($have_arm_neon)
if test $enable_arm_neon = yes && test $have_arm_neon = no ; then
AC_MSG_ERROR([ARM NEON intrinsics not detected])
fi
dnl ===========================================================================
dnl Check for IWMMXT
if test "x$IWMMXT_CFLAGS" = "x" ; then
IWMMXT_CFLAGS="-march=iwmmxt -flax-vector-conversions -Winline"
fi
have_iwmmxt_intrinsics=no
AC_MSG_CHECKING(whether to use ARM IWMMXT intrinsics)
xserver_save_CFLAGS=$CFLAGS
CFLAGS="$IWMMXT_CFLAGS $CFLAGS"
AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
#ifndef __arm__
#error "IWMMXT is only available on ARM"
#endif
#if defined(__GNUC__) && (__GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 6))
#error "Need GCC >= 4.6 for IWMMXT intrinsics"
#endif
#include <mmintrin.h>
int main () {
union {
__m64 v;
char c[8];
} a = { .c = {1, 2, 3, 4, 5, 6, 7, 8} };
int b = 4;
__m64 c = _mm_srli_si64 (a.v, b);
}]])], have_iwmmxt_intrinsics=yes)
CFLAGS=$xserver_save_CFLAGS
AC_ARG_ENABLE(arm-iwmmxt,
[AC_HELP_STRING([--disable-arm-iwmmxt],
[disable ARM IWMMXT fast paths])],
[enable_iwmmxt=$enableval], [enable_iwmmxt=auto])
if test $enable_iwmmxt = no ; then
have_iwmmxt_intrinsics=disabled
fi
if test $have_iwmmxt_intrinsics = yes ; then
AC_DEFINE(USE_ARM_IWMMXT, 1, [use ARM IWMMXT compiler intrinsics])
else
IWMMXT_CFLAGS=
fi
AC_MSG_RESULT($have_iwmmxt_intrinsics)
if test $enable_iwmmxt = yes && test $have_iwmmxt_intrinsics = no ; then
AC_MSG_ERROR([IWMMXT intrinsics not detected])
fi
AM_CONDITIONAL(USE_ARM_IWMMXT, test $have_iwmmxt_intrinsics = yes)
dnl ==========================================================================
dnl Check if assembler is gas compatible and supports MIPS DSPr2 instructions
have_mips_dspr2=no
AC_MSG_CHECKING(whether to use MIPS DSPr2 assembler)
xserver_save_CFLAGS=$CFLAGS
CFLAGS="-mdspr2 $CFLAGS"
AC_COMPILE_IFELSE([[
#if !(defined(__mips__) && __mips_isa_rev >= 2)
#error MIPS DSPr2 is currently only available on MIPS32r2 platforms.
#endif
int
main ()
{
int c = 0, a = 0, b = 0;
__asm__ __volatile__ (
"precr.qb.ph %[c], %[a], %[b] \n\t"
: [c] "=r" (c)
: [a] "r" (a), [b] "r" (b)
);
return c;
}]], have_mips_dspr2=yes)
CFLAGS=$xserver_save_CFLAGS
AC_ARG_ENABLE(mips-dspr2,
[AC_HELP_STRING([--disable-mips-dspr2],
[disable MIPS DSPr2 fast paths])],
[enable_mips_dspr2=$enableval], [enable_mips_dspr2=auto])
if test $enable_mips_dspr2 = no ; then
have_mips_dspr2=disabled
fi
if test $have_mips_dspr2 = yes ; then
AC_DEFINE(USE_MIPS_DSPR2, 1, [use MIPS DSPr2 assembly optimizations])
fi
AM_CONDITIONAL(USE_MIPS_DSPR2, test $have_mips_dspr2 = yes)
AC_MSG_RESULT($have_mips_dspr2)
if test $enable_mips_dspr2 = yes && test $have_mips_dspr2 = no ; then
AC_MSG_ERROR([MIPS DSPr2 instructions not detected])
fi
dnl =========================================================================================
dnl Check for GNU-style inline assembly support
have_gcc_inline_asm=no
AC_MSG_CHECKING(whether to use GNU-style inline assembler)
AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
int main () {
/* Most modern architectures have a NOP instruction, so this is a fairly generic test. */
asm volatile ( "\tnop\n" : : : "cc", "memory" );
return 0;
}]])], have_gcc_inline_asm=yes)
AC_ARG_ENABLE(gcc-inline-asm,
[AC_HELP_STRING([--disable-gcc-inline-asm],
[disable GNU-style inline assembler])],
[enable_gcc_inline_asm=$enableval], [enable_gcc_inline_asm=auto])
if test $enable_gcc_inline_asm = no ; then
have_gcc_inline_asm=disabled
fi
if test $have_gcc_inline_asm = yes ; then
AC_DEFINE(USE_GCC_INLINE_ASM, 1, [use GNU-style inline assembler])
fi
AC_MSG_RESULT($have_gcc_inline_asm)
if test $enable_gcc_inline_asm = yes && test $have_gcc_inline_asm = no ; then
AC_MSG_ERROR([GNU-style inline assembler not detected])
fi
AM_CONDITIONAL(USE_GCC_INLINE_ASM, test $have_gcc_inline_asm = yes)
dnl ==============================================
dnl Static test programs
AC_ARG_ENABLE(static-testprogs,
[AC_HELP_STRING([--enable-static-testprogs],
[build test programs as static binaries [default=no]])],
[enable_static_testprogs=$enableval], [enable_static_testprogs=no])
TESTPROGS_EXTRA_LDFLAGS=
if test "x$enable_static_testprogs" = "xyes" ; then
TESTPROGS_EXTRA_LDFLAGS="-all-static"
fi
AC_SUBST(TESTPROGS_EXTRA_LDFLAGS)
dnl ==============================================
dnl Timers
AC_ARG_ENABLE(timers,
[AC_HELP_STRING([--enable-timers],
[enable TIMER_BEGIN and TIMER_END macros [default=no]])],
[enable_timers=$enableval], [enable_timers=no])
if test $enable_timers = yes ; then
AC_DEFINE(PIXMAN_TIMERS, 1, [enable TIMER_BEGIN/TIMER_END macros])
fi
AC_SUBST(PIXMAN_TIMERS)
dnl ===================================
dnl GTK+
AC_ARG_ENABLE(gtk,
[AC_HELP_STRING([--enable-gtk],
[enable tests using GTK+ [default=auto]])],
[enable_gtk=$enableval], [enable_gtk=auto])
PKG_PROG_PKG_CONFIG
if test $enable_gtk = yes ; then
AC_CHECK_LIB([pixman-1], [pixman_version_string])
PKG_CHECK_MODULES(GTK, [gtk+-2.0 pixman-1])
fi
if test $enable_gtk = auto ; then
AC_CHECK_LIB([pixman-1], [pixman_version_string], [enable_gtk=auto], [enable_gtk=no])
fi
if test $enable_gtk = auto ; then
PKG_CHECK_MODULES(GTK, [gtk+-2.0 pixman-1], [enable_gtk=yes], [enable_gtk=no])
fi
AM_CONDITIONAL(HAVE_GTK, [test "x$enable_gtk" = xyes])
AC_SUBST(GTK_CFLAGS)
AC_SUBST(GTK_LIBS)
AC_SUBST(DEP_CFLAGS)
AC_SUBST(DEP_LIBS)
dnl =====================================
dnl posix_memalign, sigaction, alarm, gettimeofday
AC_CHECK_FUNC(posix_memalign, have_posix_memalign=yes, have_posix_memalign=no)
if test x$have_posix_memalign = xyes; then
AC_DEFINE(HAVE_POSIX_MEMALIGN, 1, [Whether we have posix_memalign()])
fi
AC_CHECK_FUNC(sigaction, have_sigaction=yes, have_sigaction=no)
if test x$have_sigaction = xyes; then
AC_DEFINE(HAVE_SIGACTION, 1, [Whether we have sigaction()])
fi
AC_CHECK_FUNC(alarm, have_alarm=yes, have_alarm=no)
if test x$have_alarm = xyes; then
AC_DEFINE(HAVE_ALARM, 1, [Whether we have alarm()])
fi
AC_CHECK_HEADER([sys/mman.h],
[AC_DEFINE(HAVE_SYS_MMAN_H, [1], [Define to 1 if we have <sys/mman.h>])])
AC_CHECK_FUNC(mmap, have_mmap=yes, have_mmap=no)
if test x$have_mmap = xyes; then
AC_DEFINE(HAVE_MMAP, 1, [Whether we have mmap()])
fi
AC_CHECK_FUNC(mprotect, have_mprotect=yes, have_mprotect=no)
if test x$have_mprotect = xyes; then
AC_DEFINE(HAVE_MPROTECT, 1, [Whether we have mprotect()])
fi
AC_CHECK_FUNC(getpagesize, have_getpagesize=yes, have_getpagesize=no)
if test x$have_getpagesize = xyes; then
AC_DEFINE(HAVE_GETPAGESIZE, 1, [Whether we have getpagesize()])
fi
AC_CHECK_HEADER([fenv.h],
[AC_DEFINE(HAVE_FENV_H, [1], [Define to 1 if we have <fenv.h>])])
AC_CHECK_LIB(m, feenableexcept, have_feenableexcept=yes, have_feenableexcept=no)
if test x$have_feenableexcept = xyes; then
AC_DEFINE(HAVE_FEENABLEEXCEPT, 1, [Whether we have feenableexcept()])
fi
AC_CHECK_FUNC(gettimeofday, have_gettimeofday=yes, have_gettimeofday=no)
AC_CHECK_HEADER(sys/time.h, have_sys_time_h=yes, have_sys_time_h=no)
if test x$have_gettimeofday = xyes && test x$have_sys_time_h = xyes; then
AC_DEFINE(HAVE_GETTIMEOFDAY, 1, [Whether we have gettimeofday()])
fi
dnl =====================================
dnl Thread local storage
support_for__thread=no
AC_MSG_CHECKING(for __thread)
AC_LINK_IFELSE([AC_LANG_SOURCE([[
#if defined(__MINGW32__) && !(__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 5))
#error This MinGW version has broken __thread support
#endif
#ifdef __OpenBSD__
#error OpenBSD has broken __thread support
#endif
static __thread int x ;
int main () { x = 123; return x; }
]])], support_for__thread=yes)
if test $support_for__thread = yes; then
AC_DEFINE([TOOLCHAIN_SUPPORTS__THREAD],[],[Whether the tool chain supports __thread])
fi
AC_MSG_RESULT($support_for__thread)
dnl
dnl posix tls
dnl
m4_define([pthread_test_program],AC_LANG_SOURCE([[dnl
#include <stdlib.h>
#include <pthread.h>
static pthread_once_t once_control = PTHREAD_ONCE_INIT;
static pthread_key_t key;
static void
make_key (void)
{
pthread_key_create (&key, NULL);
}
int
main ()
{
void *value = NULL;
if (pthread_once (&once_control, make_key) != 0)
{
value = NULL;
}
else
{
value = pthread_getspecific (key);
if (!value)
{
value = malloc (100);
pthread_setspecific (key, value);
}
}
return 0;
}
]]))
AC_DEFUN([PIXMAN_CHECK_PTHREAD],[dnl
if test "z$support_for_pthread_setspecific" != "zyes"; then
PIXMAN_LINK_WITH_ENV(
[$1], [pthread_test_program],
[PTHREAD_CFLAGS="$CFLAGS"
PTHREAD_LIBS="$LIBS"
PTHREAD_LDFLAGS="$LDFLAGS"
support_for_pthread_setspecific=yes])
fi
])
if test $support_for__thread = no; then
support_for_pthread_setspecific=no
AC_MSG_CHECKING(for pthread_setspecific)
PIXMAN_CHECK_PTHREAD([CFLAGS="-D_REENTRANT"; LIBS="-lpthread"])
PIXMAN_CHECK_PTHREAD([CFLAGS="-pthread"; LDFLAGS="-pthread"])
PIXMAN_CHECK_PTHREAD([CFLAGS="-D_REENTRANT"; LDFLAGS="-lroot"])
if test $support_for_pthread_setspecific = yes; then
CFLAGS="$CFLAGS $PTHREAD_CFLAGS"
AC_DEFINE([HAVE_PTHREAD_SETSPECIFIC], [], [Whether pthread_setspecific() is supported])
fi
AC_MSG_RESULT($support_for_pthread_setspecific);
fi
AC_SUBST(TOOLCHAIN_SUPPORTS__THREAD)
AC_SUBST(HAVE_PTHREAD_SETSPECIFIC)
AC_SUBST(PTHREAD_LDFLAGS)
AC_SUBST(PTHREAD_LIBS)
dnl =====================================
dnl __attribute__((constructor))
support_for_attribute_constructor=no
AC_MSG_CHECKING(for __attribute__((constructor)))
AC_LINK_IFELSE([AC_LANG_SOURCE([[
#if defined(__GNUC__) && (__GNUC__ > 2 || (__GNUC__ == 2 && __GNUC_MINOR__ >= 7))
/* attribute 'constructor' is supported since gcc 2.7, but some compilers
* may only pretend to be gcc, so let's try to actually use it
*/
static int x = 1;
static void __attribute__((constructor)) constructor_function () { x = 0; }
int main (void) { return x; }
#else
#error not gcc or gcc version is older than 2.7
#endif
]])], support_for_attribute_constructor=yes)
if test x$support_for_attribute_constructor = xyes; then
AC_DEFINE([TOOLCHAIN_SUPPORTS_ATTRIBUTE_CONSTRUCTOR],
[],[Whether the tool chain supports __attribute__((constructor))])
fi
AC_MSG_RESULT($support_for_attribute_constructor)
AC_SUBST(TOOLCHAIN_SUPPORTS_ATTRIBUTE_CONSTRUCTOR)
dnl ==================
dnl libpng
AC_ARG_ENABLE(libpng, AS_HELP_STRING([--enable-libpng], [Build support for libpng (default: auto)]),
[have_libpng=$enableval], [have_libpng=auto])
case x$have_libpng in
xyes) PKG_CHECK_MODULES(PNG, [libpng]) ;;
xno) ;;
*) PKG_CHECK_MODULES(PNG, [libpng], have_libpng=yes, have_libpng=no) ;;
esac
if test x$have_libpng = xyes; then
AC_DEFINE([HAVE_LIBPNG], [1], [Whether we have libpng])
fi
AC_SUBST(HAVE_LIBPNG)
AC_OUTPUT([pixman-1.pc
pixman-1-uninstalled.pc
Makefile
pixman/Makefile
pixman/pixman-version.h
demos/Makefile
test/Makefile])
m4_if(m4_eval(pixman_minor % 2), [1], [
echo
echo "@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@"
echo
echo " Thanks for testing this development snapshot of pixman. Please"
echo " report any problems you find, either by sending email to "
echo
echo " pixman@lists.freedesktop.org"
echo
echo " or by filing a bug at "
echo
echo " https://bugs.freedesktop.org/enter_bug.cgi?product=pixman "
echo
echo " If you are looking for a stable release of pixman, please note "
echo " that stable releases have _even_ minor version numbers. Ie., "
echo " pixman-0.]m4_eval(pixman_minor & ~1)[.x are stable releases, whereas pixman-$PIXMAN_VERSION_MAJOR.$PIXMAN_VERSION_MINOR.$PIXMAN_VERSION_MICRO is a "
echo " development snapshot that may contain bugs and experimental "
echo " features. "
echo
echo "@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@"
echo
])

226
debian/changelog vendored
View File

@ -1,3 +1,229 @@
pixman (0.44.0-4) UNRELEASED; urgency=medium
* Team upload.
* debian/copyright: Convert to machine-readable format
-- Dylan Aïssi <daissi@debian.org> Thu, 31 Jul 2025 22:16:23 +0200
pixman (0.44.0-3) unstable; urgency=medium
* Replace timeout bump patch by using a multiplier option instead.
Thanks, Aurelien Jarno! (Closes: #1086999)
-- Timo Aaltonen <tjaalton@debian.org> Sat, 09 Nov 2024 11:02:55 +0200
pixman (0.44.0-2) unstable; urgency=medium
* patches: Increase test timeout 120->240s. (Closes: #1086999)
-- Timo Aaltonen <tjaalton@debian.org> Fri, 08 Nov 2024 09:58:04 +0200
pixman (0.44.0-1) unstable; urgency=medium
* New upstream release.
* patches: Refresh patch.
* control, rules: Build with meson.
* symbols: Updated.
* control: Migrate to pkgconf.
* rules: Drop obsolete dbgsym-migration.
-- Timo Aaltonen <tjaalton@debian.org> Thu, 07 Nov 2024 16:48:29 +0200
pixman (0.42.2-1) unstable; urgency=medium
* New upstream release.
* d/p/Avoid-integer-overflow-leading-to-out-of-bounds-writ.diff:
- Removed, fixed upstream.
-- Emilio Pozuelo Monfort <pochu@debian.org> Fri, 11 Nov 2022 13:42:25 +0100
pixman (0.40.0-1.1) unstable; urgency=medium
* Non-maintainer upload.
* Avoid integer overflow leading to out-of-bounds write (CVE-2022-44638)
(Closes: #1023427)
-- Salvatore Bonaccorso <carnil@debian.org> Thu, 03 Nov 2022 23:07:46 +0100
pixman (0.40.0-1) unstable; urgency=medium
* New upstream release. (Closes: #958298, #832579, #838650)
* control, rules: Migrate to debhelper-compat, bump to 13.
* symbols: Updated, bump shlibs.
-- Timo Aaltonen <tjaalton@debian.org> Thu, 03 Dec 2020 15:28:13 +0200
pixman (0.36.0-1) unstable; urgency=medium
* New upstream release.
* Update to my Debian address.
* Update Vcs-* URLs to point to salsa.debian.org.
* Use https URL in debian/copyright.
* Set source format to 1.0.
* Bump debhelper compat to 11.
* Bump standards version to 4.2.1.
-- Andreas Boll <aboll@debian.org> Wed, 12 Dec 2018 22:02:44 +0100
pixman (0.34.0-2) unstable; urgency=medium
* Declare Multi-Arch: same for libpixman-1-dev (Closes: #884166).
* Switch to dbsym package.
* Stop passing --disable-silent-rules to configure, debhelper does it
now.
* Bump standards version to 4.1.2.
-- Andreas Boll <andreas.boll.dev@gmail.com> Sun, 17 Dec 2017 13:33:55 +0100
pixman (0.34.0-1) unstable; urgency=medium
* Team upload.
* New upstream release (no actual changes)
* Use https URL in debian/watch.
-- Julien Cristau <jcristau@debian.org> Sat, 24 Sep 2016 13:25:16 +0200
pixman (0.33.6-1) unstable; urgency=medium
* New upstream release candidate.
* Add myself to Uploaders.
-- Andreas Boll <andreas.boll.dev@gmail.com> Thu, 14 Jan 2016 13:46:28 +0100
pixman (0.33.4-1) unstable; urgency=medium
* Team upload.
* New upstream release candidate.
-- Andreas Boll <andreas.boll.dev@gmail.com> Wed, 04 Nov 2015 13:26:18 +0100
pixman (0.33.2-2) sid; urgency=medium
* Run tests with VERBOSE=1.
-- Julien Cristau <jcristau@debian.org> Sat, 12 Sep 2015 20:31:06 +0200
pixman (0.33.2-1) sid; urgency=medium
[ Andreas Boll ]
* New upstream release candidate.
* Enable vmx on ppc64el (closes: #786345).
* Update Vcs-* fields.
* Add upstream url.
* Drop XC- prefix from Package-Type field.
* Bump standards version to 3.9.6.
[ intrigeri ]
* Simplify hardening build flags handling (closes: #760100).
Thanks to Simon Ruderich <simon@ruderich.org> for the patch.
* Enable all hardening build flags. Thanks to Simon Ruderich too.
-- Julien Cristau <jcristau@debian.org> Sat, 12 Sep 2015 13:08:02 +0200
pixman (0.32.6-3) sid; urgency=medium
[ intrigeri ]
* Enable hardening build flags with dpkg-buildflags.
-- Julien Cristau <jcristau@debian.org> Sat, 23 Aug 2014 22:16:40 -0700
pixman (0.32.6-2) sid; urgency=medium
[ Julien Cristau ]
* Disable vmx on ppc64el (closes: #745547). Thanks, Breno Leitao!
-- Cyril Brulebois <kibi@debian.org> Mon, 18 Aug 2014 22:50:39 +0200
pixman (0.32.6-1) sid; urgency=medium
* New upstream release.
* Bump debhelper compat level to 9.
* Remove Cyril from Uploaders.
-- Julien Cristau <jcristau@debian.org> Sun, 13 Jul 2014 16:31:06 +0200
pixman (0.32.4-1) sid; urgency=low
* New upstream release.
-- Julien Cristau <jcristau@debian.org> Tue, 17 Dec 2013 22:04:15 +0100
pixman (0.30.2-2) sid; urgency=low
* Cherry-pick upstream bigfixes for fixing a crash when rendering
invalid trapezoids. (LP: #1197921)
Addresses CVE-2013-6425.
-- Maarten Lankhorst <maarten.lankhorst@ubuntu.com> Mon, 18 Nov 2013 15:08:56 +0100
pixman (0.30.2-1) sid; urgency=low
* New upstream release
- includes big-endian matrix-test fix
* Increase alpha-loop test timeout some more.
-- Julien Cristau <jcristau@debian.org> Tue, 13 Aug 2013 12:08:18 +0200
pixman (0.30.0-3) sid; urgency=low
* Increase timeout for the alpha-loop test. That will hopefully let it pass
on the mips buildd.
-- Julien Cristau <jcristau@debian.org> Sat, 03 Aug 2013 10:24:29 +0200
pixman (0.30.0-2) sid; urgency=low
* Disable silent Makefile rules.
* Disable arm iwmmxt fast paths. It breaks the build.
* Fix matrix-test on big endian (patch from Siarhei Siamashka).
-- Julien Cristau <jcristau@debian.org> Sat, 27 Jul 2013 21:40:48 +0200
pixman (0.30.0-1) sid; urgency=low
[ Maarten Lankhorst, Cyril Brulebois, Julien Cristau ]
* New upstream release.
-- Julien Cristau <jcristau@debian.org> Fri, 26 Jul 2013 14:58:25 +0200
pixman (0.26.0-4) sid; urgency=high
* Fix for CVE-2013-1591 (stack-based buffer overflow), cherry-picked from
0.27.4 (closes: #700308).
-- Julien Cristau <jcristau@debian.org> Mon, 18 Feb 2013 19:58:33 +0100
pixman (0.26.0-3) unstable; urgency=low
* Pass LS_CFLAGS=" " to configure to prevent -march=loongson2f from
being passed to gcc, which would break on loongson2e (see fdo bug
#51451). This fixes the test suite failures on mipsel, and should
avoid any crashes depending on user systems.
-- Cyril Brulebois <kibi@debian.org> Wed, 27 Jun 2012 12:11:54 +0200
pixman (0.26.0-2) unstable; urgency=low
* Cherry-pick from upstream master branch to fix FTBFS on *i386:
- da6193b1fc “mmx: add missing _mm_empty calls”
-- Cyril Brulebois <kibi@debian.org> Fri, 15 Jun 2012 01:25:20 +0200
pixman (0.26.0-1) unstable; urgency=low
* New upstream release.
-- Cyril Brulebois <kibi@debian.org> Fri, 15 Jun 2012 00:16:47 +0200
pixman (0.25.6-1) experimental; urgency=low
* New upstream release candidate.
* Remove demos/parrot.jpg before building the source package to avoid
“binary file contents changed” until it's shipped in the upstream
tarball.
-- Cyril Brulebois <kibi@debian.org> Sun, 20 May 2012 17:56:35 +0200
pixman (0.25.2-1) experimental; urgency=low
* New upstream release candidate.

1
debian/compat vendored
View File

@ -1 +0,0 @@
8

30
debian/control vendored
View File

@ -2,15 +2,16 @@ Source: pixman
Section: devel
Priority: optional
Maintainer: Debian X Strike Force <debian-x@lists.debian.org>
Uploaders: Cyril Brulebois <kibi@debian.org>
Uploaders: Andreas Boll <aboll@debian.org>
Build-Depends:
debhelper (>= 8.1.3),
dh-autoreconf,
pkg-config,
debhelper-compat (= 13),
meson,
pkgconf,
quilt,
Standards-Version: 3.9.2
Vcs-Git: git://git.debian.org/git/pkg-xorg/lib/pixman
Vcs-Browser: http://git.debian.org/?p=pkg-xorg/lib/pixman.git
Standards-Version: 4.2.1
Vcs-Git: https://salsa.debian.org/xorg-team/lib/pixman.git
Vcs-Browser: https://salsa.debian.org/xorg-team/lib/pixman
Homepage: http://pixman.org/
Package: libpixman-1-0
Section: libs
@ -28,7 +29,7 @@ Description: pixel-manipulation library for X and cairo
Package: libpixman-1-0-udeb
Section: debian-installer
XC-Package-Type: udeb
Package-Type: udeb
Architecture: any
Depends:
${shlibs:Depends},
@ -37,24 +38,13 @@ Description: pixel-manipulation library for X and cairo
This package contains a minimal set of libraries needed for the Debian
installer. Do not install it on a normal system.
Package: libpixman-1-0-dbg
Section: debug
Priority: extra
Architecture: any
Depends:
libpixman-1-0 (= ${binary:Version}),
${misc:Depends},
Multi-Arch: same
Description: pixel-manipulation library for X and cairo (debugging symbols)
Debugging symbols for the Cairo/X pixel manipulation library. This is
needed to debug programs linked against libpixman0.
Package: libpixman-1-dev
Section: libdevel
Architecture: any
Depends:
libpixman-1-0 (= ${binary:Version}),
${misc:Depends},
Multi-Arch: same
Description: pixel-manipulation library for X and cairo (development files)
Development libraries, header files and documentation needed by
programs that want to compile with the Cairo/X pixman library.

89
debian/copyright vendored
View File

@ -1,47 +1,48 @@
This package was downloaded from
http://xorg.freedesktop.org/releases/individual/lib/
Format: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
Upstream-Name: pixman
Source: https://gitlab.freedesktop.org/pixman/pixman
License: Expat
Debian packaging by Julien Cristau <jcristau@debian.org>, 18 May 2007.
Files: *
Copyright: 1987-1998 The Open Group
1987-1989 Digital Equipment Corporation
1999-2008 Keith Packard
2000 SuSE, Inc.
2000 Keith Packard, member of The XFree86 Project, Inc.
2004-2010 Red Hat, Inc.
2004 Nicholas Miell
2005 Lars Knoll & Zack Rusin, Trolltech
2005 Trolltech AS
2007 Luca Barbato
2008 Aaron Plattner, NVIDIA Corporation
2008 Rodrigo Kumpera
2008 André Tupinambá
2008 Mozilla Corporation
2008 Frederic Plourde
2009, Oracle and/or its affiliates. All rights reserved.
2009-2010 Nokia Corporation
License: Expat
The following is the MIT license, agreed upon by most contributors.
Copyright holders of new code should use this license statement where
possible. They may also add themselves to the list below.
Files: debian/*
Copyright: 2007 Julien Cristau <jcristau@debian.org>
License: Expat
/*
* Copyright 1987, 1988, 1989, 1998 The Open Group
* Copyright 1987, 1988, 1989 Digital Equipment Corporation
* Copyright 1999, 2004, 2008 Keith Packard
* Copyright 2000 SuSE, Inc.
* Copyright 2000 Keith Packard, member of The XFree86 Project, Inc.
* Copyright 2004, 2005, 2007, 2008, 2009, 2010 Red Hat, Inc.
* Copyright 2004 Nicholas Miell
* Copyright 2005 Lars Knoll & Zack Rusin, Trolltech
* Copyright 2005 Trolltech AS
* Copyright 2007 Luca Barbato
* Copyright 2008 Aaron Plattner, NVIDIA Corporation
* Copyright 2008 Rodrigo Kumpera
* Copyright 2008 André Tupinambá
* Copyright 2008 Mozilla Corporation
* Copyright 2008 Frederic Plourde
* Copyright 2009, Oracle and/or its affiliates. All rights reserved.
* Copyright 2009, 2010 Nokia Corporation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
License: Expat
Permission is hereby granted, free of charge, to any person obtaining a
copy of this software and associated documentation files (the "Software"),
to deal in the Software without restriction, including without limitation
the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the
Software is furnished to do so, subject to the following conditions:
.
The above copyright notice and this permission notice (including the next
paragraph) shall be included in all copies or substantial portions of the
Software.
.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.

View File

@ -0,0 +1,2 @@
libpixman-1-0: symbols-declares-dependency-on-other-package libpixman-1-0-private

View File

@ -1,8 +1,12 @@
libpixman-1.so.0 libpixman-1-0 #MINVER#
| libpixman-1-0-private
_pixman_internal_only_get_implementation@Base 0 1
pixman_add_trapezoids@Base 0
pixman_add_traps@Base 0
pixman_add_triangles@Base 0.21.6
pixman_blt@Base 0
pixman_composite_glyphs@Base 0.27.2
pixman_composite_glyphs_no_mask@Base 0.27.2
pixman_composite_trapezoids@Base 0.21.6
pixman_composite_triangles@Base 0.21.6
pixman_compute_composite_region@Base 0
@ -23,9 +27,22 @@ libpixman-1.so.0 libpixman-1-0 #MINVER#
pixman_f_transform_scale@Base 0.13.2
pixman_f_transform_translate@Base 0.13.2
pixman_fill@Base 0
pixman_filter_create_separable_convolution@Base 0.30.0
pixman_format_supported_destination@Base 0.15.16
pixman_format_supported_source@Base 0.15.16
pixman_glyph_cache_create@Base 0.27.2
pixman_glyph_cache_destroy@Base 0.27.2
pixman_glyph_cache_freeze@Base 0.27.2
pixman_glyph_cache_insert@Base 0.27.2
pixman_glyph_cache_lookup@Base 0.27.2
pixman_glyph_cache_remove@Base 0.27.2
pixman_glyph_cache_thaw@Base 0.27.2
pixman_glyph_get_extents@Base 0.27.2
pixman_glyph_get_mask_format@Base 0.27.2
pixman_image_composite@Base 0.15.14
pixman_image_composite32@Base 0.18.0
pixman_image_create_bits@Base 0.15.12
pixman_image_create_bits_no_clear@Base 0.27.4
pixman_image_create_conical_gradient@Base 0
pixman_image_create_linear_gradient@Base 0
pixman_image_create_radial_gradient@Base 0
@ -47,7 +64,9 @@ libpixman-1.so.0 libpixman-1-0 #MINVER#
pixman_image_set_clip_region@Base 0
pixman_image_set_component_alpha@Base 0
pixman_image_set_destroy_function@Base 0.15.12
pixman_image_set_filter@Base 0
pixman_image_set_dither@Base 0.40.0
pixman_image_set_dither_offset@Base 0.40.0
pixman_image_set_filter@Base 0.30.0
pixman_image_set_has_client_clip@Base 0
pixman_image_set_indexed@Base 0
pixman_image_set_repeat@Base 0
@ -61,6 +80,7 @@ libpixman-1.so.0 libpixman-1-0 #MINVER#
pixman_region32_contains_point@Base 0.11.2
pixman_region32_contains_rectangle@Base 0.11.2
pixman_region32_copy@Base 0.11.2
pixman_region32_empty@Base 0.44.0
pixman_region32_equal@Base 0.11.2
pixman_region32_extents@Base 0.11.2
pixman_region32_fini@Base 0.11.2
@ -85,6 +105,7 @@ libpixman-1.so.0 libpixman-1-0 #MINVER#
pixman_region_contains_point@Base 0
pixman_region_contains_rectangle@Base 0
pixman_region_copy@Base 0
pixman_region_empty@Base 0.44.0
pixman_region_equal@Base 0
pixman_region_extents@Base 0
pixman_region_fini@Base 0
@ -121,11 +142,12 @@ libpixman-1.so.0 libpixman-1-0 #MINVER#
pixman_transform_is_scale@Base 0.13.2
pixman_transform_multiply@Base 0.13.2
pixman_transform_point@Base 0.13.2
pixman_transform_point_31_16@Base 0 1
pixman_transform_point_31_16_3d@Base 0 1
pixman_transform_point_31_16_affine@Base 0 1
pixman_transform_rotate@Base 0.13.2
pixman_transform_scale@Base 0.13.2
pixman_transform_translate@Base 0.13.2
pixman_transform_point_3d@Base 0
pixman_version@Base 0.10.0
pixman_version_string@Base 0.10.0
pixman_format_supported_destination@Base 0.15.16
pixman_format_supported_source@Base 0.15.16

View File

@ -1,4 +1,3 @@
usr/lib/*/libpixman-1.so
usr/lib/*/libpixman-1.a
usr/lib/*/pkgconfig
usr/include/pixman-1

View File

@ -1 +1 @@
# placeholder
test-increase-timeout.diff

View File

@ -0,0 +1,11 @@
--- a/test/alpha-loop.c
+++ b/test/alpha-loop.c
@@ -22,7 +22,7 @@ main (int argc, char **argv)
d = pixman_image_create_bits (PIXMAN_a8r8g8b8, WIDTH, HEIGHT, dest, WIDTH * 4);
s = pixman_image_create_bits (PIXMAN_a2r10g10b10, WIDTH, HEIGHT, src, WIDTH * 4);
- fail_after (5, "Infinite loop detected: 5 seconds without progress\n");
+ fail_after (50, "Infinite loop detected: 50 seconds without progress\n");
pixman_image_set_alpha_map (s, a, 0, 0);
pixman_image_set_alpha_map (a, s, 0, 0);

22
debian/rules vendored
View File

@ -1,14 +1,16 @@
#!/usr/bin/make -f
PACKAGE = libpixman-1-0
SHLIBS = 0.25.2
SHLIBS = 0.40.0
DEB_HOST_MULTIARCH ?= $(shell dpkg-architecture -qDEB_HOST_MULTIARCH)
export DEB_BUILD_MAINT_OPTIONS = hardening=+all
# Disable Gtk+ autodetection:
override_dh_auto_configure:
dh_auto_configure -- --disable-gtk \
--libdir=\$${prefix}/lib/$(DEB_HOST_MULTIARCH)
# also avoid loongson2f optimizations on mipsel, see 0.26.0-3
# changelog entry:
LS_CFLAGS=" " dh_auto_configure -- \
-Dgtk=disabled
# Install in debian/tmp to retain control through dh_install:
override_dh_auto_install:
@ -17,16 +19,14 @@ override_dh_auto_install:
# Kill *.la files, and forget no-one:
override_dh_install:
find debian/tmp -name '*.la' -delete
dh_install --fail-missing
# Debug package:
override_dh_strip:
dh_strip -p$(PACKAGE) --dbg-package=$(PACKAGE)-dbg
dh_strip -N$(PACKAGE)
dh_install
# Shlibs:
override_dh_makeshlibs:
dh_makeshlibs -p$(PACKAGE) --add-udeb $(PACKAGE)-udeb -V"$(PACKAGE) (>= $(SHLIBS))" -- -c4
override_dh_auto_test:
dh_auto_test -- --verbose --timeout-multiplier 3
%:
dh $@ --with quilt,autoreconf --builddirectory=build/ --parallel
dh $@ --with quilt --builddirectory=build/

1
debian/source/format vendored Normal file
View File

@ -0,0 +1 @@
1.0

2
debian/watch vendored
View File

@ -1,3 +1,3 @@
#git=git://anongit.freedesktop.org/pixman
version=3
http://xorg.freedesktop.org/releases/individual/lib/ pixman-(.*)\.tar\.gz
https://xorg.freedesktop.org/releases/individual/lib/ pixman-(.*)\.tar\.gz

View File

@ -1,36 +0,0 @@
if HAVE_GTK
AM_CFLAGS = $(OPENMP_CFLAGS)
AM_LDFLAGS = $(OPENMP_CFLAGS)
LDADD = $(top_builddir)/pixman/libpixman-1.la -lm $(GTK_LIBS) $(PNG_LIBS)
INCLUDES = -I$(top_srcdir)/pixman -I$(top_builddir)/pixman $(GTK_CFLAGS) $(PNG_CFLAGS)
GTK_UTILS = gtk-utils.c gtk-utils.h
DEMOS = \
clip-test \
clip-in \
composite-test \
gradient-test \
radial-test \
alpha-test \
screen-test \
convolution-test \
trap-test \
tri-test
gradient_test_SOURCES = gradient-test.c $(GTK_UTILS)
alpha_test_SOURCES = alpha-test.c $(GTK_UTILS)
composite_test_SOURCES = composite-test.c $(GTK_UTILS)
clip_test_SOURCES = clip-test.c $(GTK_UTILS)
clip_in_SOURCES = clip-in.c $(GTK_UTILS)
trap_test_SOURCES = trap-test.c $(GTK_UTILS)
screen_test_SOURCES = screen-test.c $(GTK_UTILS)
convolution_test_SOURCES = convolution-test.c $(GTK_UTILS)
radial_test_SOURCES = radial-test.c ../test/utils.c ../test/utils.h $(GTK_UTILS)
tri_test_SOURCES = tri-test.c ../test/utils.c ../test/utils.h $(GTK_UTILS)
noinst_PROGRAMS = $(DEMOS)
endif

71
demos/checkerboard.c Normal file
View File

@ -0,0 +1,71 @@
#include <stdio.h>
#include <stdlib.h>
#include "pixman.h"
#include "gtk-utils.h"
int
main (int argc, char **argv)
{
#define WIDTH 400
#define HEIGHT 400
#define TILE_SIZE 25
pixman_image_t *checkerboard;
pixman_image_t *destination;
#define D2F(d) (pixman_double_to_fixed(d))
pixman_transform_t trans = { {
{ D2F (-1.96830), D2F (-1.82250), D2F (512.12250)},
{ D2F (0.00000), D2F (-7.29000), D2F (1458.00000)},
{ D2F (0.00000), D2F (-0.00911), D2F (0.59231)},
}};
int i, j;
checkerboard = pixman_image_create_bits (PIXMAN_a8r8g8b8,
WIDTH, HEIGHT,
NULL, 0);
destination = pixman_image_create_bits (PIXMAN_a8r8g8b8,
WIDTH, HEIGHT,
NULL, 0);
for (i = 0; i < HEIGHT / TILE_SIZE; ++i)
{
for (j = 0; j < WIDTH / TILE_SIZE; ++j)
{
double u = (double)(j + 1) / (WIDTH / TILE_SIZE);
double v = (double)(i + 1) / (HEIGHT / TILE_SIZE);
pixman_color_t black = { 0, 0, 0, 0xffff };
pixman_color_t white = {
v * 0xffff,
u * 0xffff,
(1 - (double)u) * 0xffff,
0xffff };
pixman_color_t *c;
pixman_image_t *fill;
if ((j & 1) != (i & 1))
c = &black;
else
c = &white;
fill = pixman_image_create_solid_fill (c);
pixman_image_composite (PIXMAN_OP_SRC, fill, NULL, checkerboard,
0, 0, 0, 0, j * TILE_SIZE, i * TILE_SIZE,
TILE_SIZE, TILE_SIZE);
}
}
pixman_image_set_transform (checkerboard, &trans);
pixman_image_set_filter (checkerboard, PIXMAN_FILTER_BEST, NULL, 0);
pixman_image_set_repeat (checkerboard, PIXMAN_REPEAT_NONE);
pixman_image_composite (PIXMAN_OP_SRC,
checkerboard, NULL, destination,
0, 0, 0, 0, 0, 0,
WIDTH, HEIGHT);
show_image (destination);
return 0;
}

View File

@ -3,9 +3,10 @@
#include <stdio.h>
#include "pixman.h"
#include "gtk-utils.h"
#include "parrot.c"
#define WIDTH 60
#define HEIGHT 60
#define WIDTH 80
#define HEIGHT 80
typedef struct {
const char *name;
@ -87,26 +88,24 @@ int
main (int argc, char **argv)
{
#define d2f pixman_double_to_fixed
GtkWidget *window, *swindow;
GtkWidget *table;
uint32_t *dest = malloc (WIDTH * HEIGHT * 4);
uint32_t *src = malloc (WIDTH * HEIGHT * 4);
pixman_image_t *src_img;
pixman_image_t *gradient, *parrot;
pixman_image_t *dest_img;
pixman_point_fixed_t p1 = { -10 << 0, 0 };
pixman_point_fixed_t p2 = { WIDTH << 16, (HEIGHT - 10) << 16 };
uint16_t full = 0xcfff;
uint16_t low = 0x5000;
uint16_t alpha = 0xffff;
pixman_point_fixed_t p1 = { -10 << 16, 10 << 16 };
pixman_point_fixed_t p2 = { (WIDTH + 10) << 16, (HEIGHT - 10) << 16 };
uint16_t alpha = 0xdddd;
pixman_gradient_stop_t stops[6] =
{
{ d2f (0.0), { full, low, low, alpha } },
{ d2f (0.25), { full, full, low, alpha } },
{ d2f (0.4), { low, full, low, alpha } },
{ d2f (0.6), { low, full, full, alpha } },
{ d2f (0.8), { low, low, full, alpha } },
{ d2f (1.0), { full, low, full, alpha } },
{ d2f (0.0), { 0xf2f2, 0x8787, 0x7d7d, alpha } },
{ d2f (0.22), { 0xf3f3, 0xeaea, 0x8383, alpha } },
{ d2f (0.42), { 0x6b6b, 0xc0c0, 0x7777, alpha } },
{ d2f (0.57), { 0x4b4b, 0xc9c9, 0xf5f5, alpha } },
{ d2f (0.75), { 0x6a6a, 0x7f7f, 0xbebe, alpha } },
{ d2f (1.0), { 0xeded, 0x8282, 0xb0b0, alpha } },
};
int i;
@ -116,20 +115,20 @@ main (int argc, char **argv)
window = gtk_window_new (GTK_WINDOW_TOPLEVEL);
gtk_window_set_default_size (GTK_WINDOW (window), 800, 600);
g_signal_connect (window, "delete-event",
G_CALLBACK (gtk_main_quit),
NULL);
table = gtk_table_new (G_N_ELEMENTS (operators) / 6, 6, TRUE);
src_img = pixman_image_create_linear_gradient (&p1, &p2, stops,
G_N_ELEMENTS (stops));
gradient = pixman_image_create_linear_gradient (&p1, &p2, stops, G_N_ELEMENTS (stops));
parrot = pixman_image_create_bits (PIXMAN_a8r8g8b8, WIDTH, HEIGHT, (uint32_t *)parrot_bits, WIDTH * 4);
pixman_image_set_repeat (gradient, PIXMAN_REPEAT_PAD);
pixman_image_set_repeat (src_img, PIXMAN_REPEAT_PAD);
dest_img = pixman_image_create_bits (PIXMAN_a8r8g8b8,
WIDTH, HEIGHT,
dest,
NULL,
WIDTH * 4);
pixman_image_set_accessors (dest_img, reader, writer);
@ -139,7 +138,6 @@ main (int argc, char **argv)
GdkPixbuf *pixbuf;
GtkWidget *vbox;
GtkWidget *label;
int j, k;
vbox = gtk_vbox_new (FALSE, 0);
@ -147,14 +145,11 @@ main (int argc, char **argv)
gtk_box_pack_start (GTK_BOX (vbox), label, FALSE, FALSE, 6);
gtk_widget_show (label);
for (j = 0; j < HEIGHT; ++j)
{
for (k = 0; k < WIDTH; ++k)
dest[j * WIDTH + k] = 0x7f6f6f00;
}
pixman_image_composite (operators[i].op, src_img, NULL, dest_img,
pixman_image_composite (PIXMAN_OP_SRC, gradient, NULL, dest_img,
0, 0, 0, 0, 0, 0, WIDTH, HEIGHT);
pixbuf = pixbuf_from_argb32 (pixman_image_get_data (dest_img), TRUE,
pixman_image_composite (operators[i].op, parrot, NULL, dest_img,
0, 0, 0, 0, 0, 0, WIDTH, HEIGHT);
pixbuf = pixbuf_from_argb32 (pixman_image_get_data (dest_img),
WIDTH, HEIGHT, WIDTH * 4);
image = gtk_image_new_from_pixbuf (pixbuf);
gtk_box_pack_start (GTK_BOX (vbox), image, FALSE, FALSE, 0);
@ -167,7 +162,7 @@ main (int argc, char **argv)
g_object_unref (pixbuf);
}
pixman_image_unref (src_img);
pixman_image_unref (gradient);
free (src);
pixman_image_unref (dest_img);
free (dest);
@ -176,7 +171,7 @@ main (int argc, char **argv)
gtk_scrolled_window_set_policy (GTK_SCROLLED_WINDOW (swindow),
GTK_POLICY_AUTOMATIC,
GTK_POLICY_AUTOMATIC);
gtk_scrolled_window_add_with_viewport (GTK_SCROLLED_WINDOW (swindow), table);
gtk_widget_show (table);

100
demos/conical-test.c Normal file
View File

@ -0,0 +1,100 @@
#include "utils.h"
#include "gtk-utils.h"
#define SIZE 128
#define GRADIENTS_PER_ROW 7
#define NUM_ROWS ((NUM_GRADIENTS + GRADIENTS_PER_ROW - 1) / GRADIENTS_PER_ROW)
#define WIDTH (SIZE * GRADIENTS_PER_ROW)
#define HEIGHT (SIZE * NUM_ROWS)
#define NUM_GRADIENTS 35
#define double_to_color(x) \
(((uint32_t) ((x)*65536)) - (((uint32_t) ((x)*65536)) >> 16))
#define PIXMAN_STOP(offset,r,g,b,a) \
{ pixman_double_to_fixed (offset), \
{ \
double_to_color (r), \
double_to_color (g), \
double_to_color (b), \
double_to_color (a) \
} \
}
static const pixman_gradient_stop_t stops[] = {
PIXMAN_STOP (0.25, 1, 0, 0, 0.7),
PIXMAN_STOP (0.5, 1, 1, 0, 0.7),
PIXMAN_STOP (0.75, 0, 1, 0, 0.7),
PIXMAN_STOP (1.0, 0, 0, 1, 0.7)
};
#define NUM_STOPS (sizeof (stops) / sizeof (stops[0]))
static pixman_image_t *
create_conical (int index)
{
pixman_point_fixed_t c;
double angle;
c.x = pixman_double_to_fixed (0);
c.y = pixman_double_to_fixed (0);
angle = (0.5 / NUM_GRADIENTS + index / (double)NUM_GRADIENTS) * 720 - 180;
return pixman_image_create_conical_gradient (
&c, pixman_double_to_fixed (angle), stops, NUM_STOPS);
}
int
main (int argc, char **argv)
{
pixman_transform_t transform;
pixman_image_t *src_img, *dest_img;
int i;
enable_divbyzero_exceptions ();
dest_img = pixman_image_create_bits (PIXMAN_a8r8g8b8,
WIDTH, HEIGHT,
NULL, 0);
draw_checkerboard (dest_img, 25, 0xffaaaaaa, 0xff888888);
pixman_transform_init_identity (&transform);
pixman_transform_translate (NULL, &transform,
pixman_double_to_fixed (0.5),
pixman_double_to_fixed (0.5));
pixman_transform_scale (NULL, &transform,
pixman_double_to_fixed (SIZE),
pixman_double_to_fixed (SIZE));
pixman_transform_translate (NULL, &transform,
pixman_double_to_fixed (0.5),
pixman_double_to_fixed (0.5));
for (i = 0; i < NUM_GRADIENTS; i++)
{
int column = i % GRADIENTS_PER_ROW;
int row = i / GRADIENTS_PER_ROW;
src_img = create_conical (i);
pixman_image_set_repeat (src_img, PIXMAN_REPEAT_NORMAL);
pixman_image_set_transform (src_img, &transform);
pixman_image_composite32 (
PIXMAN_OP_OVER, src_img, NULL,dest_img,
0, 0, 0, 0, column * SIZE, row * SIZE,
SIZE, SIZE);
pixman_image_unref (src_img);
}
show_image (dest_img);
pixman_image_unref (dest_img);
return 0;
}

277
demos/dither.c Normal file
View File

@ -0,0 +1,277 @@
/*
* Copyright 2012, Red Hat, Inc.
* Copyright 2012, Soren Sandmann
* Copyright 2018, Basile Clement
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
#ifdef HAVE_CONFIG_H
#include "pixman-config.h"
#endif
#include <math.h>
#include <gtk/gtk.h>
#include <stdlib.h>
#include "utils.h"
#include "gtk-utils.h"
#define WIDTH 1024
#define HEIGHT 640
typedef struct
{
GtkBuilder * builder;
pixman_image_t * original;
pixman_format_code_t format;
pixman_dither_t dither;
int width;
int height;
} app_t;
static GtkWidget *
get_widget (app_t *app, const char *name)
{
GtkWidget *widget = GTK_WIDGET (gtk_builder_get_object (app->builder, name));
if (!widget)
g_error ("Widget %s not found\n", name);
return widget;
}
typedef struct
{
char name [20];
int value;
} named_int_t;
static const named_int_t formats[] =
{
{ "a8r8g8b8", PIXMAN_a8r8g8b8 },
{ "rgb", PIXMAN_rgb_float },
{ "sRGB", PIXMAN_a8r8g8b8_sRGB },
{ "r5g6b5", PIXMAN_r5g6b5 },
{ "a4r4g4b4", PIXMAN_a4r4g4b4 },
{ "a2r2g2b2", PIXMAN_a2r2g2b2 },
{ "r3g3b2", PIXMAN_r3g3b2 },
{ "r1g2b1", PIXMAN_r1g2b1 },
{ "a1r1g1b1", PIXMAN_a1r1g1b1 },
};
static const named_int_t dithers[] =
{
{ "None", PIXMAN_REPEAT_NONE },
{ "Bayer 8x8", PIXMAN_DITHER_ORDERED_BAYER_8 },
{ "Blue noise 64x64", PIXMAN_DITHER_ORDERED_BLUE_NOISE_64 },
};
static int
get_value (app_t *app, const named_int_t table[], const char *box_name)
{
GtkComboBox *box = GTK_COMBO_BOX (get_widget (app, box_name));
return table[gtk_combo_box_get_active (box)].value;
}
static void
rescale (GtkWidget *may_be_null, app_t *app)
{
app->dither = get_value (app, dithers, "dithering_combo_box");
app->format = get_value (app, formats, "target_format_combo_box");
gtk_widget_set_size_request (
get_widget (app, "drawing_area"), app->width + 0.5, app->height + 0.5);
gtk_widget_queue_draw (
get_widget (app, "drawing_area"));
}
static gboolean
on_draw (GtkWidget *widget, cairo_t *cr, gpointer user_data)
{
app_t *app = user_data;
GdkRectangle area;
cairo_surface_t *surface;
pixman_image_t *tmp, *final;
uint32_t *pixels;
gdk_cairo_get_clip_rectangle(cr, &area);
tmp = pixman_image_create_bits (
app->format, area.width, area.height, NULL, 0);
pixman_image_set_dither (tmp, app->dither);
pixman_image_composite (
PIXMAN_OP_SRC,
app->original, NULL, tmp,
area.x, area.y, 0, 0, 0, 0,
app->width - area.x,
app->height - area.y);
pixels = calloc (1, area.width * area.height * 4);
final = pixman_image_create_bits (
PIXMAN_a8r8g8b8, area.width, area.height, pixels, area.width * 4);
pixman_image_composite (
PIXMAN_OP_SRC,
tmp, NULL, final,
area.x, area.y, 0, 0, 0, 0,
app->width - area.x,
app->height - area.y);
surface = cairo_image_surface_create_for_data (
(uint8_t *)pixels, CAIRO_FORMAT_ARGB32,
area.width, area.height, area.width * 4);
cairo_set_source_surface (cr, surface, area.x, area.y);
cairo_paint (cr);
cairo_surface_destroy (surface);
free (pixels);
pixman_image_unref (final);
pixman_image_unref (tmp);
return TRUE;
}
static void
set_up_combo_box (app_t *app, const char *box_name,
int n_entries, const named_int_t table[])
{
GtkWidget *widget = get_widget (app, box_name);
GtkListStore *model;
GtkCellRenderer *cell;
int i;
model = gtk_list_store_new (1, G_TYPE_STRING);
cell = gtk_cell_renderer_text_new ();
gtk_cell_layout_pack_start (GTK_CELL_LAYOUT (widget), cell, TRUE);
gtk_cell_layout_set_attributes (GTK_CELL_LAYOUT (widget), cell,
"text", 0,
NULL);
gtk_combo_box_set_model (GTK_COMBO_BOX (widget), GTK_TREE_MODEL (model));
for (i = 0; i < n_entries; ++i)
{
const named_int_t *info = &(table[i]);
GtkTreeIter iter;
gtk_list_store_append (model, &iter);
gtk_list_store_set (model, &iter, 0, info->name, -1);
}
gtk_combo_box_set_active (GTK_COMBO_BOX (widget), 0);
g_signal_connect (widget, "changed", G_CALLBACK (rescale), app);
}
static app_t *
app_new (pixman_image_t *original)
{
GtkWidget *widget;
app_t *app = g_malloc (sizeof *app);
GError *err = NULL;
app->builder = gtk_builder_new ();
app->original = original;
if (original->type == BITS)
{
app->width = pixman_image_get_width (original);
app->height = pixman_image_get_height (original);
}
else
{
app->width = WIDTH;
app->height = HEIGHT;
}
if (!gtk_builder_add_from_file (app->builder, "dither.ui", &err))
g_error ("Could not read file dither.ui: %s", err->message);
widget = get_widget (app, "drawing_area");
g_signal_connect (widget, "draw", G_CALLBACK (on_draw), app);
set_up_combo_box (app, "target_format_combo_box",
G_N_ELEMENTS (formats), formats);
set_up_combo_box (app, "dithering_combo_box",
G_N_ELEMENTS (dithers), dithers);
app->dither = get_value (app, dithers, "dithering_combo_box");
app->format = get_value (app, formats, "target_format_combo_box");
rescale (NULL, app);
return app;
}
int
main (int argc, char **argv)
{
GtkWidget *window;
pixman_image_t *image;
app_t *app;
gtk_init (&argc, &argv);
if (argc < 2)
{
pixman_gradient_stop_t stops[] = {
/* These colors make it very obvious that dithering
* is useful even for 8-bit gradients
*/
{ 0x00000, { 0x1b1b, 0x5d5d, 0x7c7c, 0xffff } },
{ 0x10000, { 0x3838, 0x3232, 0x1010, 0xffff } },
};
pixman_point_fixed_t p1, p2;
p1.x = p1.y = 0x0000;
p2.x = WIDTH << 16;
p2.y = HEIGHT << 16;
if (!(image = pixman_image_create_linear_gradient (
&p1, &p2, stops, ARRAY_LENGTH (stops))))
{
printf ("Could not create gradient\n");
return -1;
}
}
else if (!(image = pixman_image_from_file (argv[1], PIXMAN_a8r8g8b8)))
{
printf ("Could not load image \"%s\"\n", argv[1]);
return -1;
}
app = app_new (image);
window = get_widget (app, "main");
g_signal_connect (window, "delete_event", G_CALLBACK (gtk_main_quit), NULL);
gtk_window_set_default_size (GTK_WINDOW (window), 1024, 768);
gtk_widget_show_all (window);
gtk_main ();
return 0;
}

147
demos/dither.ui Normal file
View File

@ -0,0 +1,147 @@
<?xml version="1.0" encoding="UTF-8"?>
<interface>
<requires lib="gtk+" version="2.12"/>
<object class="GtkWindow" id="main">
<property name="can_focus">False</property>
<child>
<placeholder/>
</child>
<child>
<object class="GtkHBox" id="u">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="spacing">12</property>
<child>
<object class="GtkScrolledWindow" id="scrolledwindow1">
<property name="visible">True</property>
<property name="can_focus">True</property>
<property name="shadow_type">in</property>
<child>
<object class="GtkViewport" id="viewport1">
<property name="visible">True</property>
<property name="can_focus">False</property>
<child>
<object class="GtkDrawingArea" id="drawing_area">
<property name="visible">True</property>
<property name="can_focus">False</property>
</object>
</child>
</object>
</child>
</object>
<packing>
<property name="expand">True</property>
<property name="fill">True</property>
<property name="position">0</property>
</packing>
</child>
<child>
<object class="GtkVBox" id="box1">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="spacing">12</property>
<child>
<object class="GtkVBox" id="box6">
<property name="visible">True</property>
<property name="can_focus">False</property>
<child>
<object class="GtkTable" id="grid1">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="n_rows">2</property>
<property name="n_columns">2</property>
<property name="column_spacing">8</property>
<property name="row_spacing">6</property>
<child>
<object class="GtkLabel" id="label4">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="label" translatable="yes">&lt;b&gt;Target format:&lt;/b&gt;</property>
<property name="use_markup">True</property>
<property name="xalign">1</property>
</object>
</child>
<child>
<object class="GtkLabel" id="label5">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="label" translatable="yes">&lt;b&gt;Dithering:&lt;/b&gt;</property>
<property name="use_markup">True</property>
<property name="xalign">1</property>
</object>
<packing>
<property name="top_attach">1</property>
</packing>
</child>
<child>
<object class="GtkComboBox" id="target_format_combo_box">
<property name="visible">True</property>
<property name="can_focus">False</property>
</object>
<packing>
<property name="left_attach">1</property>
</packing>
</child>
<child>
<object class="GtkComboBox" id="dithering_combo_box">
<property name="visible">True</property>
<property name="can_focus">False</property>
</object>
<packing>
<property name="left_attach">1</property>
<property name="top_attach">1</property>
</packing>
</child>
</object>
<packing>
<property name="expand">False</property>
<property name="fill">True</property>
<property name="padding">6</property>
<property name="position">1</property>
</packing>
</child>
</object>
<packing>
<property name="expand">False</property>
<property name="fill">True</property>
<property name="position">0</property>
</packing>
</child>
</object>
<packing>
<property name="expand">False</property>
<property name="fill">True</property>
<property name="position">1</property>
</packing>
</child>
</object>
</child>
</object>
<object class="GtkAdjustment" id="rotate_adjustment">
<property name="lower">-180</property>
<property name="upper">190</property>
<property name="step_increment">1</property>
<property name="page_increment">10</property>
<property name="page_size">10</property>
</object>
<object class="GtkAdjustment" id="scale_x_adjustment">
<property name="lower">-32</property>
<property name="upper">42</property>
<property name="step_increment">1</property>
<property name="page_increment">10</property>
<property name="page_size">10</property>
</object>
<object class="GtkAdjustment" id="scale_y_adjustment">
<property name="lower">-32</property>
<property name="upper">42</property>
<property name="step_increment">1</property>
<property name="page_increment">10</property>
<property name="page_size">10</property>
</object>
<object class="GtkAdjustment" id="subsample_adjustment">
<property name="upper">12</property>
<property name="value">4</property>
<property name="step_increment">1</property>
<property name="page_increment">1</property>
</object>
</interface>

View File

@ -1,13 +1,78 @@
#include <gtk/gtk.h>
#include <config.h>
#include "pixman-private.h" /* For image->bits.format
* FIXME: there should probably be public API for this
*/
#ifdef HAVE_CONFIG_H
#include <pixman-config.h>
#endif
#include "utils.h"
#include "gtk-utils.h"
pixman_image_t *
pixman_image_from_file (const char *filename, pixman_format_code_t format)
{
GdkPixbuf *pixbuf;
pixman_image_t *image;
int width, height;
uint32_t *data, *d;
uint8_t *gdk_data;
int n_channels;
int j, i;
int stride;
if (!(pixbuf = gdk_pixbuf_new_from_file (filename, NULL)))
return NULL;
image = NULL;
width = gdk_pixbuf_get_width (pixbuf);
height = gdk_pixbuf_get_height (pixbuf);
n_channels = gdk_pixbuf_get_n_channels (pixbuf);
gdk_data = gdk_pixbuf_get_pixels (pixbuf);
stride = gdk_pixbuf_get_rowstride (pixbuf);
if (!(data = malloc (width * height * sizeof (uint32_t))))
goto out;
d = data;
for (j = 0; j < height; ++j)
{
uint8_t *gdk_line = gdk_data;
for (i = 0; i < width; ++i)
{
int r, g, b, a;
uint32_t pixel;
r = gdk_line[0];
g = gdk_line[1];
b = gdk_line[2];
if (n_channels == 4)
a = gdk_line[3];
else
a = 0xff;
r = (r * a + 127) / 255;
g = (g * a + 127) / 255;
b = (b * a + 127) / 255;
pixel = (a << 24) | (r << 16) | (g << 8) | b;
*d++ = pixel;
gdk_line += n_channels;
}
gdk_data += stride;
}
image = pixman_image_create_bits (
format, width, height, data, width * 4);
out:
g_object_unref (pixbuf);
return image;
}
GdkPixbuf *
pixbuf_from_argb32 (uint32_t *bits,
gboolean has_alpha,
int width,
int height,
int stride)
@ -16,56 +81,43 @@ pixbuf_from_argb32 (uint32_t *bits,
8, width, height);
int p_stride = gdk_pixbuf_get_rowstride (pixbuf);
guint32 *p_bits = (guint32 *)gdk_pixbuf_get_pixels (pixbuf);
int w, h;
for (h = 0; h < height; ++h)
int i;
for (i = 0; i < height; ++i)
{
for (w = 0; w < width; ++w)
{
uint32_t argb = bits[h * (stride / 4) + w];
guint r, g, b, a;
char *pb = (char *)p_bits;
uint32_t *src_row = &bits[i * (stride / 4)];
uint32_t *dst_row = p_bits + i * (p_stride / 4);
pb += h * p_stride + w * 4;
r = (argb & 0x00ff0000) >> 16;
g = (argb & 0x0000ff00) >> 8;
b = (argb & 0x000000ff) >> 0;
a = has_alpha? (argb & 0xff000000) >> 24 : 0xff;
if (a)
{
r = (r * 255) / a;
g = (g * 255) / a;
b = (b * 255) / a;
}
if (r > 255) r = 255;
if (g > 255) g = 255;
if (b > 255) b = 255;
pb[0] = r;
pb[1] = g;
pb[2] = b;
pb[3] = a;
}
a8r8g8b8_to_rgba_np (dst_row, src_row, width);
}
return pixbuf;
}
static gboolean
on_expose (GtkWidget *widget, GdkEventExpose *expose, gpointer data)
on_draw (GtkWidget *widget, cairo_t *cr, gpointer user_data)
{
GdkPixbuf *pixbuf = data;
gdk_draw_pixbuf (widget->window, NULL,
pixbuf, 0, 0, 0, 0,
gdk_pixbuf_get_width (pixbuf),
gdk_pixbuf_get_height (pixbuf),
GDK_RGB_DITHER_NONE,
0, 0);
pixman_image_t *pimage = user_data;
int width = pixman_image_get_width (pimage);
int height = pixman_image_get_height (pimage);
int stride = pixman_image_get_stride (pimage);
cairo_surface_t *cimage;
cairo_format_t format;
if (pixman_image_get_format (pimage) == PIXMAN_x8r8g8b8)
format = CAIRO_FORMAT_RGB24;
else
format = CAIRO_FORMAT_ARGB32;
cimage = cairo_image_surface_create_for_data (
(uint8_t *)pixman_image_get_data (pimage),
format, width, height, stride);
cairo_rectangle (cr, 0, 0, width, height);
cairo_set_source_surface (cr, cimage, 0, 0);
cairo_fill (cr);
cairo_surface_destroy (cimage);
return TRUE;
}
@ -74,13 +126,12 @@ void
show_image (pixman_image_t *image)
{
GtkWidget *window;
GdkPixbuf *pixbuf;
int width, height, stride;
int width, height;
int argc;
char **argv;
char *arg0 = g_strdup ("pixman-test-program");
gboolean has_alpha;
pixman_format_code_t format;
pixman_image_t *copy;
argc = 1;
argv = (char **)&arg0;
@ -90,23 +141,34 @@ show_image (pixman_image_t *image)
window = gtk_window_new (GTK_WINDOW_TOPLEVEL);
width = pixman_image_get_width (image);
height = pixman_image_get_height (image);
stride = pixman_image_get_stride (image);
gtk_window_set_default_size (GTK_WINDOW (window), width, height);
format = image->bits.format;
if (format == PIXMAN_a8r8g8b8)
has_alpha = TRUE;
else if (format == PIXMAN_x8r8g8b8)
has_alpha = FALSE;
else
g_error ("Can't deal with this format: %x\n", format);
pixbuf = pixbuf_from_argb32 (pixman_image_get_data (image), has_alpha,
width, height, stride);
g_signal_connect (window, "expose_event", G_CALLBACK (on_expose), pixbuf);
format = pixman_image_get_format (image);
/* We always display the image as if it contains sRGB data. That
* means that no conversion should take place when the image
* has the a8r8g8b8_sRGB format.
*/
switch (format)
{
case PIXMAN_a8r8g8b8_sRGB:
case PIXMAN_a8r8g8b8:
case PIXMAN_x8r8g8b8:
copy = pixman_image_ref (image);
break;
default:
copy = pixman_image_create_bits (PIXMAN_a8r8g8b8,
width, height, NULL, -1);
pixman_image_composite32 (PIXMAN_OP_SRC,
image, NULL, copy,
0, 0, 0, 0, 0, 0,
width, height);
break;
}
g_signal_connect (window, "draw", G_CALLBACK (on_draw), copy);
g_signal_connect (window, "delete_event", G_CALLBACK (gtk_main_quit), NULL);
gtk_widget_show (window);

View File

@ -6,8 +6,10 @@
void show_image (pixman_image_t *image);
pixman_image_t *
pixman_image_from_file (const char *filename, pixman_format_code_t format);
GdkPixbuf *pixbuf_from_argb32 (uint32_t *bits,
gboolean has_alpha,
int width,
int height,
int stride);

50
demos/linear-gradient.c Normal file
View File

@ -0,0 +1,50 @@
#include "utils.h"
#include "gtk-utils.h"
#define WIDTH 1024
#define HEIGHT 640
int
main (int argc, char **argv)
{
pixman_image_t *src_img, *dest_img;
pixman_gradient_stop_t stops[] = {
{ 0x00000, { 0x0000, 0x0000, 0x4444, 0xdddd } },
{ 0x10000, { 0xeeee, 0xeeee, 0x8888, 0xdddd } },
#if 0
/* These colors make it very obvious that dithering
* is useful even for 8-bit gradients
*/
{ 0x00000, { 0x6666, 0x3333, 0x3333, 0xffff } },
{ 0x10000, { 0x3333, 0x6666, 0x6666, 0xffff } },
#endif
};
pixman_point_fixed_t p1, p2;
enable_divbyzero_exceptions ();
dest_img = pixman_image_create_bits (PIXMAN_x8r8g8b8,
WIDTH, HEIGHT,
NULL, 0);
p1.x = p1.y = 0x0000;
p2.x = WIDTH << 16;
p2.y = HEIGHT << 16;
src_img = pixman_image_create_linear_gradient (&p1, &p2, stops, ARRAY_LENGTH (stops));
pixman_image_composite32 (PIXMAN_OP_OVER,
src_img,
NULL,
dest_img,
0, 0,
0, 0,
0, 0,
WIDTH, HEIGHT);
show_image (dest_img);
pixman_image_unref (dest_img);
return 0;
}

66
demos/meson.build Normal file
View File

@ -0,0 +1,66 @@
# Copyright © 2018 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
extra_demo_cflags = []
if cc.get_argument_syntax() == 'msvc'
extra_demo_cflags = ['-D_USE_MATH_DEFINES']
endif
demos = [
'gradient-test',
'alpha-test',
'composite-test',
'clip-test',
'trap-test',
'screen-test',
'convolution-test',
'radial-test',
'linear-gradient',
'conical-test',
'tri-test',
'checkerboard',
'srgb-test',
'srgb-trap-test',
'scale',
'dither',
]
if dep_gtk.found()
libdemo = static_library(
'demo',
['gtk-utils.c', config_h, version_h],
dependencies : [libtestutils_dep, dep_gtk, dep_glib, dep_png, dep_m, dep_openmp],
include_directories : inc_pixman,
)
if dep_gtk.found()
foreach d : demos
executable(
d,
[d + '.c', config_h, version_h],
c_args : extra_demo_cflags,
link_with : [libdemo],
dependencies : [idep_pixman, libtestutils_dep, dep_glib, dep_gtk, dep_openmp, dep_png],
)
endforeach
endif
endif

1079
demos/parrot.c Normal file

File diff suppressed because it is too large Load Diff

BIN
demos/parrot.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 71 KiB

2183
demos/quad2quad.c Normal file

File diff suppressed because it is too large Load Diff

View File

@ -1,7 +1,7 @@
#include "../test/utils.h"
#include "utils.h"
#include "gtk-utils.h"
#define NUM_GRADIENTS 7
#define NUM_GRADIENTS 9
#define NUM_STOPS 3
#define NUM_REPEAT 4
#define SIZE 128
@ -28,6 +28,9 @@
* centers (0, 0) and (1, 0), but with different radiuses. From left
* to right:
*
* - Degenerate start circle completely inside the end circle
* 0.00 -> 1.75; dr = 1.75 > 0; a = 1 - 1.75^2 < 0
*
* - Small start circle completely inside the end circle
* 0.25 -> 1.75; dr = 1.5 > 0; a = 1 - 1.50^2 < 0
*
@ -49,15 +52,20 @@
* - Small end circle completely inside the start circle
* 1.75 -> 0.25; dr = -1.5 > 0; a = 1 - 1.50^2 < 0
*
* - Degenerate end circle completely inside the start circle
* 0.00 -> 1.75; dr = 1.75 > 0; a = 1 - 1.75^2 < 0
*
*/
const static double radiuses[NUM_GRADIENTS] = {
0.00,
0.25,
0.50,
0.50,
1.00,
1.00,
1.50,
1.75,
1.75
};
@ -133,12 +141,14 @@ main (int argc, char **argv)
pixman_image_t *src_img, *dest_img;
int i, j;
enable_fp_exceptions ();
enable_divbyzero_exceptions ();
dest_img = pixman_image_create_bits (PIXMAN_a8r8g8b8,
WIDTH, HEIGHT,
NULL, 0);
draw_checkerboard (dest_img, 25, 0xffaaaaaa, 0xffbbbbbb);
pixman_transform_init_identity (&transform);
/*

454
demos/scale.c Normal file
View File

@ -0,0 +1,454 @@
/*
* Copyright 2012, Red Hat, Inc.
* Copyright 2012, Soren Sandmann
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*
* Author: Soren Sandmann <soren.sandmann@gmail.com>
*/
#ifdef HAVE_CONFIG_H
#include "pixman-config.h"
#endif
#include <math.h>
#include <gtk/gtk.h>
#include <pixman.h>
#include <stdlib.h>
#include "gtk-utils.h"
typedef struct
{
GtkBuilder * builder;
pixman_image_t * original;
GtkAdjustment * scale_x_adjustment;
GtkAdjustment * scale_y_adjustment;
GtkAdjustment * rotate_adjustment;
GtkAdjustment * subsample_adjustment;
int scaled_width;
int scaled_height;
} app_t;
static GtkWidget *
get_widget (app_t *app, const char *name)
{
GtkWidget *widget = GTK_WIDGET (gtk_builder_get_object (app->builder, name));
if (!widget)
g_error ("Widget %s not found\n", name);
return widget;
}
/* Figure out the boundary of a diameter=1 circle transformed into an ellipse
* by trans. Proof that this is the correct calculation:
*
* Transform x,y to u,v by this matrix calculation:
*
* |u| |a c| |x|
* |v| = |b d|*|y|
*
* Horizontal component:
*
* u = ax+cy (1)
*
* For each x,y on a radius-1 circle (p is angle to the point):
*
* x^2+y^2 = 1
* x = cos(p)
* y = sin(p)
* dx/dp = -sin(p) = -y
* dy/dp = cos(p) = x
*
* Figure out derivative of (1) relative to p:
*
* du/dp = a(dx/dp) + c(dy/dp)
* = -ay + cx
*
* The min and max u are when du/dp is zero:
*
* -ay + cx = 0
* cx = ay
* c = ay/x (2)
* y = cx/a (3)
*
* Substitute (2) into (1) and simplify:
*
* u = ax + ay^2/x
* = a(x^2+y^2)/x
* = a/x (because x^2+y^2 = 1)
* x = a/u (4)
*
* Substitute (4) into (3) and simplify:
*
* y = c(a/u)/a
* y = c/u (5)
*
* Square (4) and (5) and add:
*
* x^2+y^2 = (a^2+c^2)/u^2
*
* But x^2+y^2 is 1:
*
* 1 = (a^2+c^2)/u^2
* u^2 = a^2+c^2
* u = hypot(a,c)
*
* Similarily the max/min of v is at:
*
* v = hypot(b,d)
*
*/
static void
compute_extents (pixman_f_transform_t *trans, double *sx, double *sy)
{
*sx = hypot (trans->m[0][0], trans->m[0][1]) / trans->m[2][2];
*sy = hypot (trans->m[1][0], trans->m[1][1]) / trans->m[2][2];
}
typedef struct
{
char name [20];
int value;
} named_int_t;
static const named_int_t filters[] =
{
{ "Box", PIXMAN_KERNEL_BOX },
{ "Impulse", PIXMAN_KERNEL_IMPULSE },
{ "Linear", PIXMAN_KERNEL_LINEAR },
{ "Cubic", PIXMAN_KERNEL_CUBIC },
{ "Lanczos2", PIXMAN_KERNEL_LANCZOS2 },
{ "Lanczos3", PIXMAN_KERNEL_LANCZOS3 },
{ "Lanczos3 Stretched", PIXMAN_KERNEL_LANCZOS3_STRETCHED },
{ "Gaussian", PIXMAN_KERNEL_GAUSSIAN },
};
static const named_int_t repeats[] =
{
{ "None", PIXMAN_REPEAT_NONE },
{ "Normal", PIXMAN_REPEAT_NORMAL },
{ "Reflect", PIXMAN_REPEAT_REFLECT },
{ "Pad", PIXMAN_REPEAT_PAD },
};
static int
get_value (app_t *app, const named_int_t table[], const char *box_name)
{
GtkComboBox *box = GTK_COMBO_BOX (get_widget (app, box_name));
return table[gtk_combo_box_get_active (box)].value;
}
static void
copy_to_counterpart (app_t *app, GObject *object)
{
static const char *xy_map[] =
{
"reconstruct_x_combo_box", "reconstruct_y_combo_box",
"sample_x_combo_box", "sample_y_combo_box",
"scale_x_adjustment", "scale_y_adjustment",
};
GObject *counterpart = NULL;
int i;
for (i = 0; i < G_N_ELEMENTS (xy_map); i += 2)
{
GObject *x = gtk_builder_get_object (app->builder, xy_map[i]);
GObject *y = gtk_builder_get_object (app->builder, xy_map[i + 1]);
if (object == x)
counterpart = y;
if (object == y)
counterpart = x;
}
if (!counterpart)
return;
if (GTK_IS_COMBO_BOX (counterpart))
{
gtk_combo_box_set_active (
GTK_COMBO_BOX (counterpart),
gtk_combo_box_get_active (
GTK_COMBO_BOX (object)));
}
else if (GTK_IS_ADJUSTMENT (counterpart))
{
gtk_adjustment_set_value (
GTK_ADJUSTMENT (counterpart),
gtk_adjustment_get_value (
GTK_ADJUSTMENT (object)));
}
}
static double
to_scale (double v)
{
return pow (1.15, v);
}
static void
rescale (GtkWidget *may_be_null, app_t *app)
{
pixman_f_transform_t ftransform;
pixman_transform_t transform;
double new_width, new_height;
double fscale_x, fscale_y;
double rotation;
pixman_fixed_t *params;
int n_params;
double sx, sy;
pixman_f_transform_init_identity (&ftransform);
if (may_be_null && gtk_toggle_button_get_active (
GTK_TOGGLE_BUTTON (get_widget (app, "lock_checkbutton"))))
{
copy_to_counterpart (app, G_OBJECT (may_be_null));
}
fscale_x = gtk_adjustment_get_value (app->scale_x_adjustment);
fscale_y = gtk_adjustment_get_value (app->scale_y_adjustment);
rotation = gtk_adjustment_get_value (app->rotate_adjustment);
fscale_x = to_scale (fscale_x);
fscale_y = to_scale (fscale_y);
new_width = pixman_image_get_width (app->original) * fscale_x;
new_height = pixman_image_get_height (app->original) * fscale_y;
pixman_f_transform_scale (&ftransform, NULL, fscale_x, fscale_y);
pixman_f_transform_translate (&ftransform, NULL, - new_width / 2.0, - new_height / 2.0);
rotation = (rotation / 360.0) * 2 * M_PI;
pixman_f_transform_rotate (&ftransform, NULL, cos (rotation), sin (rotation));
pixman_f_transform_translate (&ftransform, NULL, new_width / 2.0, new_height / 2.0);
pixman_f_transform_invert (&ftransform, &ftransform);
compute_extents (&ftransform, &sx, &sy);
pixman_transform_from_pixman_f_transform (&transform, &ftransform);
pixman_image_set_transform (app->original, &transform);
params = pixman_filter_create_separable_convolution (
&n_params,
sx * 65536.0 + 0.5,
sy * 65536.0 + 0.5,
get_value (app, filters, "reconstruct_x_combo_box"),
get_value (app, filters, "reconstruct_y_combo_box"),
get_value (app, filters, "sample_x_combo_box"),
get_value (app, filters, "sample_y_combo_box"),
gtk_adjustment_get_value (app->subsample_adjustment),
gtk_adjustment_get_value (app->subsample_adjustment));
pixman_image_set_filter (app->original, PIXMAN_FILTER_SEPARABLE_CONVOLUTION, params, n_params);
pixman_image_set_repeat (
app->original, get_value (app, repeats, "repeat_combo_box"));
free (params);
app->scaled_width = ceil (new_width);
app->scaled_height = ceil (new_height);
gtk_widget_set_size_request (
get_widget (app, "drawing_area"), new_width + 0.5, new_height + 0.5);
gtk_widget_queue_draw (
get_widget (app, "drawing_area"));
}
static gboolean
on_draw (GtkWidget *widget, cairo_t *cr, gpointer user_data)
{
app_t *app = user_data;
GdkRectangle area;
cairo_surface_t *surface;
pixman_image_t *tmp;
uint32_t *pixels;
gdk_cairo_get_clip_rectangle(cr, &area);
pixels = calloc (1, area.width * area.height * 4);
tmp = pixman_image_create_bits (
PIXMAN_a8r8g8b8, area.width, area.height, pixels, area.width * 4);
if (area.x < app->scaled_width && area.y < app->scaled_height)
{
pixman_image_composite (
PIXMAN_OP_SRC,
app->original, NULL, tmp,
area.x, area.y, 0, 0, 0, 0,
app->scaled_width - area.x, app->scaled_height - area.y);
}
surface = cairo_image_surface_create_for_data (
(uint8_t *)pixels, CAIRO_FORMAT_ARGB32,
area.width, area.height, area.width * 4);
cairo_set_source_surface (cr, surface, area.x, area.y);
cairo_paint (cr);
cairo_surface_destroy (surface);
free (pixels);
pixman_image_unref (tmp);
return TRUE;
}
static void
set_up_combo_box (app_t *app, const char *box_name,
int n_entries, const named_int_t table[])
{
GtkWidget *widget = get_widget (app, box_name);
GtkListStore *model;
GtkCellRenderer *cell;
int i;
model = gtk_list_store_new (1, G_TYPE_STRING);
cell = gtk_cell_renderer_text_new ();
gtk_cell_layout_pack_start (GTK_CELL_LAYOUT (widget), cell, TRUE);
gtk_cell_layout_set_attributes (GTK_CELL_LAYOUT (widget), cell,
"text", 0,
NULL);
gtk_combo_box_set_model (GTK_COMBO_BOX (widget), GTK_TREE_MODEL (model));
for (i = 0; i < n_entries; ++i)
{
const named_int_t *info = &(table[i]);
GtkTreeIter iter;
gtk_list_store_append (model, &iter);
gtk_list_store_set (model, &iter, 0, info->name, -1);
}
gtk_combo_box_set_active (GTK_COMBO_BOX (widget), 0);
g_signal_connect (widget, "changed", G_CALLBACK (rescale), app);
}
static void
set_up_filter_box (app_t *app, const char *box_name)
{
set_up_combo_box (app, box_name, G_N_ELEMENTS (filters), filters);
}
static char *
format_value (GtkWidget *widget, double value)
{
return g_strdup_printf ("%.4f", to_scale (value));
}
static app_t *
app_new (pixman_image_t *original)
{
GtkWidget *widget;
app_t *app = g_malloc (sizeof *app);
GError *err = NULL;
app->builder = gtk_builder_new ();
app->original = original;
if (!gtk_builder_add_from_file (app->builder, "scale.ui", &err))
g_error ("Could not read file scale.ui: %s", err->message);
app->scale_x_adjustment =
GTK_ADJUSTMENT (gtk_builder_get_object (app->builder, "scale_x_adjustment"));
app->scale_y_adjustment =
GTK_ADJUSTMENT (gtk_builder_get_object (app->builder, "scale_y_adjustment"));
app->rotate_adjustment =
GTK_ADJUSTMENT (gtk_builder_get_object (app->builder, "rotate_adjustment"));
app->subsample_adjustment =
GTK_ADJUSTMENT (gtk_builder_get_object (app->builder, "subsample_adjustment"));
g_signal_connect (app->scale_x_adjustment, "value_changed", G_CALLBACK (rescale), app);
g_signal_connect (app->scale_y_adjustment, "value_changed", G_CALLBACK (rescale), app);
g_signal_connect (app->rotate_adjustment, "value_changed", G_CALLBACK (rescale), app);
g_signal_connect (app->subsample_adjustment, "value_changed", G_CALLBACK (rescale), app);
widget = get_widget (app, "scale_x_scale");
gtk_scale_add_mark (GTK_SCALE (widget), 0.0, GTK_POS_LEFT, NULL);
g_signal_connect (widget, "format_value", G_CALLBACK (format_value), app);
widget = get_widget (app, "scale_y_scale");
gtk_scale_add_mark (GTK_SCALE (widget), 0.0, GTK_POS_LEFT, NULL);
g_signal_connect (widget, "format_value", G_CALLBACK (format_value), app);
widget = get_widget (app, "rotate_scale");
gtk_scale_add_mark (GTK_SCALE (widget), 0.0, GTK_POS_LEFT, NULL);
widget = get_widget (app, "drawing_area");
g_signal_connect (widget, "draw", G_CALLBACK (on_draw), app);
set_up_filter_box (app, "reconstruct_x_combo_box");
set_up_filter_box (app, "reconstruct_y_combo_box");
set_up_filter_box (app, "sample_x_combo_box");
set_up_filter_box (app, "sample_y_combo_box");
set_up_combo_box (
app, "repeat_combo_box", G_N_ELEMENTS (repeats), repeats);
g_signal_connect (
gtk_builder_get_object (app->builder, "lock_checkbutton"),
"toggled", G_CALLBACK (rescale), app);
rescale (NULL, app);
return app;
}
int
main (int argc, char **argv)
{
GtkWidget *window;
pixman_image_t *image;
app_t *app;
gtk_init (&argc, &argv);
if (argc < 2)
{
printf ("%s <image file>\n", argv[0]);
return -1;
}
if (!(image = pixman_image_from_file (argv[1], PIXMAN_a8r8g8b8)))
{
printf ("Could not load image \"%s\"\n", argv[1]);
return -1;
}
app = app_new (image);
window = get_widget (app, "main");
g_signal_connect (window, "delete_event", G_CALLBACK (gtk_main_quit), NULL);
gtk_window_set_default_size (GTK_WINDOW (window), 1024, 768);
gtk_widget_show_all (window);
gtk_main ();
return 0;
}

334
demos/scale.ui Normal file
View File

@ -0,0 +1,334 @@
<?xml version="1.0" encoding="UTF-8"?>
<interface>
<!-- interface-requires gtk+ 2.12 -->
<!-- interface-naming-policy toplevel-contextual -->
<object class="GtkAdjustment" id="rotate_adjustment">
<property name="lower">-180</property>
<property name="upper">190</property>
<property name="step_increment">1</property>
<property name="page_increment">10</property>
<property name="page_size">10</property>
</object>
<object class="GtkAdjustment" id="scale_y_adjustment">
<property name="lower">-32</property>
<property name="upper">42</property>
<property name="step_increment">1</property>
<property name="page_increment">10</property>
<property name="page_size">10</property>
</object>
<object class="GtkAdjustment" id="scale_x_adjustment">
<property name="lower">-32</property>
<property name="upper">42</property>
<property name="step_increment">1</property>
<property name="page_increment">10</property>
<property name="page_size">10</property>
</object>
<object class="GtkAdjustment" id="subsample_adjustment">
<property name="lower">0</property>
<property name="upper">12</property>
<property name="step_increment">1</property>
<property name="page_increment">1</property>
<property name="page_size">0</property>
<property name="value">4</property>
</object>
<object class="GtkWindow" id="main">
<child>
<object class="GtkHBox" id="u">
<property name="visible">True</property>
<property name="spacing">12</property>
<child>
<object class="GtkScrolledWindow" id="scrolledwindow1">
<property name="visible">True</property>
<property name="can_focus">True</property>
<property name="shadow_type">in</property>
<child>
<object class="GtkViewport" id="viewport1">
<property name="visible">True</property>
<child>
<object class="GtkDrawingArea" id="drawing_area">
<property name="visible">True</property>
</object>
</child>
</object>
</child>
</object>
<packing>
<property name="position">0</property>
</packing>
</child>
<child>
<object class="GtkVBox" id="box1">
<property name="visible">True</property>
<property name="spacing">12</property>
<child>
<object class="GtkHBox" id="box2">
<property name="visible">True</property>
<property name="homogeneous">True</property>
<child>
<object class="GtkVBox" id="box3">
<property name="visible">True</property>
<property name="spacing">6</property>
<child>
<object class="GtkLabel" id="label1">
<property name="visible">True</property>
<property name="label" translatable="yes">&lt;b&gt;Scale X&lt;/b&gt;</property>
<property name="use_markup">True</property>
</object>
<packing>
<property name="expand">False</property>
<property name="position">0</property>
</packing>
</child>
<child>
<object class="GtkVScale" id="scale_x_scale">
<property name="visible">True</property>
<property name="can_focus">True</property>
<property name="adjustment">scale_x_adjustment</property>
<property name="fill_level">32</property>
<property name="value_pos">right</property>
</object>
<packing>
<property name="position">1</property>
</packing>
</child>
</object>
<packing>
<property name="expand">False</property>
<property name="position">0</property>
</packing>
</child>
<child>
<object class="GtkVBox" id="box4">
<property name="visible">True</property>
<property name="spacing">6</property>
<child>
<object class="GtkLabel" id="label2">
<property name="visible">True</property>
<property name="label" translatable="yes">&lt;b&gt;Scale Y&lt;/b&gt;</property>
<property name="use_markup">True</property>
</object>
<packing>
<property name="expand">False</property>
<property name="position">0</property>
</packing>
</child>
<child>
<object class="GtkVScale" id="scale_y_scale">
<property name="visible">True</property>
<property name="can_focus">True</property>
<property name="adjustment">scale_y_adjustment</property>
<property name="fill_level">32</property>
<property name="value_pos">right</property>
</object>
<packing>
<property name="position">1</property>
</packing>
</child>
</object>
<packing>
<property name="expand">False</property>
<property name="position">1</property>
</packing>
</child>
<child>
<object class="GtkVBox" id="box5">
<property name="visible">True</property>
<property name="spacing">6</property>
<child>
<object class="GtkLabel" id="label3">
<property name="visible">True</property>
<property name="label" translatable="yes">&lt;b&gt;Rotate&lt;/b&gt;</property>
<property name="use_markup">True</property>
</object>
<packing>
<property name="expand">False</property>
<property name="position">0</property>
</packing>
</child>
<child>
<object class="GtkVScale" id="rotate_scale">
<property name="visible">True</property>
<property name="can_focus">True</property>
<property name="adjustment">rotate_adjustment</property>
<property name="fill_level">180</property>
<property name="value_pos">right</property>
</object>
<packing>
<property name="position">1</property>
</packing>
</child>
</object>
<packing>
<property name="expand">False</property>
<property name="position">2</property>
</packing>
</child>
</object>
<packing>
<property name="padding">6</property>
<property name="position">0</property>
</packing>
</child>
<child>
<object class="GtkVBox" id="box6">
<property name="visible">True</property>
<child>
<object class="GtkCheckButton"
id="lock_checkbutton">
<property name="label" translatable="yes">Lock X and Y Dimensions</property>
<property name="xalign">0.0</property>
<property name="active">True</property>
</object>
<packing>
<property name="expand">False</property>
<property name="fill">False</property>
<property name="padding">6</property>
<property name="position">1</property>
</packing>
</child>
<child>
<object class="GtkTable" id="grid1">
<property name="visible">True</property>
<property name="column_spacing">8</property>
<property name="row_spacing">6</property>
<child>
<object class="GtkLabel" id="label4">
<property name="visible">True</property>
<property name="xalign">1</property>
<property name="label" translatable="yes">&lt;b&gt;Reconstruct X:&lt;/b&gt;</property>
<property name="use_markup">True</property>
</object>
</child>
<child>
<object class="GtkLabel" id="label5">
<property name="visible">True</property>
<property name="xalign">1</property>
<property name="label" translatable="yes">&lt;b&gt;Reconstruct Y:&lt;/b&gt;</property>
<property name="use_markup">True</property>
</object>
<packing>
<property name="top_attach">1</property>
</packing>
</child>
<child>
<object class="GtkLabel" id="label6">
<property name="visible">True</property>
<property name="xalign">1</property>
<property name="label" translatable="yes">&lt;b&gt;Sample X:&lt;/b&gt;</property>
<property name="use_markup">True</property>
</object>
<packing>
<property name="top_attach">2</property>
</packing>
</child>
<child>
<object class="GtkLabel" id="label7">
<property name="visible">True</property>
<property name="xalign">1</property>
<property name="label" translatable="yes">&lt;b&gt;Sample Y:&lt;/b&gt;</property>
<property name="use_markup">True</property>
</object>
<packing>
<property name="top_attach">3</property>
</packing>
</child>
<child>
<object class="GtkLabel" id="label8">
<property name="visible">True</property>
<property name="xalign">1</property>
<property name="label" translatable="yes">&lt;b&gt;Repeat:&lt;/b&gt;</property>
<property name="use_markup">True</property>
</object>
<packing>
<property name="top_attach">4</property>
</packing>
</child>
<child>
<object class="GtkLabel" id="label9">
<property name="visible">True</property>
<property name="xalign">1</property>
<property name="label" translatable="yes">&lt;b&gt;Subsample:&lt;/b&gt;</property>
<property name="use_markup">True</property>
</object>
<packing>
<property name="top_attach">5</property>
</packing>
</child>
<child>
<object class="GtkComboBox" id="reconstruct_x_combo_box">
<property name="visible">True</property>
</object>
<packing>
<property name="left_attach">1</property>
</packing>
</child>
<child>
<object class="GtkComboBox" id="reconstruct_y_combo_box">
<property name="visible">True</property>
</object>
<packing>
<property name="left_attach">1</property>
<property name="top_attach">1</property>
</packing>
</child>
<child>
<object class="GtkComboBox" id="sample_x_combo_box">
<property name="visible">True</property>
</object>
<packing>
<property name="left_attach">1</property>
<property name="top_attach">2</property>
</packing>
</child>
<child>
<object class="GtkComboBox" id="sample_y_combo_box">
<property name="visible">True</property>
</object>
<packing>
<property name="left_attach">1</property>
<property name="top_attach">3</property>
</packing>
</child>
<child>
<object class="GtkComboBox" id="repeat_combo_box">
<property name="visible">True</property>
</object>
<packing>
<property name="left_attach">1</property>
<property name="top_attach">4</property>
</packing>
</child>
<child>
<object class="GtkSpinButton" id="subsample_spin_button">
<property name="visible">True</property>
<property name="adjustment">subsample_adjustment</property>
<property name="value">4</property>
</object>
<packing>
<property name="left_attach">1</property>
<property name="top_attach">5</property>
</packing>
</child>
</object>
<packing>
<property name="expand">False</property>
<property name="padding">6</property>
<property name="position">1</property>
</packing>
</child>
</object>
<packing>
<property name="expand">False</property>
<property name="position">0</property>
</packing>
</child>
</object>
<packing>
<property name="expand">False</property>
<property name="position">1</property>
</packing>
</child>
</object>
</child>
</object>
</interface>

87
demos/srgb-test.c Normal file
View File

@ -0,0 +1,87 @@
#include <math.h>
#include "pixman.h"
#include "gtk-utils.h"
static uint32_t
linear_argb_to_premult_argb (float a,
float r,
float g,
float b)
{
r *= a;
g *= a;
b *= a;
return (uint32_t) (a * 255.0f + 0.5f) << 24
| (uint32_t) (r * 255.0f + 0.5f) << 16
| (uint32_t) (g * 255.0f + 0.5f) << 8
| (uint32_t) (b * 255.0f + 0.5f) << 0;
}
static float
lin2srgb (float linear)
{
if (linear < 0.0031308f)
return linear * 12.92f;
else
return 1.055f * powf (linear, 1.0f/2.4f) - 0.055f;
}
static uint32_t
linear_argb_to_premult_srgb_argb (float a,
float r,
float g,
float b)
{
r = lin2srgb (r * a);
g = lin2srgb (g * a);
b = lin2srgb (b * a);
return (uint32_t) (a * 255.0f + 0.5f) << 24
| (uint32_t) (r * 255.0f + 0.5f) << 16
| (uint32_t) (g * 255.0f + 0.5f) << 8
| (uint32_t) (b * 255.0f + 0.5f) << 0;
}
int
main (int argc, char **argv)
{
#define WIDTH 400
#define HEIGHT 200
int y, x, p;
float alpha;
uint32_t *dest = malloc (WIDTH * HEIGHT * 4);
uint32_t *src1 = malloc (WIDTH * HEIGHT * 4);
pixman_image_t *dest_img, *src1_img;
dest_img = pixman_image_create_bits (PIXMAN_a8r8g8b8_sRGB,
WIDTH, HEIGHT,
dest,
WIDTH * 4);
src1_img = pixman_image_create_bits (PIXMAN_a8r8g8b8,
WIDTH, HEIGHT,
src1,
WIDTH * 4);
for (y = 0; y < HEIGHT; y ++)
{
p = WIDTH * y;
for (x = 0; x < WIDTH; x ++)
{
alpha = (float) x / WIDTH;
src1[p + x] = linear_argb_to_premult_argb (alpha, 1, 0, 1);
dest[p + x] = linear_argb_to_premult_srgb_argb (1-alpha, 0, 1, 0);
}
}
pixman_image_composite (PIXMAN_OP_ADD, src1_img, NULL, dest_img,
0, 0, 0, 0, 0, 0, WIDTH, HEIGHT);
pixman_image_unref (src1_img);
free (src1);
show_image (dest_img);
pixman_image_unref (dest_img);
free (dest);
return 0;
}

Some files were not shown because too many files have changed in this diff Show More