Compare commits

...

1777 Commits

Author SHA1 Message Date
Dylan Aïssi
b85678a8de debian/copyright: Convert to machine-readable format 2025-07-31 22:22:45 +02:00
Timo Aaltonen
7d26aad890 releasing package pixman version 0.44.0-3 2024-11-09 11:03:01 +02:00
Timo Aaltonen
07627e9f31 Replace timeout bump patch by using a multiplier option instead. Thanks, Aurelien Jarno! (Closes: #1086999) 2024-11-09 11:02:51 +02:00
Timo Aaltonen
dc43d37962 releasing package pixman version 0.44.0-2 2024-11-08 09:58:31 +02:00
Timo Aaltonen
c05da7d917 patches: Increase test timeout 120->240s. (Closes: #1086999) 2024-11-08 09:53:41 +02:00
Timo Aaltonen
e55fd151a2 releasing package pixman version 0.44.0-1 2024-11-07 16:48:34 +02:00
Timo Aaltonen
7d5149536f rules: Drop obsolete dbgsym-migration. 2024-11-07 15:54:27 +02:00
Timo Aaltonen
2ad078304f control: Migrate to pkgconf. 2024-11-07 15:53:40 +02:00
Timo Aaltonen
7cca9d2d9a symbols: Updated. 2024-11-07 15:45:25 +02:00
Timo Aaltonen
c8cb00a5ad control, rules: Build with meson. 2024-11-07 15:45:17 +02:00
Timo Aaltonen
b87363cd49 patches: Refresh patch. 2024-11-07 14:31:18 +02:00
Timo Aaltonen
2e58ff85bd version bump 2024-11-07 14:30:41 +02:00
Timo Aaltonen
31b00cc770 Merge branch 'upstream-unstable' into debian-unstable 2024-11-07 14:29:36 +02:00
Matt Turner
ae6646f159 Pre-release version bump to 0.44.0 2024-11-05 11:51:31 -05:00
Lance Arsenault
126d61e796 pixman: Add library destructor
Closes: https://gitlab.freedesktop.org/pixman/pixman/-/issues/111
2024-11-05 04:31:04 +00:00
f wasil
a987256be8 Fixed memory leak in tests 2024-11-05 03:39:54 +00:00
f wasil
0e424031bd RISC-V floating point operations 2024-10-30 03:39:37 +00:00
Changqing Li
643f098a39 pixman-combine-float.c: fix inlining failed error
Refer [1], always-inline is not suggested to be used if you have indirect
calls. so replace force_inline with inline to fix error like:
In function ‘combine_inner’,
    inlined from ‘combine_soft_light_ca_float’ at ../pixman/pixman-combine-float.c:655:511:
../pixman/pixman-combine-float.c:655:211: error: inlining failed in call to ‘always_inline’ ‘combine_soft_light_c’: function not considered for inlining

Test with gcc-9 and gcc-14, both works well

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115679

Signed-off-by: Changqing Li <changqing.li@windriver.com>
2024-10-30 01:34:41 +00:00
Marek Pikuła
90f9cf1726
ci: Disable coverage for arm-v5 and mipsel targets
Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-10-21 16:49:50 +02:00
Marek Pikuła
bc2ec45d3b
ci: Add auto_cancel policy
Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-10-21 16:49:41 +02:00
Marek Pikuła
de59d1a9fb
ci: Don't execute failing jobs
Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-10-21 16:49:40 +02:00
Marek Pikuła
15336dc7cd
ci: Pin gcovr version to 7.x
Temporary version pin of gcovr due to errors in coverage report
generation when running with newly released version 8.x.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-10-21 13:17:47 +02:00
Marek Pikuła
0476eda33a
ci: Remove MESON_TESTTHREADS workaround
https://github.com/mesonbuild/meson/pull/13604 got merged and released
with Meson 1.6.0, which we already use in the Docker images, so the
workaround can be dropped.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-10-21 13:17:25 +02:00
Marek Pikuła
11e51bc72f
ci: Disable OpenMP for Win32 target
OpenMP introduces random stack overflow errors for 32-bit Windows
target.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-10-14 16:12:44 +02:00
Marek Pikuła
277f485a9c
ci: Add missing ":failing" suffix for linux-ppc job
Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-09-27 00:22:55 +02:00
Marek Pikuła
126b083142
ci: Add option to use different version of LLVM
Some targets require different version of LLVM, so now it's possible to
set it in the target's environment. Mind that the highest available
version depends on the base Debian image.

The change bumps LLVM version for all Linux targets:
- by default from 14 to 16,
- from 16 to 18 for riscv64 (based on Sid; for now, LLVM 19 doesn't have
  libomp packaged),
- mipsel stays at 14 as there seem to be some missing packages for
  higher versions.

Windows targets stay the same, as they use a different source of LLVM
(MinGW-compatible, which is currently version 18).

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-09-27 00:22:54 +02:00
Marek Pikuła
a3d297fa46
ci: riscv64: Verify if tests run on target without RVV
To ensure that the runtime discovery works correctly, and RVV code is
disabled for target without RVV extension.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-09-26 23:33:52 +02:00
Marek Pikuła
9176847f1d
ci: riscv64: Don't force enable RVV globally
RVV compilation will be enabled for RVV implementation alone, similar to
other platforms. This prevents introducing autovectorized code in the
main library, thus making pixman compatible with RISC-V targets without
RVV.
2024-09-26 23:33:52 +02:00
Marek Pikuła
76b133f293
ci: Fix active target rule for Docker stage
If rule condition for selectively running Docker image builds was ill
formed. It resulted in build of all images even when not all targets
were selected with ACTIVE_TARGET_PATTERN variable.
2024-09-26 21:54:21 +02:00
Marek Pikuła
b7ac7cd122
ci: Fix Docker image source for MRs
If the MR doesn't modify the Docker context, the pipeline should use the
image from upstream.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-09-25 20:20:08 +02:00
Marek Pikuła
ffa5645a2d
ci: Add support for Windows on ARM
It uses LLVM MinGW pre-built toolchain, and wine-arm64 base Docker image
from Linaro.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-09-03 18:21:02 +02:00
Marek Pikuła
51dcfb8027
ci: Add support for LLVM for Windows targets
It uses LLVM MinGW project to get the precompiled LLVM toolchain for
cross-compilation.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-09-03 18:21:01 +02:00
Marek Pikuła
c0ee08aab0
ci: Add LLVM support to the CI workflow
Add support for LLVM for all targets. Mind that in the current state,
some targets fail either build or test stage. For the time being, these
jobs are marked with `:failing` job name suffix.

Relevant issues:
- https://gitlab.freedesktop.org/pixman/pixman/-/issues/105
- https://gitlab.freedesktop.org/pixman/pixman/-/issues/106
- https://gitlab.freedesktop.org/pixman/pixman/-/issues/107
- https://gitlab.freedesktop.org/pixman/pixman/-/issues/108
- https://gitlab.freedesktop.org/pixman/pixman/-/issues/109
2024-09-03 18:21:00 +02:00
Marek Pikuła
44927bf1e1
ci: Unify build and test stage as job templates
This commit unifies codecov and pltcov build and test stages as single
parametrizable GitLab job templates. This cleans up the pipeline flow in
preparation for LLVM support in the pipeline.

Each target has now a Meson cross file, even when using a native
compiler, so that the job template can be better generalized. This also
allows to move architecture-specific build configuration to the cross
file instead of using the additional Meson flags in the job declaration.
2024-09-03 18:20:59 +02:00
Marek Pikuła
19b1a98e8d
ci: Unify Docker image as multi-stage build
This commit merges codecov and pltcov Dockerfiles into a single,
multi-stage Dockerfile. This results in more streamlined Docker image
builds with some common layers which can be reused by multiple images.

Also, by making a common Dockerfile, all common dependencies have the
same exact description, which decreases disparity between different
images for all the supported architectures. Mind that package version
disparity cannot be prevented 100%, as different base images may be used
for different architectures (e.g., Debian Sid for riscv64 instead of
Bookworm).

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-09-03 18:20:58 +02:00
Marek Pikuła
028213b588
ci: Unify target enable flag
It replaces CODE- and PLT- specific target enable variables. It is a
ground work for unification of codecov and pltcov flows.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-09-03 18:20:57 +02:00
Marek Pikuła
05b5ecd934
ci: Use env files instead of awk script
It makes per-targe environment declaration more extensible, as it's
possible now to set custom env variables only for the selected target
for the entire pipeline workflow in a centralized way.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-09-03 18:20:56 +02:00
Julia DeMille
726d77f6fe mmx: Fix compilation with clang-cl 2024-09-03 00:35:47 +00:00
Marek Pikuła
0cb4fbe324
ci: Fix Docker change detection
There was a missing wildcard for Docker directory
change detection, so basically this rule was not
checked correctly.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-08-21 18:46:07 +02:00
Marek Pikuła
4047a553d9
ci: Add platform coverage targets
Platform coverage checks if the code builds and executes properly for
architectures that are not officially supported by Debian. They don't
contribute to general code coverage report but provide a valuable
insight if all supported platforms are working correctly.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-08-20 18:05:44 +02:00
Marek Pikuła
cbf9d7e0d3
ci: Add architecture coverage Docker images
Add images providing an environment for architecture coverage tests.
There is a separate build for Linux and Windows, as the Windows image is
really large compared to Linux one. It decreases the execution time of
both targets, as the images needed to be pulled by runners are smaller.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-08-16 20:15:30 +02:00
Marek Pikuła
c35e47bd88
ci: Increase granularity of Docker build selection
Now, it's possible to selectively disable Docker image builds. Before,
it was only possible to disable build/test jobs for a given
architecture.
2024-08-16 20:10:21 +02:00
Marek Pikuła
e7ef051a6d
ci: Build and test on the supported platforms
This commit introduces a build and test CI workflow, which tests the
correctness of execution for nearly all configurations supported by
pixman. The notable exception is ARM iWMMXt, which is omitted as it's
soon to be deprecated as mentioned in #98.

The build and test stage is separated, as a single build can be used to
test multiple configurations for a given platform (e.g., MMX, SSE2,
SSSE3 for x86).

Execution is performed using multi-arch Docker images built in the
`docker` stage. The important thing to note is that the runner needs to
have a relatively recent version of Docker and QEMU, and needs to have
the qemu-user-static+binfmt execution enabled.

Once all tests are complete, coverage reports are merged together in the
`summary` stage. Then the result can be used in a GitLab-native coverage
report summary.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-08-16 20:04:49 +02:00
Marek Pikuła
2d35a8769c
mips: Add option to force MIPS CPU feature discovery
Used to force feature discovery in CI where /proc/cpuinfo is unreliable.
It can happen, e.g., if executed in qemu-user-static mode.

For such a build, MIPS-specific features need to be manually disabled by
using `PIXMAN_DISABLE` env variable.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-08-16 20:03:29 +02:00
Marek Pikuła
15af6fd0bc
mips: Widen CPU family check for DSPr2
DSPr2 can be available for targets other than mips32. Some distros
(e.g., Debian) don't support mips32 but still support mipsel. Extending
the check enables use of such images for testing.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-08-16 20:03:28 +02:00
Marek Pikuła
a7263190c2
ci: Add multiarch Docker image build
The image is used in CI pipeline to build and test on different
architectures.

This commit introduces more extensible GitLab CI scheme borrowed from
qemu project.

Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-08-16 20:03:19 +02:00
Marek Pikuła
b753a6f49b
mips: Fix a typo in mips_dspr2_flags
Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
2024-08-14 14:13:07 +02:00
Even Rouault
6410ec79bd pixman-combine-float.c: fix typo in MAKE_NON_SEPARABLE_PDF_COMBINERS()
There's a copy&paste typo updating sc.g twice when there's a mask
2024-08-14 02:48:25 +00:00
Marco Trevisan
5b8e928139 pixman-region: Make translate a no-op when using 0 offsets
This avoids callers to have to optimize this codepath, in case this scenario happens.
And definitely it may happen when the function is not explicitly called.
2024-08-14 02:41:08 +00:00
Matt Turner
2e29b7c43d iwmmxt: Drop support
In all likelyhood unused for at least many years, and possibly ever.

Support is deprecated and will be removed in gcc-15. See deprecation
notice in https://gcc.gnu.org/gcc-13/changes.html

Closes: https://gitlab.freedesktop.org/pixman/pixman/-/issues/98
2024-08-13 13:51:36 -04:00
Peter Hutterer
e5f8efc4c7 ci: add workflow rules to allow for MR pipelines
See
https://gitlab.freedesktop.org/freedesktop/freedesktop/-/wikis/GitLab-CI#for-project-developers
2024-08-07 09:59:34 +10:00
Bill Roberts
7ed0f8d04d
aarch64: support PAC and BTI
Enable Pointer Authentication Codes (PAC) and Branch Target
Identification (BTI) support for ARM 64 targets.

PAC works by signing the LR with either an A key or B key and verifying
the return address. There are quite a few instructions capable of doing
this, however, the Linux ARM ABI is to use hint compatible instructions
that can be safely NOP'd on older hardware and can be assembled and
linked with older binutils. This limits the instruction set to paciasp,
pacibsp, autiasp and autibsp. Instructions prefixed with pac are for
signing and instructions prefixed with aut are for signing. Both
instructions are then followed with an a or b to indicate which signing
key they are using. The keys can be controlled using
-mbranch-protection=pac-ret for the A key and
-mbranch-protection=pac-ret+b-key for the B key.

BTI works by marking all call and jump positions with bti c and bti
j instructions. If execution control transfers to an instruction other
than a BTI instruction, the execution is killed via SIGILL. Note that
to remove one instruction, the aforementioned pac instructions will
also work as a BTI landing pad for bti c usages.

For BTI to work, all object files linked for a unit of execution,
whether an executable or a library must have the GNU Notes section of
the ELF file marked to indicate BTI support. This is so loader/linkers
can apply the proper permission bits (PROT_BRI) on the memory region.

PAC can also be annotated in the GNU ELF notes section, but it's not
required for enablement, as interleaved PAC and non-pac code works as
expected since it's the callee that performs all the checking. The
linker follows the same rules as BTI for discarding the PAC flag from
the GNU Notes section.

Testing was done under the following CFLAGS and CXXFLAGS for all
combinations:
1. -mbranch-protection=none
2. -mbranch-protection=standard
3. -mbranch-protection=pac-ret
4. -mbranch-protection=pac-ret+b-key
5. -mbranch-protection=bti

Signed-off-by: Bill Roberts <bill.roberts@arm.com>
2024-07-22 16:57:13 -05:00
Bill Roberts
3a32506877
arm: add include guards on header
Prevent double inclusion of header file.

Signed-off-by: Bill Roberts <bill.roberts@arm.com>
2024-07-22 16:57:13 -05:00
Mike Hommey
865e6ce00b pixman: Adjust arm assembly for binutils change
A change in the latest version of binutils broke building pixman for arm.

The binutils change:
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=226749d5a6ff0d5c607d6428d6c81e1e7e7a994b

Closes: https://gitlab.freedesktop.org/pixman/pixman/-/issues/96
2024-07-12 15:55:33 -04:00
Matt Turner
b252d40714 Post-release version bump to 0.43.5 2024-02-29 11:19:46 -05:00
Matt Turner
54cad71674 Pre-release version bump to 0.43.4 2024-02-29 11:13:20 -05:00
Matt Turner
add7c8db45 pixman-arm: Use unified syntax
Allows us to use the same assembly without a bunch of #ifdef __clang__.
2024-02-29 10:47:07 -05:00
Makoto Kato
63ae6af9a6 pixman-arm: Fix build on clang/arm32
Closes: https://gitlab.freedesktop.org/pixman/pixman/-/issues/74
2024-02-29 10:47:00 -05:00
Matt Turner
033716e99a Revert "Allow to build pixman on clang/arm32"
This reverts merge request !78
2024-02-29 15:41:37 +00:00
Heiko Lewin
74130e84c5 Allow to build pixman on clang/arm32 2024-02-29 14:46:55 +00:00
Matt Turner
63332b4e72 pixman-x86: Move #include "cpuid.h" inside conditionals
Closes: https://gitlab.freedesktop.org/pixman/pixman/-/issues/93
Closes: https://gitlab.freedesktop.org/pixman/pixman/-/issues/94
2024-02-25 17:28:14 -05:00
Matt Turner
8c6d59a9f8 pixman-x86: Use cpuid.h header 2024-02-24 12:36:53 -05:00
Gayathri Berli
ac485a9b66 Revert the changes to fix the problem in big-endian architectures
This reverts commit b4a105d772.

There is an endianness issue in pixman-fast-path.c. In the function
bits_image_fetch_separable_convolution_affine we have this code:

#ifdef WORDS_BIGENDIAN
	buffer[k] = (satot << 0) | (srtot << 8) | (sgtot << 16) | (sbtot << 24);
#else
	buffer[k] = (satot << 24) | (srtot << 16) | (sgtot << 8) | (sbtot << 0);
#endif

This will write out the pixels as BGRA on big endian systems but
obviously that's wrong. Pixel order should be ARGB on big endian systems
so we don't need any #ifdef for big endian here at all. Instead, the
code should be the same on little and big endian, i.e. it should be just
this line instead of the 5 lines above:

	buffer[k] = (satot << 24) | (srtot << 16) | (sgtot << 8) | (sbtot << 0);

Changing the code like this fixes the wrong colors that I get with
pixman on my PowerPC/s390x system.

Here is what cairo.h has to say (which is rooted in pixman):

 * @CAIRO_FORMAT_ARGB32: each pixel is a 32-bit quantity, with
 *   alpha in the upper 8 bits, then red, then green, then blue.
 *   The 32-bit quantities are stored native-endian. Pre-multiplied
 *   alpha is used. (That is, 50% transparent red is 0x80800000,
 *   not 0x80ff0000.) (Since 1.0)

Closes: https://gitlab.freedesktop.org/pixman/pixman/-/issues/78
Signed-off-by: Gayathri Berli <gayathri.berli@ibm.com>
2024-02-24 12:28:30 -05:00
Simon Ser
fdd7161097 Post-release version bump to 0.43.3 2024-01-28 13:32:42 +01:00
Simon Ser
91b8526c1e Pre-release version bump to 0.43.2 2024-01-28 13:26:31 +01:00
Simon Ser
e8bb34e302 Drop contrib/ci.sh
This is unused and outdated (Autotools is no longer supported).

Signed-off-by: Simon Ser <contact@emersion.fr>
2024-01-28 12:23:29 +00:00
Simon Ser
43773c69db Drop ChangeLog
This file is empty and unused.

Signed-off-by: Simon Ser <contact@emersion.fr>
2024-01-28 12:22:00 +00:00
Simon Ser
8c39ce2437 Drop automatic DEBUG define
We don't use the historical odd stable release numbering scheme
anymore.

Developers can still enable this debugging code via CFLAGS=-DDEBUG.

Signed-off-by: Simon Ser <contact@emersion.fr>
Closes: https://gitlab.freedesktop.org/pixman/pixman/-/issues/87
2024-01-27 13:15:28 +01:00
Simon Ser
8e4be8c2db Post-release version bump to 0.43.1
Signed-off-by: Simon Ser <contact@emersion.fr>
2024-01-04 11:48:38 +01:00
Simon Ser
6c2e4a0dd9 Pre-release version bump to 0.43.0
Signed-off-by: Simon Ser <contact@emersion.fr>
2024-01-04 11:01:05 +01:00
Matt Turner
396e1a76ed test: Use fabsl on float128 2024-01-03 21:40:12 -05:00
Matt Turner
7e76c96281 pixman-access: Mark __dummy__ variables with MAYBE_UNUSED 2024-01-03 21:24:46 -05:00
Matt Turner
af101d3c21 pixman-mmx: Don't redefine _MM_SHUFFLE 2024-01-03 21:24:46 -05:00
Matt Turner
20cc4ee0e9 pixman-sse2: Remove unused functions 2024-01-03 21:24:46 -05:00
Simon Ser
7883ab8d63 ci: upgrade to Fedora 39
Fedora 28 is super old.

Signed-off-by: Simon Ser <contact@emersion.fr>
2023-12-15 13:21:09 +01:00
Pavel Labath
86f9162332 Fix alignment problem in pixman-fast-path.c
The variable is accessed through uint32_t pointer, so it needs to be
aligned to avoid undefined behavior (crashes on architectures which
require aligned accesses).

Closes: https://gitlab.freedesktop.org/pixman/pixman/-/issues/84
2023-12-15 13:10:52 +01:00
Benjamin Gilbert
b4b789df5b meson: avoid linking with -pthread if we don't have pthreads
Meson always returns -pthread in dependency('threads') on non-MSVC
compilers.  Fix a link error when building on MinGW without winpthreads.
2023-11-08 18:43:10 +00:00
Sam James
08115a4217
pixman-bits-image: fix -Walloc-size
GCC 14 introduces a new -Walloc-size included in -Wextra which gives (when forced
to be an error):
```
../pixman/pixman-bits-image.c: In function ‘create_bits’:
../pixman/pixman-bits-image.c:1273:16: error: allocation of insufficient size ‘1’ for type ‘uint32_t’ {aka ‘unsigned int’} with size ‘4’ [-Werror=alloc-size]
 1273 |         return calloc (buf_size, 1);
      |                ^~~~~~~~~~~~~~~~~~~~
```

The calloc prototype is:
```
void *calloc(size_t nmemb, size_t size);
```

So, just swap the number of members and size arguments to match the prototype, as
we're initialising 1 element of size `buf_size`. GCC then sees we're not
doing anything wrong.

Signed-off-by: Sam James <sam@gentoo.org>
2023-11-07 22:31:05 +00:00
Havard Eidnes
47a1c3d330 vmx: Reimplement create_mask_32_128 and use it in vmx_fill
Based on suggestion from @siamashka.

This lets the compiler pick the vector instruction to use which is
usually the best idea.

Use create_mask_32_128() instead of create_mask_1x32_128() in
vmx_fill(), avoiding loading memory beyond the filler argument on the
stack.

Remove the now-unused create_mask_1x32_128(). This gets rid of some
(correct) warnings from the compiler about indexing beyond the variable
in question.
2023-08-30 12:14:40 -04:00
Havard Eidnes
634b8196d2 vmx: Simplify scaled_nearest_scanline_vmx_8888_8888_OVER
Since combine4() does not take vector variables as arguments, there's no
need to use a vector variable and casts back and forth to normal scalars
for the arguments.
2023-08-30 12:14:26 -04:00
Matt Turner
753f5e095e meson: Fix syntax 2023-08-30 11:58:04 -04:00
Simon Ser
7aeeb501ad Fix const warnings in pixman_image_set_clip_region()
Fixes the following warnings:

    pixman-image.c: In function 'pixman_image_set_clip_region':
    pixman-image.c:601:81: warning: passing argument 2 of 'pixman_region32_copy_from_region16' discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
      601 |         if ((result = pixman_region32_copy_from_region16 (&common->clip_region, region)))
          |                                                                                 ^~~~~~
    In file included from pixman-image.c:32:
    pixman-private.h:859:56: note: expected 'pixman_region16_t *' {aka 'struct pixman_region16 *'} but argument is of type 'const pixman_region16_t *' {aka 'const struct pixman_region16 *'}
      859 |                                     pixman_region16_t *src);
          |                                     ~~~~~~~~~~~~~~~~~~~^~~
    pixman-utils.c:240:1: error: conflicting types for 'pixman_region16_copy_from_region32'; have 'pixman_bool_t(pixman_region16_t *, pixman_region32_t *)' {aka 'int(struct pixman_region16 *, struct pixman_region32 *)'}
      240 | pixman_region16_copy_from_region32 (pixman_region16_t *dst,
          | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    In file included from pixman-utils.c:31:
    pixman-private.h:862:1: note: previous declaration of 'pixman_region16_copy_from_region32' with type 'pixman_bool_t(pixman_region16_t *, const pixman_region32_t *)' {aka 'int(struct pixman_region16 *, const struct pixman_region32 *)'}
      862 | pixman_region16_copy_from_region32 (pixman_region16_t *dst,
          | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    pixman-utils.c:270:1: error: conflicting types for 'pixman_region32_copy_from_region16'; have 'pixman_bool_t(pixman_region32_t *, pixman_region16_t *)' {aka 'int(struct pixman_region32 *, struct pixman_region16 *)'}
      270 | pixman_region32_copy_from_region16 (pixman_region32_t *dst,
          | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    In file included from pixman-utils.c:31:
    pixman-private.h:858:1: note: previous declaration of 'pixman_region32_copy_from_region16' with type 'pixman_bool_t(pixman_region32_t *, const pixman_region16_t *)' {aka 'int(struct pixman_region32 *, const struct pixman_region16 *)'}
      858 | pixman_region32_copy_from_region16 (pixman_region32_t *dst,
          | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Simon Ser <contact@emersion.fr>
2023-08-30 15:49:50 +00:00
Matt Turner
7169c0404f Use more Markdown-friendly syntax 2023-08-30 11:15:00 -04:00
Matt Turner
f1072b07eb Remove generic build system information 2023-08-30 11:14:04 -04:00
Gauthier Östervall
2cf9ae1cea Update build instructions to meson and ninja 2023-08-30 11:12:41 -04:00
Dylan Baker
72c4245b2e delete win32 make files
meson can handle building for win32 (including using visual studio, and
mingw), and does a good deal more than these could. Since we're dropping
autotools, we might as well drop these too.
2023-08-30 10:54:46 -04:00
Dylan Baker
55eb680a1f autotools: remove autotools
At this point meson is pretty well tested and seems to pretty much work,
so we can consider dropping an extra build system.

This doesn't solve the problem that pixman's release scripts are part of
the autotools build system (as make targets). One solution might be to
use xorg's release.sh instead.
2023-08-30 10:51:27 -04:00
Matt Turner
593a970266 test: Revert to including pixman-private.h
This broke the Visual Studio builds in GTK's CI system.
2023-07-19 15:08:22 -04:00
Heiko Lewin
67490a8bc1
pixman-arma64: Adjustments to build with llvm integrated assembler
This enables building the aarch64 assembly with clang.
Changes:
1. Use `.func` or `.endfunc` only if available
2. Prefix macro arg names with `\` 
3. Use `\()` instead of `&`
4. Always use commas to separate macro arguments
5. Prefix asm symbols with an undderscore if necessary
2023-07-18 07:20:01 +02:00
Benjamin Gilbert
47d3fbe38f mmx: use xmmintrin.h if building with SSE2
As of mingw-w64 commit 463f00975, winnt.h includes emmintrin.h when
compiling with SSE2, causing redefinition errors for our copied MMX
intrinsics.  If the build is assuming SSE2 anyway, just use the system
header instead.
2023-07-09 01:56:40 +00:00
Simon Ser
55845c3dd3 Constify pixman_image_set_clip_region()
This function copies the region passed in.

Signed-off-by: Simon Ser <contact@emersion.fr>
2023-07-09 01:53:48 +00:00
Simon Ser
672f67db96 Add pixman_region{,32}_empty()
Inverse of pixman_region32_not_empty().

Most of the time, callers want to check whether a region is empty,
not whether a region is not empty. This results in code with
double-negatives such as !pixman_region32_not_empty(), which is
confusing to read.

Signed-off-by: Simon Ser <contact@emersion.fr>
2023-07-09 01:48:29 +00:00
Benjamin Gilbert
48d5df1f37 meson: don't dllexport when built as static library
If a static Pixman is linked with a dynamic library, Pixman shouldn't
export its own symbols into the latter's ABI.
2023-07-08 17:36:00 -04:00
Emanuel Schmidt
e4c878d179 Fixed missing dependency in libdemo
After the latest changes and separation of demo- and test-targets,
it was visible that a dependency towards `libtestutils_dep` was
missing in one of the demo-dependencies. This change will fix
this particular problem.
2023-02-17 18:52:14 +01:00
Emanuel Schmidt
ee145e53d1 Changed name of the config-header to "pixman-config.h" 2023-02-14 22:20:12 +01:00
Emanuel Schmidt
eb998d7b65 Separate meson build options for demos and tests 2023-02-08 20:56:05 +01:00
Emilio Pozuelo Monfort
a7a919b881 Release to sid 2022-11-11 13:42:32 +01:00
Emilio Pozuelo Monfort
a4e8d8901f Remove patch for CVE-2022-44638 included in 0.42.2 2022-11-08 13:24:10 +01:00
Emilio Pozuelo Monfort
590b8eb08f New upstream release 2022-11-08 13:11:47 +01:00
Emilio Pozuelo Monfort
dbe5c715e6 Merge branch 'upstream-unstable' into debian-unstable 2022-11-08 13:11:17 +01:00
Emilio Pozuelo Monfort
e71a54d0f0 Import 0.40.0-1.1 NMU
* Avoid integer overflow leading to out-of-bounds write (CVE-2022-44638)
  (Closes: #1023427)
2022-11-08 13:03:18 +01:00
Heiko Lewin
713077d0a3 Fix signed-unsigned semantics in reduce_32 2022-11-03 19:13:41 +00:00
Matt Turner
618e3d4283 Post-release version bump to 0.42.3 2022-11-03 09:53:12 -04:00
Claude Heiland-Allen
40d6c9b256 add r8g8b8 sRGB to test suite
Signed-off-by: Claude Heiland-Allen <claude@mathr.co.uk>
2022-11-03 12:51:47 +00:00
Claude Heiland-Allen
83ba024483 implement r8g8b8 sRGB (without alpha)
Signed-off-by: Claude Heiland-Allen <claude@mathr.co.uk>
2022-11-03 12:51:47 +00:00
Matt Turner
37216a3283 Pre-release version bump to 0.42.2 2022-11-02 13:25:48 -04:00
Matt Turner
a1f88e842e Avoid integer overflow leading to out-of-bounds write
Thanks to Maddie Stone and Google's Project Zero for discovering this
issue, providing a proof-of-concept, and a great analysis.

Closes: https://gitlab.freedesktop.org/pixman/pixman/-/issues/63
2022-11-02 13:25:48 -04:00
Matt Turner
c3bbb94b4c Revert "Fix signed-unsigned semantics in reduce_32"
This reverts commit aaf59b0338.

This commit regressed the scaling-test unit test, by apparently allowing
the compiler to emit fused multiply-add instructions in cases they
wouldn't have been allowed before. While using gcc's -ffp-contract=...
flag avoids the issue on amd64, it does not on at least aarch64 and
ppc64.

This is unfortunate, because the commit being reverted resolved
https://gitlab.freedesktop.org/pixman/pixman/-/issues/43 so we will
reintroduce this failure, but after more than a year without a fix for
the unit test, I think it's time to bite the bullet.

Fixes: https://gitlab.freedesktop.org/pixman/pixman/-/issues/49
2022-10-27 15:10:30 -04:00
Matt Turner
ca7bb8894e build: Add a64-neon-test.S to EXTRA_DIST
Fixes: https://gitlab.freedesktop.org/pixman/pixman/-/issues/66
2022-10-27 14:36:54 -04:00
Simon Ser
1a0d50ce70 meson: explicitly set C standard to gnu99
This explicitly indicates that GNU extensions (like asm) are used.
This fixes build errors when Pixman is used as a Meson subproject.

Signed-off-by: Simon Ser <contact@emersion.fr>
2022-10-27 18:21:37 +00:00
Simon Ser
0cf92877a9 meson: override pixman-1 dependency
This eases usage as a Meson subproject.

Signed-off-by: Simon Ser <contact@emersion.fr>
2022-10-27 18:17:26 +00:00
Thomas Klausner
4ee322c4e2 Makefile.am: increase shell portability
Use standard test(1) instead of bash's '[['.

Signed-off-by: Thomas Klausner <wiz@gatalith.at>
2022-10-18 17:48:49 +02:00
Thomas Klausner
b5b3243792 configure.ac: avoid unportable test(1) operator
"==" is only supported by bash, POSIX mandates "="

Signed-off-by: Thomas Klausner <wiz@gatalith.at>
2022-10-18 17:48:24 +02:00
Simon Ser
7df9e162c6 Post-release version bump to 0.42.1
Signed-off-by: Simon Ser <contact@emersion.fr>
2022-10-18 11:01:24 +02:00
Simon Ser
8d6d7f44f4 Pre-release version bump to 0.42.0
Signed-off-by: Simon Ser <contact@emersion.fr>
2022-10-18 09:44:04 +02:00
Benjamin Gilbert
421fc252ab meson: Add feature to disable compiler TLS support
When compiling with MinGW, use of the __thread attribute causes pixman
to gain a dependency on the winpthread DLL.  With Autotools, this could
be avoided by configuring with ac_cv_tls=none, causing pixman to fall
back to TlsSetValue() instead.

Add a Meson 'tls' option that can be 'disabled' to skip support for TLS
compiler attributes, or 'enabled' to require a working TLS attribute.
2022-10-18 01:02:43 +00:00
Alan Coopersmith
7989483929 configure.ac: allow x64 libraries on Solaris to run on non-SSSE3 machines
Override the x64 hardware capability autodetection by Solaris Studio
compilers for x64 libraries the same way we do for x86 libraries.

Also fix configure test for this override to work in out-of-tree builds.

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
2022-10-13 20:58:57 +00:00
Jocelyn Falempe
b4a105d772 Fix inverted colors on big endian system
bits_image_fetch_separable_convolution_affine() didn't take care
of big endian system

Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
2022-06-29 11:00:04 +02:00
Alan Coopersmith
285b9a907c configure: replace bugzilla URL with gitlab issues
Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
2022-02-19 13:37:54 -08:00
Nirbheek Chauhan
adc07d4618 meson: Fix usage of pkgconfig.generate()
The library that the pkgconfig file is for should be the first
positional argument. The `libraries:` kwarg is for libraries that the
user must also link against, and which meson does not know about (and
hence cannot automatically add to the `Libs:` or `Requires:` section
in the .pc file).

Fixes:
```
subprojects/pixman/meson.build:564: DEPRECATION: Library pixman-1 was
passed to the "libraries" keyword argument of a previous call to
generate() method instead of first positional argument. Adding
pixman-1 to "Requires" field, but this is a deprecated behaviour that
will change in a future version of Meson. Please report the issue if
this warning cannot be avoided in your case.
```
2022-01-22 13:25:57 +05:30
Nirbheek Chauhan
3563dfe436 meson: Fix warning about extract_all_objects usage
We use this because of a meson bug that was fixed in 0.52:

https://mesonbuild.com/Release-notes-for-0-52-0.html#improved-support-for-static-libraries

Bump the requirement and remove the extract_all_objects workaround.
This gets rid of a meson warning:

WARNING: extract_all_objects called without setting recursive
keyword argument. Meson currently defaults to
non-recursive to maintain backward compatibility but
the default will be changed in the future.
2022-01-21 09:07:53 +00:00
Manuel Stoeckl
c6e1af995e demos: port to Gtk3
GTK2 has reached end of life, and GTK3 has been available for a
almost a decade.

Signed-off-by: Manuel Stoeckl <code@mstoeckl.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
2022-01-12 23:19:39 -05:00
Mizuki Asakura
eadb82866b added aarch64 bilinear implementations (ver.4.1)
Since aarch64 has different neon syntax from aarch32 and has no
support for (older) arm-simd,
there are no SIMD accelerations for pixman on aarch64.

We need new implementations.

This patch also contains Ben Avions's series of patches for aarch32
and now the benchmark results are fine to aarch64.

Please find the result at the below ticket.

Added: https://bugs.freedesktop.org/show_bug.cgi?id=94758
Signed-off-by: Mizuki Asakura <ed6e117f@gmail.com>
2021-09-17 17:03:02 +00:00
Simon Ser
36001032b7 Constify region APIs
This allows callers to pass around const Pixman region in their
APIs, improving type safety and documentation.

Signed-off-by: Simon Ser <contact@emersion.fr>
2021-09-17 16:22:51 +00:00
Nirbheek Chauhan
bd4e7a9b9e tests: Fix undefined symbol build error on macOS
prng_state and prng_state_data are getting classified as a "Common
symbol" by the compiler due to the convoluted way in which it is
`#include`-ed in various test sources, and that's not read as a valid
symbol by the linker later.

Initializing the symbol clarifies it to the compiler that this
specific declaration is the canonical location for this variable, and
that it's not a "Common symbol".

Fixes https://gitlab.freedesktop.org/pixman/pixman/-/issues/42
2021-09-17 16:08:04 +00:00
Alex Richardson
e0d4403e78 Fix -Wincompatible-function-pointer-types warning
Adding const to the return type does nothing and means that the function
pointer types do not match exactly:

error: incompatible function pointer types passing 'const float (int, int)' to parameter of type 'dither_factor_t' (aka 'float (*)(int, int)')
2021-09-17 16:03:48 +00:00
Manuel Stoeckl
5f5e752f15 Fix masked pixel fetching with wide format
In __bits_image_fetch_affine_no_alpha and __bits_image_fetch_general,
when `wide` is true, the mask is actually an array of argb_t instead
of the array of uint32_t it was cast to, and the access to `mask[i]`
does not correctly detect when the pixel is nontrivial. The code now
uses a check appropriate for argb_t when `wide` is true.

One caveat: this new check only skips entries when the mask pixel data
is binary all zero; this misses cases like `-0.f` which would be caught
by the FLOAT_IS_ZERO macro. As the mask check only appears to be a
performance optimization to avoid loading inconsequential pixels, it
erring on the side of loading more pixels is safe.

Signed-off-by: Manuel Stoeckl <code@mstoeckl.com>
2021-08-09 21:43:58 -04:00
Heiko Lewin
aaf59b0338 Fix signed-unsigned semantics in reduce_32 2021-07-21 14:50:52 +00:00
pkubaj
4251202d9d Fix AltiVec detection on FreeBSD. 2021-05-07 15:58:56 +00:00
Jonathan Kew
e93eaff517 Avoid out-of-bounds read when accessing individual bytes from mask.
The important changes here are a handful of places where we replace

            memcpy(&m, mask++, sizeof(uint32_t));

or similar code with

            uint8_t m = *mask++;

because we're only supposed to be reading a single byte from *mask,
and accessing a 32-bit value may read out of bounds (besides that
it reads values we don't actually want; whether this matters would
depend exactly how the value in m is subsequently used).

I've also changed a bunch of other places to use this same pattern
(a local 8-bit variable) when reading individual bytes from the mask;
the code was inconsistent about this, sometimes casting the byte to
a uint32_t instead. This makes no actual difference, it just seemed
better to use a consistent pattern throughout the file.
2021-05-07 09:37:28 -04:00
Timo Aaltonen
52a3693957 release to sid 2020-12-03 15:38:38 +02:00
Timo Aaltonen
16f9268369 symbols: Updated, bump shlibs 2020-12-03 15:25:18 +02:00
Timo Aaltonen
ad3904afb6 control, rules: Migrate to debhelper-compat, bump to 13. 2020-12-03 15:19:07 +02:00
Timo Aaltonen
8b58485eb3 bump the version 2020-12-03 15:15:54 +02:00
Timo Aaltonen
4772386a28 Merge branch 'upstream-unstable' into debian-unstable 2020-12-03 15:14:50 +02:00
Érico Rolim
d93ec57138 meson: update option descriptions.
- gtk is only used in demos
- libpng is only used in tests
- openmp is only used in tests (in the standard build)
2020-10-22 20:43:26 -03:00
Dylan Baker
9b49f4e087 meson: remove pixman dependency
AFAICT from the git history, what happened is that the gtk demos rely on
gtk being built with pixman support. pkg-config isn't really expressive
enough to have that information, so the solution that was come up with
was to search for pixman as well as gtk+ and hope that pixman being
installed was.

This isn't actually used anywhere in the meson build anyway, and it's
causing problems for projects that want to use pixman as a supproject
(there's a port of cairo underway that's hitting this), because it
confuses meson.
2020-06-18 14:21:09 -07:00
Tim-Philipp Müller
606f5c15b0 meson: add option to skip building of tests and demos
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2020-06-02 02:30:39 +00:00
Tim-Philipp Müller
15e0668616 meson: add cpu-features-path option for Android
Add option to include cpu-features.[ch] from a given path
into the build for platforms that don't provide this out
of the box. This is needed on Android.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2020-06-02 01:15:33 +01:00
Tim-Philipp Müller
0ba6cbe1ac Update README a little
- bugzilla -> gitlab
- convert links to https
- suggest issues and patches be filed via gitlab
2020-05-30 11:34:26 +01:00
Tom Stellard
c2fe1568ff Add -ftrapping-math to default cflags
This should resolve https://gitlab.freedesktop.org/pixman/pixman/-/issues/22
and make the tests pass with clang.

-ftrapping-math is already the default[1] for gcc, so this should not change
behavior when compiling with gcc.  However, clang defaults[2] to -fno-trapping-math,
so -ftrapping-math is needed to avoid floating-point expceptions when running the
combiner and stress tests.

The root causes of this issue is that that pixman-combine-float.c guards floating-point
division operations with a FLOAT_IS_ZERO check e.g.

if (FLOAT_IS_ZERO (sa))
	f = 1.0f;
else
	f = CLAMP (da / sa);

With -fno-trapping-math, the compiler assumes that division will never trap, so it may
re-order the division and the guard and execute the division first.  In most cases,
this would not be an issue, because floating-point exceptions are ignored.  However,
these tests call enable_divbyzero_exceptions() which causes the SIGFPE signal to
be sent to the program when a divide by zero exception is raised.

[1] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
[2] https://clang.llvm.org/docs/UsersManual.html#controlling-floating-point-behavior
2020-05-11 22:33:49 +00:00
Michael Forney
3b1fefda7f Prevent empty top-level declaration
The expansion of PIXMAN_DEFINE_THREAD_LOCAL(...) may end in a
function definition, so the following semicolon is considered an
empty top-level declaration, which is not allowed in ISO C.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2020-04-26 13:46:43 -07:00
Matt Turner
10a057e27f Post-release version bump to 0.40.1
Signed-off-by: Matt Turner <mattst88@gmail.com>
2020-04-19 15:01:30 -07:00
Matt Turner
244383bf9f Pre-release version bump to 0.40.0
Signed-off-by: Matt Turner <mattst88@gmail.com>
2020-04-19 14:52:22 -07:00
Matt Turner
405f26068c Move from MD5/SHA1 to SHA256/SHA512 digests
Signed-off-by: Matt Turner <mattst88@gmail.com>
2020-04-19 14:52:22 -07:00
Matt Turner
88b167d18c Build xz tarballs instead of bzip2
Signed-off-by: Matt Turner <mattst88@gmail.com>
2020-04-19 14:49:46 -07:00
Matt Turner
54a13221ee Distribute the blue-noise files
Signed-off-by: Matt Turner <mattst88@gmail.com>
2020-04-19 14:46:56 -07:00
Ghabry
eb0c3d26ed Enabled armv6 SIMD for 3DS (devkitARM) and arm neon SIMD for PS Vita (vitasdk) and Switch (devkitA64) 2020-04-14 00:08:57 +00:00
Matt Turner
9976d2c099 loongson: Avoid C90 mixing-code-and-decls warning 2020-04-07 15:18:09 -07:00
Shiyou Yin
5330640025 configure.ac: use '-mloongson-mmi' for Loongson MMI
It's recommended to use '-mloongson-mmi' for MMI.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2020-04-07 15:18:03 -07:00
Adam Jackson
348e99b52f fast-path: Fix some sketchy pointer arithmetic
We want a uint8_t * at the end of this math, because that's what the
function we're about to pass it to takes. But ->bits is a uint32_t, so
if we just do the math in units of that we can avoid the explicit factor
of four which would risk an integer overflow.

Fixes: pixman/pixman#14
2020-04-02 14:58:52 +00:00
Matt Turner
ba5d794515 lowlevel-blt-bench: Remove unused variable
Closes: https://gitlab.freedesktop.org/pixman/pixman/issues/7
2020-03-20 12:42:45 -07:00
Federico Mena Quintero
6fe0131394 Initialize temporary buffers in general_composite_rect()
Otherwise, Valgrind shows things like "conditional jump or move
depends on uninitialised values" errors much later in calling code.
For example, see https://gitlab.gnome.org/GNOME/librsvg/issues/572

Fixes https://gitlab.freedesktop.org/pixman/pixman/issues/9
2020-03-18 18:52:16 -06:00
Antonio Ospite
3344f507dd pixman-compiler.h: fix building tests with MinGW
MinGW supports __declspec(dllexport) but the current logic that sets
PIXMAN_EXPORT only uses it when building with MSVC, leaving some symbols
hidden when building with MinGW.

This results in an error when trying to link the tests:

-----------------------------------------------------------------------
FAILED: subprojects/pixman/test/combiner-test.exe
x86_64-w64-mingw32-gcc  -o subprojects/pixman/test/combiner-test.exe 'subprojects/pixman/test/f48fa9c@@combiner-test@exe/combiner-test.c.obj' -Wl,--allow-shlib-undefined -Wl,--start-group subprojects/pixman/test/libtestutils.a subprojects/pixman/pixman/libpixman-1.dll.a -pthread -fopenmp -fopenmp -lm -mconsole -lkernel32 -luser32 -lgdi32 -lwinspool -lshell32 -lole32 -loleaut32 -luuid -lcomdlg32 -ladvapi32 -Wl,--end-group
/usr/bin/x86_64-w64-mingw32-ld: subprojects/pixman/test/f48fa9c@@combiner-test@exe/combiner-test.c.obj: in function `main':
.../build/../subprojects/pixman/test/combiner-test.c:124: undefined reference to `_pixman_internal_only_get_implementation'
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.
-----------------------------------------------------------------------

By using PIXMAN_API also when building with MinGW, the tests can link
successfully and the build succeed.

Tested with x86_64-w64-mingw32-gcc (GCC) 8.3-win32 20191201.
2020-03-15 00:19:56 +01:00
Yin Shiyou
127d9525d6 pixman-combine: Fix wrong value of RB_MASK_PLUS_ONE.
No functional change, as explained by Søren in
https://lists.freedesktop.org/archives/pixman/2020-February/004902.html
2020-02-20 09:55:17 -08:00
Mathieu Duponchelle
e8321503c6 meson: add missing function check (getisax)
.. and add gettimeofday to the list of funcs to check instead
of having a separate check for it.
2020-01-30 23:31:35 +01:00
Mathieu Duponchelle
8992d5b4fc meson: finish porting over mmx and ssse2 flags for sun and msvc
Those flags are set by the configure.ac script
2020-01-30 23:29:20 +01:00
Khem Raj
364760cd3d test/utils: Check for FE_INVALID definition before use
Some architectures e.g. nios2 do not support all exceptions.
2019-12-19 23:34:38 +00:00
Chun-wei Fan
7331d2b4e3 thread-test.c: Use Windows Threading API on Windows
...When we don't have a pthreads implementation available, which is
normally the case on Windows.  This attempts to make it easier for people
on Windows to verify whether their builds of Pixman (and Cairo component,
if applicable) are thread-safe.  Also, make the number of threads
a #define, so if we need to change it at some point, it's easier.

This re-enables the thread-test program on Windows in Meson builds.
2019-11-19 05:50:28 +08:00
Chun-wei Fan
1dd3bc0a35 demos: Define _USE_MATH_DEFINES on MSVC-style compilers
This is required for the use of M_PI.
2019-11-19 05:49:35 +08:00
Chun-wei Fan
3bceb3a9d3 test/solid-test.c: Include stdint.h
We need that to make sure we have UINT16_MAX.
2019-11-19 05:49:35 +08:00
Chun-wei Fan
c608e9663e pixman/meson.build: Define PIXMAN_API on MSVC-style compilers
This will make the public APIs exported from the DLL, so that we have an
import libary that we can use.
2019-11-19 05:49:35 +08:00
Chun-wei Fan
9d8dd17ada pixman-[compiler|private].h: Export symbols for tests
Define the existing PIXMAN_EXPORT to be PIXMAN_API, which can overriden
to be __declspec(dllexport) during the build of the pixman DLL on MSVC
builds, which will be in the next patch.

Also, export more private symbols as they are needed for the test
programs.
2019-11-19 05:49:35 +08:00
Chun-wei Fan
21d8ded566 pixman/pixman.h: Mark public APIs with PIXMAN_API
We can override PIXMAN_API with a CFLAG or config.h define to export
the symbols with compiler directives, if needed.
2019-11-19 05:49:35 +08:00
Chun-wei Fan
b7eea54028 pixman/pixman-version.h.in: Add a PIXMAN_API macro
This prepares to mark the public APIs that we have in pixman.h so that
we can use compiler directives such as __declspec(dllexport) to export
those symbols.
2019-11-19 05:49:35 +08:00
Chun-wei Fan
06a3f6e60b meson.build: Improve libpng search on MSVC
The build system for libpng for MSVC does not generate a pkg-config file
for us, and CMake support in Meson does not work very well.  So, look
for libpng manually on MSVC builds if depedency discovery did not work
out via pkg-config or the CMake config files.
2019-11-19 05:49:35 +08:00
Chun-wei Fan
7661b1fae9 build: Don't assume PThreads if threading support is found
Look also for pthread.h if threading support is found by Meson, as the
underlying threading support may not be PThreads, depending on platform.

For now, disable the thread-test test program if pthread.h and if
necessary, the PThreads library, cannot be found, as the current
implementation assumes the use of PThreads.

Also bump the required Meson version to 0.50.0 since we need it for
-cc.get_argument_syntax()
-For a later commit, the has_headers sub-method for cc.find_library()
2019-11-19 05:49:35 +08:00
Chun-wei Fan
e9db26898b meson.build: Disable OpenMP on MSVC builds
The implementation of OpenMP is not compliant for our uses, so disable
it for now by just not checking for it on MSVC builds, as we implicitly
add an /openmp switch to the build, which will cause linking the tests
programs to fail, as the OpenMP implementation is not enough.
2019-11-19 05:49:34 +08:00
Chun-wei Fan
f251c12f8a meson.build: Fix MMX, SSE2 and SSSE3 checks on MSVC
-For MSVC builds, do not use the GCC-specific CFlags when checking for
 these features.

-For the MMX check, assume that we have good enough MMX intrinsics and
 inline assembly support (on ix86), since MSVC provides sufficient
 support for those since before the times of MSVC 2008, and 2008 is the
 oldest version that we can support, as with the pre-C99 GTK+ stack.

Unfortunately due to x64 compiler issues, pre-Visual Studio 2010 will
crash when building SSSE3 code, so we do not enable building SSSE3 code
on pre-2010 Visual Studio.

Also, for all x64 Visual Studio builds, we do not enable USE_X86_MMX
as inline assembly is not allowed for x64 Visual Studio builds, and
instead use the compatibility instrinsics that we already have in the
code.
2019-11-18 16:19:36 +08:00
Adam Jackson
32a55aa8ac pixman-sse2: Fix undefined unaligned loads 2019-11-13 20:00:20 +00:00
Adam Jackson
47bec681d9 pixman-mmx: Fix undefined unaligned loads 2019-11-13 20:00:20 +00:00
Adam Jackson
baed75faa9 pixman-mmx: Fix undefined left-shifts 2019-11-13 20:00:20 +00:00
Adam Jackson
85acb0a933 test: Fix unrepresentable subtraction in stress-test
Does not make the test pass, but does fix this error:

../test/stress-test.c:538:25: runtime error: signed integer overflow: 2147483647 - -2 cannot be represented in type 'int'
2019-11-01 14:36:54 -04:00
Adam Jackson
1f5b20c4aa pixman-matrix: Fix left shift of a negative number
../pixman/pixman-matrix.c:276:35: runtime error: left shift of negative value -32768
2019-11-01 14:36:54 -04:00
Adam Jackson
bcfb3490db pixman-bits-image: Fix left shift of a negative number
../pixman/pixman-bits-image.c:678:33: runtime error: left shift of negative value -32768
2019-11-01 14:36:52 -04:00
Adam Jackson
fef82109eb pixman-bits-image: Fix various undefined left shifts
../pixman/pixman-bits-image.c:221:20: runtime error: left shift of 204 by 24 places cannot be represented in type 'int'
2019-10-15 16:35:25 -04:00
Adam Jackson
7d6b71b315 pixman-fast-path: Fix various undefined left shifts
../pixman/pixman-fast-path.c:3089:23: runtime error: left shift of 154 by 24 places cannot be represented in type 'int'
2019-10-15 16:34:56 -04:00
Adam Jackson
880f48b2b4 pixman-sse2: Fix an undefined left shift
../pixman/pixman-sse2.c:3346:14: runtime error: left shift of 41891 by 16 places cannot be represented in type 'int'
2019-10-15 16:33:46 -04:00
Adam Jackson
4897ad0a3f pixman-gradient-walker: Fix undefined left shift
../pixman/pixman-gradient-walker.c:216:35: runtime error: left shift of 163 by 24 places cannot be represented in type 'int'
2019-10-15 16:31:45 -04:00
Adam Jackson
7eb9c8c004 pixman-image: Fix undefined left shift
../pixman/pixman-image.c:963:46: runtime error: left shift of 255 by 24 places cannot be represented in type 'int'
2019-10-15 16:31:45 -04:00
Adam Jackson
81c87543d1 pixman-combine: Fix various undefined left shifts
../pixman/pixman-combine32.c:657:1: runtime error: left shift of 128 by 24 places cannot be represented in type 'int'
../pixman/pixman-combine32.c:694:1: runtime error: left shift of 232 by 24 places cannot be represented in type 'int'
../pixman/pixman-combine32.c:712:1: runtime error: left shift of 255 by 24 places cannot be represented in type 'int'
../pixman/pixman-combine32.c:786:1: runtime error: left shift of 255 by 24 places cannot be represented in type 'int'
../pixman/pixman-combine32.c:805:1: runtime error: left shift of 255 by 24 places cannot be represented in type 'int'
2019-10-15 16:31:45 -04:00
Adam Jackson
6d0a930b14 pixman-access: Fix various undefined left shifts
../pixman/pixman-access.c:389:2: runtime error: left shift of 1 by 31 places cannot be represented in type 'int'
../pixman/pixman-access.c:1101:2: runtime error: left shift of 2 by 30 places cannot be represented in type 'int'
../pixman/pixman-access.c:1152:2: runtime error: left shift of 2 by 30 places cannot be represented in type 'int'
2019-10-15 16:31:43 -04:00
Adam Jackson
a09bcc062f pixman: Fix undefined left shift in pixel_contract_from_float
../pixman/pixman-utils.c:216:14: runtime error: left shift of 255 by 24 places cannot be represented in type 'int'
2019-10-15 16:31:40 -04:00
Adam Jackson
f6040f56da test: Fix undefined left shift in pixel_checker_init
../test/utils.c:2070:57: runtime error: left shift of 255 by 24 places cannot be represented in type 'int'
2019-10-15 16:31:38 -04:00
Adam Jackson
52c27c82de test: Fix undefined left shift in affine-test
../test/affine-test.c:174:34: runtime error: left shift of 1 by 31 places cannot be represented in type 'int'
2019-10-15 16:31:33 -04:00
Jonathan Kew
d60b0af5e3 Avoid undefined behavior (left-shifting negative value) in pixman_int_to_fixed
Reported in https://bugzilla.mozilla.org/show_bug.cgi?id=1580352. Casting the argument to uint32_t should avoid invoking undefined behavior here. We'll still have *implementation-defined* behavior when casting the result back to pixman_fixed_t, but that's better than *undefined*.
2019-09-11 12:07:46 +00:00
Dylan Baker
afc6c935f1 meson: don't use link_with for library()
Meson doesn't do the expected thing when library() creates a static
library. Instead of combining the libraries together into a single
archive it effectively discards them, resulting in missing symbols.

To work around this we manually unpack the archives and shove the .o
files into the final library. This doesn't affect the shared library at
all, but makes the static library have the necessary symbols

Fixes #33
2019-09-09 16:06:18 -07:00
Jonathan Kew
c558647fdf Explicitly cast byte to uint32_t before left-shifting.
To avoid potential signed integer overflow (undefined behavior), as implicit integer promotion means the operand becomes a (signed) int.

(Issue originally reported at https://bugzilla.mozilla.org/show_bug.cgi?id=1577669)
2019-08-30 10:42:45 +00:00
Christoph Reiter
fd5c0da579 meson: fix TLS support under mingw
GCC on Windows complains that "__declspec(thread)" doesn't work, but still
compiles it, so the meson check doesn't work. The warning printed by gcc:
"warning: 'thread' attribute directive ignored [-Wattributes]"

Pass -Werror=attributes to make the check fail instead.

This fixes the test suite (minus gtk tests) on Windows with mingw.
2019-06-10 16:42:59 +00:00
Christoph Reiter
4851d4e20f meson: allow building a static library
So that passing "-Ddefault_library=both" also creates a static lib.

Note that Libs.private in the .pc file will still be wrong because of
https://github.com/mesonbuild/meson/issues/3934 (it contains things like
-lpixman-mmx)
2019-06-10 16:38:39 +00:00
Christoph Reiter
be0d3e6994 meson: define SIZEOF_LONG and use -Wundef
meson builds defaulted to SIZEOF_LONG=0 in various places
2019-06-10 16:34:06 +00:00
Basile Clement
0ee0ad23de Don't use GNU extension for binary numbers
The dithering code (specifically `dither_factor_bayer_8`) uses a GNU
extension for binary notation, eg 0b001.  This is not supported by MSVC
(at least) and breaks the build on this platform [1].

This patches uses hexadecimal notation instead, fixing the build.

[1]: https://lists.freedesktop.org/archives/pixman/2019-June/004883.html

Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-06-10 09:32:12 -07:00
Basile Clement
cb2ec4268f Ordered dithering with blue noise, v2
On some screens (typically low quality laptop screens), using Bayer
ordered dithering has been observed to cause color changes depending on
*where the gradient is rendered on the screen*, causing visible
flickering when moving an image on the screen.

To alleviate the issue, this patch adds support for ordered dithering
using a 64x64 matrix tuned from blue noise.  In addition to being devoid
of the positional dependency on screen, the blue noise matrix also
generates more pleasing and less discernable patterns.  As such, it is
now the method used for PIXMAN_DITHER_GOOD and PIXMAN_DITHER_BEST
dithering methods.

The 64x64 blue noise matrix has been generated using the provided
`pixman/dither/make-blue-noise.c` script, which uses the
void-and-cluster method.

Changes since v1 (thanks Bill):
 - Use uint16_t for the blue noise matrix for lower memory usage
 - Use bitwise computation for array index
2019-05-25 07:30:19 -07:00
Basile Clement
98b5ec74ca demos: Add a dithering demo
This adds a dither.c which provides a demo of the dithering feature.
This is based on the scale.c demo for scaling and provides a selection
of intermediate formats and dithering operators (currently, only
PIXMAN_DITHER_ORDERED_BAYER_8) to use.  Images are first blitted onto a
surface of the intermediate format with the requested dither setup, then
blitted back onto a a8r8g8b8 surface for display.
2019-05-25 07:30:11 -07:00
Basile Clement
37d2e681b3 test: Check the dithering path in tolerance-test
This adds support for testing dithered destinations in tolerance-test.
When dithering is enabled, the pixel checker allows for an additional
quantization error.
2019-05-25 07:30:02 -07:00
Basile Clement
ddcc41b999 Implement basic dithering for the wide pipeline, v3
This patch implements dithering in pixman.  A "dither" property is added
to BITS images, which is used to:

 - Force rendering to the image to go through the floating point
   pipeline.  Note that this is different from FAST_PATH_NARROW_FORMAT
   as it should not enable the floating point pipeline when reading from
   the image.

 - Enable dithering in dest_write_back_wide.  The dithering uses the
   destination format to determine noise amplitude.

This does not change pixman's behavior when dithering is disabled (the
default).

Additional types and functions are added to the public API:

 - The `pixman_dither_t` enum exposes the available dithering methods.
   Currently a single dithering method based on 8x8 Bayer matrices is
   implemented (PIXMAN_DITHER_ORDERED_BAYER_8).  The PIXMAN_DITHER_FAST,
   PIXMAN_DITHER_GOOD and PIXMAN_DITHER_BEST aliases are provided and
   should be used to benefit from future specializations.

 - The `pixman_image_set_dither` function allows to set the dithering
   method to use when rendering to a bits image.

 - The `pixman_image_set_dither_offset` function allows to set a
   vertical and horizontal offsets for the dither matrix.  This can be
   used after scrolling to ensure a consistent spatial positioning of
   the dither matrix.

Changes since previous version (v2):
 - linear_gradient_is_horizontal optimization is still compatible with
   the wide pipeline.  The code disabling it was a remnant of a previous
   patch which performed dithering directly inside linear_get_scanline,
   and thus needed to be called independently for each scanline.

Changes since v1:
 - Renamed PIXMAN_DITHER_BAYER_8 to PIXMAN_DITHER_ORDERED_BAYER_8
 - Disable dithering for channels with 32bpp or more (since they can
   represent exactly the wide values already).  This makes the patches
   compatible with the newly added floating point format.

Dithering is compatible with linear_gradient_is_horizontal
2019-05-25 07:29:55 -07:00
Fan Jinke
85bfa8b4f9 add Hygon Dhyana support to enable X86_MMX_EXTENSIONS feature
Signed-off-by: Fan Jinke <fanjinke@hygon.cn>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-05-02 16:07:19 -07:00
Basile Clement
8256c235d9 Fix bilinear filter computation in wide pipeline
The recently introduced wide pipeline for filters has a typo which
causes it to improperly compute bilinear interpolation positions,
causing various glitches when enabled.

This patch uses the proper computation for bilinear interpolation in the
wide pipeline.  It also makes related `if` statements conformant to the
CODING_STYLE:

* If a substatement spans multiple lines, then there must be braces
  around it.

* If one substatement of an if statement has braces, then the other
  must too.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2019-04-11 10:59:00 +02:00
Matt Turner
e21ebfb13f Post-release version bump to 0.38.5
Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-04-10 10:25:18 -07:00
Matt Turner
e8df10eea9 Pre-release version bump to 0.38.4
Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-04-10 10:17:47 -07:00
Matt Turner
23f036d461 Makefile.am: Ship Meson assembly test files in the tarball
These were forgotten in commit 0ea37df428 (meson: store ARM SIMD and
NEON tests as text files) and since autotools doesn't use them make
distcheck still succeeded.

Fixes #30

Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-04-10 10:10:47 -07:00
Matt Turner
e7058fe49d Makefile.am: Update download links
Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-04-07 13:43:57 -07:00
Matt Turner
8888e752bf Post-release version bump to 0.38.3
Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-04-07 13:34:44 -07:00
Matt Turner
a7ffb3e617 Pre-release version bump to 0.38.2
Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-04-07 13:13:30 -07:00
Matt Turner
4c4753c407 meson: Correct copy-and-paste mistake
Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-04-07 12:31:40 -07:00
Niveditha Rau
72959837ab void function should not return a value
Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-03-27 15:14:05 -07:00
Simon Richter
ef4fb03248 Windows: Support building with SHELL=cmd.exe
When GNU Make is not from msys, the startup cost for sh.exe is massive
compared to cmd.exe.

Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-03-27 15:12:52 -07:00
Simon Richter
55d8f956c2 Windows: Show compiler invocation
Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-03-27 15:12:52 -07:00
Dylan Baker
0ea37df428 meson: store ARM SIMD and NEON tests as text files
This is unfortunately required to make the tests work correctly, as
otherwise meson assumes that the files are C code not assembly. I've
opened https://github.com/mesonbuild/meson/issues/5151, to discuss
fixing the issue in meson upstream.

Fixes #29
2019-03-27 10:54:50 -07:00
Dylan Baker
2065a07e98 meson: simplify and fix mmx library compilation
This simplifies the logic and fixes the loongson-mmi implementation to
build correctly.
2019-03-27 10:54:50 -07:00
Dylan Baker
6e206cf7fc meson: Add proper include paths for the loongson check 2019-03-27 10:53:34 -07:00
Dylan Baker
9ed0576a73 meson: fix copy-n-paste error for arm simd assembly
mentioned in #29
2019-03-27 10:53:34 -07:00
Dylan Baker
d13f6a8b1d meson: fix typo which breaks loongson checks
mach -> march
2019-03-27 10:53:34 -07:00
Dylan Baker
e7ac62c3c7 meson: work around meson issue #5115
This issue causes openmp arguments to be injected into compilers that
can support openmp, even if they don't. This issue will be fixed in
0.51 (code already landed in mesonbuild#5116), for older versions lets
work around the issue.
2019-03-27 10:53:33 -07:00
Maarten Lankhorst
5d2cf8fc21 Bump version to 0.38.0
And update RELEASING for the new meson build system.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2019-02-11 13:27:25 +01:00
Maarten Lankhorst
6240ad15c6 pixman: Use maximum precision for pixman-bits-image, v2.
pixman-bits-image's wide helpers first obtains the 8-bits image,
then converts it to float. This destroys all the precision that
the wide path was offering.

Fix this by making get_pixel() take a pointer instead of returning
a value. Floating point will fill in a argb_t, while the 8-bits path
will fill a 32-bits ARGB value. This also requires writing a floating
point bilinear interpolator. With this change pixman can use the full
floating point precision internally in all paths.

Changes since v1:
- Make accum and reduce an argument to convolution functions,
  to remove duplication.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Basile Clement <basile-pixman@clement.pm>
2019-02-11 12:48:57 +01:00
Basile Clement
a32fc4faf9 Implement floating point gradient computation, v2.
This patch modifies the gradient walker to be able to generate floating
point values directly in addition to a8r8g8b8 32 bit values.  This is
then used by the various gradient implementations to render in floating
point when asked to do so, instead of rendering to a8r8g8b8 and then
expanding to floating point as they were doing previously.

Changes since v1 (mlankhorst):
- Implement pixman_gradient_walker_pixel_32 without calling
  pixman_gradient_walker_pixel_float, to prevent performance degradation.
  Suggested by Adam Jackson.
- Fix whitespace errors.
- Remove unnecessary function prototypes in pixman-private.h

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
[mlankhorst: Add comment about pixman_contract_from_float,
             based on Basille's suggestion]
Acked-by: Basile Clement <basile-pixman@clement.pm>
2019-02-11 12:48:21 +01:00
Dylan Baker
b40d5495ec build: Add meson files to EXTRA_DIST
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-01-15 19:13:24 -08:00
Dylan Baker
16eacf19a3 editorconfig: use tabs for Makefiles
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-01-15 19:13:14 -08:00
Andreas Boll
60eec33554 Upload to unstable. 2018-12-12 22:02:53 +01:00
Andreas Boll
431e754de7 Bump standards version to 4.2.1. 2018-12-12 21:55:48 +01:00
Andreas Boll
8c5411a23b Bump debhelper compat to 11. 2018-12-12 21:55:28 +01:00
Andreas Boll
ea26aeb957 Set source format to 1.0. 2018-12-12 21:54:43 +01:00
Andreas Boll
da6e874f81 Use https URL in debian/copyright. 2018-12-12 21:53:40 +01:00
Andreas Boll
2d7f5e5831 Update Vcs-* URLs to point to salsa.debian.org. 2018-12-12 21:52:18 +01:00
Andreas Boll
c8e824af7b Update to my Debian address. 2018-12-12 21:51:21 +01:00
Andreas Boll
a26c93f936 Bump changelogs 2018-12-12 21:07:41 +01:00
Andreas Boll
51baef77fb Merge branch 'debian-unstable' into debian-unstable-new 2018-12-12 21:01:26 +01:00
Andreas Boll
0c08dcfc0c pixman 0.34.0 release
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWrh2cAAoJEGUdTbirWueAfxAH/1sf8P0SHY1y9KBKCw0enM4Y
 60sZYAgTgLa5prITcPeTb11bw877WAF73bAVjzL+6pNkT+Xs1ytvckwmbDoKDRZi
 zlptf0vPCnPX95Fh2X2PSO/1G0EErNWbqP5dUtLJ8L4sEaAj5TtDC9r9BouXpFaR
 qdipAmC1dVQNsbheBUinnfIjQ7H7i0NXXoUADFoP+X9V3WW95Hjkbwyoa4IUeYsY
 lPLVKfMRTZfQLksAAViDDpAhQxIrwMYQYApuMlbYXvX3tsW6zZCTeDfjqwRfxkdX
 Nnsz3lKBGvbS2ZJQBx2Xp9YC7+eu12IlxFA8cn3Exa96VngPJK5bR8Qn1ZJlUH8=
 =hex7
 -----END PGP SIGNATURE-----

Merge tag 'pixman-0.34.0' into debian-unstable-new

pixman 0.34.0 release
2018-12-12 21:01:18 +01:00
Maarten Lankhorst
146fa64351 Merge remote-tracking branch 'origin/master'
And bump meson version to 37.1 as well. Seems my push to upstream failed.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2018-12-07 14:20:44 +01:00
Maarten Lankhorst
0202f0d89d Post release version bump to 37.1
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2018-12-07 13:44:38 +01:00
Dylan Baker
eb0dfaa0c6 gitlab-ci: Add meson build to pipeline test 2018-11-29 16:57:01 +00:00
Dylan Baker
199a3bd275 meson: Add a meson build system
This commit adds a meson build system for pixman. It carries the usual
improvements of meson, better clean build time, much better incremental
build times, while being simpler and easier to understand.

This takes advantage of some features from the most recent versions of
meson: the builtin openmp dependency and the feature option type.

There are a couple of things that I've done a bit differently than the
autotools build system, I've built a libdemos which is the utilities
from the demos folder, and I've linked the demos with libtestutils from
tetsts, otherwise I expect that most things will be the same.

I've tested so far cross compiling from x86_64 -> x86, x86_64 ->
Aarch64, and Linux to Windows via mingw, as well as native x86_64 Linux
builds which all work. I've also built with mingw nativly, there are
some test failures there. An MSVC build can be generated, but fails.

v2: - set WORDS_BIGENDIAN in the config for big endian systems.
2018-11-29 16:57:01 +00:00
Dylan Baker
761f36c3c8 Add .editorconfig file
This sets the style for meson (which uses the upstream style, 2 space
indent with no tabs), and sets the tab_width to 8 per the CODING_STYLE
document.
2018-11-29 16:57:01 +00:00
Maarten Lankhorst
0313f35ab9 Bump version to 0.36.0
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2018-11-21 12:40:26 +01:00
Maarten Lankhorst
8a5d44c420 pixman: Update git repository to the one at gitlab.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2018-11-21 12:39:33 +01:00
Maarten Lankhorst
489fa0df11 pixman: Add tests for (a)rgb floating point formats.
Add some basic tests to ensure that the newly added formats work as
intended.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-11-06 14:25:49 +01:00
Maarten Lankhorst
a4b8a26d2b pixman: Add support for argb/xrgb float formats, v5.
Pixman is already using the floating point formats internally, expose
this capability in case someone wants to support higher bit per
component formats.

This is useful for igt which depends on cairo to do the rendering.
It can use it to convert floats internally to planar Y'CbCr formats,
or to F16.

We add a new type PIXMAN_TYPE_RGBA_FLOAT for this format, which is an
all float array of R, G, B, and A. Formats that use mixed float/int
RGBA aren't supported, and will probably need their own type.

Changes since v1:
- Use RGBA 128 bits and RGB 96 bits memory layouts, to better match the opengl format.
Changes since v2:
- Add asserts in accessor and for strides to force alignment.
- Move test changes to their own commit.
Changes since v3:
- Define 32bpc as PIXMAN_FORMAT_PACKED_C32
- Rename pixman accessors from rgb*_float_float to rgb*f_float
Changes since v4:
- Create a new PIXMAN_FORMAT_BYTE for fitting up to 64 bits per component.
  (based on Siarhei Siamashka's suggestion)
- Use new format type PIXMAN_TYPE_RGBA_FLOAT

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> #v4
[mlankhorst: Fix missing braces in PIXMAN_FORMAT_RESHIFT macro]
2018-11-06 14:24:05 +01:00
Siarhei Siamashka
018bf2f230 test: Fix stride calculation in stress-test
Currently the number of bits per pixel is used instead of the
number of bytes per pixel when calculating image strides. This
does not cause any real problems, but the gaps between scanlines
are excessively large.

This patch actually converts bits to bytes and rounds up the result
to the nearest byte boundary.

Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Reviewed-by: soren.sandmann@gmail.com
2018-07-06 14:44:22 -04:00
Vladimir Smirnov
bd2b49185b test: Adjust for clang's removal of __builtin_shuffle
__builtin_shuffle was removed in clang 5.0.

Build log says:
test/utils-prng.c:207:27: error: use of unknown builtin '__builtin_shuffle' [-Wimplicit-function-declaration]
            randdata.vb = __builtin_shuffle (randdata.vb, bswap_shufflemask);
                          ^
test/utils-prng.c:207:25: error: assigning to 'uint8x16' (vector of 16 'uint8_t' values) from incompatible type 'int'
            randdata.vb = __builtin_shuffle (randdata.vb, bswap_shufflemask);
                        ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2 errors generated

Link to original discussion:
http://lists.llvm.org/pipermail/cfe-dev/2017-August/055140.html

It's possible to build pixman if attached patch is applied. Basically
patch adds check for __builtin_shuffle support and in case there is
none, falls back to clang-specific __builtin_shufflevector that do the
same but have different API.

Bugzilla: https://bugs.gentoo.org/646360
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104886
Tested-by: Philip Chimento <philip.chimento@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-06-05 12:35:07 -04:00
Adam Jackson
a75c69f122 Merge branch 'ci' into 'master'
ci: Add .gitlab-ci.yml

See merge request pixman/pixman!1
2018-06-05 16:33:50 +00:00
Adam Jackson
9034d0cc32 ci: Add .gitlab-ci.yml
Just builds on Fedora 28 for x86_64 at the moment, but it's a start.
Credit to Daniel Stone for eliminating the nested docker image.

Signed-off-by: Adam Jackson <ajax@redhat.com>
2018-06-05 12:13:35 -04:00
Dan Horák
ddf42d627c vmx: Fix vector loads on ppc64le
Use vector intrinsic for loading possibly unaligned data instead of a
typecast.

Bugzilla: https://bugzilla.redhat.com/1572540
Signed-off-by: Dan Horák <dan@danny.cz>
Signed-off-by: Adam Jackson <ajax@redhat.com>
Tested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2018-05-14 16:31:49 -04:00
Behdad Esfahbod
8b95e0e460 Promote unsigned short to unsigned int explicitly
...to avoid default promotion to signed int, which causes undefined
behaviour in the shift expression.
2018-01-09 10:26:29 +01:00
Andreas Boll
31381b7057 Upload to unstable. 2017-12-17 13:34:07 +01:00
Andreas Boll
f0178c049c Bump standards version to 4.1.2. 2017-12-17 13:19:45 +01:00
Andreas Boll
9684e88c21 Stop passing --disable-silent-rules to configure, debhelper does it now. 2017-12-17 13:19:23 +01:00
Andreas Boll
397047255e Switch to dbsym package. 2017-12-17 13:18:03 +01:00
Andreas Boll
34c1784503 Declare Multi-Arch: same for libpixman-1-dev (Closes: #884166). 2017-12-17 13:17:44 +01:00
Julien Cristau
87934b6b4f Upload to unstable 2016-09-24 13:25:26 +02:00
Julien Cristau
4daa9a4c6b Use https URL in debian/watch. 2016-09-24 13:23:41 +02:00
Søren Sandmann Pedersen
85467ec308 Revert "demos/scale: Added pulldown to choose PIXMAN_FILTER_* value"
This reverts commit 375f5ec5c5.

This patch was accidentally pushed.
2016-09-03 15:09:12 -04:00
Bill Spitzak
17c4ce2e39 pixman-filter: Made Gaussian a bit wider
Expanded the size slightly (from ~4.25 to 5) to make the cutoff less
noticable.  Previouly the value at the cutoff was
gaussian_filter(sqrt(2)*3/2) = 0.00626 which is larger than the
difference between 8-bit pixels (1/255 = 0.003921). New cutoff is
gaussian_filter(2.5) = 0.001089 which is smaller.

v11: added some math to commit message
v14: left SIGMA in there
Signed-off-by: Bill Spitzak <spitzak@gmail.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Søren Sandmann <soren.sandmann@gmail.com>
2016-09-03 14:53:07 -04:00
Bill Spitzak
d286078b28 pixman-filter: Nested polynomial for cubic
v11: Restored range checks

Signed-off-by: Bill Spitzak <spitzak@gmail.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
2016-09-03 14:53:07 -04:00
Søren Sandmann Pedersen
133142449b pixman-filter: Fix several issues related to normalization
There are a few bugs in the current normalization code

(1) The normalization is based on the sum of the *floating point*
    values generated by integral(). But in order to get the sum to be
    close to pixman_fixed_1, the sum of the rounded fixed point values
    should be used.

(2) The multiplications in the normalization loops often round the
    same way, so the residual error can fairly large.

(3) The residual error is added to the sample located at index
    (width - width / 2), which is not the midpoint for odd widths (and
    for width 1 is in fact outside the array).

This patch fixes these issues by (1) using the sum of the fixed point
values as the total to divide by, (2) doing error diffusion in the
normalization loop, and (3) putting any residual error (which is now
guaranteed to be less than pixman_fixed_e) at the first sample, which
is the only one that didn't get any error diffused into it.

Signed-off-by: Søren Sandmann <soren.sandmann@gmail.com>
2016-09-03 14:53:06 -04:00
Søren Sandmann Pedersen
3b46fce6fe pixman-filter: Speed up BOX/BOX filter
The convolution of two BOX filters is simply the length of the
interval where both are non-zero, so we can simply return width from
the integral() function because the integration region has already
been restricted to be such that both functions are non-zero on it.

This is both faster and more accurate than doing numerical integration.

This patch is based on one by Bill Spitzak

    https://lists.freedesktop.org/archives/pixman/2016-March/004446.html

with these changes:

- Rebased to not assume any changes in the arguments to integral().

- Dropped the multiplication by scale

- Added more details in the commit message.

Signed-off-by: Søren Sandmann <soren.sandmann@gmail.com>
Reviewed-by: Bill Spitzak <spitzak@gmail.com>
2016-09-02 00:40:12 -04:00
Bill Spitzak
8855b3a2a2 pixman-filter: integral splitting is only needed for triangle filter
Only the triangle is discontinuous at 0. The other filters resemble a
cubic closely enough that Simpsons integration works without
splitting.

Changes by Søren: Rebase without the changes to the integral function,
update comment to match the new code.

Signed-off-by: Bill Spitzak <spitzak@gmail.com>
Signed-off-by: Søren Sandmann <soren.sandmann@gmail.com>
Reviewed-by: Søren Sandmann <soren.sandmann@gmail.com>
2016-09-02 00:40:12 -04:00
Bill Spitzak
6ae281fbb7 pixman-filter: Correct Simpsons integration
Simpsons uses cubic curve fitting, with 3 samples defining each
cubic. This makes the weights of the samples be in a pattern of
1,4,2,4,2...4,1, and then dividing the result by 3.

The previous code was using weights of 1,2,0,6,0,6...,2,1.

With this fix the integration is accurate enough that the number of
samples could be reduced a lot. Multiples of 12 seem to work best.

v7: Merged with patch to reduce from 128 samples to 16
v9: Changed samples from 16 to 12
v10: Fixed rebase error that made it not compile
v11: minor whitespace change
v14: more whitespace changes

Signed-off-by: Bill Spitzak <spitzak@gmail.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Søren Sandmann <soren.sandmann@gmail.com>
2016-09-02 00:40:12 -04:00
Bill Spitzak
6acaf2bcb1 pixman-filter: reduce amount of malloc/free/memcpy to generate filter
Rearranged so that the entire block of memory for the filter pair
is allocated first, and then filled in. Previous version allocated
and freed two temporary buffers for each filter and did an extra
memcpy.

v8: small refactor to remove the filter_width function

v10: Restored filter_width function but with arguments changed to
     match later patches

v11: Removed unused arg and pointer from filter_width function
     Whitespace fixes.

Signed-off-by: Bill Spitzak <spitzak@gmail.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Søren Sandmann <soren.sandmann@gmail.com>
2016-09-02 00:40:12 -04:00
Bill Spitzak
d0e6c9f4f6 pixman-image: Added enable-gnuplot config to view filters in gnuplot
If enable-gnuplot is configured, then you can pipe the output of a
pixman-using program to gnuplot and get a continuously-updated plot of
the horizontal filter. This works well with demos/scale to test the
filter generation.

The plot is all the different subposition filters shuffled
together. This is misleading in a few cases:

  IMPULSE.BOX - goes up and down as the subfilters have different
                numbers of non-zero samples

  IMPULSE.TRIANGLE - somewhat crooked for the same reason

  1-wide filters - looks triangular, but a 1-wide box would be more
                   accurate

Changes by Søren: Rewrote the pixman-filter.c part to
     - make it generate correct coordinates
     - add a comment on how coordinates are generated
     - in rounding.txt, add a ceil() variant of the first-sample
       formula
     - make the gnuplot output slightly prettier

v7: First time this ability was included

v8: Use config option
    Moved code to the filter generator
    Modified scale demo to not call filter generator a second time.

v10: Only print if successful generation of plots
     Use #ifdef, not #if

v11: small whitespace fixes
v12: output range from -width/2 to width/2 and include y==0, to avoid misleading plots
     for subsample_bits==0 and for box filters which may have no small values.

Signed-off-by: Bill Spitzak <spitzak@gmail.com>
2016-09-02 00:40:11 -04:00
Bill Spitzak
375f5ec5c5 demos/scale: Added pulldown to choose PIXMAN_FILTER_* value
This is very useful for comparing the results of SEPARABLE_CONVOLUTION
with BILINEAR and NEAREST.

v14: Removed good/best items
v15: Skip filter generation so gnuplot output continues showing previous value

Signed-off-by: Bill Spitzak <spitzak@gmail.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
2016-09-02 00:40:11 -04:00
Bill Spitzak
afee2adc1e demos/scale: Default to locked axis
Signed-off-by: Bill Spitzak <spitzak@gmail.com>
Reviewed-by: Søren Sandmann <soren.sandmann@gmail.com>
2016-09-02 00:40:11 -04:00
Bill Spitzak
1e1af34d3b demos/scale: fix blank subsamples spin box
It now shows the initial value of 4 when the demo is started

Signed-off-by: Bill Spitzak <spitzak@gmail.com>
Reviewed-by: Søren Sandmann <soren.sandmann@gmail.com>
2016-09-02 00:40:11 -04:00
Bill Spitzak
99b574109d demos/scale: Compute filter size using boundary of xformed ellipse
Instead of using the boundary of xformed rectangle, use the boundary
of xformed ellipse. This is much more accurate and less blurry. In
particular the filtering does not change as the image is rotated.

Signed-off-by: Bill Spitzak <spitzak@gmail.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Soren Sandmann <soren.sandmann@gmail.com>
2016-09-02 00:40:11 -04:00
Søren Sandmann Pedersen
b9ead7ddf7 More general BILINEAR=>NEAREST reduction
Generalize and simplify the code that reduces BILINEAR to NEAREST so
that the reduction happens for all affine transformations where
t00...t12 are integers and (t00 + t01) and (t10 + t11) are both
odd. This is a sufficient condition for the resulting transformed
coordinates to be exactly at the center of a pixel so that BILINEAR
becomes identical to NEAREST.

V2: Address some comments by Bill Spitzak

Signed-off-by: Søren Sandmann <soren.sandmann@gmail.com>
Reviewed-by: Bill Spitzak <spitzak@gmail.com>
2016-09-02 00:40:11 -04:00
Søren Sandmann Pedersen
7612369013 Add new test of filter reduction from BILINEAR to NEAREST
This new test tests a bunch of bilinear downscalings, where many have
a transformation such that the BILINEAR filter can be reduced to
NEAREST (and many don't).

A CRC32 is computed for all the resulting images and compared to a
known-good value for both 4-bit and 7-bit interpolation.

V2: Remove leftover comment, some minor formatting fixes, use a
timestamp as the PRNG seed.

Signed-off-by: Søren Sandmann <soren.sandmann@gmail.com>
Reviewed-by: Bill Spitzak <spitzak@gmail.com>
2016-09-02 00:40:11 -04:00
Søren Sandmann Pedersen
eb4a832ec2 pixman-fast-path.c: Pick NEAREST affine fast paths before BILINEAR ones
When a BILINEAR filter is reduced to NEAREST, it is possible for both
types of fast paths to run; in this case, the NEAREST ones should be
preferred as that is the simpler filter.

Signed-off-by: Soren Sandmann <soren.sandmann@gmail.com>
Reviewed-by: Bill Spitzak <spitzak@gmail.com>
2016-09-02 00:40:11 -04:00
Julien Cristau
0f4e087031 Bump changelogs 2016-05-13 12:50:41 +02:00
Julien Cristau
5672fa0f82 pixman 0.34.0 release
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWrh2cAAoJEGUdTbirWueAfxAH/1sf8P0SHY1y9KBKCw0enM4Y
 60sZYAgTgLa5prITcPeTb11bw877WAF73bAVjzL+6pNkT+Xs1ytvckwmbDoKDRZi
 zlptf0vPCnPX95Fh2X2PSO/1G0EErNWbqP5dUtLJ8L4sEaAj5TtDC9r9BouXpFaR
 qdipAmC1dVQNsbheBUinnfIjQ7H7i0NXXoUADFoP+X9V3WW95Hjkbwyoa4IUeYsY
 lPLVKfMRTZfQLksAAViDDpAhQxIrwMYQYApuMlbYXvX3tsW6zZCTeDfjqwRfxkdX
 Nnsz3lKBGvbS2ZJQBx2Xp9YC7+eu12IlxFA8cn3Exa96VngPJK5bR8Qn1ZJlUH8=
 =hex7
 -----END PGP SIGNATURE-----

Merge tag 'pixman-0.34.0' into debian-unstable

pixman 0.34.0 release
2016-05-13 12:49:33 +02:00
Oded Gabbay
1727aa4ab6 Pre-release version bump to 0.34.0
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2016-01-31 16:39:23 +02:00
Thomas Petazzoni
7c6066b700 pixman-private: include <float.h> only in C code
<float.h> is included unconditionally by pixman-private.h, which in
turn gets included by assembler files. Unfortunately, with certain C
libraries (like the musl C library), <float.h> cannot be included in
assembler files:

  CCLD     libpixman-arm-simd.la
/home/test/buildroot/output/host/usr/arm-buildroot-linux-musleabihf/sysroot/usr/include/float.h: Assembler messages:
/home/test/buildroot/output/host/usr/arm-buildroot-linux-musleabihf/sysroot/usr/include/float.h:8: Error: bad instruction `int __flt_rounds(void)'
/home/test/buildroot/output/host/usr/arm-buildroot-linux-musleabihf/sysroot/usr/include/float.h: Assembler messages:
/home/test/buildroot/output/host/usr/arm-buildroot-linux-musleabihf/sysroot/usr/include/float.h:8: Error: bad instruction `int __flt_rounds(void)'

It turns out however that <float.h> is not needed by assembly files,
so we move its inclusion within the #ifndef __ASSEMBLER__ condition,
which solves the problem.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Reviewed-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2016-01-31 16:15:26 +02:00
Andreas Boll
af451ab328 Upload to unstable. 2016-01-14 13:46:57 +01:00
Andreas Boll
cae8b2a893 Add myself to Uploaders. 2016-01-14 13:21:08 +01:00
Andreas Boll
e22e142165 Bump changelogs. 2016-01-14 13:19:45 +01:00
Andreas Boll
5e030aac41 pixman 0.33.6 release
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWeVNeAAoJEGUdTbirWueAZVUIAIMrz8RGz2t/6Y16CPx8Kfat
 NJFe9k0gVxTCBGYcAOtZJxeqcl/RryGuEGrdcN1UiAeCsjDxTCEwefHO1ablC6A6
 Zc57mkxbknM1eOHiU/D59+JFC5cvLM3WlsQSAi2CyUIdlSq/b7vK/ADWas7kn8y9
 AdDd/MEfGXwVKumQqSN+h5GZxLwhOYw6Y9Ew6srR5EX3jzGQ8GQY3cfd3tzXpYYN
 aZ3EME3EUkhrT3DdUg/byoQu1YIppGm5Vb405gqe/1B+QZLMHUsKP3dwMk++jcdn
 4vcZAhs3s5VrVlPkfng6HLdRHmHI//AfwRBktcrEoirGfGGtPF3NKfk9B4KgPRk=
 =FhAa
 -----END PGP SIGNATURE-----

Merge tag 'pixman-0.33.6' into debian-unstable

pixman 0.33.6 release
2016-01-14 13:17:22 +01:00
Andrea Canciani
342cbf1644 build: Distinguish SKIP and FAIL on Win32
The `check` target in test/Makefile.win32 assumed that any non-0 exit
code from the tests was an error, but the testsuite is currently using
77 as a SKIP exit code (based on the convention used in autotools).

Fixes fence-image-self-test and cover-test (now reported as SKIP).

Signed-off-by: Andrea Canciani <ranma42@gmail.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-12-30 14:06:40 +01:00
Simon Richter
af0689716a build: Use del instead of rm on cmd.exe shells
The `rm` command is not usually available when running on Win32 in a
`cmd.exe` shell. Instead the shell provides the `del` builtin, which
has somewhat more limited wildcars expansion and error handling.

This makes all of the Makefile targets work on Win32 both using
`cmd.exe` and using the MSYS environment.

Signed-off-by: Simon Richter <Simon.Richter@hogyros.de>
Signed-off-by: Andrea Canciani <ranma42@gmail.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-12-23 21:24:17 +01:00
Andrea Canciani
93b876c110 build: Do not use mkdir -p on Windows
When the build is performed using `cmd.exe` as shell, the `mkdir`
command does not support the `-p` flag. The ability to create multiple
netsted folder is not used, hence it can be easily replaced by only
creating the directory if it does not exist.

This makes the build work on the `cmd.exe` shell, except for the
`clean` targets.

Signed-off-by: Andrea Canciani <ranma42@gmail.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-12-23 21:24:06 +01:00
Andrea Canciani
cc35d01980 build: Avoid phony pixman target in test/Makefile.win32
Instead of explicitly depending on "pixman" for the "all" and "check"
targets, rely on the dependency to the .lib file

Signed-off-by: Andrea Canciani <ranma42@gmail.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-12-23 21:23:57 +01:00
Andrea Canciani
ceb49cbda9 build: Remove use of BUILT_SOURCES from Makefile.win32
Since 3d81d89c29 BUILT_SOURCES is not
used anymore, but it was unintentionally left in Win32 Makefiles.

Signed-off-by: Andrea Canciani <ranma42@gmail.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-12-23 21:23:46 +01:00
Oded Gabbay
ba1868a854 Post 0.34 branch creation version bump to 0.35.1
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-12-23 10:46:40 +02:00
Oded Gabbay
0e72e78086 Post-release version bump to 0.33.7
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-12-22 15:55:32 +02:00
Oded Gabbay
65f35270e4 Pre-release version bump to 0.33.6
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-12-22 15:30:10 +02:00
Oded Gabbay
a566f627db configura.ac: fix test for SSE2 & SSSE3 assembler support
This patch modifies the SSE2 & SSSE3 tests in configure.ac to use a
global variable to initialize vector variables. In addition, we now
return the value of the computation instead of 0.

This is done so gcc 4.9 (and lower) won't optimize the SSE assembly
instructions (when using -O1 and higher), because then the configure test
might incorrectly pass even though the assembler doesn't support the
SSE instructions (the test will pass because the compiler does support
the intrinsics).

v2: instead of using volatile, use a global variable as input

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-12-22 11:19:01 +02:00
Andrea Canciani
d24b415f3e mmx: Improve detection of support for "K" constraint
Older versions of clang emitted an error on the "K" constraint, but at
least since version 3.7 it is supported. Just like gcc, this
constraint is only allowed for constants, but apparently clang
requires them to be known before inlining.

Using the macro definition _mm_shuffle_pi16(A, N) ensures that the "K"
constraint is always applied to a literal constant, independently from
the compiler optimizations and allows building pixman-mmx on modern
clang.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrea Canciani <ranma42@gmail.com>
2015-11-18 14:19:58 -08:00
Matt Turner
312e381523 Revert "mmx: Use MMX2 intrinsics from xmmintrin.h directly."
This reverts commit 7de61d8d14.

Newer versions of gcc allow inclusion of xmmintrin.h without -msse, but
still won't allow usage of the intrinsics.

Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=564024
2015-11-18 14:19:12 -08:00
Andreas Boll
017a59ec26 Upload to unstable 2015-11-04 13:26:38 +01:00
Andreas Boll
c193730083 Bump changelogs. 2015-11-04 10:30:58 +01:00
Andreas Boll
51c330400f pixman 0.33.4 release
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWKk4tAAoJEGUdTbirWueAIDkH/0YQj9943iFVJFEWhQdhLJe6
 PeHsiZgNjhPTNK2gpuudtOK2yda1akQTCfjGeNzN0nKQ0qPOaDiF71jt/C4Duppx
 rX9M6lkyMEPlCrM27+pZUCJitL+e7j8qYjapAdfvx8lCqvl8Mkq2t5JCsr1PWkte
 5w83kNhWf35eWN0zgRem9tTgVQ0LMYdO5IYPasAnqKHUUaIHO/r2dTNdc8bBFvD7
 k7X3Qz/kqAodraTWpieT59mwttUI0x/CiaNjlXfMDC4KKtbzkZJQlc0Oys74EG17
 Oag2Bvi4vnkTj+lvoixhu8dBGR/LPyEzZHbZyNWfjsDYL2RM2FuovUDxaYYM5nQ=
 =11P2
 -----END PGP SIGNATURE-----

Merge tag 'pixman-0.33.4' into debian-unstable

pixman 0.33.4 release
2015-11-04 10:28:32 +01:00
Oded Gabbay
3a50806cbe Post-release version bump to 0.33.5
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-10-23 18:33:55 +03:00
Oded Gabbay
fa71d08a81 Pre-release version bump to 0.33.4
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-10-23 17:58:49 +03:00
Andrea Canciani
9728241bd0 test: Fix fence-image-self-test on Mac
On MacOS X, according to the manpage of mprotect(), "When a program
violates the protections of a page, it gets a SIGBUS or SIGSEGV
signal.", but fence-image-self-test was only accepting a SIGSEGV as
notification of invalid access.

Fixes fence-image-self-test

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-10-16 15:05:02 +03:00
Matt Turner
7de61d8d14 mmx: Use MMX2 intrinsics from xmmintrin.h directly.
We had lots of hacks to handle the inability to include xmmintrin.h
without compiling with -msse (lest SSE instructions be used in
pixman-mmx.c). Some recent version of gcc relaxed this restriction.

Change configure.ac to test that xmmintrin.h can be included and that we
can use some intrinsics from it, and remove the work-around code from
pixman-mmx.c.

Evidently allows gcc 4.9.3 to optimize better as well:

   text	   data	    bss	    dec	    hex	filename
 657078	  30848	    680	 688606	  a81de	libpixman-1.so.0.33.3 before
 656710	  30848	    680	 688238	  a806e	libpixman-1.so.0.33.3 after

Reviewed-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Tested-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2015-10-13 09:40:42 -07:00
Siarhei Siamashka
90e62c0867 vmx: implement fast path vmx_composite_over_n_8888
Running "lowlevel-blt-bench over_n_8888" on Playstation3 3.2GHz,
Gentoo ppc (32-bit userland) gave the following results:

before:  over_n_8888 =  L1: 147.47  L2: 205.86  M:121.07
after:   over_n_8888 =  L1: 287.27  L2: 261.09  M:133.48

Cairo non-trimmed benchmarks on POWER8, 3.4GHz 8 Cores:

ocitysmap          659.69  -> 611.71   :  1.08x speedup
xfce4-terminal-a1  2725.22 -> 2547.47  :  1.07x speedup

Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-09-29 14:21:46 +03:00
Ben Avison
2876d8d3dd affine-bench: remove 8e margin from COVER area
Patch "Remove the 8e extra safety margin in COVER_CLIP analysis" reduced
the required image area for setting the COVER flags in
pixman.c:analyze_extent(). Do the same reduction in affine-bench.

Leaving the old calculations in place would be very confusing for anyone
reading the code.

Also add a comment that explains how affine-bench wants to hit the COVER
paths. This explains why the intricate extent calculations are copied
from pixman.c.

[Pekka: split patch, change comments, write commit message]
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-09-25 14:26:04 +03:00
Ben Avison
0e2e975128 Remove the 8e extra safety margin in COVER_CLIP analysis
As discussed in
http://lists.freedesktop.org/archives/pixman/2015-August/003905.html

the 8 * pixman_fixed_e (8e) adjustment which was applied to the transformed
coordinates is a legacy of rounding errors which used to occur in old
versions of Pixman, but which no longer apply. For any affine transform,
you are now guaranteed to get the same result by transforming the upper
coordinate as though you transform the lower coordinate and add (size-1)
steps of the increment in source coordinate space. No projective
transform routines use the COVER_CLIP flags, so they cannot be affected.

Proof by Siarhei Siamashka:

Let's take a look at the following affine transformation matrix (with 16.16
fixed point values) and two vectors:

         | a   b     c    |
M      = | d   e     f    |
         | 0   0  0x10000 |

         |  x_dst  |
P     =  |  y_dst  |
         | 0x10000 |

         | 0x10000 |
ONE_X  = |    0    |
         |    0    |

The current matrix multiplication code does the following calculations:

             | (a * x_dst + b * y_dst + 0x8000) / 0x10000 + c |
    M * P =  | (d * x_dst + e * y_dst + 0x8000) / 0x10000 + f |
             |                   0x10000                      |

These calculations are not perfectly exact and we may get rounding
because the integer coordinates are adjusted by 0.5 (or 0x8000 in the
16.16 fixed point format) before doing matrix multiplication. For
example, if the 'a' coefficient is an odd number and 'b' is zero,
then we are losing some of the least significant bits when dividing by
0x10000.

So we need to strictly prove that the following expression is always
true even though we have to deal with rounding:

                                          | a |
    M * (P + ONE_X) - M * P = M * ONE_X = | d |
                                          | 0 |

or

   ((a * (x_dst + 0x10000) + b * y_dst + 0x8000) / 0x10000 + c)
  -
   ((a * x_dst             + b * y_dst + 0x8000) / 0x10000 + c)
  =
    a

It's easy to see that this is equivalent to

    a + ((a * x_dst + b * y_dst + 0x8000) / 0x10000 + c)
      - ((a * x_dst + b * y_dst + 0x8000) / 0x10000 + c)
  =
    a

Which means that stepping exactly by one pixel horizontally in the
destination image space (advancing 'x_dst' by 0x10000) is the same as
changing the transformed 'x_src' coordinate in the source image space
exactly by 'a'. The same applies to the vertical direction too.
Repeating these steps, we can reach any pixel in the source image
space and get exactly the same fixed point coordinates as doing
matrix multiplications per each pixel.

By the way, the older matrix multiplication implementation, which was
relying on less accurate calculations with three intermediate roundings
"((a + 0x8000) >> 16) + ((b + 0x8000) >> 16) + ((c + 0x8000) >> 16)",
also has the same properties. However reverting
    http://cgit.freedesktop.org/pixman/commit/?id=ed39992564beefe6b12f81e842caba11aff98a9c
and applying this "Remove the 8e extra safety margin in COVER_CLIP
analysis" patch makes the cover test fail. The real reason why it fails
is that the old pixman code was using "pixman_transform_point_3d()"
function
    http://cgit.freedesktop.org/pixman/tree/pixman/pixman-matrix.c?id=pixman-0.28.2#n49
for getting the transformed coordinate of the top left corner pixel
in the image scaling code, but at the same time using a different
"pixman_transform_point()" function
    http://cgit.freedesktop.org/pixman/tree/pixman/pixman-matrix.c?id=pixman-0.28.2#n82
in the extents calculation code for setting the cover flag. And these
functions did the intermediate rounding differently. That's why the 8e
safety margin was needed.

** proof ends

However, for COVER_CLIP_NEAREST, the actual margins added were not 8e.
Because the half-way cases round down, that is, coordinate 0 hits pixel
index -1 while coordinate e hits pixel index 0, the extra safety margins
were actually 7e to the left and up, and 9e to the right and down. This
patch removes the 7e and 9e margins and restores the -e adjustment
required for NEAREST sampling in Pixman. For reference, see
pixman/rounding.txt.

For COVER_CLIP_BILINEAR, the margins were exactly 8e as there are no
additional offsets to be restored, so simply removing the 8e additions
is enough.

Proof:

All implementations must give the same numerical results as
bits_image_fetch_pixel_nearest() / bits_image_fetch_pixel_bilinear().

The former does
    int x0 = pixman_fixed_to_int (x - pixman_fixed_e);
which maps directly to the new test for the nearest flag, when you consider
that x0 must fall in the interval [0,width).

The latter does
    x1 = x - pixman_fixed_1 / 2;
    x1 = pixman_fixed_to_int (x1);
    x2 = x1 + 1;
When you write a COVER path, you take advantage of the assumption that
both x1 and x2 fall in the interval [0, width).

As samplers are allowed to fetch the pixel at x2 unconditionally, we
require
    x1 >= 0
    x2 < width
so
    x - pixman_fixed_1 / 2 >= 0
    x - pixman_fixed_1 / 2 + pixman_fixed_1 < width * pixman_fixed_1
so
    pixman_fixed_to_int (x - pixman_fixed_1 / 2) >= 0
    pixman_fixed_to_int (x + pixman_fixed_1 / 2) < width
which matches the source code lines for the bilinear case, once you delete
the lines that add the 8e margin.

Signed-off-by: Ben Avison <bavison@riscosopen.org>
[Pekka: adjusted commit message, left affine-bench changes for another patch]
[Pekka: add commit message parts from Siarhei]
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-09-25 14:24:17 +03:00
Ben Avison
23525b4ea5 pixman-general: Tighten up calculation of temporary buffer sizes
Each of the aligns can only add a maximum of 15 bytes to the space
requirement. This permits some edge cases to use the stack buffer where
previously it would have deduced that a heap buffer was required.

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-09-25 14:19:15 +03:00
Siarhei Siamashka
8b49d4b6b4 pixman-general: Fix stack related pointer arithmetic overflow
As https://bugs.freedesktop.org/show_bug.cgi?id=92027#c6 explains,
the stack is allocated at the very top of the process address space
in some configurations (32-bit x86 systems with ASLR disabled).
And the careless computations done with the 'dest_buffer' pointer
may overflow, failing the buffer upper limit check.

The problem can be reproduced using the 'stress-test' program,
which segfaults when executed via setarch:

    export CFLAGS="-O2 -m32" && ./autogen.sh
    ./configure --disable-libpng --disable-gtk && make
    setarch i686 -R test/stress-test

This patch introduces the required corrections. The extra check
for negative 'width' may be redundant (the invalid 'width' value
is not supposed to reach here), but it's better to play safe
when dealing with the buffers allocated on stack.

Reported-by: Ludovic Courtès <ludo@gnu.org>
Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Reviewed-by: soren.sandmann@gmail.com
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-09-22 13:19:06 +03:00
Thomas Petazzoni
4297e9058d test: add a check for FE_DIVBYZERO
Some architectures, such as Microblaze and Nios2, currently do not
implement FE_DIVBYZERO, even though they have <fenv.h> and
feenableexcept(). This commit adds a configure.ac check to verify
whether FE_DIVBYZERO is defined or not, and if not, disables the
problematic code in test/utils.c.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Marek Vasut <marex@denx.de>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-09-20 15:50:04 +03:00
Oded Gabbay
8189fad961 vmx: Remove unused expensive functions
Now that we replaced the expensive functions with better performing
alternatives, we should remove them so they will not be used again.

Running Cairo benchmark on trimmed traces gave the following results:

POWER8, 8 cores, 3.4GHz, RHEL 7.2 ppc64le.

Speedups
========
t-firefox-scrolling     1232.30 -> 1096.55 :  1.12x
t-gnome-terminal-vim    613.86  -> 553.10  :  1.11x
t-evolution             405.54  -> 371.02  :  1.09x
t-firefox-talos-gfx     919.31  -> 862.27  :  1.07x
t-gvim                  653.02  -> 616.85  :  1.06x
t-firefox-canvas-alpha  941.29  -> 890.42  :  1.06x

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-09-18 10:07:13 +03:00
Oded Gabbay
6b1b8b2b90 vmx: implement fast path vmx_composite_over_n_8_8888
POWER8, 8 cores, 3.4GHz, RHEL 7.2 ppc64le.

reference memcpy speed = 25008.9MB/s (6252.2MP/s for 32bpp fills)

                Before         After           Change
              ---------------------------------------------
L1              91.32          182.84         +100.22%
L2              94.94          182.83         +92.57%
M               95.55          181.51         +89.96%
HT              88.96          162.09         +82.21%
VT              87.4           168.35         +92.62%
R               83.37          146.23         +75.40%
RT              66.4           91.5           +37.80%
Kops/s          683            859            +25.77%

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-09-18 10:07:08 +03:00
Oded Gabbay
8d8caa55a3 vmx: optimize vmx_composite_over_n_8888_8888_ca
This patch optimizes vmx_composite_over_n_8888_8888_ca by removing use
of expand_alpha_1x128, unpack/pack and in_over_2x128 in favor of
splat_alpha, in_over and MUL/ADD macros from pixman_combine32.h.

Running "lowlevel-blt-bench -n over_8888_8888" on POWER8, 8 cores,
3.4GHz, RHEL 7.2 ppc64le gave the following results:

reference memcpy speed = 23475.4MB/s (5868.8MP/s for 32bpp fills)

                Before          After           Change
              --------------------------------------------
L1              244.97          474.05         +93.51%
L2              243.74          473.05         +94.08%
M               243.29          467.16         +92.02%
HT              144.03          252.79         +75.51%
VT              174.24          279.03         +60.14%
R               109.86          149.98         +36.52%
RT              47.96           53.18          +10.88%
Kops/s          524             576            +9.92%

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-09-18 10:07:03 +03:00
Oded Gabbay
857880f0e4 vmx: optimize scaled_nearest_scanline_vmx_8888_8888_OVER
This patch optimizes scaled_nearest_scanline_vmx_8888_8888_OVER and all
the functions it calls (combine1, combine4 and
core_combine_over_u_pixel_vmx).

The optimization is done by removing use of expand_alpha_1x128 and
expand_alpha_2x128 in favor of splat_alpha and MUL/ADD macros from
pixman_combine32.h.

Running "lowlevel-blt-bench -n over_8888_8888" on POWER8, 8 cores,
3.4GHz, RHEL 7.2 ppc64le gave the following results:

reference memcpy speed = 24847.3MB/s (6211.8MP/s for 32bpp fills)

                Before          After           Change
              --------------------------------------------
L1              182.05          210.22         +15.47%
L2              180.6           208.92         +15.68%
M               180.52          208.22         +15.34%
HT              130.17          178.97         +37.49%
VT              145.82          184.22         +26.33%
R               104.51          129.38         +23.80%
RT              48.3            61.54          +27.41%
Kops/s          430             504            +17.21%

v2: Check *pm is not NULL before dereferencing it in combine1()

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-09-18 10:06:50 +03:00
Pekka Paalanen
73e586efb3 armv6: enable over_n_8888
Enable the fast path added in the previous patch by moving the lookup
table entries to their proper locations.

Lowlevel-blt-bench benchmark statistics with 30 iterations, showing the
effect of adding this one patch on top of
"armv6: Add over_n_8888 fast path (disabled)", which was applied on
fd59569294.

       Before          After
      Mean StdDev     Mean StdDev   Confidence   Change
L1    12.5   0.04     45.2   0.10    100.00%    +263.1%
L2    11.1   0.02     43.2   0.03    100.00%    +289.3%
M      9.4   0.00     42.4   0.02    100.00%    +351.7%
HT     8.5   0.02     25.4   0.10    100.00%    +198.8%
VT     8.4   0.02     22.3   0.07    100.00%    +167.0%
R      8.2   0.02     23.1   0.09    100.00%    +183.6%
RT     5.4   0.05     11.4   0.21    100.00%    +110.3%

At most 3 outliers rejected per test per set.

Iterating here means that lowlevel-blt-bench was executed 30 times, and
the statistics above were computed from the output.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-09-17 14:40:39 +03:00
Ben Avison
9eb6889b15 armv6: Add over_n_8888 fast path (disabled)
This new fast path is initially disabled by putting the entries in the
lookup table after the sentinel. The compiler cannot tell the new code
is not used, so it cannot eliminate the code. Also the lookup table size
will include the new fast path. When the follow-up patch then enables
the new fast path, the binary layout (alignments, size, etc.) will stay
the same compared to the disabled case.

Keeping the binary layout identical is important for benchmarking on
Raspberry Pi 1. The addresses at which functions are loaded will have a
significant impact on benchmark results, causing unexpected performance
changes. Keeping all function addresses the same across the patch
enabling a new fast path improves the reliability of benchmarks.

Benchmark results are included in the patch enabling this fast path.

[Pekka: disabled the fast path, commit message]
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-09-17 14:40:39 +03:00
Ben Avison
4c71f595e3 test: Add cover-test v5
This test aims to verify both numerical correctness and the honouring of
array bounds for scaled plots (both nearest-neighbour and bilinear) at or
close to the boundary conditions for applicability of "cover" type fast paths
and iter fetch routines.

It has a secondary purpose: by setting the env var EXACT (to any value) it
will only test plots that are exactly on the boundary condition. This makes
it possible to ensure that "cover" routines are being used to the maximum,
although this requires the use of a debugger or code instrumentation to
verify.

Changes in v4:

  Check the fence page size and skip the test if it is too large. Since
  we need to deal with pixman_fixed_t coordinates that go beyond the
  real image width, make the page size limit 16 kB. A 32 kB or larger
  page size would cause an a8 image width to be 32k or more, which is no
  longer representable in pixman_fixed_t.

  Use a shorthand variable 'filter' in test_cover().

  Whitespace adjustments.

Changes in v5:

  Skip if fenced memory is not supported. Do you know of any such
  platform?

Signed-off-by: Ben Avison <bavison@riscosopen.org>
[Pekka: changes in v4 and v5]
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-09-16 15:34:43 +03:00
Julien Cristau
f9a49b3783 Run tests with VERBOSE=1. 2015-09-12 20:31:08 +02:00
Julien Cristau
4b4898e073 Upload to unstable 2015-09-12 13:08:19 +02:00
Pekka Paalanen
812c9c9758 implementation: add PIXMAN_DISABLE=wholeops
Add a new option to PIXMAN_DISABLE: "wholeops". This option disables all
whole-operation fast paths regardless of implementation level, except
the general path (general_composite_rect).

The purpose is to add a debug option that allows us to test optimized
iterator paths specifically. With this, it is possible to see if:
- fast paths mask bugs in iterators
- compare fast paths with iterator paths for performance

The effect was tested on x86_64 by running:
$ PIXMAN_DISABLE='' ./test/lowlevel-blt-bench over_8888_8888
$ PIXMAN_DISABLE='wholeops' ./test/lowlevel-blt-bench over_8888_8888

In the first case time is spent in sse2_composite_over_8888_8888(), and
in the latter in sse2_combine_over_u().

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-09-09 11:42:55 +03:00
Pekka Paalanen
e9ef2cc4de utils.[ch]: add fence_get_page_size()
Add a function to get the page size used for memory fence purposes, and
use it everywhere where getpagesize() was used.

This offers a single point in code to override the page size, in case
one wants to experiment how the tests work with a higher page size than
what the developer's machine has.

This also offers a clean API, without adding #ifdefs, to tests for
checking the page size.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-09-09 11:30:51 +03:00
Pekka Paalanen
82f8c997df utils.c: fix fallback code for fence_image_create_bits()
Used a wrong variable name, causing:
/home/pq/git/pixman/demos/../test/utils.c: In function ‘fence_image_create_bits’:
/home/pq/git/pixman/demos/../test/utils.c:562:46: error: ‘width’ undeclared (first use in this function)

Use the correct variable.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-09-09 11:29:44 +03:00
Andreas Boll
42fab57651 Bump standards version to 3.9.6. 2015-09-04 13:40:42 +02:00
Andreas Boll
56432ef5e5 Drop XC- prefix from Package-Type field. 2015-09-04 13:39:55 +02:00
Andreas Boll
c0f98e1cf4 Add upstream url. 2015-09-04 12:30:27 +02:00
Andreas Boll
03e2d2138b Update Vcs-* fields. 2015-09-04 12:30:27 +02:00
intrigeri
e6fce5e4e4 Update changelog.
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
2015-09-04 12:30:26 +02:00
intrigeri
7bc925aa50 Enable all hardening build flags. Thanks to Simon Ruderich <simon@ruderich.org> for the patch.
Quoting Simon again: "It currently has the same effect as hardening=+bindnow,
but will automatically enable future hardening options and in case the package
will ever build binaries those are immediately protected with PIE as well."

Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
2015-09-04 12:29:51 +02:00
intrigeri
2fb4da778c Simplify hardening build flags handling. Thanks to Simon Ruderich <simon@ruderich.org> for the patch.
Quoting Simon Ruderich <simon@ruderich.org>:
"There's no need to use dpkg-buildflags manually in debian/rules.
Debhelper with compat=9 automatically enables the hardening flags when
dh_auto_configure is used. So just by calling dh_auto_configure [...]
the hardening flags get automatically passed to the build system.
DEB_BUILD_MAINT_OPTIONS is also respected."

Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
2015-09-04 12:29:51 +02:00
Andreas Boll
e47fb32ae3 Enable vmx on ppc64el (closes: #786345). 2015-09-04 12:29:49 +02:00
Andreas Boll
18e4bdcadf Bump changelogs. 2015-09-04 12:28:52 +02:00
Andreas Boll
40eb5d8140 Merge branch 'debian-unstable' into debian-unstable-new 2015-09-04 11:24:59 +02:00
Andreas Boll
266eaac369 Merge branch 'upstream-unstable' into debian-unstable-new 2015-09-04 11:24:48 +02:00
Pekka Paalanen
0700685382 test: add fence-image-self-test
Tests that fence_malloc and fence_image_create_bits actually work: that
out-of-bounds and out-of-row (unused stride area) accesses trigger
SIGSEGV.

If fence_malloc is a dummy (FENCE_MALLOC_ACTIVE not defined), this test
is skipped.

Changes in v2:

- check FENCE_MALLOC_ACTIVE value, not whether it is defined
- test that reading bytes near the fence pages does not cause a
  segmentation fault

Changes in v3:

- Do not print progress messages unless VERBOSE environment variable is
  set. Avoid spamming the terminal output of 'make check' on some
  versions of autotools.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-09-03 14:00:32 +03:00
Pekka Paalanen
13d93aa120 utils.[ch]: add fence_image_create_bits ()
Useful for detecting out-of-bounds accesses in composite operations.

This will be used by follow-up patches adding new tests.

Changes in v2:

- fix style on fence_image_create_bits args
- add page to stride only if stride_fence
- add comment on the fallback definition about freeing storage

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-09-01 17:06:46 +03:00
Pekka Paalanen
c70ddd5c9e utils.[ch]: add FENCE_MALLOC_ACTIVE
Define a new token to simplify checking whether fence_malloc() actually
can catch out-of-bounds access.

This will be used in the future to skip tests that rely on fence_malloc
checking functionality.

Changes in v2:

- #define FENCE_MALLOC_ACTIVE always, but change its value to help catch
  use of it without including utils.h

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-09-01 17:05:58 +03:00
Ben Avison
a82e519944 scaling-test: list more details when verbose
Add mask details to the output.

[Pekka: redo whitespace and print src,dst,mask x and y.]
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-08-28 14:24:28 +03:00
Pekka Paalanen
fd59569294 lowlevel-blt-bench: make extra arguments an error
If a user gives multiple patterns or extra arguments, only the last one
was used as the pattern while the former were just ignored. This is a
user error silently converted to something possibly unexpected.

In presence of extra arguments, complain and quit.

Cc: Ben Avison <bavison@riscosopen.org>
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-08-18 10:23:27 +03:00
Oded Gabbay
69611473c5 Post-release version bump to 0.33.3
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-08-01 23:01:43 +03:00
Oded Gabbay
ee790044b0 Pre-release version bump to 0.33.2
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-08-01 22:34:53 +03:00
Oded Gabbay
8d9be3619a vmx: implement fast path iterator vmx_fetch_a8
no changes were observed when running cairo trimmed benchmarks.

Running "lowlevel-blt-bench src_8_8888" on POWER8, 8 cores,
3.4GHz, RHEL 7.1 ppc64le gave the following results:

reference memcpy speed = 25197.2MB/s (6299.3MP/s for 32bpp fills)

                Before          After           Change
              --------------------------------------------
L1              965.34          3936           +307.73%
L2              942.99          3436.29        +264.40%
M               902.24          2757.77        +205.66%
HT              448.46          784.99         +75.04%
VT              430.05          819.78         +90.62%
R               412.9           717.04         +73.66%
RT              168.93          220.63         +30.60%
Kops/s          1025            1303           +27.12%

It was benchmarked against commid id e2d211a from pixman/master

Siarhei Siamashka reported that on playstation3, it shows the following
results:

== before ==

              src_8_8888 =  L1: 194.37  L2: 198.46  M:155.90 (148.35%)
              HT: 59.18  VT: 36.71  R: 38.93  RT: 12.79 ( 106Kops/s)

== after ==

              src_8_8888 =  L1: 373.96  L2: 391.10  M:245.81 (233.88%)
              HT: 80.81  VT: 44.33  R: 48.10  RT: 14.79 ( 122Kops/s)

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-07-16 16:13:35 +03:00
Oded Gabbay
47f74ca946 vmx: implement fast path iterator vmx_fetch_x8r8g8b8
It was benchmarked against commid id 2be523b from pixman/master

POWER8, 8 cores, 3.4GHz, RHEL 7.1 ppc64le.

cairo trimmed benchmarks :

Speedups
========
t-firefox-asteroids  533.92  -> 489.94 :  1.09x

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-07-16 16:13:35 +03:00
Oded Gabbay
fcbb97d445 vmx: implement fast path scaled nearest vmx_8888_8888_OVER
It was benchmarked against commid id 2be523b from pixman/master

POWER8, 8 cores, 3.4GHz, RHEL 7.1 ppc64le.
reference memcpy speed = 24764.8MB/s (6191.2MP/s for 32bpp fills)

                Before           After           Change
              ---------------------------------------------
L1              134.36          181.68          +35.22%
L2              135.07          180.67          +33.76%
M               134.6           180.51          +34.11%
HT              121.77          128.79          +5.76%
VT              120.49          145.07          +20.40%
R               93.83           102.3           +9.03%
RT              50.82           46.93           -7.65%
Kops/s          448             422             -5.80%

cairo trimmed benchmarks :

Speedups
========
t-firefox-asteroids  533.92 -> 497.92 :  1.07x
    t-midori-zoomed  692.98 -> 651.24 :  1.06x

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-07-16 16:13:35 +03:00
Oded Gabbay
ad612c4205 vmx: implement fast path vmx_composite_src_x888_8888
It was benchmarked against commid id 2be523b from pixman/master

POWER8, 8 cores, 3.4GHz, RHEL 7.1 ppc64le.
reference memcpy speed = 24764.8MB/s (6191.2MP/s for 32bpp fills)

                Before           After           Change
              ---------------------------------------------
L1              1115.4          5006.49         +348.85%
L2              1112.26         4338.01         +290.02%
M               1110.54         2524.15         +127.29%
HT              745.41          1140.03         +52.94%
VT              749.03          1287.13         +71.84%
R               423.91          547.6           +29.18%
RT              205.79          194.98          -5.25%
Kops/s          1414            1361            -3.75%

cairo trimmed benchmarks :

Speedups
========
t-gnome-system-monitor  1402.62  -> 1212.75 :  1.16x
   t-firefox-asteroids   533.92  ->  474.50 :  1.13x

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-07-16 16:13:35 +03:00
Oded Gabbay
fafc1d403b vmx: implement fast path vmx_composite_over_n_8888_8888_ca
It was benchmarked against commid id 2be523b from pixman/master

POWER8, 8 cores, 3.4GHz, RHEL 7.1 ppc64le.

reference memcpy speed = 24764.8MB/s (6191.2MP/s for 32bpp fills)

                Before           After           Change
              ---------------------------------------------
L1              61.92            244.91          +295.53%
L2              62.74            243.3           +287.79%
M               63.03            241.94          +283.85%
HT              59.91            144.22          +140.73%
VT              59.4             174.39          +193.59%
R               53.6             111.37          +107.78%
RT              37.99            46.38           +22.08%
Kops/s          436              506             +16.06%

cairo trimmed benchmarks :

Speedups
========
t-xfce4-terminal-a1  1540.37 -> 1226.14 :  1.26x
t-firefox-talos-gfx  1488.59 -> 1209.19 :  1.23x

Slowdowns
=========
        t-evolution  553.88  -> 581.63  :  1.05x
          t-poppler  364.99  -> 383.79  :  1.05x
t-firefox-scrolling  1223.65 -> 1304.34 :  1.07x

The slowdowns can be explained in cases where the images are small and
un-aligned to 16-byte boundary. In that case, the function will first
work on the un-aligned area, even in operations of 1 byte. In case of
small images, the overhead of such operations can be more than the
savings we get from using the vmx instructions that are done on the
aligned part of the image.

In the C fast-path implementation, there is no special treatment for the
un-aligned part, as it works in 4 byte quantities on the entire image.

Because llbb is a synthetic test, I would assume it has much less
alignment issues than "real-world" scenario, such as cairo benchmarks,
which are basically recorded traces of real application activity.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-07-16 16:13:35 +03:00
Oded Gabbay
a3e914407e vmx: implement fast path composite_add_8888_8888
Copied impl. from sse2 file and edited to use vmx functions

It was benchmarked against commid id 2be523b from pixman/master

POWER8, 16 cores, 3.4GHz, ppc64le :

reference memcpy speed = 27036.4MB/s (6759.1MP/s for 32bpp fills)

                Before           After           Change
              ---------------------------------------------
L1              248.76          3284.48         +1220.34%
L2              264.09          2826.47         +970.27%
M               261.24          2405.06         +820.63%
HT              217.27          857.3           +294.58%
VT              213.78          980.09          +358.46%
R               176.61          442.95          +150.81%
RT              107.54          150.08          +39.56%
Kops/s          917             1125            +22.68%

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-07-16 16:13:35 +03:00
Oded Gabbay
d5b5343c7d vmx: implement fast path composite_add_8_8
Copied impl. from sse2 file and edited to use vmx functions

It was benchmarked against commid id 2be523b from pixman/master

POWER8, 16 cores, 3.4GHz, ppc64le :

reference memcpy speed = 27036.4MB/s (6759.1MP/s for 32bpp fills)

                Before           After           Change
              ---------------------------------------------
L1              687.63          9140.84         +1229.33%
L2              715             7495.78         +948.36%
M               717.39          8460.14         +1079.29%
HT              569.56          1020.12         +79.11%
VT              520.3           1215.56         +133.63%
R               514.81          874.35          +69.84%
RT              341.28          305.42          -10.51%
Kops/s          1621            1579            -2.59%

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-07-16 16:13:35 +03:00
Oded Gabbay
339eeaf095 vmx: implement fast path composite_over_8888_8888
Copied impl. from sse2 file and edited to use vmx functions

It was benchmarked against commid id 2be523b from pixman/master

POWER8, 16 cores, 3.4GHz, ppc64le :

reference memcpy speed = 27036.4MB/s (6759.1MP/s for 32bpp fills)

                Before           After           Change
              ---------------------------------------------
L1              129.47          1054.62         +714.57%
L2              138.31          1011.02         +630.98%
M               139.99          1008.65         +620.52%
HT              122.11          468.45          +283.63%
VT              121.06          532.21          +339.62%
R               108.48          240.5           +121.70%
RT              77.87           116.7           +49.87%
Kops/s          758             981             +29.42%

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-07-16 16:13:35 +03:00
Oded Gabbay
0cc8a2e971 vmx: implement fast path vmx_fill
Based on sse2 impl.

It was benchmarked against commid id e2d211a from pixman/master

Tested cairo trimmed benchmarks on POWER8, 8 cores, 3.4GHz,
RHEL 7.1 ppc64le :

speedups
========
     t-swfdec-giant-steps  1383.09 ->  718.63  :  1.92x speedup
   t-gnome-system-monitor  1403.53 ->  918.77  :  1.53x speedup
              t-evolution  552.34  ->  415.24  :  1.33x speedup
      t-xfce4-terminal-a1  1573.97 ->  1351.46 :  1.16x speedup
      t-firefox-paintball  847.87  ->  734.50  :  1.15x speedup
      t-firefox-asteroids  565.99  ->  492.77  :  1.15x speedup
t-firefox-canvas-swscroll  1656.87 ->  1447.48 :  1.14x speedup
          t-midori-zoomed  724.73  ->  642.16  :  1.13x speedup
   t-firefox-planet-gnome  975.78  ->  911.92  :  1.07x speedup
          t-chromium-tabs  292.12  ->  274.74  :  1.06x speedup
     t-firefox-chalkboard  690.78  ->  653.93  :  1.06x speedup
      t-firefox-talos-gfx  1375.30 ->  1303.74 :  1.05x speedup
   t-firefox-canvas-alpha  1016.79 ->  967.24  :  1.05x speedup

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-07-16 16:13:35 +03:00
Oded Gabbay
c12ee95089 vmx: add helper functions
This patch adds the following helper functions for reuse of code,
hiding BE/LE differences and maintainability.

All of the functions were defined as static force_inline.

Names were copied from pixman-sse2.c so conversion of fast-paths between
sse2 and vmx would be easier from now on. Therefore, I tried to keep the
input/output of the functions to be as close as possible to the sse2
definitions.

The functions are:

- load_128_aligned       : load 128-bit from a 16-byte aligned memory
                           address into a vector

- load_128_unaligned     : load 128-bit from memory into a vector,
                           without guarantee of alignment for the
                           source pointer

- save_128_aligned       : save 128-bit vector into a 16-byte aligned
                           memory address

- create_mask_16_128     : take a 16-bit value and fill with it
                           a new vector

- create_mask_1x32_128   : take a 32-bit pointer and fill a new
                           vector with the 32-bit value from that pointer

- create_mask_32_128     : take a 32-bit value and fill with it
                           a new vector

- unpack_32_1x128        : unpack 32-bit value into a vector

- unpacklo_128_16x8      : unpack the eight low 8-bit values of a vector

- unpackhi_128_16x8      : unpack the eight high 8-bit values of a vector

- unpacklo_128_8x16      : unpack the four low 16-bit values of a vector

- unpackhi_128_8x16      : unpack the four high 16-bit values of a vector

- unpack_128_2x128       : unpack the eight low 8-bit values of a vector
                           into one vector and the eight high 8-bit
                           values into another vector

- unpack_128_2x128_16    : unpack the four low 16-bit values of a vector
                           into one vector and the four high 16-bit
                           values into another vector

- unpack_565_to_8888     : unpack an RGB_565 vector to 8888 vector

- pack_1x128_32          : pack a vector and return the LSB 32-bit of it

- pack_2x128_128         : pack two vectors into one and return it

- negate_2x128           : xor two vectors with mask_00ff (separately)

- is_opaque              : returns whether all the pixels contained in
                           the vector are opaque

- is_zero                : returns whether the vector equals 0

- is_transparent         : returns whether all the pixels
                           contained in the vector are transparent

- expand_pixel_8_1x128   : expand an 8-bit pixel into lower 8 bytes of a
                           vector

- expand_alpha_1x128     : expand alpha from vector and return the new
                           vector

- expand_alpha_2x128     : expand alpha from one vector and another alpha
                           from a second vector

- expand_alpha_rev_2x128 : expand a reversed alpha from one vector and
                           another reversed alpha from a second vector

- pix_multiply_2x128     : do pix_multiply for two vectors (separately)

- over_2x128             : perform over op. on two vectors

- in_over_2x128          : perform in-over op. on two vectors

v2: removed expand_pixel_32_1x128 as it was not used by any function and
its implementation was erroneous

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-07-16 16:13:35 +03:00
Oded Gabbay
034149537b vmx: add LOAD_VECTOR macro
This patch adds a macro for loading a single vector.
It also make the other LOAD_VECTORx macros use this macro as a base so
code would be re-used.

In addition, I fixed minor coding style issues.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-07-16 16:13:35 +03:00
Nemanja Lukic
7441340256 MIPS: update author's e-mail address
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-07-11 23:08:02 +03:00
Pekka Paalanen
e2d211ac49 lowlevel-blt-bench: add option to skip memcpy measurement
The memcpy speed measurement takes several seconds. When you are running
single tests in a harness that iterates dozens or hundreds of times, the
repeated measurements are redundant and take a lot of time. It is also
an open question whether the measured speed changes over long test runs
due to unidentified platform reasons (Raspberry Pi).

Add a command line option to set the reference memcpy speed, skipping
the measuring.

The speed is mainly used to compute how many iterations do run inside
the bench_*() functions, so for repeated testing on the same hardware,
it makes sense to lock that number to a constant.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-07-06 12:04:50 +03:00
Pekka Paalanen
31cb0d4267 lowlevel-blt-bench: add CSV output mode
Add a command line option for choosing CSV output mode.

In CSV mode, only the results in Mpixels/s are printed in an easily
machine-parseable format. All user-friendly printing is suppressed.

This is intended for cases where you benchmark one particular operation
at a time. Running the "all" set of benchmarks will print just fine, but
you may have trouble matching rows to operations as you have to look at
the tests_tbl[] to see what row is which.

Reviewed-by: Ben Avison <bavison@riscosopen.org>

v2: don't add a space after comma in CSV.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-07-06 12:04:32 +03:00
Pekka Paalanen
9a7e0bc6d0 lowlevel-blt-bench: refactor to Mpx_per_sec()
Refactor the Mpixels/s computations into a function. Easier to read and
better documents what is being computed.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-07-06 12:04:27 +03:00
Pekka Paalanen
6e9c48c579 lowlevel-blt-bench: all bench funcs to return pix_cnt
The bench_* functions, that did not already do it, are modified to
return the number of pixels processed during the benchmark. This moves
the computation to the site that actually determines the number, and
simplifies bench_composite() a bit.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-07-06 12:04:22 +03:00
Pekka Paalanen
9e8f2bcaf5 lowlevel-blt-bench: move speed and scaling printing
Move the printing of the memory speed and scaling mode into a new
function. This will help with implementing a machine-readable output
option.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-07-06 12:04:18 +03:00
Pekka Paalanen
a33c2e6853 lowlevel-blt-bench: print single pattern details
When given just a single test pattern instead of "all", print the test
details. This can be used to verify the pattern parser agrees with the
user, just like scaling settings are printed.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-07-06 12:04:12 +03:00
Pekka Paalanen
3ac7ae2017 lowlevel-blt-bench: make test_entry::testname const
We assign string literals to it, so it better be const.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-07-06 12:04:07 +03:00
Pekka Paalanen
56d8b365f5 lowlevel-blt-bench: move explanation printing
Move explanation printing to a new function. This will help with
implementing a machine-readable output option.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-07-06 12:04:03 +03:00
Pekka Paalanen
bddff993ed lowlevel-blt-bench: move usage to a function
Move printing of usage into a new function and use argv[0] as the
program name. This will help printing usage from multiple places.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-07-06 12:03:28 +03:00
Oded Gabbay
2be523b204 vmx: fix pix_multiply for ppc64le
vec_mergeh/l operates differently for BE and LE, because of the order of
the vector elements (l->r in BE and r->l in LE).
To fix that, we simply need to swap between the input parameters, in case
we are working in LE.

v2:

- replace _LITTLE_ENDIAN with WORDS_BIGENDIAN for consistency
- fixed whitespaces and indentation issues

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-07-02 10:04:41 +03:00
Oded Gabbay
8d379ad88e vmx: fix unused var warnings
v2: don't put ';' at the end of macro definition. Instead, move it to
    each line the macro is used.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-07-02 10:04:34 +03:00
Oded Gabbay
ff66a4a3ce vmx: encapsulate the temporary variables inside the macros
v2: fixed whitespaces and indentation issues

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-07-02 10:04:27 +03:00
Fernando Seiti Furusato
f6a26d0925 vmx: adjust macros when loading vectors on ppc64le
Replaced usage of vec_lvsl to direct unaligned assignment
operation (=). That is because, according to Power ABI Specification,
the usage of lvsl is deprecated on ppc64le.

Changed COMPUTE_SHIFT_{MASK,MASKS,MASKC} macro usage to no-op for powerpc
little endian since unaligned access is supported on ppc64le.

v2:

- replace _LITTLE_ENDIAN with WORDS_BIGENDIAN for consistency
- fixed whitespaces and indentation issues

Signed-off-by: Fernando Seiti Furusato <ferseiti@linux.vnet.ibm.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-07-02 10:04:15 +03:00
Oded Gabbay
b3a61703f4 vmx: fix splat_alpha for ppc64le
The permutation vector isn't correct for LE, so correct its values
in case we are in LE mode.

v2:

- replace _LITTLE_ENDIAN with WORDS_BIGENDIAN for consistency
- change #ifndef to #ifdef for readability

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-07-02 10:03:54 +03:00
Ben Avison
eebc1b7820 mmx/sse2: Use SIMPLE_NEAREST_SOLID_MASK_FAST_PATH for NORMAL repeat
These two architectures were the only place where
SIMPLE_NEAREST_SOLID_MASK_FAST_PATH was used, and in both cases the
equivalent SIMPLE_NEAREST_SOLID_MASK_FAST_PATH_NORMAL macro was used
immediately afterwards, so including the NORMAL case in the main macro
simplifies the fast path table.

[Pekka: removed extra comma from the end of
 SIMPLE_NEAREST_SOLID_MASK_FAST_PATH]

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-06-01 13:57:09 +03:00
Ben Avison
7f66928079 mmx/sse2: Use SIMPLE_NEAREST_FAST_PATH macro
There is some reordering, but the only significant thing to ensure that
the same routine is chosen is that a COVER fast path for a given
combination of operator and source/destination pixel formats must
precede all the variants of repeated fast paths for the same
combination. This patch (and the other mmx/sse2 one) still follows that
rule.

I believe that in every other case, the set of operations that match any
pair of fast paths that are reordered in these patches are mutually
exclusive. While there will be a very subtle timing difference due to
the distance through the table we have to search to find a match
(sometimes faster, sometime slower) there is no evidence that the tables
have been carefully ordered by frequency of occurrence - just for ease
of copy-and-pasting.

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-06-01 13:57:00 +03:00
Ben Avison
dee5000abb mips: Retire PIXMAN_MIPS_SIMPLE_NEAREST_A8_MASK_FAST_PATH
This macro does exactly the same thing as the platform-neutral macro
SIMPLE_NEAREST_A8_MASK_FAST_PATH.

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-06-01 13:56:54 +03:00
Ben Avison
4c70d18acc arm: Simplify PIXMAN_ARM_SIMPLE_NEAREST_A8_MASK_FAST_PATH
This macro is a superset of the platform-neutral macro
SIMPLE_NEAREST_A8_MASK_FAST_PATH. In other words, in addition to the
_COVER, _NONE and _PAD suffixes, its expansion includes the _NORMAL suffix.

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-06-01 13:56:45 +03:00
Ben Avison
de255e6a5e arm: Retire PIXMAN_ARM_SIMPLE_NEAREST_FAST_PATH
This macro does exactly the same thing as the platform-neutral macro
SIMPLE_NEAREST_FAST_PATH.

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
2015-06-01 13:56:29 +03:00
Ben Avison
62a772f2ea test: Fix solid-test for big-endian targets
When generating test data, we need to make sure the interpretation of
the data is the same regardless of endianess. That is, the pixel value
for each channel is the same on both little and big-endians.

This fixes a test failure on ppc64 (big-endian).

Tested-by: Fernando Seiti Furusato <ferseiti@linux.vnet.ibm.com> (ppc64le, ppc64, powerpc)
Tested-by: Ben Avison <bavison@riscosopen.org> (armv6l, armv7l, i686)
[Pekka: added commit message]
Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Tested-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk> (x86_64)
2015-06-01 13:11:15 +03:00
Ben Avison
82f9b4faaf test: Add new fuzz tester targeting solid images
This places a heavier emphasis on solid images than the other fuzz testers,
and tests both single-pixel repeating bitmap images as well as those created
using pixman_image_create_solid_fill(). In the former case, it also
exercises the case where the bitmap contents are written to after the
image's first use, which is not a use-case that any other test has
previously covered.

[Pekka: added the default case to the switch in test_solid ().]

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-05-15 16:30:21 +03:00
James Cowgill
cf086d4949 MIPS: Drop #ifdef __ELF__ in definition of LEAF_MIPS32R2
Commit 6d2cf40166 ("MIPS: Fix exported symbols in public API") attempted to
add a .hidden assembly directive, conditional on the code being compiled for an
ELF target. Unfortunately the #ifdef added was already inside a macro and
wasn't expanded properly by the preprocessor.

Fix by removing the check. It's unlikely there are many non-ELF MIPS systems
around anyway.

Fixes: Bug 83358 (https://bugs.freedesktop.org/83358)
Fixes: 6d2cf40166 ("MIPS: Fix exported symbols in public API")
Signed-off-by: James Cowgill <james410@cowgill.org.uk>
Cc: Vicente Olivert Riera <Vincent.Riera@imgtec.com>
Cc: Nemanja Lukic <nemanja.lukic@rt-rk.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-05-07 12:49:09 +03:00
Bill Spitzak
6f14bae79e test: Added more demos and tests to .gitignore file
Uses a wildcard to handle the majority which end in "-test".

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-05-05 09:49:25 +03:00
Ben Avison
e0c0153d8e test: Add a new benchmarker targeting affine operations
Affine-bench is written by following the example of lowlevel-blt-bench.

Affine-bench differs from lowlevel-blt-bench in the following:
- does not test different sized operations fitting to specific caches,
  destination is always 1920x1080
- allows defining the affine transformation parameters
- carefully computes operation extents to hit the COVER_CLIP fast paths

Original version by Ben Avison. Changes by Pekka in v3:
- commit message
- style fixes
- more comments
- refactoring (e.g. bench_info_t)
- help output tweak

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-04-24 10:25:42 +03:00
Pekka Paalanen
58e21d3e45 lowlevel-blt-bench: use a8r8g8b8 for CA solid masks
When doing component alpha with a solid mask, use a mask format that has
all the color channels instead of just a8. As Ben Avison explains it:

"Lowlevel-blt-bench initialises all its images using memset(0xCC) so an
a8 solid image would be converted by _pixman_image_get_solid() to
0xCC000000 whereas an a8r8g8b8 would be 0xCCCCCCCC. When you're not in
component alpha mode, only the alpha byte matters for the mask image,
but in the case of component alpha operations, a fast path might decide
that it can save itself a lot of multiplications if it spots that 3
constant mask components are already 0."

No (default) test so far has a solid mask with CA. This is just
future-proofing lowlevel-blt-bench to do what one would expect.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-04-20 16:18:18 +03:00
Pekka Paalanen
be49f929b6 lowlevel-blt-bench: use the test pattern parser
Let lowlevel-blt-bench parse the test name string from the command line,
allowing to run almost infinitely more tests. One is no longer limited
to the tests listed in the big table.

While you can use the old short-hand names like src_8888_8888, you can
also use all possible operators now, and specify pixel formats exactly
rather than just x888, for instance.

This even allows to run crazy patterns like
conjoint_over_reverse_a8b8g8r8_n_r8g8b8x8.

All individual patterns are now interpreted through the parser. The
pattern "all" runs the same old default test set as before but through
the parser instead of the hard-coded parameters.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-04-15 12:43:01 +03:00
Pekka Paalanen
5b27912108 lowlevel-blt-bench: add test name parser and self-test
This patch is inspired by "lowlevel-blt-bench: Parse test name strings in
general case" by Ben Avison. From Ben's commit message:

"There are many types of composite operation that are useful to benchmark
but which are omitted from the table. Continually having to add extra
entries to the table is a nuisance and is prone to human error, so this
patch adds the ability to break down unknow strings of the format
  <operation>_<src>[_<mask]_<dst>[_ca]
where bitmap formats are specified by number of bits of each component
(assumed in ARGB order) or 'n' to indicate a solid source or mask."

Add the parser to lowlevel-blt-bench.c, but do not hook it up to the
command line just yet. Instead, make it run a self-test.

As we now dynamically parse strings similar to the test names in the
huge table 'tests_tbl', we should make sure we can parse the old
well-known test names and produce exactly the same test parameters. The
self-test goes through this old table and verifies the parsing results.

Unfortunately the old table is not exactly consistent, it contains some
special cases that cannot be produced by the parsing rules. Whether
these special cases are intentional or just an oversight is not always
clear. Anyway, add a small table to reproduce the special cases
verbatim.

If we wanted, we could remove the big old table in a follow-up commit,
but then we would also lose the parser self-test.

The point of this whole excercise to let lowlevel-blt-bench recognize
novel test patterns in the future, following exactly the conventions
used in the old table.

Ben, from what I see, this parser has one major difference to what you
wrote. For a solid mask, your parser uses a8r8g8b8 format, while mine
uses a8 which comes from the old table.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-04-15 12:42:51 +03:00
Pekka Paalanen
1f45bd6565 test/utils: add format aliases used by lowlevel-blt-bench
Lowlevel-blt-bench uses several pixel format shorthands. Pick them from
the great table in lowlevel-blt-bench.c and add them here so that
format_from_string() can recognize them.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-04-15 12:42:45 +03:00
Pekka Paalanen
ef9c28a0e4 test/utils: add operator aliases for lowlevel-blt-bench
Lowlevel-blt-bench uses the operator alias "outrev". Add an alias for it
in the operator-name table.

Also add aliases for overrev, inrev and atoprev, so that
lowlevel-blt-bench can later recognize them for new test cases.

The aliases are added such, that an operator to name lookup will never
return them; it returns the proper names instead.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-04-15 12:42:40 +03:00
Pekka Paalanen
f1f6cc23ce test/utils: support format name aliases
Previously there was a flat list of formats, used to iterate over all
formats when looking up a format from name or listing them. This cannot
support name aliases.

To support name aliases (multiple name strings mapping to the same
format), create a format-name mapping table. Functions format_name(),
format_from_string(), and list_formats() should keep on working exactly
like before, except format_from_string() now recognizes the additional
formats that format_name() already supported.

The only the formats from the old format list are added with ENTRY, so
that list_formats() works as before. The whole list is verified against
the authoritative list in pixman.h, entries missing from the old list
are commented out.

The extra formats supported by the old format_name() are added as
ALIASes. A side-effect of that is that now also format_from_string()
recognizes the following new names: x4c4 / c8, x4g4 / g8, c4, g4, g1,
yuy2, yv12, null, solid, pixbuf, rpixbuf, unknown.

Name aliases will be useful in follow-up patches, where
lowlevel-blt-bench.c is converted to parse short-hand format names from
strings.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-04-15 12:42:33 +03:00
Pekka Paalanen
2c5fac9320 test/utils: support operator name aliases
Previously there was a flat list of operators (pixman_op_t), used to
iterate over all operators when looking up an operator from name or
listing them. This cannot support name aliases.

To support name aliases (multiple name strings mapping to the same
operator), create an operator-name mapping table. Functions
operator_name, operator_from_string, and list_operators should keep on
working exactly like before, except operator_from_string now recognizes
a few aliases too.

Name aliases will be useful in follow-up patches, where
lowlevel-blt-bench.c is converted to parse operator names from strings.
Lowlevel-blt-bench uses shorthand names instead of the usual names. This
change allows lowlevel-blt-bench.s to use operator_from_string in the
future.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Ben Avison <bavison@riscosopen.org>
2015-04-15 12:41:47 +03:00
Ben Avison
f122907dc1 test: Move format and operator string functions to utils.[ch]
This permits format_from_string(), list_formats(), list_operators() and
operator_from_string() to be used from tests other than check-formats.

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-04-13 10:11:51 +03:00
Ben Avison
9bc025f7cd pixman.c: Coding style
A few violations of coding style were identified in code copied from here
into affine-bench.

Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2015-04-09 12:04:55 +03:00
Ben Avison
978dd9fc65 armv6: Fix typo in preload macro
Missing "lsl" meant that cases with a 32-bit source and/or mask, and an
8-bit destination, the code would not assemble.
2015-04-01 18:38:36 -07:00
Siarhei Siamashka
594e6a6c93 mmx: Fix _mm_empty problems for over_8888_8888/over_8888_n_8888
Using "--disable-sse2 --disable-ssse3" configure options and
CFLAGS="-m32 -O2 -g" on an x86 system results in pixman "make check"
failures:

    ../test-driver: line 95: 29874 Aborted
    FAIL: affine-test
    ../test-driver: line 95: 29887 Aborted
    FAIL: scaling-test

One _mm_empty () was missing and another one is needed to workaround
an old GCC bug https://gcc.gnu.org/PR47759 (GCC may move MMX instructions
around and cause test suite failures).

Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-24 14:25:30 -07:00
Søren Sandmann Pedersen
a8669137b9 Fix comment about BILINEAR_INTERPOLATION_BITS to say < 8 rather than <= 8
Since a4c79d695d the constant
BILINEAR_INTERPOLATION_BITS must be strictly less than 8, so fix the
comment to say this, and also add a COMPILE_TIME_ASSERT in the
bilinear fetcher in pixman-fast-path.c
2014-10-05 12:42:47 -07:00
Matt Turner
f078727f39 mmx: Add nearest over_8888_8888
lowlevel-blt-bench -n, over_8888_8888, 15 iterations on Loongson 2f:

           Before          After
          Mean StdDev     Mean StdDev   Change
    L1    15.8   0.02     24.0   0.06   +52.0%
    L2    14.8   0.15     23.3   0.13   +56.9%
    M     10.3   0.01     13.8   0.03   +33.6%
    HT    10.0   0.02     14.5   0.05   +44.7%
    VT     9.7   0.02     13.5   0.04   +39.2%
    R      9.1   0.01     12.2   0.04   +34.4%
    RT     7.1   0.06      8.9   0.09   +25.2%
2014-09-05 00:22:07 -07:00
Matt Turner
f868ff5e34 mmx: Add nearest over_8888_n_8888
lowlevel-blt-bench -n, over_8888_n_8888, 15 iterations on Loongson 2f:

           Before          After
          Mean StdDev     Mean StdDev   Change
    L1     9.7   0.01     19.2   0.02   +98.2%
    L2     9.6   0.11     19.2   0.16   +99.5%
    M      7.3   0.02     12.5   0.01   +72.0%
    HT     6.6   0.01     13.4   0.02  +103.2%
    VT     6.4   0.01     12.6   0.03   +96.1%
    R      6.3   0.01     11.2   0.01   +76.5%
    RT     4.4   0.01      8.1   0.03   +82.6%
2014-09-05 00:22:04 -07:00
Julien Cristau
b483955605 Upload to unstable 2014-08-23 22:16:47 -07:00
intrigeri
c03d98f8d1 Enable hardening build flags with dpkg-buildflags.
All default dpkg-buildflags, plus the bonus bindnow one, are used.
The last available one (PIE) is not applicable to shared libraries.
2014-08-23 22:16:13 -07:00
Cyril Brulebois
b16d4c7ed7 Upload to unstable 2014-08-18 22:52:38 +02:00
Julien Cristau
f9c2d54a62 Disable vmx on ppc64el (closes: #745547).
Thanks, Breno Leitao!
2014-07-24 22:43:15 +02:00
Julien Cristau
cd23302b1a Upload to unstable 2014-07-13 16:31:09 +02:00
Julien Cristau
98eadfa08b Remove Cyril from Uploaders. 2014-07-13 16:31:02 +02:00
Julien Cristau
9e8362a51f Bump debhelper compat level to 9. 2014-07-13 16:24:29 +02:00
Julien Cristau
6a7a144be1 Bump changelogs 2014-07-13 16:22:42 +02:00
Julien Cristau
fd99d1a9c8 pixman 0.32.6 release
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQEcBAABAgAGBQJTuIYBAAoJEIWlZJw4kjNuHA8H/0wBuk7d/7uqCAfqyQ3o5Qs9
 q00UvsZVCymC6f1Hh+bgQGtMHgy2Wo1gvw/usSoxlxqc+T4wWeN912RPZwvprVzn
 v9+J7UyjLH28yUVq9NBn91LqHEWfzLK8gf3Y3i3IIQNd9YtIkqjPMyKDuTaQVUYc
 Op6vzXzjzwf0lKjTTZOWsnm9Zh6vvFoqVOajS6hSvA20/xczknAbU3HfUIBI+G4o
 6/br7A6OpIB08vFAJd1XJpAkrHjjIJCECg3wxsfxuCYcoRSWhUPoul4IEkHXn4p4
 mTKjTzBxuDM85FAadTT7PxygABelcQljMlJPKwY4rJwz5t8/yFLc5h5WXft2laI=
 =z2mk
 -----END PGP SIGNATURE-----

Merge tag 'pixman-0.32.6' into debian-unstable

pixman 0.32.6 release
2014-07-13 16:21:47 +02:00
Søren Sandmann Pedersen
87eea99e44 Pre-release version bump to 0.32.6 2014-07-05 18:55:43 -04:00
Siarhei Siamashka
9f18ea3483 configure.ac: Check if the compiler supports GCC vector extensions
The Intel Compiler 14.0.0 claims version GCC 4.7.3 compatibility
via __GNUC__/__GNUC__MINOR__ macros, but does not provide the same
level of GCC vector extensions support as the original GCC compiler:
    http://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html

Which results in the following compilation failure:

In file included from ../test/utils.h(7),
                 from ../test/utils.c(3):
../test/utils-prng.h(138): error: expression must have integral type
      uint32x4 e = x->a - ((x->b << 27) + (x->b >> (32 - 27)));
                            ^

The problem is fixed by doing a special check in configure for
this feature.
2014-07-04 20:52:59 -04:00
Søren Sandmann
50d7b5fa8e create_bits(): Cast the result of height * stride to size_t
In create_bits() both height and stride are ints, so the result is
also an int, which will overflow if height or stride are big enough
and size_t is bigger than int.

This patch simply casts height to size_t to prevent these overflows,
which prevents the crash in:

    https://bugzilla.redhat.com/show_bug.cgi?id=972647

It's not even close to fixing the full problem of supporting big
images in pixman.

See also

    https://bugs.freedesktop.org/show_bug.cgi?id=69014
2014-07-04 20:50:58 -04:00
Nemanja Lukic
6d2cf40166 MIPS: Fix exported symbols in public API. 2014-07-03 13:35:21 -04:00
Nemanja Lukic
c42824ebb5 MIPS: Fix exported symbols in public API. 2014-07-03 13:34:53 -04:00
Søren Sandmann Pedersen
5a2edb3f2c test: Rearrange tests in order of increasing runtime
Making short tests run first is convenient to catch obvious bugs
early.
2014-06-28 19:24:27 -04:00
Søren Sandmann Pedersen
9cd283b2eb pixman-gradient-walker: Make left_x and right_x 64 bit variables
The variables left_x, and right_x in gradient_walker_reset() are
computed from pos, which is a 64 bit quantity, so to avoid overflows,
these variables must be 64 bit as well.

Similarly, the left_x and right_x that are stored in
pixman_gradient_walker_t need to be 64 bit as well; otherwise,
pixman_gradient_walker_pixel() will call reset too often.

This fixes the radial-invalid test, which was generating 'invalid'
floating point exceptions when the overflows caused color values to be
outside of [0, 255].
2014-05-15 13:29:58 -04:00
Søren Sandmann Pedersen
f5f5dbbbc6 test: Add radial-invalid test program
This program demonstrates a bug in gradient walker, where some integer
overflows cause colors outside the range [0, 255] to be generated,
which in turns cause 'invalid' floating point exceptions when those
colors are converted to uint8_t.

The bug was first reported by Owen Taylor on the #cairo IRC channel.
2014-05-15 13:29:38 -04:00
Ben Avison
91f32ce961 ARMv6: Add fast path for src_x888_0565
Benchmark results, "before" is upstream/master
5f661ee719, and "after" contains this
patch on top.

lowlevel-blt-bench, src_8888_0565, 100 iterations:

       Before          After
      Mean StdDev     Mean StdDev   Confidence   Change
L1    25.9   0.20    115.6   0.70    100.00%    +347.1%
L2    14.4   0.23     52.7   3.48    100.00%    +265.0%
M     14.1   0.01     79.8   0.17    100.00%    +465.9%
HT    10.2   0.03     32.9   0.31    100.00%    +221.2%
VT     9.8   0.03     29.8   0.25    100.00%    +203.4%
R      9.4   0.03     27.8   0.18    100.00%    +194.7%
RT     4.6   0.04     10.9   0.29    100.00%    +135.9%

At most 19 outliers rejected per test per set.

cairo-perf-trace with trimmed traces results were indifferent.

A system-wide perf_3.10 profile on Raspbian shows significant
differences in the X server CPU usage. The following were measured from
a 130x62 char lxterminal running 'dmesg' every 0.5 seconds for roughly
30 seconds. These profiles are libpixman.so symbols only.

Before:

Samples: 63K of event 'cpu-clock', Event count (approx.): 2941348112, DSO: libpixman-1.so.0.33.1
 37.77%  Xorg  [.] fast_fetch_r5g6b5
 14.39%  Xorg  [.] pixman_composite_over_n_8_8888_asm_armv6
  8.51%  Xorg  [.] fast_write_back_r5g6b5
  7.38%  Xorg  [.] pixman_composite_src_8888_8888_asm_armv6
  4.39%  Xorg  [.] pixman_composite_add_8_8_asm_armv6
  3.69%  Xorg  [.] pixman_composite_src_n_8888_asm_armv6
  2.53%  Xorg  [.] _pixman_image_validate
  2.35%  Xorg  [.] pixman_image_composite32

After:

Samples: 31K of event 'cpu-clock', Event count (approx.): 3619782704, DSO: libpixman-1.so.0.33.1
 22.36%  Xorg  [.] pixman_composite_over_n_8_8888_asm_armv6
 13.59%  Xorg  [.] pixman_composite_src_x888_0565_asm_armv6
 12.75%  Xorg  [.] pixman_composite_src_8888_8888_asm_armv6
  6.79%  Xorg  [.] pixman_composite_add_8_8_asm_armv6
  5.95%  Xorg  [.] pixman_composite_src_n_8888_asm_armv6
  4.12%  Xorg  [.] pixman_image_composite32
  3.69%  Xorg  [.] _pixman_image_validate
  3.65%  Xorg  [.] _pixman_bits_image_setup_accessors

Before, fast_fetch_r5g6b5 + fast_write_back_r5g6b5 took 46% of the
samples in libpixman, and probably incurred some memcpy() load, too.
After, pixman_composite_src_x888_0565_asm_armv6 takes 14%. Note, that
the sample counts are very different before/after, as less time is spent
in Pixman and running time is not exactly the same.

Furthermore, in the above test, the CPU idle function was sampled 9%
before, and 15% after.

v4, Pekka Paalanen <pekka.paalanen@collabora.co.uk> :
	Re-benchmarked on Raspberry Pi, commit message.
2014-05-01 15:11:42 -04:00
Pekka Paalanen
5f661ee719 ARM: use pixman_asm_function in internal headers
The two ARM headers contained open-coded copies of pixman_asm_function,
replace these.

Since it seems customary that ARM headers do not use CPP include guards,
rely on the .S files to #include "pixman-arm-asm.h" first. They all
do now.

v2: Fix a build failure on rpi by adding one #include.
2014-04-21 20:38:09 -04:00
Ben Avison
ab587b444c ARMv6: Add fast path for in_reverse_8888_8888
Benchmark results, "before" is the patch
* upstream/master 4b76bbfda6
+ ARMv6: Support for very variable-hungry composite operations
+ ARMv6: Add fast path for over_n_8888_8888_ca
and "after" contains the additional patches on top:
+ ARMv6: Add fast path flag to force no preload of destination buffer
+ ARMv6: Add fast path for in_reverse_8888_8888 (this patch)

lowlevel-blt-bench, in_reverse_8888_8888, 100 iterations:

       Before          After
      Mean StdDev     Mean StdDev   Confidence   Change
L1    21.1   0.07     32.3   0.08    100.00%     +52.9%
L2    11.6   0.29     18.0   0.52    100.00%     +54.4%
M     10.5   0.01     16.1   0.03    100.00%     +54.1%
HT     8.2   0.02     12.0   0.04    100.00%     +45.9%
VT     8.1   0.02     11.7   0.04    100.00%     +44.5%
R      8.1   0.02     11.3   0.04    100.00%     +39.7%
RT     4.8   0.04      6.1   0.09    100.00%     +27.3%

At most 12 outliers rejected per test per set.

cairo-perf-trace with trimmed traces, 30 iterations:

                                    Before          After
                                   Mean StdDev     Mean StdDev   Confidence   Change
t-firefox-paintball.trace          18.0   0.01     14.1   0.01    100.00%     +27.4%
t-firefox-chalkboard.trace         36.7   0.03     36.0   0.02    100.00%      +1.9%
t-firefox-canvas-alpha.trace       20.7   0.22     20.3   0.22    100.00%      +1.9%
t-swfdec-youtube.trace              7.8   0.03      7.8   0.03    100.00%      +0.9%
t-firefox-talos-gfx.trace          25.8   0.44     25.6   0.29     93.87%      +0.7%  (insignificant)
t-firefox-talos-svg.trace          20.6   0.04     20.6   0.03    100.00%      +0.2%
t-firefox-fishbowl.trace           21.2   0.04     21.1   0.02    100.00%      +0.2%
t-xfce4-terminal-a1.trace           4.8   0.01      4.8   0.01     98.85%      +0.2%  (insignificant)
t-swfdec-giant-steps.trace         14.9   0.03     14.9   0.02     99.99%      +0.2%
t-poppler-reseau.trace             22.4   0.11     22.4   0.08     86.52%      +0.2%  (insignificant)
t-gnome-system-monitor.trace       17.3   0.03     17.2   0.03     99.74%      +0.2%
t-firefox-scrolling.trace          24.8   0.12     24.8   0.11     70.15%      +0.1%  (insignificant)
t-firefox-particles.trace          27.5   0.18     27.5   0.21     48.33%      +0.1%  (insignificant)
t-grads-heat-map.trace              4.4   0.04      4.4   0.04     16.61%      +0.0%  (insignificant)
t-firefox-fishtank.trace           13.2   0.01     13.2   0.01      7.64%      +0.0%  (insignificant)
t-firefox-canvas.trace             18.0   0.05     18.0   0.05      1.31%      -0.0%  (insignificant)
t-midori-zoomed.trace               8.0   0.01      8.0   0.01     78.22%      -0.0%  (insignificant)
t-firefox-planet-gnome.trace       10.9   0.02     10.9   0.02     64.81%      -0.0%  (insignificant)
t-gvim.trace                       33.2   0.21     33.2   0.18     38.61%      -0.1%  (insignificant)
t-firefox-canvas-swscroll.trace    32.2   0.09     32.2   0.11     73.17%      -0.1%  (insignificant)
t-firefox-asteroids.trace          11.1   0.01     11.1   0.01    100.00%      -0.2%
t-evolution.trace                  13.0   0.05     13.0   0.05     91.99%      -0.2%  (insignificant)
t-gnome-terminal-vim.trace         19.9   0.14     20.0   0.14     97.38%      -0.4%  (insignificant)
t-poppler.trace                     9.8   0.06      9.8   0.04     99.91%      -0.5%
t-chromium-tabs.trace               4.9   0.02      4.9   0.02    100.00%      -0.6%

At most 6 outliers rejected per test per set.

Cairo perf reports the running time, but the change is computed for
operations per second instead (inverse of running time).

Confidence is based on Welch's t-test. Absolute changes less than 1%
can be accounted as measurement errors, even if statistically
significant.

There was a question of why FLAG_NO_PRELOAD_DST is used. It makes
lowlevel-blt-bench results worse except for L1, but improves some
Cairo trace benchmarks.

"Ben Avison" <bavison@riscosopen.org> wrote:

> The thing with the lowlevel-blt-bench benchmarks for the more
> sophisticated composite types (as a general rule, anything that involves
> branches at the per-pixel level) is that they are only profiling the case
> where you have mid-level alpha values in the source/mask/destination.
> Real-world images typically have a disproportionate number of fully
> opaque and fully transparent pixels, which is why when there's a
> discrepancy between which implementation performs best with cairo-perf
> trace versus lowlevel-blt-bench, I usually favour the Cairo winner.
>
> The results of removing FLAG_NO_PRELOAD_DST (in other words, adding
> preload of the destination buffer) are easy to explain in the
> lowlevel-blt-bench results. In the L1 case, the destination buffer is
> already in the L1 cache, so adding the preloads is simply adding extra
> instruction cycles that have no effect on memory operations. The "in"
> compositing operator depends upon the alpha of both source and
> destination, so if you use uniform mid-alpha, then you actually do need
> to read your destination pixels, so you benefit from preloading them. But
> for fully opaque or fully transparent source pixels, you don't need to
> read the corresponding destination pixel - it'll either be left alone or
> overwritten. Since the ARM11 doesn't use write-allocate cacheing, both of
> these cases avoid both the time taken to load the extra cachelines, as
> well as increasing the efficiency of the cache for other data. If you
> examine the source images being used by the Cairo test, you'll probably
> find they mostly use transparent or opaque pixels.

v4, Pekka Paalanen <pekka.paalanen@collabora.co.uk> :
	Rebased, re-benchmarked on Raspberry Pi, commit message.

v5, Pekka Paalanen <pekka.paalanen@collabora.co.uk> :
	Rebased, re-benchmarked on Raspberry Pi due to a fix to
	"ARMv6: Add fast path for over_n_8888_8888_ca" patch.
2014-04-21 20:34:26 -04:00
Ben Avison
68d2f7b486 ARMv6: Add fast path flag to force no preload of destination buffer 2014-04-21 20:34:26 -04:00
Ben Avison
4ad769cbec ARMv6: Add fast path for over_n_8888_8888_ca
Benchmark results, "before" is
* upstream/master 4b76bbfda6
"after" contains the additional patches on top:
+ ARMv6: Support for very variable-hungry composite operations
+ ARMv6: Add fast path for over_n_8888_8888_ca (this patch)

lowlevel-blt-bench, over_n_8888_8888_ca, 100 iterations:

       Before          After
      Mean StdDev     Mean StdDev   Confidence   Change
L1     2.7   0.00     16.1   0.06    100.00%    +500.7%
L2     2.4   0.01     14.1   0.15    100.00%    +489.9%
M      2.3   0.00     14.3   0.01    100.00%    +510.2%
HT     2.2   0.00      9.7   0.03    100.00%    +345.0%
VT     2.2   0.00      9.4   0.02    100.00%    +333.4%
R      2.2   0.01      9.5   0.03    100.00%    +331.6%
RT     1.9   0.01      5.5   0.07    100.00%    +192.7%

At most 1 outliers rejected per test per set.

cairo-perf-trace with trimmed traces, 30 iterations:

                                    Before          After
                                   Mean StdDev     Mean StdDev   Confidence   Change
t-firefox-talos-gfx.trace          33.1   0.42     25.8   0.44    100.00%     +28.6%
t-firefox-scrolling.trace          31.4   0.11     24.8   0.12    100.00%     +26.3%
t-gnome-terminal-vim.trace         22.4   0.10     19.9   0.14    100.00%     +12.5%
t-evolution.trace                  13.9   0.07     13.0   0.05    100.00%      +6.5%
t-firefox-planet-gnome.trace       11.6   0.02     10.9   0.02    100.00%      +6.5%
t-gvim.trace                       34.0   0.21     33.2   0.21    100.00%      +2.4%
t-chromium-tabs.trace               4.9   0.02      4.9   0.02    100.00%      +1.0%
t-poppler.trace                     9.8   0.05      9.8   0.06    100.00%      +0.7%
t-firefox-canvas-swscroll.trace    32.3   0.10     32.2   0.09    100.00%      +0.4%
t-firefox-paintball.trace          18.1   0.01     18.0   0.01    100.00%      +0.3%
t-poppler-reseau.trace             22.5   0.09     22.4   0.11     99.29%      +0.3%
t-firefox-canvas.trace             18.1   0.06     18.0   0.05     99.29%      +0.2%
t-xfce4-terminal-a1.trace           4.8   0.01      4.8   0.01     99.77%      +0.2%
t-firefox-fishbowl.trace           21.2   0.03     21.2   0.04    100.00%      +0.2%
t-gnome-system-monitor.trace       17.3   0.03     17.3   0.03     99.54%      +0.1%
t-firefox-asteroids.trace          11.1   0.01     11.1   0.01    100.00%      +0.1%
t-midori-zoomed.trace               8.0   0.01      8.0   0.01     99.98%      +0.1%
t-grads-heat-map.trace              4.4   0.04      4.4   0.04     34.08%      +0.1%  (insignificant)
t-firefox-talos-svg.trace          20.6   0.03     20.6   0.04     54.06%      +0.0%  (insignificant)
t-firefox-fishtank.trace           13.2   0.01     13.2   0.01     52.81%      -0.0%  (insignificant)
t-swfdec-giant-steps.trace         14.9   0.02     14.9   0.03     85.50%      -0.1%  (insignificant)
t-firefox-chalkboard.trace         36.6   0.02     36.7   0.03    100.00%      -0.2%
t-firefox-canvas-alpha.trace       20.7   0.32     20.7   0.22     55.76%      -0.3%  (insignificant)
t-swfdec-youtube.trace              7.8   0.02      7.8   0.03    100.00%      -0.5%
t-firefox-particles.trace          27.4   0.16     27.5   0.18     99.94%      -0.6%

At most 4 outliers rejected per test per set.

Cairo perf reports the running time, but the change is computed for
operations per second instead (inverse of running time).

Confidence is based on Welch's t-test. Absolute changes less than 1%
can be accounted as measurement errors, even if statistically
significant.

v4, Pekka Paalanen <pekka.paalanen@collabora.co.uk> :
	Use pixman_asm_function instead of startfunc.
	Rebased. Re-benchmarked on Raspberry Pi.
	Commit message.

v5, Ben Avison <bavison@riscosopen.org> :
	Fixed the bug exposed in blitters-test 4928372.
	15 hours of testing, compared to the 45 minutes to hit
	the bug originally.
    Pekka Paalanen <pekka.paalanen@collabora.co.uk> :
	Squash the fix, re-benchmark on Raspberry Pi.
2014-04-21 20:34:26 -04:00
Ben Avison
73d2f8b61a ARMv6: Support for very variable-hungry composite operations
Previously, the variable ARGS_STACK_OFFSET was available to extract values
from function arguments during the init macro. Now this changes dynamically
around stack operations in the function as a whole so that arguments can be
accessed at any point. It is also joined by LOCALS_STACK_OFFSET, which
allows access to space reserved on the stack during the init macro.

On top of this, composite macros now have the option of using all of WK0-WK3
registers rather than just the subset it was told to use; this requires the
pixel count to be spilled to the stack over the leading pixels at the start
of each line. Thus, at best, each composite operation can use 11 registers,
plus any pointer registers not required for the composite type, plus as much
stack space as it needs, divided up into constants and variables as necessary.
2014-04-21 20:34:26 -04:00
Søren Sandmann
857e40f3d2 create_bits(): Cast the result of height * stride to size_t
In create_bits() both height and stride are ints, so the result is
also an int, which will overflow if height or stride are big enough
and size_t is bigger than int.

This patch simply casts height to size_t to prevent these overflows,
which prevents the crash in:

    https://bugzilla.redhat.com/show_bug.cgi?id=972647

It's not even close to fixing the full problem of supporting big
images in pixman.

See also

    https://bugs.freedesktop.org/show_bug.cgi?id=69014
2014-04-15 14:21:14 -04:00
Pekka Paalanen
4b76bbfda6 ARM: share pixman_asm_function definition
Several files define identically the asm macro pixman_asm_function.
Merge all these definitions into a new asm header.

The original definition is taken from pixman-arm-simd-asm-scaled.S with
the copyright/licence/author blurb verbatim.
2014-04-02 12:48:26 +03:00
Ben Avison
4ee85b0083 ARMv6: Add fast path for over_reverse_n_8888
Benchmark results, "before" is upstream commit
c343846 lowlevel-blt-bench: add in_reverse_8888_8888 test
and "after" is with this patch only added on top.

lowlevel-blt-bench, over_reverse_n_8888, 100 iterations:

       Before          After
      Mean StdDev     Mean StdDev   Confidence   Change
L1    15.1    0.1    274.5    2.3    100.00%   +1718.9%
L2    12.8    0.3    181.8    0.7    100.00%   +1315.5%
M     10.8    0.0     77.9    0.0    100.00%    +621.2%
HT     9.7    0.0     29.4    0.2    100.00%    +204.9%
VT     9.5    0.0     26.7    0.1    100.00%    +179.3%
R      9.3    0.0     25.3    0.1    100.00%    +173.6%
RT     6.0    0.1     11.0    0.2    100.00%     +82.9%

At most 16 outliers rejected per case per set.

cairo-perf-trace with trimmed traces, 30 iterations:

                                    Before          After
                                   Mean StdDev     Mean StdDev   Confidence   Change
t-poppler.trace                    12.9    0.1      9.7    0.0    100.00%     +32.6%
t-firefox-talos-gfx.trace          33.2    0.7     32.9    0.4     95.23%      +0.9%  (insignificant)
t-firefox-particles.trace          27.4    0.1     27.3    0.2     99.65%      +0.4%
t-firefox-canvas-alpha.trace       20.5    0.3     20.5    0.3     57.51%      +0.3%  (insignificant)
t-poppler-reseau.trace             22.4    0.1     22.4    0.1     95.69%      +0.3%  (insignificant)
t-firefox-fishtank.trace           13.2    0.0     13.2    0.0     99.84%      +0.1%
t-swfdec-giant-steps.trace         14.9    0.0     14.9    0.0     87.68%      +0.1%  (insignificant)
t-swfdec-youtube.trace              7.8    0.0      7.8    0.0     35.22%      +0.1%  (insignificant)
t-firefox-planet-gnome.trace       11.5    0.0     11.5    0.0     29.37%      +0.0%  (insignificant)
t-firefox-fishbowl.trace           21.2    0.0     21.2    0.0     18.09%      +0.0%  (insignificant)
t-grads-heat-map.trace              4.4    0.0      4.4    0.0      1.84%      +0.0%  (insignificant)
t-firefox-paintball.trace          18.0    0.0     18.0    0.0     33.43%      -0.0%  (insignificant)
t-firefox-talos-svg.trace          20.5    0.0     20.5    0.1     68.56%      -0.1%  (insignificant)
t-midori-zoomed.trace               8.0    0.0      8.0    0.0     99.98%      -0.1%
t-firefox-canvas-swscroll.trace    32.1    0.1     32.1    0.1     85.27%      -0.1%  (insignificant)
t-gnome-system-monitor.trace       17.2    0.0     17.2    0.0     99.97%      -0.2%
t-firefox-chalkboard.trace         36.5    0.0     36.6    0.0    100.00%      -0.2%
t-firefox-asteroids.trace          11.1    0.0     11.1    0.0    100.00%      -0.2%
t-firefox-canvas.trace             17.9    0.0     18.0    0.0    100.00%      -0.3%
t-chromium-tabs.trace               4.9    0.0      4.9    0.0     97.95%      -0.3%  (insignificant)
t-xfce4-terminal-a1.trace           4.8    0.0      4.8    0.0    100.00%      -0.4%
t-firefox-scrolling.trace          31.1    0.1     31.2    0.1    100.00%      -0.5%
t-evolution.trace                  13.7    0.1     13.8    0.1     99.99%      -0.6%
t-gnome-terminal-vim.trace         22.0    0.2     22.2    0.1     99.99%      -0.7%
t-gvim.trace                       33.2    0.2     33.5    0.2    100.00%      -0.8%

At most 6 outliers rejected per case per set.

Cairo perf reports the running time, but the change is computed for
operations per second instead (inverse of running time).

Changes in the order of +/- 1% can be accounted for measurement errors,
even if they are deemed to be statistically significant. This claim is
based on comparing two 30-iteration identical "before" runs using the
exact same binaries, and observing changes from -0.4% to +0.5% with
>=99% confidence.

Confidence is based on Welch's t-test.

v4, Pekka Paalanen <pekka.paalanen@collabora.co.uk> :
	Rebased, re-benchmarked on Raspberry Pi, commit message.
2014-04-02 12:46:24 +03:00
Siarhei Siamashka
56622140e3 test: Fix OpenMP clauses for the tolerance-test
Compiling with the Intel Compiler reveals a problem:

tolerance-test.c(350): error: index variable "i" of for statement following an OpenMP for pragma must be private
  #       pragma omp parallel for default(none) shared(i) private (result)
  ^

In addition to this, the 'result' variable also should not be private
(otherwise its value does not survive after the end of the loop). It
needs to be either shared or use the reduction clause to describe how
the results from multiple threads are combined together. Reduction
seems to be more appropriate here.
2014-04-02 12:46:09 +03:00
Siarhei Siamashka
840912b311 configure.ac: Check if the compiler supports GCC vector extensions
The Intel Compiler 14.0.0 claims version GCC 4.7.3 compatibility
via __GNUC__/__GNUC__MINOR__ macros, but does not provide the same
level of GCC vector extensions support as the original GCC compiler:
    http://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html

Which results in the following compilation failure:

In file included from ../test/utils.h(7),
                 from ../test/utils.c(3):
../test/utils-prng.h(138): error: expression must have integral type
      uint32x4 e = x->a - ((x->b << 27) + (x->b >> (32 - 27)));
                            ^

The problem is fixed by doing a special check in configure for
this feature.
2014-04-02 12:46:04 +03:00
Ben Avison
c343846625 lowlevel-blt-bench: add in_reverse_8888_8888 test
in_reverse_8888_8888 is one of the more commonly used operations in the
cairo-perf-trace suite that hasn't been in lowlevel-blt-bench until now.

v4, Pekka Paalanen <pekka.paalanen@collabora.co.uk> :
	Split from "Add extra test to lowlevel-blt-bench and fix an
	existing one", new summary.
2014-03-20 08:33:05 -04:00
Ben Avison
898859f3d3 lowlevel-blt-bench: over_reverse_n_8888 needs solid source
v4, Pekka Paalanen <pekka.paalanen@collabora.co.uk> :
	Split from "Add extra test to lowlevel-blt-bench and fix an
	existing one", new summary.
2014-03-20 08:33:05 -04:00
Ben Avison
38317cbfde ARMv6: remove 1 instr per row in generate_composite_function
This knocks off one instruction per row. The effect is probably too small to
be measurable, but might as well be included. The second occurrence of this
sequence doesn't actually benefit at all, but is changed for consistency.

The saved instruction comes from combining the "and" inside the .if
statement with an earlier "tst". The "and" was normally needed, except
for in one special case, where bits 4-31 were all shifted off the top of
the register later on in preload_leading_step2, so we didn't care about
their values.

v4, Pekka Paalanen <pekka.paalanen@collabora.co.uk> :
	Remove "bits 0-3" from the comments, update patch summary, and
	augment message with Ben's suggestion.
2014-03-20 08:33:05 -04:00
Ben Avison
763a6d3e67 ARMv6: Fix indentation in the composite macros 2014-03-20 08:33:05 -04:00
Søren Sandmann
82d094654a Remove all the operators that use division from pixman-combine32.c
These are now handled by floating point combiners.
2014-01-04 16:13:27 -05:00
Søren Sandmann
ccb1df0c5e Copy the comments from pixman-combine32.c to pixman-combine-float.c
An upcoming commit will delete many of the operators from
pixman-combine32.c and rely on the ones in pixman-combine-float.c. The
comments about how the operators were derived are still useful though,
so copy them into pixman-combine-float.c before the deletion.
2014-01-04 16:13:27 -05:00
Søren Sandmann Pedersen
94244b0c40 utils.c: Set DEVIATION to 0.0128
Consider a HARD_LIGHT operation with the following pixels:

- source:           15      (6 bits)
- source alpha:     255     (8 bits)
- mask alpha:       223     (8 bits)
- dest              255     (8 bits)
- dest alpha:       0       (8 bits)

Since 2 times the source is less than source alpha, the first branch
of the hard light blend mode is taken:

        (1 - sa) * d + (1 - da) * s + 2 * s * d

Since da is 0 and d is 1, this degenerates to:

        (1 - sa) + 3 * s

Taking (src IN mask) into account along with the fact that sa is 1,
this becomes:

        (1 - ma) + 3 * s * ma

      = (1 - 223/255.0) + 3 * (15/63.0) * (223/255.0)

      = 0.7501400560224089

When computed with the source converted by bit replication to eight
bits, and additionally with the (src IN mask) part rounded to eight
bits, we get:

        ma = 223/255.0

        s * ma = (60 / 255.0) * (223/255.0) which rounds to 52 / 255

and the result is

        (1 - ma) + 3 * s * ma

      = (1 - 223/255.0) + 3 * 52/255.0

      = 0.7372549019607844

so now we have an error of 0.012885.

Without making changes to the way pixman does integer
rounding/arithmetic, this error must then be considered
acceptable. Due to conservative computations in the test suite we can
however get away with 0.0128 as the acceptable deviation.

This fixes the remaining failures in pixel-test.
2014-01-04 16:13:27 -05:00
Søren Sandmann
15aa37adec Use floating point combiners for all operators that involve divisions
Consider a DISJOINT_ATOP operation with the following pixels:

- source:	0xff (8 bits)
- source alpha:	0x01 (8 bits)
- mask alpha:	0x7b (8 bits)
- dest:		0x00 (8 bits)
- dest alpha:	0xff (8 bits)

When (src IN mask) is computed in 8 bits, the resulting alpha channel
is 0 due to rounding:

     floor ((0x01 * 0x7b) / 255.0 + 0.5) = floor (0.9823) = 0

which means that since Render defines any division by zero as
infinity, the Fa and Fb for this operator end up as follows:

     Fa = max (1 - (1 - 1) / 0, 0) = 0

     Fb = min (1, (1 - 0) / 1) = 1

and so since dest is 0x00, the overall result is 0.

However, when computed in full precision, the alpha value no longer
rounds to 0, and so Fa ends up being

     Fa = max (1 - (1 - 1) / 0.0001, 0) = 1

and so the result is now

     s * ma * Fa + d * Fb

   = (1.0 * (0x7b / 255.0) * 1) + d * 0

   = 0x7b / 255.0

   = 0.4823

so the error in this case ends up being 0.48235294, which is clearly
not something that can be considered acceptable.

In order to avoid this problem, we need to do all arithmetic in such a
way that a multiplication of two tiny numbers can never end up being
zero unless one of the input numbers is itself zero.

This patch makes all computations that involve divisions take place in
floating point, which is sufficient to fix the test cases

This brings the number of failures in pixel-test down to 14.
2014-01-04 16:13:27 -05:00
Søren Sandmann
8f38243163 Soft Light: Consistent approach to division by zero
The Soft Light operator has several branches. One them is decided
based on whether 2 * s is less than or equal to 2 * sa. In floating
point implementations, when those two values are very close to each
other, it may not be completely predictable which branch we hit.

This is a problem because in one branch, when destination alpha is
zero, we get the result

      r = d * as

and in the other we get

      r = 0

So when d and as are not 0, this causes two different results to be
returned from essentially identical input values. In other words,
there is a discontinuity in the current implementation.

This patch randomly changes the second branch such that it now returns
d * sa instead. There is no deep meaning behind this, because
essentially this is an attempt to assign meaning to division by zero,
and all that is requires is that that meaning doesn't depend on minute
differences in input values.

This makes the number of failed pixels in pixel-test go down to 347.
2014-01-04 16:13:27 -05:00
Søren Sandmann Pedersen
89662adf77 pixman-combine32.c: Fix bugs related to integer promotion
In the component alpha part of the PDF_SEPARABLE_BLEND_MODE macro, the
expression ~RED_8 (m) is used. Because RED_8(m) gets promoted to int
before ~ is applied, the whole expression typically becomes some
negative value rather than (255 - RED_8(m)) as desired.

Fix this by using unsigned temporary variables.

This reduces the number of failures in pixel-test to 363.
2014-01-04 16:13:27 -05:00
Søren Sandmann Pedersen
e7a99b3b0f pixman/pixman-combine32.c: Bug fixes for separable blend modes
This commit fixes four separate bugs:

1. In the computation

      (1 - sa) * d + (1 - da) * s + sa * da * B(s, d)

   we were using regular addition for all four channels, but for
   superluminescent pixels, the addition could overflow causing
   nonsensical results.

2. The variables and return types used for the results of the blend
   mode calculations were unsigned, but for various blend modes (and
   especially with superluminescent pixels), the blend mode
   calculations could be negative, resulting in underflows.

3. The blend mode computations were returned as 8-bit values, which is
   not sufficient precision (especially considering that we need
   signed results).

4. The value before the final division by 255 was not properly clamped
   to [0, 255].

This patch fixes all those bugs. The blend mode computations are now
returned as signed 16 bit values with 1 represented as 255 * 255.

With these fixes, the number of failing pixels in pixel-test goes down
from 431 to 384.
2014-01-04 16:13:27 -05:00
Søren Sandmann
fe3504d03f pixel-test.c: Add a number of pixels that have failed at some point
This commit adds a large number of pixel regressions to
pixel-test. All of these have at some point been failing in
blend-mode-test, and most of them do fail currently.

To be specific, with this commit, pixel-test reports 431 failed tests.
2014-01-04 16:13:27 -05:00
Søren Sandmann Pedersen
bd94c17937 test/tolerance-test: New test program
This new test program is similar to test/composite in that it relies
on the pixel_checker_t API to do tolerance based verification. But
unlike the composite test, which verifies combinations of a fixed set
of pixels, this one generates random images and verifies that those
composite correctly.

Also unlike composite, tolerance-test supports all the separable blend
mode operators in addition to the original Render operators.

When tests fail, a C struct is printed that can be pasted into
pixel-test for regression purposes.

There is an option "--forever" which causes the random seed to be set
to the current time, and then the test runs until interrupted. This is
useful for overnight runs.

This test currently fails badly due to various bugs in the blend mode
operators. Later commits will fix those.
2014-01-04 16:13:27 -05:00
Søren Sandmann
c2fd65dba3 pixel-test: Command line argument to specify the regression to run
A new command line argument allows the user to specify which one of
the regressions should be run.
2014-01-04 16:13:27 -05:00
Søren Sandmann
a692e01600 pixel-test: Add support for mask pixels
Support is added to pixel-test for verifying operations involving
masks. If a regression includes a mask, it is verified with the
pixel_checker API in in both unified and component alpha modes.
2014-01-04 16:13:27 -05:00
Søren Sandmann Pedersen
779ca46e98 test/check-formats.c: Add support for separable blend modes 2014-01-04 16:13:27 -05:00
Søren Sandmann Pedersen
a42af27fc0 test/utils.c: Add support for separable blend mode ops to do_composite()
The implementations are copied from the floating point pipeline, but
use double precision instead of single precision.
2014-01-04 16:13:27 -05:00
Søren Sandmann
b29d74ef0c configure.ac: Check and use -Wno-unused-local-typedefs GCC option
With GCC 4.8.2 the COMPILE_TIME_ASSERT macro produces a spurious
warning about an unused local typedef:

    In file included from pixman.c:29:0:
    pixman.c: In function 'optimize_operator':
    pixman-private.h:1019:22: warning: typedef 'compile_time_assertion' locally defined but not used [-Wunused-local-typedefs]

The flag -Wno-unused-local-typedefs suppresses that warning.
2013-12-26 09:41:53 -05:00
Julien Cristau
08ff9fa402 Upload to unstable 2013-12-17 22:04:30 +01:00
Julien Cristau
e66148cda6 Bump changelogs 2013-12-08 15:33:18 +01:00
Julien Cristau
9c9f210896 pixman 0.32.4 release
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJSiXHSAAoJEA/daC2XTKcqUtQQALogcIuKShzPrZCnNke9jXJF
 Ujq4M0fHMBru4Uzqq+MCp02ssWLnoBvW8emwzalzt3xulZU+fUeYs1u56Epi1SnG
 oHt5ah1ZSicAwNBlDdflKgqnBGdsFJg5yj9F09zwZeBEBYwhJBaTQfIK6i0sww3s
 MQ66uANWsJQsW8/wFq5pJLmmmSWlelEHXz5pcjLavaYkOIITSzTeZF+xOvhBUwv2
 1zTsv9c2k05cR+8UKDpDURrEn5Cp5uQo0iV9FpKsyKL01ukqCbuBRWVxjSbXCmtu
 GWZ4qDLjScM8sCAQbZF4/MZuGoytC2cKxaWnjKn4h1L4+qZMIvjmcAlsP7CfJ14o
 AtWkYvU6rlY5m4je8Lh3QMbLkSTNFR8ix97jDhFmZlEQA3EXnPvme2YFecOmlVgF
 c1mVhVBR2Je/Hav0LiIne7151dFJ+THCAPOLcVqDCzRw2BMjAfp0Kx7qnFiXyvEt
 zgpoAmybf1kHOCpEugHGKwe4elCTvjq7xv3+JwkzqvV7uIvk1/J0ctIkBsboeMsP
 nvIJ8nBj9fNuJdP++jNX1xsi3C0LM16Bhd5n8wZcX4sqekSVj+LDht4JBPalMC7A
 m50kD9XlFSJ8UyoKrKMGx71XLnkGgT1hbQgE9ML8MumXZZMpjwIb9p7g7D2A1hXM
 /1kzDHmAaqbLcmFBTyO9
 =klDd
 -----END PGP SIGNATURE-----

Merge tag 'pixman-0.32.4' into debian-unstable

pixman 0.32.4 release

Conflicts:
	configure.ac
2013-12-08 15:28:54 +01:00
Søren Sandmann
945ab7a6f3 Soft Light: The first comparison should be <=, not <
According to the definition of soft light, the first comparison is
less-than-or-equal, not less-than.
2013-12-03 18:14:24 -05:00
Søren Sandmann
9ba3a34797 general: Support component alpha for all image types
Currently, if you attempt to use component alpha on source images or
images without RGB channels, Pixman will silently just use unified
alpha instead. This patch makes such images supported for component
alpha.

There is no particularly compelling usecase at the moment, but this
patch does get rid of a bit of special-case code both in
pixman-general.c and in test/composite.c.
2013-11-23 20:30:33 -05:00
Maarten Lankhorst
166899c913 release to sid 2013-11-18 15:55:02 +01:00
Maarten Lankhorst
7d8317abd4 Cherry-pick upstream bigfixes for fixing a crash when rendering invalid trapezoids. (LP: #1197921) 2013-11-18 15:54:49 +01:00
Ritesh Khadgaray
f740a26fe1 pixman_trapezoid_valid(): Fix underflow when bottom is close to MIN_INT
If t->bottom is close to MIN_INT (probably invalid value), subtracting
top can lead to underflow which causes crashes.  Attached patch will
fix the issue.

This fixes bug 67484.

(cherry picked from commit 5e14da97f1)
2013-11-18 15:08:42 +01:00
Søren Sandmann Pedersen
f4acde9c71 test/trap-crasher.c: Add trapezoid that demonstrates a crash
This trapezoid causes a crash due to an underflow in the
pixman_trapezoid_valid().

Test case from Ritesh Khadgaray.

(cherry picked from commit 2f876cf867)
2013-11-18 15:08:41 +01:00
Matt Turner
dae5a758e2 Post-release version bump to 0.32.5 2013-11-17 17:48:54 -08:00
Matt Turner
4b3a66b05e Pre-release version bump to 0.32.4 2013-11-17 17:46:52 -08:00
Søren Sandmann
97a655d5ca test/utils.c: Make the stack unaligned only on 32 bit Windows
The call_test_function() contains some assembly that deliberately
causes the stack to be aligned to 32 bits rather than 128 bits on
x86-32. The intention is to catch bugs that surface when pixman is
called from code that only uses a 32 bit alignment.

However, recent versions of GCC apparently make the assumption (either
accidentally or deliberately) that that the incoming stack is aligned
to 128 bits, where older versions only seemed to make this assumption
when compiling with -msse2. This causes the vector code in the PRNG to
now segfault when called from call_test_function() on x86-32.

This patch fixes that by only making the stack unaligned on 32 bit
Windows, where it would definitely be incorrect for GCC to assume that
the incoming stack is aligned to 128 bits.

V2: Put "defined(...)" around __GNUC__

Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=491110
(cherry picked from commit f473fd1e75)
2013-11-17 17:45:56 -08:00
Jakub Bogusz
5a313af74e Fix the SSSE3 CPUID detection.
SSSE3 is detected by bit 9 of ECX, but we were checking bit 9 of EDX
which is APIC leading to SSSE3 routines being called on CPUs without
SSSE3.

Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 8487dfbcd0)
2013-11-17 17:45:54 -08:00
Søren Sandmann
f473fd1e75 test/utils.c: Make the stack unaligned only on 32 bit Windows
The call_test_function() contains some assembly that deliberately
causes the stack to be aligned to 32 bits rather than 128 bits on
x86-32. The intention is to catch bugs that surface when pixman is
called from code that only uses a 32 bit alignment.

However, recent versions of GCC apparently make the assumption (either
accidentally or deliberately) that that the incoming stack is aligned
to 128 bits, where older versions only seemed to make this assumption
when compiling with -msse2. This causes the vector code in the PRNG to
now segfault when called from call_test_function() on x86-32.

This patch fixes that by only making the stack unaligned on 32 bit
Windows, where it would definitely be incorrect for GCC to assume that
the incoming stack is aligned to 128 bits.

V2: Put "defined(...)" around __GNUC__

Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=491110
2013-11-17 17:44:51 -08:00
Jakub Bogusz
8487dfbcd0 Fix the SSSE3 CPUID detection.
SSSE3 is detected by bit 9 of ECX, but we were checking bit 9 of EDX
which is APIC leading to SSSE3 routines being called on CPUs without
SSSE3.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-11-12 12:59:42 -08:00
Søren Sandmann
917a52003d Post-release version bump to 0.32.3 2013-11-11 19:55:18 -05:00
Søren Sandmann
a980f83a68 Pre-release version bump to 0.32.2 2013-11-11 19:44:54 -05:00
Søren Sandmann
7410073110 demos/Makefile.am: Move EXTRA_DIST outside "if HAVE_GTK"
Without this, if tarballs are generated on a system that doesn't have
GTK+ 2 development headers available, the files in EXTRA_DIST will not
be included, which then causes builds from the tarball to fail on
systems that do have GTK+ 2 headers available.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=71465
2013-11-11 19:28:30 -05:00
Søren Sandmann
e2e3817021 demos/Makefile.am: Move EXTRA_DIST outside "if HAVE_GTK"
Without this, if tarballs are generated on a system that doesn't have
GTK+ 2 development headers available, the files in EXTRA_DIST will not
be included, which then causes builds from the tarball to fail on
systems that do have GTK+ 2 headers available.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=71465
2013-11-11 19:13:31 -05:00
Andrea Canciani
950d1310f7 test: Fix the win32 build
The win32 build has no config.h, so HAVE_CONFIG_H should be checked
before including it, as in utils.h.
2013-11-11 19:09:46 -05:00
Andrea Canciani
9bab46e9b8 test: Fix the win32 build
The win32 build has no config.h, so HAVE_CONFIG_H should be checked
before including it, as in utils.h.
2013-11-11 19:09:28 -05:00
Søren Sandmann
7a00965d7a Post-release version bump to 0.32.1 2013-11-11 19:07:35 -05:00
Søren Sandmann
ca5a4dec44 Post-release version bump to 0.33.1 2013-11-10 18:17:12 -05:00
Søren Sandmann
895e7e05b7 Pre-release version bump to 0.32.0 2013-11-10 18:05:47 -05:00
Søren Sandmann Pedersen
8cbc7da4e5 Post-release version bump to 0.31.3 2013-11-01 20:52:00 -04:00
Søren Sandmann Pedersen
99e8605be0 Pre-release version bump to 0.31.2 2013-11-01 20:39:46 -04:00
Ritesh Khadgaray
5e14da97f1 pixman_trapezoid_valid(): Fix underflow when bottom is close to MIN_INT
If t->bottom is close to MIN_INT (probably invalid value), subtracting
top can lead to underflow which causes crashes.  Attached patch will
fix the issue.

This fixes bug 67484.
2013-11-01 20:24:57 -04:00
Søren Sandmann Pedersen
2f876cf867 test/trap-crasher.c: Add trapezoid that demonstrates a crash
This trapezoid causes a crash due to an underflow in the
pixman_trapezoid_valid().

Test case from Ritesh Khadgaray.
2013-11-01 20:24:27 -04:00
Brad Smith
8ef7e0d18e Fix pixman build with older GCC releases
The following patch fixes building pixman with older GCC releases
such as GCC 3.3 and older (OpenBSD; some older archs use GCC 3.3.6)
by changing the method of detecting the presence of __builtin_clz
to utilizing an autoconf check to determine its presence. Compilers
that pretend to be GCC, implement __builtin_clz and are already
utilizing the intrinsic include LLVM/Clang, Open64, EKOPath and
PCC.
2013-11-01 20:14:33 -04:00
Søren Sandmann Pedersen
3c2f4b6517 pixman-glyph.c: Add __force_align_arg_pointer to composite functions
The functions pixman_composite_glyphs_no_mask() and
pixman_composite_glyphs() can call into code compiled with -msse2,
which requires the stack to be aligned to 16 bytes. Since the ABIs on
Windows and Linux for x86-32 don't provide this guarantee, we need to
use this attribute to make GCC generate a prologue that realigns the
stack.

This fixes the crash introduced in the previous commit and also

   https://bugs.freedesktop.org/show_bug.cgi?id=70348

and

   https://bugs.freedesktop.org/show_bug.cgi?id=68300
2013-10-17 11:14:14 -04:00
Søren Sandmann Pedersen
3dce229772 utils.c: On x86-32 unalign the stack before calling test_function
GCC when compiling with -msse2 and -mssse3 will assume that the stack
is aligned to 16 bytes even on x86-32 and accordingly issue movdqa
instructions for stack allocated variables.

But despite what GCC thinks, the standard ABI on x86-32 only requires
a 4-byte aligned stack. This is true at least on Windows, but there
also was (and maybe still is) Linux code in the wild that assumed
this. When such code calls into pixman and hits something compiled
with -msse2, we get a segfault from the unaligned movdqas.

Pixman has worked around this issue in the past with the gcc attribute
"force_align_arg_pointer" but the problem has resurfaced now in

    https://bugs.freedesktop.org/show_bug.cgi?id=68300

because pixman_composite_glyphs() is missing this attribute.

This patch makes fuzzer_test_main() call the test_function through a
trampoline, which, on x86-32, has a bit of assembly that deliberately
avoids aligning the stack to 16 bytes as GCC normally expects. The
result is that glyph-test now crashes.

V2: Mark caller-save registers as clobbered, rather than using
noinline on the trampoline.
2013-10-17 11:14:14 -04:00
Siarhei Siamashka
9e81419ed5 configure.ac: check and use -Wdeclaration-after-statement GCC option
The accidental use of declaration after statement breaks compilation
with C89 compilers such as MSVC. Assuming that MSVC is one of the
supported compilers, it makes sense to ask GCC to at least report
warnings for such problematic code.
2013-10-14 00:27:04 +03:00
Siarhei Siamashka
a863bbcce0 sse2: bilinear fast path for src_x888_8888
Running cairo-perf-trace benchmark on Intel Core2 T7300:

Before:
[  0]    image    t-firefox-canvas-swscroll    1.989    2.008   0.43%    8/8
[  1]    image        firefox-canvas-scroll    4.574    4.609   0.50%    8/8

After:
[  0]    image    t-firefox-canvas-swscroll    1.404    1.418   0.51%    8/8
[  1]    image        firefox-canvas-scroll    4.228    4.259   0.36%    8/8
2013-10-14 00:26:51 +03:00
Søren Sandmann Pedersen
8f75f638ab configure.ac: Add check for pmulhuw assembly
Clang 3.0 chokes on the following bit of assembly

    asm ("pmulhuw %1, %0\n\t"
        : "+y" (__A)
        : "y" (__B)
    );

from pixman-mmx.c with this error message:

    fatal error: error in backend: Unsupported asm: input constraint
        with a matching output constraint of incompatible type!

So add a check in configure to only enable MMX when the compiler can
deal with it.
2013-10-12 15:04:27 -04:00
Søren Sandmann Pedersen
09a62d4dbc scale.c: Use int instead of kernel_t for values in named_int_t
The 'value' field in the 'named_int_t' struct is used for both
pixman_repeat_t and pixman_kernel_t values, so the type should be int,
not pixman_kernel_t.

Fixes some warnings like this

scale.c:124:33: warning: implicit conversion from enumeration
      type 'pixman_repeat_t' to different enumeration type
      'pixman_kernel_t' [-Wconversion]
    { "None",                   PIXMAN_REPEAT_NONE },
    ~                           ^~~~~~~~~~~~~~~~~~

when compiled with clang.
2013-10-12 15:04:27 -04:00
Søren Sandmann Pedersen
9367243801 pixman-combine32.c: Make Color Burn routine follow the math more closely
For superluminescent destinations, the old code could underflow in

    uint32_t r = (ad - d) * as / s;

when (ad - d) was negative. The new code avoids this problem (and
therefore causes changes in the checksums of thread-test and
blitters-test), but it is likely still buggy due to the use of
unsigned variables and other issues in the blend mode code.
2013-10-12 15:04:27 -04:00
Søren Sandmann Pedersen
105fa74fad pixman-combine32: Make Color Dodge routine follow the math more closely
Change blend_color_dodge() to follow the math in the comment more
closely.

Note, the new code here is in some sense worse than the old code
because it can now underflow the unsigned variables when the source is
superluminescent and (as - s) is therefore negative. The old code was
careful to clamp to 0.

But for superluminescent variables we really need the ability for the
blend function to become negative, and so the solution the underflow
problem is to just use signed variables. The use of unsigned variables
is a general problem in all of the blend mode code that will have to
be solved later.

The CRC32 values in thread-test and blitters-test are updated to
account for the changes in output.
2013-10-12 15:04:27 -04:00
Søren Sandmann Pedersen
2527a72432 pixman-combine32: Rename a number of variable from sa/sca to as/s
There are no semantic changes, just variables renames. The motivation
for these renames is so that the names are shorter and better match
the one used in the comments.
2013-10-12 15:04:27 -04:00
Søren Sandmann Pedersen
eaa4778c42 pixman-combine32: Improve documentation for blend mode operators
This commit overhauls the comments in pixman-comine32.c regarding
blend modes:

- Add a link to the PDF supplement that clarifies the specification of
  ColorBurn and ColorDodge

- Clarify how the formulas for premultiplied colors are derived form
  the ones in the PDF specifications

- Write out the derivation of the formulas in each blend routine
2013-10-12 15:04:27 -04:00
Søren Sandmann Pedersen
4bf1502fe8 pixman-combine32.c: Formatting fixes
Fix a bunch of spacing issues.

V2: More spacing issues, in the _ca combiners
2013-10-12 15:04:00 -04:00
Andrea Canciani
54be1a52f7 Fix thread-test on non-OpenMP systems
The non-reentrant versions of prng_* functions are thread-safe only in
OpenMP-enabled builds.

Fixes thread-test failing when compiled with Clang (both on Linux and
on MacOS).
2013-10-09 18:23:27 +02:00
Andrea Canciani
0af2fcaebc Add support for SSSE3 to the MSVC build system
Handle SSSE3 just like MMX and SSE2.
2013-10-09 14:23:12 +02:00
Andrea Canciani
e4d9c623d3 Fix build of check-formats on MSVC
Fixes

check-formats.obj : error LNK2019: unresolved external symbol
_strcasecmp referenced in function _format_from_string

check-formats.obj : error LNK2019: unresolved external symbol
_snprintf referenced in function _list_operators
2013-10-09 14:23:11 +02:00
Andrea Canciani
96ad6ebd8b Fix building of "other" programs on MSVC
In d1434d112c the benchmarks have been
extended to include other programs as well and the variable names have
been updated accordingly in the autotools-based build system, but not
in the MSVC one.
2013-10-09 14:23:11 +02:00
Andrea Canciani
31ac784f34 Fix build on MSVC
After a4c79d695d the MMX and SSE2 code
has some declarations after the beginning of a block, which is not
allowed by MSVC.

Fixes multiple errors like:

pixman-mmx.c(3625) : error C2275: '__m64' : illegal use of this type
as an expression

pixman-sse2.c(5708) : error C2275: '__m128i' : illegal use of this
type as an expression
2013-10-09 14:23:11 +02:00
Søren Sandmann Pedersen
c89f4c8266 fast: Swap image and iter flags in generated fast paths
The generated fast paths that were moved into the 'fast'
implementation in ec0e38cbb7 had their
image and iter flag arguments swapped; as a result, none of the fast
paths were ever called.
2013-10-04 14:11:57 -04:00
Siarhei Siamashka
7d05a7f4dc vmx: there is no need to handle unaligned destination anymore
So the redundant variables, memory reads/writes and reshuffles
can be safely removed. For example, this makes the inner loop
of 'vmx_combine_add_u_no_mask' function much more simple.

Before:

    7a20:7d a8 48 ce lvx     v13,r8,r9
    7a24:7d 80 48 ce lvx     v12,r0,r9
    7a28:7d 28 50 ce lvx     v9,r8,r10
    7a2c:7c 20 50 ce lvx     v1,r0,r10
    7a30:39 4a 00 10 addi    r10,r10,16
    7a34:10 0d 62 eb vperm   v0,v13,v12,v11
    7a38:10 21 4a 2b vperm   v1,v1,v9,v8
    7a3c:11 2c 6a eb vperm   v9,v12,v13,v11
    7a40:10 21 4a 00 vaddubs v1,v1,v9
    7a44:11 a1 02 ab vperm   v13,v1,v0,v10
    7a48:10 00 0a ab vperm   v0,v0,v1,v10
    7a4c:7d a8 49 ce stvx    v13,r8,r9
    7a50:7c 00 49 ce stvx    v0,r0,r9
    7a54:39 29 00 10 addi    r9,r9,16
    7a58:42 00 ff c8 bdnz+   7a20 <.vmx_combine_add_u_no_mask+0x120>

After:

    76c0:7c 00 48 ce lvx     v0,r0,r9
    76c4:7d a8 48 ce lvx     v13,r8,r9
    76c8:39 29 00 10 addi    r9,r9,16
    76cc:7c 20 50 ce lvx     v1,r0,r10
    76d0:10 00 6b 2b vperm   v0,v0,v13,v12
    76d4:10 00 0a 00 vaddubs v0,v0,v1
    76d8:7c 00 51 ce stvx    v0,r0,r10
    76dc:39 4a 00 10 addi    r10,r10,16
    76e0:42 00 ff e0 bdnz+   76c0 <.vmx_combine_add_u_no_mask+0x120>
2013-10-01 23:43:44 +03:00
Siarhei Siamashka
b6c5ba06f0 vmx: align destination to fix valgrind invalid memory writes
The SIMD optimized inner loops in the VMX/Altivec code are trying
to emulate unaligned accesses to the destination buffer. For each
4 pixels (which fit into a 128-bit register) the current
implementation:
  1. first performs two aligned reads, which cover the needed data
  2. reshuffles bytes to get the needed data in a single vector register
  3. does all the necessary calculations
  4. reshuffles bytes back to their original location in two registers
  5. performs two aligned writes back to the destination buffer

Unfortunately in the case if the destination buffer is unaligned and
the width is a perfect multiple of 4 pixels, we may have some writes
crossing the boundaries of the destination buffer. In a multithreaded
environment this may potentially corrupt the data outside of the
destination buffer if it is concurrently read and written by some
other thread.

The valgrind report for blitters-test is full of:

==23085== Invalid write of size 8
==23085==    at 0x1004B0B4: vmx_combine_add_u (pixman-vmx.c:1089)
==23085==    by 0x100446EF: general_composite_rect (pixman-general.c:214)
==23085==    by 0x10002537: test_composite (blitters-test.c:363)
==23085==    by 0x1000369B: fuzzer_test_main._omp_fn.0 (utils.c:733)
==23085==    by 0x10004943: fuzzer_test_main (utils.c:728)
==23085==    by 0x10002C17: main (blitters-test.c:397)
==23085==  Address 0x5188218 is 0 bytes after a block of size 88 alloc'd
==23085==    at 0x4051DA0: memalign (vg_replace_malloc.c:581)
==23085==    by 0x4051E7B: posix_memalign (vg_replace_malloc.c:709)
==23085==    by 0x10004CFF: aligned_malloc (utils.c:833)
==23085==    by 0x10001DCB: create_random_image (blitters-test.c:47)
==23085==    by 0x10002263: test_composite (blitters-test.c:283)
==23085==    by 0x1000369B: fuzzer_test_main._omp_fn.0 (utils.c:733)
==23085==    by 0x10004943: fuzzer_test_main (utils.c:728)
==23085==    by 0x10002C17: main (blitters-test.c:397)

This patch addresses the problem by first aligning the destination
buffer at a 16 byte boundary in each combiner function. This trick
is borrowed from the pixman SSE2 code.

It allows to pass the new thread-test on PowerPC VMX/Altivec systems and
also resolves the "make check" failure reported for POWER7 hardware:
    http://lists.freedesktop.org/archives/pixman/2013-August/002871.html
2013-10-01 23:42:56 +03:00
Søren Sandmann Pedersen
0438435b9c test: Add new thread-test program
This test program allocates an array of 16 * 7 uint32_ts and spawns 16
threads that each use 7 of the allocated uint32_ts as a destination
image for a large number of composite operations. Each thread then
computes and returns a checksum for the image. Finally, the main
thread computes a checksum of the checksums and verifies that it
matches expectations.

The purpose of this test is catch errors where memory outside images
is read and then written back. Such out-of-bounds accesses are broken
when multiple threads are involved, because the threads will race to
read and write the shared memory.

V2:
- Incorporate fixes from Siarhei for endianness and undefined behavior
  regarding argument evaluation
- Make the images 7 pixels wide since the bug only happens when the
  composite width is greater than 4.
- Compute a checksum of the checksums so that you don't have to
  update 16 values if something changes.

V3: Remove stray dollar sign
2013-10-01 23:33:57 +03:00
Søren Sandmann Pedersen
6582950407 Rename HAVE_PTHREAD_SETSPECIFIC to HAVE_PTHREADS
The test for pthread_setspecific() can be used as a general test for
whether pthreads are available, so rename the variable from
HAVE_PTHREAD_SETSPECIFIC to HAVE_PTHREADS and run the test even when
better support for thread local variables are available.

However, the pthread arguments are still only added to CFLAGS and
LDFLAGS when pthread_setspecific() is used for thread local variables.

V2: AC_SUBST(PTHREAD_CFLAGS)
2013-10-01 23:33:35 +03:00
Søren Sandmann Pedersen
b513b3dffe blitters-test: Remove unused variable 2013-09-29 16:47:53 -04:00
Søren Sandmann Pedersen
fa0559eb71 utils.c: Make image_endian_swap() deal with negative strides
Use a temporary variable s containing the absolute value of the stride
as the upper bound in the inner loops.

V2: Do this for the bpp == 16 case as well
2013-09-27 17:11:08 -04:00
Søren Sandmann Pedersen
ff682089ce utils.c: Make print_image actually cope with negative strides
Commit 4312f07736 claimed to have made
print_image() work with negative strides, but it didn't actually
work. When the stride was negative, the image buffer would be accessed
as if the stride were positive.

Fix the bug by not changing the stride variable and instead using a
temporary, s, that contains the absolute value of stride.
2013-09-26 13:35:29 -04:00
Søren Sandmann Pedersen
ec0e38cbb7 Move generated affine fetchers into pixman-fast-path.c
The generated fetchers for NEAREST, BILINEAR, and
SEPARABLE_CONVOLUTION filters are fast paths and so they belong in
pixman-fast-path.c
2013-09-26 10:21:29 -04:00
Søren Sandmann Pedersen
96e163d2fd Move bits_image_fetch_bilinear_no_repeat_8888 into pixman-fast-path.c
This iterator is really a fast path, so it belongs in the fast path
implementation.
2013-09-26 10:21:29 -04:00
Søren Sandmann Pedersen
8d465c2a5d fast, ssse3: Simplify logic to fetch lines in the bilinear iterators
Instead of having logic to swap the lines around when one of them
doesn't match, store the two lines in an array and use the least
significant bit of the y coordinate as the index into that
array. Since the two lines always have different least significant
bits, they will never collide.

The effect is that lines corresponding to even y coordinates are
stored in info->lines[0] and lines corresponding to odd y coordinates
are stored in info->lines[1].
2013-09-26 10:20:43 -04:00
Søren Sandmann Pedersen
aa5c45254e test: Test negative strides
Pixman supports negative strides, but up until now they haven't been
tested outside of stress-test. This commit adds testing of negative
strides to blitters-test, scaling-test, affine-test, rotate-test, and
composite-traps-test.
2013-09-19 21:37:56 -04:00
Søren Sandmann Pedersen
4312f07736 test: Share the image printing code
The affine-test, blitters-test, and scaling-test all have the ability
to print out the bytes of the destination image. Share this code by
moving it to utils.c.

At the same time make the code work correctly with negative strides.
2013-09-19 21:37:56 -04:00
Søren Sandmann Pedersen
51d7135456 {scaling,affine,composite-traps}-test: Use compute_crc32_for_image()
By using this function instead of compute_crc32() the alpha masking
code and the call to image_endian_swap() are not duplicated.
2013-09-19 21:37:56 -04:00
Søren Sandmann Pedersen
75506e6367 pixman-filter.c: Use 65536, not 65535, for fixed point conversion
Converting a double precision number to 16.16 fixed point should be
done by multiplying with 65536.0, not 65535.0.

The bug could potentially cause certain filters that would otherwise
leave the image bit-for-bit unchanged under an identity
transformation, to not do so, but the numbers are close enough that
there weren't any visual differences.
2013-09-16 17:54:46 -04:00
Søren Sandmann Pedersen
9899a7bae8 demos/scale.ui: Allow subsample_bits to be 0
The separable convolution filter supports a subsample_bits of 0 which
corresponds to no subsampling at all, so allow this value to be used
in the scale demo.
2013-09-16 17:54:46 -04:00
Søren Sandmann Pedersen
58a79dfe6d ssse3: Add iterator for separable bilinear scaling
This new iterator uses the SSSE3 instructions pmaddubsw and pabsw to
implement a fast iterator for bilinear scaling.

There is a graph here recording the per-pixel time for various
bilinear scaling algorithms as reported by scaling-bench:

    http://people.freedesktop.org/~sandmann/ssse3.v2/ssse3.v2.png

As the graph shows, this new iterator is clearly faster than the
existing C iterator, and when used with an SSE2 combiner, it is also
faster than the existing SSE2 fast paths for upscaling, though not for
downscaling.

Another graph:

    http://people.freedesktop.org/~sandmann/ssse3.v2/movdqu.png

shows the difference between writing to iter->buffer with movdqa,
movdqu on an aligned buffer, and movdqu on a deliberately unaligned
buffer. Since the differences are very small, the patch here avoids
using movdqa because imposing alignment restrictions on iter->buffer
may interfere with other optimizations, such as writing directly to
the destination image.

The data was measured with scaling-bench on a Sandy Bridge Core
i3-2350M @ 2.3GHz and is available in this directory:

    http://people.freedesktop.org/~sandmann/ssse3.v2/

where there is also a Gnumeric spreadsheet ssse3.v2.gnumeric
containing the per-pixel values and the graph.

V2:
- Use uintptr_t instead of unsigned long in the ALIGN macro
- Use _mm_storel_epi64 instead of _mm_cvtsi128_si64 as the latter form
  is not available on x86-32.
- Use _mm_storeu_si128() instead of _mm_store_si128() to avoid
  imposing alignment requirements on iter->buffer
2013-09-16 16:50:35 -04:00
Søren Sandmann Pedersen
f1792b3221 Add empty SSSE3 implementation
This commit adds a new, empty SSSE3 implementation and the associated
build system support.

configure.ac:   detect whether the compiler understands SSSE3
                intrinsics and set up the required CFLAGS

Makefile.am:    Add libpixman-ssse3.la

pixman-x86.c:   Add X86_SSSE3 feature flag and detect it in
                detect_cpu_features().

pixman-ssse3.c: New file with an empty SSSE3 implementation

V2: Remove SSSE3_LDFLAGS since it isn't necessary unless Solaris
support is added.
2013-09-16 16:50:35 -04:00
Søren Sandmann Pedersen
f10b5449a8 general: Ensure that iter buffers are aligned to 16 bytes
At the moment iter buffers are only guaranteed to be aligned to a 4
byte boundary. SIMD implementations benefit from the buffers being
aligned to 16 bytes, so ensure this is the case.

V2:
- Use uintptr_t instead of unsigned long
- allocate 3 * SCANLINE_BUFFER_LENGTH byte on stack rather than just
  SCANLINE_BUFFER_LENGTH
- use sizeof (stack_scanline_buffer) instead of SCANLINE_BUFFER_LENGTH
  to determine overflow
2013-09-16 16:50:35 -04:00
Siarhei Siamashka
700db9d872 sse2: faster bilinear scaling (pack 4 pixels to write with MOVDQA)
The loops are already unrolled, so it was just a matter of packing
4 pixels into a single XMM register and doing aligned 128-bit
writes to memory via MOVDQA instructions for the SRC compositing
operator fast path. For the other fast paths, this XMM register
is also directly routed to further processing instead of doing
extra reshuffling. This replaces "8 PACKSSDW/PACKUSWB + 4 MOVD"
instructions with "3 PACKSSDW/PACKUSWB + 1 MOVDQA" per 4 pixels,
which results in a clear performance improvement.

There are also some other (less important) tweaks:

1. Convert 'pixman_fixed_t' to 'intptr_t' before using it as an
   index for addressing memory. The problem is that 'pixman_fixed_t'
   is a 32-bit data type and it has to be extended to 64-bit
   offsets, which needs extra instructions on 64-bit systems.

2. Allow to recalculate the horizontal interpolation weights only
   once per 4 pixels by treating the XMM register as four pairs
   of 16-bit values. Each of these 16-bit/16-bit pairs can be
   replicated to fill the whole 128-bit register by using PSHUFD
   instructions. So we get "3 PADDW/PSRLW + 4 PSHUFD" instructions
   per 4 pixels instead of "12 PADDW/PSRLW" per 4 pixels
   (or "3 PADDW/PSRLW" per each pixel).

   Now a good question is whether replacing "9 PADDW/PSRLW" with
   "4 PSHUFD" is a favourable exchange. As it turns out, PSHUFD
   instructions are very fast on new Intel processors (including
   Atoms), but are rather slow on the first generation of Core2
   (Merom) and on the other processors from that time or older.
   A good instructions latency/throughput table, covering all the
   relevant processors, can be found at:
        http://www.agner.org/optimize/instruction_tables.pdf

   Enabling this optimization is controlled by the PSHUFD_IS_FAST
   define in "pixman-sse2.c".

3. One use of PSHUFD instruction (_mm_shuffle_epi32 intrinsic) in
   the older code has been also replaced by PUNPCKLQDQ equivalent
   (_mm_unpacklo_epi64 intrinsic) in PSHUFD_IS_FAST=0 configuration.
   The PUNPCKLQDQ instruction is usually faster on older processors,
   but has some side effects (instead of fully overwriting the
   destination register like PSHUFD does, it retains half of the
   original value, which may inhibit some compiler optimizations).

Benchmarks with "lowlevel-blt-bench -b src_8888_8888" using GCC 4.8.1 on
x86-64 system and default optimizations. The results are in MPix/s:

====== Intel Core2 T7300 (2GHz) ======

old:                     src_8888_8888 =  L1: 128.69  L2: 125.07  M:124.86
                        over_8888_8888 =  L1:  83.19  L2:  81.73  M: 80.63
                      over_8888_n_8888 =  L1:  79.56  L2:  78.61  M: 77.85
                      over_8888_8_8888 =  L1:  77.15  L2:  75.79  M: 74.63

new (PSHUFD_IS_FAST=0):  src_8888_8888 =  L1: 168.67  L2: 163.26  M:162.44
                        over_8888_8888 =  L1: 102.91  L2: 100.43  M: 99.01
                      over_8888_n_8888 =  L1:  97.40  L2:  95.64  M: 94.24
                      over_8888_8_8888 =  L1:  98.04  L2:  95.83  M: 94.33

new (PSHUFD_IS_FAST=1):  src_8888_8888 =  L1: 154.67  L2: 149.16  M:148.48
                        over_8888_8888 =  L1:  95.97  L2:  93.90  M: 91.85
                      over_8888_n_8888 =  L1:  93.18  L2:  91.47  M: 90.15
                      over_8888_8_8888 =  L1:  95.33  L2:  93.32  M: 91.42

====== Intel Core i7 860 (2.8GHz) ======

old:                     src_8888_8888 =  L1: 323.48  L2: 318.86  M:314.81
                        over_8888_8888 =  L1: 187.38  L2: 186.74  M:182.46

new (PSHUFD_IS_FAST=0):  src_8888_8888 =  L1: 373.06  L2: 370.94  M:368.32
                        over_8888_8888 =  L1: 217.28  L2: 215.57  M:211.32

new (PSHUFD_IS_FAST=1):  src_8888_8888 =  L1: 401.98  L2: 397.65  M:395.61
                        over_8888_8888 =  L1: 218.89  L2: 217.56  M:213.48

The most interesting benchmark is "src_8888_8888" (because this code can
be reused for a generic non-separable SSE2 bilinear fetch iterator).

The results shows that PSHUFD instructions are bad for Intel Core2 T7300
(Merom core) and good for Intel Core i7 860 (Nehalem core). Both of these
processors support SSSE3 instructions though, so they are not the primary
targets for SSE2 code. But without having any other more relevant hardware
to test, PSHUFD_IS_FAST=0 seems to be a reasonable default for SSE2 code
and old processors (until the runtime CPU features detection becomes
clever enough to recognize different microarchitectures).

(Rebased on top of patch that removes support for 8-bit bilinear
 filtering -ssp)
2013-09-16 16:48:44 -04:00
Siarhei Siamashka
e43cc9c902 test: safeguard the scaling-bench test against COW
The calloc call from pixman_image_create_bits may still
rely on http://en.wikipedia.org/wiki/Copy-on-write
Explicitly initializing the destination image results in
a more predictable behaviour.

V2:
 - allocate 16 bytes aligned buffer with aligned stride instead
   of delegating this to pixman_image_create_bits
 - use memset for the allocated buffer instead of pixman solid fill
 - repeat tests 3 times and select best results in order to filter
   out even more measurement noise
2013-09-07 17:20:09 -04:00
Søren Sandmann Pedersen
a4c79d695d Drop support for 8-bit precision in bilinear filtering
The default has been 7-bit for a while now, and the quality
improvement with 8-bit precision is not enough to justify keeping the
code around as a compile-time option.
2013-09-07 17:19:50 -04:00
Søren Sandmann Pedersen
80a232db68 Make the first argument to scanline fetchers have type bits_image_t *
Scanline fetchers haven't been used for images other than bits for a
long time, so by making the type reflect this fact, a bit of casting
can be saved in various places.
2013-09-07 17:12:18 -04:00
Matt Turner
8ad63f90cd iwmmxt: Disallow if gcc version is < 4.8.
Later versions of gcc-4.7.x are capable of generating iwMMXt
instructions properly, but gcc-4.8 contains better support and other
fixes, including iwMMXt in conjunction with hardfp. The existing 4.5
requirement was based on attempts to have OLPC use a patched gcc to
build pixman. Let's just require gcc-4.8.
2013-09-04 23:48:52 -07:00
Søren Sandmann Pedersen
02906e57bd fast_bilinear_cover_init: Don't install a finalizer on the error path
No memory is allocated in the error case, so a finalizer is not
necessary, and will cause problems if the data pointer is not
initialized to NULL.
2013-08-31 14:19:58 -04:00
Julien Cristau
d4898ac139 Upload to unstable 2013-08-13 12:08:22 +02:00
Julien Cristau
105c249996 Increase alpha-loop test timeout some more. 2013-08-13 12:03:40 +02:00
Julien Cristau
9b844940ba Includes big-endian matrix-test fix 2013-08-13 12:01:40 +02:00
Julien Cristau
2fc06503f6 Bump changelogs 2013-08-13 12:00:48 +02:00
Julien Cristau
a781ff50e7 pixman 0.30.2 release
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQEcBAABAgAGBQJSAlYRAAoJEIWlZJw4kjNuBQYIAKwOAc0rKtX5c/z5iuf90akR
 EfEKK5ICQ8iE55Jvmn3e9ny12yrRbP/S6++W2kKkaF6gEmab2/3YswN42/ZPn3gJ
 1RER7b+x/CxsJbJVNPbRBLdkfF2HH8RicJru7cQ98TjR2mSC9uKAyiC/podWQZvO
 96rcnXZZBZMMjZLCUYfhiNz71Frhjh3fZrodx9GUJ6Lbka74bvWJ3fB4PXoTtbbr
 H8OPkxJQw5OjGtqgwB8lbLQZmZLhuZYUGOF0wbSA2+2HvylxlPlpUgC1c3r8yn77
 MQsD/ex+CfswwxxMTrINkHSVllaoJafM8cjk8HFG3EPkW/ohdpDthhtZpmSsM5E=
 =09FF
 -----END PGP SIGNATURE-----

Merge tag 'pixman-0.30.2' into debian-unstable

pixman 0.30.2 release
2013-08-13 12:00:07 +02:00
Søren Sandmann Pedersen
3518a0dafa Add an iterator that can fetch bilinearly scaled images
This new iterator works in a separable way; that is, for a destination
scaline, it scales the two involved source scanlines and then caches
them so that they can be reused for the next destination scanlines.

There are two versions of the code, one that uses 64 bit arithmetic,
and one that uses 32 bit arithmetic only. The latter version is
used on 32 bit systems, where it is expected to be faster.

This scheme saves a substantial amount of arithmetic for larger
scalings; the per-pixel times for various configurations as reported
by scaling-bench are graphed here:

	http://people.freedesktop.org/~sandmann/separable.v2/v2.png

The "sse2" graph is current default on x86, "mmx" is with sse2
disabled, "old c" is with sse2 and mmx disabled. The "new 32" and "new
64" graphs show times for the new code. As the graphs show, the 64 bit
version of the new code beats the "old c" for all scaling ratios.

The data was taken on a Sandy Bridge Core i3-2350M CPU @ 2.0 GHz
running in 64 bit mode.

The data used to generate the graph is available in this directory:

    http://people.freedesktop.org/~sandmann/separable.v2/

There is also a Gnumeric spreadsheet v2.gnumeric containing the
per-pixel values and the graph.

V2:
- Add error message in the OOM/bad matrix case
- Save some shifts by storing the cached scanlines in AGBR order
- Special cased version that uses 32 bit arithmetic when sizeof(long) <= 4
2013-08-10 11:18:23 -04:00
Søren Sandmann Pedersen
146116eff4 Add support for iter finalizers
Iterators may sometimes need to allocate auxillary memory. In order to
be able to free this memory, optional iterator finalizers are
required.
2013-08-10 11:18:23 -04:00
Søren Sandmann Pedersen
1be9208e04 test/scaling-bench.c: New benchmark for bilinear scaling
This new benchmark scales a 320 x 240 test a8r8g8b8 image by all
ratios from 0.1, 0.2, ... up to 10.0 and reports the time it to took
to do each of the scaling operations, and the time spent per
destination pixel.

The times reported for the scaling operations are given in
milliseconds, the times-per-pixel are in nanoseconds.

V2: Format output better
2013-08-10 11:18:23 -04:00
Søren Sandmann Pedersen
fedd6b192d RELEASING: Add note about changing the topic of the #cairo IRC channel 2013-08-07 10:22:25 -04:00
Søren Sandmann Pedersen
f8a0812b1c Pre-release version bump to 0.30.2 2013-08-07 10:07:35 -04:00
Siarhei Siamashka
b5167b8a54 test: fix matrix-test on big endian systems 2013-08-05 01:45:59 +03:00
Siarhei Siamashka
d87601ffc3 test: fix matrix-test on big endian systems 2013-08-05 01:42:29 +03:00
Julien Cristau
bbb3765faf Upload to unstable 2013-08-03 10:24:43 +02:00
Julien Cristau
2e13b569cb Increase timeout for the alpha-loop test.
That will hopefully let it pass on the mips buildd.
2013-08-03 10:23:41 +02:00
Andrea Canciani
a82b95a264 test: Fix build on MSVC
The MSVC compiler is very strict about variable declarations after
statements.

Move all the declarations of each block before any statement in the
same block to fix multiple instances of:

alpha-loop.c(XX) : error C2275: 'pixman_image_t' : illegal use of this
type as an expression
2013-08-01 09:08:15 -07:00
Søren Sandmann Pedersen
4c04a86c68 Version bump to 0.30.1 2013-08-01 07:19:21 -04:00
Alexander Troosh
6300452952 Require GTK+ version >= 2.16
I'm got bug in my system:

lcc: "scale.c", line 374: warning: function "gtk_scale_add_mark" declared
          implicitly [-Wimplicit-function-declaration]
      gtk_scale_add_mark (GTK_SCALE (widget), 0.0, GTK_POS_LEFT, NULL);
      ^

  CCLD   scale
scale.o: In function `app_new':
(.text+0x23e4): undefined reference to `gtk_scale_add_mark'
scale.o: In function `app_new':
(.text+0x250c): undefined reference to `gtk_scale_add_mark'
scale.o: In function `app_new':
(.text+0x2634): undefined reference to `gtk_scale_add_mark'
make[2]: *** [scale] Error 1
make[2]: Target `all' not remade because of errors.

$ pkg-config --modversion gtk+-2.0
2.12.1

The demos/scale.c use call to gtk_scale_add_mark() function from 2.16+
version of GTK+. Need do support old GTK+ (rewrite scale.c) or simple
demand of high version of GTK+, like this:
2013-07-30 08:18:35 -04:00
Matthieu Herrb
02869a1229 configure.ac: Don't use '+=' since it's not POSIX
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Matthieu Herrb <matthieu.herrb@laas.fr>
2013-07-30 08:18:25 -04:00
Markos Chandras
35da06c828 Use AC_LINK_IFELSE to check if the Loongson MMI code can link
The Loongson code is compiled with -march=loongson2f to enable the MMI
instructions, but binutils refuses to link object code compiled with
different -march settings, leading to link failures later in the
compile. This avoids that problem by checking if we can link code
compiled for Loongson.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
2013-07-30 08:18:02 -04:00
ingmar@irsoft.de
e14f5a739f Fix broken build when HAVE_CONFIG_H is undefined, e.g. on Win32.
Build fix for platforms without a generated config.h, for example Win32.
2013-07-30 08:17:49 -04:00
Julien Cristau
3f0d759608 Upload to unstable 2013-07-27 21:40:50 +02:00
Julien Cristau
3c4dac9a7c Fix matrix-test on big endian
Patch from Siarhei Siamashka.
2013-07-27 21:40:09 +02:00
Julien Cristau
3473a947da Disable arm iwmmxt fast paths. It breaks the build. 2013-07-27 14:48:50 +02:00
Julien Cristau
dc29515934 Disable silent Makefile rules. 2013-07-27 14:37:23 +02:00
Julien Cristau
2084b2d3bd Upload to unstable 2013-07-26 14:58:46 +02:00
Julien Cristau
317b3c3eea Add more test-only exported functions to symbols file 2013-07-26 14:47:35 +02:00
Julien Cristau
73ff58c119 Remove png file missing from the tarball 2013-07-26 14:36:14 +02:00
Julien Cristau
d2fbfbc23c Bump changelog and symbols for 0.30.0 2013-07-26 14:31:38 +02:00
Julien Cristau
5de927bd3e Merge branch 'upstream-merge' into debian-unstable 2013-07-26 14:26:43 +02:00
Julien Cristau
0ef6350c3d Revert "Add 00-unexport-symbol.diff"
This reverts commit 01c2431ef8.
2013-07-26 14:26:30 +02:00
Julien Cristau
07473e703e Merge remote-tracking branch 'origin/debian-experimental' into debian-unstable
Conflicts:
	debian/changelog
2013-07-26 14:26:11 +02:00
Julien Cristau
be9bb76118 Merge remote-tracking branch 'origin/upstream-experimental' into upstream-merge 2013-07-26 14:24:21 +02:00
Andrea Canciani
1e49329333 test: Fix build on MSVC
The MSVC compiler is very strict about variable declarations after
statements.

Move all the declarations of each block before any statement in the
same block to fix multiple instances of:

alpha-loop.c(XX) : error C2275: 'pixman_image_t' : illegal use of this
type as an expression
2013-06-25 16:55:24 +02:00
Alexander Troosh
279bdcda7e Require GTK+ version >= 2.16
I'm got bug in my system:

lcc: "scale.c", line 374: warning: function "gtk_scale_add_mark" declared
          implicitly [-Wimplicit-function-declaration]
      gtk_scale_add_mark (GTK_SCALE (widget), 0.0, GTK_POS_LEFT, NULL);
      ^

  CCLD   scale
scale.o: In function `app_new':
(.text+0x23e4): undefined reference to `gtk_scale_add_mark'
scale.o: In function `app_new':
(.text+0x250c): undefined reference to `gtk_scale_add_mark'
scale.o: In function `app_new':
(.text+0x2634): undefined reference to `gtk_scale_add_mark'
make[2]: *** [scale] Error 1
make[2]: Target `all' not remade because of errors.

$ pkg-config --modversion gtk+-2.0
2.12.1

The demos/scale.c use call to gtk_scale_add_mark() function from 2.16+
version of GTK+. Need do support old GTK+ (rewrite scale.c) or simple
demand of high version of GTK+, like this:
2013-06-11 12:09:49 -04:00
Matthieu Herrb
889f118946 configure.ac: Don't use '+=' since it's not POSIX
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Matthieu Herrb <matthieu.herrb@laas.fr>
2013-06-08 10:21:54 -07:00
Søren Sandmann Pedersen
2acfac5f8e Consolidate all the iter_init_bits_stride functions
The SSE2, MMX, and fast implementations all have a copy of the
function iter_init_bits_stride that computes an image buffer and
stride.

Move that function to pixman-utils.c and share it among all the
implementations.
2013-05-22 09:43:21 -04:00
Søren Sandmann Pedersen
533f54430a Delete the old src/dest_iter_init() functions
Now that we are using the new _pixman_implementation_iter_init(), the
old _src/_dest_iter_init() functions are no longer needed, so they can
be deleted, and the corresponding fields in pixman_implementation_t
can be removed.
2013-05-22 09:43:21 -04:00
Søren Sandmann Pedersen
125a4fd36f Add _pixman_implementation_iter_init() and use instead of _src/_dest_init()
A new field, 'iter_info', is added to the implementation struct, and
all the implementations store a pointer to their iterator tables in
it. A new function, _pixman_implementation_iter_init(), is then added
that searches those tables, and the new function is called in
pixman-general.c and pixman-image.c instead of the old
_pixman_implementation_src_init() and _pixman_implementation_dest_init().
2013-05-22 09:43:21 -04:00
Søren Sandmann Pedersen
245d0090c5 general: Store the iter initializer in a one-entry pixman_iter_info_t table
In preparation for sharing all iterator initialization code from all
the implementations, move the general implementation to use a table of
pixman_iter_info_t.

The existing src_iter_init and dest_iter_init functions are
consolidated into one general_iter_init() function that checks the
iter_flags for whether it is dealing with a source or destination
iterator.

Unlike in the other implementations, the general_iter_init() function
stores its own get_scanline() and write_back() functions in the
iterator, so it relies on the initializer being called after
get_scanline and write_back being copied from the struct to the
iterator.
2013-05-22 09:43:21 -04:00
Søren Sandmann Pedersen
9c15afb105 fast: Replace the fetcher_info_t table with a pixman_iter_info_t table
Similar to the SSE2 and MMX patches, this commit replaces a table of
fetcher_info_t with a table of pixman_iter_info_t, and similar to the
noop patch, both fast_src_iter_init() and fast_dest_iter_init() are
now doing exactly the same thing, so their code can be shared in a new
function called fast_iter_init_common().
2013-05-22 09:43:21 -04:00
Søren Sandmann Pedersen
71c2d519d0 mmx: Replace the fetcher_info_t table with a pixman_iter_info_t table
Similar to the SSE2 commit, information about the iterators is stored
in a table of pixman_iter_info_t.
2013-05-22 09:43:21 -04:00
Søren Sandmann Pedersen
78f437d61e sse2: Replace the fetcher_info_t table with a pixman_iter_info_t table
Similar to the changes to noop, put all the iterators into a table of
pixman_iter_info_t and then do a generic search of that table during
iterator initialization.
2013-05-22 09:43:20 -04:00
Søren Sandmann Pedersen
c7b0da8a96 noop: Keep information about iterators in an array of pixman_iter_info_t
Instead of having a nest of if statements, store the information about
iterators in a table of a new struct type, pixman_iter_info_t, and
then walk that table when initializing iterators.

The new struct contains a format, a set of image flags, and a set of
iter flags, plus a pixman_iter_get_scanline_t, a
pixman_iter_write_back_t, and a new function type
pixman_iter_initializer_t.

If the iterator matches an entry, it is first initialized with the
given get_scanline and write_back functions, and then the provided
iter_initializer (if present) is run. Running the iter_initializer
after setting get_scanline and write_back allows the initializer to
override those fields if it wishes.

The table contains both source and destination iterators,
distinguished based on the recently-added ITER_SRC and ITER_DEST;
similarly, wide iterators are recognized with the ITER_WIDE
flag. Having both source and destination iterators in the table means
the noop_src_iter_init() and noop_dest_iter_init() functions become
identical, so this patch factors out their code in a new function
noop_iter_init_common() that both calls.

The following patches in this series will change all the
implementations to use an iterator table, and then move the table
search code to pixman-implementation.c.
2013-05-22 09:43:20 -04:00
Søren Sandmann Pedersen
3b96ee4e77 Always set the FAST_PATH_NO_ALPHA_MAP flag for non-BITS images
We only support alpha maps for BITS images, so it's always to ignore
the alpha map for non-BITS image. This makes it possible get rid of
the check for SOLID images since it will now be subsumed by the check
for FAST_PATH_NO_ALPHA_MAP.

Opaque masks are reduced to NULL images in pixman.c, and those can
also safely be treated as not having an alpha map, so set the
FAST_PATH_NO_ALPHA_MAP bit for those as well.
2013-05-22 09:43:12 -04:00
Søren Sandmann Pedersen
52ff5f0cd9 Add ITER_WIDE iter flag
This will be useful for putting iterators into tables where they can
be looked up by iterator flags. Without this flag, wide iterators can
only be recognized by the absence of ITER_NARROW, which makes testing
for a match difficult.
2013-05-22 09:43:03 -04:00
Søren Sandmann Pedersen
e8a180797c Add ITER_SRC and ITER_DEST iter flags
These indicate whether the iterator is for a source or a destination
image. Note iterator initializers are allowed to rely on one of these
being set, so they can't be left out the way it's generally harmless
(aside from potentil performance degradation) to leave out a
particular fast path flag.
2013-05-22 09:41:10 -04:00
Søren Sandmann Pedersen
2320f0520b Make use of image flag in noop iterators
Similar to c2230fe2af, simply check against SAMPLES_COVER_CLIP_NEAREST
instead of comparing all the x/y/width/height parameters.
2013-05-22 04:28:41 -04:00
Markos Chandras
d77d75cc6e Use AC_LINK_IFELSE to check if the Loongson MMI code can link
The Loongson code is compiled with -march=loongson2f to enable the MMI
instructions, but binutils refuses to link object code compiled with
different -march settings, leading to link failures later in the
compile. This avoids that problem by checking if we can link code
compiled for Loongson.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
2013-05-19 09:01:34 -07:00
Matt Turner
a74be759a1 mmx: Document implementation(s) of pix_multiply().
I look at that function and can never remember what it does or how it
manages to do it.
2013-05-15 09:51:15 -07:00
ingmar@irsoft.de
cb5d131ff4 Fix broken build when HAVE_CONFIG_H is undefined, e.g. on Win32.
Build fix for platforms without a generated config.h, for example Win32.
2013-05-11 16:09:39 -04:00
Søren Sandmann Pedersen
d70141955e Post-release version bump to 0.31.1 2013-05-08 19:40:12 -04:00
Søren Sandmann Pedersen
41daf50aae Pre-release version bump to 0.30.0 2013-05-08 19:31:22 -04:00
Søren Sandmann Pedersen
5a7179191d Post-release version bump to 0.29.5 2013-04-30 18:57:43 -04:00
Søren Sandmann Pedersen
2714b5d201 Pre-release version bump to 0.29.4 2013-04-30 18:50:04 -04:00
Søren Sandmann Pedersen
7fc2654a1f pixman/refactor: Delete this file
Essentially all of it is obsolete by now.
2013-04-30 16:25:10 -04:00
Nemanja Lukic
cb928a77c0 MIPS: DSPr2: Added rpixbuf fast path.
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
       rpixbuf =  L1:  14.63  L2:  13.55  M:  9.91 ( 79.53%)  HT:  8.47  VT:  8.32  R:  8.17  RT:  4.90 (  33Kops/s)

Optimized:
       rpixbuf =  L1:  45.69  L2:  37.30  M: 17.24 (138.31%)  HT: 15.66  VT: 14.88  R: 13.97  RT:  8.38 (  44Kops/s)
2013-04-30 15:38:43 -04:00
Nemanja Lukic
c6a6fbdcd3 MIPS: DSPr2: Added pixbuf fast path.
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        pixbuf =  L1:  18.18  L2:  16.47  M: 13.36 (107.27%)  HT: 10.16  VT: 10.07  R:  9.84  RT:  5.54 (  35Kops/s)

Optimized:
        pixbuf =  L1:  43.54  L2:  36.02  M: 17.08 (137.09%)  HT: 15.58  VT: 14.85  R: 13.87  RT:  8.38 (  44Kops/s)
2013-04-30 15:38:43 -04:00
Nemanja Lukic
f69335d529 test: add "pixbuf" and "rpixbuf" to lowlevel-blt-bench
Add necessary support to lowlevel-blt benchmark for benchmarking pixbuf and
rpixbuf fast paths. bench_composite function now checks for pixbuf string in
testname, and if that is detected, use same bits for src and mask images.
2013-04-30 15:38:43 -04:00
Nemanja Lukic
3dc9e3827e test: add "src_0888_8888_rev" and "src_0888_0565_rev" to lowlevel-blt-bench 2013-04-30 15:38:43 -04:00
Nemanja Lukic
44174ce51d MIPS: DSPr2: Fix for bug in in_n_8 routine.
Rounding logic was not implemented right.
Instead of using rounding version of the 8-bit shift, logical shifts were used.
Also, code used unnecessary multiplications, which could be avoided by packing
4 destination (a8) pixel into one 32bit register. There were also, unnecessary
spills on stack. Code is rewritten to address mentioned issues.

The bug was revealed by increasing number of the iterations in blitters-test.

Performance numbers on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
                   in_n_8 =  L1:  21.20  L2:  22.86  M: 21.42 ( 14.21%)  HT: 15.97  VT: 15.69  R: 15.47  RT:  8.00 (  48Kops/s)
Optimized (first implementation, with bug):
                   in_n_8 =  L1:  89.38  L2:  86.07  M: 65.48 ( 43.44%)  HT: 44.64  VT: 41.50  R: 40.77  RT: 16.94 (  66Kops/s)
Optimized (with bug fix, and code revisited):
                   in_n_8 =  L1: 102.33  L2:  95.65  M: 70.54 ( 46.84%)  HT: 48.35  VT: 45.06  R: 43.20  RT: 17.60 (  66Kops/s)
2013-04-30 15:38:43 -04:00
Nemanja Lukic
5858f09d26 MIPS: DSPr2: Added src_0565_8888 nearest neighbor fast path.
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
         src_0565_8888 =  L1:  20.70  L2:  19.22  M: 12.50 ( 49.79%)  HT: 10.45  VT: 10.18  R:  9.99  RT:  5.31 (  31Kops/s)

Optimized:
         src_0565_8888 =  L1:  62.98  L2:  53.44  M: 23.07 ( 91.87%)  HT: 19.85  VT: 19.15  R: 17.70  RT:  9.68 (  43Kops/s)
2013-04-30 15:38:43 -04:00
Nemanja Lukic
311d55b6d8 MIPS: DSPr2: Added over_8888_0565 nearest neighbor fast path.
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        over_8888_0565 =  L1:  13.22  L2:  12.02  M:  9.77 ( 38.92%)  HT:  8.58  VT:  8.35  R:  8.38  RT:  5.78 (  35Kops/s)

Optimized:
        over_8888_0565 =  L1:  26.20  L2:  22.97  M: 15.92 ( 63.40%)  HT: 13.33  VT: 13.13  R: 12.72  RT:  7.65 (  39Kops/s)
2013-04-30 15:38:43 -04:00
Nemanja Lukic
bd487ee34c MIPS: DSPr2: Added over_8888_8888 nearest neighbor fast path.
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        over_8888_8888 =  L1:  19.47  L2:  16.30  M: 11.24 ( 59.69%)  HT:  9.54  VT:  9.29  R:  9.47  RT:  6.24 (  37Kops/s)

Optimized:
        over_8888_8888 =  L1:  43.67  L2:  33.30  M: 16.32 ( 86.65%)  HT: 14.10  VT: 13.78  R: 12.96  RT:  7.85 (  39Kops/s)
2013-04-30 15:38:43 -04:00
Nemanja Lukic
66def909ad MIPS: DSPr2: Fix bug in over_n_8888_8888_ca/over_n_8888_0565_ca routines
After introducing new PRNG (pseudorandom number generator) a bug in two DSPr2
routines was revealed. Bug manifested by wrong calculation in composite and
glyph tests, which caused make check to fail for MIPS DSPr2 optimizations.

Bug was in the calculation of the:
*dst = over (src, *dst) when ma == 0xffffffff

In this case src was not negated and shifted right by 24 bits, it was only
negated. When implementing this routine in the first place, I missplaced those
shifts, which alowed me to combine code for over operation and:
    UN8x4_MUL_UN8x4 (s, ma);
    UN8x4_MUL_UN8 (ma, srca);
    ma = ~ma;
    UN8x4_MUL_UN8x4_ADD_UN8x4 (d, ma, s);
So I decided to rewrite that piece of code from scratch. I changed logic, so
now assembly code mimics code from pixman-fast-path.c but processes two pixels
at a time. This code should be easier to debug and maintain.

The bug was revealed in commit b31a6962. Errors were detected by composite
and glyph tests.
2013-04-30 15:38:43 -04:00
Siarhei Siamashka
d768558ce1 sse2: faster bilinear interpolation (get rid of XOR instruction)
The old code was calculating horizontal weights for right pixels
in the following way (for simplicity assume 8-bit interpolation
precision):

  Start with "x = vx" and do increment "x += ux" after each pixel.
  In this case right pixel weight for interpolation can be calculated
  as "((x >> 8) ^ 0xFF) + 1", which is the same as "256 - (x >> 8)".

The new code instead:

  Starts with "x = -(vx + 1)", performs increment "x += -ux" after
  each pixel and calculates right weights as just "(x >> 8) + 1",
  eliminating the need for XOR operation in the inner loop.

So we have one instruction less on the critical path. Benchmarks
with "lowlevel-blt-bench -b src_8888_8888" using GCC 4.7.2 on
x86-64 system and default optimizations:

Intel Core i7 860 (2.8GHz):
    before: src_8888_8888 =  L1: 291.37  L2: 288.58  M:285.38
    after:  src_8888_8888 =  L1: 319.66  L2: 316.47  M:312.06

Intel Core2 T7300 (2GHz):
    before: src_8888_8888 =  L1: 121.95  L2: 118.38  M:118.52
    after:  src_8888_8888 =  L1: 128.82  L2: 125.12  M:124.88

Intel Atom N450 (1.67GHz):
    before: src_8888_8888 =  L1:  64.25  L2:  62.37  M: 61.80
    after:  src_8888_8888 =  L1:  64.23  L2:  62.37  M: 61.82

Inspired by the "sse2_bilinear_interpolation" function (single
pixel interpolation) from:
    http://lists.freedesktop.org/archives/pixman/2013-January/002575.html
2013-04-28 23:22:41 +03:00
Siarhei Siamashka
59109f3293 test: larger 0xFF/0x00 filled clusters in random images for blitters-test
Current blitters-test program had difficulties detecting a bug in
over_n_8888_8888_ca implementation for MIPS DSPr2:

    http://lists.freedesktop.org/archives/pixman/2013-March/002645.html

In order to hit the buggy code path, two consecutive mask values had
to be equal to 0xFFFFFFFF because of loop unrolling. The current
blitters-test generates random images in such a way that each byte
has 25% probability for having 0xFF value. Hence each 32-bit mask
value has ~0.4% probability for 0xFFFFFFFF. Because we are testing
many compositing operations with many pixels, encountering at least
one 0xFFFFFFFF mask value reasonably fast is not a problem. If a
bug related to 0xFFFFFFFF mask value is artificialy introduced into
over_n_8888_8888_ca generic C function, it gets detected on 675591
iteration in blitters-test (out of 2000000).

However two consecutive 0xFFFFFFFF mask values are much less likely
to be generated, so the bug was missed by blitters-test.

This patch addresses the problem by also randomly setting the 32-bit
values in images to either 0xFFFFFFFF or 0x00000000 (also with 25%
probability). It allows to have larger clusters of consecutive 0x00
or 0xFF bytes in images which may have special shortcuts for handling
them in unrolled or SIMD optimized code.
2013-04-28 22:14:47 +03:00
Stefan Weil
a99147d1ea Trivial spelling fixes in comments
They were found by codespell.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2013-04-27 04:08:45 -04:00
Peter Breitenlohner
9d0bb10312 Check for missing sqrtf() as, e.g., for Solaris 9
Signed-off-by: Peter Breitenlohner <peb@mppmu.mpg.de>
2013-04-08 14:33:25 -04:00
Søren Sandmann Pedersen
d8ac35af12 Improve precision of calculations in pixman-gradient-walker.c
The computations in pixman-gradient-walker.c currently take place at
very limited 8 bit precision which results in quite visible artefacts
in gradients. An example is the one produced by demos/linear-gradient
which currently looks like this:

    http://i.imgur.com/kQbX8nd.png

With the changes in this commit, the gradient looks like this:

    http://i.imgur.com/nUlyuKI.png

The images are also available here:

    http://people.freedesktop.org/~sandmann/gradients/before.png
    http://people.freedesktop.org/~sandmann/gradients/after.png

This patch computes pixels using floating point, but uses a faster
algorithm, which makes up for the loss of performance.

== Theory:

In both the new and the old algorithm, the various gradient
implementations compute a parameter x that indicates how far along the
gradient the current scanline is. The current algorithm has a cache of
the two color stops surrounding the last parameter; those are used in
a SIMD-within-register fashion in this way:

    t1 = walker->left_rb * idist + walker->right_rb * dist;

where dist and idist are the distances to the left and right color
stops respectively normalized to the distance between the left and
right stops. The normalization (which involves a division) is captured
in another cached variable "stepper". The cached values are recomputed
whenever the parameter moves in between two different stops (called
"reset" in the implementation).

Because idist and dist are computed in 8 bits only, a lot of
information is lost, which is quite visible as the image linked above
shows.

The new algorithm caches more information in the following way. When
interpolating between stops, the formula to be used is this:

     t = ((x - left) / (right - left));

     result = lc * (1 - t) + rc * t;

where

    - x is the parameter as computed by the main gradient code,
    - left is the position of the left color stop,
    - right is the position of the right color stop
    - lc is the color of the left color stop
    - rc is the color of the right color stop

That formula can also be written like this:

    result
      = lc * (1 - t) + rc * t;
      = lc + (rc - lc) * t
      = lc + (rc - lc) * ((x - left) / (right - left))
      = (rc - lc) / (right - left) * x +
      	       lc - (left * (rc - lc)) / (right - left)
      = s * x + b

where

    s = (rc - lc) / (right - left)

and

    b = lc - left * (rc - lc) / (right - left)
      = (lc * (right - left) - left * (rc - lc)) / (right - left)
      = (lc * right - rc * left) / (right - left)

To summarize, setting w = (right - left):

    s = (rc - lc) / w
    b = (lc * right - rc * left) / w

    r = s * x + b

Since s and b only depend on the two active stops, both can be cached
so that the computation only needs to do one multiplication and one
addition per pixel (followed by premultiplication of the alpha
channel). That is, seven multiplications in total, which is the same
number as the old SIMD-within-register implementation had.

== Implementation notes:

The new formula described above is implemented in single precision
floating point, and the eight divisions necessary to compute the
cached values are done by multiplication with the reciprocal of the
distance between the color stops.

The alpha values used in the cached computation are scaled by 255.0,
whereas the RGB values are kept in the [0, 1] interval. The ensures
that after premultiplication, all values will be in the [0, 255]
interval.

This scaling is done by first dividing all the all the channels by
257, and then later on dividing the r, g, b channels by 255. It would
be more natural to do all this scaling in only one place, but
inexplicably, that results in a (substantial) slowdown on Sandy Bridge
with GCC v 4.7.

== Performance impact (median of three runs of radial-perf-test):

   == Intel Sandy Bridge, Core i3 @ 1.2GHz

   Before: 0.014553
   After:  0.014410
   Change: 1.0% faster

   == AMD Barcelona @ 1.2 GHz

   Before: 0.021735
   After:  0.021328
   Change: 1.9% faster

Ie., slightly faster, though conceivably there could be a negative
impact on machines with a bigger difference between integer and
floating point performance.

V2:

- Use 's' and 'b' in the variable names instead of 'm' and 'd'. This
  way they match the explanation above

- Move variable declarations to the top of the function

- Remove unused stepper field

- Some formatting fixes

- Don't pointlessly include pixman-combine32.h

- Don't offset x for each pixel; go back to offsetting left_x and
  right_x at reset time. The offsets cancel out in the formula above,
  so there is no impact on the calcualations.
2013-03-16 01:14:22 -04:00
Søren Sandmann Pedersen
a1c2331e0e Move the IS_ZERO() to pixman-private.h and rename to FLOAT_IS_ZERO()
Some upcoming changes to pixman-gradient-walker.c will need this
macro.
2013-03-11 22:41:55 -04:00
Søren Sandmann Pedersen
2c953e572f test: Add radial-perf-test, a microbenchmark for radial gradients
This benchmark renders one of the radial gradients used in the
swfdec-youtube cairo trace 500 times and reports the average time it
took.

V2: Update .gitignore
2013-03-11 22:41:45 -04:00
Søren Sandmann Pedersen
460faaa411 demos: Add linear-gradient demo program
This program displays a linear gradient from blue to yellow. Due to
limited precision in pixman-gradient-walker.c, it currently has some
ugly artefacts that gives it a 'brushed metal' appearance.

V2: Update .gitignore
2013-03-11 22:40:05 -04:00
Behdad Esfahbod
aaae3d8eef Remove unused macro 2013-03-08 06:00:00 -05:00
Nemanja Lukic
5feda20fc3 MIPS: DSPr2: Added more fast-paths for SRC operation:
- src_0888_8888_rev
 - src_0888_0565_rev

Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        src_0888_8888_rev =  L1:  51.88  L2:  42.00  M: 19.04 ( 88.50%)  HT: 15.27  VT: 14.62  R: 14.13  RT:  7.12 (  45Kops/s)
        src_0888_0565_rev =  L1:  31.96  L2:  30.90  M: 22.60 ( 75.03%)  HT: 15.32  VT: 15.11  R: 14.49  RT:  6.64 (  43Kops/s)

Optimized:
        src_0888_8888_rev =  L1: 222.73  L2: 113.70  M: 20.97 ( 97.35%)  HT: 18.31  VT: 17.14  R: 16.71  RT:  9.74 (  54Kops/s)
        src_0888_0565_rev =  L1: 100.37  L2:  74.27  M: 29.43 ( 97.63%)  HT: 22.92  VT: 21.59  R: 20.52  RT: 10.56 (  56Kops/s)
2013-02-27 14:40:51 +01:00
Nemanja Lukic
43914d68d1 MIPS: DSPr2: Added more fast-paths for OVER operation:
- over_8888_0565
 - over_n_8_8

Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        over_8888_0565 =  L1:  14.30  L2:  13.22  M: 10.43 ( 41.56%)  HT: 12.51  VT: 12.95  R: 11.82  RT:  7.34 (  49Kops/s)
            over_n_8_8 =  L1:  12.77  L2:  16.93  M: 15.03 ( 29.94%)  HT: 10.78  VT: 10.72  R: 10.29  RT:  4.92 (  33Kops/s)

Optimized:
        over_8888_0565 =  L1:  26.03  L2:  22.92  M: 15.68 ( 62.43%)  HT: 16.19  VT: 16.27  R: 14.93  RT:  8.60 (  52Kops/s)
            over_n_8_8 =  L1:  62.00  L2:  55.17  M: 40.29 ( 80.23%)  HT: 26.77  VT: 25.64  R: 24.13  RT: 10.01 (  47Kops/s)
2013-02-27 14:39:45 +01:00
Julien Cristau
259f681187 Upload to unstable 2013-02-18 20:17:18 +01:00
Søren Sandmann Pedersen
6dfdd8534f Fix for infinite-loop test
The infinite loop detected by "affine-test 212944861" is caused by an
overflow in this expression:

    max_x = pixman_fixed_to_int (vx + (width - 1) * unit_x) + 1;

where (width - 1) * unit_x doesn't fit in a signed int. This causes
max_x to be too small so that this:

    src_width = 0

    while (src_width < REPEAT_NORMAL_MIN_WIDTH && src_width <= max_x)
        src_width += src_image->bits.width;

results in src_width being 0. Later on when src_width is used for
repeat calculations, we get the infinite loop.

By casting unit_x to int64_t, the expression no longer overflows and
affine-test 212944861 and infinite-loop no longer loop forever.
(cherry picked from commit de60e2e0e3)
2013-02-18 19:58:06 +01:00
Søren Sandmann Pedersen
2156fb51b3 gtk-utils.c: Use cairo in show_image() rather than GdkPixbuf
GdkPixbufs are not premultiplied, so when using them to display pixman
images, there is some unecessary conversions going on: First the image
is converted to non-premultiplied, and then GdkPixbuf premultiplies
before sending the result to the X server. These conversions may cause
the displayed image to not be exactly identical to the original.

This patch just uses a cairo image surface instead, which avoids these
conversions.

Also make the comment about sRGB a little more concise.
2013-02-15 18:57:24 -05:00
Ben Avison
5e207f825b Fix to lowlevel-blt-bench
The source, mask and destination buffers are initialised to 0xCC just after
they are allocated. Between each benchmark, there are a pair of memcpys,
from the destination buffer to the source buffer and back again (there are
no explanatory comments, but presumably this is an effort to flush the
caches). However, it has an unintended consequence, which is to change the
contents of the buffers on entry to subsequent benchmarks. This means it is
not a fair test: for example, with over_n_8888 (featured in the following
patches) it reports L2 and even M tests as being faster than the L1 test,
because after the L1 test, the source buffer is filled with fully opaque
pixels, for which over_n_8888 has a shortcut.

The fix here is simply to reverse the order of the memcpys, so src and
destination are both filled with 0xCC on entry to all tests.
2013-02-13 02:24:34 -05:00
Stefan Weil
d26f922dc1 sse2: Use uintptr_t in type casts from pointer to integral value
Some recent code added new type casts from pointer to unsigned long.
These type casts result in compiler warnings for systems like
MinGW-w64 (64 bit Windows) where sizeof(unsigned long) != sizeof(void *).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-02-13 02:18:01 -05:00
Søren Sandmann Pedersen
dc80eb09e2 lookup_composite: Don't update cache in case of error
If we fail to find a composite function, don't update the fast path
cache with the dummy compositing function.

Also make the error message state that the bug is likely caused by
issues with thread local storage.
2013-02-13 02:18:01 -05:00
Søren Sandmann Pedersen
4dced81c91 Turn on error logging at all times
While releasing 0.29.2 the distcheck run produced a number of error
messages that had to be fixed in 349015e1fc.
These were not caught before so nobody had actually run pixman with
debugging turned on. It's not the first time this has happened, see
5b0563f39e for example.

So this patch makes the return_if_fail() macros use unlikely() around
the expressions and then turns on error logging at all times. The
performance hit should negligible since we were already evaluating the
expressions.

The place where DEBUG actually does cause a performance hit is in the
region selfcheck code, and that will still only be enabled in
development snapshots.
2013-02-13 02:18:01 -05:00
Søren Sandmann Pedersen
f4c9492c12 pixman-compiler.h: Add unlikely() macro
When compiling with GCC this macro expands to __builtin_expect((expr), 0).
On other compilers, it just expands to (expr).
2013-02-13 02:18:01 -05:00
Søren Sandmann Pedersen
5ebb5ac380 utils.c: Increase acceptable deviation to 0.0064 in pixel_checker_t
The check-formats programs reveals that the 8 bit pipeline cannot meet
the current 0.004 acceptable deviation specified in utils.c, so we
have to increase it. Some of the failing pixels were captured in
pixel-test, which with this commit now passes.

== a4r4g4b4 DISJOINT_XOR a8r8g8b8 ==

The DISJOINT_XOR operator applied to an a4r4g4b4 source pixel of
0xd0c0 and a destination pixel of 0x5300ea00 results in the exact
value:

    fa = (1 - da) / sa = (1 - 0x53 / 255.0) / (0xd / 15.0) = 0.7782
    fb = (1 - sa) / da = (1 - 0xd / 15.0) / (0x53 / 255.0) = 0.4096

    r = fa * (0xc / 15.0) + fb * (0xea / 255.0) = 0.99853

But when computing in 8 bits, we get:

    fa8 = ((255 - 0x53) * 255 + 0xdd / 2) / 0xdd = 0xc6
    fb8 = ((255 - 0xdd) * 255 + 0x53 / 3) / 0x53 = 0x68

    r8 = (fa8 * 0xcc + 127) / 255 + (fb8 * 0xea + 127) / 255 = 0xfd

and

    0xfd / 255.0 = 0.9921568627450981

for a deviation of 0.00637118610187, which we then have to consider
acceptable given the current implementation.

By switching to computing the result with

   r = (fa * s + fb * d + 127) / 255

rather than

   r = (fa * s + 127) / 255 + (fb * d + 127) / 255

the deviation would be only 0.00244961747442, so at some point it may
be worth doing either this, or switching to floating point for
operators that involve divisions.

Note that the conversion from 4 bits to 8 bits does not cause any
error in this case because both rounding and bit replication produces
an exact result when the number of from-bits divide the number of
to-bits.

== a8r8g8b8 OVER r5g6b5 ==

When OVER compositing the a8r8g8b8 pixel 0x0f00c300 with the x14r6g6b6
pixel 0x03c0, the true floating point value of the resulting green
channel is:

   0xc3 / 255.0 + (1.0 - 0x0f / 255.0) * (0x0f / 63.0) = 0.9887955

but when compositing 8 bit values, where the 6-bit green channel is
converted to 8 bit through bit replication, the 8-bit result is:

   0xc3 + ((255 - 0x0f) * 0x3c + 127) / 255 = 251

which corresponds to a real value of 0.984314. The difference from the
true value is 0.004482 which is bigger than the acceptable deviation
of 0.004. So, if we were to compute all the CONJOINT/DISJOINT
operators in floating point, or otherwise make them more accurate, the
acceptable deviation could be set at 0.0045.

If we were doing the 6-bit conversion with rounding:

   (x / 63.0 * 255.0 + 0.5)

instead of bit replication, the deviation in this particular case
would be only 0.0005, so we may want to consider this at some
point.
2013-02-13 02:18:01 -05:00
Søren Sandmann Pedersen
f2ba7fe1d8 test: Add new pixel-test regression test
This test program contains a table of individual operator/pixel
combinations. For each pixel combination, images of various sizes are
filled with the pixels and then composited. The result is then
verified against the output of do_composite(). If the result doesn't
match, detailed error information is printed.

The initial 14 pixel combinations currently all fail.
2013-02-13 02:18:01 -05:00
Søren Sandmann Pedersen
6781636740 a1-trap-test: Add tests for operator_name and format_name()
The check-formats.c test depends on the exact format of the strings
returned from these functions, so add a test here.

a1-trap-test isn't the ideal place, but it seems like overkill to add
a new test just for these trivial checks.
2013-02-13 02:18:01 -05:00
Søren Sandmann Pedersen
d1434d112c test: Add new check-formats utility
Given an operator and two formats, this program will composite and
check all pixels where the red and blue channels are 0. That is, if
the two formats are a8r8g8b8 and a4r4g4b4, all source pixels matching
the mask

    0xff00ff00

are composited with the given operator against all destination pixels
matching the mask

    0xf0f0

and the result is then verified against the do_composite() function
that was moved to utils.c earlier.

This program reveals that a number of operators and format
combinations are not computed to within the precision currently
accepted by pixel_checker_t. For example:

    check-formats over a8r8g8b8 r5g6b5 | grep failed | wc -l
    30

reveals that there are 30 pixel combinations where OVER produces
insufficiently precise results for the a8r8g8b8 and r5g6b5 formats.
2013-02-13 02:18:01 -05:00
Søren Sandmann Pedersen
1820131fe6 utils.[ch]: Add pixel_checker_get_masks()
This function returns the a, r, g, and b masks corresponding to the
pixel checker's format.
2013-02-13 02:18:01 -05:00
Søren Sandmann Pedersen
5eb61f72ea test/utils.[ch]: Add pixel_checker_convert_pixel_to_color()
This function takes a pixel in the format corresponding to the pixel
checker, and converts to a color_t.
2013-02-13 02:18:01 -05:00
Søren Sandmann Pedersen
3ae717f71a test: Move do_composite() function from composite.c to utils.c
So that it can be used in other tests.
2013-02-13 02:18:01 -05:00
Søren Sandmann Pedersen
958bd334b3 Post-release version bump to 0.29.3 2013-01-29 21:42:02 -05:00
Søren Sandmann Pedersen
a56707e23b Pre-release version bump to 0.29.2 2013-01-29 21:14:51 -05:00
Søren Sandmann Pedersen
349015e1fc stresstest: Ensure that the rasterizer is only given alpha formats
In c2cb303d33, return_if_fail()s were added to
prevent the trapezoid rasterizers from being called with non-alpha
formats. However, stress-test actually does call the rasterizers with
non-alpha formats, but because _pixman_log_error() is disabled in
versions with an odd minor number, the errors never materialized.

Fix this by changing the argument to random format to an enum of three
values DONT_CARE, PREFER_ALPHA, or REQUIRE_ALPHA, and then in the
switch that calls the trapezoid rasterizers, pass the appropriate
value for the function in question.
2013-01-29 20:43:51 -05:00
Søren Sandmann Pedersen
afde862928 Change default GPGKEY to 3892336E, which is soren.sandmann@gmail.com
The old one belongs to the email address sandmann@daimi.au.dk, which
doesn't work anyore.

Also use gpg to get the name and address for the "(Signed by ...)"
line since that works more reliably for me than using git.
2013-01-29 15:24:22 -05:00
Ben Avison
69a7a9b6b6 Improve L1 and L2 benchmark tests for caches that don't use allocate-on-write
In particular this affects single-core ARMs (e.g. ARM11, Cortex-A8), which
are usually configured this way. For other CPUs, this should only add a
constant time, which will be cancelled out by the EXCLUDE_OVERHEAD runs.

The problems were caused by cachelines becoming permanently evicted from
the cache, because the code that was intended to pull them back in again on
each iteration assumed too long a cache line (for the L1 test) or failed to
read memory beyond the first pixel row (for the L2 test). Also, the reloading
of the source buffer was unnecessary.

These issues were identified by Siarhei in this post:
http://lists.freedesktop.org/archives/pixman/2013-January/002543.html
2013-01-29 15:23:05 -05:00
Søren Sandmann Pedersen
1fa67f499d pixman-combine-float.c: Use IS_ZERO() in clip_color() and set_sat()
The clip_color() function has some checks to avoid division by zero,
but they are done by comparing the value to 4 * FLT_EPSILON, where a
better choice is the IS_ZERO() macro that compares to +/- FLT_MIN.

In set_sat(), the check is that *max > *min before dividing by *max -
*min, but that has the potential problem that interactions between GCC
optimizions and 80 bit x87 registers could mean that (*max > *min) is
true in 80 bits, but (*max - *min) is 0 in 32 bits, so that the
division by zero is not prevented. Using IS_ZERO() here as well
prevents this.
2013-01-29 15:23:05 -05:00
Ben Avison
7e53e58664 ARMv6: Replacement add_8_8, over_8888_8888, over_8888_n_8888 and over_n_8_8888 routines
Improved by adding preloads, combining writes and using the SEL
instruction.

add_8_8

    Before          After
    Mean   StdDev   Mean   StdDev  Confidence  Change
L1  62.1   0.2      543.4  12.4    100.0%      +774.9%
L2  38.7   0.4      116.8  1.7     100.0%      +201.8%
M   40.0   0.1      110.1  0.5     100.0%      +175.3%
HT  30.9   0.2      43.4   0.5     100.0%      +40.4%
VT  30.6   0.3      39.2   0.5     100.0%      +28.0%
R   21.3   0.2      35.4   0.4     100.0%      +66.6%
RT  8.6    0.2      10.2   0.3     100.0%      +19.4%

over_8888_8888

    Before          After
    Mean   StdDev   Mean   StdDev  Confidence  Change
L1  32.3   0.1      38.0   0.2     100.0%      +17.7%
L2  15.9   0.4      30.6   0.5     100.0%      +92.8%
M   13.3   0.0      25.6   0.0     100.0%      +92.9%
HT  10.5   0.1      15.5   0.1     100.0%      +47.1%
VT  10.4   0.1      14.6   0.1     100.0%      +40.8%
R   10.3   0.1      15.8   0.1     100.0%      +53.3%
RT  6.0    0.1      7.6    0.1     100.0%      +25.9%

over_8888_n_8888

    Before          After
    Mean   StdDev   Mean   StdDev  Confidence  Change
L1  17.6   0.1      21.0   0.1     100.0%      +19.2%
L2  11.2   0.2      19.2   0.1     100.0%      +71.2%
M   10.2   0.0      19.6   0.0     100.0%      +92.6%
HT  8.4    0.0      11.9   0.1     100.0%      +41.7%
VT  8.3    0.0      11.3   0.1     100.0%      +36.4%
R   8.3    0.0      11.8   0.1     100.0%      +43.1%
RT  5.1    0.1      6.2    0.1     100.0%      +21.3%

over_n_8_8888

    Before          After
    Mean   StdDev   Mean   StdDev  Confidence  Change
L1  17.5   0.1      22.8   0.8     100.0%      +30.1%
L2  14.2   0.3      21.7   0.2     100.0%      +52.6%
M   12.0   0.0      22.3   0.0     100.0%      +84.8%
HT  10.5   0.1      14.1   0.1     100.0%      +34.5%
VT  10.0   0.1      13.5   0.1     100.0%      +35.3%
R   9.4    0.0      12.9   0.2     100.0%      +37.7%
RT  5.5    0.1      6.5    0.2     100.0%      +19.2%
2013-01-29 21:48:03 +02:00
Ben Avison
f87dfd6f37 ARMv6: New conversion routines
There was no previous attempt at accelerating these specifically for
ARMv6.

src_x888_8888

    Before          After
    Mean   StdDev   Mean   StdDev  Confidence  Change
L1  96.7   0.5      270.4  2.6     100.0%      +179.5%
L2  44.6   2.7      110.6  9.7     100.0%      +148.0%
M   26.9   0.1      87.6   0.5     100.0%      +226.1%
HT  19.3   0.2      37.5   0.4     100.0%      +93.7%
VT  18.6   0.1      33.7   0.4     100.0%      +81.6%
R   18.4   0.1      32.2   0.3     100.0%      +75.2%
RT  9.2    0.2      12.1   0.3     100.0%      +31.4%

src_0565_8888

    Before          After
    Mean   StdDev   Mean   StdDev  Confidence  Change
L1  37.0   0.3      66.9   0.2     100.0%      +80.8%
L2  30.3   0.2      55.9   0.3     100.0%      +84.4%
M   25.9   0.0      62.3   0.2     100.0%      +140.3%
HT  15.2   0.1      33.1   0.3     100.0%      +116.9%
VT  15.1   0.1      30.7   0.3     100.0%      +103.6%
R   14.2   0.1      27.6   0.3     100.0%      +94.0%
RT  6.0    0.1      11.2   0.3     100.0%      +87.2%
2013-01-29 21:47:59 +02:00
Ben Avison
a0f59f3b28 ARMv6: New blit routines
These are usable either as various composite operations, or via the
top-level function pixman_blt() which now does some blitting for the
first time on an ARMv6 platform (previously it just returned FALSE).

src_8888_8888

    Before          After
    Mean   StdDev   Mean   StdDev  Confidence  Change
L1  414.5  9.4      445.8  3.6     100.0%      +7.6%
L2  93.3   20.7     114.5  12.9    100.0%      +22.7%
M   57.0   0.2      89.2   0.5     100.0%      +56.4%
HT  28.7   0.3      39.6   0.4     100.0%      +37.9%
VT  25.5   0.2      35.3   0.4     100.0%      +38.4%
R   20.1   0.1      33.8   0.3     100.0%      +67.8%
RT  7.8    0.2      12.7   0.4     100.0%      +62.7%

src_0565_0565

    Before          After
    Mean   StdDev   Mean   StdDev  Confidence  Change
L1  397.4  6.1      412.5  5.2     100.0%      +3.8%
L2  143.2  10.9     141.9  6.5     68.9%       -0.9%  (insignificant)
M   90.7   0.4      133.5  0.7     100.0%      +47.1%
HT  38.6   0.3      53.7   0.7     100.0%      +39.0%
VT  33.0   0.3      47.3   0.6     100.0%      +43.3%
R   25.7   0.2      42.1   0.5     100.0%      +64.1%
RT  8.0    0.2      13.3   0.3     100.0%      +65.6%

src_8_8

    Before          After
    Mean   StdDev   Mean   StdDev  Confidence  Change
L1  716.5  9.8      768.2  20.4    100.0%      +7.2%
L2  246.2  12.7     260.5  8.8     100.0%      +5.8%
M   146.8  0.7      227.9  0.7     100.0%      +55.2%
HT  44.9   0.6      62.1   1.0     100.0%      +38.2%
VT  35.6   0.4      53.4   0.7     100.0%      +50.0%
R   29.7   0.3      48.2   0.6     100.0%      +62.2%
RT  8.6    0.2      12.9   0.4     100.0%      +49.3%
2013-01-29 21:47:54 +02:00
Ben Avison
3cff56c5b0 ARMv6: New fill routines
Note that this also effectively accelerates src_n_8888, src_n_0565 and
src_n_8 composite types, because of the fast paths in
pixman-fast-path.c implemented by fast_composite_solid_fill(), which
end up dispatching these platform-specific fill routines.

src_n_8888

    Before          After
    Mean   StdDev   Mean   StdDev  Confidence  Change
L1  157.3  1.1      574.2  8.7     100.0%      +265.0%
L2  94.2   0.5      364.8  4.2     100.0%      +287.3%
M   92.7   0.4      358.7  1.1     100.0%      +287.1%
HT  68.5   0.9      133.6  4.0     100.0%      +95.2%
VT  61.3   0.8      111.8  2.6     100.0%      +82.4%
R   61.1   0.9      108.7  2.8     100.0%      +78.1%
RT  24.6   1.0      28.6   1.6     100.0%      +16.0%

src_n_0565

    Before          After
    Mean   StdDev   Mean   StdDev  Confidence  Change
L1  157.4  1.0      983.1  38.5    100.0%      +524.6%
L2  93.6   0.5      696.0  14.3    100.0%      +643.4%
M   92.7   0.4      680.5  1.0     100.0%      +634.0%
HT  68.3   0.9      160.3  6.6     100.0%      +134.6%
VT  61.1   0.8      130.1  3.4     100.0%      +112.9%
R   61.0   0.8      125.4  4.1     100.0%      +105.7%
RT  24.9   1.3      29.5   1.5     100.0%      +18.2%

src_n_8

    Before          After
    Mean   StdDev   Mean   StdDev  Confidence  Change
L1  154.7  1.0      1324.4 48.5    100.0%      +756.3%
L2  92.4   0.4      1178.4 10.9    100.0%      +1175.6%
M   92.9   0.4      1275.7 2.1     100.0%      +1273.5%
HT  68.2   1.0      169.8  5.5     100.0%      +149.0%
VT  61.2   1.0      138.5  3.6     100.0%      +126.3%
R   61.3   0.9      130.1  3.8     100.0%      +112.4%
RT  25.5   1.3      29.2   1.9     100.0%      +14.6%
2013-01-29 21:47:49 +02:00
Ben Avison
2e173326aa ARMv6: Lay the groundwork for later patches in the series
Move the entire contents of pixman-arm-simd-asm.S to a new file;
ultimately this will only retain the scaled operations, so it is
named pixman-arm-simd-asm-scaled.S. Added new header file
pixman-arm-simd-asm.h, containing the macros which are the basis of
all the new ARMv6 implementations, although at this point in the
series, nothing uses them and the library should be binary-identical.
2013-01-29 21:47:42 +02:00
Søren Sandmann Pedersen
65fc1adb65 demo/scale: Add a spin button to set the number of subsample bits
For large upscalings the level of subsampling for the filter has a
quite visible effect, so make it settable in the UI so that people can
experiment with various values.
2013-01-27 23:06:28 -05:00
Siarhei Siamashka
ed39992564 Use pixman_transform_point_31_16() from pixman_transform_point()
Old functions pixman_transform_point() and pixman_transform_point_3d()
now become just wrappers for pixman_transform_point_31_16() and
pixman_transform_point_31_16_3d(). Eventually their uses should be
completely eliminated in the pixman code and replaced with their
extended range counterparts. This is needed in order to be able
to correctly handle any matrices and parameters that may come
to pixman from the code responsible for XRender implementation.
2013-01-27 20:50:38 +02:00
Siarhei Siamashka
5a78d74ccc test: Added matrix-test for testing projective transform accuracy
This test uses __float128 data type when it is available
for implementing a "perfect" reference implementation. The
output from from pixman_transform_point_31_16() and
pixman_transform_point_31_16_affine() is compared with the
reference implementation to make sure that the rounding
errors may only show up in a single least significant bit.

The platforms and compilers, which do not support __float128
data type, can rely on crc32 checksum for the pseudorandom
transform results.
2013-01-27 20:50:31 +02:00
Siarhei Siamashka
09600ae7e3 configure.ac: Added detection for __float128 support
GCC supports 128-bit floating point data type on some platforms (including
but not limited to x86 and x86-64). This may be useful for tests, which
need prefectly accurate reference implementations of certain algorithms.
2013-01-27 20:50:26 +02:00
Siarhei Siamashka
c3deb8334a Add higher precision "pixman_transform_point_*" functions
The following new functions are added:

pixman_transform_point_31_16_3d() -
    Calculates the product of a matrix and a vector multiplication.

pixman_transform_point_31_16() -
    Calculates the product of a matrix and a vector multiplication.
    Then converts the homogenous resulting vector [x, y, z] to
    cartesian [x', y', 1] variant, where x' = x / z, and y' = y / z.

pixman_transform_point_31_16_affine() -
    A faster sibling of the other two functions, which assumes affine
    transformation, where the bottom row of the matrix is [0, 0, 1] and
    the last element of the input vector is set to 1.

These functions transform a point with 31.16 fixed point coordinates from
the destination space to a point with 48.16 fixed point coordinates in
the source space.

The results are accurate and the rounding errors may only show up in
the least significant bit. No overflows are possible for the affine
transformations as long as the input data is provided in 31.16 format.
In the case of projective transformations, some output values may be not
representable using 48.16 fixed point format. In this case the results
are clamped to return maximum or minimum 48.16 values (so that the caller
can at least handle NONE and PAD repeats correctly).
2013-01-27 20:49:43 +02:00
Siarhei Siamashka
a47ed2c311 Faster fetch for the C variant of r5g6b5 src/dest iterator
Processing two pixels at once is used to reduce the number of
arithmetic operations.

The speedup relative to the generic fetch_scanline_r5g6b5() from
"pixman-access.c" (pixman was compiled with gcc 4.7.2):

    MIPS 74K        480MHz  :  20.32 MPix/s ->  26.47 MPix/s
    ARM11           700MHz  :  34.95 MPix/s ->  38.22 MPix/s
    ARM Cortex-A8  1000MHz  :  87.44 MPix/s -> 100.92 MPix/s
    ARM Cortex-A9  1700MHz  : 150.95 MPix/s -> 158.13 MPix/s
    ARM Cortex-A15 1700MHz  : 148.91 MPix/s -> 155.42 MPix/s
    IBM Cell PPU   3200MHz  :  75.29 MPix/s ->  98.33 MPix/s
    Intel Core i7  2800MHz  : 257.02 MPix/s -> 376.93 MPix/s

That's the performance for C code (SIMD and assembly optimizations
are disabled via PIXMAN_DISABLE environment variable).
2013-01-27 20:48:31 +02:00
Siarhei Siamashka
e66fd5ccb6 Faster write-back for the C variant of r5g6b5 dest iterator
Unrolling loops improves performance, so just use it here.

Also GCC can't properly optimize this code for RISC processors and
allocate 0x1F001F constant in a register. Because this constant is
too large to be represented as an immediate operand in instructions,
GCC inserts some redundant arithmetics. This problem can be workarounded
by explicitly using a variable for 0x1F001F constant and also initializing
it by a read from another volatile variable. In this case GCC is forced
to allocate a register for it, because it is not seen as a constant anymore.

The speedup relative to the generic store_scanline_r5g6b5() from
"pixman-access.c" (pixman was compiled with gcc 4.7.2):

    MIPS 74K        480MHz  :  33.22 MPix/s ->  43.42 MPix/s
    ARM11           700MHz  :  50.16 MPix/s ->  78.23 MPix/s
    ARM Cortex-A8  1000MHz  : 117.75 MPix/s -> 196.34 MPix/s
    ARM Cortex-A9  1700MHz  : 177.04 MPix/s -> 320.32 MPix/s
    ARM Cortex-A15 1700MHz  : 231.44 MPix/s -> 261.64 MPix/s
    IBM Cell PPU   3200MHz  : 130.25 MPix/s -> 145.61 MPix/s
    Intel Core i7  2800MHz  : 502.21 MPix/s -> 721.73 MPix/s

That's the performance for C code (SIMD and assembly optimizations
are disabled via PIXMAN_DISABLE environment variable).
2013-01-27 20:48:26 +02:00
Siarhei Siamashka
a9f6669416 Added C variants of r5g6b5 fetch/write-back iterators
Adding specialized iterators for r5g6b5 color format allows us to work
on fine tuning performance of r5g6b5 fetch/write-back operations in the
pixman general "fetch -> combine -> store" pipeline.

These iterators also make "src_x888_0565" fast path redundant, so it can
be removed.
2013-01-27 20:48:22 +02:00
Chris Wilson
794033ed43 Eliminate duplicate copies of channel flags for pixman_image_composite32()
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-27 14:04:16 +00:00
Chris Wilson
a59f081df4 Always return a valid function from lookup_combiner()
We should always have at least a C combiner available, so we never
expect the search to fail. If it does, emit an error and return a
dummy function.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-27 14:04:16 +00:00
Chris Wilson
520230914b Always return a valid function from lookup_composite()
We never expect to fail to find the appropriate function as the
general_composite_rect should always match. So if somehow we fallthrough
the search, emit a _pixman_log_error() and return a dummy function.

Note that we remove some conditionals and a level of indentation hence a
large amount of code movement. This also reveals that in a few places we
are duplicating stack variables that can be eliminated later.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-27 14:04:15 +00:00
Chris Wilson
b283c864a3 sse2: Add fast paths for bilinear source with a solid mask
Based on the existing sse2_8888_n_8888 nearest scaling routines.

fishbowl on an i5-2500: 60.9s -> 56.9s

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-27 14:04:15 +00:00
Chris Wilson
d00ce40912 sse2: Add a fast path for add_n_8_8888
This path is being exercised by compositing of trapezoids for clipmasks, for
instance as used in the firefox-asteroids cairo-trace.

IVB i7-3720qm ./tests/lowlevel-blt-bench add_n_8_8888:

reference memcpy speed = 14846.7MB/s (3711.7MP/s for 32bpp fills)

before: L1: 681.10  L2: 735.14  M:701.44 ( 28.35%)  HT:283.32  VT:213.23  R:208.93  RT: 77.89 ( 793Kops/s)

after:  L1: 992.91  L2:1017.33  M:982.58 ( 39.88%)  HT:458.93  VT:332.32  R:326.13  RT:136.66 (1287Kops/s)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-27 14:04:15 +00:00
Chris Wilson
7ced3beec9 sse2: Add a fast path for add_n_8888
This path is being exercised by inplace compositing of trapezoids, for
instance as used in the firefox-asteroids cairo-trace.

IVB i3-3720qm ./tests/lowlevel-blt-bench add_n_888:

reference memcpy speed = 14918.3MB/s (3729.6MP/s for 32bpp fills)

before: L1:1752.44  L2:2259.48  M:2215.73 ( 58.80%)  HT:589.49   VT:404.04   R:424.69  RT:134.68 (1182Kops/s)

after:  L1:3931.21  L2:6132.78  M:3440.17 ( 92.24%)  HT:1337.70  VT:1357.64  R:1270.27  RT:359.78 (2161Kops/s)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-27 14:04:15 +00:00
Jeff Muizelaar
b7f523e3bc Add a version of bilinear_interpolation for precision <=4
Having 4 or fewer bits means we can do two components at
a time in a single 32 bit register.

Here are the results for firefox-fishtank on a Pandaboard with
4.6.3 and PIXMAN_DISABLE="arm-neon"

Before:
[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]    image           t-firefox-fishtank    7.841    7.910   0.70%    6/6

After:
[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]    image           t-firefox-fishtank    6.951    6.995   1.11%    6/6
2013-01-25 13:14:37 -05:00
Ben Avison
24e83cae64 Tweaks to lowlevel-blt-bench
This adds two extra tests, src_n_8 and src_8_8, which I have been
using to benchmark my ARMv6 changes.

I'd also like to propose that it requires an exact test name as the
executable's argument, as achieved by this strstr to strcmp change.
Without this, it is impossible to only benchmark (for example)
add_8_8, add_n_8 or src_n_8, due to those also being substrings of
many other test names.
2013-01-25 11:13:07 -05:00
Søren Sandmann Pedersen
b527a0e615 test: Use operator_name() and format_name() in composite.c
With the operator_name() and format_name() functions there is no
longer any reason for composite.c to have its own table of format and
operator names.
2013-01-23 12:24:31 -05:00
Søren Sandmann Pedersen
4eb9a24aba utils.[ch]: Add new format_name() function
This function returns the name of the given format code, which is
useful for printing out debug information. The function is written as
a switch without a default value so that the compiler will warn if new
formats are added in the future. The fake formats used in the fast
path tables are also recognized.

The function is used in alpha_map.c, where it replaces an existing
format_name() function, and in blitters-test.c, affine-test.c, and
scaling-test.c.
2013-01-23 12:24:31 -05:00
Søren Sandmann Pedersen
1676b49389 test/utils.[ch]: Add new function operator_name()
This function returns the name of the given operator, which is useful
for printing out debug information. The function is done as a switch
without a default value so that the compiler will warn if new
operators are added in the future.

The function is used in affine-test.c, scaling-test.c, and
blitters-test.c.
2013-01-23 12:24:31 -05:00
Søren Sandmann Pedersen
8d85311143 README: Add guidelines on how to contribute patches
Ben Avison pointed out here:

   http://lists.freedesktop.org/archives/pixman/2013-January/002485.html

that there isn't really any documentation about how to submit patches
to pixman. This patch adds some information to the README file.

v2: Incorporate some comments from Ben Avison
v3: Change gitweb URL to cgit
2013-01-23 12:22:40 -05:00
Matt Turner
61dacffaf4 Convert INCLUDES to AM_CPPFLAGS
INCLUDES has been deprecated starting with automake 1.13. Convert all
occurrences with the recommended AM_CPPFLAGS replacement.
2013-01-22 22:08:30 -08:00
Matt Turner
c7c28f440d Add new demos and tests to .gitignore 2013-01-22 22:08:30 -08:00
Nemanja Lukic
2c6577476e MIPS: DSPr2: Added more fast-paths:
- over_reverse_n_8888
 - in_n_8_8

Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        over_reverse_n_8888 =  L1:  19.42  L2:  19.07  M: 15.38 ( 40.80%)  HT: 13.35  VT: 13.10  R: 12.92  RT:  8.27 (  49Kops/s)
                   in_n_8_8 =  L1:  21.20  L2:  22.86  M: 21.42 ( 14.21%)  HT: 15.97  VT: 15.69  R: 15.47  RT:  8.00 (  48Kops/s)

Optimized:
        over_reverse_n_8888 =  L1:  60.09  L2:  47.87  M: 28.65 ( 76.02%)  HT: 23.58  VT: 22.51  R: 21.99  RT: 12.28 (  60Kops/s)
                   in_n_8_8 =  L1:  89.38  L2:  86.07  M: 65.48 ( 43.44%)  HT: 44.64  VT: 41.50  R: 40.77  RT: 16.94 (  66Kops/s)
2013-01-22 03:12:59 +01:00
Nemanja Lukic
a67b0e24d7 MIPS: DSPr2: Added more fast-paths for REVERSE operation:
- out_reverse_8_0565
 - out_reverse_8_8888

Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        out_reverse_8_0565 =  L1:  14.29  L2:  13.58  M: 12.14 ( 24.16%)  HT:  9.23  VT:  9.12  R:  8.84  RT:  4.75 (  36Kops/s)
        out_reverse_8_8888 =  L1:  27.46  L2:  23.24  M: 17.41 ( 57.73%)  HT: 12.61  VT: 12.47  R: 11.79  RT:  5.86 (  41Kops/s)

Optimized:
        out_reverse_8_0565 =  L1:  28.24  L2:  25.64  M: 20.63 ( 41.05%)  HT: 16.69  VT: 16.14  R: 15.50  RT:  8.69 (  52Kops/s)
        out_reverse_8_8888 =  L1:  52.78  L2:  41.44  M: 23.50 ( 77.94%)  HT: 18.79  VT: 18.16  R: 16.90  RT:  9.11 (  53Kops/s)
2013-01-22 03:10:31 +01:00
Maarten Lankhorst
01c2431ef8 Add 00-unexport-symbol.diff
* Add 00-unexport-symbol.diff
  - remove test-only use of _pixman_internal_only_get_implementation
  - zap the only test requiring the use of this symbol
2013-01-08 18:16:23 +01:00
Maarten Lankhorst
d6b69d4f63 update symbols file and addd lintian override for hidden symbol 2013-01-08 17:10:12 +01:00
Maarten Lankhorst
0f8c56fe52 new upstream release 2013-01-08 16:12:25 +01:00
Maarten Lankhorst
818af795d4 pixman 0.28.2 release
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iEYEABECAAYFAlDF0BkACgkQmxfmIW/3waiEegCcCVDzXL2gGouDGCBqJVOmzUcv
 ZnMAoI50IhP5KXKKEEx2dJlfFkzKVo5N
 =J62R
 -----END PGP SIGNATURE-----

Merge tag 'pixman-0.28.2' into debian-experimental

pixman 0.28.2 release
2013-01-08 16:10:57 +01:00
Søren Sandmann Pedersen
35cc965514 pixman-filter.c: Cope with NULL returns from malloc()
v2: Don't return a pointer to uninitialized memory when the allocation
of horz and vert fails, but allocation of params doesn't.
2013-01-06 17:38:23 -05:00
Søren Sandmann Pedersen
58526cfc72 Handle solid images in the noop iterator
The noop src iterator already has code to handle solid images, but
that code never actually runs currently because it is not possible for
an image to have both a format code of PIXMAN_solid and a flag of
FAST_PATH_BITS_IMAGE.

If these two were to be set at the same time, the
fast_composite_tiled_repeat() fast path would trigger for solid images
(because it triggers for PIXMAN_any formats, which includes
PIXMAN_solid), but for solid images we can usually do better than that
fast path.

So this patch removes _pixman_solid_fill_iter_init() and instead
handles such images (along with repeating 1x1 bits images without an
alpha map) in pixman-noop.c.

When a 1x1R image is involved in the general composite path, before
this patch, it would hit this code in repeat() in pixman-inlines.h:

        while (*c >= size)
            *c -= size;
        while (*c < 0)
            *c += size;

and those loops could run for a huge number of iteratons (proportional
to the composite width). For such cases, the performance improvement
is really big:

./test/lowlevel-blt-bench -n add_n_8888:

Before:

    add_n_8888 =  L1:   3.86  L2:   3.78  M:  1.40 (  0.06%)  HT:  1.43  VT:  1.41  R:  1.41  RT:  1.38 (  19Kops/s)

After:

    add_n_8888 =  L1:1236.86  L2:2468.49  M:1097.88 ( 49.04%)  HT:476.49  VT:429.05  R:417.04  RT:155.12 ( 817Kops/s)
2013-01-06 17:30:12 -05:00
Marko Lindqvist
480dd38fd1 Fix build with automake-1.13
Automake-1.13 has removed long obsolete AM_CONFIG_HEADER macro (
http://lists.gnu.org/archive/html/automake/2012-12/msg00038.html )
and autoreconf errors out upon seeing it.

Attached patch replaces obsolete AM_CONFIG_HEADER with now proper
AC_CONFIG_HEADERS.
2013-01-04 01:54:10 +02:00
Siarhei Siamashka
1abde88ae6 Use more appropriate types and remove a magic constant 2013-01-04 01:27:06 +02:00
Siarhei Siamashka
c1fd5a4243 Define SIZE_MAX if it is not provided by the standard C headers
C++ compilers do not define SIZE_MAX. It is also not available
if the code is compiled by some C compilers:
    http://lists.freedesktop.org/archives/pixman/2012-August/002196.html
2013-01-04 01:26:55 +02:00
Siarhei Siamashka
66c4292822 Rename 'xor' variable to 'filler' (because 'xor' is a C++ keyword) 2012-12-20 03:14:21 +02:00
Søren Sandmann Pedersen
4dfda2adfe float-combiner.c: Change tests for x == 0.0 tests to - FLT_MIN < x < FLT_MIN
pixman-float-combiner.c currently uses checks like these:

    if (x == 0.0f)
        ...
    else
        ... / x;

to prevent division by 0. In theory this is correct: a division-by-zero
exception is only supposed to happen when the floating point numerator is
exactly equal to a positive or negative zero.

However, in practice, the combination of x87 and gcc optimizations
causes issues. The x87 registers are 80 bits wide, which means the
initial test:

	if (x == 0.0f)

may be false when x is an 80 bit floating point number, but when x is
rounded to a 32 bit single precision number, it becomes equal to
0.0. In principle, gcc should compensate for this quirk of x87, and
there are some options such as -ffloat-store, -fexcess-precision=standard,
and -std=c99 that will make it do so, but these all have a performance
cost.  It is also possible to set the FPU to a mode that makes it do
all computation with single or double precision, but that would
require pixman to save the existing mode before doing anything with
floating point and restore it afterwards.

Instead, this patch side-steps the issue by replacing exact checks for
equality with zero with a new macro that checkes whether the value is
between -FLT_MIN and FLT_MIN.

There is extensive reading material about this issue linked off the
infamous gcc bug 323:

    http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323
2012-12-19 13:49:32 -05:00
Siarhei Siamashka
2734071d7b ARM: make use of UQADD8 instruction even in generic C code paths
ARMv6 has UQADD8 instruction, which implements unsigned saturated
addition for 8-bit values packed in 32-bit registers. It is very useful
for UN8x4_ADD_UN8x4, UN8_rb_ADD_UN8_rb and ADD_UN8 macros (which would
otherwise need a lot of arithmetic operations to simulate this operation).
Since most of the major ARM linux distros are built for ARMv7, we are
much less dependent on runtime CPU detection and can get practical
benefits from conditional compilation here for a lot of users.

The results of cairo-perf-trace benchmark on ARM Cortex-A15 with pixman
compiled by gcc 4.7.2 and PIXMAN_DISABLE set to "arm-simd arm-neon":

Speedups
========
image    firefox-talos-gfx  (29938.22 0.12%) ->  (27814.76 0.51%) : 1.08x speedup
image    firefox-asteroids  (23241.11 0.07%) ->  (21795.19 0.07%) : 1.07x speedup
image firefox-canvas-alpha (174519.85 0.08%) -> (164788.64 0.20%) : 1.06x speedup
image              poppler   (9464.46 1.61%) ->   (8991.53 0.14%) : 1.05x speedup
2012-12-18 20:49:58 +02:00
Siarhei Siamashka
f9a41703b2 Faster conversion from a8r8g8b8 to r5g6b5 in C code
This change reduces 3 shifts, 3 ANDs and 2 ORs (total 8 arithmetic
operations) to 3 shifts, 2 ANDs and 2 ORs (total 7 arithmetic
operations).

We get garbage in the high 16 bits of the result, which might need
to be cleared when casting to uint16_t (it would bring us back to
total 8 arithmetic operations). However in the case if the result
of a8r8g8b8->r5g6b5 conversion is immediately stored to memory, no
extra instructions for clearing these garbage bits are needed.

This allows the a8r8g8b8->r5g6b5 conversion code to be compiled
into 4 instructions for ARM instead of 5 (assuming a good optimizing
compiler), which has no pipeline stalls on ARM11 as an additional
bonus.

The change in benchmark results for 'lowlevel-blt-bench src_8888_0565'
with PIXMAN_DISABLE="arm-simd arm-neon mips-dspr2 mmx sse2" and pixman
compiled by gcc-4.7.2:

    MIPS 74K        480MHz  :  40.44 MPix/s ->  40.13 MPix/s
    ARM11           700MHz  :  50.28 MPix/s ->  62.85 MPix/s
    ARM Cortex-A8  1000MHz  : 124.38 MPix/s -> 141.85 MPix/s
    ARM Cortex-A15 1700MHz  : 281.07 MPix/s -> 303.29 MPix/s
    Intel Core i7  2800MHz  : 515.92 MPix/s -> 531.16 MPix/s

The same trick was used in xomap (X server for Nokia N800/N810):
    http://repository.maemo.org/pool/diablo/free/x/xorg-server/
    xorg-server_1.3.99.0~git20070321-0osso20083801.tar.gz
2012-12-18 20:45:57 +02:00
Siarhei Siamashka
3922e90c40 Change CONVERT_XXXX_TO_YYYY macros into inline functions
It is easier and safer to modify their code in the case if the
calculations need some temporary variables. And the temporary
variables will be needed soon.
2012-12-18 20:45:47 +02:00
Siarhei Siamashka
e4519360c1 test: add "src_0565_8888" to lowlevel-blt-bench 2012-12-18 20:43:51 +02:00
Søren Sandmann Pedersen
6a6c8c51ed pixman_composite_trapezoids(): Check for NULL return from create_bits()
A check is needed that the creation of the temporary image in
pixman_composite_trapezoids() succeeds.

Fixes crash in stress-test -s 0x313c on my system.
2012-12-13 16:13:11 -05:00
Søren Sandmann Pedersen
c2cb303d33 pixman_composite_trapezoids: Return early if mask_format is not of TYPE_ALPHA
stress-test -s 0x17ee crashes because pixman_composite_trapezoids() is
given a mask_format of PIXMAN_c8, which causes it to create a
temporary image with that format but without a palette. This causes
crashes later.

The only mask_format that we actually support are those of TYPE_ALPHA,
so this patch add a return_if_fail() to ensure this.

Similarly, although currently it won't crash if given an invalid
format, alpha-only formats have always been the only thing that made
sense for the pixman_rasterize_edges() functions, so add a
return_if_fail() ensuring that the destination format is of type
PIXMAN_TYPE_ALPHA.
2012-12-13 16:10:41 -05:00
Søren Sandmann Pedersen
1f0c02811e Add testing of trapezoids to stress-test
The entry points add_trapezoids(), rasterize_trapezoid() and
composite_trapezoid() are exercised with random trapezoids.

This uncovers crashes with stress-test seeds 0x17ee and 0x313c.
2012-12-13 15:59:18 -05:00
Søren Sandmann Pedersen
526dc06e56 demos/radial-test: Add checkerboard to display the alpha channel 2012-12-11 09:05:58 -05:00
Søren Sandmann Pedersen
6402b2aa0c demos/conical-test: Use the draw_checkerboard() utility function
Instead of having its own copy.
2012-12-11 09:05:58 -05:00
Søren Sandmann Pedersen
e382e52d67 test/utils.[ch]: Add utility function to draw a checkerboard
This is useful in demo programs to display the alpha channel.
2012-12-11 09:05:58 -05:00
Søren Sandmann Pedersen
b0a6504122 radial: When comparing t to mindr, use >= rather than >
Radial gradients are conceptually rendered as a sequence of circles
generated by linearly extrapolating from the two circles given by the
gradient specification. Any circles in that sequence that would end up
with a negative radius are not drawn, a condition that is enforced by
checking that t * dr is bigger than mindr:

     if (t * dr > mindr)

However, it is legitimate for a circle to have radius exactly 0, so
the test should use >= rather than >.

This gets rid of the dots in demos/radial-test except for when the c2
circle has radius 0 and a repeat mode of either NONE or NORMAL. Both
those dots correspond to a t value of 1.0, which is outside the
defined interval of [0.0, 1.0) and therefore subject to the repeat
algorithm. As a result, in the NONE case, a value of 1.0 turns into
transparent black. In the NORMAL case, 1.0 wraps around and becomes
0.0 which is red, unlike 0.99 which is blue.

Cc: ranma42@gmail.com
2012-12-11 09:05:38 -05:00
Søren Sandmann Pedersen
54aca22058 demos/radial-test: Add zero-radius circles to demonstrate rendering bugs
Add two new gradient columns, one where the start circle is has radius
0 and one where the end circle has radius 0. All the new gradients
except for one are rendered with a bright dot in the middle. In most
but not all cases this is incorrect.

Cc: ranma42@gmail.com
2012-12-11 08:20:45 -05:00
Siarhei Siamashka
fdab3c1b6c test: Workaround unaligned MOVDQA bug (http://gcc.gnu.org/PR55614)
Just use SSE2 intrinsics to do unaligned memory accesses as
a workaround for this gcc bug related to vector extensions.
2012-12-10 20:05:15 +02:00
Siarhei Siamashka
2bc59006d7 Improve performance of combine_over_u
The generic C over_u combiner can be a lot faster with the
addition of special shortcuts for 0xFF and 0x00 alpha/mask
values. This is already implemented in C and SSE2 fast paths.

Profiling the run of cairo-perf-trace benchmarks with PIXMAN_DISABLE
environment variable set to "fast mmx sse2" on Intel Core i7:

=== before ===

37.32%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] combine_over_u
21.37%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] bits_image_fetch_bilinear_no_repeat_8888
13.51%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] bits_image_fetch_bilinear_affine_none_a8r8g8b8
 2.96%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] radial_compute_color
 2.74%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] fetch_scanline_a8
 2.71%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] fetch_scanline_x8r8g8b8
 2.17%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] _pixman_gradient_walker_pixel
 1.86%  cairo-perf-trac  libcairo.so.2.11200.0 [.] _cairo_tor_scan_converter_generate
 1.57%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] bits_image_fetch_bilinear_affine_pad_a8r8g8b8
 0.97%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] combine_in_reverse_u
 0.96%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] combine_over_ca

=== after ===

28.79%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] bits_image_fetch_bilinear_no_repeat_8888
18.44%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] bits_image_fetch_bilinear_affine_none_a8r8g8b8
15.54%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] combine_over_u
 3.94%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] radial_compute_color
 3.69%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] fetch_scanline_a8
 3.69%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] fetch_scanline_x8r8g8b8
 2.94%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] _pixman_gradient_walker_pixel
 2.52%  cairo-perf-trac  libcairo.so.2.11200.0 [.] _cairo_tor_scan_converter_generate
 2.08%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] bits_image_fetch_bilinear_affine_pad_a8r8g8b8
 1.31%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] combine_in_reverse_u
 1.29%  cairo-perf-trac  libpixman-1.so.0.29.1 [.] combine_over_ca
2012-12-10 20:02:08 +02:00
Søren Sandmann Pedersen
a5e5179b56 Pre-release version bump to 0.28.2 2012-12-10 06:46:36 -05:00
Benjamin Gilbert
6e270a7968 Fix thread safety on mingw-w64 and clang
After finding a working TLS storage class specifier, configure was
continuing to test other candidates.  This caused it to prefer
__declspec(thread) over __thread.  However, __declspec(thread) is
ignored with a warning by mingw-w64 [1] and silently ignored by clang [2].
The resulting binary behaved as if PIXMAN_NO_TLS was defined.

Bug introduced by a069da6c.

[1] https://bugs.freedesktop.org/show_bug.cgi?id=57591
[2] http://lists.freedesktop.org/archives/pixman/2012-October/002320.html
2012-12-10 06:46:36 -05:00
Stefan Weil
d91f550b2a Always use xmmintrin.h for 64 bit Windows
MinGW-w64 uses the GNU compiler and does not define _MSC_VER.
Nevertheless, it provides xmmintrin.h and must be handled
here like the MS compiler. Otherwise compilation fails due to
conflicting declarations.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2012-12-10 06:46:36 -05:00
Joshua Root
2092aa0d92 Fix undeclared variable use and sysctlbyname error handling on ppc
Fixes bug 56889.
2012-12-10 06:46:36 -05:00
Søren Sandmann Pedersen
9029026edd Post-release version bump to 0.28.1 2012-12-10 06:46:36 -05:00
Søren Sandmann Pedersen
8ca4e14472 Add fast paths for separable convolution
Similar to the fast paths for general affine access, add some fast
paths for the separable filter for all combinations of formats
x8r8g8b8, a8r8g8b8, r5g6b5, a8 with the four repeat modes.

It is easy to see the speedup in the demos/scale program.
2012-12-08 12:38:58 -05:00
Søren Sandmann Pedersen
4f18ba30ce Add demo program for conical gradients
This new test is derived from radial-test.c and displays conical
gradients at various angles.

It also demonstrates how PIXMAN_REPEAT_NORMAL is supposed to work when
used with a gradient specification where the first stop is not a 0.0:
In this case the gradient is supposed to have a smooth transition from
the last stop back to the first stop with no sharp transitions. It
also shows that the repeat mode is not ignored for conical gradients
as one might be tempted to think.
2012-12-08 10:50:51 -05:00
Søren Sandmann Pedersen
3a98787bdd Add demos/zone_plate.png
The zone plate image is a useful test case for image scalers because
it contains all representable frequencies, so any imperfection in
resampling filters will show up as Moire patterns.

This version is symmetric around the midpoint of the image, so since
rotating it is supposed to be a noop, it can also be used to verify
that the resampling filters don't shift the image.

V2: Run the file through OptiPNG to cut the size in half, as suggested
by Siarhei.
2012-12-08 10:50:51 -05:00
Søren Sandmann Pedersen
97491ed26c demos: Add new demo program, "scale"
This program allows interactively scaling and rotating images with
using various filters and repeat modes. It uses
pixman_filter_create_separate_convolution() to generate the filters.
2012-12-08 10:50:51 -05:00
Søren Sandmann Pedersen
7f5bb22d17 demos/gtk-utils.[ch]: Add pixman_image_from_file()
This function uses GdkPixbuf to load various common formats such as
.png and .jpg into a pixman image.
2012-12-08 10:50:51 -05:00
Søren Sandmann Pedersen
6915f3e24f Add new pixman_filter_create_separable_convolution() API
This new API is a helper function to create filter parameters suitable
for use with PIXMAN_FILTER_SEPARABLE_CONVOLUTION.

For each dimension, given a scale factor, reconstruction and sample
filter kernels, and a subsampling resolution, this function will
compute a convolution of the two kernels scaled appropriately, then
sample that convolution and return the resulting vectors in a form
suitable for being used as parameters to
PIXMAN_FILTER_SEPARABLE_CONVOLUTION.

The filter kernels offered are the following:

  - IMPULSE:            Dirac delta function, ie., point sampling
  - BOX:                Box filter
  - LINEAR:             Linear filter, aka. "Tent" filter
  - CUBIC:              Cubic filter, currently Mitchell-Netravali
  - GAUSSIAN:           Gaussian function, sigma=1, support=3*sigma
  - LANCZOS2:           Two-lobed Lanczos filter
  - LANCZOS3:           Three-lobed Lanczos filter
  - LANCZOS3_STRETCHED: Three-lobed Lanczos filter, stretched by 4/3.0.
                        This is the "Nice" filter from Dirty Pixels by
                        Jim Blinn.

The intended way to use this function is to extract scaling factors
from the transformation and then pass those to this function to get a
filter suitable for compositing with that transformation. The filter
kernels can be chosen according to quality and performance tradeoffs.

To get equivalent quality to GdkPixbuf for downscalings, use BOX for
both reconstruction and sampling. For upscalings, use LINEAR for
reconstruction and IMPULSE for sampling (though note that for
upscaling in both X and Y directions, simply using
PIXMAN_FILTER_BILINEAR will likely be a better choice).
2012-12-08 10:50:51 -05:00
Søren Sandmann Pedersen
68760d3fe1 rounding.txt: Describe how SEPARABLE_CONVOLUTION filter works
Add some notes on how to compute the convolution matrices to be used
with the SEPARABLE_CONVOLUTION filter.
2012-12-08 10:50:51 -05:00
Søren Sandmann Pedersen
6fd480b17c Add new filter PIXMAN_FILTER_SEPARABLE_CONVOLUTION
This filter is a new way to use a convolution matrix for filtering. In
contrast to the existing CONVOLUTION filter, this new variant is
different in two respects:

- It is subsampled: Instead of just one convolution matrix, this
  filter chooses between a number of matrices based on the subpixel
  sample location, allowing the convolution kernel to be sampled at a
  higher resolution.

- It is separable: Each matrix is specified as the tensor product of
  two vectors. This has the advantages that many fewer values have to
  be stored, and that the filtering can be done separately in the x
  and y dimensions (although the initial implementation doesn't
  actually do that).

The motivation for this new filter is to improve image downsampling
quality. Currently, the best pixman can do is the regular convolution
filter which is limited to coarsely sampled convolution kernels.

With this new feature, any separable filter can be used at any desired
resolution.
2012-12-08 10:50:51 -05:00
Benjamin Gilbert
7e39861da3 Fix thread safety on mingw-w64 and clang
After finding a working TLS storage class specifier, configure was
continuing to test other candidates.  This caused it to prefer
__declspec(thread) over __thread.  However, __declspec(thread) is
ignored with a warning by mingw-w64 [1] and silently ignored by clang [2].
The resulting binary behaved as if PIXMAN_NO_TLS was defined.

Bug introduced by a069da6c.

[1] https://bugs.freedesktop.org/show_bug.cgi?id=57591
[2] http://lists.freedesktop.org/archives/pixman/2012-October/002320.html
2012-12-08 16:41:10 +02:00
Siarhei Siamashka
ebedd9a2ad test: Get rid of the obsolete 'prng_rand_N' and 'prng_rand_u32'
They are the same as 'prng_rand_n' and 'prng_rand'
2012-12-06 17:20:38 +02:00
Siarhei Siamashka
b31a696263 test: Switch to the new PRNG instead of old LCG
Wallclock time for running pixman "make check" (compile time not included):

----------------------------+----------------+-----------------------------+
                            | old PRNG (LCG) |   new PRNG (Bob Jenkins)    |
       Processor type       +----------------+------------+----------------+
                            |    gcc 4.5     |  gcc 4.5   | gcc 4.7 (simd) |
----------------------------+----------------+------------+----------------+
quad Intel Core i7  @2.8GHz |    0m49.494s   |  0m43.722s |    0m37.560s   |
dual ARM Cortex-A15 @1.7GHz |     5m8.465s   |  4m37.375s |    3m45.819s   |
     IBM Cell PPU   @3.2GHz |    23m0.821s   | 20m38.316s |   16m37.513s   |
----------------------------+----------------+------------+----------------+

But some tests got a particularly large boost. For example benchmarking and
profiling blitters-test on Core i7:

=== before ===

$ time ./blitters-test

real    0m10.907s
user    0m55.650s
sys     0m0.000s

  70.45%  blitters-test  blitters-test       [.] create_random_image
  15.81%  blitters-test  blitters-test       [.] compute_crc32_for_image_internal
   2.26%  blitters-test  blitters-test       [.] _pixman_implementation_lookup_composite
   1.07%  blitters-test  libc-2.15.so        [.] _int_free
   0.89%  blitters-test  libc-2.15.so        [.] malloc_consolidate
   0.87%  blitters-test  libc-2.15.so        [.] _int_malloc
   0.75%  blitters-test  blitters-test       [.] combine_conjoint_general_u
   0.61%  blitters-test  blitters-test       [.] combine_disjoint_general_u
   0.40%  blitters-test  blitters-test       [.] test_composite
   0.31%  blitters-test  libc-2.15.so        [.] _int_memalign
   0.31%  blitters-test  blitters-test       [.] _pixman_bits_image_setup_accessors
   0.28%  blitters-test  libc-2.15.so        [.] malloc

=== after ===

$ time ./blitters-test

real    0m3.655s
user    0m20.550s
sys     0m0.000s

  41.77%  blitters-test.n  blitters-test.new  [.] compute_crc32_for_image_internal
  15.77%  blitters-test.n  blitters-test.new  [.] prng_randmemset_r
   6.15%  blitters-test.n  blitters-test.new  [.] _pixman_implementation_lookup_composite
   3.09%  blitters-test.n  libc-2.15.so       [.] _int_free
   2.68%  blitters-test.n  libc-2.15.so       [.] malloc_consolidate
   2.39%  blitters-test.n  libc-2.15.so       [.] _int_malloc
   2.27%  blitters-test.n  blitters-test.new  [.] create_random_image
   2.22%  blitters-test.n  blitters-test.new  [.] combine_conjoint_general_u
   1.52%  blitters-test.n  blitters-test.new  [.] combine_disjoint_general_u
   1.40%  blitters-test.n  blitters-test.new  [.] test_composite
   1.02%  blitters-test.n  blitters-test.new  [.] prng_srand_r
   1.00%  blitters-test.n  blitters-test.new  [.] _pixman_image_validate
   0.96%  blitters-test.n  blitters-test.new  [.] _pixman_bits_image_setup_accessors
   0.90%  blitters-test.n  libc-2.15.so       [.] malloc
2012-12-06 17:20:35 +02:00
Siarhei Siamashka
309e66f047 test: Search/replace 'lcg_*' -> 'prng_*'
The 'lcg' prefix is going to be misleading if we replace
PRNG algorithm.
2012-12-06 17:20:31 +02:00
Siarhei Siamashka
d6545a2fc6 test: Added a better PRNG (pseudorandom number generator)
This adds a fast SIMD-optimized variant of a small noncryptographic
PRNG originally developed by Bob Jenkins:
    http://www.burtleburtle.net/bob/rand/smallprng.html

The generated pseudorandom data is good enough to pass "Big Crush"
tests from TestU01 (http://en.wikipedia.org/wiki/TestU01).

SIMD code uses http://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html
which is a GCC specific extension. There is also a slower alternative
code path, which should work with any C compiler.

The performance of filling buffer with random data:
   Intel Core i7  @2.8GHz (SSE2)     : ~5.9 GB/s
   ARM Cortex-A15 @1.7GHz (NEON)     : ~2.2 GB/s
   IBM Cell PPU   @3.2GHz (Altivec)  : ~1.7 GB/s
2012-12-06 17:20:27 +02:00
Siarhei Siamashka
41f98a07fc test: Change is_little_endian() into inline function
Also dropped redundant volatile keyword because any object
can be accessed via char* pointer without breaking aliasing
rules. The compilers are able to optimize this function to either
constant 0 or 1.
2012-12-06 17:20:23 +02:00
Cyril Brulebois
97a117ef1d New upstream release. 2012-11-27 14:00:27 +01:00
Cyril Brulebois
e33dbc6c69 Merge branch 'upstream-experimental' into debian-experimental 2012-11-27 13:59:51 +01:00
Søren Sandmann Pedersen
978bab253d Add text file rounding.txt describing how rounding works
It is not entirely obvious how pixman gets from "location in the
source image" to "pixel value stored in the destination". This file
describes how the filters work, and in particular how positions are
rounded to samples.
2012-11-22 01:16:54 -05:00
Søren Sandmann Pedersen
74319e9d39 Convolution filter: round color values instead of truncating
The pixel computed by the convolution filter should be rounded off,
not truncated. As a simple example consider a convolution matrix
consisting of five times 0x3333. If all five all five input pixels are
0xff, then the result of truncating will be

    (5 * 0x3333 * 255) >> 16 = 254

But the real value of the computation is (5 * 0x3333 / 65536.0) * 254
= 254.9961, so the error is almost 1. If the user isn't very careful
about normalizing the convolution kernel so that it sums to one in
fixed point, such error might cause solid images to change color, or
opaque images to become translucent.

The fix is simply to round instead of truncate.
2012-11-22 01:06:29 -05:00
Søren Sandmann Pedersen
f0816ddaf4 Round fixed-point multiplication
After two fixed-point numbers are multiplied, the result is shifted
into place, but up until now pixman has simply discarded the low-order
bits instead of rounding to the closest number.

Fix that by adding 0x8000 (or 0x2 in one place) before shifting and
update the test checksums to match.
2012-11-20 03:23:51 -05:00
Stefan Weil
44dd746bb6 test: Fix compiler warnings caused by unused code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2012-11-14 18:02:14 -05:00
Stefan Weil
5f96022d3b pixman: Use uintptr_t in type casts from pointer to integral value
These modifications fix lots of compiler warnings for systems where
sizeof(unsigned long) != sizeof(void *).
This is especially true for MinGW-w64 (64 bit Windows).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2012-11-14 18:02:14 -05:00
Stefan Weil
a96efd02d6 Always use xmmintrin.h for 64 bit Windows
MinGW-w64 uses the GNU compiler and does not define _MSC_VER.
Nevertheless, it provides xmmintrin.h and must be handled
here like the MS compiler. Otherwise compilation fails due to
conflicting declarations.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2012-11-14 18:02:13 -05:00
Nemanja Lukic
899e0d6052 MIPS: DSPr2: Added several nearest neighbor fast paths with a8 mask:
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench -n

Referent (before):
        over_8888_8_0565 =  L1:   9.62  L2:   8.85  M:  7.40 ( 39.27%)  HT:  5.67  VT:  5.61  R:  5.45  RT:  2.98 (  22Kops/s)
        over_0565_8_0565 =  L1:   7.90  L2:   7.49  M:  6.72 ( 26.75%)  HT:  5.24  VT:  5.20  R:  5.06  RT:  2.90 (  22Kops/s)

Optimized:
        over_8888_8_0565 =  L1:  18.51  L2:  16.82  M: 12.13 ( 64.43%)  HT: 10.06  VT:  9.88  R:  9.54  RT:  5.63 (  31Kops/s)
        over_0565_8_0565 =  L1:  14.82  L2:  13.94  M: 11.34 ( 45.20%)  HT:  9.45  VT:  9.35  R:  9.03  RT:  5.50 (  31Kops/s)
2012-11-14 18:01:18 -05:00
Nemanja Lukic
a432bdce66 MIPS: DSPr2: Added more fast-paths for OVER operation:
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        over_n_0565 =  L1:  14.48  L2:  21.36  M: 17.57 ( 23.30%)  HT:  6.95  VT:  6.44  R:  6.39  RT:  2.16 (  22Kops/s)
        over_n_8888 =  L1:  92.60  L2:  86.13  M: 24.41 ( 64.74%)  HT:  8.94  VT:  8.06  R:  8.00  RT:  2.53 (  25Kops/s)

Optimized:
        over_n_0565 =  L1:  27.65  L2: 189.22  M: 58.19 ( 77.12%)  HT: 52.80  VT: 49.88  R: 47.53  RT: 23.67 (  72Kops/s)
        over_n_8888 =  L1: 235.99  L2: 230.86  M: 29.09 ( 77.11%)  HT: 27.95  VT: 27.24  R: 26.58  RT: 18.10 (  67Kops/s)
2012-11-14 18:01:18 -05:00
Nemanja Lukic
e33e9d3f55 MIPS: DSPr2: Added more fast-paths for SRC operation:
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        src_n_8_8888 =  L1:  13.79  L2:  22.47  M: 17.55 ( 58.28%)  HT:  6.95  VT:  6.46  R:  6.34  RT:  2.07 (  20Kops/s)
           src_n_8_8 =  L1:  20.22  L2:  20.21  M: 18.20 ( 24.17%)  HT:  6.65  VT:  6.22  R:  6.11  RT:  2.03 (  20Kops/s)

Optimized:
        src_n_8_8888 =  L1:  58.31  L2:  53.34  M: 25.69 ( 85.29%)  HT: 22.55  VT: 21.44  R: 19.91  RT: 10.34 (  48Kops/s)
           src_n_8_8 =  L1: 102.60  L2:  89.43  M: 65.01 ( 86.32%)  HT: 37.87  VT: 37.02  R: 32.43  RT: 12.41 (  51Kops/s)
2012-11-14 18:01:18 -05:00
Søren Sandmann Pedersen
d881e1f580 Allow src and dst to be identical in pixman_f_transform_invert()
It is useful to be able to invert a matrix in place, but currently
pixman_f_transform_invert() will produce wrong results if you pass the
same matrix as both source and destination.

Fix that by inverting into a temporary matrix and then copying that to
the destination.
2012-11-11 14:09:22 -05:00
Søren Sandmann Pedersen
614e7aaf14 pixman.h: Add typedefs for pixman_f_transform and pixman_f_vector 2012-11-10 01:46:17 -05:00
Joshua Root
b2e0e240fe Fix undeclared variable use and sysctlbyname error handling on ppc
Fixes bug 56889.
2012-11-09 16:13:31 -05:00
Søren Sandmann Pedersen
400436dc52 pixman_image_composite: Reduce opaque masks to NULL
When the mask is known to be opaque, we might as well reduce it to
NULL to take advantage of the various fast paths that operate on NULL
masks.
2012-11-09 16:13:31 -05:00
Søren Sandmann Pedersen
f2ada9e63f Post-release version bump to 0.29.1 2012-11-07 13:45:09 -05:00
Søren Sandmann Pedersen
8a2ff3e0ef Pre-release version bump to 0.28.0 2012-11-07 13:41:15 -05:00
Søren Sandmann Pedersen
4b91f6ca72 Post-release version bump to 0.27.5 2012-10-25 10:42:26 -04:00
Søren Sandmann Pedersen
0de3f33449 Pre-release version bump to 0.27.4 2012-10-25 10:35:27 -04:00
Nemanja Lukic
f075025845 MIPS: DSPr2: Added more fast-paths for ADD operation: - add_8888_8888_8888 - add_8_8 - add_8888_8888
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        add_8888_8888_8888 =  L1:  17.55  L2:  13.35  M:  8.13 ( 93.95%)  HT:  6.60  VT:  6.64  R:  6.45  RT:  3.47 (  26Kops/s)
        add_8_8            =  L1:  86.07  L2:  84.89  M: 62.36 ( 90.11%)  HT: 36.36  VT: 34.74  R: 29.56  RT: 11.56 (  52Kops/s)
        add_8888_8888      =  L1:  95.59  L2:  73.05  M: 17.62 (101.84%)  HT: 15.46  VT: 15.01  R: 13.94  RT:  6.71 (  42Kops/s)

Optimized:
        add_8888_8888_8888 =  L1:  41.52  L2:  33.21  M: 11.97 (138.45%)  HT: 10.47  VT: 10.19  R:  9.42  RT:  4.86 (  32Kops/s)
        add_8_8            =  L1: 135.06  L2: 104.82  M: 57.13 ( 82.58%)  HT: 34.79  VT: 36.60  R: 28.28  RT: 10.54 (  51Kops/s)
        add_8888_8888      =  L1: 176.36  L2:  67.82  M: 17.48 (101.06%)  HT: 15.16  VT: 14.62  R: 13.88  RT:  8.05 (  45Kops/s)
2012-10-25 10:04:30 -04:00
Nemanja Lukic
ca83717c63 MIPS: DSPr2: Added more fast-paths for ADD operation: - add_0565_8_0565 - add_8888_8_8888 - add_8888_n_8888
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        add_0565_8_0565 =  L1:   8.89  L2:   8.37  M:  7.35 ( 29.22%)  HT:  5.90  VT:  5.85  R:  5.67  RT:  3.31 (  26Kops/s)
        add_8888_8_8888 =  L1:  17.22  L2:  14.17  M:  9.89 ( 65.56%)  HT:  7.57  VT:  7.50  R:  7.36  RT:  4.10 (  30Kops/s)
        add_8888_n_8888 =  L1:  17.79  L2:  14.87  M: 10.35 ( 54.89%)  HT:  5.19  VT:  4.93  R:  4.92  RT:  1.90 (  19Kops/s)

Optimized:
        add_0565_8_0565 =  L1:  21.72  L2:  20.01  M: 14.96 ( 59.54%)  HT: 12.03  VT: 11.81  R: 11.26  RT:  6.33 (  37Kops/s)
        add_8888_8_8888 =  L1:  47.42  L2:  38.64  M: 15.90 (105.48%)  HT: 13.34  VT: 13.03  R: 11.84  RT:  6.63 (  38Kops/s)
        add_8888_n_8888 =  L1:  54.83  L2:  42.66  M: 17.36 ( 92.11%)  HT: 15.20  VT: 14.82  R: 13.66  RT:  7.83 (  41Kops/s)
2012-10-25 10:04:30 -04:00
Nemanja Lukic
52d20e692e MIPS: DSPr2: Added fast-paths for ADD operation: - add_n_8_8 - add_n_8_8888 - add_8_8_8
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        add_n_8_8    =  L1:  41.37  L2:  37.83  M: 30.38 ( 60.45%)  HT: 23.70  VT: 22.85  R: 21.51  RT: 10.32 (  45Kops/s)
        add_n_8_8888 =  L1:  16.01  L2:  14.46  M: 11.64 ( 46.32%)  HT:  5.50  VT:  5.18  R:  5.06  RT:  1.89 (  18Kops/s)
        add_8_8_8    =  L1:  13.26  L2:  12.47  M: 11.16 ( 29.61%)  HT:  8.09  VT:  8.04  R:  7.68  RT:  3.90 (  29Kops/s)

Optimized:
        add_n_8_8    =  L1:  96.03  L2:  79.37  M: 51.89 (103.31%)  HT: 32.59  VT: 31.29  R: 28.52  RT: 11.08 (  46Kops/s)
        add_n_8_8888 =  L1:  53.61  L2:  46.92  M: 23.78 ( 94.70%)  HT: 19.06  VT: 18.64  R: 17.30  RT:  9.15 (  43Kops/s)
        add_8_8_8    =  L1:  89.65  L2:  66.82  M: 37.10 ( 98.48%)  HT: 22.10  VT: 21.74  R: 20.12  RT:  8.12 (  41Kops/s)
2012-10-25 10:04:30 -04:00
Siarhei Siamashka
9df645dfb0 Workaround for FTBFS with gcc 4.6 (http://gcc.gnu.org/PR54965)
GCC 4.6 has problems with force_inline, so just use normal inline instead.
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=55630
2012-10-25 00:39:41 +03:00
Søren Sandmann Pedersen
31e5a0a393 pixman_composite_trapezoids(): don't clip to extents for some operators
pixman_composite_trapezoids() is supposed to composite across the
entire destination, but it actually only composites across the extent
of the trapezoids. For operators such as ADD or OVER this doesn't
matter since a zero source has no effect on the destination. But for
operators such as SRC or IN, it does matter.

So for such operators where a zero source has an effect, don't clip to
the trap extents.
2012-10-21 04:13:36 -04:00
Søren Sandmann Pedersen
65db2362e2 pixman_composite_trapezoids(): Factor out extents computation
The computation of the extents rectangle is moved to its own
function.
2012-10-21 04:13:36 -04:00
Søren Sandmann Pedersen
2d9cb563b4 Add new pixman_image_create_bits_no_clear() API
When pixman_image_create_bits() function is given NULL for bits, it
will allocate a new buffer and initialize it to zero. However, in some
cases, only a small region of the image is actually used; in that case
it is wasteful to touch all of the memory.

The new pixman_image_create_bits_no_clear() works exactly like
_create_bits() except that it doesn't initialize any newly allocated
memory.
2012-10-21 04:13:36 -04:00
Benny Siegert
af803be17b configure.ac: PIXMAN_LINK_WITH_ENV fix
(fixes bug #52101)

On MirBSD, the compiler produces a (harmless) warning when the compiler
is called without the standard CFLAGS:

foo.c:0: note: someone does not honour COPTS correctly, passed 0 times

However, PIXMAN_LINK_WITH_ENV considers _any_ output on stderr as an
error, even if the exit status of the compiler is 0. Furthermore, it
resets CFLAGS and LDFLAGS at the start. On MirBSD, this will lead to a
warning in each test, making all such tests fail. In particular, the
pthread_setspecific test fails, thus pixman is compiled without thread
support. This leads to compile errors later on, or at least it did when
I tried this on pkgsrc. Re-adding the saved CFLAGS, LDFLAGS and LIBS
before the test makes it work.

The second hunk inverts the order of the pthread flag checks. On BSD
systems (this is true at least on OpenBSD and MirBSD), both -lpthread
and -pthread work but the latter is "preferred", whatever this means.
2012-10-17 14:42:56 -04:00
Siarhei Siamashka
6e56098c03 Add missing force_inline to in() function used for C fast paths 2012-10-16 22:31:38 +03:00
Siarhei Siamashka
90bcafa495 MIPS: skip runtime detection for DSPr2 if -mdspr2 option is in CFLAGS
This provides a way to enable MIPS DSP ASE optimizations if running
under qemu-user (where /proc/cpuinfo contains information about the
host processor instead of the emulated one). Can be used for running
pixman test suite in qemu-user when having no access to real MIPS
hardware.
2012-10-16 18:27:45 +03:00
Søren Sandmann Pedersen
d5f2f39319 region: Remove overlap argument from pixman_op()
This is used to compute whether the regions in question overlap, but
nothing makes use of this information, so it can be removed.
2012-10-11 05:09:19 -04:00
Søren Sandmann Pedersen
cb4f325ec0 region: Formatting fix
The while part of a do/while loop was formatted as if it were a while
loop with an empty body. Probably some indent tool misinterpreted the
code at some point.
2012-10-11 04:08:48 -04:00
Søren Sandmann Pedersen
15b153d633 Only regard images as pixbufs if they have identity transformations
In order for a src/mask pair to be considered a pixbuf, they have to
have identical transformations, but we don't check for that. Since the
only fast paths we have for pixbufs require identity transformations,
it sufficies to check that both source and mask are
untransformed.

This is also the reason that this bug can't be triggered by any test
code - if the source and mask had different transformations, we would
consider them a pixbuf, but then wouldn't take the fast path because
at least one of the transformations would be different from the
identity.
2012-10-07 18:00:09 -04:00
Søren Sandmann Pedersen
3d81d89c29 Remove BUILT_SOURCES
pixman-combine32.[ch] were the only built sources, so BUILT_SOURCES
can now be removed.
2012-10-04 12:44:22 -04:00
Søren Sandmann Pedersen
ec7aa11a6e Speed up pixman_expand_to_float()
GCC doesn't move the divisions out of the loop, so do it manually by
looking up the four (1.0f / mask) values in a table. Table lookups are
used under the theory that one L2 hit plus three L1 hits is preferable
to four floating point divisions.
2012-10-04 03:34:05 -04:00
Søren Sandmann Pedersen
8ccda2be30 Don't auto-generate pixman-combine32.[ch] anymore
Since pixman-combine64.[ch] are not used anymore, there is no point
generating these files from pixman-combine.[ch].template.

Also get rid of dependency on perl in configure.ac.
2012-10-04 03:33:50 -04:00
Søren Sandmann Pedersen
4afd20cc71 Remove 64 bit pipeline
The 64 bit pipeline is not used anymore, so it can now be removed.

Don't generate pixman-combine64.[ch] anymore. Don't generate the
pixman-srgb.c anymore. Delete all the 64 bit fetchers in
pixman-access.c, all the 64 bit iterator functions in
pixman-bits-image.c and all the functions that expand from 8 to 16
bits.
2012-10-01 12:56:09 -04:00
Søren Sandmann Pedersen
5ff0bbd972 Switch the wide pipeline over to using floating point
In pixman-bits-image.c, remove bits_image_fetch_untransformed_64() and
add bits_image_fetch_untransformed_float(); change
dest_get_scanline_wide() to produce a floating point buffer,

In the gradients, change *_get_scanline_wide() to call
pixman_expand_to_float() instead of pixman_expand().

In pixman-general.c change the wide Bpp to 16 instead of 8, and
initialize the buffers to 0 to prevent NaNs from causing trouble.

In pixman-noop.c make the wide solid iterator generate floating point
pixels.

In pixman-solid-fill.c, cache a floating point pixel, and make the
wide iterator generate floating point pixels.

Bug fix in bits_image_fetch_untransformed_repeat_normal
2012-10-01 12:56:09 -04:00
Søren Sandmann Pedersen
e75bacc5f9 pixman-access.c: Add floating point accessor functions
Three new function pointer fields are added to bits_image_t:

      fetch_scanline_float
      fetch_pixel_float
      store_scanline_float

similar to the existing 32 and 64 bit accessors. The fetcher_info_t
struct in pixman_access similarly gets a new get_scanline_float field.

For most formats, the new get_scanline_float field is set to a new
function fetch_scanline_generic_float() that first calls the 32 bit
fetcher uses the 32 bit scanline fetcher and then expands these pixels
to floating point.

For the 10 bpc formats, new floating point accessors are added that
use pixman_unorm_to_float() and pixman_float_to_unorm() to convert
back and forth.

The PIXMAN_a8r8g8b8_sRGB format is handled with a 256-entry table that
maps 8 bit sRGB channels to linear single precision floating point
numbers. The sRGB->linear direction can then be done with a simple
table lookup.

The other direction is currently done with 4096-entry table which
works fine for 16 bit integers, but not so great for floating
point. So instead this patch uses a binary search in the sRGB->linear
table. The existing 32 bit accessors for the sRGB format are also
converted to use this method.
2012-10-01 12:56:09 -04:00
Søren Sandmann Pedersen
23252393a2 pixman-utils.c, pixman-private.h: Add floating point conversion routines
A new struct argb_t containing a floating point pixel is added to
pixman-private.h and conversion routines are added to pixman-utils.c
to convert normalized integers to and from that struct.

New functions:

  - pixman_expand_to_float()
    Expands a buffer of integer pixels to a buffer of argb_t pixels

  - pixman_contract_from_float()
    Converts a buffer of argb_t pixels to a buffer integer pixels

  - pixman_float_to_unorm()
    Converts a floating point number to an unsigned normalized integer

  - pixman_unorm_to_float()
    Converts an unsigned normalized integer to a floating point number
2012-10-01 12:56:09 -04:00
Søren Sandmann Pedersen
4760599ff3 Add combiner test
This test runs the new floating point combiners on random input with
divide-by-zero exceptions turned on.

With the floating point combiners the only thing we guarantee is that
divide-by-zero exceptions are not generated, so change
enable_fp_exceptions() to only enable those, and rename accordingly.
2012-10-01 12:56:09 -04:00
Søren Sandmann Pedersen
a5b459114e Add pixman-combine-float.c
This file contains floating point implementations of combiners for all
pixman operators. These combiners operate on buffers containing single
precision floating point pixels stored in (a, r, g, b) order.

The combiners are added to the pixman_implementation_t struct, but
nothing uses them yet.

This commit incorporates a number of bug fixes contributed by Andrea
Canciani.

Some notes:

- The combiners are making sure to never divide by zero regardless of
  input, so an application could enable divide-by-zero exceptions and
  pixman wouldn't generate any.

- The operators are implemented according to the Render spec. Ie.,

    - If the input pixels are between 0 and 1, then so is the output.

    - The source and destination coefficients for the conjoint and
      disjoint operators are clamped to [0, 1].

- The PDF operators are not described in the render spec, and the
  implementation here doesn't do any clamping except in the final
  conversion from floating point to destination format.

All of the above will need to be rethought if we add support for pixel
formats that can support negative and greater-than-one pixels. It is
in fact already the case in principle that convolution filters can
produce pixels with negative values, but since these go through the
broken "wide" path that narrows everything to 32 bits, these negative
values don't currently survive to the combiners.
2012-10-01 12:56:09 -04:00
Søren Sandmann Pedersen
7a9c2d586b blitters-test: Prepare for floating point
Comment out some formats in blitters-test that are going to rely on
floating point in some upcoming patches.
2012-10-01 12:56:09 -04:00
Søren Sandmann Pedersen
600a06c81d glyph-test: Prepare for floating point
In preparation for an upcoming change of the wide pipe to use floating
point, comment out some formats in glyph-test that are going to be
using floating point and update the CRC32 value to match.
2012-10-01 12:56:09 -04:00
Søren Sandmann Pedersen
2e17b6dd4e Make pixman.h more const-correct
Add const to pointer arguments when the function doesn't change the
pointed-to data.

Also in add_glyphs() in pixman-glyph.c make 'white' in add_glyphs()
static and const.
2012-10-01 12:52:58 -04:00
Matt Turner
183afcf1d9 iwmmxt: Don't define dummy _mm_empty for >=gcc-4.8
Definition was not present in <4.8.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55451
2012-09-30 11:59:23 -07:00
Søren Sandmann Pedersen
d4b72eb6cc rotate-test: Call image_endian_swap() in make_image()
Otherwise the test fails on big-endian.

Tested-by: Matt Turner <mattst88@gmail.com>
2012-09-29 18:15:54 -04:00
Siarhei Siamashka
aff796d6ce Add scaled nearest repeat fast paths
Before this patch it was often faster to scale and repeat
in two passes because each pass used a fast path vs.
the slow path that the single pass approach takes. This
makes it so that the single pass approach has competitive
performance.
2012-09-26 00:03:10 -04:00
Matt Turner
05560828c4 sse2: mark pack_565_2x128_128 as static force_inline 2012-09-25 14:41:24 -07:00
Søren Sandmann Pedersen
de60e2e0e3 Fix for infinite-loop test
The infinite loop detected by "affine-test 212944861" is caused by an
overflow in this expression:

    max_x = pixman_fixed_to_int (vx + (width - 1) * unit_x) + 1;

where (width - 1) * unit_x doesn't fit in a signed int. This causes
max_x to be too small so that this:

    src_width = 0

    while (src_width < REPEAT_NORMAL_MIN_WIDTH && src_width <= max_x)
        src_width += src_image->bits.width;

results in src_width being 0. Later on when src_width is used for
repeat calculations, we get the infinite loop.

By casting unit_x to int64_t, the expression no longer overflows and
affine-test 212944861 and infinite-loop no longer loop forever.
2012-09-24 18:43:31 -04:00
Søren Sandmann Pedersen
aa311a4641 test: Add inifinite-loop test
This test demonstrates a bug where a certain transformation matrix can
result in an infinite loop. It was extracted as a standalone version
of "affine-test 212944861".

If given the option -nf, the test program will not call fail_after()
and therefore potentially run forever.
2012-09-24 18:29:30 -04:00
Søren Sandmann Pedersen
d5c721768c affine-test: Print out the transformation matrix when verbose
Printing out the translation and scale is a bit misleading because the
actual transformation matrix can be modified in various other ways.

Instead simply print the whole transformation matrix that is actually
used.
2012-09-24 18:27:10 -04:00
Nemanja Lukic
292fce7a23 MIPS: DSPr2: Added OVER combiner and two new fast paths: - over_8888_8888 - over_8888_8888_8888
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
          over_8888_8888 =  L1:  19.61  L2:  17.10  M: 11.16 ( 59.20%)  HT: 16.47  VT: 15.81  R: 14.82  RT:  8.90 (  50Kops/s)
     over_8888_8888_8888 =  L1:  13.56  L2:  11.22  M:  7.46 ( 79.18%)  HT:  6.24  VT:  6.20  R:  6.11  RT:  3.95 (  29Kops/s)

Optimized:
          over_8888_8888 =  L1:  46.42  L2:  36.70  M: 16.69 ( 88.57%)  HT: 17.11  VT: 16.55  R: 15.31  RT:  9.48 (  52Kops/s)
     over_8888_8888_8888 =  L1:  26.06  L2:  22.53  M: 11.49 (121.91%)  HT:  9.93  VT:  9.62  R:  9.19  RT:  5.75 (  36Kops/s)
2012-09-24 17:13:46 -04:00
Nemanja Lukic
28c9bd4866 MIPS: DSPr2: Added fast-paths for OVER operation: - over_0565_n_0565 - over_0565_8_0565
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        over_0565_n_0565 =  L1:   7.56  L2:   7.24  M:  6.16 ( 16.38%)  HT:  4.01  VT:  3.84  R:  3.79  RT:  1.66 (  18Kops/s)
        over_0565_8_0565 =  L1:   7.43  L2:   7.05  M:  5.98 ( 23.85%)  HT:  5.27  VT:  5.23  R:  5.09  RT:  3.14 (  28Kops/s)

Optimized:
        over_0565_n_0565 =  L1:  15.47  L2:  14.52  M: 12.30 ( 32.65%)  HT: 10.76  VT: 10.57  R: 10.27  RT:  6.63 (  46Kops/s)
        over_0565_8_0565 =  L1:  15.47  L2:  14.61  M: 11.78 ( 46.92%)  HT: 10.00  VT:  9.84  R:  9.40  RT:  5.81 (  43Kops/s)
2012-09-24 17:12:57 -04:00
Nemanja Lukic
b660eb30b4 MIPS: DSPr2: Added fast-paths for OVER operation: - over_8888_n_0565 - over_8888_8_0565
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        over_8888_n_0565 =  L1:   8.95  L2:   8.33  M:  6.95 ( 27.74%)  HT:  4.27  VT:  4.07  R:  4.01  RT:  1.74 (  19Kops/s)
        over_8888_8_0565 =  L1:   8.86  L2:   8.11  M:  6.72 ( 35.71%)  HT:  5.68  VT:  5.62  R:  5.47  RT:  3.35 (  30Kops/s)

Optimized:
        over_8888_n_0565 =  L1:  18.76  L2:  17.55  M: 13.11 ( 52.19%)  HT: 11.35  VT: 11.10  R: 10.88  RT:  6.94 (  47Kops/s)
        over_8888_8_0565 =  L1:  18.14  L2:  16.79  M: 12.10 ( 64.25%)  HT: 10.24  VT:  9.98  R:  9.63  RT:  5.89 (  43Kops/s)
2012-09-24 17:12:57 -04:00
Nemanja Lukic
37e3368e20 MIPS: DSPr2: Added fast-paths for OVER operation: - over_8888_n_8888 - over_8888_8_8888
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench results

Referent (before):
        over_8888_n_8888 =  L1:   9.92  L2:  11.27  M:  8.50 ( 45.23%)  HT:  4.70  VT:  4.45  R:  4.49  RT:  1.85 (  20Kops/s)
        over_8888_8_8888 =  L1:  12.54  L2:  10.86  M:  8.18 ( 54.36%)  HT:  6.53  VT:  6.45  R:  6.41  RT:  3.83 (  33Kops/s)

Optimized:
        over_8888_n_8888 =  L1:  28.02  L2:  24.92  M: 14.72 ( 78.15%)  HT: 13.03  VT: 12.65  R: 12.00  RT:  7.49 (  49Kops/s)
        over_8888_8_8888 =  L1:  26.92  L2:  23.93  M: 13.65 ( 90.58%)  HT: 11.68  VT: 11.29  R: 10.56  RT:  6.37 (  45Kops/s)
2012-09-24 17:12:56 -04:00
Søren Sandmann Pedersen
f580c4c5b2 pixman-combine.c.template: Formatting clean-ups
Various formatting fixes, and removal of some obsolete comments about
strength reduction of operators.
2012-09-22 23:41:19 -04:00
Søren Sandmann Pedersen
58f8704664 Fix bugs in pixman-image.c
In the checks for whether the transforms are rotation matrices "-1"
and "1" were used instead of the correct -pixman_fixed_1 and
pixman_fixed_1.

Fixes test suite failure for rotate-test.
2012-09-22 23:41:19 -04:00
Søren Sandmann Pedersen
550dfc5e7e Add rotate-test.c test program
This program exercises a bug in pixman-image.c where "-1" and "1" were
used instead of the correct "- pixman_fixed_1" and "pixman_fixed_1".

With the fast implementation enabled:

     % ./rotate-test
     rotate test failed! (checksum=35A01AAB, expected 03A24D51)

Without it:

     % env PIXMAN_DISABLE=fast ./rotate-test
     pixman: Disabled fast implementation
     rotate test passed (checksum=03A24D51)

V2: The first version didn't have lcg_srand (testnum) in test_transform().
2012-09-22 23:41:19 -04:00
Søren Sandmann Pedersen
2ab77c97a5 Fix bugs in component alpha combiners for separable PDF operators
In general, the component alpha version of an operator is supposed to
do this:

       - multiply source with mask in all channels
       - multiply mask with source alpha in all channels
       - compute the regular operator in all channels using the
         mask value whenever source alpha is called for

The first two steps are usually accomplished with the function
combine_mask_ca(), but for operators where source alpha is not used,
such as SRC, ADD and OUT, the simpler function
combine_mask_value_ca(), which doesn't compute the new mask values,
can be used.

However, the PDF blend modes generally *do* make use of source alpha,
so they can't use combine_mask_value_ca() as they do now. They have to
use combine_mask_ca().

This patch fixes this in combine_multiply_ca() and the CA combiners
generated by PDF_SEPARABLE_BLEND_MODE.
2012-09-22 23:41:19 -04:00
Søren Sandmann Pedersen
c4b69e706e Fix bug in fast_composite_scaled_nearest()
The fast_composite_scaled_nearest() function can be called when the
format is x8b8g8r8. In that case pixels fetched in fetch_nearest()
need to have their alpha channel set to 0xff.

Fixes test suite failure in scaling-test.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-09-22 23:40:52 -04:00
Søren Sandmann Pedersen
35be7acb66 Add PIXMAN_x8b8g8r8 and PIXMAN_a8b8g8r8 formats to scaling-test
Update the CRC values based on what the general implementation
reports. This reveals a bug in the fast implementation:

    % env PIXMAN_DISABLE="mmx sse2" ./test/scaling-test
    pixman: Disabled mmx implementation
    pixman: Disabled sse2 implementation
    scaling test failed! (checksum=AA722B06, expected 03A23E0C)

vs.

    % env PIXMAN_DISABLE="mmx sse2 fast" ./test/scaling-test
    pixman: Disabled fast implementation
    pixman: Disabled mmx implementation
    pixman: Disabled sse2 implementation
    scaling test passed (checksum=03A23E0C)

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-09-22 23:40:52 -04:00
Søren Sandmann Pedersen
9decb9a979 implementation: Rename delegate to fallback
At this point the chain of implementations has nothing to do with the
delegation design pattern anymore, so rename the delegate pointer to
'fallback'.
2012-09-19 12:22:59 -04:00
Søren Sandmann Pedersen
b96599ccf3 _pixman_implementation_create(): Initialize implementation with memset()
All the function pointers are NULL by default now, so we can just zero
the struct. Also write the function a little more compactly.
2012-09-19 12:22:59 -04:00
Søren Sandmann Pedersen
9539a18832 Rename _pixman_lookup_composite_function() to _pixman_implementation_lookup_composite()
And move it into pixman-implementation.c which is where it belongs
logically.
2012-09-19 12:22:59 -04:00
Søren Sandmann Pedersen
ee6af72dad Move delegation of src/dest iter init into pixman-implementation.c
Instead of relying on each implementation to delegate when an iterator
can't be initialized, change the type of iterator initializers to
boolean and make pixman-implementation.c do the delegation whenever an
iterator initializer returns FALSE.
2012-09-19 12:22:58 -04:00
Søren Sandmann Pedersen
c710d0fae2 Move fill delegation into pixman-implementation.c
As in the blt commit, do the delegation in pixman-implementation.c
whenever the implementation fill returns FALSE instead of relying on
each implementation to do it by itself.

With this change there is no longer any reason for the implementations
to have one fill function that delegates and one that actually blits,
so consolidate those in the NEON, DSPr2, SSE2, and MMX
implementations.
2012-09-19 12:22:58 -04:00
Søren Sandmann Pedersen
534507ba3b Move blt delegation into pixman-implementation.c
Rather than require each individual implementation to do the
delegation for blt, just do it in pixman-implementation.c whenever the
implementation blt returns FALSE.

With this change, there is no longer any reason for the
implementations to have one blt function that delegates and one that
actually blits, so consolidate those in the NEON, DSPr2, SSE2, and MMX
implementations.
2012-09-19 12:22:58 -04:00
Søren Sandmann Pedersen
7ef4436abb implementation: Write lookup_combiner() in a less convoluted way.
Instead of initializing an array on the stack, just use a simple
switch to select which set of combiners to look up in.
2012-09-19 12:22:58 -04:00
Matt Turner
3124a51abb build: Remove useless DEP_CFLAGS/DEP_LIBS variables 2012-09-15 23:46:21 -07:00
Andrea Canciani
46e4faf8ef build: Improve win32 build system
Handle cross-directory dependencies using PHONY targets and clean up
some redundancies.
2012-09-15 07:49:53 +02:00
Andrea Canciani
c89efdd211 mmx: Fix x86 build on MSVC
The MSVC compiler is very strict about variable declarations after
statements.

Move all the declarations of each block before any statement in
the same block to fix multiple instances of:

pixman-mmx.c(xxxx) : error C2275: '__m64' : illegal use of this type
as an expression
2012-09-15 07:49:52 +02:00
Søren Sandmann Pedersen
1e3e569b04 test/utils.c: Use pow(), not powf() in sRGB conversion routines
These functions are operating on double precision values, so use pow()
instead of powf().
2012-08-29 15:05:49 -04:00
Søren Sandmann Pedersen
8577daba04 pixel_checker: Move sRGB conversion into get_limits()
The sRGB conversion has to be done every time the limits are being
computed. Without this fix, pixel_checker_get_min/max() will produce
the wrong results when called from somewhere other than
pixel_checker_check().
2012-08-26 18:13:47 -04:00
Søren Sandmann Pedersen
62eb6e5e05 Remove obsolete TODO file 2012-08-25 17:17:24 -04:00
Søren Sandmann Pedersen
384846b38c Remove pointless declaration of _pixman_image_get_scanline_generic_64()
This declaration used to be necessary when
_pixman_image_get_scanline_generic_64() referred to a structure that
itself referred back to _pixman_image_get_scanline_generic_64().
2012-08-19 13:45:21 -04:00
Søren Sandmann Pedersen
09cb1ae10b demos: Add srgb_trap_test.c
This demo program composites a bunch of trapezoids side by side with
and without gamma aware compositing.
2012-08-09 11:24:37 -04:00
Søren Sandmann Pedersen
04e878c231 Make show_image() cope with more formats
This makes show_image() deal with more formats than just a8r8g8b8, in
particular, a8r8g8b8_sRGB can now be handled.

Images that are passed to show_image with a format of a8r8g8b8_sRGB
are displayed without modification under the assumption that the
monitor is approximately sRGB.

Images with a format of a8r8g8b8 are also displayed without
modification since many other users of show_image() have been
generating essentially sRGB data with this format. Other formats are
also assumed to be gamma compressed; these are converted to a8r8g8b8
before being displayed.

With these changes, srgb-test.c doesn't need to do its own conversion
anymore.
2012-08-09 11:24:37 -04:00
Søren Sandmann Pedersen
8db9ec9814 Define TIMER_BEGIN and TIMER_END even when timers are not enabled
This allows code that uses these macros to build when timers are
disabled.
2012-08-09 11:23:45 -04:00
Søren Sandmann Pedersen
da5268cc19 Post-release version bump to 0.27.3 2012-08-01 15:56:13 -04:00
Søren Sandmann Pedersen
e8ddef78b6 Pre-release version bump to 0.27.2 2012-08-01 15:22:57 -04:00
Sebastian Bauer
c214ca51a0 Use angle brackets form of including config.h 2012-08-01 15:21:51 -04:00
Sebastian Bauer
98617b3796 Added HAVE_CONFIG_H check before including config.h 2012-08-01 15:21:51 -04:00
Søren Sandmann Pedersen
5b0563f39e glyph-test: Avoid setting solid images as alpha maps.
glyph-test would sometimes set a solid image as an alpha map, which is
not allowed. When this happened and the debug spew was enabled,
messages like this one would be generated:

    *** BUG ***
    In pixman_image_set_alpha_map: The expression
            !alpha_map || alpha_map->type == BITS was false
    Set a breakpoint on '_pixman_log_error' to debug

Fix this by not passing the ALLOW_SOLID flag to create_image() when
the resulting is to be used as an alpha map.
2012-07-31 23:51:53 -04:00
Søren Sandmann Pedersen
38fe7cd7be stress-test: Avoid overflows in clip rectangles
The rectangles in the clip region set in set_general_properties()
would sometimes overflow, which would lead to messages like these:

      *** BUG ***
      In pixman_region32_union_rect: Invalid rectangle passed
      Set a breakpoint on '_pixman_log_error' to debug

when the micro version number of pixman is even.

Fix this by detecting the overflow and clamping such that the x2/y2
coordinates are less than INT32_MAX.
2012-07-31 23:51:53 -04:00
Søren Sandmann Pedersen
24d83cbf3d Add make-srgb.pl to EXTRA_DIST
Otherwise make distcheck doesn't pass.
2012-07-31 23:51:52 -04:00
Antti S. Lankila
72ba0b9555 Add tests to validate new sRGB behavior
Composite checks random combinations of operations that now also have
sRGB sources, masks and destinations, and stress-test validates the
read/write primitives.
2012-07-30 15:44:38 -04:00
Antti S. Lankila
a161a6ba23 Add sRGB blending demo program
Simple sRGB color blender test can be used to determine if the sRGB processing
works as expected. It blends alpha ramps of purple and green together such that
at midpoint of image, 50 % blend of both is realized. At that point, sRGB-aware
processing yields a result close to #bbb rather than #888, which is the linear
light blending result.

The demo also contains the sample computation for sRGB premultiplied alpha.
2012-07-30 15:40:16 -04:00
Antti S. Lankila
7460457f80 Add support for sRGB surfaces
sRGB format is defined as a new format type, PIXMAN_TYPE_ARGB_SRGB. One form of
this type is provided, PIXMAN_a8r8g8b8_sRGB. Use of an sRGB format triggers
wide processing, and the pixel fetch/store functions handle the relevant
conversion between color spaces. Pixman itself is thought to compose in the
linearized sRGB color space.

sRGB conversion is tabularized. For sRGB to linear, we are using only 256
values because the current source format uses 8 bits per component precision.
For linear to sRGB, it turns out that only 4096 brightness levels are required
to generate all of the 256 sRGB color values, and therefore only 12 bits per
component are considered during store. As a special case, a no-op
sRGB->linear->sRGB conversion is constructed to be lossless by adjusting the
sRGB->linear conversion table where necessary.
2012-07-30 15:37:26 -04:00
Antti S. Lankila
1dcca0f7ae Remove unnecessary dst initialization
The initialization work is already performed correctly in image_init().
2012-07-29 11:01:11 -04:00
Cyril Brulebois
1713a099d6 Upload to unstable. 2012-06-27 12:11:58 +02:00
Cyril Brulebois
9026e61d84 Disable loongson2f optimizations, fix FTBFS on mipsel. 2012-06-27 11:21:54 +02:00
Søren Sandmann Pedersen
56321eff65 Make pixman-mmx.c compile on x86-32 without optimization
When not optimizing, write _mm_shuffle_pi16() as a statement
expression with inline assembly. That way we avoid
__builtin_ia32_pshufw(), which is only available when compiling with
-msse, while still allowing the non-optimizing gcc to understand that
the second argument is a compile time constant.

Tested-by: Knut Petersen <knut_petersen@t-online.de>
2012-06-20 02:53:31 -04:00
Søren Sandmann Pedersen
0c81957e9b Cleanups and simplifications in x86 CPU feature detection
A new function pixman_cpuid() is added that runs the cpuid instruction
and returns the results. On GCC this function uses inline assembly; on
MSVC, the function calls the __cpuid intrinsic.

There is also a new function called have_cpuid() which detects whether
cpuid is available. On x86-64 and MSVC, it simply returns TRUE; on
x86-32 bit, it checks whether the 22nd bit of eflags can be
modified. On MSVC this does have the consequence that pixman will no
longer work CPUS without cpuid (ie., older than 486 and some 486
models).

These two functions together makes it possible to write a generic
detect_cpu_features() in plain C. This function is then used in a new
have_feature() function that checks whether a specific set of feature
bits is available.

Aside from the cleanups and simplifications, the main benefit from
this patch is that pixman now can do feature detection on x86-64, so
that newer instruction sets such as SSSE3 and SSE4.1 can be used. (And
apparently the assumption that x86-64 CPUs always have MMX and SSE2 is
no longer correct: Knight's Corner is x86-64, but doesn't have them).

V2: Rename the constants in the getisax() code, as pointed out by Alan
Coopersmith. Also reinstate the result variable and initialize
features to 0.

V3: Fixes for the fact that the upper 32 bits of a 64 bit register are
zeroed whenever the corresponding 32 bit register is written to.

V4: Fixes for the fact that in 32 bit mode, when gcc is not optimizing
there were not enough registers available. The new code uses the "a",
"b", "c", and "d" constraints instead, and has two separate versions
for 32 and 64 bit modes.
2012-06-20 02:51:04 -04:00
Sebastian Bauer
4d641c3803 Changed the style of two function headers
Declare functions *_inverse() and *_contains_rectangle() in the same
way as the other functions are declared. This doesn't imply any semantic
changes. It's just a unification of coding styles.
2012-07-08 18:49:24 -04:00
Nemanja Lukic
86ad09b548 MIPS: DSPr2: Added more bilinear fast paths (without mask)
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench -b

Referent (before):
  src_8888_8888 =  L1:   8.18  L2:   7.79  M:  6.32 ( 33.51%)  HT:  5.78  VT:  5.70  R:  5.61  RT:  3.79 (  29Kops/s)
  src_8888_0565 =  L1:   6.90  L2:   7.14  M:  6.47 ( 25.75%)  HT:  5.54  VT:  5.51  R:  5.46  RT:  3.53 (  28Kops/s)
  src_0565_x888 =  L1:   3.76  L2:   3.71  M:  3.37 ( 13.41%)  HT:  3.26  VT:  3.22  R:  3.20  RT:  2.58 (  23Kops/s)
  src_0565_0565 =  L1:   3.59  L2:   3.56  M:  3.47 (  9.19%)  HT:  3.19  VT:  3.18  R:  3.16  RT:  2.46 (  22Kops/s)
 over_8888_8888 =  L1:   5.99  L2:   5.66  M:  4.95 ( 26.28%)  HT:  4.40  VT:  4.38  R:  4.31  RT:  3.02 (  26Kops/s)
  add_8888_8888 =  L1:   6.84  L2:   6.39  M:  5.48 ( 29.09%)  HT:  4.80  VT:  4.79  R:  4.70  RT:  3.20 (  27Kops/s)

Optimized:
  src_8888_8888 =  L1:  18.27  L2:  16.69  M: 12.87 ( 68.25%)  HT: 11.80  VT: 11.61  R: 10.60  RT:  7.05 (  41Kops/s)
  src_8888_0565 =  L1:  15.18  L2:  14.10  M: 11.75 ( 46.71%)  HT: 10.64  VT: 10.50  R: 10.03  RT:  7.15 (  41Kops/s)
  src_0565_x888 =  L1:  10.45  L2:   9.96  M:  9.23 ( 36.72%)  HT:  8.39  VT:  8.29  R:  8.02  RT:  5.75 (  37Kops/s)
  src_0565_0565 =  L1:   9.37  L2:   8.98  M:  8.50 ( 22.53%)  HT:  7.71  VT:  7.66  R:  7.52  RT:  5.59 (  37Kops/s)
 over_8888_8888 =  L1:  12.21  L2:  11.01  M:  8.56 ( 45.36%)  HT:  7.71  VT:  7.64  R:  7.43  RT:  5.51 (  36Kops/s)
  add_8888_8888 =  L1:  17.72  L2:  15.16  M: 10.78 ( 57.13%)  HT:  9.46  VT:  9.30  R:  9.00  RT:  6.03 (  38Kops/s)
2012-07-08 21:38:14 +03:00
Nemanja Lukic
707a8be112 MIPS: DSPr2: Added several bilinear fast paths with a8 mask
Performance numbers before/after on MIPS-74kc @ 1GHz:

lowlevel-blt-bench -b

Referent (before):

  src_8888_8_8888 =  L1:   6.37  L2:   6.08  M:  5.46 ( 32.57%)  HT:  4.64  VT:  4.61  R:  4.52  RT:  2.85 (  23Kops/s)
  src_8888_8_0565 =  L1:   5.89  L2:   5.66  M:  5.11 ( 23.71%)  HT:  4.36  VT:  4.34  R:  4.26  RT:  2.71 (  22Kops/s)
  src_0565_8_x888 =  L1:   3.32  L2:   3.27  M:  3.17 ( 14.71%)  HT:  2.86  VT:  2.84  R:  2.81  RT:  2.07 (  19Kops/s)
  src_0565_8_0565 =  L1:   3.19  L2:   3.15  M:  3.05 ( 10.11%)  HT:  2.75  VT:  2.74  R:  2.71  RT:  2.00 (  18Kops/s)
 over_8888_8_8888 =  L1:   4.99  L2:   4.71  M:  4.11 ( 27.22%)  HT:  3.59  VT:  3.58  R:  3.50  RT:  2.36 (  21Kops/s)
  add_8888_8_8888 =  L1:   5.60  L2:   5.26  M:  4.52 ( 29.95%)  HT:  3.92  VT:  3.89  R:  3.80  RT:  2.49 (  21Kops/s)

Optimized:

  src_8888_8_8888 =  L1:  13.19  L2:  12.13  M:  9.75 ( 58.22%)  HT:  8.60  VT:  8.44  R:  7.90  RT:  5.06 (  33Kops/s)
  src_8888_8_0565 =  L1:  11.64  L2:  10.81  M:  9.18 ( 42.63%)  HT:  8.04  VT:  7.90  R:  7.57  RT:  5.02 (  32Kops/s)
  src_0565_8_x888 =  L1:   8.34  L2:   7.95  M:  7.29 ( 33.85%)  HT:  6.55  VT:  6.48  R:  6.25  RT:  4.35 (  30Kops/s)
  src_0565_8_0565 =  L1:   7.71  L2:   7.35  M:  6.90 ( 22.90%)  HT:  6.14  VT:  6.10  R:  5.94  RT:  4.07 (  29Kops/s)
 over_8888_8_8888 =  L1:   9.73  L2:   8.99  M:  7.15 ( 47.41%)  HT:  6.40  VT:  6.30  R:  6.11  RT:  4.28 (  30Kops/s)
  add_8888_8_8888 =  L1:  13.01  L2:  11.72  M:  8.70 ( 57.68%)  HT:  7.59  VT:  7.46  R:  7.20  RT:  4.74 (  32Kops/s)
2012-07-08 21:38:09 +03:00
Søren Sandmann Pedersen
6aac8e8570 Simplify CPU detection on PPC.
Get rid of the initialized and have_vmx static variables in
pixman-ppc.c There is no point to them since CPU detection only
happens once per process.

On Linux, just read /proc/self/auxv instead of generating the filename
with getpid() and don't bother with the stack buffer. Instead just
read the aux entries one by one.
2012-07-07 01:09:23 -04:00
Søren Sandmann Pedersen
4b78d78537 Simplifications to ARM CPU detection
Organize pixman-arm.c such that each operating system/compiler exports
a detect_cpu_features() function that returns a bitmask with the
various features that we are interested in. A new function
have_feature() then calls this function, caches the result, and return
whether the given feature is available.

The result is that all the pixman_have_arm_<feature> functions become
redundant and can be deleted.
2012-07-07 01:09:23 -04:00
Søren Sandmann Pedersen
8b795a9c17 Simplify MIPS CPU detection
There is no reason to have pixman_have_<feature> functions when all
they do is call pixman_have_mips_feature().

Instead rename pixman_have_mips_feature() to have_feature() and call
it directly from _pixman_mips_get_implementations(). Also on
non-Linux, just make have_feature() return FALSE.
2012-07-07 01:09:23 -04:00
Søren Sandmann Pedersen
16502dd3ae Move the remaining bits of pixman-cpu into pixman-implementation.c 2012-07-07 01:09:23 -04:00
Søren Sandmann Pedersen
5813bb96ae Move MIPS specific CPU detection to its own file, pixman-mips.c 2012-07-07 01:09:23 -04:00
Søren Sandmann Pedersen
4ac0a1d60f Move PowerPC specific CPU detection to its own file pixman-ppc.c 2012-07-07 01:09:23 -04:00
Søren Sandmann Pedersen
8590415f0e Move ARM specific CPU detection to a new file pixman-arm.c
Similar to the x86 commit, this moves the ARM specific CPU detection
to its own file which exports a pixman_arm_get_implementations()
function that is supposed to be a noop on non-ARM.
2012-07-07 01:09:22 -04:00
Søren Sandmann Pedersen
39ac18570a Move x86 specific CPU detection to a new file pixman-x86.c
Extract the x86 specific parts of pixman-cpu.c and put them in their
own file called pixman-x86.c which exports one function
pixman_x86_get_implementations() that creates the MMX and SSE2
implementations. This file is supposed to be compiled on all
architectures, but pixman_x86_get_implementations() should be a noop
on non-x86.
2012-07-06 23:53:19 -04:00
Søren Sandmann Pedersen
1a3b7614a9 pixman-cpu.c: Rename disabled to _pixman_disabled() and export it 2012-07-06 23:52:14 -04:00
Sebastian Bauer
d4aa82fb91 Qualify the static variables in pixman_f_transform_invert() with the const keyword.
Their contents is not overwritten.
2012-07-06 23:50:21 -04:00
Søren Sandmann Pedersen
f9c91ee2f2 Use a compile-time constant for the "K" constraint in the MMX detection.
When compiling with -O0, gcc doesn't understand that in

     signed char x = 0;

     ...

     asm ("...",
     	  : "K" (x));

x is constant. Fix this by using an immediate constant instead of a
variable.
2012-07-02 18:21:21 -04:00
Søren Sandmann Pedersen
cd7ecf548a In fast_composite_tiled_repeat() don't clone images with a palette
In fast_composite_tiled_repeat() if the source image is less than a
certain constant width, a clone is created which is then
pre-repeated. However, the source image's palette, if it has one, is
not cloned, so for indexed images, the pre-repeating would crash.

Fix this by not doing any pre-repeating for images with a palette set.
2012-07-02 18:21:21 -04:00
Søren Sandmann Pedersen
7b20ad39f7 test: Make stress-test more likely to actually composite something
stress-test current almost never composites anything because the clip
rectangles and transformations are such that either
_pixman_compute_composite_region32() or analyze_extent() will return
FALSE.

Fix this by:

- making log_rand() return smaller numbers so that the clip rectangles
  are more likely to be within the destination image

- adding rand_x() and rand_y() functions that pick positions within an
  image and using them for positioning alpha maps and source/mask
  positions.

- making it less likely that clip regions are used in general

These changes make the test take longer, so speed it up a little by
making most images smaller and by reducing the maximum convolution
filter from 17x19 to 3x4.

With these changes, stress-test reveals a crash in iteration 0xd39
where fast_composite_tiled_repeat() creates an indexed image without a
palette.
2012-07-02 18:21:21 -04:00
Matt Turner
4cdf8e9f3a sse2: add missing ABGR entires for bilinear src_8888_8888 2012-07-01 16:35:46 -04:00
Matt Turner
ef99f9e972 loongson: optimize _mm_set_pi* functions with shuffle instructions 2012-07-01 16:34:45 -04:00
Matt Turner
9aa8e3a260 mmx: optimize bilinear function when using 7-bit precision
Loongson:
image             firefox-fishtank 1037.738 1040.218   0.19%    3/3
image             firefox-fishtank 1056.611 1057.581   0.20%    3/3

ARM/iwMMXt:
image             firefox-fishtank 1487.282 1492.640   0.17%    3/3
image             firefox-fishtank 1363.913 1364.366   0.11%    3/3
2012-07-01 16:34:21 -04:00
Matt Turner
1ad6ae6ee8 mmx: add scaled bilinear over_8888_8_8888
Loongson:
image             firefox-fishtank 1665.163 1670.370   0.17%    3/3
image             firefox-fishtank 1037.738 1040.218   0.19%    3/3

ARM/iwMMXt:
image             firefox-fishtank 2042.723 2045.308   0.10%    3/3
image             firefox-fishtank 1487.282 1492.640   0.17%    3/3
2012-07-01 16:34:14 -04:00
Matt Turner
c43de364cb mmx: add scaled bilinear over_8888_8888
Loongson:
image         firefox-planet-gnome  157.012  158.087   0.30%    6/6
image         firefox-planet-gnome  156.617  157.109   0.15%    5/6

ARM/iwMMXt:
image         firefox-planet-gnome  148.086  149.339   0.76%    6/6
image         firefox-planet-gnome  144.939  146.123   0.61%    6/6
2012-07-01 16:33:19 -04:00
Matt Turner
9209cd746b mmx: add scaled bilinear src_8888_8888
Loongson:
image         firefox-planet-gnome  170.025  170.229   0.09%    3/4
image         firefox-planet-gnome  157.012  158.087   0.30%    6/6

ARM/iwMMXt:
image         firefox-planet-gnome  164.192  164.875   0.34%    3/4
image         firefox-planet-gnome  148.086  149.339   0.76%    6/6
2012-07-01 16:33:08 -04:00
Matt Turner
51f27d7364 mmx: Use expand_alpha instead of mask/shift 2012-07-01 16:25:30 -04:00
Siarhei Siamashka
b0855f095a Change default bilinear interpolation precision to 7 bits
This improves performance for the current SSE2 code. Further
reduction to 4 bits may be considered later if it proves
to allow additional speedup.
2012-07-01 23:00:34 +03:00
Siarhei Siamashka
c430b1dba7 sse2: _mm_madd_epi16 for faster bilinear scaling with 7-bit precision
Reducing interpolation precision allows the use of PMADDWD instruction.
This makes bilinear scaling much faster (on Intel Core i7):

8-bit: image             firefox-fishtank   57.584   58.349   0.74%    3/3
7-bit: image             firefox-fishtank   51.139   51.229   0.30%    3/3

8-bit: src_8888_8888 =  L1: 228.71  L2: 226.52  M:224.82 ( 14.95%)  HT:183.22  VT:154.02  R:171.72  RT:109.36
7-bit: src_8888_8888 =  L1: 320.45  L2: 317.43  M:314.38 ( 20.77%)  HT:215.13  VT:177.35  R:204.46  RT:121.93
2012-07-01 22:40:23 +03:00
Siarhei Siamashka
ccd31896bc Bilinear interpolation precision is now configurable at compile time
Macro BILINEAR_INTERPOLATION_BITS in pixman-private.h selects
the number of fractional bits used for bilinear interpolation.

scaling-test and affine-test have checksums for 4-bit, 7-bit
and 8-bit configurations.
2012-07-01 21:45:43 +03:00
Matt Turner
ad9f1d0201 Fix distcheck due to custom iwMMXt rules 2012-06-29 14:24:30 -04:00
Siarhei Siamashka
ff5d041b88 sse2: faster bilinear scaling (use _mm_loadl_epi64)
Using _mm_loadl_epi64() to load two pixels at once (pairs of top
and bottom pixels) is faster than loading each pixel separately
and combining them with _mm_set_epi32().

=== cairo-perf-trace ===

before: image             firefox-fishtank   66.912   66.931   0.13%    3/3
after:  image             firefox-fishtank   57.584   58.349   0.74%    3/3

=== lowlevel-blt-bench ===

before: src_8888_8888 =  L1: 181.10  L2: 179.14  M:178.08 ( 11.02%)  HT:153.22  VT:133.45  R:142.24  RT: 95.32
after:  src_8888_8888 =  L1: 228.68  L2: 225.75  M:223.98 ( 14.23%)  HT:185.32  VT:155.06  R:162.73  RT:102.52

This improvement was suggested by Matt Turner on irc.
2012-06-29 03:29:32 +03:00
Siarhei Siamashka
fc162bad56 test: support nearest/bilinear scaling in lowlevel-blt-bench
Scale factor is selected to be nearly 1x, so that the MPix/s results
can be directly compared with the results of non-scaled compositing
operations.
2012-06-29 03:24:29 +03:00
Siarhei Siamashka
387e9bcddb test: Fix for strict aliasing issue in 'get_random_seed'
Gets rid of gcc warning when compiled with -fstrict-aliasing option in CFLAGS
2012-06-29 03:23:09 +03:00
Andrea Canciani
4cbeb0aedc build: Fix compilation on win32
When compiling using the win32 build system, config.h is not
available nor needed.

Fixes:

pixman-glyph.c(26) : fatal error C1083: Cannot open include file:
'config.h': No such file or directory
2012-06-20 17:13:33 +02:00
Matt Turner
21077e1b83 sse2: add src_x888_0565
Port of 2ddd1c498b to SSE2.

Uses the pmadd technique described in
http://software.intel.com/sites/landingpage/legacy/mmx/MMX_App_24-16_Bit_Conversion.pdf

Works around lack of packusdw instruction by first sign extending the
values.

fast:	src_8888_0565 =  L1: 681.40  L2: 689.20  M: 644.76 ( 25.51%)  HT:404.42  VT:288.04  R:306.07  RT:150.80 (1619Kops/s)
mmx:	src_8888_0565 =  L1:2056.03  L2:1985.44  M:1574.91 ( 61.87%)  HT:533.10  VT:376.35  R:416.10  RT:178.79 (1833Kops/s)
sse2:	src_8888_0565 =  L1:3793.42  L2:3653.44  M:1878.83 ( 73.94%)  HT:535.03  VT:407.96  R:421.46  RT:163.31 (1727Kops/s)

and for reference, using packusdw
sse4:	src_8888_0565 =  L1:4396.18  L2:4229.25  M:1904.04 ( 75.18%)  HT:559.79  VT:427.96  R:440.06  RT:165.71 (1744Kops/s)

Notice that MMX is faster in the RT case because it can operate on
8-bytes instead of the current 16-bytes for SSE2.
2012-06-16 16:00:00 -04:00
Cyril Brulebois
3acc1ffc32 Upload to unstable. 2012-06-15 01:25:23 +02:00
Cyril Brulebois
1952e2a77b Document the cherry-pick, fixing FTBFS on *i386. 2012-06-15 01:20:14 +02:00
Matt Turner
1701defb49 mmx: add missing _mm_empty calls
Fixes spurious test failures on x86-32.
(cherry picked from commit da6193b1fc)
2012-06-15 01:19:04 +02:00
Cyril Brulebois
8940c5222e Upload to unstable. 2012-06-15 00:16:59 +02:00
Cyril Brulebois
0181d422ab Bump changelogs. 2012-06-15 00:15:43 +02:00
Cyril Brulebois
f53c40a739 Merge branch 'upstream-unstable' into debian-unstable 2012-06-15 00:15:23 +02:00
Matt Turner
7db07cb731 sse2: enable over_n_0565 for b5g6r5
Same as b950bb12 for MMX.
2012-06-13 19:32:21 -04:00
Matt Turner
45946c5fa1 .gitignore: add test/glyph-test 2012-06-13 19:32:21 -04:00
Søren Sandmann Pedersen
eadb442b5c test: Add missing break in stress-test.c
Found by coverity:

https://bugzilla.redhat.com/show_bug.cgi?id=756069
2012-06-13 07:30:06 -04:00
Siarhei Siamashka
492dac7593 test: fix bisecting issue in fuzzer-find-diff.pl
Before bisecting to find the exact test which has failed, we
first need to make sure that the first test is fine (the first
test is "good" and the whole range is "bad"). Otherwise
test 2 gets incorrectly flagged as problematic in the case
if we already got a failure on test 1 right from the start.
2012-06-12 04:21:57 +03:00
Siarhei Siamashka
40a0d10eea test: OpenMP 2.5 requires signed loop iteration variables
Unsigned loop variables are only supported since version 3.0
of OpenMP specification. Changing loop variables to use int32_t
type fixes pixman build problems with path64 compiler.
2012-06-12 04:21:07 +03:00
Søren Sandmann Pedersen
619a60d201 test: Make glyph test pass on big endian
The destination buffer was initialized with random uint32_t values, so
it started out different on big endian vs. little endian. Fix that by
initializing the buffer with random uint8_t values instead.
2012-06-11 19:19:23 -04:00
Søren Sandmann Pedersen
f80e7ad3cb bits-image: Turn all the fetchers into iterator getters
Instead of caching these fetchers in the image structure, and then
have the iterator getter call them from there, simply change them to
be iterator getters themselves.

This avoids an extra indirect function call and lets us get rid of the
get_scanline_32/64 fields in pixman_image_t.
2012-06-11 07:15:00 -04:00
Antti S. Lankila
fd175f9d02 Faster unorm_to_unorm for wide processing.
Optimizing the unorm_to_unorm functions allows a speedup from:

src_8888_2x10 =  L1:  62.08  L2:  60.73  M: 59.61 (  4.30%)  HT: 46.81
	VT: 42.17  R: 43.18  RT: 26.01 (325Kops/s)

to:

src_8888_2x10 =  L1:  76.94  L2:  78.43  M: 75.87 (  5.59%)  HT: 56.73
	VT: 52.39  R: 53.00  RT: 29.29 (363Kops/s)

on a i7 Q720 -based laptop.

The key of the patch is the observation that unorm_to_unorm's work can
more easily be done with a simple multiplication and shift, when the
function is applied repeatedly and the parameters are not compile-time
constants. For instance, converting from 0xfe to 0xfefe (expanding
from 8 bits to 16 bits) can be done by calculating

c = c * 0x101

However, sometimes the result is not a neat replication of all the
bits. For instance, going from 10 bits to 16 bits can be done by
calculating

c = c * 0x401UL >> 4

where the intermediate result is 20 bit wide repetition of the 10-bit
pattern followed by shifting off the unnecessary lowest bits.

The patch has the algorithm to calculate the factor and the shift, and
converts the code to use it.
2012-06-10 14:23:17 -04:00
Matt Turner
367b78fd5c configure.ac: add iwmmxt2 configure flag
The flag allows the user to select whether pixman-mmx.c is compiled with
-march=iwmmxt or -march=iwmmxt2.

gcc has scheduling support for the Marvell CPU in the XO 1.75 when
building with -march=iwmmxt2.
2012-06-09 16:57:16 -04:00
Matt Turner
31a6563ec5 autotools: use custom build rule to build iwMMXt code
gcc has no sane way of enabling iwmmxt code generation, like -msse for
SSE, so you have to use -march=iwmmxt{,2}. User CFLAGS are placed after
-march=iwmmxt and override the march value, so we have to use a custom
build rule to order the CFLAGS such that pixman-mmx.c will be built with
the necessary CFLAGS.
2012-06-09 16:57:16 -04:00
Søren Sandmann Pedersen
706bf8264c Speed up _pixman_image_get_solid() in common cases
Make _pixman_image_get_solid() faster by special-casing the common
cases where the image is SOLID or a repeating a8r8g8b8 image.

This optimization together with the previous one results in a small
but reproducable performance improvement on the xfce4-terminal-a1
cairo trace:

[ # ]  backend                         test   min(s) median(s) stddev. count
Before:
[  0]    image            xfce4-terminal-a1    1.221    1.239   1.21%  100/100
After:
[  0]    image            xfce4-terminal-a1    1.170    1.199   1.26%  100/100

Either optimization by itself is difficult to separate from noise.
2012-06-02 08:19:38 -04:00
Søren Sandmann Pedersen
934c9d8546 Speed up _pixman_composite_glyphs_no_mask()
Bypass much of the overhead of pixman_image_composite32() by only
computing the composite region once instead of once per glyph, and by
only looking up the composite function whenever the glyph format or
flags change.

As part of this, the pixman_compute_composite_region32() was renamed
to _pixman_compute_composite_region32() and exported in
pixman-private.h.

I couldn't find a trace that would reliably demonstrate that this is
actually an improvement by itself (since _pixman_composite_glyphs_no_mask()
is called so rarely), but together with the following optimization for
solid sources, there is a small but reliable improvement to the
xfce4-a1-terminal cairo trace.
2012-06-02 08:19:38 -04:00
Søren Sandmann Pedersen
a162189dc0 Speed up pixman_composite_glyphs()
When adding glyphs to the mask, bypass most of the overhead of
pixman_image_composite32() by:

- Only looking up the composite function when the glyph changes either
  format or flags.

- Only using a white source when the glyph format is different from
  the mask format.

- Simply intersecting the glyph rectangle with the destination
  rectangle instead of doing the full _pixman_composite_region32().

Performance results:

[ # ]  backend                         test   min(s) median(s) stddev. count
Before:
[  0]    image            firefox-talos-gfx    6.570    6.577   0.13%    8/10
After:
[  0]    image            firefox-talos-gfx    4.272    4.289   0.28%   10/10

V2: Changes to deal with white sources
2012-06-02 08:19:30 -04:00
Søren Sandmann Pedersen
d9710442b4 test: Add glyph-test
This test tests the new glyph cache and compositing API. Much of this
test is intending to making sure that clipping and alpha map handling
survive any optimizations that may be added to the glyph compositing.

V2: Evaluating lcg_rand_n() multiple times in an argument list lead
    to undefined behavior.
2012-06-02 07:55:11 -04:00
Søren Sandmann Pedersen
dc92374727 Add support for alpha maps to compute_crc32_for_image().
When a destination image I has an alpha map A, the following rules apply:

   - If I has an alpha channel itself, the content of that channel is
     undefined

   - If A has RGB channels, the content of those channels is
     undefined.

Hence in order to compute the CRC32 for such an image, we have to mask
off the alpha channel of the image, and the RGB channels of the alpha
map.

V2: Shifting by 32 is undefined in C
2012-06-02 07:55:11 -04:00
Søren Sandmann Pedersen
43e029d525 Move CRC32 computation from blitters-test.c into utils.c
This way it can be used in other tests.
2012-06-02 07:55:11 -04:00
Søren Sandmann Pedersen
fce31a5ef8 Add pixman_glyph_cache_t API
This new API allows entire glyph strings to be composited in one go
which reduces overhead compared to multiple calls to
pixman_image_composite32().

The pixman_glyph_cache_t is a hash table that maps two keys (a "font"
and a "glyph" key, but they are just keys; there is no distinction
between them as far as pixman is concerned) to a glyph. Glyphs in the
cache can be composited through two new entry points
pixman_glyph_cache_composite_glyphs() and
pixman_glyph_cache_composite_glyphs_no_mask().

A glyph cache may only be inserted into when it is "frozen", which is
achieved by calling pixman_glyph_cache_freeze(). When
pixman_glyph_cache_thaw() is later called, if the cache has become too
crowded, some glyphs (currently the least-recently-used) will
automatically be evicted. This means that a user must ensure that all
the required glyphs are present in the cache before compositing a
string. The intended way to use the cache is like this:

        pixman_glyph_t glyphs[MAX_GLYPHS];

        pixman_glyph_cache_freeze (cache);

        for (i = 0; i < n_glyphs; ++i)
        {
            const void *g;

            if (!(g = pixman_glyph_cache_lookup (cache, font_key, glyph_key)))
            {
                img = <rasterize glyph as a pixman_image_t>;

                g = pixman_glyph_cache_insert (cache, font_key, glyph_key,
                                               glyph_origin_x, glyph_origin_y,
                                               img);

                if (!g)
                {
                    /* Clean up out-of-memory condition */
                    goto oom;
                }

                glyphs[i].pos_x = glyph_x_pos;
                glyphs[i].pos_y = glyph_y_pos;
                glyphs[i].glyph = g;
            }
        }

        pixman_composite_glyphs (op, src, dest, ..., cache, n_glyphs, glyphs);

        pixman_glyph_cache_thaw (cache);

V2:
- Move glyphs to front of the MRU list when they are used. Pointed
  out by Behdad Esfahbod.
- Composite glyphs with (white IN glyph) ADD mask in order to support
  mixed a8 and a8r8g8b8 glyphs. Also pointed out by Behdad.
- Add pixman_glyph_get_mask_format
2012-06-02 07:55:11 -04:00
Søren Sandmann Pedersen
a3ae88b71b Add doubly linked lists
This commit adds some new inline functions to maintain a doubly linked
list.

The way to use them is to embed a pixman_link_t into the structures
that should be linked, and use a pixman_list_t as the head of the
list.

The new functions are

    pixman_list_init (pixman_list_t *list);
    pixman_list_prepend (pixman_list_t *list, pixman_link_t *link);
    pixman_list_move_to_front (pixman_list_t *list, pixman_link_t *link);

There are also a new macro:

    CONTAINER_OF(type, member, data);

that can be used to get from a pointer to a member to the containing
structure.

V2: Use the C89 macro offsetof() instead of rolling our own -
suggested by Alan Coopersmith.
2012-06-02 07:54:48 -04:00
Søren Sandmann Pedersen
c2230fe2af Make use of image flags in mmx and sse2 iterators
Now that we have the full image flags available, the SSE2 and MMX
iterators can simply check against SAMPLES_COVER_CLIP_NEAREST (which
is computed in pixman_image_composite32()) instead of comparing all
the x/y/width/height parameters.
2012-05-30 04:42:29 -04:00
Søren Sandmann Pedersen
c1065a9cb4 Pass the full image flags to iterators
When pixman_image_composite32() is called some flags are computed that
indicate various things about the composite operation that can't be
deduced from the image flags themselves. These additional flags are
not currently available to iterators. All they can do is read the
image flags in image->common.flags.

Fix that by passing the info->{src, mask, dest}_flags on to the
iterator initialization and store the flags in the iter struct as
"image_flags". At the same time rename the *iterator* flags variable
to "iter_flags" to avoid confusion.
2012-05-30 04:34:29 -04:00
Matt Turner
da6193b1fc mmx: add missing _mm_empty calls
Fixes spurious test failures on x86-32.
2012-05-27 14:59:56 -04:00
Matt Turner
62c4bdc94f mmx: add over_reverse_n_8888
Loongson:
over_reverse_n_8888 =  L1:  16.04  L2:  15.35  M: 10.20 ( 27.96%)  HT: 10.95  VT: 10.45  R:  9.18  RT:  6.99 (  76Kops/s)
over_reverse_n_8888 =  L1:  27.40  L2:  26.67  M: 16.97 ( 45.78%)  HT: 16.66  VT: 15.38  R: 14.15  RT:  9.44 (  97Kops/s)

image                      poppler   34.106   35.500   1.48%    6/6
image                      poppler   29.598   30.835   1.70%    6/6

ARM/iwMMXt:
over_reverse_n_8888 =  L1:  15.63  L2:  14.33  M: 10.83 ( 27.55%)  HT:  9.78  VT:  9.91  R:  9.49  RT:  6.96 (  69Kops/s)
over_reverse_n_8888 =  L1:  22.79  L2:  19.40  M: 13.76 ( 34.19%)  HT: 11.66  VT: 11.86  R: 11.17  RT:  7.85 (  75Kops/s)

image                      poppler   38.040   38.606   1.10%    6/6
image                      poppler   31.686   32.278   0.80%    5/6
2012-05-26 20:32:27 -04:00
Matt Turner
17acc7a4c7 mmx: add add_0565_0565
Loongson:
add_0565_0565 =  L1:  15.37  L2:  14.91  M: 11.83 ( 16.06%)  HT: 10.53  VT: 10.15  R:  9.74  RT:  6.19 (  68Kops/s)
add_0565_0565 =  L1:  45.06  L2:  46.71  M: 27.45 ( 38.00%)  HT: 23.76  VT: 22.84  R: 18.96  RT:  9.79 ( 104Kops/s)

ARM/iwMMXt:
add_0565_0565 =  L1:  12.87  L2:  11.58  M: 10.11 ( 12.50%)  HT:  9.06  VT:  8.66  R:  7.70  RT:  5.62 (  58Kops/s)
add_0565_0565 =  L1:  31.14  L2:  28.87  M: 22.46 ( 28.60%)  HT: 18.61  VT: 17.04  R: 15.21  RT:  9.35 (  90Kops/s)
2012-05-26 20:32:27 -04:00
Matt Turner
d551dc0494 fast: add add_0565_0565 function
I'll need this code for header and tail alignment loops in MMX, so I
might as well implement a fast path here.
2012-05-26 20:32:27 -04:00
Matt Turner
f8dc0e9834 mmx: implement expand_4x565 in terms of expand_4xpacked565
Loongson:
        over_n_0565 =  L1:  38.57  L2:  38.88  M: 30.01 ( 20.97%)  HT: 23.60  VT: 23.88  R: 21.95  RT: 11.65 ( 113Kops/s)
        over_n_0565 =  L1:  56.28  L2:  55.90  M: 34.20 ( 23.82%)  HT: 25.66  VT: 26.60  R: 23.78  RT: 11.80 ( 115Kops/s)

     over_8888_0565 =  L1:  35.89  L2:  36.11  M: 21.56 ( 45.47%)  HT: 18.33  VT: 17.90  R: 16.27  RT:  9.07 (  98Kops/s)
     over_8888_0565 =  L1:  40.91  L2:  41.06  M: 23.13 ( 48.46%)  HT: 19.24  VT: 18.71  R: 16.82  RT:  9.18 (  99Kops/s)

      over_n_8_0565 =  L1:  28.92  L2:  29.12  M: 21.42 ( 30.00%)  HT: 18.37  VT: 17.75  R: 16.15  RT:  8.79 (  91Kops/s)
      over_n_8_0565 =  L1:  32.32  L2:  32.13  M: 22.44 ( 31.27%)  HT: 19.15  VT: 18.66  R: 16.62  RT:  8.86 (  92Kops/s)

over_n_8888_0565_ca =  L1:  29.33  L2:  29.22  M: 18.99 ( 66.69%)  HT: 16.69  VT: 16.22  R: 14.63  RT:  8.42 (  88Kops/s)
over_n_8888_0565_ca =  L1:  34.97  L2:  34.14  M: 20.32 ( 71.73%)  HT: 17.67  VT: 17.19  R: 15.23  RT:  8.50 (  89Kops/s)

ARM/iwMMXt:
        over_n_0565 =  L1:  29.70  L2:  30.53  M: 24.47 ( 14.84%)  HT: 22.28  VT: 21.72  R: 21.13  RT: 12.58 ( 105Kops/s)
        over_n_0565 =  L1:  41.42  L2:  40.00  M: 30.95 ( 19.13%)  HT: 27.06  VT: 27.28  R: 23.43  RT: 14.44 ( 114Kops/s)

     over_8888_0565 =  L1:  12.73  L2:  11.53  M:  9.07 ( 16.47%)  HT:  9.00  VT:  9.25  R:  8.44  RT:  7.27 (  76Kops/s)
     over_8888_0565 =  L1:  23.72  L2:  21.76  M: 15.89 ( 29.51%)  HT: 14.36  VT: 14.05  R: 12.44  RT:  8.94 (  86Kops/s)

      over_n_8_0565 =  L1:   6.80  L2:   7.15  M:  6.37 (  7.90%)  HT:  6.58  VT:  6.24  R:  6.49  RT:  5.94 (  59Kops/s)
      over_n_8_0565 =  L1:  12.06  L2:  11.02  M: 10.16 ( 13.43%)  HT:  9.57  VT:  8.49  R:  9.10  RT:  6.86 (  69Kops/s)

over_n_8888_0565_ca =  L1:   7.62  L2:   7.01  M:  6.27 ( 20.52%)  HT:  6.00  VT:  6.07  R:  5.68  RT:  5.53 (  57Kops/s)
over_n_8888_0565_ca =  L1:  13.54  L2:  11.96  M:  9.76 ( 30.66%)  HT:  9.72  VT:  8.45  R:  9.37  RT:  6.85 (  67Kops/s)
2012-05-26 20:32:27 -04:00
Matt Turner
51681a052f mmx: add and use expand_4xpacked565 function
Loongson:
add_0565_0565 =  L1:  14.39  L2:  13.98  M: 11.28 ( 15.22%)  HT: 10.11  VT:  9.74  R:  9.39  RT:  6.05 (  67Kops/s)
add_0565_0565 =  L1:  15.37  L2:  14.91  M: 11.83 ( 16.06%)  HT: 10.53  VT: 10.15  R:  9.74  RT:  6.19 (  68Kops/s)

ARM/iwMMXt:
add_0565_0565 =  L1:  11.12  L2:  10.40  M:  8.82 ( 10.65%)  HT:  7.98  VT:  7.41  R:  7.57  RT:  5.21 (  54Kops/s)
add_0565_0565 =  L1:  12.87  L2:  11.58  M: 10.11 ( 12.50%)  HT:  9.06  VT:  8.66  R:  7.70  RT:  5.62 (  58Kops/s)
2012-05-26 20:32:27 -04:00
Søren Sandmann Pedersen
6491c70e3a Post-release version bump to 0.27.1 2012-05-26 16:34:13 -04:00
Søren Sandmann Pedersen
b1a401e6c9 Pre-release version bump to 0.26.0 2012-05-26 16:17:14 -04:00
Ingmar Runge
f71e3dba97 Fix MSVC compilation
Only up to three SSE intrinsics supported in function declaration.
2012-05-25 20:10:31 -04:00
Søren Sandmann Pedersen
1e59e18d73 test: Composite with solid images instead of using pixman_image_fill_*
There is a couple of places where the test suite uses the
pixman_image_fill_* functions to initialize images. These functions
can fail, and will do so if the "fast" implementation is disabled.

So to make sure the test suite passes even using
PIXMAN_DISABLE="fast", use pixman_image_composite32() with a solid
image instead of pixman_image_fill_*.
2012-05-24 15:30:41 -04:00
Nemanja Lukic
30816e3068 MIPS: DSPr2: Added bilinear over_8888_8_8888 fast path.
Performance numbers before/after on MIPS-74kc @ 1GHz

Referent (before):

cairo-perf-trace:
[ # ]  backend                         test   min(s) median(s) stddev. count
[ # ]    image: pixman 0.25.3
[  0]    image             firefox-fishtank 2289.180 2290.567   0.05%    5/6

Optimized:

cairo-perf-trace:
[ # ]  backend                         test   min(s) median(s) stddev. count
[ # ]    image: pixman 0.25.3
[  0]    image             firefox-fishtank 1700.925 1708.314   0.22%    5/6
2012-05-23 13:50:05 -04:00
Nemanja Lukic
aea0522f6f MIPS: DSPr2: Fix bug in over_n_8888_8888_ca/over_n_8888_0565_ca routines
In main loop (unrolled by factor 2), instead of negating multiplied
mask values by srca, values of srca was negated, and passed as alpha
argument for

    UN8x4_MUL_UN8x4_ADD_UN8x4 macro.

Instead of:

    ma = ~ma;
    UN8x4_MUL_UN8x4_ADD_UN8x4 (d, ma, s);

Code was doing this:

    ma = ~srca;
    UN8x4_MUL_UN8x4_ADD_UN8x4 (d, ma, s);

Key is in substituting registers s0/s1 (containing srca value), with
t0/t1 containing mask values multiplied by srca.  Register usage is
also improved (less registers are saved on stack, for
over_n_8888_8888_ca routine).

The bug was introduced in commit d2ee5631 and revealed by composite test.
2012-05-23 13:41:44 -04:00
Søren Sandmann Pedersen
74bf5dc2f9 demos: Add parrot.jpg to EXTRA_DIST
Pointed out by Cyril Brulebois.
2012-05-20 13:09:16 -04:00
Cyril Brulebois
ae5a109768 Upload to experimental. 2012-05-20 17:56:41 +02:00
Cyril Brulebois
a2283057a6 Remove demos/parrot.jpg before building the source package.
Let's avoid “binary file contents changed” until it's shipped in the
upstream tarball.
2012-05-20 17:56:18 +02:00
Cyril Brulebois
5cb7202a34 Bump changelogs. 2012-05-20 17:41:34 +02:00
Cyril Brulebois
4ed6f63c09 Merge branch 'upstream-experimental' into debian-experimental 2012-05-20 17:40:56 +02:00
Matt Turner
55698584be configure.ac: Fail the ARM/iwMMXt test if not compiling with -march=iwmmxt
If not compiling with -march=iwmmxt, the configure test will still pass,
thinking that the __builtin_arm_* intrinsic is a function instead of
generating a single instruction. Since no linking is done, the configure
test doesn't catch this, and we get linking errors in the build.
2012-05-15 16:41:22 -04:00
Søren Sandmann Pedersen
3682b61515 Post-release version bump to 0.25.7 2012-05-15 13:38:44 -04:00
Søren Sandmann Pedersen
1e1a00e964 Pre-release version bump to 0.25.6
Note that 0.25.4 was a botched release that doesn't have a tag and
doesn't correspond to any commit ID. It was however uploaded and
announced, so I'll just use the 0.25.6 version number.
2012-05-15 13:20:09 -04:00
Søren Sandmann Pedersen
b2c16aaadf demos/Makefile.am: Add parrot.c to EXTRA_DIST
To get 'make distcheck' to pass.
2012-05-15 13:19:19 -04:00
Matt Turner
50d3088d78 configure.ac: Rename loongson -> loongson-mmi
Make it match with the other fast paths, and the PIXMAN_DISABLE value is
already loongson-mmi.
2012-05-11 21:59:13 -04:00
Matt Turner
a0a40cb822 configure.ac: Fix loongson-mmi out-of-tree builds
When building out-of-tree, gcc wasn't able to find loongson-mmintrin.h
to compile the test program. Add -I$srcdir to CFLAGS to point gcc to it.
2012-05-11 21:49:42 -04:00
Nemanja Lukic
618a08e6aa MIPS: DSPr2: Added over_n_8_8888 and over_n_8_0565 fast paths.
Performance numbers before/after on MIPS-74kc @ 1GHz

Referent (before):

lowlevel-blt-bench:
     over_n_8_8888 =  L1:  10.40  L2:   9.79  M:  8.47 ( 33.62%)  HT:  7.64  VT:  7.59  R:  7.48  RT:  5.30 (  40Kops/s)
     over_n_8_0565 =  L1:   7.40  L2:   7.23  M:  6.78 ( 17.94%)  HT:  6.23  VT:  6.17  R:  6.14  RT:  4.62 (  37Kops/s)

Optimized:

lowlevel-blt-bench:
     over_n_8_8888 =  L1:  27.25  L2:  26.24  M: 18.15 ( 72.12%)  HT: 14.52  VT: 14.31  R: 13.83  RT:  7.57 (  48Kops/s)
     over_n_8_0565 =  L1:  18.91  L2:  17.59  M: 15.06 ( 39.90%)  HT: 12.18  VT: 11.98  R: 11.83  RT:  6.80 (  46Kops/s)
2012-05-11 17:11:27 -04:00
Matt Turner
7d4beedc61 mmx: add and use pack_4x565 function
The pack_4x565 makes use of the pack_4xpacked565 function which uses pmadd.

Some of the speed up is probably attributable to removing the artificial
serialization imposed by the
	vdest = pack_565 (..., vdest, 0);
	vdest = pack_565 (..., vdest, 1);
	...
pattern.

Loongson:
        over_n_0565 =  L1:  16.44  L2:  16.42  M: 13.83 (  9.85%)  HT: 12.83  VT: 12.61  R: 12.34  RT:  8.90 (  93Kops/s)
        over_n_0565 =  L1:  42.48  L2:  42.53  M: 29.83 ( 21.20%)  HT: 23.39  VT: 23.72  R: 21.80  RT: 11.60 ( 113Kops/s)

     over_8888_0565 =  L1:  15.61  L2:  15.42  M: 12.11 ( 25.79%)  HT: 11.07  VT: 10.70  R: 10.37  RT:  7.25 (  82Kops/s)
     over_8888_0565 =  L1:  35.01  L2:  35.20  M: 21.42 ( 45.57%)  HT: 18.12  VT: 17.61  R: 16.09  RT:  9.01 (  97Kops/s)

      over_n_8_0565 =  L1:  15.17  L2:  14.94  M: 12.57 ( 17.86%)  HT: 11.96  VT: 11.52  R: 10.79  RT:  7.31 (  79Kops/s)
      over_n_8_0565 =  L1:  29.83  L2:  29.79  M: 21.85 ( 30.94%)  HT: 18.82  VT: 18.25  R: 16.15  RT:  8.72 (  91Kops/s)

over_n_8888_0565_ca =  L1:  15.25  L2:  15.02  M: 11.64 ( 41.39%)  HT: 11.08  VT: 10.72  R: 10.02  RT:  7.00 (  77Kops/s)
over_n_8888_0565_ca =  L1:  30.12  L2:  29.99  M: 19.47 ( 68.99%)  HT: 17.05  VT: 16.55  R: 14.67  RT:  8.38 (  88Kops/s)

ARM/iwMMXt:
        over_n_0565 =  L1:  19.29  L2:  19.88  M: 17.38 ( 10.54%)  HT: 15.53  VT: 16.11  R: 13.69  RT: 11.00 (  96Kops/s)
        over_n_0565 =  L1:  36.02  L2:  34.85  M: 28.04 ( 16.97%)  HT: 22.12  VT: 24.21  R: 22.36  RT: 12.22 ( 103Kops/s)

     over_8888_0565 =  L1:  18.38  L2:  16.59  M: 12.34 ( 22.29%)  HT: 11.67  VT: 11.71  R: 11.02  RT:  6.89 (  72Kops/s)
     over_8888_0565 =  L1:  24.96  L2:  22.17  M: 15.11 ( 26.81%)  HT: 14.14  VT: 13.71  R: 13.18  RT:  8.13 (  78Kops/s)

      over_n_8_0565 =  L1:  14.65  L2:  12.44  M: 11.56 ( 14.50%)  HT: 10.93  VT: 10.39  R: 10.06  RT:  7.05 (  70Kops/s)
      over_n_8_0565 =  L1:  18.37  L2:  14.98  M: 13.97 ( 16.51%)  HT: 12.67  VT: 10.35  R: 11.80  RT:  8.14 (  74Kops/s)

over_n_8888_0565_ca =  L1:  14.27  L2:  12.93  M: 10.52 ( 33.23%)  HT:  9.70  VT:  9.90  R:  9.31  RT:  6.34 (  65Kops/s)
over_n_8888_0565_ca =  L1:  19.69  L2:  17.58  M: 13.40 ( 42.35%)  HT: 11.75  VT: 11.33  R: 11.17  RT:  7.49 (  73Kops/s)
2012-05-10 16:21:07 -04:00
Matt Turner
2beabd9fed configure.ac: make -march=loongson2f come before CFLAGS
Otherwise we'd have -march=loongson2f being overridden by automake's
CFLAGS ordering which causes build failures when -march=<not loongson2f>
is specified by the user.
2012-05-10 16:15:34 -04:00
Søren Sandmann Pedersen
dadb9a318b Add Makefile.win32 and Makefile.win32.common to EXTRA_DIST
https://bugs.freedesktop.org/show_bug.cgi?id=46905
2012-05-10 15:54:32 -04:00
Matt Turner
3c57ec471e .gitignore: add demos/checkerboard and demos/quad2quad 2012-05-09 22:50:50 -04:00
Matt Turner
2d431b53d3 mmx: Use wpackhus in src_x888_0565 on iwMMXt
iwMMXt which has an unsigned saturation pack instruction, while MMX/EXT
and Loongson don't.

ARM/iwMMXt:
src_8888_0565 =  L1: 110.38  L2:  82.33  M: 40.92 ( 73.22%)  HT: 35.63  VT: 32.22  R: 30.07  RT: 18.40 ( 132Kops/s)
src_8888_0565 =  L1: 117.91  L2:  83.05  M: 41.52 ( 75.58%)  HT: 37.63  VT: 35.40  R: 29.37  RT: 19.39 ( 134Kops/s)
2012-04-27 16:39:13 -04:00
Matt Turner
2ddd1c498b mmx: add src_8888_0565
Uses the pmadd technique described in
http://software.intel.com/sites/landingpage/legacy/mmx/MMX_App_24-16_Bit_Conversion.pdf

The technique uses the packssdw instruction which uses signed
saturatation. This works in their example because they pack 888 to 555
leaving the high bit as zero. For packing to 565, it is unsuitable, so
we replace it with an or+shuffle.

Loongson:
src_8888_0565 =  L1: 106.13  L2:  83.57  M: 33.46 ( 68.90%)  HT: 30.29  VT: 27.67  R: 26.11  RT: 15.06 ( 135Kops/s)
src_8888_0565 =  L1: 122.10  L2: 117.53  M: 37.97 ( 78.58%)  HT: 33.14  VT: 30.09  R: 29.01  RT: 15.76 ( 139Kops/s)

ARM/iwMMXt:
src_8888_0565 =  L1:  67.88  L2:  56.61  M: 31.20 ( 56.74%)  HT: 29.22  VT: 27.01  R: 25.39  RT: 19.29 ( 130Kops/s)
src_8888_0565 =  L1: 110.38  L2:  82.33  M: 40.92 ( 73.22%)  HT: 35.63  VT: 32.22  R: 30.07  RT: 18.40 ( 132Kops/s)
2012-04-27 14:12:28 -04:00
Matt Turner
3e8fe65a08 mmx: add x8f8g8b8 fetcher
Loongson:
   add_x888_x888 =  L1:  29.36  L2:  27.81  M: 14.05 ( 38.74%)  HT: 12.45  VT: 11.78  R: 11.52  RT:  7.23 (  75Kops/s)
   add_x888_x888 =  L1:  36.06  L2:  34.55  M: 14.81 ( 41.03%)  HT: 14.01  VT: 13.41  R: 13.06  RT:  9.06 (  90Kops/s)

 src_x888_8_x888 =  L1:  21.92  L2:  20.15  M: 13.35 ( 41.42%)  HT: 11.70  VT: 10.95  R: 10.53  RT:  6.18 (  65Kops/s)
 src_x888_8_x888 =  L1:  25.43  L2:  23.51  M: 14.12 ( 44.00%)  HT: 13.14  VT: 12.50  R: 11.86  RT:  7.49 (  76Kops/s)

over_x888_8_0565 =  L1:  10.64  L2:  10.17  M:  7.74 ( 21.35%)  HT:  6.83  VT:  6.55  R:  6.34  RT:  4.03 (  46Kops/s)
over_x888_8_0565 =  L1:  11.41  L2:  10.97  M:  8.07 ( 22.36%)  HT:  7.42  VT:  7.18  R:  6.92  RT:  4.62 (  52Kops/s)

ARM/iwMMXt:
   add_x888_x888 =  L1:  22.10  L2:  18.93  M: 13.48 ( 32.29%)  HT: 11.32  VT: 10.64  R: 10.36  RT:  6.51 (  61Kops/s)
   add_x888_x888 =  L1:  24.26  L2:  20.83  M: 14.52 ( 35.64%)  HT: 12.66  VT: 12.98  R: 11.34  RT:  7.69 (  72Kops/s)

 src_x888_8_x888 =  L1:  19.33  L2:  17.66  M: 14.26 ( 38.43%)  HT: 11.53  VT: 10.83  R: 10.57  RT:  6.12 (  58Kops/s)
 src_x888_8_x888 =  L1:  21.23  L2:  19.60  M: 15.41 ( 42.55%)  HT: 12.66  VT: 13.30  R: 11.55  RT:  7.32 (  67Kops/s)

over_x888_8_0565 =  L1:   8.15  L2:   7.56  M:  6.50 ( 15.58%)  HT:  5.73  VT:  5.49  R:  5.50  RT:  3.53 (  38Kops/s)
over_x888_8_0565 =  L1:   8.35  L2:   7.85  M:  6.68 ( 16.40%)  HT:  6.12  VT:  5.97  R:  5.78  RT:  4.03 (  43Kops/s)
2012-04-27 13:42:36 -04:00
Matt Turner
c2b1630d96 mmx: add a8 fetcher
oprofile of xfce4-terminal-a1
210535    9.0407  libpixman-1.so.0.25.3    fetch_scanline_a8
144802    6.0054  libpixman-1.so.0.25.3    mmx_fetch_a8

Loongson:
       add_8_8_8 =  L1:  17.98  L2:  17.28  M: 14.28 ( 19.79%)  HT: 11.11  VT: 10.38  R:  9.97  RT:  5.14 (  55Kops/s)
       add_8_8_8 =  L1:  20.44  L2:  19.65  M: 15.62 ( 21.53%)  HT: 12.86  VT: 11.98  R: 11.32  RT:  6.13 (  64Kops/s)

 src_8888_8_0565 =  L1:  19.97  L2:  18.59  M: 13.42 ( 32.55%)  HT: 11.46  VT: 10.78  R: 10.33  RT:  5.87 (  61Kops/s)
 src_8888_8_0565 =  L1:  21.16  L2:  19.68  M: 13.94 ( 33.64%)  HT: 12.31  VT: 11.52  R: 11.02  RT:  6.54 (  68Kops/s)

 src_x888_8_x888 =  L1:  20.54  L2:  18.88  M: 13.07 ( 40.74%)  HT: 11.05  VT: 10.36  R: 10.02  RT:  5.68 (  60Kops/s)
 src_x888_8_x888 =  L1:  21.92  L2:  20.15  M: 13.35 ( 41.42%)  HT: 11.70  VT: 10.95  R: 10.53  RT:  6.18 (  65Kops/s)

over_x888_8_0565 =  L1:  10.32  L2:   9.85  M:  7.63 ( 21.13%)  HT:  6.56  VT:  6.30  R:  6.12  RT:  3.80 (  43Kops/s)
over_x888_8_0565 =  L1:  10.64  L2:  10.17  M:  7.74 ( 21.35%)  HT:  6.83  VT:  6.55  R:  6.34  RT:  4.03 (  46Kops/s)

ARM/iwMMXt:
       add_8_8_8 =  L1:  13.10  L2:  11.67  M: 10.74 ( 13.46%)  HT:  8.62  VT:  8.15  R:  7.94  RT:  4.39 (  44Kops/s)
       add_8_8_8 =  L1:  13.81  L2:  12.79  M: 11.63 ( 13.93%)  HT:  9.33  VT:  9.20  R:  9.04  RT:  5.43 (  52Kops/s)

 src_8888_8_0565 =  L1:  16.62  L2:  15.07  M: 12.52 ( 27.46%)  HT: 10.07  VT: 10.17  R:  9.95  RT:  5.64 (  54Kops/s)
 src_8888_8_0565 =  L1:  16.84  L2:  16.11  M: 13.22 ( 27.71%)  HT: 11.74  VT: 10.90  R: 10.80  RT:  6.66 (  62Kops/s)

 src_x888_8_x888 =  L1:  17.49  L2:  16.22  M: 13.73 ( 38.73%)  HT: 10.10  VT: 10.33  R:  9.55  RT:  5.21 (  52Kops/s)
 src_x888_8_x888 =  L1:  19.33  L2:  17.66  M: 14.26 ( 38.43%)  HT: 11.53  VT: 10.83  R: 10.57  RT:  6.12 (  58Kops/s)

over_x888_8_0565 =  L1:   7.57  L2:   7.29  M:  6.37 ( 15.97%)  HT:  5.53  VT:  5.33  R:  5.21  RT:  3.22 (  35Kops/s)
over_x888_8_0565 =  L1:   8.15  L2:   7.56  M:  6.50 ( 15.58%)  HT:  5.73  VT:  5.49  R:  5.50  RT:  3.53 (  38Kops/s)
2012-04-27 13:42:26 -04:00
Matt Turner
20bad64d9a mmx: add r5g6b5 fetcher
Loongson:
add_0565_0565 =  L1:  12.73  L2:  12.26  M: 10.05 ( 13.87%)  HT:  8.77  VT:  8.50  R:  8.25  RT:  5.28 (  58Kops/s)
add_0565_0565 =  L1:  14.04  L2:  13.63  M: 10.96 ( 15.19%)  HT:  9.73  VT:  9.43  R:  9.11  RT:  5.93 (  64Kops/s)

ARM/iwMMXt:
add_0565_0565 =  L1:  10.36  L2:  10.03  M:  9.04 ( 10.88%)  HT:  3.11  VT:  7.16  R:  7.72  RT:  5.12 (  51Kops/s)
add_0565_0565 =  L1:  10.84  L2:  10.20  M:  9.15 ( 11.46%)  HT:  7.60  VT:  7.82  R:  7.70  RT:  5.41 (  53Kops/s)
2012-04-27 13:42:16 -04:00
Matt Turner
c136e535ad mmx: Use Loongson pextrh instruction in expand565
Same story as pinsrh in the previous commit.

 text	data	bss	dec	hex filename
25336	1952	  0   27288    6a98 .libs/libpixman_loongson_mmi_la-pixman-mmx.o
25072	1952	  0   27024    6990 .libs/libpixman_loongson_mmi_la-pixman-mmx.o

-dsll: 95
+dsll: 70
-dsrl: 135
+dsrl: 105
-ldc1: 462
+ldc1: 445
-lw: 721
+lw: 700
+pextrh: 30
2012-04-27 13:42:07 -04:00
Matt Turner
facceb4a1f mmx: Use Loongson pinsrh instruction in pack_565
The pinsrh instruction is analogous to MMX EXT's pinsrw, except like
other Loongson vector instructions it cannot access the general purpose
registers. In the cases of other Loongson vector instructions, this is a
headache, but it is actually a good thing here. Since the instruction is
different from MMX, I've named the intrinsic loongson_insert_pi16.

 text	data	bss	dec	 hex filename
25976	1952	  0   27928	6d18 .libs/libpixman_loongson_mmi_la-pixman-mmx.o
25336	1952	  0   27288	6a98 .libs/libpixman_loongson_mmi_la-pixman-mmx.o

-and: 181
+and: 147
-dsll: 143
+dsll: 95
-dsrl: 87
+dsrl: 135
-ldc1: 523
+ldc1: 462
-lw: 767
+lw: 721
+pinsrh: 35
2012-04-27 13:41:47 -04:00
Matt Turner
6d29b7d755 mmx: don't pack and unpack src unnecessarily
The combine function was store8888'ing the result, and all consumers
were immediately load8888'ing it, causing lots of unnecessary pack and
unpack instructions.

It's a very straight forward conversion, except for mmx_combine_over_u
and mmx_combine_saturate_u. mmx_combine_over_u was testing the integer
result to skip pixels, so we use the is_* functions to test the __m64
data directly without loading it into an integer register.

For mmx_combine_saturate_u there's not a lot we can do, since it uses
DIV_UN8.
2012-04-27 13:35:31 -04:00
Matt Turner
ee75003425 mmx: introduce is_equal, is_opaque, and is_zero functions
To be used by the next commit.
2012-04-27 13:35:25 -04:00
Matt Turner
10c77b339f mmx: simplify srcsrcsrcsrc calculation in over_n_8_0565 2012-04-27 13:35:19 -04:00
Matt Turner
e06947d101 mmx: remove unnecessary uint64_t<->__m64 conversions
Loongson:
add_8888_8888 =  L1:  68.73  L2:  55.09  M: 25.39 ( 68.18%)  HT: 25.28 VT: 22.42  R: 20.74  RT: 13.26 ( 131Kops/s)
add_8888_8888 =  L1: 159.19  L2: 114.10  M: 30.74 ( 77.91%)  HT: 27.63 VT: 24.99  R: 24.61  RT: 14.49 ( 141Kops/s)
2012-04-27 13:35:14 -04:00
Matt Turner
c78e986085 mmx: compile on MIPS for Loongson MMI optimizations
image               image16
           evolution   32.985 ->  29.667    27.314 ->  23.870
firefox-planet-gnome  197.982 -> 180.437   220.986 -> 205.057
gnome-system-monitor   48.482 ->  49.752    52.820 ->  49.528
  gnome-terminal-vim   60.799 ->  50.528    51.655 ->  44.131
      grads-heat-map    3.167 ->   3.181     3.328 ->   3.321
                gvim   38.646 ->  32.552    38.126 ->  34.453
       midori-zoomed   44.371 ->  43.338    28.860 ->  28.865
           ocitysmap   23.065 ->  18.057    23.046 ->  18.055
             poppler   43.676 ->  36.077    43.065 ->  36.090
  swfdec-giant-steps   20.166 ->  20.365    22.354 ->  16.578
      swfdec-youtube   31.502 ->  28.118    44.052 ->  41.771
   xfce4-terminal-a1   69.517 ->  51.288    62.225 ->  53.309
2012-04-27 13:35:05 -04:00
Matt Turner
4e0c7902b2 mmx: make ldq_u take __m64* directly
Before, if __m64 is allocated in vector or floating-point registers,

	__m64 vs = ldq_u((uint64_t *)src);

would cause src to be loaded into an integer register and then
transferred to an __m64 register. By switching ldq_u's argument type to
__m64 we give the compile enough information to recognize that it can
load to the vector register directly.

This patch is necessary for the Loongson optimizations when __m64 is
typedef'd as double.
2012-04-27 13:34:59 -04:00
Matt Turner
2e54b76a2d mmx: add load function and use it in add_8888_8888 2012-04-27 13:34:53 -04:00
Matt Turner
084e3f2f4b mmx: add store function and use it in add_8888_8888 2012-04-27 13:34:45 -04:00
Søren Sandmann Pedersen
e24c1c849d bits_image_fetch_pixel_convolution(): Make sure channels are signed
In the computation:

    srtot += RED_8 (pixel) * f

RED_8 (pixel) is an unsigned quantity, which means the signed filter
coefficient f gets converted to an unsigned integer before the
multiplication. We get away with this because when the 32 bit unsigned
result is converted to int32_t, the correct sign is produced. But if
srtot had been an int64_t, the result would have been a very large
positive number.

Fix this by explicitly casting the channels to int.
2012-04-20 10:17:13 -04:00
Søren Sandmann Pedersen
4d2fee1406 test/utils.c: Clip values to the [0, 255] interval
Unpremultiplying a superluminescent pixel can result in values greater
than 255.
2012-04-20 10:17:13 -04:00
Matt Turner
e291764584 configure.ac: fix iwMMXt/gcc version error message 2012-04-18 18:14:13 -04:00
Matt Turner
b87cd1f605 mmx: fix _mm_shuffle_pi16 function when compiling without optimization
The last argument must be an immediate value, and when compiling without
optimization the compiler might not recognize this. So use a macro if
not optimizing.
2012-04-15 14:03:08 -04:00
Matt Turner
e927d23971 configure.ac: require >= gcc-4.5 for ARM iwMMXt
We're using a patched gcc-4.5, and having to modify configure.ac and
autoreconf between changes is annoying. And besides, 4.5, 4.6, and 4.7's
iwMMXt intrinsic support is equally broken, and we test a known broken
intrinsic in the configure test program, so the version check is rather
meaningless.
2012-04-15 14:00:17 -04:00
Matt Turner
0531170436 mmx: Use force_inline instead of __inline__ (bug 46906)
Fixes the build on MSVC.
2012-04-05 17:36:05 -04:00
Matt Turner
b950bb12dc mmx: enable over_n_0565 for b5g6r5
Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-04-05 17:34:26 -04:00
Søren Sandmann Pedersen
87ecec8d72 gtk-utils.c: In pixbuf_from_argb32() use a8r8g8b8_to_rgba_np()
Instead of inlining a copy of that functionality.
2012-04-02 15:25:00 -04:00
Søren Sandmann Pedersen
d1ec1467f6 test/utils.c: Rename and export the pngify_pixels() function.
This function converts from a8r8g8b8 to non-premultiplied RGBA (the
PNG or GdkPixbuf format that has the channels in this order: R, G, B,
A in memory regardless of the computer's endianness). The function's
new name is a8r8g8b8_to_rgba_np().
2012-04-02 15:24:56 -04:00
Søren Sandmann Pedersen
b16ddf1782 gtk-utils.c: Don't include pixman-private.h
Use pixman_image_get_format() instead of image->bits.format.
2012-04-02 14:59:02 -04:00
Søren Sandmann Pedersen
b9ca23a9c7 Rename fast_composite_add_1000_1000 to _add_1_1()
The 1000_1000 name is a relic from before the refactoring.
2012-03-27 22:04:37 -04:00
Søren Sandmann Pedersen
746291a19e Add the original parrot image.
This is the Parrot image that was downscaled and cropped before being
used in the composite-test.c demo.
2012-03-27 22:04:36 -04:00
Søren Sandmann Pedersen
451b25ae90 composite-test.c: Add a parrot image
Instead of the yellow square, use a parrot as the source image. This
demonstrates the various blend modes much better.

The parrot is a cropped version of finger painting by Rubens LP:

    http://www.flickr.com/photos/dorubens/4030604504/in/set-72157622586088192/

where the background has been removed. Used here under Creative
Commons Attribution. The artist's web site:

     http://www.rubenslp.com.br/
2012-03-27 22:04:32 -04:00
Søren Sandmann Pedersen
3aa45d62e4 composite-test.c: Use similar gradient to the one in the PDF spec. 2012-03-24 16:41:47 -04:00
Søren Sandmann Pedersen
e1b8969e78 demos: Add checkerboard demo
This is a simple demo that displays a checkboard with a projective
transformation.
2012-03-24 16:29:36 -04:00
Søren Sandmann Pedersen
41863fbabb demos: Add quad2quad program
This program can compute the projective transformation that transforms
one quadrilateral into another. The code is basically maxima[1] output
translated into C.

[1] http://maxima.sourceforge.net/
2012-03-24 16:29:27 -04:00
Søren Sandmann Pedersen
cf0d0d6364 Use "=a" and "=d" constraints for rdtsc inline assembly
In 32 bit mode the "=A" constraint refers to the register pair
edx:eax, but according to GCC developers this is not the case in 64
bit mode, where it refers to "rax".

Hence, using "=A" for rdtsc is incorrect in 64 bit mode.

See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21249
2012-03-24 16:26:07 -04:00
Jeremy Huddleston
8a8aabf05c configure.ac: Fix a copy-paste-o in TLS detection
Regression from: a069da6c66

Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
Tested-by: Matt Turner <mattst88@gmail.com>
2012-03-16 12:41:14 -07:00
Matt Turner
ee6bac11c2 Use AC_LANG_SOURCE for DSPr2 configure program
Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-03-15 16:49:29 -04:00
Chun-wei Fan
21eeecffa9 Just include xmmintrin.h on MSVC as well
The xmmintrin.h as shipped with recent Visual C++ (2003+) provides
_mm_shuffle_pi16 and _mm_mulhi_pu16, so including that header
will do for using these functions, and MSVC does not like the GCC-specific
implementations of _mm_shuffle_pi16 and _mm_mulhi_pu16 that is
currently in the code.

_MM_SHUFFLE is declared in the same way in MSVC's xmmintrin.h, so don't
re-define it here to avoid a compilation warning.
2012-03-15 15:18:11 -04:00
Jeremy Huddleston
94aea2e868 Fix a false-negative in MMX check
Silence warnings that could make -Werror give a false negative
Use signed char to avoid cases where int8_t isn't declared

Reported-by: Mike Lothian <mike@fireburn.co.uk>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
2012-03-14 19:10:22 -07:00
Nemanja Lukic
d2ee5631ae MIPS: DSPr2: Added over_n_8888_8888_ca and over_n_8888_0565_ca fast paths.
Performance numbers before/after on MIPS-74kc @ 1GHz

Referent (before):

lowlevel-blt-bench:
     over_n_8888_8888_ca =  L1:   8.32  L2:   7.65  M:  6.38 ( 51.08%)  HT:  5.78  VT:  5.74  R:  5.84  RT:  4.39 (  37Kops/s)
     over_n_8888_0565_ca =  L1:   7.40  L2:   6.95  M:  6.16 ( 41.06%)  HT:  5.72  VT:  5.52  R:  5.63  RT:  4.28 (  36Kops/s)
cairo-perf-trace:
[ # ]  backend                         test   min(s) median(s) stddev. count
[ # ]    image: pixman 0.25.3
[  0]    image            xfce4-terminal-a1  138.223  139.070   0.33%    6/6
[ # ]  image16: pixman 0.25.3
[  0]  image16            xfce4-terminal-a1  132.763  132.939   0.06%    5/6

Optimized:

lowlevel-blt-bench:
     over_n_8888_8888_ca =  L1:  19.35  L2:  23.84  M: 13.68 (109.39%)  HT: 11.39  VT: 11.19  R: 11.27  RT:  6.90 (  47Kops/s)
     over_n_8888_0565_ca =  L1:  18.68  L2:  17.00  M: 12.56 ( 83.70%)  HT: 10.72  VT: 10.45  R: 10.43  RT:  5.79 (  43Kops/s)
cairo-perf-trace:
[ # ]  backend                         test   min(s) median(s) stddev. count
[ # ]    image: pixman 0.25.3
[  0]    image            xfce4-terminal-a1  130.400  131.720   0.46%    6/6
[ # ]  image16: pixman 0.25.3
[  0]  image16            xfce4-terminal-a1  125.830  126.604   0.34%    6/6
2012-03-13 18:04:31 -04:00
Jeremy Huddleston
a069da6c66 Expand TLS support beyond __thread to __declspec(thread)
This code was pretty much coppied from a similar commit that I made to
xorg-server in April.

cf: xorg/xserver: bb4d145bd25e2aee988b100ecf1105ea3b6a40b8

Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
2012-03-13 18:02:26 -04:00
Jeremy Huddleston
61d999b910 Disable MMX when incompatible clang is being used.
Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
2012-03-13 18:02:26 -04:00
Jeremy Huddleston
ad4b6922f2 Silence a warning about unused pixman_have_mmx
Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
2012-03-13 18:02:25 -04:00
Jeremy Huddleston
bb5ff26878 Revert "Disable MMX when Clang is being used."
This reverts commit 5eb4c12a79.
2012-03-13 18:02:25 -04:00
Cyril Brulebois
c6b4daedbc Upload to experimental. 2012-03-09 13:17:30 +01:00
Cyril Brulebois
b3db603f91 Add new symbols and bump shlibs accordingly. 2012-03-09 13:15:11 +01:00
Cyril Brulebois
e6c37e621b Bump changelogs. 2012-03-09 13:03:52 +01:00
Cyril Brulebois
e4e7b8fcb8 Merge branch 'debian-unstable' into debian-experimental 2012-03-09 13:03:07 +01:00
Cyril Brulebois
44abaa5132 Merge branch 'upstream-unstable' into debian-experimental 2012-03-09 13:03:04 +01:00
Søren Sandmann Pedersen
a6ad5120f7 Post-release version bump to 0.25.3 2012-03-08 10:11:20 -05:00
Søren Sandmann Pedersen
f73f798531 Pre-release version bump to 0.25.2 2012-03-08 09:33:16 -05:00
Søren Sandmann Pedersen
62df04eb25 mmx: Squash a warning by making the argument to ldl_u() const 2012-03-08 09:29:46 -05:00
Alan Coopersmith
85943733cb Just use xmmintrin.h when building with Solaris Studio compilers
Since the Solaris Studio compilers don't have a mode where MMX
instructions are available and SSE instructions are not, we can
just use the <xmmintrin.h> header directly.

Fixes build failure due to Studio not supporting the __gnu_inline__
or __artificial__ attributes.

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2012-03-05 18:57:26 -08:00
Nemanja Lukic
304f57644a MIPS: DSPr2: Added mips_dspr2_blt and mips_dspr2_fill routines.
Performance numbers before/after on MIPS-74kc @ 1GHz

Referent (before):

lowlevel-blt-bench:
              src_n_0565 =  L1: 238.14  L2: 233.15  M: 57.88 ( 77.23%)  HT: 53.22  VT: 49.99  R: 47.73  RT: 24.79 (  91Kops/s)
              src_n_8888 =  L1: 190.19  L2: 187.57  M: 28.94 ( 77.23%)  HT: 27.91  VT: 27.33  R: 26.64  RT: 14.68 (  77Kops/s)
cairo-perf-trace:
[ # ]  backend                         test   min(s) median(s) stddev. count
[ # ]    image: pixman 0.25.1
[  0]    image         gnome-system-monitor  268.460  269.712   0.22%    6/6

Optimized:

lowlevel-blt-bench:
              src_n_0565 =  L1:1081.39  L2: 258.22  M:189.59 (252.91%)  HT: 60.23  VT: 55.01  R: 53.44  RT: 23.68 (  89Kops/s)
              src_n_8888 =  L1: 653.46  L2: 113.55  M:135.26 (360.86%)  HT: 38.99  VT: 37.38  R: 34.95  RT: 18.67 (  84Kops/s)
cairo-perf-trace:
[ # ]  backend                         test   min(s) median(s) stddev. count
[ # ]    image: pixman 0.25.1
[  0]    image         gnome-system-monitor  246.565  246.706   0.04%    6/6
2012-03-04 01:09:56 -05:00
Søren Sandmann Pedersen
999e72b80b pixman-access.c: Remove some unused macros
The macros related to palette entries:

RGB15_TO_ENTRY,
RGB24_TO_ENTRY,
RGB24_TO_ENTRY_Y

are not used anywhere.
2012-03-01 23:49:51 -05:00
Søren Sandmann Pedersen
c0cb48aae0 pixman-accessors.h: Delete unused macros
The MEMCPY_WRAPPED and ACCESS macros are not used anymore.
2012-03-01 23:49:51 -05:00
Søren Sandmann Pedersen
5adf569317 Move fetching for solid bits images to pixman-noop.c
This should be a bit faster because it can reuse the scanline on each iteration.
2012-03-01 23:49:50 -05:00
Matt Turner
3c3c70fa0b lowlevel-blt-bench: add in_8_8 and in_n_8_8
Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-03-01 17:42:37 -05:00
Søren Sandmann Pedersen
fcea053561 Disable implementations mentioned in the PIXMAN_DISABLE environment variable.
With this, it becomes possible to do

     PIXMAN_DISABLE="sse2 mmx" some_app

which will run some_app without SSE2 and MMX enabled. This is useful
for benchmarking, testing and narrowing down bugs.

The current list of implementations that can be disabled:

    fast
    mmx
    sse2
    arm-simd
    arm-iwmmxt
    arm-neon
    mips-dspr2
    vmx

The general and noop implementations can't be disabled because pixman
depends on those being available for correct operation.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-02-28 15:46:13 -05:00
Nemanja Lukic
e7574d336b MIPS: DSPr2: Added fast-paths for SRC operation.
Following fast-path functions are implemented (routines 4, 5 and 6 utilize
same fast-memcpy routine):
    1. src_x888_8888
    2. src_8888_0565
    3. src_0565_8888
    4. src_0565_0565
    5. src_8888_8888
    6. src_0888_0888

Performance numbers before/after on MIPS-74kc @ 1GHz

Referent (before):

lowlevel-blt-bench:
        src_x888_8888 =  L1: 199.35  L2:  96.54  M: 18.87 (100.68%)  HT: 17.12  VT: 16.24  R: 15.43  RT:  9.33 (  61Kops/s)
        src_8888_0565 =  L1:  71.22  L2:  51.95  M: 24.19 ( 96.17%)  HT: 20.71  VT: 19.92  R: 18.15  RT:  9.92 (  63Kops/s)
        src_0565_8888 =  L1:  38.82  L2:  36.22  M: 18.60 ( 73.95%)  HT: 14.47  VT: 13.19  R: 12.97  RT:  6.61 (  49Kops/s)
        src_0565_0565 =  L1: 286.05  L2: 155.02  M: 37.68 (100.54%)  HT: 31.08  VT: 28.07  R: 26.26  RT: 11.93 (  68Kops/s)
        src_8888_8888 =  L1: 454.32  L2: 139.15  M: 19.30 (102.98%)  HT: 17.73  VT: 16.08  R: 16.62  RT: 10.45 (  64Kops/s)
        src_0888_0888 =  L1: 190.47  L2: 106.14  M: 25.26 (101.08%)  HT: 21.88  VT: 20.32  R: 18.83  RT: 10.10 (  63Kops/s)
cairo-perf-trace:
[ # ]  backend                         test   min(s) median(s) stddev. count
[ # ]    image: pixman 0.25.1
[  0]    image            firefox-asteroids  421.215  421.325   0.01%    4/6
[  1]    image         firefox-planet-gnome  647.708  648.486   0.13%    6/6
[  2]    image         gnome-system-monitor  276.073  277.506   0.38%    6/6
[  3]    image           gnome-terminal-vim  263.866  265.229   0.39%    6/6
[  4]    image                      poppler  123.576  124.003   0.15%    6/6

Optimized (with these optimizations):

lowlevel-blt-bench:
        src_x888_8888 =  L1: 369.50  L2:  99.37  M: 27.19 (145.07%)  HT: 20.24  VT: 19.48  R: 19.00  RT: 10.22 (  63Kops/s)
        src_8888_0565 =  L1: 105.65  L2:  67.87  M: 25.41 (101.00%)  HT: 20.78  VT: 19.84  R: 18.52  RT:  9.81 (  63Kops/s)
        src_0565_8888 =  L1:  77.10  L2:  63.04  M: 23.37 ( 92.90%)  HT: 20.29  VT: 19.37  R: 18.14  RT: 10.02 (  63Kops/s)
        src_0565_0565 =  L1: 519.02  L2: 241.32  M: 62.35 (166.34%)  HT: 33.74  VT: 27.63  R: 26.12  RT: 11.70 (  67Kops/s)
        src_8888_8888 =  L1: 390.48  L2: 113.99  M: 30.32 (161.77%)  HT: 19.55  VT: 17.05  R: 17.13  RT: 10.19 (  63Kops/s)
        src_0888_0888 =  L1: 349.74  L2: 156.68  M: 40.68 (162.78%)  HT: 25.58  VT: 20.57  R: 20.20  RT:  9.96 (  63Kops/s)
cairo-perf-trace:
[ # ]  backend                         test   min(s) median(s) stddev. count
[ # ]    image: pixman 0.25.1
[  0]    image            firefox-asteroids  400.050  400.308   0.04%    6/6
[  1]    image         firefox-planet-gnome  628.978  629.364   0.07%    6/6
[  2]    image         gnome-system-monitor  270.247  270.313   0.03%    6/6
[  3]    image           gnome-terminal-vim  256.413  257.641   0.21%    6/6
[  4]    image                      poppler  119.540  120.023   0.21%    6/6
2012-02-25 15:06:43 -05:00
Nemanja Lukic
1364c91bd1 MIPS: DSPr2: Basic infrastructure for MIPS architecture
MIPS DSP instruction set extensions
2012-02-25 15:06:43 -05:00
Matt Turner
e43d65d49d lowlevel-blt: add over_x888_n_8888
Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-24 20:02:55 -05:00
Matt Turner
9f60704995 lowlevel-blt: add over_8888_8888
Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-24 19:58:09 -05:00
Søren Sandmann Pedersen
5eb4c12a79 Disable MMX when Clang is being used.
There are several issues with the Clang compiler and pixman-mmx.c:

- When not optimizing, it doesn't seem to recognize that an argument
  to an __always_inline__ function is compile-time constant. This
  results in this error being produced:

      fatal error: error in backend: Invalid operand for inline asm
              constraint 'K'!

- This inline assembly:

      asm ("pmulhuw %1, %0\n\t"
          : "+y" (__A)
          : "y" (__B)
      );

  results in

      fatal error: error in backend: Unsupported asm: input constraint
              with a matching output constraint of incompatible type!

So disable MMX when the compiler is Clang.
2012-02-24 16:30:41 -05:00
Matt Turner
350e231b3f mmx: make load8888 take a pointer to data instead of the data itself
Allows us to tune how we load data into the vector registers.

Signed-off-by: Matt Turner <mattst88@gmail.com>

And squashed in:

mmx: define and use load8888u function

For unaligned loads.

Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-24 08:46:48 -05:00
Matt Turner
ab68316eda mmx: make store8888 take uint32_t *dest as argument
Allows us to tune how we store data from the vector registers.

Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-24 08:46:28 -05:00
Matt Turner
57a245a6e0 Update .gitignore with more demos and tests
Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-22 16:32:46 -05:00
Søren Sandmann Pedersen
51ae3f2d7f mmx: Delete unused function in_over_full_src_alpha()
Also a few minor formatting fixes.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-02-22 14:14:30 -05:00
Søren Sandmann Pedersen
bbd1e6941b mmx: Enable over_x888_8_8888() for x86 as well
It used to be slower than the generic code (with the gcc that was
current in 2007), but that doesn't seem to be the case anymore:

over_x888_8_8888 =  L1:  22.97  L2:  22.88  M: 22.27 (  5.29%)  HT: 18.30  VT: 15.81  R: 15.54  RT: 10.35 ( 131Kops/s)
over_x888_8_8888 =  L1:  53.56  L2:  53.20  M: 50.50 ( 11.99%)  HT: 38.60  VT: 31.19  R: 29.00  RT: 17.37 ( 208Kops/s)

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-02-22 14:14:08 -05:00
Matt Turner
4fc586c3df mmx: fix typo in pix_add_mul on MSVC
Typo introduced in commit a075a870.

Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-21 16:28:37 -05:00
Matt Turner
84221f4c16 mmx: Use _mm_shuffle_pi16
The pshufw x86 instruction is part of Extended 3DNow! and SSE1. The
equivalent ARM wshufh instruction was available from the first iwMMXt
instrucion set.

This instruction is already used in the SSE2 code.

Reduces code size by ~9%.

amd64
  text    data     bss     dec     hex filename
 29925    2240       0   32165    7da5 .libs/libpixman_mmx_la-pixman-mmx.o
 27237    2240       0   29477    7325 .libs/libpixman_mmx_la-pixman-mmx.o

x86
  text    data     bss     dec     hex filename
 27677    1792       0   29469    731d .libs/libpixman_mmx_la-pixman-mmx.o
 24959    1792       0   26751    687f .libs/libpixman_mmx_la-pixman-mmx.o

arm
  text    data     bss     dec     hex filename
 30176    1792       0   31968    7ce0 .libs/libpixman_iwmmxt_la-pixman-mmx.o
 27384    1792       0   29176    71f8 .libs/libpixman_iwmmxt_la-pixman-mmx.o

Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-21 12:47:49 -05:00
Matt Turner
1420834496 mmx: Use _mm_mulhi_pu16
The pmulhuw x86 instruction is part of Extended 3DNow! and SSE1. The
equivalent ARM wmuluh instruction was available from the first iwMMXt
instrucion set.

This instruction is already used in the SSE2 code.

Reduces code size by ~5%.

amd64
  text    data     bss     dec     hex filename
 31325    2240       0   33565    831d .libs/libpixman_mmx_la-pixman-mmx.o
 29925    2240       0   32165    7da5 .libs/libpixman_mmx_la-pixman-mmx.o

x86
  text    data     bss     dec     hex filename
 29165    1792       0   30957    78ed .libs/libpixman_mmx_la-pixman-mmx.o
 27677    1792       0   29469    731d .libs/libpixman_mmx_la-pixman-mmx.o

arm
  text    data     bss     dec     hex filename
 31632    1792       0   33424    8290 .libs/libpixman_iwmmxt_la-pixman-mmx.o
 30176    1792       0   31968    7ce0 .libs/libpixman_iwmmxt_la-pixman-mmx.o

Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-21 12:46:02 -05:00
Matt Turner
69ed71fad1 mmx: enable over_x888_8_8888 on ARM/iwMMXt
before: over_x888_8_8888 =  L1:   7.63  L2:   7.72  M:  6.44 ( 19.17%)  HT: 6.24  VT:  6.11  R:  5.87  RT:  4.61 (  51Kops/s)
after : over_x888_8_8888 =  L1:  11.88  L2:  11.11  M:  8.70 ( 26.01%)  HT: 8.15  VT:  8.07  R:  7.76  RT:  5.62 (  61Kops/s)

Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-20 19:07:44 -05:00
Matt Turner
a14f0f66bb autoconf: use #error instead of error
We'd rather see the actual #error message rather than a syntax error in
config.log.

Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-20 18:36:24 -05:00
Matt Turner
fced5c82c2 Convert while (w) to if (w) when possible
Missed in commit 57fd8c37.

Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-18 17:41:10 -05:00
Matt Turner
e27bdcd968 Make sure to run AC_SUBST IWMMXT_CFLAGS
Allows you to compile without -flax-vector-conversions in your CFLAGS,
though -march=iwmmxt2 is still necessary since specifying some other
-march= value will override it, and disable iwmmxt.

Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-17 18:10:37 -05:00
Jeremy Huddleston
82a3980701 configure.ac: Add an --enable-libpng option
Now there is a way to not link against libpng even if it's available.

Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
2012-02-16 15:22:32 -05:00
Matt Turner
46fc4eb234 Use AC_LANG_SOURCE for iwMMXt configure program
Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-11 23:47:10 -05:00
Julien Cristau
b60708fb0e Upload to unstable 2012-02-09 21:16:57 +01:00
Julien Cristau
20446ebc6b Bump changelogs 2012-02-09 20:52:20 +01:00
Julien Cristau
00e59db614 pixman 0.24.4 release
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.9 (GNU/Linux)
 
 iEYEABECAAYFAk8zEZkACgkQmxfmIW/3wagcTwCgjGvmVz4suHSfs+OzQWEmBDqv
 dCYAnjcm0p9EaocqWhbUV2UfGC0NMX8A
 =wOcR
 -----END PGP SIGNATURE-----

Merge tag 'pixman-0.24.4' into debian-unstable

pixman 0.24.4 release
2012-02-09 20:48:25 +01:00
Søren Sandmann Pedersen
8bff730a98 Pre-release version bump to 0.24.4 2012-02-08 19:03:22 -05:00
Søren Sandmann Pedersen
c5c866a394 Revert "Reject trapezoids where top (botttom) is above (below) the edges"
Cairo 1.10 will sometimes generate trapezoids like this, so we can't
consider them invalid. Fixes bug 45009, reported by Michael Biebl.

This reverts commit 2437ae80e5.
2012-02-08 19:01:05 -05:00
Bobby Salazar
1ceb66750c iOS Runtime Detection Support For ARM NEON
This patch adds runtime detection support for the ARM NEON fast paths
for code compiled with the iOS SDK.
2012-02-08 19:01:03 -05:00
Søren Sandmann Pedersen
e5555d7a74 Revert "Reject trapezoids where top (botttom) is above (below) the edges"
Cairo 1.10 will sometimes generate trapezoids like this, so we can't
consider them invalid. Fixes bug 45009, reported by Michael Biebl.

This reverts commit 2437ae80e5.
2012-01-31 09:10:07 -05:00
Bobby Salazar
3557787697 iOS Runtime Detection Support For ARM NEON
This patch adds runtime detection support for the ARM NEON fast paths
for code compiled with the iOS SDK.
2012-01-31 09:10:07 -05:00
Cyril Brulebois
11ddc57db9 Upload to unstable. 2012-01-19 12:23:22 +01:00
Cyril Brulebois
cbde497236 Bump changelogs. 2012-01-19 12:21:28 +01:00
Cyril Brulebois
ed216c187b Merge branch 'upstream-unstable' into debian-unstable 2012-01-19 12:20:52 +01:00
Søren Sandmann Pedersen
7ccb0c45e5 Post-release version bump to 0.24.3 2012-01-18 16:06:05 -05:00
Søren Sandmann Pedersen
08070759c3 Pre-release version bump to 0.24.2 2012-01-18 15:49:24 -05:00
Søren Sandmann Pedersen
a9b4fa378b Fix bugs with alpha maps
The alpha channel from the alpha map must be inserted as the new alpha
channel when a scanline is fetched from an image. Previously the alpha
map would overwrite the buffer instead. This wasn't caught be the
alpha map test because it would only verify that the resulting alpha
channel was correct, and not pay attention to incorrect color
channels.
2012-01-18 15:37:36 -05:00
Alan Coopersmith
7dd2b8ee7e Make mmx code compatible with Solaris Studio 12.3 compilers
Rearranged some of the existing gcc & Intel compiler checks to allow
easier sharing of common cases among the compilers.

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
2012-01-18 15:37:26 -05:00
Søren Sandmann Pedersen
ee500cb2b1 Reject trapezoids where top (botttom) is above (below) the edges
When a trapezoid has a top/bottom that is above/below the left/right
edges, degenerate trapezoids become possible. For example the edge
could be very short and close to horizontal. If the bottom edge is far
below the bottom point of such a short edge, the result is that the
lower right corner of the trapezoid will be extremely far to the left.

This kind of trapezoid causes overflows in the rasterization code, so
change pixman_trapezoid_valid() to reject them.
2012-01-18 15:37:08 -05:00
Søren Sandmann Pedersen
1398a2fae4 Fix some signed overflow bugs
In the macros for the PDF blend modes, two comp1_t variables are
multiplied together and then used as if the result were a
comp4_t. When comp1_t is a uint8_t, this is fine because they are
promoted to int, and the product of two uint8_ts fits in an
int. However, when comp1_t is uint16, the product does not necessarily
fit in an int, so casts are necessary.

Fix for bug 43906, reported by Siarhei Siamashka.
2012-01-18 15:36:50 -05:00
Søren Sandmann Pedersen
419820cce6 pixman-image.c: Fix typo in pixman_image_set_transform()
A parenthesis was misplaced so that the size argument to memcmp() was
always 0. The bug is harmless except that the flags might be
unnecessarily recomputed in some cases.

A bug reporting this in Mozilla's fork was discovered here:

    https://bugzilla.mozilla.org/show_bug.cgi?id=710992
2012-01-18 15:36:34 -05:00
Colin Walters
5bd74a7c96 autogen.sh: Support GNOME Build API
http://people.gnome.org/~walters/docs/build-api.txt
2012-01-18 15:36:22 -05:00
Søren Sandmann Pedersen
dbb6148158 gradient-walker: For NONE repeats, when x < 0 or x > 1, set both colors to 0
ec7c9c2b68 introduced a bug where NONE gradients would be
misrendered, causing the area outside the gradient to be treated as a
(very) long fade to transparent.The problem was that a check for
positions outside the gradients were dropped in favor of relying on
the sentinels.

Aside from misrendering, this also caused a signed integer overflow
when the code would compute a stepper size based on MIN_INT32.

This patches fixes the issue by reinstating a check for these cases
and setting both the right and left colors to transparent black.
2012-01-18 15:36:13 -05:00
Bobby Salazar
b14fd2ad60 Android Runtime Detection Support For ARM NEON
This patch adds runtime detection support for the ARM NEON fast paths
for code compiled with the Android NDK. This is the only code change
needed to enable the ARM NEON pixman fast paths for the ever growing
Android platform (200 million+ smartphones, tablets, etc.). Just make
sure to #define USE_ARM_NEON in your makefile.
2012-01-18 15:35:41 -05:00
Naohiro Aota
3c87d862d9 Don't use non-POSIX test
test "$test_CFLAGS" == "" &&         \

may cause an error on some POSIX shells and uses a style which is not
consistent with the other tests in configure.ac

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=42588 and
https://bugs.gentoo.org/show_bug.cgi?id=387087
2012-01-18 15:35:30 -05:00
Søren Sandmann Pedersen
c19a09b314 Post-release version bump to 0.24.1 2012-01-18 15:35:09 -05:00
Søren Sandmann Pedersen
86ce180882 test: Port composite test over to use new pixel_checker_t object.
Also make some tweaks to the way the errors are printed.
2012-01-10 09:04:45 -05:00
Søren Sandmann Pedersen
f57034f678 test: Add a new "pixel_checker_t" object.
Add a new pixel_checker_t object to test/utils.[ch]. This object
should be initialized with a format and can then be used to check
whether a given "real" pixel in that format is close enough to a
"perfect" pixel given as a double precision ARGB struct.

The acceptable deviation is calcuated as follows. Each channel of the
perfect pixel has 0.004 subtracted from it and is then converted to
the format. The resulting value is the minimum value that will be
accepted. Similarly, to compute the maximum value, the channel has
0.004 added to it and is then converted to the given format. Checking
a pixel is then a matter of splitting it into channels and checking
that each is within the computed bounds.

The value of 0.004 was chosen because it is the minimum one that will
make the existing composite test pass (see next commit). A problem
with this value is that it causes 0xFE to be acceptable when the
correct value is 1.0, and 0x01 to be acceptable when the correct value
is 0. It would be better if, when the result is exactly 0 or exactly
1, an a8r8g8b8 pixel were required to produce exactly 0x00 or 0xff to
preserve full black and full white. A deviation value of 0.003 would
produce this, but currently this would cause tests with operators that
involve divisions to fail.
2012-01-10 09:04:45 -05:00
Søren Sandmann Pedersen
0053a9f869 Rename color_correct() to round_color()
And do the rounding from float to int in the same way cairo does: by
multiplying with (1 << width), then subtracting one when the input was 1.0.
2012-01-10 09:04:45 -05:00
Søren Sandmann Pedersen
55a010bf31 Move the color_correct() function from composite.c to utils.c 2012-01-10 09:04:45 -05:00
Søren Sandmann Pedersen
065666f33c Get rid of delegates for combiners
Add a new function _pixman_implementation_lookup_combiner() that will
find a usable combiner given an operator and information about whether
the combiner should apply component alpha and whether it should be 64
bit.

In pixman-general.c use this function to look up a combiner up front
instead of walking the delegate chain for every scanline.
2012-01-10 09:04:37 -05:00
Søren Sandmann Pedersen
ab584ab500 test/alphamap.c: Make dst and orig_dst more independent of each other
When making the copy of the destination, do so separately for the
image and the alpha map. This ensures that the alpha channel of the
alpha map will be different from the alpha channel of the actual
image.

Previously, orig_dst would be copied onto dst along with its alpha
map, which mean that the alpha map of orig_dst would become the new
alpha channel of *both* dst and dst's alpha map. This meant that test
didn't actually test that the alpha maps alpha channel was actually
fetched.
2012-01-10 09:04:36 -05:00
Søren Sandmann Pedersen
4613f2caac Fix bugs with alpha maps
The alpha channel from the alpha map must be inserted as the new alpha
channel when a scanline is fetched from an image. Previously the alpha
map would overwrite the buffer instead. This wasn't caught be the
alpha map test because it would only verify that the resulting alpha
channel was correct, and not pay attention to incorrect color
channels.
2012-01-10 09:04:36 -05:00
Søren Sandmann Pedersen
8bd63634cd test: In the alphamap test, also test that we get the right red value
There is a bug where the red channel of the alpha map of the
destination image is used instead of the red channel of the
destination image itself.
2012-01-10 09:04:36 -05:00
Alan Coopersmith
007d8b1813 Make mmx code compatible with Solaris Studio 12.3 compilers
Rearranged some of the existing gcc & Intel compiler checks to allow
easier sharing of common cases among the compilers.

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
2012-01-09 10:28:23 -08:00
Søren Sandmann Pedersen
3757245586 Fix rounding for DIV_UNc()
We need to compute floor (a/b * 255 + 0.5), not floor (a / b * 255),
so add b/2 to the numerator in the DIV_UNc() macro.
2012-01-09 05:40:34 -05:00
Søren Sandmann Pedersen
2437ae80e5 Reject trapezoids where top (botttom) is above (below) the edges
When a trapezoid has a top/bottom that is above/below the left/right
edges, degenerate trapezoids become possible. For example the edge
could be very short and close to horizontal. If the bottom edge is far
below the bottom point of such a short edge, the result is that the
lower right corner of the trapezoid will be extremely far to the left.

This kind of trapezoid causes overflows in the rasterization code, so
change pixman_trapezoid_valid() to reject them.
2012-01-09 05:40:34 -05:00
Søren Sandmann Pedersen
6a8192b6dd In MUL_UNc() cast to comp2_t
Otherwise, when comp1_t is 16 bits wide, we can end up with a signed
integer overflow.
2012-01-09 05:40:33 -05:00
Søren Sandmann Pedersen
33ac0a9084 Fix a bunch of signed overflow issues
In pixman-fast-path.c: (1 << 31) - 1 causes a signed overflow, so
change to (1U << n) - 1.

In pixman-image.c: The check for whether m10 == -m01 will overflow
when -m01 == INT_MIN. Instead just check whether the variables are 1
and -1.

In pixman-utils.c: When the depth of the topmost channel is 0, we can
end up shifting by 32.

In blitters-test.c: Replicating the mask would end up shifting more
than 32.

In region-contains-test.c: Computing the average of two large integers
could overflow. Instead add half the difference between them to the
first integer.

In stress-test.c: Masking the value in fake_reader() would sometimes
shift by 32. Instead just use the most significant bits instead of
the least significant.

All these issues were found by the IOC tool:

    http://embed.cs.utah.edu/ioc/
2012-01-09 05:40:33 -05:00
Søren Sandmann Pedersen
d788f76278 Add missing cast in _pixman_edge_multi_init()
nx and e->dy are both 32 bit quantities, so a cast is needed to make
sure their product is 64 bit before subtracting it from a 64 bit
quantity.
2012-01-09 05:40:33 -05:00
Søren Sandmann Pedersen
4f3fe9c909 Fix some signed overflow bugs
In the macros for the PDF blend modes, two comp1_t variables are
multiplied together and then used as if the result were a
comp4_t. When comp1_t is a uint8_t, this is fine because they are
promoted to int, and the product of two uint8_ts fits in an
int. However, when comp1_t is uint16, the product does not necessarily
fit in an int, so casts are necessary.

Fix for bug 43906, reported by Siarhei Siamashka.
2012-01-09 05:40:33 -05:00
Søren Sandmann Pedersen
3e93bba3b0 pixman-image.c: Fix typo in pixman_image_set_transform()
A parenthesis was misplaced so that the size argument to memcmp() was
always 0. The bug is harmless except that the flags might be
unnecessarily recomputed in some cases.

A bug reporting this in Mozilla's fork was discovered here:

    https://bugzilla.mozilla.org/show_bug.cgi?id=710992
2012-01-09 05:40:33 -05:00
Colin Walters
ae651e7e73 autogen.sh: Support GNOME Build API
http://people.gnome.org/~walters/docs/build-api.txt
2012-01-05 10:14:52 -05:00
Søren Sandmann Pedersen
89498a1178 gradient-walker: For NONE repeats, when x < 0 or x > 1, set both colors to 0
ec7c9c2b68 introduced a bug where NONE gradients would be
misrendered, causing the area outside the gradient to be treated as a
(very) long fade to transparent.The problem was that a check for
positions outside the gradients were dropped in favor of relying on
the sentinels.

Aside from misrendering, this also caused a signed integer overflow
when the code would compute a stepper size based on MIN_INT32.

This patches fixes the issue by reinstating a check for these cases
and setting both the right and left colors to transparent black.
2012-01-03 11:37:12 -05:00
Søren Sandmann Pedersen
d0091a33fc Modify gradient-test to show a bug in NONE processing
This patch modifies demos/gradient-test to display a bug in gradients
with a repeat mode of NONE. With the current gradient code, the left
side will be a solid red (actually an extremely long fade from solid
red to transparent) instead of a sharp transition from red to green.
2012-01-03 11:36:31 -05:00
Søren Sandmann Pedersen
9db9805515 region: Add pixman_region{,32}_clear() functions.
These functions simply reset the region to empty. They are equivalent
to

      pixman_region_fini (&region);
      pixman_region_init (&region);
2011-12-13 14:50:40 -05:00
Bobby Salazar
6b9d6a91ed Android Runtime Detection Support For ARM NEON
This patch adds runtime detection support for the ARM NEON fast paths
for code compiled with the Android NDK. This is the only code change
needed to enable the ARM NEON pixman fast paths for the ever growing
Android platform (200 million+ smartphones, tablets, etc.). Just make
sure to #define USE_ARM_NEON in your makefile.
2011-12-13 02:03:16 -05:00
Naohiro Aota
84450c411c Don't use non-POSIX test
test "$test_CFLAGS" == "" &&         \

may cause an error on some POSIX shells and uses a style which is not
consistent with the other tests in configure.ac

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=42588 and
https://bugs.gentoo.org/show_bug.cgi?id=387087
2011-11-24 14:23:29 +01:00
Andrea Canciani
9985febd78 test: Produce autotools-looking report in the win32 build system
Tweak the commands used to run the tests on win32 to make the output
look mostly like that produced by the autotools test system.

In addition to this, make sure that the exit status of the test target
is success (0) if and only if no failure occurred.
2011-11-09 09:43:06 +01:00
Andrea Canciani
b31da39f6f demos: Consistently use G_N_ELEMENTS()
Instead of open-coding G_N_ELEMENTS(), just use it.
2011-11-09 09:17:00 +01:00
Andrea Canciani
1662c94348 test: Reuse the ARRAY_LENGTH() macro
It is provided by utils.h, there is no need to redefine it.
2011-11-09 09:17:00 +01:00
Andrea Canciani
97b9fa090c Use the ARRAY_LENGTH() macro when possible
This patch has been generated by the following Coccinelle semantic patch:

// Use the ARRAY_LENGTH() macro when possible
//
// Replace open-coded array length computations with the
// ARRAY_LENGTH() macro

@@
type T;
T[] E;
@@
- (sizeof(E)/sizeof(T))
+ ARRAY_LENGTH (E)
2011-11-09 09:17:00 +01:00
Andrea Canciani
06760f5cb0 test: Cleanup includes
All the tests are linked to libutil, hence it makes sence to always
include utils.h and reuse what it provides (config.h inclusion, access
to private pixman APIs, ARRAY_LENGTH, ...).
2011-11-09 09:17:00 +01:00
Andrea Canciani
cbd88a9416 Remove useless checks for NULL before freeing
This patch has been generated by the following Coccinelle semantic patch:

// Remove useless checks for NULL before freeing
//
// free (NULL) is a no-op, so there is no need to avoid it

@@
expression E;
@@
+ free (E);
+ E = NULL;
- if (unlikely (E != NULL)) {
-   free(E);
(
-   E = NULL;
|
-   E = 0;
)
   ...
- }

@@
expression E;
@@
+ free (E);
- if (unlikely (E != NULL)) {
-   free (E);
- }
2011-11-09 09:17:00 +01:00
Cyril Brulebois
70dac03d59 Upload to unstable. 2011-11-07 18:13:55 +01:00
Cyril Brulebois
9c9bf5de9c Bump changelogs. 2011-11-07 18:13:36 +01:00
Cyril Brulebois
1e5a59c905 Merge branch 'upstream-unstable' into debian-unstable 2011-11-07 18:12:45 +01:00
Søren Sandmann Pedersen
8d72d35b29 Post-release version bump to 0.25.1 2011-11-06 16:36:01 -05:00
Søren Sandmann Pedersen
973dc7d319 Pre-release version bump to 0.24.0 2011-11-06 16:10:33 -05:00
Alan Coopersmith
6bf590f385 Change MMX ldq_u to return _m64 instead of forcing all callers to cast
Sun/Oracle Studio compilers allow the pointers to be cast, but not the
non-pointer forms, causing pixman compiles to fail with many errors of:
"pixman-mmx.c", line 1411: invalid cast expression

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
2011-11-04 13:41:30 -07:00
Jeff Muizelaar
5d7f5bc8ee Add definitions of INT64_MIN and INT64_MAX 2011-11-02 18:49:58 -04:00
Cyril Brulebois
afde156de5 Document what happened: pixman went to sid… 2011-11-01 16:06:39 +01:00
Cyril Brulebois
39102f8b3e Upload to experimental. 2011-11-01 12:29:25 +01:00
Cyril Brulebois
bfad5455b6 Bump changelogs. 2011-11-01 12:28:58 +01:00
Cyril Brulebois
eae1bc3667 Merge branch 'upstream-experimental' into debian-experimental 2011-11-01 12:28:26 +01:00
Søren Sandmann Pedersen
697cfe1537 Post-release version bump to 0.23.9 2011-10-29 05:51:54 -04:00
Søren Sandmann Pedersen
a0f1b56581 Pre-release version bump to 0.23.8 2011-10-29 05:33:44 -04:00
Søren Sandmann Pedersen
498138c293 Fix use of uninitialized fields reported by valgrind
In pixman-noop.c and pixman-sse2.c, we are accessing
image->bits.width/height without first making sure the image is a bits
image. The warning is harmless because we never act on this
information without checking that the image is a8r8g8b8, but valgrind
does warn about it.

In pixman-noop.c, just reorder the clauses in the if statement; in
pixman-sse2.c require images to have the FAST_PATH_BITS_IMAGE flag
set.
2011-10-25 12:00:19 -04:00
Julien Cristau
40a04cb1b6 Upload to experimental 2011-10-22 11:09:17 +02:00
Søren Sandmann Pedersen
6131707e8f Merge branch 'gradients' 2011-10-20 09:13:12 -04:00
Rico Tzschichholz
bdfdaaff5d Bump changelogs. 2011-10-19 17:44:08 +02:00
Rico Tzschichholz
bccb9afc56 Merge branch 'upstream-experimental' into debian-experimental 2011-10-19 17:24:45 +02:00
Taekyun Kim
3d4d705d2f ARM: NEON: Fix assembly typo error in src_n_8_8888
Binutils 2.21 does not complain about missing comma between ARM
register and alignement specifier in vld/vst instructions which
causes build error on binutils 2.20.
2011-10-18 21:50:18 +09:00
Taekyun Kim
19f118f41f ARM: NEON: Standard fast path src_n_8_8
Performance numbers of before/after on cortex-a8 @ 1GHz

- before
L1:  28.05  L2:  28.26  M: 26.97 (  4.48%)  HT: 19.79  VT: 19.14  R: 17.61  RT:  9.88 ( 101Kops/s)

- after
L1:1430.28  L2:1252.10  M:421.93 ( 75.48%)  HT:170.16  VT:138.03  R:145.86  RT: 35.51 ( 255Kops/s)
2011-10-18 13:16:50 +09:00
Taekyun Kim
4db9e2bc13 ARM: NEON: Standard fast path src_n_8_8888
Performance numbers of before/after on cortex-a8 @ 1GHz

- before
L1:  32.39  L2:  31.79  M: 30.84 ( 13.77%)  HT: 21.58  VT: 19.75  R: 18.83  RT: 10.46 ( 106Kops/s)

- after
L1: 516.25  L2: 372.00  M:193.49 ( 85.59%)  HT:136.93  VT:109.10  R:104.48  RT: 34.77 ( 253Kops/s)
2011-10-18 13:16:48 +09:00
Taekyun Kim
26659de6cd ARM: NEON: Instruction scheduling of bilinear over_8888_8_8888
Instructions are reordered to eliminate pipeline stalls and get
better memory access.

Performance of before/after on cortex-a8 @ 1GHz

<< 2000 x 2000 with scale factor close to 1.x >>
before : 40.53 Mpix/s
after  : 50.76 Mpix/s
2011-10-18 13:16:42 +09:00
Taekyun Kim
4481920f40 ARM: NEON: Instruction scheduling of bilinear over_8888_8888
Instructions are reordered to eliminate pipeline stalls and get
better memory access.

Performance of before/after on cortex-a8 @ 1GHz

<< 2000 x 2000 with scale factor close to 1.x >>
before : 50.43 Mpix/s
after  : 61.09 Mpix/s
2011-10-18 13:14:28 +09:00
Taekyun Kim
1cd916f3a5 ARM: NEON: Replace old bilinear scanline generator with new template
Bilinear scanline functions in pixman-arm-neon-asm-bilinear.S can
be replaced with new template just by wrapping existing macros.
2011-10-18 13:00:10 +09:00
Taekyun Kim
6682b2b359 ARM: NEON: Bilinear macro template for instruction scheduling
This macro template takes 6 code blocks.

1. process_last_pixel
2. process_two_pixels
3. process_four_pixels
4. process_pixblock_head
5. process_pixblock_tail
6. process_pixblock_tail_head

process_last_pixel does not need to update horizontal weight. This
is done by the template. two and four code block should update
horizontal weight inside of them. head/tail/tail_head blocks
consist unrolled core loop. You can apply instruction scheduling
to the tail_head blocks.

You can also specify size of the pixel block. Supported size is 4
and 8. If you want to use mask, give BILINEAR_FLAG_USE_MASK flags
to the template, then you can use register MASK. When using d8~d15
registers, give BILINEAR_FLAG_USE_ALL_NEON_REGS to make sure
registers are properly saved on the stack and later restored.
2011-10-18 13:00:06 +09:00
Taekyun Kim
b5e4355fa4 ARM: NEON: Some cleanup of bilinear scanline functions
Use STRIDE and initial horizontal weight update is done before
entering interpolation loop. Cache preload for mask and dst.
2011-10-18 13:00:02 +09:00
Søren Sandmann Pedersen
ec7c9c2b68 Simplify gradient_walker_reset()
The code that searches for the closest color stop to the given
position is duplicated across the various repeat modes. Replace the
switch with two if/else constructions, and put the search code between
them.
2011-10-15 10:50:20 -04:00
Søren Sandmann Pedersen
2d0da8ab8d Use sentinels instead of special casing first and last stops
When storing the gradient stops internally, allocate two more stops,
one before the beginning of the stop list and one after the
end. Initialize those stops based on the repeat property of the
gradient.

This allows gradient_walker_reset() to be simplified because it can
now simply pick the two closest stops to the position without special
casing the first and last stops.
2011-10-15 10:50:20 -04:00
Søren Sandmann Pedersen
84d6ca7c89 gradient walker: Correct types and fix formatting
The type of pos in gradient_walker_reset() and gradient_walker_pixel()
is pixman_fixed_48_16_t and not pixman_fixed_32_32. The types of the
positions in the walker struct are pixman_fixed_t and not int32_t, and
need_reset is a boolean, not an integer. The spread field should be
called repeat and have the type pixman_repeat_t.

Also fix some formatting issues, make gradient_walker_reset() static,
and delete the pointless PIXMAN_GRADIENT_WALKER_NEED_RESET() macro.
2011-10-15 10:50:14 -04:00
Søren Sandmann Pedersen
ace225b53d Add stable release / development snapshot to draft release notes
This will hopefully serve as a reminder to me that I should put this
information in the release notes.
2011-10-11 16:12:32 -04:00
Søren Sandmann Pedersen
bb7142d361 Post-release version bump to 0.23.7 2011-10-11 06:10:39 -04:00
Søren Sandmann Pedersen
e20ac40bd3 Pre-release version bump to 0.23.6 2011-10-11 06:00:51 -04:00
Taekyun Kim
a43946a51f Simple repeat: Extend too short source scanlines into temporary buffer
Too short scanlines can cause repeat handling overhead and optimized
pixman composite functions usually process a bunch of pixels in a
single loop iteration it might be beneficial to pre-extend source
scanlines. The temporary buffers will usually reside in cache, so
accessing them should be quite efficient.
2011-10-10 12:18:28 +09:00
Taekyun Kim
eaff774a3f Simple repeat fast path
We can implement simple repeat by stitching existing fast path
functions. First lookup COVER_CLIP function for given input and
then stitch horizontally using the function.
2011-10-10 12:18:25 +09:00
Taekyun Kim
a258e33fcb Move _pixman_lookup_composite_function() to pixman-utils.c 2011-10-10 12:18:23 +09:00
Søren Sandmann Pedersen
fc62785aab Add src, mask, and dest flags to the composite args struct.
These flags are useful in the various compositing routines, and the
flags stored in the image structs are missing some bits of information
that can only be computed when pixman_image_composite() is called.
2011-10-10 12:18:21 +09:00
Taekyun Kim
fa6523d13a Add new fast path flag FAST_PATH_BITS_IMAGE
This fast path flag indicate that type of the image is bits image.
2011-10-10 12:18:18 +09:00
Taekyun Kim
7272e2fcd2 init/fini functions for pixman_image_t
pixman_image_t itself can be on stack or heap. So segregating
init/fini from create/unref can be useful when we want to use
pixman_image_t on stack or other memory.
2011-10-10 12:18:14 +09:00
Taekyun Kim
4dcf1b0107 sse2: Bilinear scaled over_8888_8_8888 2011-10-10 12:13:20 +09:00
Taekyun Kim
81050f2784 sse2: Bilinear scaled over_8888_8888 2011-10-10 12:13:17 +09:00
Taekyun Kim
d67c0b883d sse2: Macros for assembling bilinear interpolation code fractions
Primitive bilinear interpolation code is reusable to implement other
bilinear functions.

BILINEAR_DECLARE_VARIABLES
- Declare variables needed to interpolate src pixels.

BILINEAR_INTERPOLATE_ONE_PIXEL
- Interpolate one pixel and advance to next pixel

BILINEAR_SKIP_ONE_PIXEL
- Skip interpolation and just advance to next pixel
  This is useful for skipping zero mask
2011-10-10 12:12:47 +09:00
Matt Turner
741eb8462c Correct the minimum gcc version needed for iwmmxt
Spotted by Søren Sandmann.

Signed-off-by: Matt Turner <mattst88@gmail.com>
2011-10-06 17:56:09 -04:00
Matt Turner
0a34277180 Make sure iwMMXt is only detected on ARM
iwMMXt is incorrectly detected on x86 and amd64. This happens because
the test uses standard _mm_* intrinsic functions which it compiles with
-march=iwmmxt, but when the user has set CFLAGS=-march=k8 for instance,
no error is generated from -march=iwmmxt, even though it's not a valid
flag on x86/amd64. Passing CFLAGS=-march=native does not override the
-march=iwmmxt flag though, which is why it wasn't noticed before.

So, just #error out in the test if the __arm__ preprocessor directive
isn't defined.

Fixes https://bugs.gentoo.org/show_bug.cgi?id=385179

Signed-off-by: Matt Turner <mattst88@gmail.com>
2011-10-06 17:52:12 -04:00
Søren Sandmann Pedersen
879b7c21e4 Don't include stdint.h in scaling-helpers-test.
Fixes bug 41257.
2011-09-28 09:16:23 -04:00
Benjamin Otte
01c2dcbe69 build: replace @VAR@ with $(VAR) in makefiles 2011-09-28 01:48:02 +02:00
Benjamin Otte
100f16eae9 tests: Add PNG_CFLAGS/LIBS to tests
PNG flags were accidentally included by gdk-pixbuf. This has been fixed
recently, so we need to make sure to include it ourselves.
2011-09-28 01:48:01 +02:00
Matt Turner
d1313febbe mmx: optimize unaligned 64-bit ARM/iwmmxt loads
Signed-off-by: Matt Turner <mattst88@gmail.com>
2011-09-27 13:13:22 -04:00
Matt Turner
7ab94c5f99 mmx: compile on ARM for iwmmxt optimizations
Check in configure for at least gcc-4.6, since gcc-4.7 (and hopefully
4.6) will be the eariest version capable of compiling the _mm_*
intrinsics on ARM/iwmmxt. Even for suitable compile versions I use
_mm_srli_si64 which is known to cause unpatched compilers to fail.

Select iwmmxt at runtime only after NEON, since we expect the NEON
optimizations to be more capable and faster than iwmmxt.

Signed-off-by: Matt Turner <mattst88@gmail.com>
2011-09-27 13:13:15 -04:00
Matt Turner
f66887d9ea mmx: prepare pixman-mmx.c to be compiled for ARM/iwmmxt
Signed-off-by: Matt Turner <mattst88@gmail.com>
2011-09-27 13:13:07 -04:00
Matt Turner
7c6d5d1999 mmx: fix unaligned accesses
Simply return *p in the unaligned access functions, since alignment
constraints are very relaxed on x86 and this allows us to generate
identical code as before.

Tested with the test suite, lowlevel-blit-test, and cairo-perf-trace on
ARM and Alpha with no unaligned accesses found.

Signed-off-by: Matt Turner <mattst88@gmail.com>
2011-09-27 13:13:01 -04:00
Matt Turner
5d98abb14c mmx: wrap x86/MMX inline assembly in ifdef USE_X86_MMX
Signed-off-by: Matt Turner <mattst88@gmail.com>
2011-09-27 13:12:55 -04:00
Matt Turner
02c1f1a022 mmx: rename USE_MMX to USE_X86_MMX
This will make upcoming ARM usage of pixman-mmx.c unambiguous.

Signed-off-by: Matt Turner <mattst88@gmail.com>
2011-09-27 13:12:50 -04:00
Matt Turner
57fd8c37aa mmx: convert while (w) to if (w) when possible
gcc isn't able to see that w is no greater than 1, so it generates
unnecessary loop instructions with while (w).

Signed-off-by: Matt Turner <mattst88@gmail.com>
2011-09-26 11:30:05 -04:00
Matt Turner
38a7aae1d9 mmx: fix formats in commented code
b8r8g8 is apparently no longer supported sometime since this code was
commented.

Signed-off-by: Matt Turner <mattst88@gmail.com>
2011-09-26 11:29:58 -04:00
Matt Turner
b6b77488a0 lowlevel-blt: add over_x888_8_8888
Signed-off-by: Matt Turner <mattst88@gmail.com>
2011-09-26 11:29:51 -04:00
Siarhei Siamashka
9126f36b96 BILINEAR->NEAREST filter optimization for simple rotation and translation
Simple rotation and translation are the additional cases when BILINEAR
filter can be safely reduced to NEAREST.
2011-09-21 18:55:25 -04:00
Søren Sandmann Pedersen
ad5c6bbb36 Strength-reduce BILINEAR filter to NEAREST filter for identity transforms
An image with a bilinear filter and an identity transform is
equivalent to one with a nearest filter, so there is no reason the
standard fast paths shouldn't be usable.

But because a BILINEAR filter samples a 2x2 pixel block in the source
image, FAST_PATH_SAMPLES_COVER_CLIP can't be set in the case where the
source area is the entire image, because some compositing operations
might then read pixels outside the image.

This patch fixes the problem by splitting the
FAST_PATH_SAMPLES_COVER_CLIP flag into two separate flags
FAST_PATH_SAMPLES_COVER_CLIP_NEAREST and
FAST_PATH_SAMPLES_COVER_CLIP_BILINEAR that indicate that the clip
covers the samples taking into account NEAREST/BILINEAR filters
respectively.

All the existing compositing operations that require
FAST_PATH_SAMPLES_COVER_CLIP then have their flags modified to pick
either COVER_CLIP_NEAREST or COVER_CLIP_BILINEAR depending on which
filter they depend on.

In compute_image_info() both COVER_CILP_NEAREST and
COVER_CLIP_BILINEAR can be set depending on how much room there is
around the clip rectangle.

Finally, images with an identity transform and a bilinear filter get
FAST_PATH_NEAREST_FILTER set as well as FAST_PATH_BILINEAR_FILTER.

Performance measurementas with render_bench against Xephyr:

Before

*** ROUND 1 ***
---------------------------------------------------------------
Test: Test Xrender doing non-scaled Over blends
Time: 5.720 sec.
---------------------------------------------------------------
Test: Test Xrender (offscreen) doing non-scaled Over blends
Time: 5.149 sec.
---------------------------------------------------------------
Test: Test Imlib2 doing non-scaled Over blends
Time: 6.237 sec.

After:

*** ROUND 1 ***
---------------------------------------------------------------
Test: Test Xrender doing non-scaled Over blends
Time: 4.947 sec.
---------------------------------------------------------------
Test: Test Xrender (offscreen) doing non-scaled Over blends
Time: 4.487 sec.
---------------------------------------------------------------
Test: Test Imlib2 doing non-scaled Over blends
Time: 6.235 sec.
2011-09-21 18:55:25 -04:00
Søren Sandmann Pedersen
eb2e7ed81b test: Occasionally use a BILINEAR filter in blitters-test
To test that reductions of BILINEAR->NEAREST for identity
transformations happen correctly, occasionally use a bilinear filter
in blitters test.
2011-09-21 18:55:25 -04:00
Siarhei Siamashka
2a9f88430e test: better coverage for BILINEAR->NEAREST filter optimization
The upcoming optimization which is going to be able to replace BILINEAR filter
with NEAREST where appropriate needs to analyze the transformation matrix
and not to make any mistakes.

The changes to affine-test include:
1. Higher chance of using the same scale factor for x and y axes. This can help
   to stress some special cases (for example the case when both x and y scale
   factors are integer). The same applies to x/y translation.
2. Introduced a small chance for "corrupting" transformation matrix by flipping
   random bits. This supposedly can help to identify the cases when some of the
   fast paths or other code logic is wrongly activated due to insufficient checks.
2011-09-21 18:55:10 -04:00
Søren Sandmann Pedersen
054922e2fc Eliminate compute_sample_extents() function
In analyze_extents(), instead of calling compute_sample_extents() call
compute_transformed_extents() and inline the remaining part of
compute_sample_extents(). The upcoming bilinear->nearest optimization
will do something different with these two pieces of code.
2011-09-21 18:53:03 -04:00
Søren Sandmann Pedersen
577b6c46fd Split computation of sample area into own function
compute_sample_extents() have two parts: one that computes the
transformed extents, and one that checks whether the computed extents
fit within the 16.16 coordinate space.

Split the first part into its own function
compute_transformed_extents().
2011-09-21 18:52:18 -04:00
Søren Sandmann Pedersen
5064f18031 Remove x and y coordinates from analyze_extents() and compute_sample_extents()
These coordinates were only ever used for subtracting from the extents
box to put it into the coordinate space of the image, so we might as
well do this coordinate translation only once before entering the
functions.
2011-09-21 18:48:55 -04:00
Søren Sandmann Pedersen
dbcb4af60d Use MAKE_ACCESSORS() to generate accessors for paletted formats
Add support in convert_pixel_from_a8r8g8b8() and
convert_pixel_to_a8r8g8b8() for conversion to/from paletted formats,
then use MAKE_ACCESSORS() to generate accessors for the indexed
formats: c8, g8, g4, c4, g1
2011-09-20 06:44:05 -04:00
Søren Sandmann Pedersen
c82c2c3853 Use MAKE_ACCESSORS() to generate accessors for the a1 format.
Add FETCH_1 and STORE_1 macros and use them to add support for 1bpp
pixels to fetch_and_convert_pixel() and convert_and_store_pixel(),
then use MAKE_ACCESSORS() to generate the accessors for the a1
format. (Not the g1 format as it is indexed).
2011-09-20 06:44:05 -04:00
Søren Sandmann Pedersen
2114dd8aa1 Use MAKE_ACCESSORS() to generate accessors for 24bpp formats
Add FETCH_24 and STORE_24 macros and use them to add support for 24bpp
pixels in fetch_and_convert_pixel() and
convert_and_store_pixel(). Then use MAKE_ACCESSORS() to generate
accessors for the 24 bpp formats:

    r8g8b8
    b8g8r8
2011-09-20 06:44:05 -04:00
Søren Sandmann Pedersen
f19f5daa1b Use MAKE_ACCESSORS() to generate accessors for 4 bpp RGB formats
Use FETCH_4 and STORE_4 macros to add support for 4bpp pixels to
fetch_and_convert_pixel() and convert_and_store_pixel(), then use
MAKE_ACCESSORS() to generate accessors for 4 bpp formats, except g4 and
c4 which are indexed:

    a4
    r1g2b1
    b1g2r1
    a1r1g1b1
    a1b1g1r1
2011-09-20 06:44:04 -04:00
Søren Sandmann Pedersen
af78fe24e4 Use MAKE_ACCESSORS() to generate accessors for 8bpp RGB formats
Add support for 8 bpp formats to fetch_and_convert_pixel() and
convert_and_store_pixel(), then use MAKE_ACCESSORS() to generate the
accessors for all the 8 bpp formats, except g8 and c8, which are
indexed:

    a8
    r3g3b2
    b2g3r3
    a2r2g2b2
    a2b2g2r2
    x4a4
2011-09-20 06:44:04 -04:00
Søren Sandmann Pedersen
5e1b9f8975 Use MAKE_ACCESSORS() to generate accessors for all the 16bpp formats
Add support for 16bpp pixels to fetch_and_convert_pixel() and
convert_and_store_pixel(), then use MAKE_ACCESSORS() to generate
accessors for all the 16bpp formats:

    r5g6b5
    b5g6r5
    a1r5g5b5
    x1r5g5b5
    a1b5g5r5
    x1b5g5r5
    a4r4g4b4
    x4r4g4b4
    a4b4g4r4
    x4b4g4r4
2011-09-20 06:44:04 -04:00
Søren Sandmann Pedersen
a77597bcb8 Use MAKE_ACCESSORS() to generate all the 32 bit accessors
Add support for 32bpp formats in fetch_and_convert_pixel() and
convert_and_store_pixel(), then use MAKE_ACCESSORS() to generate
accessors for all the 32 bpp formats:

    a8r8g8b8
    x8r8g8b8
    a8b8g8r8
    x8b8g8r8
    x14r6g6b6
    b8g8r8a8
    b8g8r8x8
    r8g8b8x8
    r8g8b8a8
2011-09-20 06:44:04 -04:00
Søren Sandmann Pedersen
814af33df3 Add initial version of the MAKE_ACCESSORS() macro
This macro will eventually allow the fetchers and storers to be
generated automatically. For now, it's just a skeleton that doesn't
actually do anything.
2011-09-20 06:44:04 -04:00
Søren Sandmann Pedersen
5cae7a3fe6 Add general pixel converter
This function can convert between any <= 32 bpp formats. Nothing uses
it yet.
2011-09-20 06:44:04 -04:00
Søren Sandmann Pedersen
22f54dde6b Add a generic unorm_to_unorm() conversion utility
This function can convert between normalized numbers of different
depths. When converting to higher bit depths, it will replicate the
existing bits, when converting to lower bit depths, it will simply
truncate.

This function replaces the expand16() function in pixman-utils.c
2011-09-20 06:44:04 -04:00
Søren Sandmann Pedersen
d842669a46 A few tweaks to a comment in pixman-combine.c.template
Include a link to

	http://marc.info/?l=xfree-render&m=99792000027857&w=2

where Keith explains how the disjoint/conjoint operators work.
2011-09-19 09:08:33 -04:00
Jon TURNEY
3432e1a344 Fix build on cygwin after commit efdf65c0c4
libutils depends on pixman and so needs to preceed it in the link order

Found by tinderbox, see [1]

[1] http://tinderbox.freedesktop.org/builds/2011-09-15-0005/logs/pixman/#build

Signed-off-by: Jon TURNEY <jon.turney at dronecode.org.uk>
2011-09-19 06:17:58 -04:00
Søren Sandmann Pedersen
f9faf4df44 test: Use smaller boxes in region_contains_test()
The boxes used region_contains_test() sometimes overflow causing

    *** BUG ***
    In pixman_region32_union_rect: Invalid rectangle passed
    Set a breakpoint on '_pixman_log_error' to debug

messages to be printed when pixman is compiled with DEBUG. Fix this by
dividing the x, y, w, h coordinates by 4 to prevent overflows.
2011-09-19 06:15:14 -04:00
Andrea Canciani
9623b478f7 build-win32: Add 'check' target
On win32 the tests are built but they are not run automatically by the
build system.

A minimal 'check' target (depending on the tests being built) can
simply run them and log to the console their success/failure.
2011-09-14 07:03:35 -07:00
Andrea Canciani
479d094485 test: Do not include config.h unless HAVE_CONFIG_H is defined
The win32 build system does not generate config.h and correctly runs
the compiler without defining HAVE_CONFIG_H. Nevertheless some files
include config.h without checking for its availability, breaking the
build from a clean directory:

test\utils.h(2) : fatal error C1083: Cannot open include file:
'config.h': No such file or directory
...
2011-09-14 07:03:35 -07:00
Andrea Canciani
d46a9f3ace build-win32: Add root Makefile.win32
Add Makefile.win32 to the pixman root. This makefile can recursively
run the other ones to compile the library or the test suite.
2011-09-14 07:03:35 -07:00
Andrea Canciani
a76b78c2da build-win32: Share targets and variables across win32 makefiles
The win32 build system repeatedly defines some basic variables
(notably program names and flags) and C sources compilation rules.

They can be factored out to a common Makefile, to be included in every
other Makefile.win32.
2011-09-14 07:03:35 -07:00
Andrea Canciani
efdf65c0c4 build: Reuse test sources
Makefile.am and Makefile.win32 should not duplicate content, as this
leads to breaking the build when they are not kept in sync.

This can be avoided by listing sources, headers and common build
variables/rules in a Makefile.sources file.

In order to further simplify the test makefiles, the utility functions
are now in a static library, which gets linked to all the tests and
benchmarks.
2011-09-14 07:03:34 -07:00
Andrea Canciani
a4f95d083b build: Reuse sources and pixman-combine build rules
Makefile.am and Makefile.win32 should not duplicate content, as this
leads to breaking the build when they are not kept in sync.

This can be avoided by listing sources, headers and common build
variables/rules in a Makefile.sources file.
2011-09-14 07:02:59 -07:00
Andrea Canciani
25bd96a3d0 test: Fix compilation on win32
Adding scaling-helpers-test to the testsuite on win32 makes MSVC
complain about int64_t being used as an expression:

scaling-helpers-test.c(27) : error C2275: 'int64_t' : illegal use of
this type as an expression
2011-09-14 07:02:59 -07:00
Søren Sandmann Pedersen
9882d832f6 Use pkg-config to determine the flags to use with libpng
Previously we would unconditionally link with -lpng leading to build
failures on systems without libpng.
2011-09-12 22:39:53 -04:00
Søren Sandmann Pedersen
99a53667da test: New function to save a pixman image to .png
When debugging it is often very useful to be able to save an image as
a png file. This commit adds a function "write_png()" that does that.

If libpng is not available, then the function becomes a noop.
2011-09-10 04:07:50 -04:00
Søren Sandmann Pedersen
1e1ae0bf6e Post-release version bump to 0.23.5 2011-09-09 23:59:20 -04:00
Søren Sandmann Pedersen
f901e3b58b Pre-release version bump to 0.23.4 2011-09-09 23:51:11 -04:00
Chris Wilson
f5da52b677 bits: optimise fetching width==1 repeats
Profiling ign.com, 20% of the entire render time was absorbed in this
single operation:

<< /content //COLOR_ALPHA /width 480 /height 800 >> surface context
<< /width 1 /height 677 /format //ARGB32 /source <|!!!@jGb!m5gD']#$jFHGWtZcK&2i)Up=!TuR9`G<8;ZQp[FQk;emL9ibhbEL&NTh-j63LhHo$E=mSG,0p71`cRJHcget4%<S\X+~> >> image pattern
  //EXTEND_REPEAT set-extend
  set-source
n 0 0 480 677 rectangle
fill+
pop

which is a simple composition of a single pixel wide image. Sadly this
is a workaround for lack of independent repeat-x/y handling in cairo and
pixman. Worse still is that the worst-case behaviour of the general repeat
path is for width 1 images...

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-09-09 23:43:16 -04:00
Taekyun Kim
7ef44cae6b ARM: NEON better instruction scheduling of over_n_8888
New head, tail, tail/head blocks are added and instructions
are reordered to eliminate pipeline stalls

Performance numbers of before/after

- cortex a8 -
before : L1: 375.39  L2: 391.93  M:114.39 ( 40.99%)  HT: 99.37  VT: 98.20  R: 90.24  RT: 32.87 ( 240Kops/s)
after  : L1: 481.90  L2: 483.46  M:114.29 ( 40.69%)  HT:106.91  VT: 93.38  R: 90.74  RT: 29.51 ( 236Kops/s)

- cortex a9 -
before : L1: 324.50  L2: 332.79  M:155.55 ( 47.51%)  HT:111.93  VT: 93.58  R: 71.92  RT: 28.21 ( 233Kops/s)
after  : L1: 355.87  L2: 364.49  M:156.90 ( 47.59%)  HT:111.52  VT: 91.76  R: 72.16  RT: 28.22 ( 234Kops/s)
2011-09-07 11:01:50 +09:00
Taekyun Kim
6aa82b7a72 ARM: NEON better instruction scheduling of over_n_8_8888
tail/head block is expanded and reordered to eliminate stalls

Performance numbers of before/after

- cortex a8 -
before : L1: 201.35  L2: 190.48  M:101.94 ( 54.85%)  HT: 78.41  VT: 63.83  R: 58.25  RT: 21.74 ( 191Kops/s)
after  : L1: 257.65  L2: 255.49  M:102.04 ( 55.33%)  HT: 79.19  VT: 65.46  R: 59.23  RT: 21.12 ( 189Kops/s)

- cortex a9 -
before : L1: 157.35  L2: 159.81  M:133.00 ( 60.94%)  HT: 82.44  VT: 63.64  R: 51.66  RT: 19.15 ( 179Kops/s)
after  : L1: 216.83  L2: 219.40  M:135.83 ( 61.80%)  HT: 85.60  VT: 64.80  R: 52.23  RT: 19.16 ( 179Kops/s)
2011-09-07 11:01:47 +09:00
Andrea Canciani
4ffa077487 Workaround bug in llvm-gcc
llvm-gcc (shipped in Apple XCode 4.1.1 as the default compiler or in
the 2.9 release of LLVM) performs an invalid optimization which
unifies the empty_region and the bad_region structures because they
have the same content.

A bugreport has been filed against Apple Developers Tool for this
issue. This commit works around this bug by making one of the two
structures volatile, so that it cannot be merged.

Fixes region-contains-test.
2011-08-29 07:38:37 +02:00
Andrea Canciani
a1ebff0dcb win32: Build benchmarks
Add the makefile rules needed to compile lowlevel-blt-bench on win32
and fix the compilation errors.
2011-08-29 07:37:46 +02:00
Søren Sandmann Pedersen
2644d5a947 Move bilinear interpolation to pixman-inlines.h 2011-08-19 20:01:40 -04:00
Søren Sandmann Pedersen
12ad42dd32 Use repeat() function from pixman-inlines.h in pixman-bits-image.c
The repeat() functionality was duplicated between pixman-bits-image.c
and pixman-inlines.h
2011-08-19 20:01:40 -04:00
Søren Sandmann Pedersen
2f443466bb Rename pixman-fast-path.h to pixman-inlines.h
It is not really specific to pixman-fast-path.c.
2011-08-19 20:01:36 -04:00
Søren Sandmann Pedersen
e58b208958 In pixman_image_create_bits() allow images larger than 2GB
There is no reason for pixman_image_create_bits() to check that the
image size fits in int32_t. The correct check is against size_t since
that is what the argument to calloc() is.

This patch fixes this by adding a new _pixman_multiply_overflows_size()
and using it in create_bits(). Also prepend an underscore to the names
of other similar functions since they are internal to pixman.

V2: Use int, not ssize_t for the arguments in create_bits() since
width/height are still limited to 32 bits, as pointed out by Chris
Wilson.
2011-08-15 09:37:49 -04:00
Søren Sandmann Pedersen
bdfb5944ff Don't include stdint.h in lowlevel-blt-bench.c
Some systems don't have the file, and the types are already defined in
pixman.h.

https://bugs.freedesktop.org//show_bug.cgi?id=37422
2011-08-11 03:32:14 -04:00
Søren Sandmann Pedersen
e5d85ce662 Use find_box_for_y() in pixman_region_contains_point() too
The same binary search from the previous commit can be used in this
function too.

V2: Remove check from loop that is not needed anymore, pointed out by
Andrea Canciani.
2011-08-11 03:32:14 -04:00
Søren Sandmann Pedersen
04bd4bdca6 Speed up pixman_region{,32}_contains_rectangle()
When someone selects some text in Firefox under a non-composited X
server and initiates a drag, a shaped window is created with a complex
shape corresponding to the outline of the text. Then, on every mouse
movement pixman_region_contains_rectangle() is called many times on
that complicated region. And pixman_region_contains_rectangle() is
doing a linear scan through the rectangles in the region, although the
scan does exit when it finds the first box that can't possibly
intersect the passed-in rectangle.

This patch changes the loop so that it uses a binary search to skip
boxes that don't overlap the current y position.  The performance
improvement for the text dragging case is easily noticable.

V2: Use the binary search for the "getting up to speed or skippping
remainder of band" as well.
2011-08-11 03:32:14 -04:00
Søren Sandmann Pedersen
795ec5af2f New test of pixman_region_contains_{rectangle,point}
This test generates random regions and checks whether random boxes and
points are contained within them. The results are combined and a CRC32
value is computed and compared to a known-correct one.
2011-08-11 03:32:14 -04:00
Søren Sandmann Pedersen
842591d9d1 Fix lcg_rand_u32() to return 32 random bits.
The lcg_rand() function only returns 15 random bits, so lcg_rand_u32()
would always have 0 in bit 31 and bit 15. Fix that by calling
lcg_rand() three times, to generate 15, 15, and 2 random bits
respectively.

V2: Use the 10/11 most significant bits from the 3 lcg results and mix
them with the low ones from the adjacent one, as suggested by Andrea
Canciani.
2011-08-11 03:32:14 -04:00
Taekyun Kim
12da53f81c ARM NEON: Standard fast path out_reverse_8_8888
This fast path is frequently used by cairo to do polygon rendering.
Existing NEON code generation framework is used.
2011-08-04 23:38:45 +09:00
Andrea Canciani
b395c3c5a2 radial: Fix typos and trailing whitespace
Correct a typo reported by James Cloos and some reported by automatic
spellchecking.

Remove trailing whitespace.
2011-07-29 12:25:39 +02:00
Siarhei Siamashka
b8d6babc91 ARM: workaround binutils bug #12931 (code sections alignment)
More details in binutils bugtracker:
  http://sourceware.org/bugzilla/show_bug.cgi?id=12931

The problem was encountered in the wild by Mozilla:
  https://bugzilla.mozilla.org/show_bug.cgi?id=672787
2011-07-27 17:07:19 +03:00
Siarhei Siamashka
5754e5689d C fast path for scaled src_x888_8888 with nearest filter
The necessity is justified by a message in the pixman mailing list:
  http://lists.freedesktop.org/archives/pixman/2011-July/001330.html

NONE repeat is not supported, but could be added by tweaking
the interpretation and making use of 'fully_transparent_src'
scanline function argument.
2011-07-22 23:03:36 +03:00
Andrea Canciani
c06af10454 radial: Improve documentation and naming
Add a comment to explain why the tests guarantee that the code always
computes the greatest valid root.

Rename "det" as "discr" to make it match the mathematical name
"discriminant".

Based on a patch by Jeff Muizelaar <jmuizelaar@mozilla.com>.
2011-07-15 22:05:11 +02:00
Cyril Brulebois
69b4ffdbc9 Upload to experimental. 2011-07-05 01:37:39 +02:00
Cyril Brulebois
351ed700c3 Enable parallel building (by passing --parallel to dh $@). 2011-07-05 01:36:48 +02:00
Cyril Brulebois
af6efdfd20 Bump changelogs. 2011-07-04 22:47:03 +02:00
Cyril Brulebois
ff92434b39 Merge branch 'upstream-experimental' into debian-experimental 2011-07-04 22:45:55 +02:00
Søren Sandmann Pedersen
e814b50877 Makefile.am: Add pixman@lists.freedesktop.org to RELEASE_ANNOUNCE_LIST 2011-07-04 15:58:41 -04:00
Søren Sandmann Pedersen
ed6d2f1cec Post-release version bump to 0.23.3 2011-07-04 15:35:17 -04:00
Søren Sandmann Pedersen
6c4001a0e1 Pre-release version bump to 0.23.2 2011-07-04 08:13:19 -04:00
Taekyun Kim
eff7c8efab Bilinear REPEAT_NORMAL source line extension for too short src_width
To avoid function call and other calculation overhead, extend source
scanline into temporary buffer when source width is too small.
Temporary buffer will be repeatedly accessed, so extension cost is
very small due to cache effect.
2011-06-28 23:20:32 +09:00
Taekyun Kim
828794d328 Enable REPEAT_NORMAL bilinear fast path entries 2011-06-28 23:20:29 +09:00
Taekyun Kim
1161b3f9ed ARM: Add REPEAT_NORMAL functions to bilinear BIND macros
Now bilinear template support REPEAT_NORMAL, so functions for that
is added to PIXMAN_ARM_BIND_SCALED_BILINEAR_ macros. Fast path
entries are not enabled yet.
2011-06-28 23:20:27 +09:00
Taekyun Kim
ebd2f06d96 sse2: Declare bilinear src_8888_8888 REPEAT_NORMAL composite function
Now bilinear template support REPEAT_NORMAL, so declare composite
functions using it. Function is just declared not used yet.
2011-06-28 23:20:25 +09:00
Taekyun Kim
7e22b2f782 REPEAT_NORMAL support for bilinear fast path template
The basic idea is to break down normal repeat into a set of
non-repeat scanline compositions and stitching them together.

Bilinear may interpolate last and first pixels of source scanline.
In this case, we can use temporary wrap around buffer.
2011-06-28 23:20:23 +09:00
Taekyun Kim
2f025bad43 Replace boolean arguments with flags for bilinear fast path template
By replacing boolean arguments with flags, the code can be more
readable and flags can be extended to do some more things later.

Currently following flags are defined.

FLAG_NONE
    - No flags are turned on.

FLAG_HAVE_SOLID_MASK
    - Template will generate solid mask composite functions.

FLAG_HAVE_NON_SOLID_MASK
    - Template will generate bits mask composite functions.

FLAG_HAVE_SOLID_MASK and FLAG_NON_SOLID_MASK should be mutually
exclusive.
2011-06-28 23:20:21 +09:00
Søren Sandmann
4d4d1760e8 test: Make fuzzer-find-diff.pl executable 2011-06-25 10:17:50 -04:00
Søren Sandmann
ece8d13bf7 ARM: Fix two bugs in neon_composite_over_n_8888_0565_ca().
The first bug is that a vmull.u8 instruction would store its result in
the q1 register, clobbering the d2 register used later on. The second
is that a vraddhn instruction would overwrite d25, corrupting the q12
register used later.

Fixing the second bug caused a pipeline bubble where the d18 register
would be unavailable for a clock cycle. This is fixed by swapping the
instruction with its successor.
2011-06-25 10:17:05 -04:00
Søren Sandmann Pedersen
5715a394c4 blitters-test: Make common formats more likely to be tested.
Move the eight most common formats to the top of the list of image
formats and make create_random_image() much more likely to select one
of those eight formats.

This should help catch more bugs in SIMD optimized operations.
2011-06-25 10:17:05 -04:00
Andrea Canciani
d815a1c54a Silence autoconf warnings
Autoconf 2.86 reports:

warning: AC_LANG_CONFTEST: no AC_LANG_SOURCE call detected in body

Every code fragment must be wrapped in [AC_LANG_SOURCE([...])]
2011-06-23 10:47:43 +02:00
Søren Sandmann Pedersen
a89f8cfaf1 Replace argumentxs to composite functions with a pointer to a struct
This allows more information, such as flags or the composite region,
to be passed to the composite functions.
2011-06-20 02:03:23 -04:00
Søren Sandmann Pedersen
99e7d8fab5 In pixman-general.c rename image_parameters to {src, mask, dest}_image
All the fast paths generally use these names as well.
2011-06-12 16:45:57 -04:00
Søren Sandmann Pedersen
4d713e3120 Replace instances of "dst_*" with "dest_*"
The variables in question were dst_x, dst_y, dst_image. The majority
of _x and _y uses were already dest_x and dest_y, while the majority
of _image uses were dst_image.
2011-06-12 16:45:57 -04:00
Julien Cristau
9d5bef2fcf Upload to unstable 2011-06-12 17:02:08 +02:00
Julien Cristau
90f71ced40 Bump changelogs 2011-06-12 17:01:38 +02:00
Julien Cristau
045dd15b6a Merge tag 'pixman-0.22.0' into debian-unstable 2011-06-12 17:00:36 +02:00
Julien Cristau
105c2e8664 Bump Standards-Version to 3.9.2. 2011-06-12 16:59:43 +02:00
Julien Cristau
3bb65959ee Add changelog entry for multiarch 2011-06-12 16:58:06 +02:00
Julien Cristau
f7a60c64ac Don't ship debug symbols for the udeb 2011-06-12 16:57:28 +02:00
Julien Cristau
94b5f3b6a4 Merge branch 'multiarch' of git.debian.org:/git/pkg-xorg/lib/pixman into debian-unstable
Conflicts:
	debian/control
	debian/rules
2011-06-12 16:55:38 +02:00
Søren Sandmann
6aceb767aa demos: Comment out some unused variables 2011-05-31 18:07:34 -04:00
Søren Sandmann
4abe76432a sse2: Delete some unused variables 2011-05-31 18:07:26 -04:00
Søren Sandmann
5c60e1855b mmx: Delete some unused variables 2011-05-31 18:06:43 -04:00
Andrea Canciani
827e613338 Include noop in win32 builds 2011-05-29 10:02:21 +02:00
Nis Martensen
65b63728cc Fix a few typos in pixman-combine.c.template
Some equations have too much multiplication with alpha.
2011-05-24 10:01:37 -04:00
Søren Sandmann Pedersen
dd449a2a8e Move NOP src iterator into noop implementation.
The iterator for sources where neither RGB nor ALPHA is needed, really
belongs in the noop implementation.
2011-05-19 13:46:56 +00:00
Søren Sandmann Pedersen
ba480882aa Move NULL iterator into pixman-noop.c
Iterating a NULL image returns NULL for all scanlines. We may as well
do this in the noop iterator.
2011-05-19 13:46:56 +00:00
Søren Sandmann Pedersen
a4e984de19 Add a noop src iterator
When the image is a8r8g8b8 and not transformed, and the fetched
rectangle is within the image bounds, scanlines can be fetched by
simply returning a pointer instead of copying the bits.
2011-05-19 13:46:56 +00:00
Søren Sandmann Pedersen
d4fff4a959 Move noop dest fetching to noop implementation
It will at some point become useful to have CPU specific destination
iterators. However, a problem with that, is that such iterators should
not be used if we can composite directly in the destination image.

By moving the noop destination iterator to the noop implementation, we
can ensure that it will be chosen before any CPU specific iterator.
2011-05-19 13:46:50 +00:00
Søren Sandmann Pedersen
13ce88f800 Add a noop composite function for the DST operator
The DST operator doesn't actually do anything, so add a noop "fast
path" for it, instead of checking in pixman_image_composite32().

The performance tradeoff here is that we get rid of a test for DST in
the common case where the operator is not DST, in return for an extra
walk over the clip rectangles in the uncommon case where the operator
actually is DST.
2011-05-19 13:45:59 +00:00
Søren Sandmann Pedersen
8c76235f41 Add a "noop" implementation.
This new implementation is ahead of all other implementations in the
fallback chain and is supposed to contain operations that are "noops",
ie., they don't require any work. For example, it might contain a
"fast path" for the DST operator that doesn't actually do anything or
an iterator for a8r8g8b8 that just returns a pointer into the image.
2011-05-19 13:45:59 +00:00
Andrea Canciani
0f6a4d4588 test: Fix compilation on win32
MSVC complains about uint32_t being used as an expression:

composite.c(902) : error C2275: 'uint32_t' : illegal use of this type
as an expression
2011-05-17 00:29:55 +02:00
Dave Yeo
838c2b593e Check for working mmap()
OS/2 doesn't have a working mmap().
2011-05-09 12:38:44 +02:00
Søren Sandmann Pedersen
c53625a36e Post-release version bump to 0.23.1 2011-05-02 05:11:49 -04:00
Søren Sandmann Pedersen
918a544406 Pre-release version bump to 0.22.0 2011-05-02 05:06:33 -04:00
Cyril Brulebois
2296b15c9d Upload to unstable. 2011-04-29 17:53:20 +02:00
Cyril Brulebois
c48a9b8035 Mention endianness-related FTBFS fix (Closes: #622211). 2011-04-29 17:53:09 +02:00
Cyril Brulebois
fa956ebd6b Bump changelogs. 2011-04-29 17:52:36 +02:00
Cyril Brulebois
d06147d984 Merge branch 'upstream-unstable' into debian-unstable 2011-04-29 17:51:32 +02:00
Søren Sandmann Pedersen
71b2e2745b Post-release version bump to 0.21.9 2011-04-19 00:22:29 -04:00
Søren Sandmann Pedersen
89868e93bd Pre-release version bump to 0.21.8 2011-04-19 00:00:37 -04:00
Taekyun Kim
33f1652b95 ARM: Enable bilinear fast paths using scanline functions in pixman-arm-neon-asm-bilinear.S
Enable fast paths which is supported by scanline functions in
pixman-arm-neon-asm-bilinear.S
2011-04-18 16:49:46 -04:00
Taekyun Kim
e8185f1cb4 ARM: NEON scanline functions for bilinear scaling
General fetch->combine->store based bilinear scanline functions.
Need further optimizations and eventually will be replaced with optimal
functions one by one.
General functions should be located in pixman-arm-neon-asm-bilinear.S and
optimal functions in pixman-arm-neon-asm.S

Following general bilinear scanline functions are implemented
    over_8888_8888
    add_8888_8888
    src_8888_8_8888
    src_8888_8_0565
    src_0565_8_x888
    src_0565_8_0565
    over_8888_8_8888
    add_8888_8_8888
2011-04-18 16:49:43 -04:00
Taekyun Kim
00939d3562 ARM: Common macro for scaled bilinear scanline function with A8 mask
Defining PIXMAN_ARM_BIND_SCALED_BILINEAR_SRC_A8_DST macro for declaration of
scaled bilinear scanline functions in common header.
2011-04-18 16:49:40 -04:00
Søren Sandmann Pedersen
b455496890 Offset rendering in pixman_composite_trapezoids() by (x_dst, y_dst)
Previously, this function would do coordinate calculations in such a
way that (x_dst, y_dst) would only affect the alignment of the source
image, but not of the traps, which would always be considered to be in
absolute destination coordinates. This is unlike the
pixman_image_composite() function which also registers the mask to the
destination.

This patch makes it so that traps are also offset by (x_dst, y_dst).

Also add a comment explaining how this function is supposed to
operate, and update tri-test.c and composite-trap-test.c to deal with
the new semantics.
2011-04-18 16:27:29 -04:00
Søren Sandmann Pedersen
e75e6a4ef5 ARM: Add 'neon_composite_over_n_8888_0565_ca' fast path
This improves the performance of the firefox-talos-gfx benchmark with
the image16 backend. Benchmark on an 800 MHz ARM Cortex A8:

Before:

[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]  image16            firefox-talos-gfx  121.773  122.218   0.15%    6/6

After:

[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]  image16            firefox-talos-gfx   85.247   85.563   0.22%    6/6

V2: Slightly better instruction scheduling based on comments from Taekyun Kim.
V3: Eliminate all stalls from the inner loop. Also based on comments from Taekyun Kim.
2011-04-18 16:25:36 -04:00
Gilles Espinasse
1670b95214 Fix OpenMP not supported case
PIXMAN_LINK_WITH_ENV did not fail unless -Wall -Werror is used.
So even when the compiler did not support OpenMP, USE_OPENMP was defined.
Fix that by running the second OpenMP test only when first AC_OPENMP find supported

configure tested in the cases :
gcc without libgomp support, no openmp option, --enable-openmp and --disable-openmp
gcc with libgomp support, no openmp option, --enable-openmp and --disable-openmp

Not tested with autoconf version not knowing openmp (<2.62)

Warn when --enable-openmp is requested but no support is found

Signed-off-by: Gilles Espinasse <g.esp@free.fr>
2011-04-18 16:13:58 -04:00
Gilles Espinasse
b9e8f7fb74 Fix missing AC_MSG_RESULT value from Werror test
Use the correct variable name

Signed-off-by: Gilles Espinasse <g.esp@free.fr>
2011-04-18 16:13:58 -04:00
Siarhei Siamashka
caae4e82ff ARM: pipelined NEON implementation of bilinear scaled 'src_8888_0565'
Benchmark on ARM Cortex-A8 r1p3 @600MHz, 32-bit LPDDR @166MHz:
 Microbenchmark (scaling 2000x2000 image with scale factor close to 1x):
  before: op=1, src=20028888, dst=10020565, speed=33.59 MPix/s
  after:  op=1, src=20028888, dst=10020565, speed=46.25 MPix/s

Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz:
 Microbenchmark (scaling 2000x2000 image with scale factor close to 1x):
  before: op=1, src=20028888, dst=10020565, speed=63.86 MPix/s
  after:  op=1, src=20028888, dst=10020565, speed=84.22 MPix/s
2011-04-11 10:48:35 +03:00
Siarhei Siamashka
d080d59b80 ARM: pipelined NEON implementation of bilinear scaled 'src_8888_8888'
Performance of the inner loop when working with the data in L1 cache:
    ARM Cortex-A8: 41 cycles per 4 pixels (no stalls and partial dual issue)
    ARM Cortex-A9: 48 cycles per 4 pixels (no stalls)

It might be still possible to improve performance even more on ARM Cortex-A8
with a better use of dual issue.

Benchmark on ARM Cortex-A8 r1p3 @600MHz, 32-bit LPDDR @166MHz:
 Microbenchmark (scaling 2000x2000 image with scale factor close to 1x):
  before: op=1, src=20028888, dst=20028888, speed=40.38 MPix/s
  after:  op=1, src=20028888, dst=20028888, speed=48.47 MPix/s

Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz:
 Microbenchmark (scaling 2000x2000 image with scale factor close to 1x):
  before: op=1, src=20028888, dst=20028888, speed=79.68 MPix/s
  after:  op=1, src=20028888, dst=20028888, speed=93.11 MPix/s
2011-04-11 10:48:30 +03:00
Siarhei Siamashka
b496a8b279 ARM: support different levels of loop unrolling in bilinear scaler
Now an extra 'flag' parameter is supported in bilinear scaline scaling
function generation macro. It can be used to enable 4 or 8 pixels per
loop iteration unrolling and provide save/restore code for d8-d15
registers.
2011-04-11 10:48:24 +03:00
Siarhei Siamashka
34ca9cf03f ARM: use less ARM instructions in NEON bilinear scaling code
This reduces code size and also puts less pressure on the
instruction decoder.
2011-04-11 10:48:14 +03:00
Siarhei Siamashka
0f7be9f72e ARM: support for software pipelining in bilinear macros
Now it's possible to override the main loop of bilinear scaling code
with optimized pipelined implementation.
2011-04-11 10:48:10 +03:00
Siarhei Siamashka
9638af9583 ARM: use aligned memory writes in NEON bilinear scaling code 2011-04-11 10:48:05 +03:00
Siarhei Siamashka
8bba3a0e1e ARM: tweaked horizontal weights update in NEON bilinear scaling code
Moving horizontal interpolation weights update instructions from the
beginning of loop to its end allows to hide some pipeline stalls and
improve performance.
2011-04-11 10:48:01 +03:00
Cyril Brulebois
eade7b4dbd Upload to unstable. 2011-04-10 23:08:45 +02:00
Søren Sandmann Pedersen
a215322267 ARM: Tiny improvement in over_n_8888_8888_ca_process_pixblock_head
Instead of two

	mvn d24, d24
	mvn d25, d25

use just one

	mvn q12, q12

Also move another vmvn instruction into the created pipeline bubble,
as pointed out by Siarhei.
2011-04-06 23:03:19 -04:00
Søren Sandmann Pedersen
44f99735d9 Makefile.am: Put development releases in "snapshots" directory
Up until now, all pixman release, both snapshots and releases were
uploaded to the "releases" directory on www.cairographics.org, but
it's better to development snapshots in the "snapshots" directory.

This patch changes Makefile.am to do that.
2011-04-06 23:03:10 -04:00
Steve Langasek
c6ce22e73a build for multiarch 2011-03-26 00:30:06 -07:00
Søren Sandmann Pedersen
ad3cbfb073 test: Fix infinite loop in composite
When run in PIXMAN_RANDOMIZE_TESTS mode, this test would go into an
infinite loop because the loop started at 'seed' but the stop
condition was still N_TESTS.
2011-03-22 13:43:29 -04:00
Alexandros Frantzis
b514e63cfc Add support for the r8g8b8a8 and r8g8b8x8 formats to the tests. 2011-03-22 13:43:29 -04:00
Alexandros Frantzis
f05a90e5f8 Add simple support for the r8g8b8a8 and r8g8b8x8 formats.
This format is particularly useful on big-endian architectures, where RGBA in
memory/file order corresponds to r8g8b8a8 as an uint32_t. This is important
because RGBA is in some cases the only available choice (for example as a pixel
format in OpenGL ES 2.0).
2011-03-22 13:43:29 -04:00
Søren Sandmann Pedersen
7eb0abb5e8 test: Randomize some tests if PIXMAN_RANDOMIZE_TESTS is set
This patch makes so that composite and stress-test will start from a
random seed if the PIXMAN_RANDOMIZE_TESTS environment variable is
set. Running the test suite in this mode is useful to get more test
coverage.

Also, in stress-test.c make it so that setting the initial seed causes
threads to be turned off. This makes it much easier to see when
something fails.
2011-03-19 08:51:35 -04:00
Søren Sandmann Pedersen
6b27768d81 Simplify the prototype for iterator initializers.
All of the information previously passed to the iterator initializers
is now available in the iterator itself, so there is no need to pass
it as arguments anymore.
2011-03-18 16:23:10 -04:00
Søren Sandmann Pedersen
74d0f44b6d Fill out parts of iters in _pixman_implementation_{src,dest}_iter_init()
This makes _pixman_implementation_{src,dest}_iter_init() responsible
for filling parts of the information in the iterators. Specifically,
the information passed as arguments is stored in the iterator.

Also add a height field to pixman_iter_t().
2011-03-18 16:23:10 -04:00
Søren Sandmann Pedersen
be4eaa0e4f In delegate_{src,dest}_iter_init() call delegate directly.
There is no reason to go through
_pixman_implementation_{src,dest}_iter_init(), especially since
_pixman_implementation_src_iter_init() is doing various other checks
that only need to be done once.

Also call delegate->src_iter_init() directly in pixman-sse2.c
2011-03-18 16:23:10 -04:00
Siarhei Siamashka
70a923882c ARM: a bit faster NEON bilinear scaling for r5g6b5 source images
Instructions scheduling improved in the code responsible for fetching r5g6b5
pixels and converting them to the intermediate x8r8g8b8 color format used in
the interpolation part of code. Still a lot of NEON stalls are remaining,
which can be resolved later by the use of pipelining.

Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz:
 Microbenchmark (scaling 2000x2000 image with scale factor close to 1x):
  before: op=1, src=10020565, dst=10020565, speed=32.29 MPix/s
          op=1, src=10020565, dst=20020888, speed=36.82 MPix/s
  after:  op=1, src=10020565, dst=10020565, speed=41.35 MPix/s
          op=1, src=10020565, dst=20020888, speed=49.16 MPix/s
2011-03-12 21:30:22 +02:00
Siarhei Siamashka
fe99673719 ARM: NEON optimization for bilinear scaled 'src_0565_0565'
Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz:
 Microbenchmark (scaling 2000x2000 image with scale factor close to 1x):
  before: op=1, src=10020565, dst=10020565, speed=3.30 MPix/s
  after:  op=1, src=10020565, dst=10020565, speed=32.29 MPix/s
2011-03-12 21:30:18 +02:00
Siarhei Siamashka
29003c3bef ARM: NEON optimization for bilinear scaled 'src_0565_x888'
Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz:
 Microbenchmark (scaling 2000x2000 image with scale factor close to 1x):
  before: op=1, src=10020565, dst=20020888, speed=3.39 MPix/s
  after:  op=1, src=10020565, dst=20020888, speed=36.82 MPix/s
2011-03-12 21:30:13 +02:00
Siarhei Siamashka
2ee27e7d79 ARM: NEON optimization for bilinear scaled 'src_8888_0565'
Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz:
 Microbenchmark (scaling 2000x2000 image with scale factor close to 1x):
  before: op=1, src=20028888, dst=10020565, speed=6.56 MPix/s
  after:  op=1, src=20028888, dst=10020565, speed=61.65 MPix/s
2011-03-12 21:30:09 +02:00
Siarhei Siamashka
11a0c5badb ARM: use common macro template for bilinear scaled 'src_8888_8888'
This is a cleanup for old and now duplicated code. The performance improvement
is mostly coming from the enabled use of software prefetch, but instructions
scheduling is also slightly better.

Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz:
 Microbenchmark (scaling 2000x2000 image with scale factor close to 1x):
  before: op=1, src=20028888, dst=20028888, speed=53.24 MPix/s
  after:  op=1, src=20028888, dst=20028888, speed=74.36 MPix/s
2011-03-12 21:30:05 +02:00
Siarhei Siamashka
34098dba67 ARM: NEON: common macro template for bilinear scanline scalers
This allows to generate bilinear scanline scaling functions targeting
various source and destination color formats. Right now a8r8g8b8/x8r8g8b8
and r5g6b5 color formats are supported. More formats can be added if needed.
2011-03-12 21:30:00 +02:00
Siarhei Siamashka
66f4ee1b3b ARM: new bilinear fast path template macro in 'pixman-arm-common.h'
It can be reused in different ARM NEON bilinear scaling fast path functions.
2011-03-12 21:29:56 +02:00
Siarhei Siamashka
5921c17639 ARM: assembly optimized nearest scaled 'src_8888_8888'
Benchmark on ARM Cortex-A8 r1p3 @500MHz, 32-bit LPDDR @166MHz:
 Microbenchmark (scaling 2000x2000 image with scale factor close to 1x):
  before: op=1, src=20028888, dst=20028888, speed=44.36 MPix/s
  after:  op=1, src=20028888, dst=20028888, speed=39.79 MPix/s

Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz:
 Microbenchmark (scaling 2000x2000 image with scale factor close to 1x):
  before: op=1, src=20028888, dst=20028888, speed=102.36 MPix/s
  after:  op=1, src=20028888, dst=20028888, speed=163.12 MPix/s
2011-03-12 21:26:05 +02:00
Siarhei Siamashka
f3e17872f5 ARM: common macro for nearest scaling fast paths
The code of nearest scaled 'src_0565_0565' function was generalized
and moved to a common macro, so that it can be reused for other
fast paths.
2011-03-12 21:24:40 +02:00
Siarhei Siamashka
bb3d1b67fd ARM: use prefetch in nearest scaled 'src_0565_0565'
Benchmark on ARM Cortex-A8 r1p3 @500MHz, 32-bit LPDDR @166MHz:
 Microbenchmark (scaling 2000x2000 image with scale factor close to 1x):
  before: op=1, src=10020565, dst=10020565, speed=75.02 MPix/s
  after:  op=1, src=10020565, dst=10020565, speed=73.63 MPix/s

Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz:
 Microbenchmark (scaling 2000x2000 image with scale factor close to 1x):
  before: op=1, src=10020565, dst=10020565, speed=176.12 MPix/s
  after:  op=1, src=10020565, dst=10020565, speed=267.50 MPix/s
2011-03-12 21:23:54 +02:00
Cyril Brulebois
3503f7956f Upload to experimental. 2011-03-09 04:08:04 +01:00
Cyril Brulebois
19f2d3d9c1 Bump Standards-Version to 3.9.1 (no changes needed). 2011-03-09 04:07:54 +01:00
Cyril Brulebois
bec6320b0e Add a quilt series placeholder file. 2011-03-09 04:04:13 +01:00
Cyril Brulebois
43375c5d66 Switch to dh. 2011-03-09 03:55:08 +01:00
Cyril Brulebois
d3975d7ff9 Update Uploaders list. Thanks, David! 2011-03-09 03:42:00 +01:00
Cyril Brulebois
b03a2e477b Remove libpixman1-dev from Conflicts, last seen in etch! 2011-03-09 03:41:05 +01:00
Cyril Brulebois
61363cc614 Wrap Build-Depends. 2011-03-09 03:40:06 +01:00
Cyril Brulebois
b98292b4d5 Bump shlibs accordingly. 2011-03-09 03:39:07 +01:00
Cyril Brulebois
1e6491fdde Update symbols file with new symbols. 2011-03-09 03:38:42 +01:00
Cyril Brulebois
1d60bb92f7 Bump changelogs. 2011-03-09 03:21:07 +01:00
Cyril Brulebois
a0ab0aecb2 Merge branch 'upstream-experimental' into debian-experimental 2011-03-09 03:20:36 +01:00
Søren Sandmann Pedersen
84e361c8e3 test: Do endian swapping of the source and destination images.
Otherwise the test fails on big endian. Fix for bug 34767, reported by
Siarhei Siamashka.
2011-03-07 14:08:00 -05:00
Søren Sandmann Pedersen
84f3c5a71a test: In image_endian_swap() use pixman_image_get_format() to get the bpp.
There is no reason to pass in the bpp as an argument; it can be gotten
directly from the image.
2011-03-07 14:07:44 -05:00
Siarhei Siamashka
17feaa9c50 ARM: NEON optimization for bilinear scaled 'src_8888_8888'
Initial NEON optimization for bilinear scaling. Can be probably
improved more.

Benchmark on ARM Cortex-A8:
 Microbenchmark (scaling 2000x2000 image with scale factor close to 1x):
  before: op=1, src=20028888, dst=20028888, speed=6.70 MPix/s
  after:  op=1, src=20028888, dst=20028888, speed=44.27 MPix/s
2011-02-28 15:47:58 +02:00
Siarhei Siamashka
350029396d SSE2 optimization for bilinear scaled 'src_8888_8888'
A primitive naive implementation of bilinear scaling using SSE2 intrinsics,
which only handles one pixel at a time. It is approximately 2x faster than
pixman general compositing path. Single pass processing without intermediate
temporary buffer contributes to ~15% and loop unrolling contributes to ~20%
of this speedup.

Benchmark on Intel Core i7 (x86-64):
 Using cairo-perf-trace:
  before: image        firefox-planet-gnome   12.566   12.610   0.23%    6/6
  after:  image        firefox-planet-gnome   10.961   11.013   0.19%    5/6

 Microbenchmark (scaling 2000x2000 image with scale factor close to 1x):
  before: op=1, src=20028888, dst=20028888, speed=70.48 MPix/s
  after:  op=1, src=20028888, dst=20028888, speed=165.38 MPix/s
2011-02-28 15:47:52 +02:00
Siarhei Siamashka
0df43b8ae5 test: check correctness of 'bilinear_pad_repeat_get_scanline_bounds'
Individual correctness check for the new bilinear scaling related
supplementary function. This test program uses a bit wider range
of input arguments, not covered by other tests.
2011-02-28 15:29:23 +02:00
Siarhei Siamashka
d506bf68fd Main loop template for fast single pass bilinear scaling
Can be used for implementing SIMD optimized fast path
functions which work with bilinear scaled source images.

Similar to the template for nearest scaling main loop, the
following types of mask are supported:
1. no mask
2. non-scaled a8 mask with SAMPLES_COVER_CLIP flag
3. solid mask

PAD repeat is fully supported. NONE repeat is partially
supported (right now only works if source image has alpha
channel or when alpha channel of the source image does not
have any effect on the compositing operation).
2011-02-28 15:29:16 +02:00
Andrea Canciani
9ebde285fa test: Silence MSVC warnings
MSVC does not notice non-returning functions (abort() / assert(0))
and warns about paths which end with them in non-void functions:

c:\cygwin\home\ranma42\code\fdo\pixman\test\fetch-test.c(114) :
warning C4715: 'reader' : not all control paths return a value
c:\cygwin\home\ranma42\code\fdo\pixman\test\stress-test.c(133) :
warning C4715: 'real_reader' : not all control paths return a value
c:\cygwin\home\ranma42\code\fdo\pixman\test\composite.c(431) :
warning C4715: 'calc_op' : not all control paths return a value

These warnings can be silenced by adding a return after the
termination call.
2011-02-28 10:38:02 +01:00
Andrea Canciani
8868778ea1 Do not include unused headers
pixman-combine32.h is included without being used both in
pixman-image.c and in pixman-general.c.
2011-02-28 10:38:02 +01:00
Andrea Canciani
72f5e5f608 test: Add Makefile for Win32 2011-02-28 10:38:02 +01:00
Andrea Canciani
11305b4ecd test: Fix tests for compilation on Windows
The Microsoft C compiler cannot handle subobject initialization and
Win32 does not provide snprintf.

Work around these limitations by using normal struct initialization
and using sprintf (a manual check shows that the buffer size is
sufficient).
2011-02-28 10:38:02 +01:00
Andrea Canciani
20ed723a5a Fix compilation on Win32
Makefile.win32 contained a typo and was missing the dependency from
the built sources.
2011-02-28 10:38:01 +01:00
Søren Sandmann Pedersen
48e951000c Post-release version bump to 0.21.7 2011-02-22 16:13:32 -05:00
Søren Sandmann Pedersen
8b33321660 Pre-release version bump to 0.21.6 2011-02-22 15:43:41 -05:00
Søren Sandmann Pedersen
2cb67d2a0b Minor fix to the RELEASING file 2011-02-22 15:40:34 -05:00
Søren Sandmann Pedersen
3cdf74257b Delete pixman-x64-mmx-emulation.h from pixman/Makefile.am 2011-02-22 15:28:17 -05:00
Siarhei Siamashka
65919ad17f Ensure that tests run as the last step of a build for 'make check'
Previously 'make check' would compile and run tests first, and only
then proceed to compiling demos. Which is not very convenient
because of the need to scroll back console output to see the
tests verdict. Swapping order of SUBDIRS variable entries in
Makefile.am resolves this.
2011-02-22 19:43:57 +02:00
Søren Sandmann Pedersen
34a7ac0474 sse2: Minor coding style cleanups.
Also make pixman_fill_sse2() static.
2011-02-18 16:03:30 -05:00
Søren Sandmann Pedersen
10f69e5ec8 sse2: Remove pixman-x64-mmx-emulation.h
Also stop including mmintrin.h
2011-02-18 16:03:29 -05:00
Søren Sandmann Pedersen
984be4def2 sse2: Delete obsolete or redundant comments 2011-02-18 16:03:29 -05:00
Søren Sandmann Pedersen
33d9890226 sse2: Remove all the core_combine_* functions
Now that _mm_empty() is not used anymore, they are no longer different
from the sse2_combine_* functions, so they can be consolidated.
2011-02-18 16:03:29 -05:00
Søren Sandmann Pedersen
87cd6b8056 sse2: Don't compile pixman-sse2.c with -mmmx anymore
It's not necessary now that the file doesn't use MMX instructions.
2011-02-18 16:03:29 -05:00
Søren Sandmann Pedersen
e7fe5e35e9 sse2: Delete unused MMX functions and constants and all _mm_empty()s
These are not needed because the SSE2 implementation doesn't use MMX
anymore.
2011-02-18 16:03:29 -05:00
Søren Sandmann Pedersen
f88ae14c15 sse2: Convert all uses of MMX registers to use SSE2 registers instead.
By avoiding use of MMX registers we won't need to call emms all over
the place, which avoids various miscompilation issues.
2011-02-18 16:03:29 -05:00
Søren Sandmann Pedersen
7fb75bb3e6 Coding style: core_combine_in_u_pixelsse2 -> core_combine_in_u_pixel_sse2 2011-02-18 16:03:29 -05:00
Søren Sandmann Pedersen
510c0d088a In pixman_image_set_transform() allow NULL for transform
Previously, this would crash unless the existing transform were also
NULL.
2011-02-18 06:21:38 -05:00
Søren Sandmann Pedersen
7feb710e60 Avoid marking images dirty when properties are reset
When an image property is set to the same value that it already is,
there is no reason to mark the image dirty and incur a recomputation
of the flags.
2011-02-18 06:21:37 -05:00
Søren Sandmann Pedersen
3598ec26ec Add new public function pixman_add_triangles()
This allows some more code to be deleted from the X server. The
implementation consists of converting to trapezoids, and is shared
with pixman_composite_triangles().
2011-02-18 06:21:37 -05:00
Søren Sandmann Pedersen
964c7e7cd2 Optimize adding opaque trapezoids onto a8 destination.
When the source is opaque and the destination is alpha only, we can
avoid the temporary mask and just add the trapezoids directly.
2011-02-18 06:21:37 -05:00
Søren Sandmann Pedersen
0bc03482f1 Add a test program, tri-test
This program tests whether the new triangle support works.
2011-02-18 06:21:31 -05:00
Søren Sandmann Pedersen
79e69aac8c Add support for triangles to pixman.
The Render X extension can draw triangles as well as trapezoids, but
the implementation has always converted them to trapezoids. This patch
moves the X server's triangle conversion code into pixman, where we
can reuse the pixman_composite_trapezoid() code.
2011-02-15 09:25:18 -05:00
Søren Sandmann Pedersen
4e6dd4928d Add a test program for pixman_composite_trapezoids().
A CRC32 based test program to check that pixman_composite_trapezoids()
actually works.
2011-02-15 09:25:18 -05:00
Søren Sandmann Pedersen
803272e38c Add pixman_composite_trapezoids().
This function is an implementation of the X server request
Trapezoids. That request is what the X backend of cairo is using all
the time; by moving it into pixman we can hopefully make it faster.
2011-02-15 09:25:18 -05:00
Søren Sandmann Pedersen
1feaf6bea7 test/Makefile.am: Move all the TEST_LDADD into a new global LDADD.
This gets rid of a bunch of replicated *_LDADD clauses
2011-02-15 09:25:17 -05:00
Søren Sandmann Pedersen
1237fd9bc8 Add @TESTPROGS_EXTRA_LDFLAGS@ to AM_LDFLAGS
Instead of explicitly adding it to each test program.
2011-02-15 09:25:17 -05:00
Søren Sandmann Pedersen
7dfe845786 Move all the GTK+ based test programs to a new subdir, "demos"
This separates the test suite from the random gtk+ using test
programs. "demos" is somewhat misleading because the programs there
are not particularly exciting (with the possible exception of
composite-test which shows off all the compositing operators).
2011-02-15 09:25:17 -05:00
Siarhei Siamashka
8e4100260b SSE2 optimization for nearest scaled over_8888_n_8888
This operation shows up a little bit in some of the html5 based
games from http://www.kesiev.com/akihabara/

=== Cairo trace of the game intro animation for 'Legend of Sadness' ===

before:
[  0]    image    firefox-legend-of-sadness   46.286   46.298   0.01%    5/6

after:
[  0]    image    firefox-legend-of-sadness   45.088   45.102   0.04%    6/6

=== Microbenchmark (scaling ~2000x~2000 -> ~2000x~2000) ===

before:
    translucent: op=3, src=8888, mask=s dst=8888, speed=131.30 MPix/s
    transparent: op=3, src=8888, mask=s dst=8888, speed=132.38 MPix/s
    opaque:      op=3, src=8888, mask=s dst=8888, speed=167.90 MPix/s
after:
    translucent: op=3, src=8888, mask=s dst=8888, speed=301.93 MPix/s
    transparent: op=3, src=8888, mask=s dst=8888, speed=770.70 MPix/s
    opaque:      op=3, src=8888, mask=s dst=8888, speed=301.80 MPix/s
2011-02-15 14:32:41 +02:00
Siarhei Siamashka
39b86b032d ARM: NEON optimization for nearest scaled over_0565_8_0565
In some cases may be used for html5 video when hardware acceleration
is not available.
2011-02-15 14:32:34 +02:00
Siarhei Siamashka
9a90c1c90f ARM: NEON optimization for nearest scaled over_8888_8_0565
In some cases may be used for html5 video when hardware acceleration
is not available.
2011-02-15 14:32:28 +02:00
Siarhei Siamashka
cd1062ded4 ARM: new macro template for using scaled fast paths with a8 mask 2011-02-15 14:32:23 +02:00
Siarhei Siamashka
b099957887 Better support for NONE repeat in nearest scaling main loop template
Scaling function now gets an extra boolean argument, which is set
to TRUE when we are fetching padding pixels for NONE repeat. This
allows to make a decision whether to interpret alpha as 0xFF or 0x00
for such pixels when working with formats which don't have alpha
channel (for example x8r8g8b8 and r5g6b5).
2011-02-15 14:32:16 +02:00
Siarhei Siamashka
14f82083a1 Support for a8 and solid mask in nearest scaling main loop template
In addition to the most common case of not having any mask at all, two
variants of scaling with mask show up in cairo traces:
1. non-scaled a8 mask with SAMPLES_COVER_CLIP flag
2. solid mask

This patch extends the nearest scaling main loop template to also
support these cases.
2011-02-15 14:32:06 +02:00
Siarhei Siamashka
e83cee5aac test: Extend scaling-test to support a8/solid mask and ADD operation
Image width also has been increased because SIMD optimizations typically
do more unrolling in the inner loops, and this needs to be tested.
2011-02-15 14:32:01 +02:00
Siarhei Siamashka
97447f440f Use const modifiers for source buffers in nearest scaling fast paths 2011-02-15 14:29:54 +02:00
Siarhei Siamashka
8d359b00c5 C fast paths for a simple 90/270 degrees rotation
Depending on CPU architecture, performance is in the range of 1.5 to 4 times
slower than simple nonrotated copy (which would be an ideal case, perfectly
utilizing memory bandwidth), but still is more than 7 times faster if
compared to general path.

This implementation sets a performance baseline for rotation. The use
of SIMD instructions may further improve memory bandwidth utilization.
2011-02-10 16:18:01 +02:00
Siarhei Siamashka
e0c7948c97 New flags for 90/180/270 rotation
These flags are set when the transform is a simple nonscaled 90/180/270
degrees rotation.
2011-02-10 16:17:24 +02:00
Siarhei Siamashka
3b68c295fd test: affine-test updated to stress 90/180/270 degrees rotation more 2011-02-10 16:17:18 +02:00
Søren Sandmann Pedersen
56f173f0af Add pixman-conical-gradient.c to Makefile.win32.
Pointed out by Kirill Tishin.
2011-02-10 05:21:42 -05:00
Cyril Brulebois
fc1b85f258 Upload to unstable. 2011-02-06 05:31:27 +01:00
Cyril Brulebois
84bb9a7605 Mention upstream git URL in a comment. 2011-02-06 05:30:48 +01:00
Søren Sandmann Pedersen
7fd4897730 Add SSE2 fetcher for 0565
Before:

add_0565_0565 = L1:  61.08  L2:  61.03  M: 60.57 ( 10.95%)  HT: 46.85  VT: 45.25  R: 39.99  RT: 20.41 ( 233Kops/s)

After:

add_0565_0565 = L1:  77.84  L2:  76.25  M: 75.38 ( 13.71%)  HT: 55.99  VT: 54.56  R: 45.41  RT: 21.95 ( 255Kops/s)
2011-02-03 03:25:05 -05:00
Søren Sandmann Pedersen
8414aa76c2 Improve performance of sse2_combine_over_u()
Split this function into two, one that has a mask, and one that
doesn't. This is a fairly substantial speed-up in many cases.

New output of lowlevel-blt-bench over_x888_8_0565:

over_x888_8_0565 =  L1:  63.76  L2:  62.75  M: 59.37 ( 21.55%)  HT: 45.89  VT: 43.55  R: 34.51  RT: 16.80 ( 201Kops/s)
2011-02-03 03:25:05 -05:00
Søren Sandmann Pedersen
08e855f15c Add SSE2 fetcher for a8
New output of lowlevel-blt-bench over_x888_8_0565:

over_x888_8_0565 =  L1:  57.85  L2:  56.80  M: 54.14 ( 19.50%)  HT: 42.64  VT: 40.56  R: 32.67  RT: 16.22 ( 195Kops/s)

Based in part on code by Steve Snyder from

    https://bugs.freedesktop.org/show_bug.cgi?id=21173
2011-02-03 03:25:05 -05:00
Søren Sandmann Pedersen
2b6b0cf359 Add SSE2 fetcher for x8r8g8b8
New output of lowlevel-blt-bench over_x888_8_0565:

over_x888_8_0565 =  L1:  55.68  L2:  55.11  M: 52.83 ( 19.04%)  HT: 39.62  VT: 37.70  R: 30.88  RT: 14.62 ( 174Kops/s)

The fetcher is looked up in a table, so that other fetchers can easily
be added.

See also https://bugs.freedesktop.org/show_bug.cgi?id=20709
2011-02-03 03:24:47 -05:00
Søren Sandmann Pedersen
13aed37758 Add a test for over_x888_8_0565 in lowlevel_blt_bench().
The next few commits will speed this up quite a bit.

Current output:

---
reference memcpy speed = 2217.5MB/s (554.4MP/s for 32bpp fills)
---
over_x888_8_0565 =  L1:  54.67  L2:  54.01  M: 52.33 ( 18.88%)  HT: 37.19  VT: 35.54  R: 29.40  RT: 13.63 ( 162Kops/s)
2011-01-28 14:35:17 -05:00
Søren Sandmann Pedersen
2de397c272 Move fallback decisions from implementations into pixman-cpu.c.
Instead of having each individual implementation decide which fallback
to use, move it into pixman-cpu.c, where a more global decision can be
made.

This is accomplished by adding a "fallback" argument to all the
pixman_implementation_create_*() implementations, and then in
_pixman_choose_implementation() pass in the desired fallback.
2011-01-26 17:07:35 -05:00
Søren Sandmann Pedersen
ed781df1cc Print a warning when a development snapshot is being configured.
It seems to be relatively common for people to use development
snapshots of pixman thinking they are ordinary releases. This patch
makes it such that if the current minor version is odd, configure will
print a banner explaining the version number scheme plus information
about where to report bugs.
2011-01-26 17:07:35 -05:00
Rolland Dudemaine
fead9eb82a Fix "variable was set but never used" warnings
Removes useless variable declarations. This can only result in more
efficient code, as these variables where sometimes assigned, but
their values were never used.
2011-01-26 15:05:24 +02:00
Rolland Dudemaine
32e556df33 test: Use the right enum types instead of int to fix warnings
Green Hills Software MULTI compiler was producing a number
of warnings due to incorrect uses of int instead of the correct
corresponding pixman_*_t type.
2011-01-26 15:05:18 +02:00
Rolland Dudemaine
b61ec0a686 Correct the initialization of 'max_vx'
http://lists.freedesktop.org/archives/pixman/2011-January/000937.html
2011-01-25 14:55:24 +02:00
Rolland Dudemaine
e8a1b1c4e5 test: Fix for mismatched 'fence_malloc' prototype/implementation
Solves compilation problem when 'mprotect' is not available. For
example, when using Green Hills Software MULTI compiler or mingw:
http://lists.freedesktop.org/archives/pixman/2011-January/000939.html
2011-01-25 14:34:56 +02:00
Siarhei Siamashka
a8e4677ecc The code in 'bitmap_addrect' already assumes non-null 'reg->data'
So the check of 'reg->data' pointer can be safely removed.
2011-01-20 02:14:07 +02:00
Cyril Brulebois
8aeb637bb5 Upload to experimental. 2011-01-19 20:31:42 +01:00
Cyril Brulebois
461dacfb5e Update debian/copyright from upstream's COPYING. 2011-01-19 20:25:41 +01:00
Cyril Brulebois
e581626827 Bump changelogs. 2011-01-19 20:24:49 +01:00
Cyril Brulebois
f5216c99bc Merge branch 'upstream-experimental' into debian-experimental 2011-01-19 20:23:47 +01:00
Søren Sandmann Pedersen
a6a04c07c3 Post-release version bump to 0.21.5 2011-01-19 07:47:52 -05:00
Søren Sandmann Pedersen
4e56cec564 Pre-release version bump to 0.21.4 2011-01-19 07:38:24 -05:00
Søren Sandmann Pedersen
1d7195dd6c Fix dangling-pointer bug in bits_image_fetch_bilinear_no_repeat_8888().
The mask_bits variable is only declared in a limited scope, so the
pointer to it becomes invalid instantly. Somehow this didn't actually
trigger any bugs, but Brent Fulgham reported that Bounds Checker was
complaining about it.

Fix the bug by moving mask_bits to the function scope.
2011-01-19 07:22:42 -05:00
Andrea Canciani
2ac4ae1ae2 Add a test for radial gradients
radial-test is a port of the radial-gradient test from the cairo test
suite. It has been modified so that some pixels have 0 in both the a
and b coefficients of the quadratic equation solved by the rasterizer,
to expose a division by zero in the original implementation.
2011-01-19 13:17:03 +01:00
Søren Sandmann Pedersen
7f4eabbeec Fix destination fetching
When fetching from destinations, we need to ignore transformations,
repeat and filtering. Currently we don't ignore them, which means all
kinds of bad things can happen.

This bug fixes this problem by directly calling the scanline fetchers
for destinations instead of going through the full
get_scanline_32/64().
2011-01-18 12:42:26 -05:00
Søren Sandmann Pedersen
9489c2e04a Turn on testing for destination transformation 2011-01-18 12:42:26 -05:00
Søren Sandmann Pedersen
fffeda703e Skip fetching pixels when possible
Add two new iterator flags, ITER_IGNORE_ALPHA and ITER_IGNORE_RGB that
are set when the alpha and rgb values are not needed. If both are set,
then we can skip fetching entirely and just use
_pixman_iter_get_scanline_noop.
2011-01-18 12:42:26 -05:00
Søren Sandmann Pedersen
3e635d6491 Add direct-write optimization back
Introduce a new ITER_LOCALIZED_ALPHA flag that indicates that the
alpha value computed is used only for the alpha channel of the output;
it doesn't affect the RGB channels.

Then in pixman-bits-image.c, if a destination is either a8r8g8b8 or
x8r8g8b8 with localized alpha, the iterator will return a pointer
directly into the image.
2011-01-18 12:42:26 -05:00
Søren Sandmann Pedersen
0f1a5c4a27 Get rid of the classify methods
They are not used anymore, and the linear gradient is now doing the
optimization in a different way.
2011-01-18 12:42:26 -05:00
Søren Sandmann Pedersen
b66cabb884 Linear: Optimize for horizontal gradients
If the gradient is horizontal, we can reuse the same scanline over and
over. Add support for this optimization to
_pixman_linear_gradient_iter_init().
2011-01-18 12:42:26 -05:00
Søren Sandmann Pedersen
cf14189c69 Consolidate the various get_scanline_32() into get_scanline_narrow()
The separate get_scanline_32() functions in solid, linear, radial and
conical images are no longer necessary because all access to these
images now go through iterators.
2011-01-18 12:42:26 -05:00
Søren Sandmann Pedersen
0a6360a7ee Allow NULL property_changed function
Initialize the field to NULL, and then delete the empty functions from
the solid, linear, radial, and conical images.
2011-01-18 12:42:26 -05:00
Søren Sandmann Pedersen
34b5633105 Move get_scanline_32/64 to the bits part of the image struct
At this point these functions are basically a cache that the bits
image uses for its fetchers, so they can be moved to the bits image.

With the scanline getters only being initialized in the bits image,
the _pixman_image_get_scanline_generic_64 can be moved to
pixman-bits-image.c. That gets rid of the final user of
_pixman_image_get_scanline_32/64, so these can be deleted.
2011-01-18 12:42:26 -05:00
Søren Sandmann Pedersen
d6b13f99b4 Use an iterator in pixman_image_get_solid()
This is a step towards getting rid of the
_pixman_image_get_scanline_32/64() functions.
2011-01-18 12:42:26 -05:00
Søren Sandmann Pedersen
51a5e949f3 Virtualize iterator initialization
Make src_iter_init() and dest_iter_init() virtual methods in the
implementation struct. This allows individual implementations to plug
in their own CPU specific scanline fetchers.
2011-01-18 12:42:26 -05:00
Søren Sandmann Pedersen
6503c6edcc Move iterator initialization to the respective image files
Instead of calling _pixman_image_get_scanline_32/64(), move the
iterator initialization into the respecive image implementations and
call the scanline generators directly.
2011-01-18 12:42:26 -05:00
Søren Sandmann Pedersen
23c6e1d2c0 Eliminate the _pixman_image_store_scanline_32/64 functions
They were only called from next_line_write_narrow/wide, so they could
simply be absorbed into those functions.
2011-01-18 12:42:25 -05:00
Søren Sandmann Pedersen
b2c9eaa502 Move initialization of iterators for bits images to pixman-bits-image.c
pixman_iter_t is now defined in pixman-private.h, and iterators for
bits images are being initialized in pixman-bits-image.c
2011-01-18 12:42:25 -05:00
Søren Sandmann Pedersen
15b1645c7b Add iterators in the general implementation
We add a new structure called a pixman_iter_t that encapsulates the
information required to read scanlines from an image. It contains two
functions, get_scanline() and write_back(). The get_scanline()
function will generate pixels for the current scanline. For iterators
for source images, it will also advance to the next scanline. The
write_back() function is only called for destination images. Its
function is to write back the modified pixels to the image and then
advance to the next scanline.

When an iterator is initialized, it is passed this information:

   - The image to iterate

   - The rectangle to be iterated

   - A buffer that the iterator may (but is not required to) use. This
     buffer is guaranteed to have space for at least width pixels.

   - A flag indicating whether a8r8g8b8 or a16r16g16b16 pixels should
     be fetched

There are a number of (eventual) benefits to the iterators:

   - The initialization of the iterator can be virtualized such that
     implementations can plug in their own CPU specific get_scanline()
     and write_back() functions.

   - If an image is horizontal, it can simply plug in an appropriate
     get_scanline(). This way we can get rid of the annoying
     classify() virtual function.

   - In general, iterators can remember what they did on the last
     scanline, so for example a REPEAT_NONE image might reuse the same
     data for all the empty scanlines generated by the zero-extension.

   - More detailed information can be passed to iterator, allowing
     more specialized fetchers to be used.

   - We can fix the bug where destination filters and transformations
     are not currently being ignored as they should be.

However, this initial implementation is not optimized at all. We lose
several existing optimizations:

   - The ability to composite directly in the destination
   - The ability to only fetch one scanline for horizontal images
   - The ability to avoid fetching the src and mask for the CLEAR
     operator

Later patches will re-introduce these optimizations.
2011-01-18 12:42:25 -05:00
Siarhei Siamashka
255d624e50 ARM: do /proc/self/auxv based cpu features detection only in linux
This method is linux specific, but earlier it was tried for any platform
that did not have _MSC_VER macro defined.
2011-01-16 23:40:38 +02:00
Siarhei Siamashka
2bbd553bd2 A new configure option --enable-static-testprogs
This option can be used for building fully static binaries of the test
programs so that they can be easily run using qemu-user. With binfmt-misc
configured, 'make check' works fine for crosscompiled pixman builds.
2011-01-16 23:40:34 +02:00
Siarhei Siamashka
55bbccf84e Make 'fast_composite_scaled_nearest_*' less suspicious
Taking address of a variable and then using it as an array looks suspicious
to static code analyzers. So change it into an array with 1 element to make
them happy. Both old and new variants of this code are correct because 'vx'
and 'unit_x' arguments are set to 0 and it means that the called scanline
function can only access a single element of 'zero' buffer.
2011-01-16 22:32:33 +02:00
Siarhei Siamashka
ae70b38d40 Bugfix for a corner case in 'pixman_transform_is_inverse'
When 'pixman_transform_multiply' fails, the result of multiplication just
could not have been identity matrix (one of the values in the resulting
matrix can't be represented as 16.16 fixed point value). So it is safe
to return FALSE.
2011-01-16 22:32:02 +02:00
Siarhei Siamashka
ab3809f4da Workaround for a preprocessor issue in old Sun Studio
Patch from Peter O'Gorman with some modifications

https://bugs.freedesktop.org//show_bug.cgi?id=32764
2011-01-16 20:48:39 +02:00
Siarhei Siamashka
f5c0a60ac8 Fix for "syntax error: empty declaration" Solaris Studio warnings 2011-01-16 20:48:13 +02:00
Siarhei Siamashka
c71e24c9fc Revert "Fix "syntax error: empty declaration" warnings."
This reverts commit b924bb1f81.

There is a better fix for these Solaris Studio warnings.
2011-01-16 20:47:56 +02:00
Andrea Canciani
29439bd772 Improve handling of tangent circles
When b is 0, avoid the division by zero and just return transparent
black.

When the solution t would have an invalid radius (negative or outside
[0,1] for none-extended gradients), return transparent black.
2011-01-12 22:04:33 +01:00
Søren Sandmann Pedersen
a484a9c49c sse2: Skip src pixels that are zero in sse2_composite_over_8888_n_8888()
This is a big speed-up in the SVG helicopter game:

   http://ie.microsoft.com/testdrive/Performance/Helicopter/Default.xhtml

when rendered by Firefox 4 since it is compositing big images
consisting almost entirely of zeros.
2010-12-20 19:37:11 -05:00
Søren Sandmann Pedersen
2610323545 Fix divide-by-zero in set_lum().
When (l - min) or (max - l) are zero, simply set all the channels to
the limit, 0 in the case of (l - min), and a in the case of (max - l).
2010-12-20 19:37:11 -05:00
Søren Sandmann Pedersen
3479050216 Add a test compositing with the various PDF operators.
The test has floating point exceptions enabled, and currently fails
with a divide-by-zero.
2010-12-20 19:37:11 -05:00
Cyril Brulebois
45a2d01077 Fix linking issues when HAVE_FEENABLEEXCEPT is set.
All objects using test/util.c fail to link:
|   CCLD   region-test
| /usr/bin/ld: utils.o: in function enable_fp_exceptions:utils.c(.text+0x939): error: undefined reference to 'feenableexcept'

There's indeed no explicit dependency on -lm, and if HAVE_FEENABLEEXCEPT
happens to be set, test/util.c uses feenableexcept(), which is nowhere
to be found while linking.

Fix this by adding -lm to TEST_LDADD, although two alternatives could be
thought of:
 - Only specifying -lm for objects using util.c.
 - Introducing a conditional to add -lm only when configure detects
   have_feenableexcept=yes.

Signed-off-by: Cyril Brulebois <kibi@debian.org>
2010-12-20 09:55:07 -05:00
Jon TURNEY
303de045ff Remove stray #include <fenv.h>
Remove a stray #include <fenv.h> added in commit 2444b2265a
to fix compilation on platforms which don't have fenv.h

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
2010-12-18 16:29:12 -05:00
Søren Sandmann Pedersen
f914cf4486 Add a stress-test program.
This test program tries to use as many rarely-used features as
possible, including alpha maps, accessor functions, oddly-sized
images, strange transformations, conical gradients, etc.

The hope is to provoke crashes or irregular behavior in pixman.
2010-12-17 17:03:29 -05:00
Søren Sandmann Pedersen
7d7b03c091 Make the argument to fence_malloc() an int64_t
That way we can detect if someone attempts to allocate a negative size
and abort instead of just returning NULL and segfaulting later.
2010-12-17 17:01:52 -05:00
Søren Sandmann Pedersen
d41522113e test/utils.c: Initialize palette->rgba to 0.
That way it can be used with palettes that are not statically
allocated, without causing valgrind issues.
2010-12-17 16:57:53 -05:00
Søren Sandmann Pedersen
337f0bff0d test: Move palette initialization to utils.[ch] 2010-12-17 16:57:53 -05:00
Søren Sandmann Pedersen
2444b2265a Extend gradient-crash-test
Test the gradients with various transformations, and test cases where
the gradients are specified with two identical points.
2010-12-17 16:57:38 -05:00
Søren Sandmann Pedersen
de2e51dacb Add enable_fp_exceptions() function in utils.[ch]
This function enables floating point traps if possible.
2010-12-17 16:57:18 -05:00
Søren Sandmann Pedersen
a2afcc9ba4 test: Make composite test use some existing macros instead of defining its own
Also move the ARRAY_LENGTH macro into utils.h so it can be used elsewhere.
2010-12-17 16:57:18 -05:00
Siarhei Siamashka
4d8d2fa47e COPYING: added Nokia to the list of copyright holders 2010-12-17 15:34:16 +02:00
Siarhei Siamashka
3d094997b1 Fix for potential unaligned memory accesses
The temporary scanline buffer allocated on stack was declared
as uint8_t array. As a result, the compiler was free to select
any arbitrary alignment for it (even though there is typically
no reason to use really weird alignments here and the stack is
normally at least 4 bytes aligned on most platforms). Having
improper alignment is non-portable and can impact performance
or even make the code misbehave depending on the target platform.

Using uint64_t type for this array should ensure that any possible
memory accesses done by pixman code are going to be handled correctly
(pixman-combine64.c can access this buffer via uint64_t * pointer).

Some alignment related problem was reported in:
http://lists.freedesktop.org/archives/pixman/2010-November/000747.html
2010-12-07 02:10:51 +02:00
Siarhei Siamashka
985e59a82f ARM: added 'neon_src_rpixbuf_8888' fast path
With this optimization added, pixman assisted conversion from
non-premultiplied to premultiplied alpha format is now fully
NEON optimized (both with and without R/B color components
swapping in the process).
2010-12-07 02:10:35 +02:00
Siarhei Siamashka
733f68912f ARM: added 'neon_composite_in_n_8' fast path 2010-12-03 15:38:04 +02:00
Siarhei Siamashka
af7a69d90e ARM: added flags parameter to some asm fast path wrapper macros
Not all types of operations can be skipped when having transparent
solid source or transparent solid mask. Add an extra flags parameter
for providing this information to the wrappers.
2010-12-03 15:38:00 +02:00
Siarhei Siamashka
f6843e3797 ARM: added 'neon_composite_add_8888_n_8888' fast path 2010-12-03 15:37:54 +02:00
Siarhei Siamashka
b066b520df ARM: added 'neon_composite_add_n_8_8888' fast path 2010-12-03 15:37:49 +02:00
Siarhei Siamashka
1fba779036 ARM: better NEON instructions scheduling for add_8888_8888_8888
Provides a minor performance improvement by using pipelining and hiding
instructions latencies. Also do not clobber d0-d3 registers (source
image pixels) while doing calculations in order to allow the use of
the same macro for add_n_8_8888 fast path later.

Benchmark from ARM Cortex-A8 @500MHz:

== before ==

  add_8888_8888_8888 = L1:  95.94  L2:  42.27  M: 25.60 (121.09%)
                       HT:  14.54  VT:  13.13  R: 12.77  RT:  4.49 (48Kops/s)
     add_8888_8_8888 = L1: 104.51  L2:  57.81  M: 36.06 (106.62%)
                       HT:  19.24  VT:  16.45  R: 14.71  RT:  4.80 (51Kops/s)

== after ==

  add_8888_8888_8888 = L1: 106.66  L2:  47.82  M: 27.32 (129.30%)
                       HT:  15.44  VT:  13.96  R: 12.86  RT:  4.48 (48Kops/s)
     add_8888_8_8888 = L1: 107.72  L2:  61.02  M: 38.26 (113.16%)
                       HT:  19.48  VT:  16.72  R: 14.82  RT:  4.80 (51Kops/s)
2010-12-03 15:37:44 +02:00
Siarhei Siamashka
c3f48b6aa2 ARM: added 'neon_composite_add_8888_8_8888' fast path 2010-12-03 15:37:40 +02:00
Siarhei Siamashka
6d2f7f981b ARM: added 'neon_composite_over_0565_n_0565' fast path 2010-12-03 15:37:23 +02:00
Siarhei Siamashka
3990931bf6 ARM: reuse common NEON code for over_{n_8|8888_n|8888_8}_0565
Renamed suppementary macros from 'over_n_8_0565' to 'over_8888_8_0565',
because they can actually support all variants of this operation:
over_8888_8_0565/over_n_8_0565/over_8888_n_0565.

Also 'over_8888_8_0565' now uses more optimized common code instead of its
own variant, improving performance a bit. Even though this operation is
still memory bandwidth limited, scaled variants of these fast paths may
put more stress on CPU later.

Benchmarked on ARM Cortex-A8 @500MHz:

== before ==

    over_8888_8_0565 =  L1:  67.10  L2:  53.82  M: 44.70 (105.17%)
                        HT:  18.73  VT:  16.91  R: 14.25  RT:  4.80 (52Kops/s)

== after ==

    over_8888_8_0565 =  L1:  77.83  L2:  58.14  M: 44.82 (105.52%)
                        HT:  20.58  VT:  17.44  R: 15.05  RT:  4.88 (52Kops/s)
2010-12-03 15:37:19 +02:00
Siarhei Siamashka
a7c36681c0 ARM: added 'neon_composite_over_8888_n_0565' fast path 2010-12-03 15:37:15 +02:00
Siarhei Siamashka
e6814837a6 ARM: better NEON instructions scheduling for over_n_8_0565
Code rearranged to get better instructions scheduling for ARM Cortex-A8/A9.
Now it is ~30% faster for the pixel data in L1 cache and makes better use
of memory bandwidth when running at lower clock frequencies (ex. 500MHz).
Also register d24 (pixels from the mask image) is now not clobbered by
supplementary macros, which allows to reuse them for the other variants
of compositing operations later.

Benchmark from ARM Cortex-A8 @500MHz:

== before ==

    over_n_8_0565 =  L1:  63.90  L2:  63.15  M: 60.97 ( 73.53%)
                     HT:  28.89  VT:  24.14  R: 21.33  RT:  6.78 (  67Kops/s)

== after ==

    over_n_8_0565 =  L1:  82.64  L2:  75.19  M: 71.52 ( 84.14%)
                     HT:  30.49  VT:  25.56  R: 22.36  RT:  6.89 (  68Kops/s)
2010-12-03 15:37:11 +02:00
Siarhei Siamashka
3be86a92cc ARM: introduced 'fetch_mask_pixblock' macro to simplify code
This macro hides the implementation details of pixels fetching
for the mask image just like 'fetch_src_pixblock' does for the
source image. This provides more possibilities for reusing the
same code blocks in different compositing functions.

This patch does not introduce any functional changes and the
resulting code in the compiled object file is exactly the same.
2010-12-03 15:37:06 +02:00
Siarhei Siamashka
98d08b37f1 ARM: added 'neon_composite_over_n_8_8' fast path 2010-12-03 15:37:01 +02:00
Siarhei Siamashka
4b5b5a2a83 C fast path for a1 fill operation
Can be used as one of the solutions to fix bug
https://bugs.freedesktop.org/show_bug.cgi?id=31604
2010-11-23 00:54:19 +02:00
Alan Coopersmith
654961efe4 Sun's copyrights belong to Oracle now
Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
2010-11-21 11:42:22 -08:00
Cyril Brulebois
e7ee43c39d Fix argument quoting for AC_INIT.
One gets rid of this accordingly:
| autoreconf -vfi
| autoreconf: Entering directory `.'
| autoreconf: configure.ac: not using Gettext
| autoreconf: running: aclocal --force
| configure.ac:61: warning: AC_INIT: not a literal: "pixman@lists.freedesktop.org"
| autoreconf: configure.ac: tracing
| configure.ac:61: warning: AC_INIT: not a literal: "pixman@lists.freedesktop.org"

Signed-off-by: Cyril Brulebois <kibi@debian.org>
2010-11-19 13:57:47 -05:00
Cyril Brulebois
149ed6b1f0 Upload to experimental. 2010-11-17 15:56:52 +01:00
Cyril Brulebois
865e06cab0 Update debian/copyright from upstream's COPYING. 2010-11-17 15:28:15 +01:00
Cyril Brulebois
868ed1e2a0 Update changelogs. 2010-11-17 15:27:13 +01:00
Cyril Brulebois
bed147b523 Merge branch 'upstream-experimental' into debian-experimental 2010-11-17 15:25:39 +01:00
Søren Sandmann Pedersen
c59db8af66 Post-release version bump to 0.21.3 2010-11-16 17:14:47 -05:00
Søren Sandmann Pedersen
4646c23858 Pre-release version bump 2010-11-16 16:43:26 -05:00
Søren Sandmann Pedersen
536cf4dd3b Generate {a,x}8r8g8b8, a8, 565 fetchers for nearest/affine images
There are versions for all combinations of x8r8g8b8/a8r8g8b8 and
pad/repeat/none/normal repeat modes. The bulk of each function is an
inline function that takes a format and a repeat mode as parameters.
2010-11-16 16:41:42 -05:00
Andrea Canciani
da0176e853 Improve conical gradients opacity check
Conical gradients are completely opaque if all of their stops are
opaque and the repeat mode is not 'none'.
2010-11-12 17:13:30 +01:00
Andrea Canciani
151f2554fc Fix opacity check
Radial gradients are "conical", thus they can have some non-opaque
parts even if all of their stops are completely opaque.

To guarantee that a radial gradient is actually opaque, it needs to
also have one of the two circles containing the other one. In this
case when extrapolating, the whole plane is completely covered (as
explained in the comment in pixman-radial-gradient.c).
2010-11-12 17:13:30 +01:00
Andrea Canciani
19ed415b74 Remove unused stop_range field 2010-11-12 17:13:30 +01:00
Siarhei Siamashka
d8fe87a626 ARM: optimization for scaled src_0565_0565 with nearest filter
The performance improvement is only in the ballpark of 5% when
compared against C code built with a reasonably good compiler
(gcc 4.5.1). But gcc 4.4 produces approximately 30% slower code
here, so assembly optimization makes sense to avoid dependency
on the compiler quality and/or optimization options.

Benchmark from ARM11:
    == before ==
    op=1, src_fmt=10020565, dst_fmt=10020565, speed=34.86 MPix/s

    == after ==
    op=1, src_fmt=10020565, dst_fmt=10020565, speed=36.62 MPix/s

Benchmark from ARM Cortex-A8:
    == before ==
    op=1, src_fmt=10020565, dst_fmt=10020565, speed=89.55 MPix/s

    == after ==
    op=1, src_fmt=10020565, dst_fmt=10020565, speed=94.91 MPix/s
2010-11-10 17:26:49 +02:00
Siarhei Siamashka
b8007d0423 ARM: NEON optimization for scaled src_0565_8888 with nearest filter
Benchmark from ARM Cortex-A8 @720MHz:
    == before ==
    op=1, src_fmt=10020565, dst_fmt=20028888, speed=8.99 MPix/s

    == after ==
    op=1, src_fmt=10020565, dst_fmt=20028888, speed=76.98 MPix/s

    == unscaled ==
    op=1, src_fmt=10020565, dst_fmt=20028888, speed=137.78 MPix/s
2010-11-10 17:26:42 +02:00
Siarhei Siamashka
2e855a2b4a ARM: NEON optimization for scaled src_8888_0565 with nearest filter
Benchmark from ARM Cortex-A8 @720MHz:
    == before ==
    op=1, src_fmt=20028888, dst_fmt=10020565, speed=42.51 MPix/s

    == after ==
    op=1, src_fmt=20028888, dst_fmt=10020565, speed=55.61 MPix/s

    == unscaled ==
    op=1, src_fmt=20028888, dst_fmt=10020565, speed=117.99 MPix/s
2010-11-10 17:26:28 +02:00
Siarhei Siamashka
4a09e472b8 ARM: NEON optimization for scaled over_8888_0565 with nearest filter
Benchmark from ARM Cortex-A8 @720MHz:
    == before ==
    op=3, src_fmt=20028888, dst_fmt=10020565, speed=10.29 MPix/s

    == after ==
    op=3, src_fmt=20028888, dst_fmt=10020565, speed=36.36 MPix/s

    == unscaled ==
    op=3, src_fmt=20028888, dst_fmt=10020565, speed=79.40 MPix/s
2010-11-10 17:26:23 +02:00
Siarhei Siamashka
67a4991f33 ARM: NEON optimization for scaled over_8888_8888 with nearest filter
Benchmark from ARM Cortex-A8 @720MHz:
    == before ==
    op=3, src_fmt=20028888, dst_fmt=20028888, speed=12.73 MPix/s

    == after ==
    op=3, src_fmt=20028888, dst_fmt=20028888, speed=28.75 MPix/s

    == unscaled ==
    op=3, src_fmt=20028888, dst_fmt=20028888, speed=53.03 MPix/s
2010-11-10 17:26:17 +02:00
Siarhei Siamashka
0b56244ac8 ARM: performance tuning of NEON nearest scaled pixel fetcher
Interleaving the use of NEON registers helps to avoid some stalls
in NEON pipeline and provides a small performance improvement.
2010-11-10 17:26:10 +02:00
Siarhei Siamashka
6e76af0d4b ARM: macro template in C code to simplify using scaled fast paths
This template can be used to instantiate scaled fast path functions
by providing main loop code and calling NEON assembly optimized
scanline processing functions from it. Another macro can be used
to simplify adding entries to fast path tables.
2010-11-10 17:25:56 +02:00
Siarhei Siamashka
88014a0e6f ARM: nearest scaling support for NEON scanline compositing functions
Now it is possible to generate scanline processing functions
for the case when the source image is scaled with NEAREST filter.

Only 16bpp and 32bpp pixel formats are supported for now. But the
others can be also added later when needed. All the existing NEON
fast path functions should be quite easy to reuse for implementing
fast paths which can work with scaled source images.
2010-11-10 17:25:39 +02:00
Siarhei Siamashka
324712e48c ARM: NEON: source image pixel fetcher can be overrided now
Added a special macro 'pixld_src' which is now responsible for fetching
pixels from the source image. Right now it just passes all its arguments
directly to 'pixld' macro, but it can be used in the future to provide
a special pixel fetcher for implementing nearest scaling.

The 'pixld_src' has a lot of arguments which define its behavior. But
for each particular fast path implementation, we already know NEON
registers allocation and how many pixels are processed in a single block.
That's why a higher level macro 'fetch_src_pixblock' is also introduced
(it's easier to use because it has no arguments) and used everywhere
in 'pixman-arm-neon-asm.S' instead of VLD instructions.

This patch does not introduce any functional changes and the resulting code
in the compiled object file is exactly the same.
2010-11-10 17:25:33 +02:00
Siarhei Siamashka
cb3f183025 ARM: fix 'vld1.8'->'vld1.32' typo in add_8888_8888 NEON fast path
This was mostly harmless and had no effect on little endian systems.
But wrong vector element size is at least inconsistent and also
can theoretically cause problems on big endian ARM systems.
2010-11-10 17:25:26 +02:00
Cyril Brulebois
85950507f1 Upload to experimental. 2010-11-06 10:01:02 +01:00
Cyril Brulebois
23b9668233 Update changelogs. 2010-11-06 09:58:54 +01:00
Cyril Brulebois
7374af53e1 Merge commit 'pixman-0.20.0' into debian-experimental 2010-11-06 09:58:20 +01:00
Siarhei Siamashka
fed4a2fde5 Do CPU features detection from 'constructor' function when compiled with gcc
There is attribute 'constructor' supported since gcc 2.7 which allows
to have a constructor function for library initialization. This eliminates
an extra branch for each composite operation and also helps to avoid
complains from race condition detection tools like helgrind.

The other compilers may or may not support this attribute properly.
Ideally, the compilers should fail to compile the code with unknown
attribute, so the configure check should do the right job. But in
reality the problems are surely possible. Fortunately such problems
should be quite easy to find because NULL pointer dereference should
happen almost immediately if the constructor fails to run.

clang 2.7:
  supports __attribute__((constructor)) properly and pretends to be gcc

tcc 0.9.25:
  ignores __attribute__((constructor)), but does not pretend to be gcc
2010-11-05 16:02:28 +02:00
Søren Sandmann Pedersen
99699771cd Delete the source_image_t struct.
It serves no purpose anymore now that the source_class_t field is gone.
2010-11-04 21:03:38 -04:00
Søren Sandmann Pedersen
f405b40798 [mmx] Mark some of the output variables as early-clobber.
GCC assumes that input variables in inline assembly are fully consumed
before any output variable is written. This means it may allocate the
variables in the same register unless the output variables are marked
as early-clobber.

From Jeremy Huddleston:

    I noticed a problem building pixman with clang and reported it to
    the clang developers.  They responded back with a comment about
    the inline asm in pixman-mmx.c and suggested a fix:

    """
    Incidentally, Jeremy, in the asm that reads
    __asm__ (
    "movq %7, %0\n"
    "movq %7, %1\n"
    "movq %7, %2\n"
    "movq %7, %3\n"
    "movq %7, %4\n"
    "movq %7, %5\n"
    "movq %7, %6\n"
    : "=y" (v1), "=y" (v2), "=y" (v3),
      "=y" (v4), "=y" (v5), "=y" (v6), "=y" (v7)
    : "y" (vfill));

    all the output operands except the last one should be marked as
    earlyclobber ("=&y"). This is working by accident with gcc.
    """

Cc: jeremyhu@apple.com
Reviewed-by: Matt Turner <mattst88@gmail.com>
2010-11-04 21:03:38 -04:00
Søren Sandmann Pedersen
9c19a85b00 Remove workaround for a bug in the 1.6 X server.
There used to be a bug in the X server where it would rely on
out-of-bounds accesses when it was asked to composite with a
window as the source. It would create a pixman image pointing
to some bogus position in memory, but then set a clip region
to the position where the actual bits were.

Due to a bug in old versions of pixman, where it would not clip
against the image bounds when a clip region was set, this would
actually work. So when the pixman bug was fixed, a workaround was
added to allow certain out-of-bound accesses.

However, the 1.6 X server is so old now that we can remove this
workaround. This does mean that if you update pixman to 0.22 or later,
you will need to use a 1.7 X server or later.
2010-11-04 21:03:38 -04:00
Siarhei Siamashka
56748ea9a6 Fixed broken configure check for __thread support
Somehow the patch from [1] was not applied correctly, fixing that.

1. http://lists.cairographics.org/archives/cairo/2010-September/020826.html
2010-11-02 01:36:37 +02:00
Søren Sandmann Pedersen
ecc3612995 COPYING: Stop saying that a modification is currently under discussion.
Also put the copyright text into a C comment for easier cut and paste.
2010-11-01 18:04:31 -04:00
Søren Sandmann Pedersen
c993cd9614 Version bump 0.21.1.
The previous bump to 0.20.1 was a mistake; it belongs on the 0.20 branch.
2010-10-27 17:21:06 -04:00
Cyril Brulebois
2da37f260e Upload to experimental. 2010-10-27 23:14:13 +02:00
Søren Sandmann Pedersen
d890b684f6 Post-release version bump to 0.20.1 2010-10-27 16:58:29 -04:00
Cyril Brulebois
3a4ab94548 Add myself to Uploaders. 2010-10-27 22:57:19 +02:00
Cyril Brulebois
a74572e2e1 Enable the testsuite. 2010-10-27 22:56:49 +02:00
Søren Sandmann Pedersen
c5e048d46c Pre-release version bump to 0.20.0 2010-10-27 16:51:40 -04:00
Cyril Brulebois
990c9e2447 Pass --disable-gtk to ./configure
As of pixman-0.19.2-5-g5b99710, Gtk+ is auto-detected, make sure not to
pick it accidentally, by passing --disable-gtk. (That's only for test
purposes, but would require pixman-1 itself.)
2010-10-27 22:50:51 +02:00
Scott McCreary
6a6d9758af Added check to find pthread on Haiku. 2010-10-27 16:49:02 -04:00
Cyril Brulebois
6cba95bcd3 Bump SHLIBS_VERSION from 0.18.0 to 0.19.4 for newly-added functions. 2010-10-27 22:37:19 +02:00
Cyril Brulebois
4504e5e3e2 Kill extra tabs. 2010-10-27 22:35:22 +02:00
Cyril Brulebois
ebabb2f2ce Add -c4 to the dh_makeshlibs call, to ensure the build breaks if unexpected symbol-related changes happened. 2010-10-27 22:34:42 +02:00
Cyril Brulebois
7cad3fd3f3 Update symbols file with newly-added functions. 2010-10-27 22:30:57 +02:00
Cyril Brulebois
865f80fd77 Update changelogs. 2010-10-27 22:13:24 +02:00
Cyril Brulebois
3389b86e1f Merge branch 'upstream-experimental' into debian-experimental 2010-10-27 22:11:02 +02:00
Cyril Brulebois
3193481ee6 Merge remote branch 'origin/upstream-experimental' into upstream-experimental
Use -s ours to move to upstream's pixman-0.19.6 tag.
2010-10-27 20:55:30 +02:00
Jon TURNEY
00fdb3d8e8 Plug another leak in alphamap test
Even after commit e46be417ce alphamap
test is still leaking the alphamap pixmap, leading to mmap() failures
on cygwin

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
2010-10-24 15:38:14 -04:00
Søren Sandmann Pedersen
1c23142efa Post-release version bump to 0.19.7 2010-10-20 16:31:57 -04:00
Søren Sandmann Pedersen
d105134015 Pre-release version bump to 0.19.6 2010-10-20 16:25:55 -04:00
Andrea Canciani
a966cd04c1 Fix an overflow in the new radial gradient code
huge-radial in the cairo test suite pointed out an undocumented
overflow in the radial gradient code.
By casting to pixman_fixed_48_16_t before doing the operations,
the overflow can be avoided.
2010-10-20 16:22:29 -04:00
Søren Sandmann Pedersen
70658f0a6b Remove the class field from source_image_t
The linear gradient was the only image type that relied on the class
being stored in the image struct itself. With the previous changes, it
doesn't need that anymore, so we can delete the field.
2010-10-20 16:09:44 -04:00
Andrea Canciani
741c30d9d9 Remove unused enum value
The new linear gradient code doesn't use SOURCE_IMAGE_CLASS_VERTICAL
anymore and it was not used anywhere else.
2010-10-20 21:27:08 +02:00
Andrea Canciani
9b72fd1b85 Make classification consistent with rasterization
Use the same computations to classify the gradient and to
rasterize it.
This improves the correctness of the classification by
avoiding integer division.
2010-10-20 21:27:08 +02:00
Andrea Canciani
1d4f2d71fa Improve precision of linear gradients
Integer division (without keeping the remainder) can discard a lot
of information. Doing the division maths in floating point (and
paying attention to error propagation) allows to greatly improve
the precision of linear gradients.
2010-10-18 22:43:24 +02:00
Andrea Canciani
f6ab20ca66 Add comments about errors
Explain how errors are introduced in the computation performed for
radial gradients.
2010-10-12 14:40:36 +02:00
Andrea Canciani
1ca715ed1e Draw radial gradients with PDF semantics
Change radial gradient computations and definition to reflect the
radial gradients in PDF specifications (see section 8.7.4.5.4,
Type 3 (Radial) Shadings of the PDF Reference Manual).

Instead of having a valid interpolation parameter value for every
point of the plane, define it only for points withing the area
covered by the family of circles generated by interpolating or
extrapolating the start and end circles.

Points outside this area are now transparent black (rgba 0 0 0 0).
Points within this area have the color assiciated with the maximum
value of the interpolation parameter in that point (if multiple
solutions exist within the range specified by the extend mode).
2010-10-12 14:40:36 +02:00
Søren Sandmann Pedersen
e46be417ce Plug leak in the alphamap test.
The images are being created with non-NULL data, so we have to free it
outselves. This is important because the Cygwin tinderbox is running
out of memory and produces this:

    mmap failed on 20000 1507328
    mmap failed on 40000 1507328
    mmap failed on 20000 1507328
    mmap failed on 40000 1507328
    mmap failed on 40000 1507328
    mmap failed on 40000 1507328

http://tinderbox.x.org/builds/2010-10-05-0014/logs/pixman/#check
2010-10-11 12:06:20 -04:00
Søren Sandmann Pedersen
6ed7164de5 Add no-op combiners for DST and the CA versions of the HSL operators.
We already exit early for DST, but for the HSL operators with
component alpha, we crash at the moment. Fix that by adding a dummy
combine_dst() function.
2010-10-11 12:06:20 -04:00
Søren Sandmann Pedersen
233b27257b test: Add some more colors to the color table in composite.c
Specifically, add transparent black and superluminescent white with
alpha = 0.
2010-10-11 12:06:20 -04:00
Søren Sandmann Pedersen
3f7da59352 test: Parallize composite.c with OpenMP
Each test uses the test number as the random number seed; if it
didn't, all the threads would run the same tests since they would all
start from the same seed.
2010-10-11 12:06:20 -04:00
Søren Sandmann Pedersen
a10ccc9f30 test: Change composite so that it tests randomly generated images
Previously this test would try to exhaustively test all combinations
of formats and operators, which meant that it would take hours to run.
Instead, generate images randomly and test compositing those.

Cc: chris@chris-wilson.co.uk
2010-10-11 12:06:20 -04:00
Søren Sandmann Pedersen
55e4065cbb test: Fix eval_diff() so that it provides useful error values.
Previously, this function would evaluate the error under the
assumption that the format was 565 or wider. This patch changes it to
take the actual format into account.

With that fixed, we can turn on testing for the rest of the formats.

Cc: chris@chris-wilson.co.uk
2010-10-11 12:06:20 -04:00
Søren Sandmann Pedersen
fe411cf2ac test: Fix bug in color_correct() in composite.c
This function was using the number of bits in a channel as if it were
a mask, which lead to many spurious errors. With that fixed, we can
turn on testing for all formats where all channels have 5 or more
bits.

Cc: chris@chris-wilson.co.uk
2010-10-11 12:06:20 -04:00
Søren Sandmann Pedersen
4e89a5b7f3 Remove broken optimizations in combine_disjoint_over_u()
The first broken optimization is that it checks "a != 0x00" where it
should check "s != 0x00". The other is that it skips the computation
when alpha is 0xff. That is wrong because in the formula:

     min (1, (1 - Aa)/Ab)

the render specification states that if Ab is 0, the quotient is
defined to positive infinity. That is the case even if (1 - Aa) is 0.
2010-10-11 12:06:20 -04:00
Siarhei Siamashka
8d76c1b339 ARM: restore fallback to ARMv6 implementation from NEON in the delegate chain
After fast path cache introduction, the overhead of having this fallback is
insignificant. On the other hand, some of the ARM assembly optimizations (for
example nearest neighbor scaling) do not need NEON.
2010-10-11 01:07:07 +03:00
Siarhei Siamashka
c748650d70 Use more unrolling for scaled src_0565_0565 with nearest filter
Benchmark from Intel Core i7 860:

    == before ==
    op=1, src_fmt=10020565, dst_fmt=10020565, speed=1335.29 MPix/s

    == after ==
    op=1, src_fmt=10020565, dst_fmt=10020565, speed=1550.96 MPix/s

    == performance of nonscaled src_0565_0565 operation as a reference ==
    op=1, src_fmt=10020565, dst_fmt=10020565, speed=2401.31 MPix/s

Benchmark from ARM Cortex-A8:

    == before ==
    op=1, src_fmt=10020565, dst_fmt=10020565, speed=81.79 MPix/s

    == after ==
    op=1, src_fmt=10020565, dst_fmt=10020565, speed=89.55 MPix/s

    == performance of nonscaled src_0565_0565 operation as a reference ==
    op=1, src_fmt=10020565, dst_fmt=10020565, speed=197.44 MPix/s
2010-10-11 01:07:01 +03:00
Siarhei Siamashka
a520c15e11 ARM: added 'neon_composite_out_reverse_8_0565' fast path
== before ==

    outrev_8_0565 =  L1:  22.91  L2:  22.40  M: 18.75 ( 10.47%)
                     HT: 12.62   VT: 12.22   R: 11.32  RT:  5.30 (  58Kops/s)

== after ==

    outrev_8_0565 =  L1: 176.27  L2: 151.70  M:108.79 ( 60.81%)
                     HT: 50.43   VT: 37.16   R: 32.26  RT:  9.62 (  97Kops/s)
2010-10-04 23:08:54 +03:00
Siarhei Siamashka
d8820360f7 ARM: added 'neon_composite_add_0565_8_0565' fast path
== before ==

    add_0565_8_0565 =  L1:  14.05  L2:  14.03  M: 11.57 ( 12.94%)
                       HT:  8.31   VT:  8.10   R:  7.47  RT:  3.64 (  42Kops/s)

== after ==

    add_0565_8_0565 =  L1: 123.36  L2:  94.70  M: 74.36 ( 83.15%)
                       HT: 31.17   VT:  23.97  R: 21.06  RT:  6.42 (  70Kops/s)
2010-10-04 23:08:47 +03:00
Siarhei Siamashka
2f6c7b4f9d ARM: NEON: added forgotten cache preload for over_n_8888/over_n_0565
Prefetch provides up to 40-50% better performance when working
with large images and/or when having lots of L2 cache misses
on ARM Cortex-A8 @ 720MHz:

== before ==

    over_n_8888 =  L1: 225.83  L2: 181.02  M: 55.57 ( 41.41%)
                   HT: 38.96   VT: 36.92   R: 32.84  RT: 14.15 ( 123Kops/s)

    over_n_0565 =  L1: 153.91  L2: 149.69  M: 83.17 ( 30.95%)
                   HT: 50.41   VT: 49.15   R: 40.56  RT: 15.45 ( 131Kops/s)

== after ==

    over_n_8888 =  L1: 222.39  L2: 170.95  M: 76.86 ( 57.27%)
                   HT: 58.80   VT: 53.03   R: 45.51  RT: 14.13 ( 124Kops/s)

    over_n_0565 =  L1: 151.87  L2: 149.54  M:125.63 ( 46.80%)
                   HT: 67.85   VT: 57.54   R: 50.21  RT: 15.32 ( 130Kops/s)
2010-10-04 23:05:24 +03:00
Mika Yrjola
b924bb1f81 Fix "syntax error: empty declaration" warnings.
These minor changes should fix a large number of
macro declaration - related "syntax error:  empty declaration" warnings
which are seen while compiling the code with the Solaris Studio
compiler.
2010-10-04 11:20:01 -04:00
Søren Sandmann Pedersen
73c1fefa1b Delete simple repeat code
This was supposedly an optimization, but it has pathological cases
where it definitely isn't. For example a 1 x n image will cause it to
have terrible memory access patterns and to generate a ton of modulus
operations.

Since no one has ever measured whether it actually is an improvement,
and since it is doing the repeating at the wrong the stage in the
pipeline, and since with the previous commit it can't be triggered
anymore because we now require SAMPLES_COVER_CLIP for regular fast
paths, just delete it.
2010-10-04 11:19:27 -04:00
Søren Sandmann Pedersen
a4d1c9d383 Fix bug in FAST_PATH_STD_FAST_PATH
The standard fast paths deal with two kinds of images: solids and
bits. These two image types require different flags, but
PIXMAN_STD_FAST_PATH uses the same ones for both.

This patch makes it so that solid images just get the standard flags,
while bits images must be untransformed contain the destination clip
within the sample grid.

This means that the old FAST_PATH_COVERS_CLIP flag is now not used
anymore, so it can be deleted.
2010-10-04 11:17:53 -04:00
Dmitri Vorobiev
10e13135c3 Some clean-ups in fence_malloc() and fence_free()
This patch removes an unnecessary typecast of MAP_FAILED,
replaces an erroneous free() by the correct munmap() in the
error path for a failing mprotect(), and, finally, removes
redundant calls to mprotect() that aren't necessary, because
munmap() doesn't call for any specific memory protection.
2010-09-29 02:15:12 -04:00
Søren Sandmann Pedersen
ba693d2e88 Fix search-and-replace issue in lowlevel-blt-bench.c 2010-09-28 02:52:17 -04:00
Søren Sandmann Pedersen
77d3e5f6ff Rename all the fast paths with _8000 in their names to _8
This inconsistent naming somehow survived the refactoring from a while
back.
2010-09-28 00:07:47 -04:00
Liu Xinyun
ba69989374 Remove cache prefetch code.
The performance is decreased with cache prefetch, especially for
ATOM. So remove these code. Following is the experiment.

old: 0.19.5-with-cache-prefetch
new: 0.19.5-without-cache-prefetch

CPU: Intel Atom N270@1.6GHz
OS: MeeGo (32 bits)
Speedups
========
image-rgba                    poppler-0    17125.68 (17279.58 0.92%) -> 14765.36 (15926.49 3.54%):  1.16x speedup
image-rgba                  ocitysmap-0    9008.25 (9040.41 7.50%) -> 8277.94 (8343.09 5.44%):  1.09x speedup
image-rgba          xfce4-terminal-a1-0    18020.76 (18230.68 0.97%) -> 16703.77 (16712.42 1.22%):  1.08x speedup
image-rgba         gnome-terminal-vim-0    25081.38 (25133.38 0.24%) -> 23407.47 (23652.98 0.54%):  1.07x speedup
image-rgba          firefox-talos-gfx-0    57916.97 (57973.20 0.11%) -> 54556.64 (54624.55 0.39%):  1.06x speedup
image-rgba       firefox-planet-gnome-0    102377.47 (103496.63 0.70%) -> 96816.65 (97075.54 0.15%):  1.06x speedup
image-rgba         swfdec-giant-steps-0    12376.24 (12616.84 1.02%) -> 11705.30 (11825.20 1.06%):  1.06x speedup

CPU: Intel Core(TM)2 Duo CPU T9600@2.80GHz
OS: Ubuntu 10.04 (64bits)
Speedups
========
image-rgba                  ocitysmap-0    2671.46 (2691.82 8.55%) -> 2296.20 (2307.26 5.77%):  1.16x speedup
image-rgba         swfdec-giant-steps-0    1614.55 (1615.18 1.68%) -> 1532.84 (1538.52 0.72%):  1.05x speedup

Signed-off-by: Liu Xinyun <xinyun.liu@intel.com>
Signed-off-by: Chen Miaobo <miaobo.chen@intel.com>
2010-09-27 23:44:09 -04:00
Dmitri Vorobiev
56777f3f67 Use <sys/mman.h> macros only when they are available
Not all systems are regular Unices, so let's be careful with the
mmap()-related stuff, which might be unavailable. This patch makes
sure that mmap() and friends is used only when the <sys/mman.h>
header is found.
2010-09-23 16:02:29 -04:00
Søren Sandmann Pedersen
39524a4687 Revert "add enable-cache-prefetch option"
Revert this accidentally committed patch.

This reverts commit 19ea0e16b9.
2010-09-21 14:20:43 -04:00
Søren Sandmann Pedersen
e97da21049 If MAP_ANONYMOUS is not defined, define it to MAP_ANON.
This hopefully fixes the build failure on OS X.
2010-09-21 14:12:00 -04:00
Liu Xinyun
19ea0e16b9 add enable-cache-prefetch option
OK. here is the work to clear all cache prefetch. Please review it. 3x

On Tue, Sep 21, 2010 at 11:36:30PM +0800, Soeren Sandmann wrote:
> Liu Xinyun <xinyun.liu@intel.com> writes:
>
> >    This patch is to add a new configuration option: enable-cache-prefetch,
> > which is default yes.
> >
> >    Here is a link which talks on cache issue.
> >    http://lists.freedesktop.org/archives/pixman/2010-June/000218.html
> >
> >    When disable it on Atom CPU(configured with --enable-cache-prefetch=no),
> > it will have a little performance gain. Here is the patch.
>
> I think the cache prefetch code should just be deleted outright. No
> benchmarks that I'm aware of show it to be an improvement.
>
>
> Thanks,
> Soren

>From bca2192ef524bcae4eea84d0ffed9e8c4855675f Mon Sep 17 00:00:00 2001
From: Liu Xinyun <xinyun.liu@intel.com>
Date: Wed, 22 Sep 2010 00:11:56 +0800
Subject: [PATCH] remove cache prefetch
2010-09-21 12:35:51 -04:00
Søren Sandmann Pedersen
edd1733966 Post-release version bump to 0.19.5 2010-09-21 10:18:44 -04:00
Søren Sandmann Pedersen
e5b3a6e710 Pre-release version bump to 0.19.4 2010-09-21 10:11:34 -04:00
Søren Sandmann Pedersen
0742ba4164 compute_composite_region32: Zero extents before returning FALSE.
If the extents of the composite region are broken such that x2 <= x1
or y2 <= y1, then we need to zero the extents before returning so that
the region won't be completely broken when calling
pixman_region32_fini().
2010-09-21 10:05:52 -04:00
Jonathan Morton
7cd4f2fa20 Add a lowlevel blitter benchmark
This test is a modified version of Siarhei's compositor throughput
benchmark.  It's expanded with explicit reporting of memory bandwidth
consumption for the M-test, and with an additional 8x8-random test
intended to determine peak ops/sec capability.  There are also quite a
lot more operations tested for.
2010-09-21 08:50:18 -04:00
Dmitri Vorobiev
eab3a77877 Add noinline macro
This patch adds a noinline macro, which expands to compiler-dependent
keywords that tell the compiler to never inline a function.
2010-09-21 08:50:17 -04:00
Dmitri Vorobiev
cab3261c0d Add gettime() routine to test utils
Impending benchmark code will need a function to get current time
in seconds, and this patch introduces such routine. We try to use
the POSIX gettimeofday() function when available, and fall back to
clock() when not.
2010-09-21 08:50:17 -04:00
Dmitri Vorobiev
fd3c87d460 Move aligned_malloc() to utils
The aligned_malloc() routine will be used in more than one test utility.
At least, a low-level blitter benchmark needs it. Therefore, let's make
this function a part of common test utilities code.
2010-09-21 08:50:17 -04:00
Søren Sandmann Pedersen
f474783607 Enable bits_image_fetch_bilinear_affine_normal_r5g6b5 2010-09-21 08:50:17 -04:00
Søren Sandmann Pedersen
91521d30ab Enable bits_image_fetch_bilinear_affine_reflect_r5g6b5 2010-09-21 08:50:17 -04:00
Søren Sandmann Pedersen
372d7b954a Enable bits_image_fetch_bilinear_affine_none_r5g6b5 2010-09-21 08:50:17 -04:00
Søren Sandmann Pedersen
a826ae0e3a Enable bits_image_fetch_bilinear_affine_pad_r5g6b5 2010-09-21 08:50:17 -04:00
Søren Sandmann Pedersen
c5238bd180 Enable bits_image_fetch_bilinear_affine_normal_a8 2010-09-21 08:50:17 -04:00
Søren Sandmann Pedersen
d12daefcdb Enable bits_image_fetch_bilinear_affine_reflect_a8 2010-09-21 08:50:17 -04:00
Søren Sandmann Pedersen
9388be3293 Enable bits_image_fetch_bilinear_affine_none_a8 2010-09-21 08:50:17 -04:00
Søren Sandmann Pedersen
8e4d4e8d11 Enable bits_image_fetch_bilinear_affine_pad_a8 2010-09-21 08:50:17 -04:00
Søren Sandmann Pedersen
ce1f6c50b4 Enable bits_image_fetch_bilinear_affine_normal_x8r8g8b8 2010-09-21 08:50:17 -04:00
Søren Sandmann Pedersen
83f2ee3e95 Enable bits_image_fetch_bilinear_affine_reflect_x8r8g8b8 2010-09-21 08:50:17 -04:00
Søren Sandmann Pedersen
be37ae331c Enable bits_image_fetch_bilinear_affine_none_x8r8g8b8 2010-09-21 08:50:16 -04:00
Søren Sandmann Pedersen
5f8a9bebc0 Enable bits_image_fetch_bilinear_affine_pad_x8r8g8b8 2010-09-21 08:50:16 -04:00
Søren Sandmann Pedersen
c59584cb86 Enable bits_image_fetch_bilinear_affine_normal_a8r8g8b8 2010-09-21 08:50:16 -04:00
Søren Sandmann Pedersen
2292cff304 Enable bits_image_fetch_bilinear_affine_reflect_a8r8g8b8 2010-09-21 08:50:16 -04:00
Søren Sandmann Pedersen
8b29162693 Enable bits_image_fetch_bilinear_affine_none_a8r8g8b8 2010-09-21 08:50:16 -04:00
Søren Sandmann Pedersen
e8555874e1 Enable bits_image_fetch_bilinear_affine_pad_a8r8g8b8 2010-09-21 08:50:16 -04:00
Søren Sandmann Pedersen
f9778c15e9 Use a macro to generate some {a,x}8r8g8b8, a8, and r5g6b5 bilinear fetchers.
There are versions for all combinations of x8r8g8b8/a8r8g8b8 and
pad/repeat/none/normal repeat modes. The bulk of each scaler is an
inline function that takes a format and a repeat mode as parameters.

The new scalers are all commented out, but the next commits will
enable them one at a time to facilitate bisecting.
2010-09-21 08:50:16 -04:00
Søren Sandmann Pedersen
6d1e10a8b5 test: Add affine-test
This test tests compositing with various affine transformations. It is
almost identical to scaling-test, except that it also applies a random
rotation in addition to the random scaling and translation.
2010-09-21 08:31:09 -04:00
Søren Sandmann Pedersen
4fa33537d7 analyze_extents: Fast path for non-transformed BITS images
Profiling various cairo traces showed that we were spending a lot of
time in analyze_extents and compute_sample_extents(). This was
especially bad for glyphs where all this computation was completely
unnecessary.

This patch adds a fast path for the case of non-transformed BITS
images. The result is approximately a 6% improvement on the
firefox-talos-gfx benchmark:

Before:

[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]    image            firefox-talos-gfx   13.797   13.848   0.20%    6/6

After:

[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]    image            firefox-talos-gfx   12.946   13.018   0.39%    6/6
2010-09-21 08:31:09 -04:00
Søren Sandmann Pedersen
c97881fe3c Move some of the FAST_PATH_COVERS_CLIP computation to pixman-image.c
When an image is solid or repeating, the FAST_PATH_COVERS_CLIP flag
can be set in compute_image_info().

Also the code that turned this flag off in pixman.c was not correct;
it didn't take transformations into account. With this patch, pixman.c
doesn't set the flag by default, but instead relies on the call to
compute_samples_extents() to set it when possible.
2010-09-21 08:31:09 -04:00
Tor Lillqvist
3411f9399c Support __thread on MINGW 4.5
By the way, it seems that with gcc 4.5.0 from mingw.org, __thread, sse
and mmx work fine.

I added the below to pixman 0.18 and as far as I can see, it works.
make check reports no problems. (Earlier I had to use --disable-mmx
and --disable-sse2.) Also gtk-demo and gimp run fine.

(Also a change to get rid of the warnings about -fvisibility being ignored.)
2010-09-21 08:31:08 -04:00
Søren Sandmann Pedersen
add0fd1bac Clip composite region against the destination alpha map extents.
Otherwise we can end up writing outside the alpha map.
2010-09-21 08:31:08 -04:00
Søren Sandmann Pedersen
af2f0080fe Remove FAST_PATH_NARROW_FORMAT flag if there is a wide alpha map
If an image has an alpha map that has wide components, then we need to
use 64 bit processing for that image. We detect this situation in
pixman-image.c and remove the FAST_PATH_NARROW_FORMAT flag.

In pixman-general, the wide/narrow decision is now based on the flags
instead of on the formats.
2010-09-21 08:31:08 -04:00
Søren Sandmann Pedersen
0afc613415 Rename FAST_PATH_NO_WIDE_FORMAT to FAST_PATH_NARROW_FORMAT
This avoids a negative in the name. Also, by renaming the "wide"
variable in pixman-general.c to "narrow" and fixing up the logic
correspondingly, the code there reads a lot more straightforwardly.
2010-09-21 08:31:08 -04:00
Søren Sandmann Pedersen
ae77548f0d Update and extend the alphamap test
- Test many more combinations of formats

- Test destination alpha maps

- Test various different alpha origins

Also add a transformation to the destination, but comment it out
because it is actually broken at the moment (and pretty difficult to
fix).
2010-09-21 08:28:55 -04:00
Søren Sandmann Pedersen
dc9fe269ea Add fence_malloc() and fence_free().
These variants of malloc() and free() try to surround the allocated
memory with protected pages so that out-of-bounds accessess will cause
a segmentation fault.

If mprotect() and getpagesize() are not available, these functions are
simply equivalent to malloc() and free().
2010-09-21 08:28:55 -04:00
Søren Sandmann Pedersen
f4dc73bad4 Do opacity computation with shifts instead of comparing with 0
Also add a COMPILE_TIME_ASSERT() macro and use it to assert that the
shift is correct.
2010-09-21 08:28:55 -04:00
Siarhei Siamashka
517a77a992 SSE2 optimization for scaled over_8888_8888 operation with nearest filter
This is the first demo implementation, it should be possible to
generalize it later to cover more operations with less lines of code.

It should be also possible to introduce the use of '__builtin_constant_p'
gcc builtin function for an efficient way of checking if 'unit_x' is known
to be zero at compile time (when processing padding pixels for NONE, or
PAD repeat).

Benchmarks from Intel Core i7 860:

== before (nearest OVER) ==
op=3, src_fmt=20028888, dst_fmt=20028888, speed=142.01 MPix/s

== after (nearest OVER) ==
op=3, src_fmt=20028888, dst_fmt=20028888, speed=314.99 MPix/s

== performance of nonscaled operation as a reference ==
op=3, src_fmt=20028888, dst_fmt=20028888, speed=652.09 MPix/s
2010-09-21 13:33:57 +03:00
Siarhei Siamashka
abc90dad57 NONE repeat support for fast scaling with nearest filter
Implemented very similar to PAD repeat.

And gcc also seems to be able to completely eliminate the
code responsible for left and right padding pixels for OVER
operation with NONE repeat.
2010-09-21 13:33:08 +03:00
Siarhei Siamashka
45833d5b19 PAD repeat support for fast scaling with nearest filter
When processing pixels from the left and right padding, the same
scanline function is used with 'unit_x' set to 0.

Actually appears that gcc can handle this quite efficiently. When
using 'restrict' keyword, it is able to optimize the whole operation
performed on left or right padding pixels to a small unrolled loop
(the code is reduced to a simple fill implementation):

    9b30:       89 08                   mov    %ecx,(%rax)
    9b32:       89 48 04                mov    %ecx,0x4(%rax)
    9b35:       48 83 c0 08             add    $0x8,%rax
    9b39:       49 39 c0                cmp    %rax,%r8
    9b3c:       75 f2                   jne    9b30

Without 'restrict' keyword, there is one instruction more: reloading
source pixel data from memory in the beginning of each iteration. That
is slower, but also acceptable.
2010-09-21 13:32:11 +03:00
Siarhei Siamashka
3db0cc5c75 Introduce a fake PIXMAN_REPEAT_COVER constant
We need to implement a true PIXMAN_REPEAT_NONE support later (padding
the source with zero pixels). So it's better not to use PIXMAN_REPEAT_NONE
for handling FAST_PATH_SAMPLES_COVER_CLIP special case.
2010-09-21 13:30:59 +03:00
Siarhei Siamashka
e9b0740af7 Nearest scaling fast path macro split into two parts
Scanline processing is now split into a separate function. This provides
an easy way of overriding it with a platform specific implementation,
which may use SIMD optimizations. Only basic C data types are used as
the arguments for this function, so it may be implemented entirely in
assembly or be generated by some JIT engine.

Also as a result of this split, the complexity of code is reduced a
bit and now it should be easier to introduce support for the currently
missing NONE, PAD and REFLECT repeat types.
2010-09-21 13:29:55 +03:00
Siarhei Siamashka
066ce191a6 Nearest scaling fast path macros moved to 'pixman-fast-path.h'
These macros with some modifications can can be reused later by
various platform specific implementations, introducing SIMD
optimizations for nearest scaling fast paths.
2010-09-21 13:28:40 +03:00
Søren Sandmann Pedersen
fb819c0e93 Add FAST_PATH_NO_ALPHA_MAP to the standard destination flags.
We can't in general take a fast path if the destination has an alpha
map.
2010-09-14 08:57:17 -04:00
Siarhei Siamashka
ba6c98fc4b test: detection of possible floating point registers corruption
Added a pair of macros which can help to detect corruption
of floating point registers after a function call. This may
happen if _mm_empty() call is forgotten in MMX/SSE2 fast
path code, or ARM NEON assembly optimized function
forgets to save/restore d8-d15 registers before use.
2010-09-13 18:12:31 +03:00
Siarhei Siamashka
e470c0dc5b ARM: added 'neon_composite_over_0565_8_0565' fast path 2010-09-13 18:10:59 +03:00
Siarhei Siamashka
a5bf7c3b1a ARM: helper macros for conversion between 8888/x888/0565 formats 2010-09-13 18:08:16 +03:00
Siarhei Siamashka
8e299702f3 ARM: common init/cleanup macro for saving/restoring NEON registers
This is a typical prologue/epilogue for many NEON fast path functions, so
it makes sense to provide common reusable macros for it in the header file.
2010-09-13 18:05:53 +03:00
Søren Sandmann Pedersen
e29d9dfcb5 Silence some warnings about uninitialized variables
Neither were real problems, but GCC was complaining about them.
2010-09-08 19:16:21 -04:00
Søren Sandmann Pedersen
27f7852b5a When pixman_compute_composite_region32() returns FALSE, don't fini the region.
The rule is that the region passed in must be initialized and that the
region returned will still be valid. Ie., the lifecycle is the
responsibility of the caller, regardless of what the function returns.

Previously, compute_composite_region32() would finalize the region and
then return FALSE, and then the caller would finalize the region
again, leading to memory corruption in some cases.
2010-09-08 19:15:01 -04:00
Søren Sandmann Pedersen
df6dbc9024 Store a2b2g2r2 pixel through the WRITE macro
Otherwise, accessor functions won't work.
2010-09-08 19:14:58 -04:00
Siarhei Siamashka
f42419a3e4 ARM: added 'neon_composite_over_8888_8_0565' fast path 2010-09-06 23:56:05 +03:00
Julien Cristau
a4f6c93016 Upload to experimental 2010-09-06 21:15:21 +02:00
Maarten Bosmans
765bde32e0 Add *.exe to .gitignore 2010-08-30 13:41:38 -04:00
Maarten Bosmans
8596408261 Use windows.h directly for mingw32 build
This patch adresses the issue discussed in
http://lists.freedesktop.org/archives/pixman/2010-April/000163.html

There were only two clashing identifiers.  The first one is IN, which
obviously causes problems in Pixman for lines like

    PIXMAN_STD_FAST_PATH (IN, solid, a8, a8, fast_composite_in_n_8_8),

Fortunately the mingw headers provide a solution: by defining
_NO_W32_PSEUDO_MODIFIERS, these stupid symbols are skipped.

The other name is UINT64, used in pixman-mmx.c. I renamed that
function to to_uint64, but may be another name is more appropriate.
2010-08-30 13:39:48 -04:00
Søren Sandmann Pedersen
5b99710042 Be more paranoid about checking for GTK+
From time to time people run into issues where the configure script
detects GTK+ when it is either not installed, or not functional due to
a missing pixman. Most recently:

  https://bugs.freedesktop.org/show_bug.cgi?id=29736

This patch makes the configure script more paranoid by

- always using PKG_CHECK_MODULES and not PKG_CHECK_EXISTS, since it
seems PKG_CHECK_EXISTS will sometimes return true even if a dependency
of GTK+, such as pixman-1, is missing.

- explicitly checking that pixman-1 is installed before enabling GTK+.

Cc: my.somewhat.lengthy.loginname@gmail.com
2010-08-24 08:12:20 -04:00
Søren Sandmann Pedersen
5530bcab26 Merge pixman_image_composite32() and do_composite().
There is not much point having a separate function that just validates
the images. Also add a boolean return to lookup_composite_function()
so that we can return if no composite function is found.
2010-08-24 08:12:20 -04:00
Benjamin Otte
a8ea889e5e region: Fix pixman_region_translate() clipping bug
Fixes the region-translate test case by clipping region translations to
the newly defined PIXMAN_REGION_MIN/MAX and using the newly introduced
type overflow_int_t to check for the overflow.
Also uses INT16_MAX or INT32_MAX for these values instead of relying on
the size of short and int types.
2010-08-24 12:17:50 +02:00
Benjamin Otte
4d8fb1bc01 region: Add a new test region-translate
This test exercises a bug in pixman_region32_translate(). The function
clips the region to int16 coordinates SHRT_MIN/SHRT_MAX.
2010-08-24 12:17:18 +02:00
Søren Sandmann Pedersen
5ff359b8a0 Post-release version bump to 0.19.3 2010-08-21 06:39:44 -04:00
Søren Sandmann Pedersen
39308ed3b0 Pre-release version bump to 0.19.2 2010-08-21 06:33:19 -04:00
Søren Sandmann Pedersen
393ccab74e Only try to compute the FAST_SAMPLES_COVER_CLIP for bits images
It doesn't make sense in other cases, and the computation would make
use of image->bits.{width,height} which lead to uninitialized memory
accesses when the image wasn't of type BITS.
2010-08-21 06:29:36 -04:00
Robert Hooker
dbc6d202d7 Bump changelogs. 2010-08-16 10:19:25 -04:00
Robert Hooker
637f4b5907 Merge branch 'upstream-experimental' into debian-experimental 2010-08-16 10:10:53 -04:00
Søren Sandmann Pedersen
97336fad32 Pre-release version bump to 0.18.4 2010-08-16 06:34:53 -04:00
Søren Sandmann Pedersen
da6f33a798 Introduce new FAST_PATH_SAMPLES_OPAQUE flag
This flag is set whenever the pixels of a bits image don't have an
alpha channel. Together with FAST_PATH_SAMPLES_COVER_CLIP it implies
that the image effectively is opaque, so we can do operator reductions
such as OVER->SRC.
2010-08-16 06:28:23 -04:00
Søren Sandmann Pedersen
32509aa4da Check for read accessors before taking the bilinear fast path
The bilinear fast path accesses pixels directly, so if the image has a
read accessor, then it can't be used.
2010-08-15 22:42:02 -04:00
Søren Sandmann Pedersen
052c5b819c If we bail out of do_composite, make sure to undo any workarounds.
The workaround for an old X bug has to be undone if we bail from
do_composite, so we can't just return.
2010-08-15 22:41:01 -04:00
Søren Sandmann Pedersen
91cb142177 When storing a g1 pixel, store the lowest bit, rather than comparing with 0. 2010-08-15 22:38:40 -04:00
Søren Sandmann Pedersen
a9a084c85c Fix memory leak in the pthreads thread local storage code
When a thread exits, we leak whatever is stored in thread local
variables, so install a destructor to free it.
2010-08-15 22:28:01 -04:00
Søren Sandmann Pedersen
4e5d6f00bf pixman_image_set_alpha_map(): Disallow alpha map cycles
If someone tries to set an alpha map that itself has an alpha map,
simply return. Also, if someone tries to add an alpha map to an image
that is being _used_ as an alpha map, simply return.

This ensures that an alpha map can never have an alpha map.
2010-08-15 22:08:16 -04:00
Søren Sandmann Pedersen
9fe7d32c4b Add alpha-loop test program
This tests what happens if you attempt to make an image with an alpha
map that has the image as its alpha map. This results in an infinite
loop in _pixman_image_validate(), so the test sets up a SIGALRM to
exit if it runs for more than five seconds.
2010-08-15 21:57:18 -04:00
Siarhei Siamashka
8a5d1be1da ARM: 'neon_combine_out_reverse_u' combiner
This operation was seen in mozilla browser profiling logs.
Implemented so that 'over' and 'out_reverse' operations
now reuse common parts of code.
2010-08-14 14:50:02 +00:00
Siarhei Siamashka
731e9feaa6 Code simplification (no need advancing 'vx' at the end of scanline) 2010-08-14 14:49:54 +00:00
Søren Sandmann Pedersen
41584f8fe1 Store the various bits image fetchers in a table with formats and flags.
Similarly to how the fast paths are done, put the various bits_image
fetchers in a table, so that we can quickly find the best one based on
the image's flags and format.
2010-08-08 13:57:40 -04:00
Søren Sandmann Pedersen
8e33643f44 Add some new FAST_PATH flags
The flags are:

 *  AFFINE_TRANSFORM, for affine transforms

 *  Y_UNIT_ZERO, for when the 10 entry in the transformation is zero

 *  FILTER_BILINEAR, for when the image has a bilinear filter

 *  NO_NORMAL_REPEAT, for when the repeat mode is not NORMAL

 *  HAS_TRANSFORM, for when the transform is not NULL

Also add some new FAST_PATH_REPEAT_* macros. These are just shorthands
for the image not having any of the other repeat modes. For example
REPEAT_NORMAL is (NO_NONE | NO_PAD | NO_REFLECT).
2010-08-08 13:57:40 -04:00
Søren Sandmann Pedersen
6f62231d15 Remove "_raw_" from all the accessors.
There are no non-raw accessors anymore.
2010-08-08 13:57:40 -04:00
Søren Sandmann Pedersen
807fd3c084 Eliminate the store_scanline_{32,64} function pointers.
Now that we can't recurse on alpha maps, they are not needed anymore.
2010-08-08 13:57:40 -04:00
Søren Sandmann Pedersen
e213d5fd62 Split bits_image_fetch_transformed() into two functions.
One function deals with the common affine, no-alpha-map case. The
other deals with perspective transformations and alpha maps.
2010-08-08 13:57:40 -04:00
Søren Sandmann Pedersen
cbb2a0d792 Eliminate get_pixel_32() and get_pixel_64() from bits_image.
These functions can simply be passed as arguments to the various pixel
fetchers. We don't need to store them. Since they are known at compile
time and the pixel fetchers are force_inline, this is not a
performance issue.

Also temporarily make all pixel access go through the alpha path.
2010-08-08 13:57:40 -04:00
Søren Sandmann Pedersen
6480c92312 Eliminate recursion from alpha map code
Alpha maps with alpha maps are no longer supported. It's not a useful
feature and it could could lead to infinite recursion.
2010-08-08 13:57:40 -04:00
Søren Sandmann Pedersen
1cc750ed92 Replace compute_src_extent_flags() with analyze_extents()
This commit fixes two separate problems: 1. Incorrect computation of
the FAST_PATH_SAMPLES_COVER_CLIP flag, and 2. FAST_PATH_16BIT_SAFE is
a nonsensical thing to compute.

== 1. Incorrect computation of SAMPLES_COVER_CLIP:

Previously we were using pixman_transform_bounds() to compute which
source samples would be used for a composite operation. This is
incorrect for several reasons:

(a) pixman_transform_bounds() is transforming the integer bounding box
of the destination samples, where it should be transforming the
bounding box of the samples themselves. In other words, it is too
pessimistic in some cases.

(b) pixman_transform_bounds() is not rounding the same way as we do
during sampling. For example, for a NEAREST filter we subtract
pixman_fixed_e before rounding off to the nearest sample so that a
transformed value of 1 will round to the sample at 0.5 and not to the
one at 1.5. However, pixman_transform_bounds() would simply truncate
to 1 which would imply that the first sample to be used was the one at
1.5. In other words, it is too optimistic in some cases.

(c) The result of pixman_transform_bounds() does not account for the
interpolation filter applied to the source.

== 2. FAST_PATH_16BIT_SAFE is nonsensical

The FAST_PATH_16BIT_SAFE is a flag that indicates that various
computations can be safely done within a 16.16 fixed-point
variable. It was used by certain fast paths who relied on those
computations succeeding. The problem is that many other compositing
functions were making similar assumptions but not actually requiring
the flag to be set. Notably, all the general compositing functions
simply walk the source region using 16.16 variables. If the
transformation happens to overflow, strange things will happen.

So instead of computing this flag in certain cases, it is better to
simply detect that overflows will happen and not try to composite at
all in that case. This has the advantage that most compositing
functions can be written naturally way.

It does have the disadvantage that we are giving up on some cases that
previously worked, but those are all corner cases where the areas
involved were very close to the limits of the coordinate
system. Relying on these working reliably was always a somewhat
dubious proposition. The most important case that might have worked
previously was untransformed compositing involving images larger than
32 bits. But even in those cases, if you had REPEAT_PAD or
REPEAT_REFLECT turned on, you would hit bits_image_fetch_transformed()
which has the 16 bit limitations.

== Fixes

This patch fixes both problems by introducing a new function called
analyze_extents() that has the responsibility to reject corner cases,
and to compute flags based on the extents.

It does this through a new compute_sample_extents() function that will
compute a conservative (but tight) approximation to the bounding box
of the samples that will actually be needed. By basing the computation
on the positions of the _sample_ locations in the destination, and by
taking the interpolation filter into account, it fixes problem one.

The same function is also used with a one-pixel expanded version of
the destination extents. By checking if the transformed bounding box
will overflow 16.16 fixed point, it fixes problem two.
2010-08-08 13:57:39 -04:00
Søren Sandmann Pedersen
5b289d39cf Extend scaling-crash-test in various ways
This extends scaling-crash-test to test some more things:

- All combinations of NEAREST/BILINEAR/CONVOLUTION filters and
  NORMAL/PAD/REFLECT repeat modes.

- Tests various scale factors very close to 1/7th such that the source
  area is very close to edge of the source image.

- The same things, only with scale factors very close to 1/32767th.

- Enables the commented-out tests for accessing memory outside the
  source buffer.

Also there is now a border around the source buffer which has a
different color than the source buffer itself so that if we sample
outside, it will show up.

Finally, the test now allows the destination buffer to not be changed
at all. This allows pixman to simply bail out in cases where the
transformation too strange.
2010-08-08 13:57:39 -04:00
Søren Sandmann Pedersen
71ff55a3e5 Fix Altivec/OpenBSD patch
As Brad pointed out, I pushed the wrong version of this patch.
2010-08-05 19:00:56 -04:00
Brad Smith
cb50e9cc95 Add support for AltiVec detection for OpenBSD/PowerPC.
Bug 29331.
2010-08-05 12:16:40 -04:00
Søren Sandmann Pedersen
664132128e CODING_STYLE: Delete the stuff about trailing spaces
Also fix various other minor issues.
2010-08-04 09:50:30 -04:00
Søren Sandmann Pedersen
cc9221ce96 If we bail out of do_composite, make sure to undo any workarounds.
The workaround for an old X bug has to be undone if we bail from
do_composite, so we can't just return.
2010-08-04 09:12:05 -04:00
Søren Sandmann Pedersen
b243a66041 Add x14r6g6b6 format to blitters-test 2010-08-04 08:58:51 -04:00
Marek Vasut
d6a7b15424 Add support for 32bpp X14R6G6B6 format.
This format is used on PXA framebuffer with some boards. It uses only 18 bits
from the 32 bit framebuffer to interpret color.

Signed-off-by: Marek Vasut <marek.vasut@gmail.com>
2010-08-04 08:44:24 -04:00
Siarhei Siamashka
226a6df4f9 test: 'scaling-test' updated to provide better coverage
Negative scale factors are now also tested. A small additional
translate transform helps to stress the use of fractional
coordinates better.

Also the number of iterations to run by default increased in order
to compensate increased variety of operations to be tested.
2010-07-27 16:07:34 +03:00
Siarhei Siamashka
af3eeaeb13 test: 'scaling-crash-test' added
This test tries to exploit some corner cases and previously known
bugs in nearest neighbor scaling fast path code, attempting to
crash pixman or cause some other nasty effect.
2010-07-27 16:07:07 +03:00
Søren Sandmann Pedersen
90483fcabb bits: Fix potential divide-by-zero in projective code
If the homogeneous coordinate is 0, just set the coordinates to 0.
2010-07-23 19:16:43 -04:00
Søren Sandmann Pedersen
bf125fbbb7 [sse2] Add sse2_composite_add_n_8()
This shows up when epiphany displays the "ImageTest" on
glimr.rubyforge.org/cake/canvas.html
2010-07-22 07:39:49 -04:00
Søren Sandmann Pedersen
16ae3285e6 [sse2] Add sse2_composite_in_n_8()
This shows up when epiphany displays the "ImageTest" on
glimr.rubyforge.org/cake/canvas.html
2010-07-22 07:39:49 -04:00
Søren Sandmann Pedersen
e0b430a13e [sse2] Add sse2_composite_src_x888_8888()
This operation shows up when Firefox displays
http://dougx.net/plunder/plunder.html
2010-07-22 07:39:49 -04:00
Søren Sandmann Pedersen
16bae83475 [fast] Add fast_composite_src_x888_8888()
This shows up on when Firefox displays http://dougx.net/plunder/plunder.html
2010-07-22 07:39:49 -04:00
M Joonas Pihlaja
9399b1a5af Fix thinko in configure.ac's macro to test linking.
Copy-paste carnage.  Renames save_{cflags,libs,ldflags} to
save_{CFLAGS,LIBS,LDFLAGS}.
2010-07-21 23:52:23 +03:00
M Joonas Pihlaja
5537e51cd0 Avoid trailing slashes on automake install dirs.
The install-sh on a Solaris box couldn't copy with
trailing slashes.
2010-07-21 23:52:23 +03:00
M Joonas Pihlaja
1d9c6fa623 Check for specific flags by actually trying to compile and link.
Instead of relying on preprocessor version checks to see if a
some compiler flags are supported, actually try to compile and
link a test program with the flags.
2010-07-21 23:52:23 +03:00
M Joonas Pihlaja
d95ae70604 Check that the OpenMP pragmas don't cause link errors.
This patch adds extra guards around our use of
OpenMP pragmas and checks that the pragmas won't
cause link errors.  This fixes the build on
Tru64 and Solaris with the native compilers and clang.
2010-07-21 23:52:23 +03:00
M Joonas Pihlaja
eb247ac377 Don't trust OpenBSD's gcc to produce working code for __thread.
The gcc on OpenBSD 4.5 to 4.7 at least produces bad code for __thread,
without as much as a warning.

See PR #6410 "Using __thread TLS variables compiles ok but segfault at runtime."

http://cvs.openbsd.org/cgi-bin/query-pr-wrapper?full=yes&numbers=6410
2010-07-21 23:52:22 +03:00
M Joonas Pihlaja
dbf35f1f27 Try harder to find suitable flags for pthreads.
The flags -D_REENTRANT -lpthread work on more systems than
does -pthread unfortunately, so give that a go too.
2010-07-21 23:52:22 +03:00
Søren Sandmann Pedersen
9897bb4eee Check for read accessors before taking the bilinear fast path
The bilinear fast path accesses pixels directly, so if the image has a
read accessor, then it can't be used.
2010-07-13 15:46:21 -04:00
Søren Sandmann Pedersen
ce3d9fca73 fast-path: Some formatting fixes
Add spaces before parentheses; fix indentation in the macro.
2010-07-12 09:46:37 -04:00
Søren Sandmann Pedersen
839326e471 In the FAST_NEAREST macro call the function 8888_8888 and not x888_x888
The x888 suggests that they have something to do with the x8r8g8b8
formats, but that's not the case; they are assuming a8r8g8b8
formats. (Although in some cases they also work for x8r8g8b8 type
formats).
2010-07-12 09:46:37 -04:00
Søren Sandmann Pedersen
e13d9f9684 Make the repeat mode explicit in the FAST_NEAREST macro.
Before, it was 0 or 1 meaning 'no repeat' and 'normal repeat'
respectively. Now we explicitly pass in either NONE or NORMAL.
2010-07-12 09:46:37 -04:00
Søren Sandmann Pedersen
2e7fb66553 When converting indexed formats to 64 bits, don't correct for channel widths
Indexed formats are mapped to a8r8g8b8 with full precision, so when
expanding we shouldn't correct for the width of the channels
2010-07-11 09:43:56 -04:00
Søren Sandmann Pedersen
2df6dac0be test: Make sure the palettes for indexed format roundtrip properly
The palettes for indexed formats must satisfy the condition that if
some index maps to a color C, then the 15 bit version of that color
must map back to the index. This ensures that the destination operator
is always a no-op, which seems like a reasonable assumption to make.
2010-07-11 09:43:56 -04:00
Søren Sandmann Pedersen
5dd59c8b7c Split the fast path caching into its own force_inline function
The do_composite() function is a lot more readable this way.
2010-07-11 09:43:56 -04:00
Søren Sandmann Pedersen
98d19d9abd Cache the implementation along with the fast paths.
When calling a fast path, we need to pass the corresponding
implementation since it might contain information necessary to run the
fast path.
2010-07-11 09:43:55 -04:00
Søren Sandmann Pedersen
f18bcf1f6e Hide the global implementation variable behind a force_inline function.
Previously the global variable was called 'imp' which was confusing
with the argument to various other functions also being called imp.
2010-07-11 09:43:55 -04:00
Søren Sandmann Pedersen
5c935473d8 Fix memory leak in the pthreads thread local storage code
When a thread exits, we leak whatever is stored in thread local
variables, so install a destructor to free it.
2010-07-10 21:05:27 -04:00
Søren Sandmann Pedersen
7114b2d63b Make the combiner macros less likely to cause name collisions.
Protect the arguments to the combiner macros with parentheses, and
postfix their temporary variables with underscores to avoid name space
collisions with the surrounding code.

Alexander Shulgin pointed out that underscore-prefixed identifiers are
reserved for the C implementation, so we use postfix underscores
instead.
2010-07-07 06:50:45 -04:00
Søren Sandmann Pedersen
a92e4a6a94 Minor tweaks to README 2010-07-06 19:15:29 -04:00
Søren Sandmann Pedersen
ca846806cb Store the conical angle in floating point radians, not fixed point degrees
This is a slight simplification.
2010-06-24 14:56:09 -04:00
Søren Sandmann Pedersen
3074d57b56 Fix conical gradients to match QConicalGradient from Qt
Under the assumption that pixman gradients are supposed to match
QConicalgradient, described here:

        http://doc.trolltech.com/4.4/qconicalgradient.html

this patch fixes two separate bugs in pixman-conical-gradient.c.

The first bug is that the output of atan2() is in the range of [-pi,
pi], which means the parameter into the gradient can be negative. This
is wrong since a QConicalGradient always interpolates around the
center from 0 to 1. The fix for that is to (a) make sure the given
angle is between 0 and 360, and (b) add or subtract 2 * M_PI if the
computed angle ends up outside [0, 2 * pi].

The other bug is that we were interpolating clockwise, whereas
QConicalGradient calls for a counter-clockwise interpolation. This is
easily fixed by subtracting the parameter from 1.

Finally, this patch encapsulates the computation in a new force-inline
function so that it can be reused in both the affine and non-affine
case.
2010-06-20 04:45:20 -04:00
Søren Sandmann Pedersen
66365b5ef1 Make separate gray scanline storers.
For gray formats the palettes are indexed by luminance, not RGB, so we
can't use the color storers for gray too.
2010-06-18 20:33:02 -04:00
Søren Sandmann Pedersen
4e1d4847c9 When storing a g1 pixel, store the lowest bit, rather than comparing with 0. 2010-06-18 20:33:02 -04:00
Andrea Canciani
445eb6385f test: verify that gradients do not crash pixman
Test gradients under particular conditions (no stops, all the stops
at the same offset) to check that pixman does not misbehave.
2010-06-09 17:30:41 +02:00
Andrea Canciani
de03202581 support single-stop gradients
Just like conical gradients, linear and radial gradients can now
have a single stop.
2010-06-09 17:30:41 +02:00
Søren Sandmann Pedersen
32bd31d677 Eliminate mask_bits from all the scanline fetchers.
Back in the day, the mask_bits argument was used to distinguish
between masks used for component alpha (where it was 0xffffffff) and
masks for unified alpha (where it was 0xff000000). In this way, the
fetchers could check if just the alpha channel was 0 and in that case
avoid fetching the source.

However, we haven't actually used it like that for a long time; it is
currently always either 0xffffffff or 0 (if the mask is NULL). It also
doesn't seem worthwhile resurrecting it because for premultiplied
buffers, if alpha is 0, then so are the color channels
normally.

This patch eliminates the mask_bits and changes the fetchers to just
assume it is 0xffffffff if mask is non-NULL.
2010-06-09 07:17:59 -04:00
Jeff Muizelaar
78778e5963 create getter for component alpha
This patch comes from the mozilla central tree. See
http://hg.mozilla.org/mozilla-central/rev/89338a224278 for the
original changeset.

Signed-off-by: Jeff Muizelaar <jmuizelaar@mozilla.com>
Signed-off-by: Egor Starkov <egor.starkov@nokia.com>
Signed-off-by: Rami Ylimaki <ext-rami.ylimaki@nokia.com>
Signed-off-by: Siarhei Siamashka <siarhei.siamashka@nokia.com>
2010-06-03 23:18:16 +03:00
Robert Hooker
8814afc5fd Prepare changelog for upload 2010-05-14 13:06:08 -04:00
Robert Hooker
1773c6829c Bump changelogs. 2010-05-14 12:57:58 -04:00
Robert Hooker
c42434196b Merge branch 'upstream-experimental' into debian-experimental 2010-05-14 12:31:03 -04:00
Siarhei Siamashka
cfc4e38852 test: added OpenMP support for better utilization of multiple CPU cores
Some of the tests are quite heavy CPU users and may benefit from
using multiple CPU cores, so the programs from 'test' directory
are now built with OpenMP support. OpenMP is easy to use, portable
and also takes care of making a decision about how many threads
to spawn.
2010-05-13 21:04:55 +03:00
Siarhei Siamashka
f905ebb03d test: scaling-test updated to use new fuzzer_test_main() function 2010-05-13 21:04:36 +03:00
Siarhei Siamashka
be387701a5 test: blitters-test updated to use new fuzzer_test_main() function 2010-05-13 21:04:31 +03:00
Siarhei Siamashka
9ed9abd154 test: blitters-test-bisect.rb converted to perl
This new script can be used to run continuously to compare two test
programs based on fuzzer_test_main() function from 'util.c' and
narrow down to a single problematic test from the batch which results
in different behavior.
2010-05-13 21:03:07 +03:00
Siarhei Siamashka
30c3e91c3f test: main loop from blitters-test added as a new function to utils.c
This new generalized function can be reused in both blitters-test
and scaling-test. Final checksum calculation changed in order to make
it parallelizable (it is a sum of individual 32-bit values returned
by a callback function, which is now responsible for running test-specific
code). Return values may be crc32, some other hash or even just zero on
success and non-zero on error (in this case, the expected result of the
whole test run should be 0).
2010-05-13 21:02:27 +03:00
Søren Sandmann Pedersen
872c915dcb Post-release version bump to 0.18.3 2010-05-12 16:33:35 -04:00
Søren Sandmann Pedersen
b48d8b5201 Pre-release version bump to 0.18.2 2010-05-12 16:27:02 -04:00
Søren Sandmann Pedersen
970c183c33 Add macros for thread local storage on MinGW 32
These macros are identical to the ones that Tor Lillqvist posted here:

    http://lists.freedesktop.org/archives/pixman/2010-April/000160.html

with one exception: the variable is allocated with calloc() and not
malloc().

Cc: tml@iki.fi
2010-05-12 16:15:42 -04:00
Søren Sandmann Pedersen
61ff1a3214 Don't use __thread on MinGW.
It is apparently broken. See this:

http://mingw-users.1079350.n2.nabble.com/gcc-4-4-multi-threaded-exception-handling-thread-specifier-not-working-td3440749.html

We'll need to support thread local storage on MinGW32 some other way.

Cc: tml@iki.fi
2010-05-12 16:15:41 -04:00
Søren Sandmann Pedersen
f973be464d Don't consider indexed formats opaque.
The indexed formats have 0 bits of alpha, but can't be considered
opaque because there may be non-opaque colors in the palette.
2010-05-12 16:15:41 -04:00
Jeff Muizelaar
34fb38554f Add missing HAVE_CONFIG_H guards for config.h inclusion 2010-05-12 16:15:41 -04:00
Søren Sandmann Pedersen
38928afaa1 Update README to mention the pixman mailing list 2010-05-12 16:15:41 -04:00
Søren Sandmann Pedersen
664984206d [mmx] Fix mask creation bugs
This line:

    mask = mask | mask >> 8 | mask >> 16 | mask >> 24;

only works when mask has 0s in the lower 24 bits, so add

     mask &= 0xff000000;

before.

Reported by Todd Rinaldo on the #cairo IRC channel.
2010-05-12 16:15:41 -04:00
Søren Sandmann Pedersen
d197dc5e8d Fixes for pthread thread local storage.
The tls_name_key variable is passed to tls_name_get(), and the first
time this happens it isn't initialized. tls_name_get() then passes it
on to tls_name_alloc() which passes it on to pthread_setspecific()
leading to undefined behavior.

None of this is actually necessary at all because there is only one
such variable per thread local variable, so it doesn't need to passed
as a parameter at all.

All of this was pointed out by Tor Lillqvist on the cairo mailing
list.
2010-05-12 16:15:40 -04:00
Søren Sandmann Pedersen
9babaab404 Fix uninitialized cache when pthreads are used
The thread local cache is allocated with malloc(), but we rely on it
being initialized to zero, so allocate it with calloc() instead.
2010-05-12 16:15:40 -04:00
Siddharth Agarwal
4fe0a40e75 Visual Studio 2010 includes stdint.h
Use the builtin version instead of defining the types ourselves.
2010-05-12 16:15:40 -04:00
Søren Sandmann Pedersen
9a46eddc92 Post-release version bump to 0.18.1 2010-05-12 16:15:40 -04:00
Julien Cristau
68b6e0e095 Prepare changelog for upload 2010-05-11 14:16:18 +02:00
Søren Sandmann Pedersen
164fe215f2 Merge branch 'for-master' 2010-05-09 14:24:24 -04:00
Julien Cristau
c6afb1f264 add bug closer 2010-05-08 17:23:17 +02:00
Julien Cristau
92ac0adbbf Drop pixman-arm-don-t-use-env-vars-to-get-hwcap-platform.patch, obsolete. 2010-05-08 17:19:53 +02:00
Julien Cristau
b24ef53fa7 rules: use find .. -delete instead of rm $(find ..) 2010-05-08 17:18:00 +02:00
Julien Cristau
df082450b1 Update symbols file for new API, bump shlibs. 2010-05-08 17:17:27 +02:00
Julien Cristau
a2009cec77 Bump changelogs 2010-05-08 17:06:51 +02:00
Julien Cristau
e91730b91b Merge branch 'upstream-experimental' into debian-experimental 2010-05-08 17:05:21 +02:00
Julien Cristau
1300217b90 Merge branch 'upstream-unstable' into upstream-experimental 2010-05-08 17:04:51 +02:00
Søren Sandmann Pedersen
e1594f204d test/gtk-utils: Set the size of the window to the size of the image 2010-05-06 01:05:40 +03:00
Jeff Muizelaar
2f4f2fb485 Add support for compiling pixman without thread/tls support 2010-05-04 11:55:30 -04:00
Søren Sandmann Pedersen
5158d6740c Add macros for thread local storage on MinGW 32
These macros are identical to the ones that Tor Lillqvist posted here:

    http://lists.freedesktop.org/archives/pixman/2010-April/000160.html

with one exception: the variable is allocated with calloc() and not
malloc().

Cc: tml@iki.fi
2010-05-03 11:12:43 +03:00
Søren Sandmann Pedersen
582fa58bba Don't use __thread on MinGW.
It is apparently broken. See this:

http://mingw-users.1079350.n2.nabble.com/gcc-4-4-multi-threaded-exception-handling-thread-specifier-not-working-td3440749.html

We'll need to support thread local storage on MinGW32 some other way.

Cc: tml@iki.fi
2010-05-03 11:12:24 +03:00
Søren Sandmann Pedersen
95d4026866 Add support for 8bpp to pixman_fill_sse2() 2010-05-03 10:59:36 +03:00
Søren Sandmann Pedersen
d539e0c661 sse2: Add sse2_composite_over_reverse_n_8888
This is a small speed-up for the poppler benchmark:

Before:
[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]    image                      poppler    4.443    4.474   0.31%    6/6

After:
[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]    image                      poppler    4.224    4.248   0.42%    6/6
2010-05-03 10:59:36 +03:00
Søren Sandmann Pedersen
2d65fb033b Don't consider indexed formats opaque.
The indexed formats have 0 bits of alpha, but can't be considered
opaque because there may be non-opaque colors in the palette.
2010-05-03 10:59:36 +03:00
Søren Sandmann Pedersen
19459672ce Add an over_8888_8888_8888 sse2 fast path. 2010-05-03 10:59:36 +03:00
Søren Sandmann Pedersen
a3d29157b4 Add pixman_region{,32}_intersect_rect() 2010-05-03 10:59:36 +03:00
Søren Sandmann Pedersen
c0d0d20bd2 Rename fast_composite_src_8888_x888 to fast_composite_src_memcpy()
Then generalize it and use it for SRC copying between various
identical formats.
2010-05-03 10:59:36 +03:00
Jeff Muizelaar
1f0cba3bdc Add missing HAVE_CONFIG_H guards for config.h inclusion 2010-04-27 15:23:20 -04:00
Søren Sandmann Pedersen
526132fa65 Remove alphamap from the GTK+ part of tests/Makefile.am
It doesn't use GTK+ and it was already listed in the non-GTK+ part.
2010-04-22 12:14:23 -04:00
Søren Sandmann Pedersen
8f7cc5e438 Add pixman_image_get_format() accessor 2010-04-21 09:59:29 -04:00
Søren Sandmann Pedersen
2b1cae1ef6 Some minor updates to README 2010-04-21 09:55:35 -04:00
Søren Sandmann Pedersen
15f5868f63 Update README to mention the pixman mailing list 2010-04-18 16:24:39 -04:00
Søren Sandmann Pedersen
a652d5c154 [mmx] Fix mask creation bugs
This line:

    mask = mask | mask >> 8 | mask >> 16 | mask >> 24;

only works when mask has 0s in the lower 24 bits, so add

     mask &= 0xff000000;

before.

Reported by Todd Rinaldo on the #cairo IRC channel.
2010-04-13 22:41:48 -04:00
Søren Sandmann Pedersen
714559dccd Fixes for pthread thread local storage.
The tls_name_key variable is passed to tls_name_get(), and the first
time this happens it isn't initialized. tls_name_get() then passes it
on to tls_name_alloc() which passes it on to pthread_setspecific()
leading to undefined behavior.

None of this is actually necessary at all because there is only one
such variable per thread local variable, so it doesn't need to passed
as a parameter at all.

All of this was pointed out by Tor Lillqvist on the cairo mailing
list.
2010-04-13 22:41:48 -04:00
Søren Sandmann Pedersen
634ba33b5b Fix uninitialized cache when pthreads are used
The thread local cache is allocated with malloc(), but we rely on it
being initialized to zero, so allocate it with calloc() instead.
2010-04-13 22:41:47 -04:00
Siddharth Agarwal
bc11545a1b Visual Studio 2010 includes stdint.h
Use the builtin version instead of defining the types ourselves.
2010-04-13 10:15:29 -04:00
Søren Sandmann Pedersen
0345c343e5 Post-release version bump to 0.19.1 2010-04-01 06:21:21 -04:00
Søren Sandmann Pedersen
e9dc568d6f Pre-release version bump to 0.18.0 2010-04-01 05:23:31 -04:00
Matthias Hopf
efd41c6287 Revert "Improve PIXREGION_NIL to return true on degenerated regions."
This reverts commit ebba149313.
Scheduled for re-discussion after stable 0.18 has been released.
2010-03-24 18:54:29 +01:00
Matthias Hopf
ebba149313 Improve PIXREGION_NIL to return true on degenerated regions.
Fixes Novell bug 568811.
2010-03-24 14:51:05 +01:00
Søren Sandmann Pedersen
c0f8d417b5 Post-release version bump to 0.17.15 2010-03-23 17:25:54 -04:00
Søren Sandmann Pedersen
b35f0b0158 Pre-release version bump to 0.17.14 2010-03-23 16:52:02 -04:00
Søren Sandmann Pedersen
27a9f0468b Merge remote branch 'ssvb/arm-fixes' 2010-03-23 11:00:04 -04:00
Siarhei Siamashka
3ef203331f ARM: SIMD optimizations moved to a separate .S file
This should be the last step in providing full armv4t compatibility
with CPU features runtime autodetection in pixman.
2010-03-22 21:56:17 +02:00
Siarhei Siamashka
0a0591c2f7 ARM: SIMD optimizations updated to use common assembly calling conventions 2010-03-22 20:17:14 +02:00
Siarhei Siamashka
c1e8d4533a ARM: Helper ARM NEON assembly binding macros moved into a separate header
This is needed for future reuse of the same macros for the other
ARM assembly optimizations (armv4t, armv6)
2010-03-22 18:51:54 +02:00
Siarhei Siamashka
5791026e45 ARM: Workaround for a NEON bug in assembler from binutils 2.18
The problem was reported as bug 25534 against pixman in
freedesktop.org bugzila. Link to a patch for binutils:
http://sourceware.org/ml/binutils/2008-03/msg00260.html

For pixman the impact is a build failure when using
binutils 2.18. Versions 2.19 and higer are fine. Still
some distros may be using older versions of binutils and
this is causing problems.

This patch workarounds the problem by replacing a problematic
"vmov a, b" instruction with equivalent "vorr a, b, b". Actually
they even map to the same instruction opcode in the generated
code, so the resulting binary is identical with and without patch.
2010-03-22 16:15:18 +02:00
Siarhei Siamashka
68d8d83223 ARM: Use '.object_arch' directive in NEON assembly file
This can be used to override the architecture recorded in the EABI object
attribute section. We set a minimum arch to 'armv4'. Binutils documentation
recommends to use this directive with the code performing runtime detection
of CPU features.

Additionally NEON/VFP EABI attributes are suppressed. And the instruction
set to use is explicitly set to '.arm'.

Configure test for NEON support is also updated to include a bunch of
these new directives (if any of these is unsupported by the assembler,
it is better to fail configure test than to fail library build).

All these changes are required to fix SIGILL problem on armv4t, reported in
http://lists.freedesktop.org/archives/pixman/2010-March/000123.html
2010-03-22 12:12:03 +02:00
Jon TURNEY
69f1ec9a78 Avoid a potential division-by-zero exeception in window-test
Avoid a division-by-zero exception if the first number returned by
rand() is a multiple of 500, causing us to create a zero width pixmap,
and then attempt to use get_rand(0) when generating a random stride...

Fixes https://bugs.freedesktop.org/attachment.cgi?id=34162
2010-03-17 20:25:25 -04:00
Søren Sandmann Pedersen
50713d9d0d Post-release version bump to 0.17.13 2010-03-17 15:12:06 -04:00
Søren Sandmann Pedersen
fb68d6c14d Pre-release version bump to 0.17.12 2010-03-17 13:46:44 -04:00
Søren Sandmann Pedersen
265ea1fb4d Specialize the fast_composite_scaled_nearest_* scalers to positive x units
This avoids a test in the inner loop, which improves performance
especially for tiled sources.

On x86-32, I get these results:

Before:
op=1, src_fmt=20028888, dst_fmt=20028888, speed=306.96 MPix/s (73.18 FPS)
op=1, src_fmt=20028888, dst_fmt=10020565, speed=102.67 MPix/s (24.48 FPS)
op=1, src_fmt=10020565, dst_fmt=10020565, speed=324.85 MPix/s (77.45 FPS)

After:
op=1, src_fmt=20028888, dst_fmt=20028888, speed=332.19 MPix/s (79.20 FPS)
op=1, src_fmt=20028888, dst_fmt=10020565, speed=110.41 MPix/s (26.32 FPS)
op=1, src_fmt=10020565, dst_fmt=10020565, speed=363.28 MPix/s (86.61 FPS)
2010-03-17 11:14:20 -04:00
Søren Sandmann Pedersen
9cd1051523 Add a FAST_PATH_X_UNIT_POSITIVE flag
This is the common case for a lot of transformed images. If the unit
were negative, the transformation would be a reflection which is
fairly rare.
2010-03-17 11:03:05 -04:00
Alexander Larsson
a5b51bb03c Use the right format for the OVER_8888_565 fast path 2010-03-17 11:03:05 -04:00
Alexander Larsson
3b92b711d0 Add specialized fast nearest scalers
This is a macroized version of SRC/OVER repeat normal/unneeded nearest
neighbour scaling instantiated for some common 8888 and 565 formats.

Based on work by Siarhei Siamashka
2010-03-17 11:03:05 -04:00
Alexander Larsson
5750408e48 Add FAST_PATH_SAMPLES_COVER_CLIP and FAST_PATH_16BIT_SAFE
FAST_PATH_SAMPLES_COVER_CLIP:

This is set of the source sample grid, unrepeated but transformed
completely completely covers the clip destination. If this is set
you can use a simple scaled that doesn't have to care about the repeat
mode.

FAST_PATH_16BIT_SAFE:

This signifies two things:
1) The size of the src/mask fits in a 16.16 fixed point, so something like:

    max_vx = src_image->bits.width << 16;

    Is allowed and is guaranteed to not overflow max_vx

2) When stepping the source space we're guaranteed to never overflow
   a 16.16 bit fix point variable, even if we step one extra step
   in the destination space. This means that a loop doing:

   x = vx >> 16;
   vx += unit_x;								   d = src_row[x];

   will never overflow vx causing x to be negative.

   And additionally, if you track vx like above and apply NORMAL repeat
   after the vx addition with something like:

   while (vx >= max_vx) vx -= max_vx;

   This will never overflow the vx even on the final increment that
   takes vx one past the end of where we will read, which makes the
   repeat loop safe.
2010-03-17 11:03:05 -04:00
Alexander Larsson
cba6fbbddc Add FAST_PATH_NO_NONE_REPEAT flag 2010-03-17 11:03:05 -04:00
Alexander Larsson
7ec023ede1 Add CONVERT_8888_TO_8888 and CONVERT_0565_TO_0565 macros
These are useful for macroization
2010-03-17 11:03:05 -04:00
Alexander Larsson
c903d03052 Add CONVERT_0565_TO_8888 macro
This lets us simplify some fast paths since we get a consistent
naming that always has 8888 and gets some value for alpha.
2010-03-17 11:03:05 -04:00
Søren Sandmann Pedersen
de27f45ddd Ensure that only the low 4 bit of 4 bit pixels are stored.
In some cases we end up trying to use the STORE_4 macro with an 8 bit
values, which resulted in other pixels getting overwritten. Fix this
by always masking off the low 4 bits.

This fixes blitters-test on big-endian machines.
2010-03-17 11:02:58 -04:00
Søren Sandmann Pedersen
6532f8488a Fix contact address in configure.ac 2010-03-16 14:58:18 -04:00
Søren Sandmann Pedersen
7c9f121efe Add PIXMAN_DEFINE_THREAD_LOCAL() and PIXMAN_GET_THREAD_LOCAL() macros
These macros hide the various types of thread local support. On Linux
and Unix, they expand to just __thread. On Microsoft Visual C++, they
expand to __declspec(thread).

On OS X and other systems that don't have __thread, they expand to a
complicated concoction that uses pthread_once() and
pthread_get/set_specific() to get thread local variables.
2010-03-16 14:58:12 -04:00
Søren Sandmann Pedersen
6b9c548200 Add checks for various types of thread local storage.
OS X does not support __thread, so we have to check for it before
using it.  It does however support pthread_get/setspecific(), so if we
don't have __thread, check if those are available.
2010-03-16 12:01:51 -04:00
Alan Coopersmith
313353f1fb Add Sun cc to thread-local support checks in pixman-compiler.h
Clears '#warning: "unknown compiler"' messages when building

Signed-off-by: Alan Coopersmith <alan.coopersmith@sun.com>
2010-03-15 15:20:23 -07:00
Alan Coopersmith
b67f784a5d Make .s target asm flag selection more portable
The previous code worked in GNU make, but caused a syntax error in Solaris
make ( https://bugs.freedesktop.org/show_bug.cgi?id=27062 ) - this seems to
work in both, and should hopefully not cause syntax errors in any versions
of make not supporting the macro-substitution-in-macro-name feature, just
cause the macro to expand to nothing.

Signed-off-by: Alan Coopersmith <alan.coopersmith@sun.com>
2010-03-15 10:52:20 -07:00
Søren Sandmann Pedersen
7a5dc74785 Fix typo: WORDS_BIG_ENDIAN => WORDS_BIGENDIAN in pixman-edge.c
Pointed out by Andreas Falkenhahn on the cairo mailing list.
2010-03-15 07:40:46 -04:00
Søren Sandmann Pedersen
ff30a5cbb9 test: Add support for indexed formats to blitters-test
These formats work fine, they just need to have a palette set.
2010-03-14 12:25:17 -04:00
Søren Sandmann Pedersen
2b5f7be6c0 pixman.h: Only define stdint types when PIXMAN_DONT_DEFINE_STDINT is undefined
In SPICE, with Microsoft Visual C++, pixman.h is included after
another file that defines these types, which causes warnings and
errors.

This patch allows such code to just define PIXMAN_DONT_DEFINE_STDINT
to use its own version of those types.
2010-03-14 12:24:50 -04:00
Søren Sandmann Pedersen
f4da05c9f9 Merge branch 'operator-table' 2010-03-14 12:12:05 -04:00
Søren Sandmann Pedersen
a12d868df8 Merge branch 'fast-path-cache' 2010-03-14 12:12:00 -04:00
Søren Sandmann Pedersen
f534509d00 Change operator table to be an array of arrays of four bytes.
This makes gcc generate slightly better code for optimize_operator.
2010-03-14 12:11:48 -04:00
Søren Sandmann Pedersen
94d75ebd21 Strength reduce certain conjoint/disjoint to their normal counterparts.
This allows us to not test for them later on.
2010-03-14 12:11:47 -04:00
Søren Sandmann Pedersen
58be9c71d2 Store the operator table more compactly.
The four cases for each operator:

    none-are-opaque, src-is-opaque, dest-is-opaque, both-are-opaque

are packed into one uint32_t per operator. The relevant strength
reduced operator can then be found by packing the source-is-opaque and
dest-is-opaque into two bits and shifting that number of bytes.

Chris Wilson pointed out a bug in the original version of this commit:
dest_is_opaque and source_is_opaque were used as booleans, but their
actual values were the results of a logical AND with the
FAST_PATH_OPAQUE flag, so the shift value was wildly wrong.

The only reason it actually passed the test suite (on x86) was that
the compiler computed the shift amount in the cl register, and the low
byte of FAST_PATH_OPAQUE happens to be 0, so no shifting actually took
place, and the original operator was returned.
2010-03-14 12:11:47 -04:00
Søren Sandmann Pedersen
7fe35f0e6b Make the operator strength reduction constant time.
By extending the operator information table to cover all operators we
can replace the loop with a table look-up. At the same time, base the
operator optimization on the computed flags rather than the ones in
the image struct.

Finally, as an extra optimization, we no longer ignore the case where
there is a mask. Instead we consider the source opaque if both source
and mask are opaque, or if the source is opaque and the mask is
missing.
2010-03-14 12:11:47 -04:00
Loïc Minier
18f0de452d ARM: SIMD: Try without any CFLAGS before forcing -mcpu=
http://bugs.launchpad.net/bugs/535183
2010-03-14 13:15:34 +02:00
Egor Starkov
9335408613 Eliminate trailing comma in enum
https://bugs.freedesktop.org/show_bug.cgi?id=27050

Pixman is not compiling with c++ compiler. During compilation it gives
the following error:

/usr/include/pixman-1/pixman.h:335: error: comma at end of enumerator list

Signed-off-by: Søren Sandmann Pedersen <ssp@redhat.com>
2010-03-12 10:50:18 -05:00
Søren Sandmann Pedersen
54e39e0038 Add a fast path cache
This patch adds a cache in front of the fast path tables to reduce the
overhead of pixman_composite(). It is fixed size with move-to-front to
make sure the most popular fast paths are at the beginning of the cache.

The cache is thread local to avoid locking.
2010-03-06 11:58:02 -05:00
Søren Sandmann Pedersen
84b009ae9f Post-release version bump to 0.17.11 2010-03-05 20:40:41 -05:00
Søren Sandmann Pedersen
14fd287efb Pre-release version bump to 0.17.10 2010-03-05 20:06:08 -05:00
Søren Sandmann Pedersen
bd9934551f Move __force_align_arg_pointer workaround before composite32()
Since otherwise the workaround won't take effect when you call
pixman_image_composite32() directly.
2010-03-04 04:15:44 -05:00
Søren Sandmann Pedersen
14bb054d96 Merge branch 'more-flags' 2010-03-04 02:30:22 -05:00
Søren Sandmann Pedersen
9a8e404d44 test: Remove obsolete comment 2010-03-03 13:37:20 -05:00
Siarhei Siamashka
182e4c2635 ARM: added 'neon_composite_over_reverse_n_8888' fast path
This fast path function improves performance of 'poppler' cairo-perf trace.

Benchmark from ARM Cortex-A8 @720MHz

before:

[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]    image                      poppler   38.986   39.158   0.23%    6/6

after:

[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]    image                      poppler   24.981   25.136   0.28%    6/6
2010-03-03 19:43:00 +02:00
Siarhei Siamashka
072a7d31a8 ARM: added 'neon_composite_src_x888_8888' fast path
This fast path function improves performance of 'gnome-system-monitor'
cairo-perf trace.

Benchmark from ARM Cortex-A8 @720MHz

before:

[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]    image         gnome-system-monitor   68.838   68.899   0.05%    5/6

after:

[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]    image         gnome-system-monitor   53.336   53.384   0.09%    6/6
2010-03-03 19:42:34 +02:00
Siarhei Siamashka
2ed7c13922 ARM: added 'neon_composite_over_n_8888_8888_ca' fast path
This fast path function improves performance of 'firefox-talos-gfx'
cairo-perf trace.

Benchmark from ARM Cortex-A8 @720MHz

before:

[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]    image            firefox-talos-gfx  139.969  141.176   0.35%    6/6

after:

[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]    image            firefox-talos-gfx  111.810  112.196   0.23%    6/6
2010-03-03 19:42:29 +02:00
Søren Sandmann Pedersen
3db76b9004 Restructure the flags computation in compute_image_info().
Restructure the code to use switches instead of ifs. This saves a few
comparisons and make the code slightly easier to follow. Also add some
comments.
2010-02-24 23:23:52 -05:00
Søren Sandmann Pedersen
ac44db3340 Move workaround code to pixman-image.c
It is more natural to put it where all the other flags are computed.
2010-02-24 23:20:28 -05:00
Søren Sandmann Pedersen
35af45d5e3 Turn need_workaround into another flag.
Instead of storing it as a boolean in the image struct, just use
another flag for it.
2010-02-24 23:20:28 -05:00
Søren Sandmann Pedersen
f27f17ce22 Eliminate _pixman_image_is_opaque() in favor of a new FAST_PATH_IS_OPAQUE flag
The new FAST_PATH_IS_OPAQUE flag is computed along with the others in
_pixman_image_validate().
2010-02-24 23:20:27 -05:00
Søren Sandmann Pedersen
2a6ba862ab Eliminate _pixman_image_is_solid()
Instead of calling this function in compute_image_info(), just do the
relevant checks when the extended format is computed.

Move computation of solidness to validate
2010-02-24 23:20:27 -05:00
Søren Sandmann Pedersen
45006e5e64 Move computation of extended format code to validate.
Instead of computing the extended format on every composite, just
compute it once and store it in the image.
2010-02-24 23:20:27 -05:00
Søren Sandmann Pedersen
fb0096a282 Add new FAST_PATH_SIMPLE_REPEAT flag
This flags indicates that the image is untransformed an
repeating. Such images can be composited quickly by simply repeating
the composite operation.
2010-02-24 23:20:27 -05:00
Søren Sandmann Pedersen
a7ad9c7c9d Compute the image flags at validation time instead of composite time
Instead of computing all the image flags at composite time, we compute
them once in _pixman_image_validate() and cache them in the image.
2010-02-24 23:20:27 -05:00
Søren Sandmann Pedersen
7bc4cd42c3 RELEASING: Update the release instructions. 2010-02-24 22:10:24 -05:00
Søren Sandmann Pedersen
7392a350f2 Post-release version bump 2010-02-24 22:02:13 -05:00
Søren Sandmann Pedersen
4d1c216af3 Pre-release version bump 2010-02-24 21:52:30 -05:00
Søren Sandmann Pedersen
e0f1d84107 Merge branch 'trap-fixes' 2010-02-24 21:01:29 -05:00
Søren Sandmann Pedersen
16ef3ab230 Add a1-trap-test
When a trapezoid sample point is exactly on a polygon edge, the rule
is that it is considered inside the trapezoid if the edge is a top or
left edge, but outside for bottom and right edges.

This program tests that for a1 trapezoids.
2010-02-24 21:01:24 -05:00
Søren Sandmann Pedersen
ad5cbba4c0 Hide the C++ extern "C" declarations behind macros.
That way they don't confuse the indenting algorithm in editors such as
Emacs.
2010-02-21 02:07:32 -05:00
Søren Sandmann Pedersen
14f201dc47 Merge branch 'eliminate-composite'
Conflicts:
	pixman/pixman-sse2.c
2010-02-20 13:09:01 -05:00
Søren Sandmann Pedersen
94f585916a Move all code to do debugging spew into pixman-private.
Rather than the region code having its own little debug system, move
all of it into pixman-private where there is already return_if_fail()
macros etc. These macros are now enabled in development snapshots and
nowhere else. Previously they were never enabled unless you modified
the code.

At the same time, remove all the asserts from the region code since we
can never turn them on anyway, and replace them with
critical_if_fail() macros that will print spew to standard error when
DEBUG is defined.

Finally, also change the debugging spew in pixman-bits-image.c to use
return_val_if_fail() instead of its own fprintf().
2010-02-20 11:57:58 -05:00
Alexander Larsson
f32d585069 Test pixman_region32_init_from_image in region-test 2010-02-19 11:25:41 +01:00
Alexander Larsson
48ef4befd8 Add pixman_region{32}_init_from_image
This creates a region from an image in PIXMAN_a1 format.
2010-02-19 11:25:41 +01:00
Alexander Larsson
5dee05fcab Move SCREEN_SHIFT_LEFT/RIGHT to pixman-private.h
This is needed for later use in other code.
2010-02-19 11:25:41 +01:00
Makoto Kato
61f4ed9c7a Compile by USE_SSE2 only without USE_MMX
Although we added MMX emulation for Microsoft Visual C++ compiler for x64,
USE_SSE2 still requires USE_MMX.  So we remove dependency of USE_MMX
for Windows x64.

Signed-off-by: Makoto Kato <m_kato@ga2.so-net.ne.jp>
2010-02-18 13:09:08 -05:00
Søren Sandmann Pedersen
6b2da683de Move NULL check out of get_image_info()
The NULL check is only necessary for masks, so there is no reason to
do it for destinations and sources.
2010-02-14 21:45:25 -05:00
Søren Sandmann Pedersen
1dd8744f40 Add a fast path for non-repeating sources in walk_region_internal().
In the common case where there is no repeating, the loop in
walk_region_internal() reduces to just walking of the boxes involved
and calling the composite function.
2010-02-14 21:45:25 -05:00
Søren Sandmann Pedersen
362a9f564a Move more things out of the inner loop in do_composite().
Specifically,

- the src_ and mask_repeat computations

- the check for whether the involved images cover the composite
  region.
2010-02-14 21:45:25 -05:00
Søren Sandmann Pedersen
129d9c1871 Move region computation out of the loop in do_composite()
We only need to compute the composite region once, not on every
iteration.
2010-02-14 21:45:25 -05:00
Søren Sandmann Pedersen
4c185503d2 Move get_image_info() out of the loop in do_composite
The computation of image formats and flags is invariant to the loop,
so it can all be moved out.
2010-02-14 21:45:13 -05:00
Søren Sandmann Pedersen
81b7d7b180 Manually inline _pixman_run_fast_path()
Move all of the code into do_composite().
2010-02-14 21:15:48 -05:00
Søren Sandmann Pedersen
e914cccb24 Move compositing functionality from pixman-utils.c into pixman.c
_pixman_run_fast_path() and pixman_compute_composite_region() are both
moved to pixman-image, since at this point that's the only place they
are being called from.
2010-02-14 20:43:25 -05:00
Søren Sandmann Pedersen
0eeb197599 Move compositing to its own function, do_composite() 2010-02-14 11:12:41 -05:00
Søren Sandmann Pedersen
f831552bce Optimize for the common case wrt. the workaround.
In the common case no images need the workaround, so we check for that
first, and only if an image does need a workaround do we check which
one of the images actually need it.
2010-02-14 11:12:41 -05:00
Søren Sandmann Pedersen
fa4df6225d Eliminate all the composite methods.
They are no longer necessary because we will just walk the fast path
tables, and the general composite path is treated as another fast
path.

This unfortunately means that sse2_composite() can no longer be
responsible for realigning the stack to 16 bytes, so we have to move
that to pixman_image_composite().
2010-02-14 11:12:38 -05:00
Søren Sandmann Pedersen
c3d7b51255 Delete unused _pixman_walk_composite_region() function 2010-02-14 11:12:20 -05:00
Søren Sandmann Pedersen
488480301c Don't call _pixman_implementation_composite() anymore.
Instead just call _pixman_run_fast_path(). Since we view
general_composite() as a fast path now, we know that it will find
*some* compositing routine.
2010-02-14 11:12:20 -05:00
Søren Sandmann Pedersen
06ae5ed597 Delete unused sources_cover() function 2010-02-14 11:12:20 -05:00
Søren Sandmann Pedersen
543a04a3bb Store a pointer to the array of fast paths in the implementation struct.
Also add an empty fast path table to the vmx implementation, so that
we can assume sure the pointer is never NULL.
2010-02-14 11:12:16 -05:00
Søren Sandmann Pedersen
376f2a3f85 Make fast_composite_scaled_nearest() another fast path.
This requires another couple of flags

     FAST_PATH_SCALE_TRANSFORM
     FAST_PATH_NEAREST_FILTER
2010-02-14 11:10:39 -05:00
Søren Sandmann Pedersen
87430cfc35 Make general_composite_rect() just another fast path.
We introduce a new PIXMAN_OP_any fake operator and a PIXMAN_any fake
format that match anything. Then general_composite_rect() can be used
as another fast path.

Because general_composite_rect() does not require the sources to cover
the clip region, we add a new flag FAST_PATH_COVERS_CLIP which is part
of the set of standard flags for fast paths.

Because this flag cannot be computed until after the clip region is
available, we have to call pixman_compute_composite_region32() before
checking for fast paths. This will resolve itself when we get to the
point where _pixman_run_fast_path() is only called once per composite
operation.
2010-02-14 11:10:15 -05:00
Søren Sandmann Pedersen
d7e281e0a1 Post-release version bump 2010-02-13 18:23:34 -05:00
Søren Sandmann Pedersen
9bcadc3408 Pre-release version bump 2010-02-13 18:12:32 -05:00
Søren Sandmann Pedersen
97a1245739 Once unrolled version of fast_path_composite_nearest_scaled()
Separate out the fetching and combining code in two inline
functions. Then do two pixels per iteration.
2010-02-13 17:31:11 -05:00
Søren Sandmann Pedersen
e597211075 Generalize and optimize fast_composite_src_scaled_nearest()
- Make it work for PIXMAN_OP_OVER

- Split repeat computation for x and y, and only the x part in the
  inner loop.

- Move stride multiplication outside of inner loop
2010-02-13 16:41:53 -05:00
Søren Sandmann Pedersen
337e916473 Merge branch 'bitmasks' 2010-02-13 12:26:09 -05:00
Søren Sandmann Pedersen
bdc4a6afe0 Makefile.am: Remove 'check' from release-check
It's already included in distcheck.
2010-02-13 11:28:33 -05:00
Søren Sandmann Pedersen
edee4be052 Turn off asserts in development snapshots (bug 26314).
There is not much real benefit in having asserts turned on in
snapshots because it doesn't lead to any new bug reports, just to
people not installing development snapshots since they case X server
crashes. So just turn them off.

While we are at it, limit the number of messages to stderr to 5
instead of 50.
2010-02-13 11:28:33 -05:00
Siarhei Siamashka
cf1f034fef ARM: Remove any use of environment variables for cpu features detection
Old code assumed that all ARMv7 processors support NEON instructions
unless overrided by environment variable ARM_TRUST_HWCAP. This causes
X server to die with SIGILL if NEON support is disabled in the kernel
configuration. Additionally, ARMv7 processors lacking NEON unit are
going to become available eventually.

The problem was reported by user bearsh at irc.freenode.net #gentoo-embedded
2010-02-13 17:44:49 +02:00
Alexander Larsson
865c37d574 Add pixman_image_get_destroy_data()
This way you can get back user data that was set using
pixman_image_set_destroy_function().
2010-02-09 15:57:18 +01:00
Alexander Larsson
cca1cef3f2 Add extern "C" guards for c++ 2010-02-09 13:22:38 +01:00
Søren Sandmann Pedersen
8e85059436 Move checks for src/mask repeat right before walking the region.
Also add a couple of additional checks to the src/mask repeat check.
2010-01-28 16:02:27 -05:00
Søren Sandmann Pedersen
eea58eab93 Compute src, mask, dest flags and base fast path decisions on them.
This makes sets the stage for caching the information by image instead
of computing it on each composite invocation.

This patch also computes format codes for images such as PIXMAN_solid,
so that we can no longer end up in the situation that a fast path is
selected for a 1x1 solid image, when that fast path doesn't actually
understand repeating.
2010-01-28 11:52:56 -05:00
Søren Sandmann Pedersen
6197db91a3 Add src_, mask_, and dest_flags fields to fast path arrays
Update all the fast path tables to match using a new
PIXMAN_STD_FAST_PATH macro.

For now, use 0 for the flags fields.
2010-01-28 11:52:55 -05:00
Søren Sandmann Pedersen
ff6eaac50e Move calls to source_is_fastpathable() into get_source_format() 2010-01-28 11:52:55 -05:00
Søren Sandmann Pedersen
171dc48756 Fold get_fast_path() into _pixman_run_fast_path()
Also factor out the source format code computation to its own
function.
2010-01-28 11:52:55 -05:00
Søren Sandmann Pedersen
459c7a52f6 Consolidate the source and mask sanity checks in a function 2010-01-28 11:52:55 -05:00
Søren Sandmann Pedersen
27a4fb4747 Move pixbuf checks after src_format and mask_format have been computed. 2010-01-28 11:52:55 -05:00
Søren Sandmann Pedersen
2def1a8867 Move the sanity checks for src, mask and destination into get_fast_path() 2010-01-28 11:52:55 -05:00
Søren Sandmann Pedersen
d76aab4d03 Turn some uint16_t variables to int32_t in the fast paths.
This is necessary now that we have a 32 bit version of
pixman_image_composite().
2010-01-27 07:11:11 -05:00
Søren Sandmann Pedersen
15d07d6c2a Implement get_scanline_64() correctly for solid fill images.
Previously they would be evaluated at 8 bits and then expanded.
2010-01-26 14:46:33 -05:00
Benjamin Otte
0e8550798f Make pixman_image_fill_rectangles() call pixman_image_fill_boxes()
Avoids duplication of code
2010-01-26 20:22:52 +01:00
Benjamin Otte
d0d284da0a Add pixman_image_fill_boxes() API
It's basically the 32bit version of pixman_image_fill_rectangles(), just
with a saner data type.
2010-01-26 20:22:52 +01:00
Benjamin Otte
e841c556d5 Add pixman_image_composite32()
This is equal to pixman_image_composite(), just with 32bit parameters.
pixman_image_composite() now just calls pixman_image_composite32()
2010-01-26 20:22:52 +01:00
Benjamin Otte
78b6c47078 Make region argument to pixman_region(32)_init_rects() const
No indenting of the header to keep git blame working
2010-01-26 20:22:51 +01:00
Benjamin Otte
b194bb78c8 Fix typo 2010-01-26 20:22:51 +01:00
Søren Sandmann Pedersen
c066c347ae Fix some warnings 2010-01-19 14:23:57 -05:00
Søren Sandmann Pedersen
8fce7b18f3 Post-release version bump 2010-01-17 19:34:27 -05:00
Søren Sandmann Pedersen
23e1ba3c06 Pre-release version bump 2010-01-17 18:56:11 -05:00
Søren Sandmann Pedersen
8dabd1fdd8 bits: Print an error if someone tries to create an image with bpp < depth
Something in the X server apparently does this.
2010-01-17 16:47:15 -05:00
Søren Sandmann Pedersen
2c3cbc83c4 When fetching from an alpha map, replace the alpha channel of the image
Previously it would be multiplied onto the image pixel, but the Render
specification is pretty clear that the alpha map should be used
*instead* of any alpha channel within the image.

This makes the assumption that the pixels in the image are already
premultiplied with the alpha channel from the alpha map. If we don't
make this assumption and the image has an alpha channel of its own, we
would have to first unpremultiply that pixel, and then premultiply the
alpha value onto the color channels, and then replace the alpha
channel.
2010-01-17 16:47:15 -05:00
Søren Sandmann Pedersen
0df6098f3d pixman_image_validate() needs to also validate the alpha map.
This is the other half of bug 25950.
2010-01-17 16:47:15 -05:00
Søren Sandmann Pedersen
7f00dc62e4 When fetching from an alpha map, use the alpha map's fetch function.
Don't use the one from the image. This is the first half of bug 25950.
2010-01-17 16:47:15 -05:00
Søren Sandmann Pedersen
042f978b04 test: Add new alphamap test program.
This program demonstrates three bugs relating to alpha maps:

- When fetching from an alpha map into 32 bit intermediates, we use
  the fetcher from the image, and not the one from the alpha map.

- For 64 bit intermediates we call fetch_pixel_generic_lossy_32()
  which then calls fetch_pixel_raw_64, which is NULL because alpha
  images are never validated.

- The alpha map should be used *in place* of any existing alpha
  channel, but we are actually multiplying it onto the image.
2010-01-17 16:47:15 -05:00
Søren Sandmann Pedersen
05c38141b4 fetch-test: Fix spelling error (pallete -> palette) 2010-01-16 07:41:23 -05:00
Alan Coopersmith
c46a87e45a Update Sun license notices to current X.Org standard form
Signed-off-by: Alan Coopersmith <alan.coopersmith@sun.com>
2010-01-14 09:42:41 -08:00
Søren Sandmann Pedersen
3df6cb3431 fetch-test: Various formatting fixes 2010-01-10 09:15:24 -05:00
Pierre-Loup A. Griffais
7862f9b96e Interpret the angle of a conical gradient in degrees.
The conical gradient angle's fixed point degrees to
radians conversion code is missing a factor of pi.
2010-01-06 01:26:07 +02:00
Søren Sandmann Pedersen
54f51c4a75 region: Enable or disable fatal errors and selfchecks based on version number
There is a couple of bugs in bugzilla where bugs in the X server
triggered asserts in the pixman region code. It is probably better to
let the X server survive this. (In fact, I thought I had disabled them
for 0.16.0, but apparently not).

The patch below uses these rules:

    - In _stable_ pixman releases, assertions and selfchecks are turned
      off. Assertions, so that the X server doesn't die. Selfchecks,
      for performance reasons.

    - In _unstable_ pixman releases, both assertions and selfcheck are
      turned on. These releases are what get added to development
      distributions such as rawhide, so we want as much self-checking
      as possible.

    - In _random git checkouts_, assertions are enabled, so that bugs
      are caught, but selfchecks are disabled so that you can use them
      for performance work without having to fiddle with turning
      selfchecks off.
2009-12-17 02:22:00 -05:00
Søren Sandmann Pedersen
91ec7fecc9 Some minor formatting fixes. 2009-12-16 23:15:04 -05:00
Søren Sandmann Pedersen
97cf4d494c arm-simd: Whitespace fixes 2009-12-16 17:54:41 -05:00
Søren Sandmann Pedersen
28778c997e mmx: Eliminate trailing whitespace. 2009-12-16 17:49:44 -05:00
Søren Sandmann Pedersen
c6c43c65f7 Add 'check' to release-check make target 2009-12-16 15:27:50 -05:00
Søren Sandmann Pedersen
b3afacf9c9 Reorder tests so that they fastest ones run first. 2009-12-16 15:27:50 -05:00
Marvin Schmidt
bbc5108bf8 Build tests and run non-GTK+ ones on make check
Setting TESTS will run the tests on `make check`

Bug 25131
2009-12-16 15:24:36 -05:00
Siarhei Siamashka
4476832070 ARM: added 'neon_combine_add_u' function 2009-12-16 20:56:13 +02:00
Siarhei Siamashka
f2c7a04c41 ARM: added 'neon_combine_over_u' function 2009-12-16 20:56:08 +02:00
Siarhei Siamashka
24cd286af6 ARM: macro template for single scanline compositing functions
Existing template already supports 2D images processing,
but pixman also needs some NEON optimized functions for
improving performance when compositing is decoupled
into "fetch -> process -> store" stages and done via
temporary scanline buffer. That's why a new simplified
template which deals only with the generation of single
scanline processing functions is handy.
2009-12-16 20:55:54 +02:00
Siarhei Siamashka
ae8d9df624 Use canonical pixman license notice for recently added ARM NEON assembly files 2009-12-16 20:39:21 +02:00
Siarhei Siamashka
ce78288d77 ARM: added 'neon_composite_src_pixbuf_8888' fast path
This is ARM NEON optimized conversion of native RGBA format used by
GTK/GDK into native 32bpp RGBA format used by cairo/pixman.
2009-12-09 15:22:09 +02:00
Siarhei Siamashka
a732d3baeb ARM: added 'neon_composite_src_0888_0565_rev' fast path
This is ARM NEON optimized conversion of native RGB format used by
GTK/GDK into r5g6b5 format.
2009-12-09 15:22:03 +02:00
Siarhei Siamashka
a1386a1ceb ARM: added 'neon_src_0888_8888_rev' fast path
This is ARM NEON optimized conversion of native RGB format used by
GTK/GDK into native 32bpp RGB format used by cairo/pixman.
2009-12-09 15:21:57 +02:00
Siarhei Siamashka
78a60047ac ARM: added 'neon_composite_over_n_8888' fast path 2009-12-09 11:29:13 +02:00
Siarhei Siamashka
96fd17488f ARM: added 'neon_composite_over_n_0565' fast path 2009-12-09 11:27:57 +02:00
Siarhei Siamashka
2d332c7a56 ARM: added 'neon_composite_src_0565_8888' fast path 2009-12-09 10:33:01 +02:00
Siarhei Siamashka
062da411d8 ARM: added 'neon_composite_add_8888_8888_8888' fast path 2009-12-09 10:26:47 +02:00
Siarhei Siamashka
3d0eedb5d9 ARM: added 'neon_composite_add_8888_8888' fast path 2009-12-09 10:25:03 +02:00
Siarhei Siamashka
86b54c6701 ARM: added 'neon_composite_over_8888_8_8888' fast path 2009-12-09 10:24:30 +02:00
Siarhei Siamashka
aec1524e77 ARM: added 'neon_composite_over_8888_8888_8888' fast path 2009-12-09 10:19:37 +02:00
Siarhei Siamashka
ba59d53d0b ARM: minor source formatting changes
Now it's a bit harder to exceed 80 characters line limit
when binding assembly functions.
2009-12-09 10:17:23 +02:00
Siarhei Siamashka
a47b5167c4 ARM: added '.arch armv7a' directive to NEON assembly file
This fix prevents build failure due to not accepting PLD instruction when
compiling for armv4 cpu with the relevant -mcpu/-march options set in CFLAGS.
2009-12-08 08:52:34 +02:00
Benjamin Otte
3fba7dc6fa Make test program not throw warnings about undefined variables 2009-12-04 15:04:24 +01:00
Benjamin Otte
10ab592d57 Fix bug that prevented pixman_fill MMX and SSE paths for 16 and 8bpp 2009-12-04 15:04:24 +01:00
Siarhei Siamashka
7c7b6f5de7 ARM: NEON optimized pixman_blt
NEON unit has fast access to L1/L2 caches and even simple
copy of memory buffers using NEON provides more than 1.5x
performance improvement on ARM Cortex-A8.
2009-11-30 22:21:08 +02:00
Siarhei Siamashka
dce6e1bd68 test: support for testing pixbuf fast path functions in blitters-test 2009-11-27 15:50:26 +02:00
Benjamin Otte
0901ef41fb Remove nonexistant function from header 2009-11-22 10:57:06 +01:00
Søren Sandmann Pedersen
c97b1e803f Post-release version bump 2009-11-20 12:02:50 +01:00
Søren Sandmann Pedersen
5a7597f818 Pre-release version bump 2009-11-20 11:55:40 +01:00
Søren Sandmann Pedersen
95a08dece3 Remove stray semicolon from blitters-test.c
Pointed out by scottmc2@gmail.com in bug 25137.
2009-11-20 11:18:58 +01:00
Siarhei Siamashka
6e2c7d54c6 C fast path function for 'over_n_1_0565'
This function is needed to improve performance of xfce4 terminal when
using bitmap fonts and running with 16bpp desktop. Some other applications
may potentially benefit too.

After applying this patch, top functions from Xorg process in
oprofile log change from

samples  %        image name               symbol name
13296    29.1528  libpixman-1.so.0.17.1    combine_over_u
6452     14.1466  libpixman-1.so.0.17.1    fetch_scanline_r5g6b5
5516     12.0944  libpixman-1.so.0.17.1    fetch_scanline_a1
2273      4.9838  libpixman-1.so.0.17.1    store_scanline_r5g6b5
1741      3.8173  libpixman-1.so.0.17.1    fast_composite_add_1000_1000
1718      3.7669  libc-2.9.so              memcpy

to

samples  %        image name               symbol name
5594     14.7033  libpixman-1.so.0.17.1    fast_composite_over_n_1_0565
4323     11.3626  libc-2.9.so              memcpy
3695      9.7119  libpixman-1.so.0.17.1    fast_composite_add_1000_1000

when scrolling text in terminal (reading man page).
2009-11-20 11:18:58 +01:00
Søren Sandmann Pedersen
282f5cf8b8 Round horizontal sampling points towards northwest.
This is a similar change as the top/bottom one, but in this case the
rounding is simpler because it's just always rounding down.

Based on a patch by M Joonas Pihlaja.
2009-11-17 01:58:01 -05:00
Søren Sandmann Pedersen
f44431986f Fix rounding of top and bottom coordinates.
The rules for trap rasterization is that coordinates are rounded
towards north-west.

The pixman_sample_ceil() function is used to compute the first
(top-most) sample row included in the trap, so when the input
coordinate is already exactly on a sample row, no rounding should take
place.

On the other hand, pixman_sample_floor() is used to compute the final
(bottom-most) sample row, so if the input is precisely on a sample
row, it needs to be rounded down to the previous row.

This commit fixes the rounding computation. The idea of the
computation is like this:

Floor operation that rounds exact matches down: First subtract
pixman_fixed_e to make sure input already on a sample row gets rounded
down. Then find out how many small steps are between the input and the
first fraction. Then add those small steps to the first fraction.

The ceil operation first adds (small_step + pixman_e), then runs a
floor. This ensures that exact matches are not rounded off.

Based on a patch by M Joonas Pihlaja.
2009-11-17 01:58:01 -05:00
Søren Sandmann Pedersen
3bea18e3ea Fix slightly skewed sampling grid for antialiased traps
The sampling grid is slightly skewed in the antialiased case. Consider
the case where we have n = 8 bits of alpha.

The small step is

     small_step = fixed_1 / 15 = 65536 / 15 = 4369

The first fraction is then

     frac_first = (small_step / 2) = (65536 - 15) / 2 = 2184

and the last fraction becomes

     frac_last
          = frac_first + (15 - 1) * small_step = 2184 + 14 * 4369 = 63350

which means the size of the last bit of the pixel is

     65536 - 63350 = 2186

which is 2 bigger than the first fraction. This is not the end of the
world, but it would be more correct to have 2185 and 2185, and we can
accomplish that simply by making the first fraction half the *big*
step instead of half the small step.

If we ever move to coordinates with 8 fractional bits, the
corresponding values become 8 and 10 out of 256, where 9 and 9 would
be better.

Similarly in the X direction.
2009-11-17 01:58:01 -05:00
Søren Sandmann Pedersen
98bb0a509f Delete the flags field from fast_path_info_t 2009-11-17 00:47:49 -05:00
Søren Sandmann Pedersen
b7fb7e6c70 Eliminate NEED_PIXBUF flag.
Instead introduce two new fake formats

	PIXMAN_pixbuf
	PIXMAN_rpixbuf

and compute whether the source and mask have them in
find_fast_path(). This lead to some duplicate entries in the fast path
tables that could then be removed.
2009-11-17 00:42:21 -05:00
Søren Sandmann Pedersen
542b79c30d Compute src_format outside the fast path loop.
Inside the loop all we have to do is check that the formats match.
2009-11-17 00:42:21 -05:00
Søren Sandmann Pedersen
12108ecbe4 Eliminate the NEED_COMPONENT_ALPHA flag.
Instead introduce two new fake formats

	PIXMAN_a8r8g8b8_ca
	PIXMAN_a8b8g8r8_ca

that are used in the fast path tables for this case.
2009-11-17 00:42:21 -05:00
Søren Sandmann Pedersen
4686d1f53b Eliminate the NEED_SOLID_MASK flag
This flag was used to indicate that the mask was solid while still
allowing a specific format to be required. However, there is not
actually any need for this because the fast paths all used
_pixman_image_get_solid() which already allowed arbitrary formats.

The one thing that had to be dealt with was component alpha. In
addition to interpreting the presence of the NEED_COMPONENT_ALPHA
flag, we now also interprete the *absence* of this flag as a
requirement that the mask does *not* have component alpha.

Siarhei Siamashka pointed out that the first version of this commit
had a bug, in which a NEED_SOLID_MASK was accidentally not turned into
a PIXMAN_solid in the ARM NEON implementation.
2009-11-17 00:42:21 -05:00
Søren Sandmann Pedersen
2ef8b394d7 Use the destination buffer directly in more cases instead of fetching.
When the destination buffer is either a8r8g8b8 or x8r8g8b8, we can use
it directly instead of fetching into a temporary buffer. When the
format is x8r8g8b8, we require the operator to not make use of
destination alpha, but when it is a8r8g8b8, there are no restrictions.

This is approximately a 5% speedup on the poppler cairo benchmark:

[ # ]  backend                         test   min(s) median(s) stddev. count

Before:
[  0]    image                      poppler    6.661    6.709   0.59%    6/6

After:
[  0]    image                      poppler    6.307    6.320   0.12%    5/6
2009-11-17 00:42:21 -05:00
Søren Sandmann Pedersen
13f4e02b14 test: Move image_endian_swap() from blitters-test.c to utils.[ch] 2009-11-17 00:32:03 -05:00
Søren Sandmann Pedersen
24e203a8a8 test: Move random number generator from blitters/scaling-test to utils.[ch] 2009-11-17 00:32:03 -05:00
Søren Sandmann Pedersen
cc34554652 test: In scaling-test use the crc32 from utils.c 2009-11-17 00:32:03 -05:00
Søren Sandmann Pedersen
b465b8b79d test: Move CRC32 code from blitters-test to new files utils.[ch] 2009-11-17 00:32:03 -05:00
Søren Sandmann Pedersen
56bd913401 test: Rename utils.[ch] to gtk-utils.[ch] 2009-11-17 00:32:03 -05:00
Søren Sandmann Pedersen
7be529f3bd sse2: Add a fast path for OVER 8888 x 8 x 8888
This is a small speedup on the swfdec-youtube benchmark:

Before:
[  0]    image               swfdec-youtube    5.789    5.806   0.20%    6/6

After:
[  0]    image               swfdec-youtube    5.489    5.524   0.27%    6/6

Ie., approximately 5% faster.
2009-11-13 15:57:48 -05:00
Siarhei Siamashka
abefe68ae2 ARM: enabled 'neon_composite_add_8000_8000' fast path 2009-11-11 18:12:58 +02:00
Siarhei Siamashka
635f389ff4 ARM: enabled 'neon_composite_add_8_8_8' fast path 2009-11-11 18:12:58 +02:00
Siarhei Siamashka
7e1bfed676 ARM: enabled 'neon_composite_add_n_8_8' fast path 2009-11-11 18:12:58 +02:00
Siarhei Siamashka
deeb67b13a ARM: enabled 'neon_composite_over_8888_8888' fast path 2009-11-11 18:12:58 +02:00
Siarhei Siamashka
f449364849 ARM: enabled 'neon_composite_over_8888_0565' fast path 2009-11-11 18:12:57 +02:00
Siarhei Siamashka
2dfbf6c4a5 ARM: enabled 'neon_composite_over_8888_n_8888' fast path 2009-11-11 18:12:57 +02:00
Siarhei Siamashka
43824f98f1 ARM: enabled 'neon_composite_over_n_8_8888' fast path 2009-11-11 18:12:57 +02:00
Siarhei Siamashka
189d0d783c ARM: enabled 'neon_composite_over_n_8_0565' fast path 2009-11-11 18:12:57 +02:00
Siarhei Siamashka
cccfc87f4f ARM: enabled 'neon_composite_src_0888_0888' fast path 2009-11-11 18:12:57 +02:00
Siarhei Siamashka
e89b4f8105 ARM: enabled 'neon_composite_src_8888_0565' fast path 2009-11-11 18:12:56 +02:00
Siarhei Siamashka
2d54ed46fb ARM: enabled 'neon_composite_src_0565_0565' fast path 2009-11-11 18:12:56 +02:00
Siarhei Siamashka
5d695cb86e ARM: added 'bindings' for NEON assembly optimized functions
These functions serve as 'adaptors', converting standard internal
pixman fast path function arguments into arguments expected
by assembly functions.
2009-11-11 18:12:56 +02:00
Siarhei Siamashka
dcfade3df9 ARM: enabled new implementation for pixman_fill_neon 2009-11-11 18:12:56 +02:00
Siarhei Siamashka
bcb4bc7932 ARM: introduction of the new framework for NEON fast path optimizations
GNU assembler and its macro preprocessor is now used to generate
NEON optimized functions from a common template. This automatically
takes care of nuisances like ensuring optimal alignment, dealing with
leading/trailing pixels, doing prefetch, etc.

Implementations for a lot of compositing functions are also added,
but not enabled.
2009-11-11 18:12:56 +02:00
Siarhei Siamashka
1eff0ab487 ARM: removed old ARM NEON optimizations 2009-11-11 18:12:55 +02:00
Søren Sandmann Pedersen
b8898d77d0 Define PIXMAN_USE_INTERNAL_API in pixman-private.h
Instead of mucking around with CFLAGS in configure.ac, preventing
users from setting their own CFLAGS, just define the
PIXMAN_USE_INTERNAL_API and PIXMAN_DISABLE_DEPRECATED in
pixman-private.h
2009-11-07 14:47:22 -05:00
Søren Sandmann Pedersen
67bf739187 Include <inttypes.h> when compiled with HP's C compiler.
Fixes bug 23169.
2009-10-27 09:11:28 -04:00
Siarhei Siamashka
384fb88b90 C fast path function for 'over_n_1_8888'
This function is needed to improve performance of xfce4 terminal.
Some other applications may potentially benefit too.
2009-10-27 12:32:04 +02:00
Siarhei Siamashka
a2985da947 C fast path function for 'add_1000_1000'
This function is needed to improve performance of xfce4 terminal.
Some other applications may potentially benefit too.
2009-10-27 12:31:59 +02:00
Siarhei Siamashka
5f429e4510 blitters-test updated to also randomly generate mask_x/mask_y 2009-10-27 12:31:55 +02:00
André Tupinambá
0d5562747c Add fast path scaled, bilinear fetcher.
This adds a bilinear fetcher for the case where the image has a scaled
transformation, does not repeat, and the format {ax}8r8g8b8.

Results for the swfdec-youtube benchmark

Before:

[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]    image               swfdec-youtube    7.841    7.915   0.72%    6/6

After:

[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]    image               swfdec-youtube    6.677    6.780   0.94%    6/6

These results were measured on a faster machine than the ones in the
previous commit, so the numbers are not comparable.

Signed-off-by: Søren Sandmann Pedersen <sandmann@redhat.com>
2009-10-26 13:04:21 -04:00
André Tupinambá
88323c5abe Speed up bilinear interpolation.
Speed up bilinear interpolation by processing more than one component
at a time on 64 bit architectures, and by precomputing the dist{ixiy}
products on 32 bit architectures.

Previously bilinear interpolation for one pixel would take 24
multiplications. With this improvement it takes 12 on 64 bit, and 20
on 32 bit.

This is a small but consistent speedup on the swfdec-youtube
benchmark:

[ # ]  backend                         test   min(s) median(s) stddev. count
Before:
[  0]    image               swfdec-youtube   18.010   18.020   0.09%    4/5

After:
[  0]    image               swfdec-youtube   17.488   17.584   0.22%    5/6

Signed-off-by: Søren Sandmann Pedersen <sandmann@redhat.com>
2009-10-26 13:04:21 -04:00
Søren Sandmann Pedersen
f0c157f888 Extend scaling-test to also test bilinear filtering. 2009-10-26 13:04:21 -04:00
Jeremy Huddleston
eab882ef38 This is not a GNU project, so declare it foreign.
On Wed, 2009-10-21 at 13:36 +1000, Peter Hutterer wrote:
> On Tue, Oct 20, 2009 at 08:23:55PM -0700, Jeremy Huddleston wrote:
> > I noticed an INSTALL file in xlsclients and libXvMC today, and it
> > was quite annoying to work around since 'autoreconf -fvi' replaces
> > it and git wants to commit it.  Should these files even be in git?
> > Can I nuke them for the betterment of humanity and since they get
> > created by autoreconf anyways?
>
> See https://bugs.freedesktop.org/show_bug.cgi?id=24206

As an interim measure, replace AM_INIT_AUTOMAKE([dist-bzip2]) with
AM_INIT_AUTOMAKE([foreign dist-bzip2]). This will prevent the generation
of the INSTALL file. It is also part of the 24206 solution.

Signed-off-by: Jeremy Huddleston <jeremyhu@freedesktop.org>
2009-10-21 12:47:27 -07:00
Søren Sandmann Pedersen
dc46ad274a Make walk_region_internal() use 32 bit dimensions 2009-10-19 20:32:37 -04:00
Søren Sandmann Pedersen
bb3698d479 Make pixman_compute_composite_region32() use 32 bit dimensions 2009-10-19 20:31:54 -04:00
Søren Sandmann Pedersen
895c281c40 Change prototype of _pixman_walk_composite_region from int16_t to int32_t 2009-10-19 20:30:22 -04:00
Søren Sandmann Pedersen
9cd470665b Remove unused color_table and color_table_size fields 2009-10-19 20:27:36 -04:00
Søren Sandmann Pedersen
8186937637 Remove BOUNDS() macro.
It was bounding the clip region to INT16_MIN, INT16_MAX, but this was
a relic from the X server. We don't need it since we are already
restricting the clip region to the geometry of the destination.
2009-10-19 20:16:25 -04:00
Benjamin Otte
9bcfc0ac54 --enable-maintainer-mode is gone from configure, so remove it 2009-10-20 00:40:40 +02:00
Benjamin Otte
fa49ef81f7 Add default cases for all switch statements
Fixes compilation with -Wswitch-default. Compilation with -Wswitch-enums
works fine as is.
2009-10-20 00:40:40 +02:00
Benjamin Otte
5c3ef4e979 Fix compile warnings 2009-10-20 00:40:40 +02:00
Siarhei Siamashka
ad48407885 ARM: Removal of unused/broken NEON code 2009-10-20 00:21:56 +03:00
Søren Sandmann Pedersen
358f96d202 Fix double semicolon; pointed out by Travis Griggs 2009-10-08 13:01:27 -04:00
Gerdus van Zyl
93acc10617 Fix build with Visual Studio 2008
moved __m64 ms declaration in sse2_composite_over_x888_8_8888 to top
of function so it compiles with visual studio 2008
2009-09-30 06:29:43 -04:00
Andrea Canciani
f135f74ff3 Fix composite on big-endian systems.
Data narrower than 32bpp is padded to an unsigned long and on
big-endian systems this shifts the value by the padding bits.
2009-09-27 09:35:02 -04:00
Søren Sandmann Pedersen
15c14691a7 Fix fetch-test for big-endian systems.
Data narrower than 32bpp should be stored in the correct
endian. Reported by Andrea Canciani.
2009-09-26 14:10:20 -04:00
Søren Sandmann Pedersen
02d7099888 Add missing break in composite.c 2009-09-25 07:53:32 -04:00
Guillem Jover
8ce004af36 pixman: Update .gitignore
Generalize to catch all .pc files. Add more tests.

Signed-off-by: Guillem Jover <guillem@hadrons.org>
2009-09-24 21:13:54 +02:00
Søren Sandmann Pedersen
59e877cffe In the compositing test, Don't try to use component alpha with solid fills.
It's not supported yet.
2009-09-24 08:10:00 -04:00
Søren Sandmann Pedersen
16adb09c8a Update CRC value in blitters-test for the new bug fixes 2009-09-24 07:54:37 -04:00
Søren Sandmann Pedersen
e156964d3e Fix bug in blitters-test with BGRA formats.
When masking out the x bits, blitter-test would make the incorrect
assumption that the they were always in the topmost position. This is
not correct for formats of type PIXMAN_TYPE_BGRA.
2009-09-24 07:54:37 -04:00
Søren Sandmann Pedersen
eb72bfb97d Fix bugs in fetch_*_b2g3r3().
The red channel should only be shifted five positions, not six.
2009-09-24 07:54:35 -04:00
Søren Sandmann Pedersen
b4f6113cb9 Fix bugs in a1b2g1r1.
The first bug is that it is treating the input as if it were a1r1g1b1;
the second one is that the red channel should only be shifted two
bits, not three.
2009-09-24 07:48:46 -04:00
Søren Sandmann Pedersen
efdf15e677 Fix shift bug in fetch_scanline/pixel_a2b2g2r2()
0x30 * 0x55 is 0xff0, so the red channel should be shifted four bits,
not six.
2009-09-24 07:30:38 -04:00
Søren Sandmann Pedersen
679c2dabda Fix four bit formats.
The original Render code used to index pixels with their position in
bits in the image. When the scanline code was introduced pixels were
indexed in bytes, but the FETCH/STORE_4/8 macros still assumed bits.

This commit fixes that by making the FETCH/STORE_4 macros first
convert the index to bit position.
2009-09-24 07:08:30 -04:00
Søren Sandmann Pedersen
3d1714cd1f Hide PIXMAN_OP_NONE and PIXMAN_N_OPERATORS behind PIXMAN_INTERNAL_API.
These cannot sanely be used by applications since they may change in
new versions.
2009-09-24 07:06:34 -04:00
Søren Sandmann Pedersen
0683f34c41 Add a few notes about testing to TODO 2009-09-24 07:04:29 -04:00
Søren Sandmann Pedersen
48ba7d9461 Fix alpha handling for 10 bpc formats.
These generally extracted the 2 bits of alpha, then shifted them 62
bits and replicated across 16 bits. Then they were shifted another 48
bits, making the resulting alpha channel 0.
2009-09-24 06:49:56 -04:00
Søren Sandmann Pedersen
c673c83e07 Return result from pixman_image_set_transform().
Previously it would always return TRUE, even when malloc() had failed.
2009-09-24 05:22:33 -04:00
Søren Sandmann Pedersen
eb16d17188 Revert "Enable component alpha on solid masks."
For consistency we will probably want to allow component alpha to be
set on all masks at some point, but this commit only enabled it for
solid images.

This reverts commit 29e22cf38e.
2009-09-15 08:55:13 -04:00
Chris Wilson
b96e37f8d0 [Makefile] Set the SIMD specific CFLAGS for inspecting asm. 2009-09-15 13:25:00 +01:00
Søren Sandmann Pedersen
273e89750b Remove optimization for 0xffffffff and 0xff the add_n_8888_8888_ca fast path
This is an ADD operation, not an OVER. Fixes bug 23934, reported by
Siarhei Siamashka.
2009-09-14 18:52:10 -04:00
M Joonas Pihlaja
ec7c1affcc Don't prefetch from NULL in the SSE2 fast paths.
On an Athlon64 box prefetch from NULL slows down
the rgba OVER rgba fast for predominantly solid sources
by up to 3.5x in the one-rounded-rectangle test case
when run using a tiling polygon renderer.  This patch
conditionalises the prefetches of the mask everywhere
where the mask pointer may be NULL in a fast path.
2009-09-15 00:35:14 +03:00
Søren Sandmann Pedersen
1b5269a585 Reformat test/composite.c to follow the standard coding style. 2009-09-14 07:32:54 -04:00
Chris Wilson
0431a0af6c [test] Exercise repeating patterns for composite. 2009-09-13 18:02:10 +01:00
Chris Wilson
c28e39f17a [build] Add rule to generate asm for inspection. 2009-09-13 16:38:04 +01:00
Chris Wilson
823bb1a943 [sse2] Don't emit prefetch 0 for an absent mask 2009-09-13 16:37:57 +01:00
Chris Wilson
8f2daa7ca2 [test] Add composite test from rendercheck
Iterate over all destination formats for dst, src and composite and
compare the result of all oprators with a selection of colours.
2009-09-13 16:33:01 +01:00
Chris Wilson
cda0ee5165 build: Suppress verbose compile lines
Compile warnings are being lost in the sea of noise. Automake-1.11 finally
introduced AM_SILENT_RULES to suppress the echoing of the compile line for
every object. Enable this to bring sanity to the pixman build.
2009-09-13 16:32:44 +01:00
Chris Wilson
56cc06f89b Merge branch '0.16'
Conflicts:
	configure.ac
	pixman/pixman-sse2.c
2009-09-13 16:32:27 +01:00
Søren Sandmann Pedersen
8aff99e231 Fix off-by-one error in source_image_needs_out_of_bounds_workaround()
If extents->x2/y2 are equal to image->width/height, then the clip is
still inside the drawable, so no workaround is necessary.
2009-09-10 21:33:24 -04:00
Gaetan Nadon
fefe2a5d24 Remove unused generated libcomp.pc #23801 2009-09-09 14:36:39 -04:00
Siarhei Siamashka
2679d93e22 Change CFLAGS order for PPC and ARM configure tests
CFLAGS are always appended to the end of gcc options when compiling
sources in autotools based projects. Configure tests should do the
same. Otherwise build fails on PPC when using CFLAGS="-O2 -mno-altivec"
for example. Similar problem affects ARM.
2009-09-06 23:03:21 +03:00
Siarhei Siamashka
91232ee40d ARM: Remove fallback to ARMv6 implementation from NEON delegate chain
This can help to fix build problems with '-mthumb' gcc option in CFLAGS.
ARMv6 optimized code can't be compiled for thumb (because of its inline
assembly) and gets automatically disabled in configure. Reference
to it from NEON optimized code resulted in linking problems.

Every ARMv6 optimized fast path function also has a better NEON
counterpart, so there is no need to fallback to ARMv6. Shorter
delegate chain should additionally result in a bit better performance.
2009-09-06 23:03:08 +03:00
M Joonas Pihlaja
29e7d6063f Default to optimised builds when using a Sun Studio compiler.
Autoconf's AC_PROG_CC sets the default CFLAGS to -O2 -g for
gcc and -g for every other compiler.  This patch defaults
CFLAGS to the equivalent -O -g when we're using Sun Studio's cc
if the user or site admin hasn't already set CFLAGS.
2009-09-03 22:50:38 +03:00
M Joonas Pihlaja
e7018685f0 Work around a Sun Studio 12 code generation bug involving _mm_set_epi32().
Calling a static function wrapper around _mm_set_epi32() when not
using optimisation causes Sun Studio 12's cc to emit a spurious
floating point load which confuses the assembler.  Using a macro wrapper
rather than a function steps around the problem.
2009-09-03 22:47:24 +03:00
M Joonas Pihlaja
04ade7b68c Work around differing _mm_prefetch() prototypes on Solaris.
Sun Studio 12 expects the address to prefetch to be
a const char pointer rather than a __m128i pointer or
void pointer.
2009-09-03 22:47:22 +03:00
Siarhei Siamashka
3e228377f9 ARM: workaround for gcc bug in vshll_n_u8 intrinsic
Some versions of gcc (cs2009q1, 4.4.1) incorrectly reject
shift operand having value >= 8, claiming that it is out of
range. So inline assembly is used as a workaround.
2009-09-03 19:49:17 +03:00
Søren Sandmann Pedersen
632125d410 Enable the x888_8_8888 sse2 fast path. 2009-09-02 19:29:03 -04:00
Makoto Kato
097342a65d Add CPU detection for VC++ x64
VC++ x64 has no inline assembler and x64 mode supports SSE2.
So, it is unnecessary to call cpuid.
2009-09-02 15:00:46 -04:00
Søren Sandmann Pedersen
64085c91b6 Change names of add_8888_8_8 fast paths to add_n_8_8
The source is solid in those.
2009-09-01 08:23:23 -04:00
Søren Sandmann Pedersen
7af985a69a Post-release version bump 2009-08-28 08:14:04 -04:00
244 changed files with 70938 additions and 27508 deletions

11
.editorconfig Normal file
View File

@ -0,0 +1,11 @@
# To use this config on you editor, follow the instructions at:
# http://editorconfig.org
root = true
[*]
tab_width = 8
[meson.build,meson_options.txt]
indent_style = space
indent_size = 2

33
.gitignore vendored
View File

@ -3,6 +3,7 @@ Makefile.in
.deps
.libs
.msg
*.pc
*.lo
*.la
*.a
@ -21,23 +22,35 @@ install-sh
libtool
ltmain.sh
missing
pixman-1.pc
stamp-h?
config.h
config.h.in
.*.swp
pixman/pixman-combine32.c
pixman/pixman-combine32.h
pixman/pixman-combine64.c
pixman/pixman-combine64.h
demos/*-test
demos/checkerboard
demos/clip-in
demos/linear-gradient
demos/quad2quad
demos/scale
demos/dither
pixman/pixman-srgb.c
pixman/pixman-version.h
test/clip-test
test/composite-test
test/fetch-test
test/gradient-test
test/region-test
test/*-test
test/affine-bench
test/alpha-loop
test/alphamap
test/check-formats
test/clip-in
test/composite
test/infinite-loop
test/lowlevel-blt-bench
test/radial-invalid
test/region-translate
test/scaling-bench
test/trap-crasher
*.pdb
*.dll
*.lib
*.ilk
*.obj
*.exe

View File

@ -0,0 +1,80 @@
# Docker build stage
#
# It builds a multi-arch image for all required architectures. Each image can be
# later easily used with properly configured Docker (which uses binfmt and QEMU
# underneath).
docker:
stage: docker
image: quay.io/buildah/stable
rules:
- if: "$CI_PIPELINE_SOURCE == 'merge_request_event' && $TARGET =~ $ACTIVE_TARGET_PATTERN"
changes:
paths:
- .gitlab-ci.d/01-docker.yml
- .gitlab-ci.d/01-docker/**/*
variables:
DOCKER_TAG: $CI_COMMIT_REF_SLUG
DOCKER_IMAGE_NAME: ${CI_REGISTRY_IMAGE}/pixman:${DOCKER_TAG}
- if: "$CI_PIPELINE_SOURCE == 'schedule' && $TARGET =~ $ACTIVE_TARGET_PATTERN"
- if: "$CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH && $TARGET =~ $ACTIVE_TARGET_PATTERN"
- if: "$CI_COMMIT_TAG && $TARGET =~ $ACTIVE_TARGET_PATTERN"
variables:
# Use vfs with buildah. Docker offers overlayfs as a default, but Buildah
# cannot stack overlayfs on top of another overlayfs filesystem.
STORAGE_DRIVER: vfs
# Write all image metadata in the docker format, not the standard OCI
# format. Newer versions of docker can handle the OCI format, but older
# versions, like the one shipped with Fedora 30, cannot handle the format.
BUILDAH_FORMAT: docker
BUILDAH_ISOLATION: chroot
CACHE_IMAGE: ${CI_REGISTRY_IMAGE}/cache
CACHE_ARGS: --cache-from ${CACHE_IMAGE} --cache-to ${CACHE_IMAGE}
before_script:
# Login to the target registry.
- echo "${CI_REGISTRY_PASSWORD}" |
buildah login -u "${CI_REGISTRY_USER}" --password-stdin ${CI_REGISTRY}
# Docker Hub login is optional, and can be used to circumvent image pull
# quota for anonymous pulls for base images.
- echo "${DOCKERHUB_PASSWORD}" |
buildah login -u "${DOCKERHUB_USER}" --password-stdin docker.io ||
echo "Failed to login to Docker Hub."
parallel:
matrix:
- TARGET:
- linux-386
- linux-amd64
- linux-arm-v5
- linux-arm-v7
- linux-arm64-v8
- linux-mips
- linux-mips64el
- linux-mipsel
- linux-ppc
- linux-ppc64
- linux-ppc64le
- linux-riscv64
- windows-686
- windows-amd64
- windows-arm64-v8
script:
# Prepare environment.
- ${LOAD_TARGET_ENV}
- FULL_IMAGE_NAME=${DOCKER_IMAGE_NAME}-${TARGET}
# Build and push the image.
- buildah bud
--tag ${FULL_IMAGE_NAME}
--layers ${CACHE_ARGS}
--target ${TARGET}
--platform=${DOCKER_PLATFORM}
--build-arg BASE_IMAGE=${BASE_IMAGE}
--build-arg BASE_IMAGE_TAG=${BASE_IMAGE_TAG}
--build-arg LLVM_VERSION=${LLVM_VERSION}
-f Dockerfile .gitlab-ci.d/01-docker/
- buildah images
- buildah push ${FULL_IMAGE_NAME}

View File

@ -0,0 +1,150 @@
ARG BASE_IMAGE=docker.io/debian
ARG BASE_IMAGE_TAG=bookworm-slim
FROM ${BASE_IMAGE}:${BASE_IMAGE_TAG} AS base
LABEL org.opencontainers.image.title="Pixman build environment for platform coverage" \
org.opencontainers.image.authors="Marek Pikuła <m.pikula@partner.samsung.com>"
ARG DEBIAN_FRONTEND=noninteractive
ENV APT_UPDATE="apt-get update" \
APT_INSTALL="apt-get install -y --no-install-recommends" \
APT_CLEANUP="rm -rf /var/lib/apt/lists/* /var/cache/apt/archives/*"
ARG GCOVR_VERSION="~=7.2"
ARG MESON_VERSION="~=1.6"
RUN ${APT_UPDATE} \
&& ${APT_INSTALL} \
# Build dependencies.
build-essential \
ninja-build \
pkg-config \
qemu-user \
# pipx dependencies.
python3-argcomplete \
python3-packaging \
python3-pip \
python3-platformdirs \
python3-userpath \
python3-venv \
# gcovr dependencies.
libxml2-dev \
libxslt-dev \
python3-dev \
&& ${APT_CLEANUP} \
# Install pipx using pip to have a more recent version of pipx, which
# supports the `--global` flag.
&& pip install pipx --break-system-packages \
# Install a recent version of meson and gcovr using pipx to have the same
# version across all variants regardless of base.
&& pipx install --global \
gcovr${GCOVR_VERSION} \
meson${MESON_VERSION} \
&& gcovr --version \
&& echo Meson version: \
&& meson --version
FROM base AS llvm-base
# LLVM 16 is the highest available in Bookworm. Preferably, we should use the
# same version for all platforms, but it's not possible at the moment.
ARG LLVM_VERSION=16
RUN ${APT_UPDATE} \
&& ${APT_INSTALL} \
clang-${LLVM_VERSION} \
libclang-rt-${LLVM_VERSION}-dev \
lld-${LLVM_VERSION} \
llvm-${LLVM_VERSION} \
&& ${APT_CLEANUP} \
&& ln -f /usr/bin/clang-${LLVM_VERSION} /usr/bin/clang \
&& ln -f /usr/bin/lld-${LLVM_VERSION} /usr/bin/lld \
&& ln -f /usr/bin/llvm-ar-${LLVM_VERSION} /usr/bin/llvm-ar \
&& ln -f /usr/bin/llvm-strip-${LLVM_VERSION} /usr/bin/llvm-strip
FROM llvm-base AS native-base
ARG LLVM_VERSION=16
RUN ${APT_UPDATE} \
&& ${APT_INSTALL} \
# Runtime library dependencies.
libglib2.0-dev \
libgtk-3-dev \
libpng-dev \
# Install libomp-dev if available (OpenMP support for LLVM). It's done only
# for the native images, as OpenMP support in cross-build environment is
# tricky for LLVM.
&& (${APT_INSTALL} libomp-${LLVM_VERSION}-dev \
|| echo "OpenMP not available on this platform.") \
&& ${APT_CLEANUP}
# The following targets differ in BASE_IMAGE.
FROM native-base AS linux-386
FROM native-base AS linux-amd64
FROM native-base AS linux-arm-v5
FROM native-base AS linux-arm-v7
FROM native-base AS linux-arm64-v8
FROM native-base AS linux-mips64el
FROM native-base AS linux-mipsel
FROM native-base AS linux-ppc64le
FROM native-base AS linux-riscv64
# The following targets should have a common BASE_IMAGE.
FROM llvm-base AS linux-mips
RUN ${APT_UPDATE} \
&& ${APT_INSTALL} gcc-multilib-mips-linux-gnu \
&& ${APT_CLEANUP}
FROM llvm-base AS linux-ppc
RUN ${APT_UPDATE} \
&& ${APT_INSTALL} gcc-multilib-powerpc-linux-gnu \
&& ${APT_CLEANUP}
FROM llvm-base AS linux-ppc64
RUN ${APT_UPDATE} \
&& ${APT_INSTALL} gcc-multilib-powerpc64-linux-gnu \
&& ${APT_CLEANUP}
# We use a common image for Windows i686 and amd64, as it doesn't make sense to
# make them separate in terms of build time and image size. After two runs they
# should use the same cache layers, so in the end it makes the collective image
# size smaller.
FROM base AS windows-base
ARG LLVM_MINGW_RELEASE=20240619
ARG LLVM_MINGW_VARIANT=llvm-mingw-${LLVM_MINGW_RELEASE}-msvcrt-ubuntu-20.04-x86_64
RUN ${APT_UPDATE} \
&& ${APT_INSTALL} wget \
&& ${APT_CLEANUP} \
&& cd /opt \
&& wget https://github.com/mstorsjo/llvm-mingw/releases/download/${LLVM_MINGW_RELEASE}/${LLVM_MINGW_VARIANT}.tar.xz \
&& tar -xf ${LLVM_MINGW_VARIANT}.tar.xz \
&& rm -f ${LLVM_MINGW_VARIANT}.tar.xz
ENV PATH=${PATH}:/opt/${LLVM_MINGW_VARIANT}/bin
FROM windows-base AS windows-x86-base
RUN dpkg --add-architecture i386 \
&& ${APT_UPDATE} \
&& ${APT_INSTALL} \
gcc-mingw-w64-i686 \
gcc-mingw-w64-x86-64 \
mingw-w64-tools \
procps \
wine \
wine32 \
wine64 \
&& ${APT_CLEANUP} \
# Inspired by https://code.videolan.org/videolan/docker-images
&& wine wineboot --init \
&& while pgrep wineserver > /dev/null; do \
echo "waiting ..."; \
sleep 1; \
done \
&& rm -rf /tmp/wine-*
FROM windows-x86-base AS windows-686
FROM windows-x86-base AS windows-amd64
# aarch64 image requires linaro/wine-arm64 as a base.
FROM windows-base AS windows-arm64-v8
RUN wine-arm64 wineboot --init \
&& while pgrep wineserver > /dev/null; do \
echo "waiting ..."; \
sleep 1; \
done \
&& rm -rf /tmp/wine-*

View File

@ -0,0 +1,4 @@
DOCKER_PLATFORM=linux/386
BASE_IMAGE=docker.io/i386/debian
BASE_IMAGE_TAG=bookworm-slim
LLVM_VERSION=16

View File

@ -0,0 +1,4 @@
DOCKER_PLATFORM=linux/amd64
BASE_IMAGE=docker.io/amd64/debian
BASE_IMAGE_TAG=bookworm-slim
LLVM_VERSION=16

View File

@ -0,0 +1,4 @@
DOCKER_PLATFORM=linux/arm/v5
BASE_IMAGE=docker.io/arm32v5/debian
BASE_IMAGE_TAG=bookworm-slim
LLVM_VERSION=16

View File

@ -0,0 +1,4 @@
DOCKER_PLATFORM=linux/arm/v7
BASE_IMAGE=docker.io/arm32v7/debian
BASE_IMAGE_TAG=bookworm-slim
LLVM_VERSION=16

View File

@ -0,0 +1,4 @@
DOCKER_PLATFORM=linux/arm64/v8
BASE_IMAGE=docker.io/arm64v8/debian
BASE_IMAGE_TAG=bookworm-slim
LLVM_VERSION=16

View File

@ -0,0 +1,4 @@
DOCKER_PLATFORM=linux/amd64
BASE_IMAGE=docker.io/amd64/debian
BASE_IMAGE_TAG=bookworm-slim
LLVM_VERSION=16

View File

@ -0,0 +1,4 @@
DOCKER_PLATFORM=linux/mips64el
BASE_IMAGE=docker.io/mips64le/debian
BASE_IMAGE_TAG=bookworm-slim
LLVM_VERSION=16

View File

@ -0,0 +1,4 @@
DOCKER_PLATFORM=linux/mipsel
BASE_IMAGE=docker.io/serenitycode/debian-debootstrap
BASE_IMAGE_TAG=mipsel-bookworm-slim
LLVM_VERSION=14

View File

@ -0,0 +1 @@
linux-amd64.env

View File

@ -0,0 +1 @@
linux-amd64.env

View File

@ -0,0 +1,4 @@
DOCKER_PLATFORM=linux/ppc64le
BASE_IMAGE=docker.io/ppc64le/debian
BASE_IMAGE_TAG=bookworm-slim
LLVM_VERSION=16

View File

@ -0,0 +1,4 @@
DOCKER_PLATFORM=linux/riscv64
BASE_IMAGE=docker.io/riscv64/debian
BASE_IMAGE_TAG=sid-slim
LLVM_VERSION=18

View File

@ -0,0 +1 @@
linux-amd64.env

View File

@ -0,0 +1 @@
linux-amd64.env

View File

@ -0,0 +1,3 @@
DOCKER_PLATFORM=linux/amd64
BASE_IMAGE=docker.io/linaro/wine-arm64
BASE_IMAGE_TAG=latest

107
.gitlab-ci.d/02-build.yml Normal file
View File

@ -0,0 +1,107 @@
# Build stage
#
# This stage builds pixman with enabled coverage for all supported
# architectures.
#
# Some targets don't support atomic profile update, so to decrease the number of
# gcov errors, they need to be built without OpenMP (single threaded) by adding
# `-Dopenmp=disabled` Meson argument.
variables:
# Used in test stage as well.
BUILD_DIR: build-${TOOLCHAIN}
# Applicable to all build targets.
include:
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-386
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-amd64
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-arm-v5
qemu_cpu: arm1136
# Disable coverage, as the tests take too long to run with a single thread.
enable_gnu_coverage: false
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-arm-v7
qemu_cpu: max
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-arm64-v8
qemu_cpu: max
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-mips
toolchain: [gnu]
qemu_cpu: 74Kf
enable_gnu_coverage: false
# TODO: Merge with the one above once the following issue is resolved:
# https://gitlab.freedesktop.org/pixman/pixman/-/issues/105).
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-mips
toolchain: [llvm]
qemu_cpu: 74Kf
job_name_prefix: "."
job_name_suffix: ":failing"
allow_failure: true
retry: 0
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-mips64el
qemu_cpu: Loongson-3A4000
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-mipsel
toolchain: [gnu]
qemu_cpu: 74Kf
# Disable coverage, as the tests take too long to run with a single thread.
enable_gnu_coverage: false
# TODO: Merge with the one above once the following issue is resolved:
# https://gitlab.freedesktop.org/pixman/pixman/-/issues/105).
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-mipsel
toolchain: [llvm]
qemu_cpu: 74Kf
job_name_prefix: "."
job_name_suffix: ":failing"
allow_failure: true
retry: 0
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-ppc
qemu_cpu: g4
enable_gnu_coverage: false
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-ppc64
qemu_cpu: ppc64
enable_gnu_coverage: false
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-ppc64le
qemu_cpu: power10
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: linux-riscv64
qemu_cpu: rv64
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: windows-686
enable_gnu_coverage: false
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: windows-amd64
enable_gnu_coverage: false
- local: .gitlab-ci.d/templates/build.yml
inputs:
target: windows-arm64-v8
toolchain: [llvm] # GNU toolchain doesn't seem to support Windows on ARM.
qemu_cpu: max
enable_gnu_coverage: false

175
.gitlab-ci.d/03-test.yml Normal file
View File

@ -0,0 +1,175 @@
# Test stage
#
# This stage executes the test suite for pixman for all architectures in
# different configurations. Build and test is split, as some architectures can
# have different QEMU configuration or have multiple supported pixman backends,
# which are executed as job matrix.
#
# Mind that `PIXMAN_ENABLE` variable in matrix runs does nothing, but it looks
# better in CI to indicate what is actually being tested.
#
# Some emulated targets are really slow or cannot be run in multithreaded mode
# (mipsel, arm-v5). Thus coverage reporting is disabled for them.
variables:
# Used in summary stage as well.
COVERAGE_BASE_DIR: coverage
COVERAGE_OUT: ${COVERAGE_BASE_DIR}/${CI_JOB_ID}
TEST_NAME: "" # Allow to specify a set of tests to run with run variables.
include:
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-386
toolchain: [gnu]
pixman_disable:
- "sse2 ssse3" # Testing "mmx"
- "mmx ssse3" # Testing "sse2"
- "mmx sse2" # Testing "ssse3"
# TODO: Merge up after resolving
# https://gitlab.freedesktop.org/pixman/pixman/-/issues/106
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-386
toolchain: [llvm]
pixman_disable:
# Same as above.
- "sse2 ssse3"
- "mmx ssse3"
- "mmx sse2"
job_name_prefix: "."
job_name_suffix: ":failing"
allow_failure: true
retry: 0
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-amd64
pixman_disable:
- ""
- "fast"
- "wholeops"
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-arm-v5
toolchain: [gnu]
qemu_cpu: [arm1136]
pixman_disable: ["arm-neon"] # Test only arm-simd.
timeout: 3h
test_timeout_multiplier: 40
# TODO: Merge up after resolving
# https://gitlab.freedesktop.org/pixman/pixman/-/issues/107
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-arm-v5
toolchain: [llvm]
qemu_cpu: [arm1136]
pixman_disable: ["arm-neon"] # Test only arm-simd.
timeout: 3h
test_timeout_multiplier: 40
job_name_prefix: "."
job_name_suffix: ":failing"
allow_failure: true
retry: 0
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-arm-v7
qemu_cpu: [max]
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-arm64-v8
qemu_cpu: [max]
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-mips
toolchain: [gnu] # TODO: Add llvm once the build is fixed.
qemu_cpu: [74Kf]
job_name_prefix: "."
job_name_suffix: ":failing"
allow_failure: true # Some tests seem to fail.
retry: 0
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-mips64el
toolchain: [gnu]
qemu_cpu: [Loongson-3A4000]
# TODO: Merge up after resolving
# https://gitlab.freedesktop.org/pixman/pixman/-/issues/108
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-mips64el
toolchain: [llvm]
qemu_cpu: [Loongson-3A4000]
job_name_prefix: "."
job_name_suffix: ":failing"
allow_failure: true
retry: 0
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-mipsel
toolchain: [gnu] # TODO: Add llvm once the build is fixed.
qemu_cpu: [74Kf]
timeout: 2h
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-ppc
qemu_cpu: [g4]
job_name_prefix: "."
job_name_suffix: ":failing"
allow_failure: true # SIGILL for some tests
retry: 0
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-ppc64
qemu_cpu: [ppc64]
job_name_prefix: "."
job_name_suffix: ":failing"
allow_failure: true # SIGSEGV for some tests
retry: 0
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-ppc64le
toolchain: [gnu]
qemu_cpu: [power10]
# TODO: Merge up after resolving
# https://gitlab.freedesktop.org/pixman/pixman/-/issues/109
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-ppc64le
toolchain: [llvm]
qemu_cpu: [power10]
job_name_prefix: "."
job_name_suffix: ":failing"
allow_failure: true
retry: 0
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: linux-riscv64
qemu_cpu:
# Test on target without RVV (verify no autovectorization).
- rv64,v=false
# Test correctness for different VLENs.
- rv64,v=true,vext_spec=v1.0,vlen=128,elen=64
- rv64,v=true,vext_spec=v1.0,vlen=256,elen=64
- rv64,v=true,vext_spec=v1.0,vlen=512,elen=64
- rv64,v=true,vext_spec=v1.0,vlen=1024,elen=64
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: windows-686
pixman_disable:
# The same as for linux-386.
- "sse2 ssse3"
- "mmx ssse3"
- "mmx sse2"
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: windows-amd64
pixman_disable:
# The same as for linux-amd64.
- ""
- "fast"
- "wholeops"
- local: .gitlab-ci.d/templates/test.yml
inputs:
target: windows-arm64-v8
toolchain: [llvm]
qemu_cpu: [max]

View File

@ -0,0 +1,47 @@
# Summary stage
#
# This stage takes coverage reports from test runs for all architectures, and
# merges it into a single report, with GitLab visualization. There is also an
# HTML report generated as a separate artifact.
summary:
extends: .target:all
stage: summary
variables:
TARGET: linux-amd64
COVERAGE_SUMMARY_DIR: ${COVERAGE_BASE_DIR}/summary
needs:
- job: test:linux-386
optional: true
- job: test:linux-amd64
optional: true
- job: test:linux-arm-v7
optional: true
- job: test:linux-arm64-v8
optional: true
- job: test:linux-mips64el
optional: true
- job: test:linux-ppc64le
optional: true
- job: test:linux-riscv64
optional: true
script:
- echo "Input coverage reports:" && ls ${COVERAGE_BASE_DIR}/*.json || (echo "No coverage reports available." && exit)
- |
args=( )
for f in ${COVERAGE_BASE_DIR}/*.json; do
args+=( "-a" "$f" )
done
- mkdir -p ${COVERAGE_SUMMARY_DIR}
- gcovr "${args[@]}"
--cobertura-pretty --cobertura ${COVERAGE_SUMMARY_DIR}/coverage.xml
--html-details ${COVERAGE_SUMMARY_DIR}/coverage.html
--txt --print-summary
coverage: '/^TOTAL.*\s+(\d+\%)$/'
artifacts:
reports:
coverage_report:
coverage_format: cobertura
path: ${COVERAGE_SUMMARY_DIR}/coverage.xml
paths:
- ${COVERAGE_SUMMARY_DIR}/

View File

@ -0,0 +1 @@
native-gnu.meson

View File

@ -0,0 +1 @@
native-llvm.meson

View File

@ -0,0 +1 @@
native-gnu.meson

View File

@ -0,0 +1 @@
native-llvm.meson

View File

@ -0,0 +1 @@
native-gnu-noopenmp.meson

View File

@ -0,0 +1 @@
native-llvm-noopenmp.meson

View File

@ -0,0 +1 @@
native-gnu.meson

View File

@ -0,0 +1 @@
native-llvm.meson

View File

@ -0,0 +1 @@
native-gnu.meson

View File

@ -0,0 +1 @@
native-llvm.meson

View File

@ -0,0 +1,11 @@
[binaries]
c = ['mips-linux-gnu-gcc', '-DCI_HAS_ALL_MIPS_CPU_FEATURES']
ar = 'mips-linux-gnu-ar'
strip = 'mips-linux-gnu-strip'
exe_wrapper = ['qemu-mips', '-L', '/usr/mips-linux-gnu/']
[host_machine]
system = 'linux'
cpu_family = 'mips32'
cpu = 'mips32'
endian = 'big'

View File

@ -0,0 +1,14 @@
[binaries]
c = ['clang', '-target', 'mips-linux-gnu', '-fPIC', '-DCI_HAS_ALL_MIPS_CPU_FEATURES']
ar = 'llvm-ar'
strip = 'llvm-strip'
exe_wrapper = ['qemu-mips', '-L', '/usr/mips-linux-gnu/']
[built-in options]
c_link_args = ['-target', 'mips-linux-gnu', '-fuse-ld=lld']
[host_machine]
system = 'linux'
cpu_family = 'mips32'
cpu = 'mips32'
endian = 'big'

View File

@ -0,0 +1,8 @@
[binaries]
c = ['gcc', '-DCI_HAS_ALL_MIPS_CPU_FEATURES']
ar = 'ar'
strip = 'strip'
pkg-config = 'pkg-config'
[project options]
mips-dspr2 = 'disabled'

View File

@ -0,0 +1,8 @@
[binaries]
c = ['clang', '-DCI_HAS_ALL_MIPS_CPU_FEATURES']
ar = 'llvm-ar'
strip = 'llvm-strip'
pkg-config = 'pkg-config'
[project options]
mips-dspr2 = 'disabled'

View File

@ -0,0 +1 @@
native-gnu-noopenmp.meson

View File

@ -0,0 +1 @@
native-llvm-noopenmp.meson

View File

@ -0,0 +1,11 @@
[binaries]
c = 'powerpc-linux-gnu-gcc'
ar = 'powerpc-linux-gnu-ar'
strip = 'powerpc-linux-gnu-strip'
exe_wrapper = ['qemu-ppc', '-L', '/usr/powerpc-linux-gnu']
[host_machine]
system = 'linux'
cpu_family = 'ppc'
cpu = 'ppc'
endian = 'big'

View File

@ -0,0 +1,15 @@
[binaries]
c = ['clang', '-target', 'powerpc-linux-gnu']
ar = 'llvm-ar'
strip = 'llvm-strip'
exe_wrapper = ['qemu-ppc', '-L', '/usr/powerpc-linux-gnu/']
[built-in options]
# We cannot use LLD, as it doesn't support big-endian PPC.
c_link_args = ['-target', 'powerpc-linux-gnu']
[host_machine]
system = 'linux'
cpu_family = 'ppc'
cpu = 'ppc'
endian = 'big'

View File

@ -0,0 +1,11 @@
[binaries]
c = 'powerpc64-linux-gnu-gcc'
ar = 'powerpc64-linux-gnu-ar'
strip = 'powerpc64-linux-gnu-strip'
exe_wrapper = ['qemu-ppc64', '-L', '/usr/powerpc64-linux-gnu/']
[host_machine]
system = 'linux'
cpu_family = 'ppc64'
cpu = 'ppc64'
endian = 'big'

View File

@ -0,0 +1,15 @@
[binaries]
c = ['clang', '-target', 'powerpc64-linux-gnu']
ar = 'llvm-ar'
strip = 'llvm-strip'
exe_wrapper = ['qemu-ppc64', '-L', '/usr/powerpc64-linux-gnu/']
[built-in options]
# We cannot use LLD, as it doesn't support big-endian PPC.
c_link_args = ['-target', 'powerpc64-linux-gnu']
[host_machine]
system = 'linux'
cpu_family = 'ppc64'
cpu = 'ppc64'
endian = 'big'

View File

@ -0,0 +1 @@
native-gnu.meson

View File

@ -0,0 +1 @@
native-llvm.meson

View File

@ -0,0 +1 @@
native-gnu.meson

View File

@ -0,0 +1 @@
native-llvm.meson

View File

@ -0,0 +1,8 @@
[binaries]
c = ['gcc', '-DCI_HAS_ALL_MIPS_CPU_FEATURES']
ar = 'ar'
strip = 'strip'
pkg-config = 'pkg-config'
[project options]
openmp = 'disabled'

View File

@ -0,0 +1,5 @@
[binaries]
c = 'gcc'
ar = 'ar'
strip = 'strip'
pkg-config = 'pkg-config'

View File

@ -0,0 +1,8 @@
[binaries]
c = ['clang', '-DCI_HAS_ALL_MIPS_CPU_FEATURES']
ar = 'llvm-ar'
strip = 'llvm-strip'
pkg-config = 'pkg-config'
[project options]
openmp = 'disabled'

View File

@ -0,0 +1,5 @@
[binaries]
c = 'clang'
ar = 'llvm-ar'
strip = 'llvm-strip'
pkg-config = 'pkg-config'

View File

@ -0,0 +1,18 @@
[binaries]
c = 'i686-w64-mingw32-gcc'
ar = 'i686-w64-mingw32-ar'
strip = 'i686-w64-mingw32-strip'
windres = 'i686-w64-mingw32-windres'
exe_wrapper = 'wine'
[built-in options]
c_link_args = ['-static-libgcc']
[host_machine]
system = 'windows'
cpu_family = 'x86'
cpu = 'i686'
endian = 'little'
[project options]
openmp = 'disabled'

View File

@ -0,0 +1,18 @@
[binaries]
c = 'i686-w64-mingw32-clang'
ar = 'i686-w64-mingw32-llvm-ar'
strip = 'i686-w64-mingw32-strip'
windres = 'i686-w64-mingw32-windres'
exe_wrapper = 'wine'
[built-in options]
c_link_args = ['-static']
[project options]
openmp = 'disabled'
[host_machine]
system = 'windows'
cpu_family = 'x86'
cpu = 'i686'
endian = 'little'

View File

@ -0,0 +1,15 @@
[binaries]
c = 'x86_64-w64-mingw32-gcc'
ar = 'x86_64-w64-mingw32-ar'
strip = 'x86_64-w64-mingw32-strip'
windres = 'x86_64-w64-mingw32-windres'
exe_wrapper = 'wine'
[built-in options]
c_link_args = ['-static-libgcc']
[host_machine]
system = 'windows'
cpu_family = 'x86_64'
cpu = 'x86_64'
endian = 'little'

View File

@ -0,0 +1,20 @@
[binaries]
c = 'x86_64-w64-mingw32-clang'
ar = 'x86_64-w64-mingw32-llvm-ar'
strip = 'x86_64-w64-mingw32-strip'
windres = 'x86_64-w64-mingw32-windres'
exe_wrapper = 'wine'
[built-in options]
# Static linking is a workaround around `libwinpthread-1` not being discovered correctly.
c_link_args = ['-static']
[project options]
# OpenMP is disabled as it is not being discovered correctly during tests.
openmp = 'disabled'
[host_machine]
system = 'windows'
cpu_family = 'x86_64'
cpu = 'x86_64'
endian = 'little'

View File

@ -0,0 +1,18 @@
[binaries]
c = 'aarch64-w64-mingw32-clang'
ar = 'aarch64-w64-mingw32-llvm-ar'
strip = 'aarch64-w64-mingw32-strip'
windres = 'aarch64-w64-mingw32-windres'
exe_wrapper = 'wine-arm64'
[built-in options]
c_link_args = ['-static']
[project options]
openmp = 'disabled'
[host_machine]
system = 'windows'
cpu_family = 'aarch64'
cpu = 'aarch64'
endian = 'little'

View File

@ -0,0 +1,65 @@
# This file contains the set of jobs run by the pixman project:
# https://gitlab.freedesktop.org/pixman/pixman/-/pipelines
stages:
- docker
- build
- test
- summary
variables:
# Make it possible to change RUNNER_TAG from GitLab variables. The default
# `kvm` tag has been tested with FDO infrastructure.
RUNNER_TAG: kvm
# Docker image global configuration.
DOCKER_TAG: latest
DOCKER_IMAGE_NAME: registry.freedesktop.org/pixman/pixman/pixman:${DOCKER_TAG}
# Execute to load a target-specific environment.
LOAD_TARGET_ENV: source .gitlab-ci.d/01-docker/target-env/${TARGET}.env
# Enable/disable specific targets for code and platform coverage targets.
ACTIVE_TARGET_PATTERN: '/linux-386|linux-amd64|linux-arm-v5|linux-arm-v7|linux-arm64-v8|linux-mips|linux-mips64el|linux-mipsel|linux-ppc|linux-ppc64|linux-ppc64le|linux-riscv64|windows-686|windows-amd64|windows-arm64-v8/i'
workflow:
rules:
# Use modified Docker image if building in MR and Docker image is affected
# by the MR.
- if: $CI_PIPELINE_SOURCE == 'merge_request_event'
changes:
paths:
- .gitlab-ci.d/01-docker.yml
- .gitlab-ci.d/01-docker/**/*
variables:
DOCKER_TAG: $CI_COMMIT_REF_SLUG
DOCKER_IMAGE_NAME: ${CI_REGISTRY_IMAGE}/pixman:${DOCKER_TAG}
# A standard set of GitLab CI triggers (i.e., MR, schedule, default branch,
# and tag).
- if: $CI_PIPELINE_SOURCE == 'merge_request_event'
- if: $CI_COMMIT_BRANCH && $CI_OPEN_MERGE_REQUESTS
when: never
- if: $CI_PIPELINE_SOURCE == 'schedule'
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
- if: $CI_COMMIT_BRANCH
- if: $CI_COMMIT_TAG
auto_cancel:
on_new_commit: conservative
on_job_failure: all
default:
tags:
- $RUNNER_TAG
# Retry in case the runner is misconfigured for multi-arch builds or some
# random unexpected runner error occurs (it happened during testing).
retry: 1
include:
- local: "/.gitlab-ci.d/templates/targets.yml"
- local: "/.gitlab-ci.d/01-docker.yml"
- local: "/.gitlab-ci.d/02-build.yml"
- local: "/.gitlab-ci.d/03-test.yml"
- local: "/.gitlab-ci.d/04-summary.yml"

View File

@ -0,0 +1,80 @@
spec:
inputs:
target:
description:
Build target in form of "OS-ARCH" pair (e.g., linux-amd64). Mostly the
same as platform string for Docker but with a hyphen instead of slash.
toolchain:
description:
An array of toolchains to test with. Each toolchain should have an
appropriate Meson cross file.
type: array
default: [gnu, llvm]
qemu_cpu:
description:
QEMU_CPU environmental variable used by Docker (which uses QEMU
underneath). It is not used by x86 targets, as they are executed
natively on the host.
default: ""
enable_gnu_coverage:
description:
Enable coverage build flags. It can be later used to compile a coverage
report for all the jobs. Should be enabled only for native build
environments as they have all the optional dependencies, and are the
most reliable and uniform (so disable for cross environments).
type: boolean
default: true
job_name_prefix:
description:
Additional prefix for the job name. Can be used to disable a job with a
"." prefix.
default: ""
job_name_suffix:
description:
Additional suffix for the job name. Can be used to prevent job
duplication for jobs for the same target.
default: ""
allow_failure:
description:
Set the `allow_failure` flag for jobs that are expected to fail.
Remember to set `retry` argument to 0 to prevent unnecessary retries.
type: boolean
default: false
retry:
description:
Set the `retry` flag for a job. Usually used together with
`allow_failure`.
type: number
default: 1
---
"$[[ inputs.job_name_prefix ]]build:$[[ inputs.target ]]$[[ inputs.job_name_suffix ]]":
extends: .target:all
stage: build
allow_failure: $[[ inputs.allow_failure ]]
retry: $[[ inputs.retry ]]
needs:
- job: docker
optional: true
parallel:
matrix:
- TARGET: $[[ inputs.target ]]
variables:
TARGET: $[[ inputs.target ]]
QEMU_CPU: $[[ inputs.qemu_cpu ]]
parallel:
matrix:
- TOOLCHAIN: $[[ inputs.toolchain ]]
script:
- |
if [ "$[[ inputs.enable_gnu_coverage ]]" == "true" ] && [ "${TOOLCHAIN}" == "gnu" ]; then
COV_C_ARGS=-fprofile-update=atomic
COV_MESON_BUILD_ARGS=-Db_coverage=true
fi
- meson setup ${BUILD_DIR}
--cross-file .gitlab-ci.d/meson-cross/${TARGET}-${TOOLCHAIN}.meson
-Dc_args="${COV_C_ARGS}" ${COV_MESON_BUILD_ARGS}
- meson compile -C ${BUILD_DIR}
artifacts:
paths:
- ${BUILD_DIR}/

View File

@ -0,0 +1,9 @@
# General target templates.
.target:all:
image:
name: $DOCKER_IMAGE_NAME-$TARGET
rules:
- if: "$TARGET =~ $ACTIVE_TARGET_PATTERN"
before_script:
- ${LOAD_TARGET_ENV}

View File

@ -0,0 +1,112 @@
spec:
inputs:
target:
description:
Build target in form of "OS-ARCH" pair (e.g., linux-amd64). Mostly the
same as platform string for Docker but with a hyphen instead of slash.
toolchain:
description:
An array of toolchains to test with. Each toolchain should have an
appropriate Meson cross file.
type: array
default: [gnu, llvm]
qemu_cpu:
description:
An array of QEMU_CPU environmental variables used as a job matrix
variable, and in turn by Docker (which uses QEMU underneath). It is not
used by x86 targets, as they are executed natively on the host.
type: array
default: [""]
pixman_disable:
description:
An array of PIXMAN_DISABLE targets used as a job matrix variable.
type: array
default: [""]
timeout:
description:
GitLab job timeout property. May need to be increased for slow
targets.
default: 1h
test_timeout_multiplier:
description:
Test timeout multiplier flag used for Meson test execution. May need to
be increased for slow targets.
type: number
default: 20
meson_testthreads:
description:
Sets MESON_TESTTHREADS environmental variable. For some platforms, the
tests should be executed one by one (without multithreading) to prevent
gcovr errors.
type: number
default: 0
gcovr_flags:
description:
Additional flags passed to gcovr tool.
default: ""
job_name_prefix:
description:
Additional prefix for the job name. Can be used to disable a job with a
"." prefix.
default: ""
job_name_suffix:
description:
Additional suffix for the job name. Can be used to prevent job
duplication for jobs for the same target.
default: ""
allow_failure:
description:
Set the `allow_failure` flag for jobs that are expected to fail.
Remember to set `retry` argument to 0 to prevent unnecessary retries.
type: boolean
default: false
retry:
description:
Set the `retry` flag for a job. Usually used together with
`allow_failure`.
type: number
default: 1
---
"$[[ inputs.job_name_prefix ]]test:$[[ inputs.target ]]$[[ inputs.job_name_suffix ]]":
extends: .target:all
stage: test
allow_failure: $[[ inputs.allow_failure ]]
retry: $[[ inputs.retry ]]
timeout: $[[ inputs.timeout ]]
needs:
- job: docker
optional: true
parallel:
matrix:
- TARGET: $[[ inputs.target ]]
- job: build:$[[ inputs.target ]]
parallel:
matrix:
- TOOLCHAIN: $[[ inputs.toolchain ]]
variables:
TARGET: $[[ inputs.target ]]
TEST_TIMEOUT_MULTIPLIER: $[[ inputs.test_timeout_multiplier ]]
GCOVR_FLAGS: $[[ inputs.gcovr_flags ]]
MESON_ARGS: -t ${TEST_TIMEOUT_MULTIPLIER} --no-rebuild -v ${TEST_NAME}
MESON_TESTTHREADS: $[[ inputs.meson_testthreads ]]
parallel:
matrix:
- TOOLCHAIN: $[[ inputs.toolchain ]]
PIXMAN_DISABLE: $[[ inputs.pixman_disable ]]
QEMU_CPU: $[[ inputs.qemu_cpu ]]
script:
- meson test -C ${BUILD_DIR} ${MESON_ARGS}
after_script:
- mkdir -p ${COVERAGE_OUT}
- gcovr ${GCOVR_FLAGS} -r ./ ${BUILD_DIR} -e ./subprojects
--json ${COVERAGE_OUT}.json
--html-details ${COVERAGE_OUT}/coverage.html
--print-summary || echo "No coverage data available."
artifacts:
paths:
- ${BUILD_DIR}/meson-logs/testlog.txt
- ${COVERAGE_BASE_DIR}/
reports:
junit:
- ${BUILD_DIR}/meson-logs/testlog.junit.xml

16
.gitlab-ci.yml Normal file
View File

@ -0,0 +1,16 @@
#
# This is the GitLab CI configuration file for the mainstream pixman project:
# https://gitlab.freedesktop.org/pixman/pixman/-/pipelines
#
# !!! DO NOT ADD ANY NEW CONFIGURATION TO THIS FILE !!!
#
# Only documentation or comments is accepted.
#
# To use a different set of jobs than the mainstream project, you need to set
# the location of your custom yml file at "custom CI/CD configuration path", on
# your GitLab CI namespace:
# https://docs.gitlab.com/ee/ci/pipelines/settings.html#custom-cicd-configuration-path
#
include:
- local: '/.gitlab-ci.d/pixman-project.yml'

View File

@ -19,9 +19,6 @@ not
Specific guidelines:
Indentation
===========
@ -93,7 +90,7 @@ or like this:
* It extends over multiple lines
*/
Generally comments should say things that is clear from the code
Generally comments should say things that aren't clear from the code
itself. If too many comments say obvious things, then people will just
stop reading all comments, including the good ones.
@ -152,21 +149,6 @@ Whitespace
if (condition) foo (); else bar (); /* Yuck! */
* Do eliminate trailing whitespace (space or tab characters) on any
line. Also, avoid putting initial or final blank lines into any
file, and never use multiple blank lines instead of a single blank
line.
* Do enable the default git pre-commit hook that detect trailing
whitespace for you and help you to avoid corrupting cairo's tree
with it. Do that as follows:
chmod a+x .git/hooks/pre-commit
* You might also find the git-stripspace utility helpful which acts as
a filter to remove trailing whitespace as well as initial, final,
and duplicate blank lines.
Function Definitions
====================
@ -174,7 +156,7 @@ Function Definitions
Function definitions should take the following form:
void
my_function (argument)
my_function (int argument)
{
do_my_things ();
}
@ -214,3 +196,4 @@ popular editors:
* vim:sw=4:sts=4:ts=8:tw=78:fo=tcroq:cindent:cino=\:0,(0
* vim:isk=a-z,A-Z,48-57,_,.,-,>
*/

79
COPYING
View File

@ -1,39 +1,42 @@
The following is the 'standard copyright' agreed upon by most contributors,
and is currently the canonical license, though a modification is currently
under discussion. Copyright holders of new code should use this license
statement where possible, and append their name to this list.
The following is the MIT license, agreed upon by most contributors.
Copyright holders of new code should use this license statement where
possible. They may also add themselves to the list below.
Copyright 1987, 1988, 1989, 1998 The Open Group
Copyright 1987, 1988, 1989 Digital Equipment Corporation
Copyright 1999, 2004, 2008 Keith Packard
Copyright 2000 SuSE, Inc.
Copyright 2000 Keith Packard, member of The XFree86 Project, Inc.
Copyright 2004, 2005, 2007, 2008 Red Hat, Inc.
Copyright 2004 Nicholas Miell
Copyright 2005 Lars Knoll & Zack Rusin, Trolltech
Copyright 2005 Trolltech AS
Copyright 2007 Luca Barbato
Copyright 2008 Aaron Plattner, NVIDIA Corporation
Copyright 2008 Rodrigo Kumpera
Copyright 2008 André Tupinambá
Copyright 2008 Mozilla Corporation
Copyright 2008 Frederic Plourde
Permission is hereby granted, free of charge, to any person obtaining a
copy of this software and associated documentation files (the "Software"),
to deal in the Software without restriction, including without limitation
the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the
Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice (including the next
paragraph) shall be included in all copies or substantial portions of the
Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
/*
* Copyright 1987, 1988, 1989, 1998 The Open Group
* Copyright 1987, 1988, 1989 Digital Equipment Corporation
* Copyright 1999, 2004, 2008 Keith Packard
* Copyright 2000 SuSE, Inc.
* Copyright 2000 Keith Packard, member of The XFree86 Project, Inc.
* Copyright 2004, 2005, 2007, 2008, 2009, 2010 Red Hat, Inc.
* Copyright 2004 Nicholas Miell
* Copyright 2005 Lars Knoll & Zack Rusin, Trolltech
* Copyright 2005 Trolltech AS
* Copyright 2007 Luca Barbato
* Copyright 2008 Aaron Plattner, NVIDIA Corporation
* Copyright 2008 Rodrigo Kumpera
* Copyright 2008 André Tupinambá
* Copyright 2008 Mozilla Corporation
* Copyright 2008 Frederic Plourde
* Copyright 2009, Oracle and/or its affiliates. All rights reserved.
* Copyright 2009, 2010 Nokia Corporation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/

7842
ChangeLog

File diff suppressed because it is too large Load Diff

View File

@ -1,132 +0,0 @@
SUBDIRS = pixman test
pkgconfigdir=$(libdir)/pkgconfig
pkgconfig_DATA=pixman-1.pc
$(pkgconfig_DATA): pixman-1.pc.in
snapshot:
distdir="$(distdir)-`date '+%Y%m%d'`"; \
test -d "$(srcdir)/.git" && distdir=$$distdir-`cd "$(srcdir)" && git rev-parse HEAD | cut -c 1-6`; \
$(MAKE) $(AM_MAKEFLAGS) distdir="$$distdir" dist
GPGKEY=6FF7C1A8
USERNAME=$$USER
RELEASE_OR_SNAPSHOT = $$(if test "x$(CAIRO_VERSION_MINOR)" = "x$$(echo "$(CAIRO_VERSION_MINOR)/2*2" | bc)" ; then echo release; else echo snapshot; fi)
RELEASE_CAIRO_HOST = $(USERNAME)@cairographics.org
RELEASE_CAIRO_DIR = /srv/cairo.freedesktop.org/www/releases
RELEASE_CAIRO_URL = http://cairographics.org/releases
RELEASE_XORG_URL = http://xorg.freedesktop.org/archive/individual/lib
RELEASE_XORG_HOST = $(USERNAME)@xorg.freedesktop.org
RELEASE_XORG_DIR = /srv/xorg.freedesktop.org/archive/individual/lib
RELEASE_ANNOUNCE_LIST = cairo-announce@cairographics.org, xorg-announce@lists.freedesktop.org
tar_gz = $(PACKAGE)-$(VERSION).tar.gz
tar_bz2 = $(PACKAGE)-$(VERSION).tar.bz2
sha1_tgz = $(tar_gz).sha1
md5_tgz = $(tar_gz).md5
sha1_tbz2 = $(tar_bz2).sha1
md5_tbz2 = $(tar_bz2).md5
gpg_file = $(sha1_tgz).asc
$(sha1_tgz): $(tar_gz)
sha1sum $^ > $@
$(md5_tgz): $(tar_gz)
md5sum $^ > $@
$(sha1_tbz2): $(tar_bz2)
sha1sum $^ > $@
$(md5_tbz2): $(tar_bz2)
md5sum $^ > $@
$(gpg_file): $(sha1_tgz)
@echo "Please enter your GPG password to sign the checksum."
gpg --armor --sign $^
HASHFILES = $(sha1_tgz) $(sha1_tbz2) $(md5_tgz) $(md5_tbz2)
release-verify-newer:
@echo -n "Checking that no $(VERSION) release already exists at $(RELEASE_XORG_HOST)..."
@ssh $(RELEASE_XORG_HOST) test ! -e $(RELEASE_XORG_DIR)/$(tar_gz) \
|| (echo "Ouch." && echo "Found: $(RELEASE_XORG_HOST):$(RELEASE_XORG_DIR)/$(tar_gz)" \
&& echo "Refusing to try to generate a new release of the same name." \
&& false)
@ssh $(RELEASE_CAIRO_HOST) test ! -e $(RELEASE_CAIRO_DIR)/$(tar_gz) \
|| (echo "Ouch." && echo "Found: $(RELEASE_CAIRO_HOST):$(RELEASE_CAIRO_DIR)/$(tar_gz)" \
&& echo "Refusing to try to generate a new release of the same name." \
&& false)
@echo "Good."
release-remove-old:
$(RM) $(tar_gz) $(tar_bz2) $(HASHFILES) $(gpg_file)
ensure-prev:
@if [[ "$(PREV)" == "" ]]; then \
echo "" && \
echo "You must set the PREV variable on the make command line to" && \
echo "the last version." && \
echo "" && \
echo "For example:" && \
echo " make PREV=0.7.3" && \
echo "" && \
false; \
fi
release-check: ensure-prev release-verify-newer release-remove-old distcheck
release-tag:
git tag -u $(GPGKEY) -m "$(PACKAGE) $(VERSION) release" $(PACKAGE)-$(VERSION)
release-upload: release-check $(tar_gz) $(tar_bz2) $(sha1_tgz) $(sha1_tbz2) $(md5_tgz) $(gpg_file)
mkdir -p releases
scp $(tar_gz) $(sha1_tgz) $(gpg_file) $(RELEASE_CAIRO_HOST):$(RELEASE_CAIRO_DIR)
scp $(tar_gz) $(tar_bz2) $(RELEASE_XORG_HOST):$(RELEASE_XORG_DIR)
ssh $(RELEASE_CAIRO_HOST) "rm -f $(RELEASE_CAIRO_DIR)/LATEST-$(PACKAGE)-[0-9]* && ln -s $(tar_gz) $(RELEASE_CAIRO_DIR)/LATEST-$(PACKAGE)-$(VERSION)"
release-publish-message: $(HASHFILES) ensure-prev
@echo "Please follow the instructions in RELEASING to push stuff out and"
@echo "send out the announcement mails. Here is the excerpt you need:"
@echo ""
@echo "Lists: $(RELEASE_ANNOUNCE_LIST)"
@echo "Subject: [ANNOUNCE] $(PACKAGE) release $(VERSION) now available"
@echo "============================== CUT HERE =============================="
@echo "A new $(PACKAGE) release $(VERSION) is now available"
@echo ""
@echo "tar.gz:"
@echo " $(RELEASE_CAIRO_URL)/$(tar_gz)"
@echo " $(RELEASE_XORG_URL)/$(tar_gz)"
@echo ""
@echo "tar.bz2:"
@echo " $(RELEASE_XORG_URL)/$(tar_bz2)"
@echo ""
@echo "Hashes:"
@echo -n " MD5: "
@cat $(md5_tgz)
@echo -n " MD5: "
@cat $(md5_tbz2)
@echo -n " SHA1: "
@cat $(sha1_tgz)
@echo -n " SHA1: "
@cat $(sha1_tbz2)
@echo ""
@echo "GPG signature:"
@echo " $(RELEASE_CAIRO_URL)/$(gpg_file)"
@echo " (signed by `git config --get user.name` <`git config --get user.email`>)"
@echo ""
@echo "Git:"
@echo " git://git.freedesktop.org/git/pixman"
@echo " tag: $(PACKAGE)-$(VERSION)"
@echo ""
@echo "Log:"
@git log --no-merges "$(PACKAGE)-$(PREV)".."$(PACKAGE)-$(VERSION)" | git shortlog | awk '{ printf "\t"; print ; }' | cut -b1-80
@echo "============================== CUT HERE =============================="
@echo ""
release-publish: release-upload release-tag release-publish-message
.PHONY: release-upload release-publish release-publish-message release-tag

134
README
View File

@ -1,26 +1,134 @@
pixman is a library that provides low-level pixel manipulation
Pixman
======
Pixman is a library that provides low-level pixel manipulation
features such as image compositing and trapezoid rasterization.
Please submit bugs & patches to the libpixman bugzilla:
Questions should be directed to the pixman mailing list:
https://bugs.freedesktop.org/enter_bug.cgi?product=pixman
https://lists.freedesktop.org/mailman/listinfo/pixman
All questions regarding this software should be directed to either the
Xorg mailing list:
You can also file bugs at
http://lists.freedesktop.org/mailman/listinfo/xorg
https://gitlab.freedesktop.org/pixman/pixman/-/issues/new
or the cairo mailing list:
or submit improvements in form of a Merge Request via
http://lists.freedesktop.org/mailman/listinfo/cairo
https://gitlab.freedesktop.org/pixman/pixman/-/merge_requests
The master development code repository can be found at:
For real time discussions about pixman, feel free to join the IRC
channels #cairo and #xorg-devel on the FreeNode IRC network.
git://anongit.freedesktop.org/git/pixman
http://gitweb.freedesktop.org/?p=pixman;a=summary
Contributing
------------
For more information on the git code manager, see:
In order to contribute to pixman, you will need a working knowledge of
the git version control system. For a quick getting started guide,
there is the "Everyday Git With 20 Commands Or So guide"
http://wiki.x.org/wiki/GitPage
https://www.kernel.org/pub/software/scm/git/docs/everyday.html
from the Git homepage. For more in depth git documentation, see the
resources on the Git community documentation page:
https://git-scm.com/documentation
Pixman uses the infrastructure from the freedesktop.org umbrella
project. For instructions about how to use the git service on
freedesktop.org, see:
https://www.freedesktop.org/wiki/Infrastructure/git/Developers
The Pixman master repository can be found at:
https://gitlab.freedesktop.org/pixman/pixman
Sending patches
---------------
Patches should be submitted in form of Merge Requests via Gitlab.
You will first need to create a fork of the main pixman repository at
https://gitlab.freedesktop.org/pixman/pixman
via the Fork button on the top right. Once that is done you can add your
personal repository as a remote to your local pixman development git checkout:
git remote add my-gitlab git@gitlab.freedesktop.org:YOURUSERNAME/pixman.git
git fetch my-gitlab
Make sure to have added ssh keys to your gitlab profile at
https://gitlab.freedesktop.org/profile/keys
Once that is set up, the general workflow for sending patches is to create a
new local branch with your improvements and once it's ready push it to your
personal pixman fork:
git checkout -b fix-some-bug
...
git push my-gitlab
The output of the `git push` command will include a link that allows you to
create a Merge Request against the official pixman repository.
Whenever you make changes to your branch (add new commits or fix up commits)
you push them back to your personal pixman fork:
git push -f my-gitlab
If there is an open Merge Request Gitlab will automatically pick up the
changes from your branch and pixman developers can review them anew.
In order for your patches to be accepted, please consider the
following guidelines:
- At each point in the series, pixman should compile and the test
suite should pass.
The exception here is if you are changing the test suite to
demonstrate a bug. In this case, make one commit that makes the
test suite fail due to the bug, and then another commit that fixes
the bug.
You can run the test suite with
meson test -C builddir
It will take around two minutes to run on a modern PC.
- Follow the coding style described in the CODING_STYLE file
- For bug fixes, include an update to the test suite to make sure
the bug doesn't reappear.
- For new features, add tests of the feature to the test
suite. Also, add a program demonstrating the new feature to the
demos/ directory.
- Write descriptive commit messages. Useful information to include:
- Benchmark results, before and after
- Description of the bug that was fixed
- Detailed rationale for any new API
- Alternative approaches that were rejected (and why they
don't work)
- If review comments were incorporated, a brief version
history describing what those changes were.
- For big patch series, write an introductory post with an overall
description of the patch series, including benchmarks and
motivation. Each commit message should still be descriptive and
include enough information to understand why this particular commit
was necessary.
Pixman has high standards for code quality and so almost everybody
should expect to have the first versions of their patches rejected.
If you think that the reviewers are wrong about something, or that the
guidelines above are wrong, feel free to discuss the issue. The purpose
of the guidelines and code review is to ensure high code quality; it is
not an exercise in compliance.

View File

@ -10,31 +10,27 @@ Here are the steps to follow to create a new pixman release:
git log master...origin (no output; note: *3* dots)
2) Increment pixman_(major|minor|micro) in configure.ac according to
the directions in that file.
2) Increment the version in meson.build.
3) Run
3) Make sure that new version works, including
make PREV=<last version> release-check
- meson test passes
and fix things until it passes. If your freedesktop username is
different from your local username, then also set the variable
USER on the commandline.
- the X server still works with the new pixman version
installed
A very useful thing to do is to run the cairo test suite
against pixman. This can be done by running the following
commands in the "test" directory of the latest cairo release:
- the cairo test suite hasn't gained any new failures compared
to last pixman version.
tar xzf cairo-X.Y.Z.tar.gz
cd cairo
CAIRO_TEST_TARGET=image make test
4) Use "git commit" to record any changes made in steps 2 and 3.
4) Use "git commit" to record the changes made in step 2 and 3.
5) Generate and publish the tar files by running
make PREV=<last version> GPGKEY=<your gpg key id> release-publish
If your freedesktop user name is different from your local one,
then also set the variable USER to your freedesktop user name.
6) Run
make release-publish-message
@ -44,11 +40,10 @@ Here are the steps to follow to create a new pixman release:
cairo-announce@cairographics.org
and
pixman@lists.freedesktop.org
xorg-announce@lists.freedesktop.org
7) Increment pixman_micro to the next larger (odd) number in
configure.ac. Commit this change, and push all commits created
during this process using
@ -57,5 +52,7 @@ Here are the steps to follow to create a new pixman release:
git push --tags
You must use "--tags" here; otherwise the new tag will not
be pushed out. This is because technobabble.
be pushed out.
8) Change the topic of the #cairo IRC channel on freenode to advertise
the new version.

266
TODO
View File

@ -1,266 +0,0 @@
- SSE 2 issues:
- Use MM_HINT_NTA instead of MM_HINT_T0
- Use of fbCompositeOver_x888x8x8888sse2()
- Update the RLEASING file
- Things to keep in mind if breaking ABI:
- There should be a guard #ifndef I_AM_EITHER_CAIRO_OR_THE_X_SERVER
- X server will require 16.16 essentially forever. Can we get
the required precision by simply adding offset_x/y to the
relevant rendering API?
- Get rid of workaround for X server bug.
- pixman_image_set_indexed() should copy its argument, and X
should be ported over to use a pixman_image as the
representation of a Picture, rather than creating one on each
operation.
- We should get rid of pixman_set_static_pointers()
- We should get rid of the various trapezoid helper functions().
(They only exist because they are theoretically available to
drivers).
- 16 bit regions should be deleted
- There should only be one trap rasterization API.
- The PIXMAN_g8/c8/etc formats should use the A channel
to indicate the actual depth. That way PIXMAN_x4c4 and PIXMAN_c8
won't collide.
- Maybe bite the bullet and make configure.ac generate a pixman-types.h
file that can be included from pixman.h to avoid the #ifdef magic
in pixman.h
- Make pixman_region_point_in() survive a NULL box, then fix up
pixman-compose.c
- Possibly look into inlining the fetch functions
- There is a bug with source clipping demonstrated by clip-test in the
test directory. If we interprete source clipping as given in
destination coordinates, which is probably the only sane choice,
then the result should have two red bars down the sides.
- Test suite
- Add a general way of dealing with architecture specific
fast-paths. The current idea is to have each operation that can
be optimized is called through a function pointer that is
initially set to an initialization function that is responsible for
setting the function pointer to the appropriate fast-path.
- Go through things marked FIXME
- Add calls to prepare and finish access where necessary. grep for
ACCESS_MEM, and make sure they are correctly wrapped in prepare
and finish.
- restore READ/WRITE in the fbcompose combiners since they sometimes
store directly to destination drawables.
- It probably makes sense to move the more strange X region API
into pixman as well, but guarded with PIXMAN_XORG_COMPATIBILITY
- Reinstate the FbBits typedef? At the moment we don't
even have the FbBits type; we just use uint32_t everywhere.
Keith says in bug 2335:
The 64-bit code in fb (pixman) is probably broken; it hasn't been
used in quite some time as PCI (and AGP) is 32-bits wide, so
doing things 64-bits at a time is a net loss. To quickly fix
this, I suggest just using 32-bit datatypes by setting
IC_SHIFT to 5 for all machines.
- Consider optimizing the 8/16 bit solid fills in pixman-util.c by
storing more than one value at a time.
- Add an image cache to prevent excessive malloc/free. Note that pixman
needs to be thread safe when used from cairo.
- Moving to 24.8 coordinates. This is tricky because X is still
defined as 16.16 and will be basically forever. It's possible we
could do this by adding extra offset_x/y parameters to the
trapezoid calls. The X server could then just call the API with
(0, 0). Cairo would have to make sure that the delta *within* a
batch of trapezoids does not exceed 16 bit.
- Consider adding actual backends. Brain dump:
A backend is something that knows how to
- Create images
- Composite three images
- Rasterize trapezoids
- Do solid fills and blits
These operations are provided by a vtable that the backend will
create when it is initialized. Initial backends:
- VMX
- SSE2
- MMX
- Plain Old C
When the SIMD backends are initialized, they will be passed a
pointer to the Plain Old C backend that they can use for fallback
purposes.
Images would gain a vtable as well that would contain things like
- Read scanline
- Write scanline
(Or even read_patch/write_patch as suggested by Keith a while
back).
This could simplify the compositing code considerably.
- Review the pixman_format_code_t enum to make sure it will support
future formats. Some formats we will probably need:
ARGB/ABGR with 16/32/64 bit integer/floating channels
YUV2,
YV12
Also we may need the ability to distinguish between PICT_c8 and
PICT_x4c4. (This could be done by interpreting the A channel as
the depth for TYPE_COLOR and TYPE_GRAY formats).
A possibility may be to reserve the two top bits and make them
encode "number of places to shift the channel widths given" Since
these bits are 00 at the moment everything will continue to work,
but these additional widths will be allowed:
All even widths between 18-32
All multiples of four widths between 33 and 64
All multiples of eight between 64 and 128
This means things like r21g22b21 won't work - is that worth
worrying about? I don't think so. And of course the bpp field
can't handle a depth of over 256, so > 64 bit channels arent'
really all that useful.
We could reserve one extra bit to indicate floating point, but
we may also just add
PIXMAN_TYPE_ARGB_FLOAT
PIXMAN_TYPE_BGRA_FLOAT
PIXMAN_TYPE_A_FLOAT
image types. With five bits we can support up to 32 different
format types, which should be enough for everybody, even if we
decide to support all the various video formats here:
http://www.fourcc.org/yuv.php
It may make sense to have a PIXMAN_TYPE_YUV, and then use the
channel bits to specify the exact subtype.
Another possibility is to add
PIXMAN_TYPE_ARGB_W
PIXMAN_TYPE_ARGB_WW
where the channel widths would get 16 and 32 added to them,
respectively.
What about color spaces such a linear vs. srGB etc.?
done:
- Use pixmanFillsse2 and pixmanBltsse2
- Be consistent about calling sse2 sse2
- Rename "SSE" to "MMX_EXTENSIONS". (Deleted mmx extensions).
- Commented-out uses of fbCompositeCopyAreasse2()
- Consider whether calling regions region16 is really such a great
idea. Vlad wants 32 bit regions for Cairo. This will break X server
ABI, but should otherwise be mostly harmless, though a
pixman_region_get_boxes16() may be useful.
- Altivec signal issue (Company has fix, there is also a patch by
dwmw2 in rawhide).
- Behdad's MMX issue - see list
- SSE2 issues:
- Crashes in Mozilla because of unaligned stack. Possible fixes
- Make use of gcc 4.2 feature to align the stack
- Write some sort of trampoline that aligns the stack
before calling SSE functions.
- Get rid of the switch-of-doom; replace it with a big table
describing the various fast paths.
- Make source clipping optional.
- done: source clipping happens through an indirection.
still needs to make the indirection settable. (And call it
from X)
- Run cairo test suite; fix bugs
- one bug in source-scale-clip
- Remove the warning suppression in the ACCESS_MEM macro and fix the
warnings that are real
- irrelevant now.
- make the wrapper functions global instead of image specific
- this won't work since pixman is linked to both fb and wfb
- Add non-mmx solid fill
- Make sure the endian-ness macros are defined correctly.
- The rectangles in a region probably shouldn't be returned const as
the X server will be changing them.
- Right now we _always_ have a clip region, which is empty by default.
Why does this work at all? It probably doesn't. The server
distinguishes two cases, one where nothing is clipped (CT_NONE), and
one where there is a clip region (CT_REGION).
- Default clip region should be the full image
- Test if pseudo color still works. It does, but it also shows that
copying a pixman_indexed_t on every composite operation is not
going to fly. So, for now set_indexed() does not copy the
indexed table.
Also just the malloc() to allocate a pixman image shows up pretty
high.
Options include
- Make all the setters not copy their arguments
- Possibly combined with going back to the stack allocated
approach that we already use for regions.
- Keep a cached pixman_image_t around for every picture. It would
have to be kept uptodate every time something changes about the
picture.
- Break the X server ABI and simply have the relevant parameter
stored in the pixman image. This would have the additional benefits
that:
- We can get rid of the annoying repeat field which is duplicated
elsewhere.
- We can use pixman_color_t and pixman_gradient_stop_t
etc. instead of the types that are defined in
renderproto.h

5
a64-neon-test.S Normal file
View File

@ -0,0 +1,5 @@
.text
.arch armv8-a
.altmacro
prfm pldl2strm, [x0]
xtn v0.8b, v0.8h

10
arm-simd-test.S Normal file
View File

@ -0,0 +1,10 @@
.text
.arch armv6
.object_arch armv4
.arm
.altmacro
#ifndef __ARM_EABI__
#error EABI is required (to be sure that calling conventions are compatible)
#endif
pld [r0]
uqadd8 r0, r0, r0

View File

@ -1,12 +0,0 @@
#! /bin/sh
srcdir=`dirname $0`
test -z "$srcdir" && srcdir=.
ORIGDIR=`pwd`
cd $srcdir
autoreconf -v --install || exit 1
cd $ORIGDIR || exit $?
$srcdir/configure --enable-maintainer-mode "$@"

View File

@ -1,518 +0,0 @@
dnl Copyright 2005 Red Hat, Inc.
dnl
dnl Permission to use, copy, modify, distribute, and sell this software and its
dnl documentation for any purpose is hereby granted without fee, provided that
dnl the above copyright notice appear in all copies and that both that
dnl copyright notice and this permission notice appear in supporting
dnl documentation, and that the name of Red Hat not be used in
dnl advertising or publicity pertaining to distribution of the software without
dnl specific, written prior permission. Red Hat makes no
dnl representations about the suitability of this software for any purpose. It
dnl is provided "as is" without express or implied warranty.
dnl
dnl RED HAT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
dnl INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO
dnl EVENT SHALL RED HAT BE LIABLE FOR ANY SPECIAL, INDIRECT OR
dnl CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE,
dnl DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER
dnl TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
dnl PERFORMANCE OF THIS SOFTWARE.
dnl
dnl Process this file with autoconf to create configure.
AC_PREREQ([2.57])
# Pixman versioning scheme
#
# - The version in git has an odd MICRO version number
#
# - Released versions both development and stable have an even MICRO
# version number
#
# - Released development versions have an odd MINOR number
#
# - Released stable versions have an even MINOR number
#
# - Versions that break ABI must have a new MAJOR number
#
# - If you break the ABI, then at least this must be done:
#
# - increment MAJOR
#
# - In the first development release where you break ABI, find
# all instances of "pixman-n" and change them to pixman-(n+1)
#
# This needs to be done at least in
# configure.ac
# all Makefile.am's
# pixman-n.pc.in
#
# This ensures that binary incompatible versions can be installed
# in parallel. See http://www106.pair.com/rhp/parallel.html for
# more information
#
m4_define([pixman_major], 0)
m4_define([pixman_minor], 16)
m4_define([pixman_micro], 4)
m4_define([pixman_version],[pixman_major.pixman_minor.pixman_micro])
AC_INIT(pixman, pixman_version, "sandmann@daimi.au.dk", pixman)
AM_INIT_AUTOMAKE([dist-bzip2])
AM_CONFIG_HEADER(config.h)
AC_CANONICAL_HOST
test_CFLAGS=${CFLAGS+set} # We may override autoconf default CFLAGS.
AC_PROG_CC
AC_PROG_LIBTOOL
AC_CHECK_FUNCS([getisax])
AC_C_BIGENDIAN
AC_C_INLINE
# Checks for Sun Studio compilers
AC_CHECK_DECL([__SUNPRO_C], [SUNCC="yes"], [SUNCC="no"])
AC_CHECK_DECL([__amd64], [AMD64_ABI="yes"], [AMD64_ABI="no"])
# Default CFLAGS to -O -g rather than just the -g from AC_PROG_CC
# if we're using Sun Studio and neither the user nor a config.site
# has set CFLAGS.
if test $SUNCC = yes && \
test "$test_CFLAGS" == "" && \
test "$CFLAGS" = "-g"
then
CFLAGS="-O -g"
fi
#
# We ignore pixman_major in the version here because the major version should
# always be encoded in the actual library name. Ie., the soname is:
#
# pixman-$(pixman_major).0.minor.micro
#
m4_define([lt_current], [pixman_minor])
m4_define([lt_revision], [pixman_micro])
m4_define([lt_age], [pixman_minor])
LT_VERSION_INFO="lt_current:lt_revision:lt_age"
PIXMAN_VERSION_MAJOR=pixman_major()
AC_SUBST(PIXMAN_VERSION_MAJOR)
PIXMAN_VERSION_MINOR=pixman_minor()
AC_SUBST(PIXMAN_VERSION_MINOR)
PIXMAN_VERSION_MICRO=pixman_micro()
AC_SUBST(PIXMAN_VERSION_MICRO)
AC_SUBST(LT_VERSION_INFO)
# Check for dependencies
#PKG_CHECK_MODULES(DEP, x11)
changequote(,)dnl
if test "x$GCC" = "xyes"; then
case " $CFLAGS " in
*[\ \ ]-Wall[\ \ ]*) ;;
*) CFLAGS="$CFLAGS -Wall" ;;
esac
case " $CFLAGS " in
*[\ \ ]-fno-strict-aliasing[\ \ ]*) ;;
*) CFLAGS="$CFLAGS -fno-strict-aliasing" ;;
esac
fi changequote([,])dnl
AC_PATH_PROG(PERL, perl, no)
if test "x$PERL" = xno; then
AC_MSG_ERROR([Perl is required to build pixman.])
fi
AC_SUBST(PERL)
dnl =========================================================================
dnl -fvisibility stuff
have_gcc4=no
AC_MSG_CHECKING(for -fvisibility)
AC_COMPILE_IFELSE([
#if defined(__GNUC__) && (__GNUC__ >= 4)
#else
error Need GCC 4.0 for visibility
#endif
int main () { return 0; }
], have_gcc4=yes)
if test "x$have_gcc4" = "xyes"; then
CFLAGS="$CFLAGS -fvisibility=hidden"
fi
AC_MSG_RESULT($have_gcc4)
have_sunstudio8=no
AC_MSG_CHECKING([for -xldscope (Sun compilers)])
AC_COMPILE_IFELSE([
#if defined(__SUNPRO_C) && (__SUNPRO_C >= 0x550)
#else
error Need Sun Studio 8 for visibility
#endif
int main () { return 0; }
], have_sunstudio8=yes)
if test "x$have_sunstudio8" = "xyes"; then
CFLAGS="$CFLAGS -xldscope=hidden"
fi
AC_MSG_RESULT($have_sunstudio8)
dnl ===========================================================================
dnl Check for MMX
if test "x$MMX_CFLAGS" = "x" ; then
if test "x$SUNCC" = "xyes"; then
# Sun Studio doesn't have an -xarch=mmx flag, so we have to use sse
# but if we're building 64-bit, mmx & sse support is on by default and
# -xarch=sse throws an error instead
if test "$AMD64_ABI" = "no" ; then
MMX_CFLAGS="-xarch=sse"
fi
else
MMX_CFLAGS="-mmmx -Winline"
fi
fi
have_mmx_intrinsics=no
AC_MSG_CHECKING(whether to use MMX intrinsics)
xserver_save_CFLAGS=$CFLAGS
CFLAGS="$MMX_CFLAGS $CFLAGS"
AC_COMPILE_IFELSE([
#if defined(__GNUC__) && (__GNUC__ < 3 || (__GNUC__ == 3 && __GNUC_MINOR__ < 4))
error "Need GCC >= 3.4 for MMX intrinsics"
#endif
#include <mmintrin.h>
int main () {
__m64 v = _mm_cvtsi32_si64 (1);
return _mm_cvtsi64_si32 (v);
}], have_mmx_intrinsics=yes)
CFLAGS=$xserver_save_CFLAGS
AC_ARG_ENABLE(mmx,
[AC_HELP_STRING([--disable-mmx],
[disable MMX fast paths])],
[enable_mmx=$enableval], [enable_mmx=auto])
if test $enable_mmx = no ; then
have_mmx_intrinsics=disabled
fi
if test $have_mmx_intrinsics = yes ; then
AC_DEFINE(USE_MMX, 1, [use MMX compiler intrinsics])
else
MMX_CFLAGS=
fi
AC_MSG_RESULT($have_mmx_intrinsics)
if test $enable_mmx = yes && test $have_mmx_intrinsics = no ; then
AC_MSG_ERROR([MMX intrinsics not detected])
fi
AM_CONDITIONAL(USE_MMX, test $have_mmx_intrinsics = yes)
dnl ===========================================================================
dnl Check for SSE2
if test "x$SSE2_CFLAGS" = "x" ; then
if test "x$SUNCC" = "xyes"; then
# SSE2 is enabled by default in the Sun Studio 64-bit environment
if test "$AMD64_ABI" = "no" ; then
SSE2_CFLAGS="-xarch=sse2"
fi
else
SSE2_CFLAGS="-mmmx -msse2 -Winline"
fi
fi
have_sse2_intrinsics=no
AC_MSG_CHECKING(whether to use SSE2 intrinsics)
xserver_save_CFLAGS=$CFLAGS
CFLAGS="$SSE2_CFLAGS $CFLAGS"
AC_COMPILE_IFELSE([
#if defined(__GNUC__) && (__GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 2))
# if !defined(__amd64__) && !defined(__x86_64__)
# error "Need GCC >= 4.2 for SSE2 intrinsics on x86"
# endif
#endif
#include <mmintrin.h>
#include <xmmintrin.h>
#include <emmintrin.h>
int main () {
__m128i a, b, c;
c = _mm_xor_si128 (a, b);
return 0;
}], have_sse2_intrinsics=yes)
CFLAGS=$xserver_save_CFLAGS
AC_ARG_ENABLE(sse2,
[AC_HELP_STRING([--disable-sse2],
[disable SSE2 fast paths])],
[enable_sse2=$enableval], [enable_sse2=auto])
if test $enable_sse2 = no ; then
have_sse2_intrinsics=disabled
fi
if test $have_sse2_intrinsics = yes ; then
AC_DEFINE(USE_SSE2, 1, [use SSE2 compiler intrinsics])
fi
AC_MSG_RESULT($have_sse2_intrinsics)
if test $enable_sse2 = yes && test $have_sse2_intrinsics = no ; then
AC_MSG_ERROR([SSE2 intrinsics not detected])
fi
AM_CONDITIONAL(USE_SSE2, test $have_sse2_intrinsics = yes)
dnl ===========================================================================
dnl Other special flags needed when building code using MMX or SSE instructions
case $host_os in
solaris*)
# When building 32-bit binaries, apply a mapfile to ensure that the
# binaries aren't flagged as only able to run on MMX+SSE capable CPUs
# since they check at runtime before using those instructions.
# Not all linkers grok the mapfile format so we check for that first.
if test "$AMD64_ABI" = "no" ; then
use_hwcap_mapfile=no
AC_MSG_CHECKING(whether to use a hardware capability map file)
hwcap_save_LDFLAGS="$LDFLAGS"
HWCAP_LDFLAGS='-Wl,-M,$(srcdir)/solaris-hwcap.mapfile'
LDFLAGS="$LDFLAGS -Wl,-M,pixman/solaris-hwcap.mapfile"
AC_LINK_IFELSE([int main() { return 0; }],
use_hwcap_mapfile=yes,
HWCAP_LDFLAGS="")
LDFLAGS="$hwcap_save_LDFLAGS"
AC_MSG_RESULT($use_hwcap_mapfile)
fi
if test "x$MMX_LDFLAGS" = "x" ; then
MMX_LDFLAGS="$HWCAP_LDFLAGS"
fi
if test "x$SSE2_LDFLAGS" = "x" ; then
SSE2_LDFLAGS="$HWCAP_LDFLAGS"
fi
;;
esac
AC_SUBST(MMX_CFLAGS)
AC_SUBST(MMX_LDFLAGS)
AC_SUBST(SSE2_CFLAGS)
AC_SUBST(SSE2_LDFLAGS)
dnl ===========================================================================
dnl Check for VMX/Altivec
if test -n "`$CC -v 2>&1 | grep version | grep Apple`"; then
VMX_CFLAGS="-faltivec"
else
VMX_CFLAGS="-maltivec -mabi=altivec"
fi
have_vmx_intrinsics=no
AC_MSG_CHECKING(whether to use VMX/Altivec intrinsics)
xserver_save_CFLAGS=$CFLAGS
CFLAGS="$VMX_CFLAGS $CFLAGS"
AC_COMPILE_IFELSE([
#if defined(__GNUC__) && (__GNUC__ < 3 || (__GNUC__ == 3 && __GNUC_MINOR__ < 4))
error "Need GCC >= 3.4 for sane altivec support"
#endif
#include <altivec.h>
int main () {
vector unsigned int v = vec_splat_u32 (1);
v = vec_sub (v, v);
return 0;
}], have_vmx_intrinsics=yes)
CFLAGS=$xserver_save_CFLAGS
AC_ARG_ENABLE(vmx,
[AC_HELP_STRING([--disable-vmx],
[disable VMX fast paths])],
[enable_vmx=$enableval], [enable_vmx=auto])
if test $enable_vmx = no ; then
have_vmx_intrinsics=disabled
fi
if test $have_vmx_intrinsics = yes ; then
AC_DEFINE(USE_VMX, 1, [use VMX compiler intrinsics])
else
VMX_CFLAGS=
fi
AC_MSG_RESULT($have_vmx_intrinsics)
if test $enable_vmx = yes && test $have_vmx_intrinsics = no ; then
AC_MSG_ERROR([VMX intrinsics not detected])
fi
AC_SUBST(VMX_CFLAGS)
AM_CONDITIONAL(USE_VMX, test $have_vmx_intrinsics = yes)
dnl ===========================================================================
dnl Check for ARM SIMD instructions
ARM_SIMD_CFLAGS="-mcpu=arm1136j-s"
have_arm_simd=no
AC_MSG_CHECKING(whether to use ARM SIMD assembler)
xserver_save_CFLAGS=$CFLAGS
CFLAGS="$ARM_SIMD_CFLAGS $CFLAGS"
AC_COMPILE_IFELSE([
int main () {
asm("uqadd8 r1, r1, r2");
return 0;
}], have_arm_simd=yes)
CFLAGS=$xserver_save_CFLAGS
AC_ARG_ENABLE(arm-simd,
[AC_HELP_STRING([--disable-arm-simd],
[disable ARM SIMD fast paths])],
[enable_arm_simd=$enableval], [enable_arm_simd=auto])
if test $enable_arm_simd = no ; then
have_arm_simd=disabled
fi
if test $have_arm_simd = yes ; then
AC_DEFINE(USE_ARM_SIMD, 1, [use ARM SIMD compiler intrinsics])
else
ARM_SIMD_CFLAGS=
fi
AC_MSG_RESULT($have_arm_simd)
if test $enable_arm_simd = yes && test $have_arm_simd = no ; then
AC_MSG_ERROR([ARM SIMD intrinsics not detected])
fi
AC_SUBST(ARM_SIMD_CFLAGS)
AM_CONDITIONAL(USE_ARM_SIMD, test $have_arm_simd = yes)
dnl ==========================================================================
dnl Check for ARM NEON instructions
ARM_NEON_CFLAGS="-mfpu=neon -mcpu=cortex-a8"
have_arm_neon=no
AC_MSG_CHECKING(whether to use ARM NEON)
xserver_save_CFLAGS=$CFLAGS
CFLAGS="$ARM_NEON_CFLAGS $CFLAGS"
AC_COMPILE_IFELSE([
#include <arm_neon.h>
int main () {
uint8x8_t neon_test=vmov_n_u8(0);
return 0;
}], have_arm_neon=yes)
CFLAGS=$xserver_save_CFLAGS
AC_ARG_ENABLE(arm-neon,
[AC_HELP_STRING([--disable-arm-neon],
[disable ARM NEON fast paths])],
[enable_arm_neon=$enableval], [enable_arm_neon=auto])
if test $enable_arm_neon = no ; then
have_arm_neon=disabled
fi
if test $have_arm_neon = yes ; then
AC_DEFINE(USE_ARM_NEON, 1, [use ARM NEON compiler intrinsics])
else
ARM_NEON_CFLAGS=
fi
AC_SUBST(ARM_NEON_CFLAGS)
AM_CONDITIONAL(USE_ARM_NEON, test $have_arm_neon = yes)
AC_MSG_RESULT($have_arm_neon)
if test $enable_arm_neon = yes && test $have_arm_neon = no ; then
AC_MSG_ERROR([ARM NEON intrinsics not detected])
fi
dnl =========================================================================================
dnl Check for GNU-style inline assembly support
have_gcc_inline_asm=no
AC_MSG_CHECKING(whether to use GNU-style inline assembler)
AC_COMPILE_IFELSE([
int main () {
/* Most modern architectures have a NOP instruction, so this is a fairly generic test. */
asm volatile ( "\tnop\n" : : : "cc", "memory" );
return 0;
}], have_gcc_inline_asm=yes)
AC_ARG_ENABLE(gcc-inline-asm,
[AC_HELP_STRING([--disable-gcc-inline-asm],
[disable GNU-style inline assembler])],
[enable_gcc_inline_asm=$enableval], [enable_gcc_inline_asm=auto])
if test $enable_gcc_inline_asm = no ; then
have_gcc_inline_asm=disabled
fi
if test $have_gcc_inline_asm = yes ; then
AC_DEFINE(USE_GCC_INLINE_ASM, 1, [use GNU-style inline assembler])
fi
AC_MSG_RESULT($have_gcc_inline_asm)
if test $enable_gcc_inline_asm = yes && test $have_gcc_inline_asm = no ; then
AC_MSG_ERROR([GNU-style inline assembler not detected])
fi
AM_CONDITIONAL(USE_GCC_INLINE_ASM, test $have_gcc_inline_asm = yes)
dnl ==============================================
dnl Timers
AC_ARG_ENABLE(timers,
[AC_HELP_STRING([--enable-timers],
[enable TIMER_BEGIN and TIMER_END macros [default=no]])],
[enable_timers=$enableval], [enable_timers=no])
if test $enable_timers = yes ; then
AC_DEFINE(PIXMAN_TIMERS, 1, [enable TIMER_BEGIN/TIMER_END macros])
fi
AC_SUBST(PIXMAN_TIMERS)
dnl ===================================
dnl GTK+
AC_ARG_ENABLE(gtk,
[AC_HELP_STRING([--enable-gtk],
[enable tests using GTK+ [default=auto]])],
[enable_gtk=$enableval], [enable_gtk=auto])
PKG_PROG_PKG_CONFIG
if test $enable_gtk = auto ; then
PKG_CHECK_EXISTS([gtk+-2.0], [enable_gtk=yes], [enable_gtk=no])
fi
if test $enable_gtk = yes ; then
PKG_CHECK_MODULES(GTK, [gtk+-2.0])
fi
AM_CONDITIONAL(HAVE_GTK, [test "x$enable_gtk" = xyes])
AC_SUBST(GTK_CFLAGS)
AC_SUBST(GTK_LIBS)
AC_SUBST(DEP_CFLAGS)
AC_SUBST(DEP_LIBS)
dnl =====================================
dnl posix_memalign
AC_CHECK_FUNC(posix_memalign, have_posix_memalign=yes, have_posix_memalign=no)
if test x$have_posix_memalign = xyes; then
AC_DEFINE(HAVE_POSIX_MEMALIGN, 1, [Whether we have posix_memalign()])
fi
AC_OUTPUT([pixman-1.pc
pixman-1-uninstalled.pc
Makefile
pixman/Makefile
pixman/pixman-version.h
test/Makefile])

397
debian/changelog vendored
View File

@ -1,10 +1,403 @@
pixman (0.16.4-2) UNRELEASED; urgency=low
pixman (0.44.0-4) UNRELEASED; urgency=medium
* Team upload.
* debian/copyright: Convert to machine-readable format
-- Dylan Aïssi <daissi@debian.org> Thu, 31 Jul 2025 22:16:23 +0200
pixman (0.44.0-3) unstable; urgency=medium
* Replace timeout bump patch by using a multiplier option instead.
Thanks, Aurelien Jarno! (Closes: #1086999)
-- Timo Aaltonen <tjaalton@debian.org> Sat, 09 Nov 2024 11:02:55 +0200
pixman (0.44.0-2) unstable; urgency=medium
* patches: Increase test timeout 120->240s. (Closes: #1086999)
-- Timo Aaltonen <tjaalton@debian.org> Fri, 08 Nov 2024 09:58:04 +0200
pixman (0.44.0-1) unstable; urgency=medium
* New upstream release.
* patches: Refresh patch.
* control, rules: Build with meson.
* symbols: Updated.
* control: Migrate to pkgconf.
* rules: Drop obsolete dbgsym-migration.
-- Timo Aaltonen <tjaalton@debian.org> Thu, 07 Nov 2024 16:48:29 +0200
pixman (0.42.2-1) unstable; urgency=medium
* New upstream release.
* d/p/Avoid-integer-overflow-leading-to-out-of-bounds-writ.diff:
- Removed, fixed upstream.
-- Emilio Pozuelo Monfort <pochu@debian.org> Fri, 11 Nov 2022 13:42:25 +0100
pixman (0.40.0-1.1) unstable; urgency=medium
* Non-maintainer upload.
* Avoid integer overflow leading to out-of-bounds write (CVE-2022-44638)
(Closes: #1023427)
-- Salvatore Bonaccorso <carnil@debian.org> Thu, 03 Nov 2022 23:07:46 +0100
pixman (0.40.0-1) unstable; urgency=medium
* New upstream release. (Closes: #958298, #832579, #838650)
* control, rules: Migrate to debhelper-compat, bump to 13.
* symbols: Updated, bump shlibs.
-- Timo Aaltonen <tjaalton@debian.org> Thu, 03 Dec 2020 15:28:13 +0200
pixman (0.36.0-1) unstable; urgency=medium
* New upstream release.
* Update to my Debian address.
* Update Vcs-* URLs to point to salsa.debian.org.
* Use https URL in debian/copyright.
* Set source format to 1.0.
* Bump debhelper compat to 11.
* Bump standards version to 4.2.1.
-- Andreas Boll <aboll@debian.org> Wed, 12 Dec 2018 22:02:44 +0100
pixman (0.34.0-2) unstable; urgency=medium
* Declare Multi-Arch: same for libpixman-1-dev (Closes: #884166).
* Switch to dbsym package.
* Stop passing --disable-silent-rules to configure, debhelper does it
now.
* Bump standards version to 4.1.2.
-- Andreas Boll <andreas.boll.dev@gmail.com> Sun, 17 Dec 2017 13:33:55 +0100
pixman (0.34.0-1) unstable; urgency=medium
* Team upload.
* New upstream release (no actual changes)
* Use https URL in debian/watch.
-- Julien Cristau <jcristau@debian.org> Sat, 24 Sep 2016 13:25:16 +0200
pixman (0.33.6-1) unstable; urgency=medium
* New upstream release candidate.
* Add myself to Uploaders.
-- Andreas Boll <andreas.boll.dev@gmail.com> Thu, 14 Jan 2016 13:46:28 +0100
pixman (0.33.4-1) unstable; urgency=medium
* Team upload.
* New upstream release candidate.
-- Andreas Boll <andreas.boll.dev@gmail.com> Wed, 04 Nov 2015 13:26:18 +0100
pixman (0.33.2-2) sid; urgency=medium
* Run tests with VERBOSE=1.
-- Julien Cristau <jcristau@debian.org> Sat, 12 Sep 2015 20:31:06 +0200
pixman (0.33.2-1) sid; urgency=medium
[ Andreas Boll ]
* New upstream release candidate.
* Enable vmx on ppc64el (closes: #786345).
* Update Vcs-* fields.
* Add upstream url.
* Drop XC- prefix from Package-Type field.
* Bump standards version to 3.9.6.
[ intrigeri ]
* Simplify hardening build flags handling (closes: #760100).
Thanks to Simon Ruderich <simon@ruderich.org> for the patch.
* Enable all hardening build flags. Thanks to Simon Ruderich too.
-- Julien Cristau <jcristau@debian.org> Sat, 12 Sep 2015 13:08:02 +0200
pixman (0.32.6-3) sid; urgency=medium
[ intrigeri ]
* Enable hardening build flags with dpkg-buildflags.
-- Julien Cristau <jcristau@debian.org> Sat, 23 Aug 2014 22:16:40 -0700
pixman (0.32.6-2) sid; urgency=medium
[ Julien Cristau ]
* Disable vmx on ppc64el (closes: #745547). Thanks, Breno Leitao!
-- Cyril Brulebois <kibi@debian.org> Mon, 18 Aug 2014 22:50:39 +0200
pixman (0.32.6-1) sid; urgency=medium
* New upstream release.
* Bump debhelper compat level to 9.
* Remove Cyril from Uploaders.
-- Julien Cristau <jcristau@debian.org> Sun, 13 Jul 2014 16:31:06 +0200
pixman (0.32.4-1) sid; urgency=low
* New upstream release.
-- Julien Cristau <jcristau@debian.org> Tue, 17 Dec 2013 22:04:15 +0100
pixman (0.30.2-2) sid; urgency=low
* Cherry-pick upstream bigfixes for fixing a crash when rendering
invalid trapezoids. (LP: #1197921)
Addresses CVE-2013-6425.
-- Maarten Lankhorst <maarten.lankhorst@ubuntu.com> Mon, 18 Nov 2013 15:08:56 +0100
pixman (0.30.2-1) sid; urgency=low
* New upstream release
- includes big-endian matrix-test fix
* Increase alpha-loop test timeout some more.
-- Julien Cristau <jcristau@debian.org> Tue, 13 Aug 2013 12:08:18 +0200
pixman (0.30.0-3) sid; urgency=low
* Increase timeout for the alpha-loop test. That will hopefully let it pass
on the mips buildd.
-- Julien Cristau <jcristau@debian.org> Sat, 03 Aug 2013 10:24:29 +0200
pixman (0.30.0-2) sid; urgency=low
* Disable silent Makefile rules.
* Disable arm iwmmxt fast paths. It breaks the build.
* Fix matrix-test on big endian (patch from Siarhei Siamashka).
-- Julien Cristau <jcristau@debian.org> Sat, 27 Jul 2013 21:40:48 +0200
pixman (0.30.0-1) sid; urgency=low
[ Maarten Lankhorst, Cyril Brulebois, Julien Cristau ]
* New upstream release.
-- Julien Cristau <jcristau@debian.org> Fri, 26 Jul 2013 14:58:25 +0200
pixman (0.26.0-4) sid; urgency=high
* Fix for CVE-2013-1591 (stack-based buffer overflow), cherry-picked from
0.27.4 (closes: #700308).
-- Julien Cristau <jcristau@debian.org> Mon, 18 Feb 2013 19:58:33 +0100
pixman (0.26.0-3) unstable; urgency=low
* Pass LS_CFLAGS=" " to configure to prevent -march=loongson2f from
being passed to gcc, which would break on loongson2e (see fdo bug
#51451). This fixes the test suite failures on mipsel, and should
avoid any crashes depending on user systems.
-- Cyril Brulebois <kibi@debian.org> Wed, 27 Jun 2012 12:11:54 +0200
pixman (0.26.0-2) unstable; urgency=low
* Cherry-pick from upstream master branch to fix FTBFS on *i386:
- da6193b1fc “mmx: add missing _mm_empty calls”
-- Cyril Brulebois <kibi@debian.org> Fri, 15 Jun 2012 01:25:20 +0200
pixman (0.26.0-1) unstable; urgency=low
* New upstream release.
-- Cyril Brulebois <kibi@debian.org> Fri, 15 Jun 2012 00:16:47 +0200
pixman (0.25.6-1) experimental; urgency=low
* New upstream release candidate.
* Remove demos/parrot.jpg before building the source package to avoid
“binary file contents changed” until it's shipped in the upstream
tarball.
-- Cyril Brulebois <kibi@debian.org> Sun, 20 May 2012 17:56:35 +0200
pixman (0.25.2-1) experimental; urgency=low
* New upstream release candidate.
* Add new symbols and bump shlibs accordingly:
- pixman_region32_clear
- pixman_region_clear
-- Cyril Brulebois <kibi@debian.org> Fri, 09 Mar 2012 13:17:16 +0100
pixman (0.24.4-1) unstable; urgency=low
* New upstream release
- Revert "Reject trapezoids where top (botttom) is above (below) the
edges" (closes: #656682)
-- Julien Cristau <jcristau@debian.org> Thu, 09 Feb 2012 21:16:47 +0100
pixman (0.24.2-1) unstable; urgency=low
* New upstream release:
- Stable bug fix release from the 0.24 branch.
-- Cyril Brulebois <kibi@debian.org> Thu, 19 Jan 2012 12:22:54 +0100
pixman (0.24.0-1) unstable; urgency=low
* New upstream release.
-- Cyril Brulebois <kibi@debian.org> Mon, 07 Nov 2011 18:13:47 +0100
pixman (0.23.8-1) unstable; urgency=low
* New upstream release.
-- Cyril Brulebois <kibi@debian.org> Tue, 01 Nov 2011 12:29:16 +0100
pixman (0.23.6-1) experimental; urgency=low
[ Rico Tzschichholz ]
* New upstream release.
-- Julien Cristau <jcristau@debian.org> Sat, 22 Oct 2011 11:09:04 +0200
pixman (0.23.2-1) experimental; urgency=low
* New upstream release.
* Enable parallel building (by passing --parallel to dh $@).
-- Cyril Brulebois <kibi@debian.org> Tue, 05 Jul 2011 01:37:27 +0200
pixman (0.22.0-1) unstable; urgency=low
* Team upload.
[ Steve Langasek ]
* Build for multiarch.
[ Julien Cristau ]
* Bump Standards-Version to 3.9.2.
* New upstream release (no changes from 0.21.8 except for the version bump).
-- Julien Cristau <jcristau@debian.org> Sun, 12 Jun 2011 17:02:01 +0200
pixman (0.21.8-1) unstable; urgency=low
* New upstream release.
* As seen in the upstream announcement: “When this version of pixman is
used with the git version of the X server, trapezoid rendering will be
corrupted. This is a known bug in the X server.”
* This new release should fix the FTBFS on big endian machines, tests
were failing due to missing swapping (Closes: #622211).
-- Cyril Brulebois <kibi@debian.org> Fri, 29 Apr 2011 17:53:12 +0200
pixman (0.21.6-2) unstable; urgency=low
* Upload to unstable.
-- Cyril Brulebois <kibi@debian.org> Sun, 10 Apr 2011 23:08:36 +0200
pixman (0.21.6-1) experimental; urgency=low
* New upstream release.
* Update symbols file with new symbols.
* Bump shlibs accordingly.
* Wrap Build-Depends.
* Remove libpixman1-dev from Conflicts, last seen in etch!
* Update Uploaders list. Thanks, David!
* Switch to dh:
- Use debhelper 8.
- Use dh-autoreconf.
- Kill .la files.
- Switch dh_install from --list-missing to --fail-missing for
additionaly safety.
* Add a quilt series placeholder file.
* Bump Standards-Version to 3.9.1 (no changes needed).
-- Cyril Brulebois <kibi@debian.org> Wed, 09 Mar 2011 04:08:02 +0100
pixman (0.21.4-2) unstable; urgency=low
* Upload to unstable.
-- Cyril Brulebois <kibi@debian.org> Sun, 06 Feb 2011 05:31:10 +0100
pixman (0.21.4-1) experimental; urgency=low
* New upstream release.
* Update debian/copyright from upstream's COPYING.
-- Cyril Brulebois <kibi@debian.org> Wed, 19 Jan 2011 20:31:26 +0100
pixman (0.21.2-1) experimental; urgency=low
* New upstream release.
* Update debian/copyright from upstream's COPYING.
-- Cyril Brulebois <kibi@debian.org> Wed, 17 Nov 2010 15:56:46 +0100
pixman (0.20.0-1) experimental; urgency=low
* New upstream release.
-- Cyril Brulebois <kibi@debian.org> Sat, 06 Nov 2010 10:00:54 +0100
pixman (0.19.6-1) experimental; urgency=low
* New upstream release.
* Bump SHLIBS_VERSION from 0.18.0 to 0.19.4 for newly-added functions.
* Update symbols file with newly-added functions.
* Add -c4 to the dh_makeshlibs call, to ensure the build breaks if
unexpected symbol-related changes happened.
* As of pixman-0.19.2-5-g5b99710, Gtk+ is auto-detected, make sure not
to pick it accidentally, by passing --disable-gtk. (That's only for
test purposes, but would require pixman-1 itself.)
* Enable the testsuite.
* Add myself to Uploaders.
-- Cyril Brulebois <kibi@debian.org> Wed, 27 Oct 2010 23:14:00 +0200
pixman (0.18.4-1) experimental; urgency=low
[ Robert Hooker ]
* New upstream stable release.
-- Julien Cristau <jcristau@debian.org> Mon, 06 Sep 2010 21:15:07 +0200
pixman (0.18.2-1) experimental; urgency=low
* New upstream stable release. Changes since 0.18.0:
- b48d8b5... Pre-release version bump to 0.18.2
- 970c183... Add macros for thread local storage on MinGW 32
- 61ff1a3... Don't use __thread on MinGW.
- f973be4... Don't consider indexed formats opaque.
- 34fb385... Add missing HAVE_CONFIG_H guards for config.h inclusion
- 38928af... Update README to mention the pixman mailing list
- 6649842... [mmx] Fix mask creation bugs
- d197dc5... Fixes for pthread thread local storage.
- 9babaab... Fix uninitialized cache when pthreads are used
- 4fe0a40... Visual Studio 2010 includes stdint.h
- 9a46edd... Post-release version bump to 0.18.1
-- Robert Hooker <sarvatt@ubuntu.com> Fri, 14 May 2010 13:03:42 -0400
pixman (0.18.0-1) experimental; urgency=low
* Rename the build directory to not include DEB_BUILD_GNU_TYPE for no
good reason. Thanks, Colin Watson!
* Remove myself from Uploaders
* New upstream release (closes: #579014).
* Update symbols file for new API, bump shlibs.
* Drop pixman-arm-don-t-use-env-vars-to-get-hwcap-platform.patch, obsolete.
-- Julien Cristau <jcristau@debian.org> Sat, 16 Jan 2010 16:47:36 +0000
-- Julien Cristau <jcristau@debian.org> Tue, 11 May 2010 14:16:09 +0200
pixman (0.16.4-1) unstable; urgency=low

1
debian/compat vendored
View File

@ -1 +0,0 @@
5

32
debian/control vendored
View File

@ -2,11 +2,16 @@ Source: pixman
Section: devel
Priority: optional
Maintainer: Debian X Strike Force <debian-x@lists.debian.org>
Uploaders: David Nusinow <dnusinow@debian.org>
Build-Depends: debhelper (>= 5), automake, autoconf, libtool, pkg-config, quilt
Standards-Version: 3.8.3
Vcs-Git: git://git.debian.org/git/pkg-xorg/lib/pixman
Vcs-Browser: http://git.debian.org/?p=pkg-xorg/lib/pixman.git
Uploaders: Andreas Boll <aboll@debian.org>
Build-Depends:
debhelper-compat (= 13),
meson,
pkgconf,
quilt,
Standards-Version: 4.2.1
Vcs-Git: https://salsa.debian.org/xorg-team/lib/pixman.git
Vcs-Browser: https://salsa.debian.org/xorg-team/lib/pixman
Homepage: http://pixman.org/
Package: libpixman-1-0
Section: libs
@ -14,6 +19,8 @@ Architecture: any
Depends:
${shlibs:Depends},
${misc:Depends},
Pre-Depends: ${misc:Pre-Depends}
Multi-Arch: same
Description: pixel-manipulation library for X and cairo
A library for manipulating pixel regions -- a set of Y-X banded
rectangles, image compositing using the Porter/Duff model
@ -22,7 +29,7 @@ Description: pixel-manipulation library for X and cairo
Package: libpixman-1-0-udeb
Section: debian-installer
XC-Package-Type: udeb
Package-Type: udeb
Architecture: any
Depends:
${shlibs:Depends},
@ -31,24 +38,13 @@ Description: pixel-manipulation library for X and cairo
This package contains a minimal set of libraries needed for the Debian
installer. Do not install it on a normal system.
Package: libpixman-1-0-dbg
Section: debug
Priority: extra
Architecture: any
Depends:
libpixman-1-0 (= ${binary:Version}),
${misc:Depends},
Description: pixel-manipulation library for X and cairo (debugging symbols)
Debugging symbols for the Cairo/X pixel manipulation library. This is
needed to debug programs linked against libpixman0.
Package: libpixman-1-dev
Section: libdevel
Architecture: any
Depends:
libpixman-1-0 (= ${binary:Version}),
${misc:Depends},
Conflicts: libpixman1-dev
Multi-Arch: same
Description: pixel-manipulation library for X and cairo (development files)
Development libraries, header files and documentation needed by
programs that want to compile with the Cairo/X pixman library.

156
debian/copyright vendored
View File

@ -1,114 +1,48 @@
This package was downloaded from
http://xorg.freedesktop.org/releases/individual/lib/
Format: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
Upstream-Name: pixman
Source: https://gitlab.freedesktop.org/pixman/pixman
License: Expat
Debian packaging by Julien Cristau <jcristau@debian.org>, 18 May 2007.
Files: *
Copyright: 1987-1998 The Open Group
1987-1989 Digital Equipment Corporation
1999-2008 Keith Packard
2000 SuSE, Inc.
2000 Keith Packard, member of The XFree86 Project, Inc.
2004-2010 Red Hat, Inc.
2004 Nicholas Miell
2005 Lars Knoll & Zack Rusin, Trolltech
2005 Trolltech AS
2007 Luca Barbato
2008 Aaron Plattner, NVIDIA Corporation
2008 Rodrigo Kumpera
2008 André Tupinambá
2008 Mozilla Corporation
2008 Frederic Plourde
2009, Oracle and/or its affiliates. All rights reserved.
2009-2010 Nokia Corporation
License: Expat
The following is the 'standard copyright' agreed upon by most contributors,
and is currently the canonical license, though a modification is currently
under discussion. Copyright holders of new code should use this license
statement where possible, and append their name to this list.
Files: debian/*
Copyright: 2007 Julien Cristau <jcristau@debian.org>
License: Expat
Copyright 1987, 1988, 1989, 1998 The Open Group
Copyright 1987, 1988, 1989 Digital Equipment Corporation
Copyright 1999, 2004, 2008 Keith Packard
Copyright 2000 SuSE, Inc.
Copyright 2000 Keith Packard, member of The XFree86 Project, Inc.
Copyright 2004, 2005, 2007, 2008 Red Hat, Inc.
Copyright 2004 Nicholas Miell
Copyright 2005 Lars Knoll & Zack Rusin, Trolltech
Copyright 2005 Trolltech AS
Copyright 2007 Luca Barbato
Copyright 2008 Aaron Plattner, NVIDIA Corporation
Copyright 2008 Rodrigo Kumpera
Copyright 2008 André Tupinambá
Copyright 2008 Mozilla Corporation
Copyright 2008 Frederic Plourde
Permission is hereby granted, free of charge, to any person obtaining a
copy of this software and associated documentation files (the "Software"),
to deal in the Software without restriction, including without limitation
the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the
Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice (including the next
paragraph) shall be included in all copies or substantial portions of the
Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
Other licenses:
Copyright © 2000 Keith Packard, member of The XFree86 Project, Inc.
2005 Lars Knoll & Zack Rusin, Trolltech
Copyright © 2000 SuSE, Inc.
Copyright © 2007 Red Hat, Inc.
Copyright © 1998 Keith Packard
Permission to use, copy, modify, distribute, and sell this software and its
documentation for any purpose is hereby granted without fee, provided that
the above copyright notice appear in all copies and that both that
copyright notice and this permission notice appear in supporting
documentation, and that the name of the copyright holders not be used in
advertising or publicity pertaining to distribution of the software without
specific, written prior permission. The copyright holders make no
representations about the suitability of this software for any purpose. It
is provided "as is" without express or implied warranty.
THE COPYRIGHT HOLDERS DISCLAIM ALL WARRANTIES WITH REGARD TO THIS
SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS, IN NO EVENT SHALL THE COPYRIGHT HOLDERS BE LIABLE FOR ANY
SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN
AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING
OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS
SOFTWARE.
Copyright 1987, 1988, 1989, 1998 The Open Group
Permission to use, copy, modify, distribute, and sell this software and its
documentation for any purpose is hereby granted without fee, provided that
the above copyright notice appear in all copies and that both that
copyright notice and this permission notice appear in supporting
documentation.
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
OPEN GROUP BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN
AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Except as contained in this notice, the name of The Open Group shall not be
used in advertising or otherwise to promote the sale, use or other dealings
in this Software without prior written authorization from The Open Group.
Copyright 1987, 1988, 1989 by
Digital Equipment Corporation, Maynard, Massachusetts.
All Rights Reserved
Permission to use, copy, modify, and distribute this software and its
documentation for any purpose and without fee is hereby granted,
provided that the above copyright notice appear in all copies and that
both that copyright notice and this permission notice appear in
supporting documentation, and that the name of Digital not be
used in advertising or publicity pertaining to distribution of the
software without specific, written prior permission.
DIGITAL DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING
ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL
DIGITAL BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR
ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS
SOFTWARE.
License: Expat
Permission is hereby granted, free of charge, to any person obtaining a
copy of this software and associated documentation files (the "Software"),
to deal in the Software without restriction, including without limitation
the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the
Software is furnished to do so, subject to the following conditions:
.
The above copyright notice and this permission notice (including the next
paragraph) shall be included in all copies or substantial portions of the
Software.
.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.

View File

@ -1 +1 @@
usr/lib/libpixman-1.so.*
usr/lib/*/libpixman-1.so.* /usr/lib

View File

@ -1 +1 @@
usr/lib/libpixman-1.so.*
usr/lib/*/libpixman-1.so.*

View File

@ -0,0 +1,2 @@
libpixman-1-0: symbols-declares-dependency-on-other-package libpixman-1-0-private

View File

@ -1,7 +1,14 @@
libpixman-1.so.0 libpixman-1-0 #MINVER#
| libpixman-1-0-private
_pixman_internal_only_get_implementation@Base 0 1
pixman_add_trapezoids@Base 0
pixman_add_traps@Base 0
pixman_add_triangles@Base 0.21.6
pixman_blt@Base 0
pixman_composite_glyphs@Base 0.27.2
pixman_composite_glyphs_no_mask@Base 0.27.2
pixman_composite_trapezoids@Base 0.21.6
pixman_composite_triangles@Base 0.21.6
pixman_compute_composite_region@Base 0
pixman_disable_out_of_bounds_workaround@Base 0.15.16
pixman_edge_init@Base 0
@ -20,15 +27,33 @@ libpixman-1.so.0 libpixman-1-0 #MINVER#
pixman_f_transform_scale@Base 0.13.2
pixman_f_transform_translate@Base 0.13.2
pixman_fill@Base 0
pixman_filter_create_separable_convolution@Base 0.30.0
pixman_format_supported_destination@Base 0.15.16
pixman_format_supported_source@Base 0.15.16
pixman_glyph_cache_create@Base 0.27.2
pixman_glyph_cache_destroy@Base 0.27.2
pixman_glyph_cache_freeze@Base 0.27.2
pixman_glyph_cache_insert@Base 0.27.2
pixman_glyph_cache_lookup@Base 0.27.2
pixman_glyph_cache_remove@Base 0.27.2
pixman_glyph_cache_thaw@Base 0.27.2
pixman_glyph_get_extents@Base 0.27.2
pixman_glyph_get_mask_format@Base 0.27.2
pixman_image_composite@Base 0.15.14
pixman_image_composite32@Base 0.18.0
pixman_image_create_bits@Base 0.15.12
pixman_image_create_bits_no_clear@Base 0.27.4
pixman_image_create_conical_gradient@Base 0
pixman_image_create_linear_gradient@Base 0
pixman_image_create_radial_gradient@Base 0
pixman_image_create_solid_fill@Base 0
pixman_image_fill_boxes@Base 0.18.0
pixman_image_fill_rectangles@Base 0.15.14
pixman_image_get_component_alpha@Base 0.19.6
pixman_image_get_data@Base 0
pixman_image_get_depth@Base 0
pixman_image_get_destroy_data@Base 0.18.0
pixman_image_get_format@Base 0.19.6
pixman_image_get_height@Base 0
pixman_image_get_stride@Base 0
pixman_image_get_width@Base 0
@ -39,7 +64,9 @@ libpixman-1.so.0 libpixman-1-0 #MINVER#
pixman_image_set_clip_region@Base 0
pixman_image_set_component_alpha@Base 0
pixman_image_set_destroy_function@Base 0.15.12
pixman_image_set_filter@Base 0
pixman_image_set_dither@Base 0.40.0
pixman_image_set_dither_offset@Base 0.40.0
pixman_image_set_filter@Base 0.30.0
pixman_image_set_has_client_clip@Base 0
pixman_image_set_indexed@Base 0
pixman_image_set_repeat@Base 0
@ -49,17 +76,21 @@ libpixman-1.so.0 libpixman-1-0 #MINVER#
pixman_line_fixed_edge_init@Base 0
pixman_rasterize_edges@Base 0
pixman_rasterize_trapezoid@Base 0
pixman_region32_clear@Base 0.25.2
pixman_region32_contains_point@Base 0.11.2
pixman_region32_contains_rectangle@Base 0.11.2
pixman_region32_copy@Base 0.11.2
pixman_region32_empty@Base 0.44.0
pixman_region32_equal@Base 0.11.2
pixman_region32_extents@Base 0.11.2
pixman_region32_fini@Base 0.11.2
pixman_region32_init@Base 0.11.2
pixman_region32_init_from_image@Base 0.18.0
pixman_region32_init_rect@Base 0.11.2
pixman_region32_init_rects@Base 0.11.2
pixman_region32_init_with_extents@Base 0.11.2
pixman_region32_intersect@Base 0.11.2
pixman_region32_intersect_rect@Base 0.19.6
pixman_region32_inverse@Base 0.11.2
pixman_region32_n_rects@Base 0.11.2
pixman_region32_not_empty@Base 0.11.2
@ -70,17 +101,21 @@ libpixman-1.so.0 libpixman-1-0 #MINVER#
pixman_region32_translate@Base 0.11.2
pixman_region32_union@Base 0.11.2
pixman_region32_union_rect@Base 0.11.2
pixman_region_clear@Base 0.25.2
pixman_region_contains_point@Base 0
pixman_region_contains_rectangle@Base 0
pixman_region_copy@Base 0
pixman_region_empty@Base 0.44.0
pixman_region_equal@Base 0
pixman_region_extents@Base 0
pixman_region_fini@Base 0
pixman_region_init@Base 0
pixman_region_init_from_image@Base 0.18.0
pixman_region_init_rect@Base 0
pixman_region_init_rects@Base 0
pixman_region_init_with_extents@Base 0
pixman_region_intersect@Base 0
pixman_region_intersect_rect@Base 0.19.6
pixman_region_inverse@Base 0
pixman_region_n_rects@Base 0
pixman_region_not_empty@Base 0
@ -107,11 +142,12 @@ libpixman-1.so.0 libpixman-1-0 #MINVER#
pixman_transform_is_scale@Base 0.13.2
pixman_transform_multiply@Base 0.13.2
pixman_transform_point@Base 0.13.2
pixman_transform_point_31_16@Base 0 1
pixman_transform_point_31_16_3d@Base 0 1
pixman_transform_point_31_16_affine@Base 0 1
pixman_transform_rotate@Base 0.13.2
pixman_transform_scale@Base 0.13.2
pixman_transform_translate@Base 0.13.2
pixman_transform_point_3d@Base 0
pixman_version@Base 0.10.0
pixman_version_string@Base 0.10.0
pixman_format_supported_destination@Base 0.15.16
pixman_format_supported_source@Base 0.15.16

View File

@ -1,4 +1,3 @@
usr/lib/libpixman-1.so
usr/lib/libpixman-1.a
usr/lib/pkgconfig
usr/lib/*/libpixman-1.so
usr/lib/*/pkgconfig
usr/include/pixman-1

View File

@ -1,52 +0,0 @@
From 2beb82c292725ad500db9d564ca73dd6c2bce463 Mon Sep 17 00:00:00 2001
From: Julien Cristau <jcristau@debian.org>
Date: Sun, 23 Aug 2009 12:31:21 +0200
Subject: [PATCH] pixman/arm: don't use env vars to get hwcap/platform
---
pixman/pixman-cpu.c | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/pixman/pixman-cpu.c b/pixman/pixman-cpu.c
index a2a7b8a..af16715 100644
--- a/pixman/pixman-cpu.c
+++ b/pixman/pixman-cpu.c
@@ -253,8 +253,10 @@ pixman_arm_read_auxv ()
if (aux.a_type == AT_HWCAP)
{
uint32_t hwcap = aux.a_un.a_val;
+#if 0
if (getenv ("ARM_FORCE_HWCAP"))
hwcap = strtoul (getenv ("ARM_FORCE_HWCAP"), NULL, 0);
+#endif
/* hardcode these values to avoid depending on specific
* versions of the hwcap header, e.g. HWCAP_NEON
*/
@@ -266,8 +268,10 @@ pixman_arm_read_auxv ()
else if (aux.a_type == AT_PLATFORM)
{
const char *plat = (const char*) aux.a_un.a_val;
+#if 0
if (getenv ("ARM_FORCE_PLATFORM"))
plat = getenv ("ARM_FORCE_PLATFORM");
+#endif
if (strncmp (plat, "v7l", 3) == 0)
{
arm_has_v7 = TRUE;
@@ -281,11 +285,13 @@ pixman_arm_read_auxv ()
}
close (fd);
+#if 0
/* if we don't have 2.6.29, we have to do this hack; set
* the env var to trust HWCAP.
*/
if (!getenv ("ARM_TRUST_HWCAP") && arm_has_v7)
arm_has_neon = TRUE;
+#endif
}
arm_tests_initialized = TRUE;
--
1.6.3.3

View File

@ -1 +1 @@
pixman-arm-don-t-use-env-vars-to-get-hwcap-platform.patch
test-increase-timeout.diff

View File

@ -0,0 +1,11 @@
--- a/test/alpha-loop.c
+++ b/test/alpha-loop.c
@@ -22,7 +22,7 @@ main (int argc, char **argv)
d = pixman_image_create_bits (PIXMAN_a8r8g8b8, WIDTH, HEIGHT, dest, WIDTH * 4);
s = pixman_image_create_bits (PIXMAN_a2r10g10b10, WIDTH, HEIGHT, src, WIDTH * 4);
- fail_after (5, "Infinite loop detected: 5 seconds without progress\n");
+ fail_after (50, "Infinite loop detected: 50 seconds without progress\n");
pixman_image_set_alpha_map (s, a, 0, 0);
pixman_image_set_alpha_map (a, s, 0, 0);

116
debian/rules vendored
View File

@ -1,104 +1,32 @@
#!/usr/bin/make -f
include /usr/share/quilt/quilt.make
PACKAGE = libpixman-1-0
SHLIBS_VERSION = 0.15.16
SHLIBS = 0.40.0
CFLAGS = -Wall -g
ifneq (,$(filter noopt,$(DEB_BUILD_OPTIONS)))
CFLAGS += -O0
else
CFLAGS += -O2
endif
ifneq (,$(filter parallel=%,$(DEB_BUILD_OPTIONS)))
NUMJOBS = $(patsubst parallel=%,%,$(filter parallel=%,$(DEB_BUILD_OPTIONS)))
MAKEFLAGS += -j$(NUMJOBS)
endif
export DEB_BUILD_MAINT_OPTIONS = hardening=+all
DEB_HOST_ARCH ?= $(shell dpkg-architecture -qDEB_HOST_ARCH)
DEB_HOST_GNU_TYPE ?= $(shell dpkg-architecture -qDEB_HOST_GNU_TYPE)
DEB_BUILD_GNU_TYPE ?= $(shell dpkg-architecture -qDEB_BUILD_GNU_TYPE)
ifeq ($(DEB_BUILD_GNU_TYPE), $(DEB_HOST_GNU_TYPE))
confflags += --build=$(DEB_HOST_GNU_TYPE)
else
confflags += --build=$(DEB_BUILD_GNU_TYPE) --host=$(DEB_HOST_GNU_TYPE)
endif
# Disable Gtk+ autodetection:
override_dh_auto_configure:
# also avoid loongson2f optimizations on mipsel, see 0.26.0-3
# changelog entry:
LS_CFLAGS=" " dh_auto_configure -- \
-Dgtk=disabled
autogen: autogen-stamp
autogen-stamp: $(QUILT_STAMPFN)
dh_testdir
autoreconf -vfi
touch $@
# Install in debian/tmp to retain control through dh_install:
override_dh_auto_install:
dh_auto_install --destdir=debian/tmp
config: config-stamp
config-stamp: autogen-stamp
dh_testdir
test -d build || mkdir build
cd build && \
../configure \
--prefix=/usr \
--mandir=\$${prefix}/share/man \
--infodir=\$${prefix}/share/info \
$(confflags) \
CFLAGS="$(CFLAGS)"
touch $@
# Kill *.la files, and forget no-one:
override_dh_install:
find debian/tmp -name '*.la' -delete
dh_install
# Shlibs:
override_dh_makeshlibs:
dh_makeshlibs -p$(PACKAGE) --add-udeb $(PACKAGE)-udeb -V"$(PACKAGE) (>= $(SHLIBS))" -- -c4
build: build-stamp
build-stamp: config-stamp
dh_testdir
cd build && $(MAKE)
touch $@
override_dh_auto_test:
dh_auto_test -- --verbose --timeout-multiplier 3
clean: unpatch
dh_testdir
dh_testroot
rm -f autogen-stamp config-stamp build-stamp install-stamp
rm -f config.cache config.log config.status
rm -f */config.cache */config.log */config.status
rm -f conftest* */conftest*
rm -rf autom4te.cache */autom4te.cache
rm -rf build
rm -f $$(find -name Makefile.in)
rm -f compile config.guess config.sub configure depcomp install-sh
rm -f ltmain.sh missing INSTALL aclocal.m4 config.h.in
dh_clean
install: install-stamp
install-stamp: build-stamp
dh_testdir
dh_testroot
dh_clean -k
dh_installdirs
cd build && $(MAKE) DESTDIR=$(CURDIR)/debian/tmp install
touch $@
# Install architecture-dependent files here.
binary-arch: install
dh_testdir
dh_testroot
dh_installdocs
dh_install --sourcedir=debian/tmp --list-missing
dh_installchangelogs ChangeLog
dh_link
dh_strip --dbg-package=$(PACKAGE)-dbg
dh_compress
dh_fixperms
dh_makeshlibs -p$(PACKAGE) --add-udeb $(PACKAGE)-udeb -V"$(PACKAGE) (>= $(SHLIBS_VERSION))"
dh_installdeb
dh_shlibdeps
dh_gencontrol
dh_md5sums
dh_builddeb
binary-indep: install
# Nothing to do
binary: binary-indep binary-arch
.PHONY: autogen config build clean binary-indep binary-arch binary install
%:
dh $@ --with quilt --builddirectory=build/

1
debian/source/format vendored Normal file
View File

@ -0,0 +1 @@
1.0

3
debian/watch vendored
View File

@ -1,2 +1,3 @@
#git=git://anongit.freedesktop.org/pixman
version=3
http://xorg.freedesktop.org/releases/individual/lib/ pixman-(.*)\.tar\.gz
https://xorg.freedesktop.org/releases/individual/lib/ pixman-(.*)\.tar\.gz

View File

@ -1,7 +1,7 @@
#include <stdio.h>
#include <stdlib.h>
#include "pixman.h"
#include "utils.h"
#include "gtk-utils.h"
int
main (int argc, char **argv)
@ -14,7 +14,6 @@ main (int argc, char **argv)
uint32_t *src = malloc (WIDTH * HEIGHT * 4);
pixman_image_t *grad_img;
pixman_image_t *alpha_img;
pixman_image_t *solid_img;
pixman_image_t *dest_img;
pixman_image_t *src_img;
int i;
@ -26,24 +25,27 @@ main (int argc, char **argv)
pixman_point_fixed_t p1 = { pixman_double_to_fixed (0), 0 };
pixman_point_fixed_t p2 = { pixman_double_to_fixed (WIDTH),
pixman_int_to_fixed (0) };
#if 0
pixman_transform_t trans = {
{ { pixman_double_to_fixed (2), pixman_double_to_fixed (0.5), pixman_double_to_fixed (-100), },
{ pixman_double_to_fixed (0), pixman_double_to_fixed (3), pixman_double_to_fixed (0), },
{ pixman_double_to_fixed (0), pixman_double_to_fixed (0.000), pixman_double_to_fixed (1.0) }
}
};
pixman_transform_t id = {
#else
pixman_transform_t trans = {
{ { pixman_fixed_1, 0, 0 },
{ 0, pixman_fixed_1, 0 },
{ 0, 0, pixman_fixed_1 } }
};
#endif
#if 0
pixman_point_fixed_t c_inner;
pixman_point_fixed_t c_outer;
pixman_fixed_t r_inner;
pixman_fixed_t r_outer;
pixman_color_t red = { 0xffff, 0x0000, 0x0000, 0xffff };
#endif
for (i = 0; i < WIDTH * HEIGHT; ++i)
alpha[i] = 0x4f00004f; /* pale blue */
@ -69,6 +71,7 @@ main (int argc, char **argv)
src,
WIDTH * 4);
#if 0
c_inner.x = pixman_double_to_fixed (50.0);
c_inner.y = pixman_double_to_fixed (50.0);
c_outer.x = pixman_double_to_fixed (50.0);
@ -76,7 +79,6 @@ main (int argc, char **argv)
r_inner = 0;
r_outer = pixman_double_to_fixed (50.0);
#if 0
grad_img = pixman_image_create_conical_gradient (&c_inner, r_inner,
stops, 2);
#endif
@ -91,7 +93,7 @@ main (int argc, char **argv)
grad_img = pixman_image_create_linear_gradient (&p1, &p2,
stops, 2);
pixman_image_set_transform (grad_img, &id);
pixman_image_set_transform (grad_img, &trans);
pixman_image_set_repeat (grad_img, PIXMAN_REPEAT_PAD);
pixman_image_composite (PIXMAN_OP_OVER, grad_img, NULL, alpha_img,

71
demos/checkerboard.c Normal file
View File

@ -0,0 +1,71 @@
#include <stdio.h>
#include <stdlib.h>
#include "pixman.h"
#include "gtk-utils.h"
int
main (int argc, char **argv)
{
#define WIDTH 400
#define HEIGHT 400
#define TILE_SIZE 25
pixman_image_t *checkerboard;
pixman_image_t *destination;
#define D2F(d) (pixman_double_to_fixed(d))
pixman_transform_t trans = { {
{ D2F (-1.96830), D2F (-1.82250), D2F (512.12250)},
{ D2F (0.00000), D2F (-7.29000), D2F (1458.00000)},
{ D2F (0.00000), D2F (-0.00911), D2F (0.59231)},
}};
int i, j;
checkerboard = pixman_image_create_bits (PIXMAN_a8r8g8b8,
WIDTH, HEIGHT,
NULL, 0);
destination = pixman_image_create_bits (PIXMAN_a8r8g8b8,
WIDTH, HEIGHT,
NULL, 0);
for (i = 0; i < HEIGHT / TILE_SIZE; ++i)
{
for (j = 0; j < WIDTH / TILE_SIZE; ++j)
{
double u = (double)(j + 1) / (WIDTH / TILE_SIZE);
double v = (double)(i + 1) / (HEIGHT / TILE_SIZE);
pixman_color_t black = { 0, 0, 0, 0xffff };
pixman_color_t white = {
v * 0xffff,
u * 0xffff,
(1 - (double)u) * 0xffff,
0xffff };
pixman_color_t *c;
pixman_image_t *fill;
if ((j & 1) != (i & 1))
c = &black;
else
c = &white;
fill = pixman_image_create_solid_fill (c);
pixman_image_composite (PIXMAN_OP_SRC, fill, NULL, checkerboard,
0, 0, 0, 0, j * TILE_SIZE, i * TILE_SIZE,
TILE_SIZE, TILE_SIZE);
}
}
pixman_image_set_transform (checkerboard, &trans);
pixman_image_set_filter (checkerboard, PIXMAN_FILTER_BEST, NULL, 0);
pixman_image_set_repeat (checkerboard, PIXMAN_REPEAT_NONE);
pixman_image_composite (PIXMAN_OP_SRC,
checkerboard, NULL, destination,
0, 0, 0, 0, 0, 0,
WIDTH, HEIGHT);
show_image (destination);
return 0;
}

View File

@ -2,7 +2,7 @@
#include <stdlib.h>
#include <string.h>
#include "pixman.h"
#include "utils.h"
#include "gtk-utils.h"
/* This test demonstrates that clipping is done totally different depending
* on whether the source is transformed or not.

View File

@ -1,7 +1,7 @@
#include <stdio.h>
#include <stdlib.h>
#include "pixman.h"
#include "utils.h"
#include "gtk-utils.h"
#define WIDTH 200
#define HEIGHT 200
@ -31,9 +31,11 @@ main (int argc, char **argv)
{ pixman_int_to_fixed (0), { 0xffff, 0x0000, 0x0000, 0xffff } },
{ pixman_int_to_fixed (1), { 0xffff, 0xffff, 0x0000, 0xffff } }
};
#if 0
pixman_point_fixed_t p1 = { 0, 0 };
pixman_point_fixed_t p2 = { pixman_int_to_fixed (WIDTH),
pixman_int_to_fixed (HEIGHT) };
#endif
pixman_point_fixed_t c_inner;
pixman_point_fixed_t c_outer;
pixman_fixed_t r_inner;

View File

@ -2,10 +2,11 @@
#include <stdlib.h>
#include <stdio.h>
#include "pixman.h"
#include "utils.h"
#include "gtk-utils.h"
#include "parrot.c"
#define WIDTH 60
#define HEIGHT 60
#define WIDTH 80
#define HEIGHT 80
typedef struct {
const char *name;
@ -77,6 +78,9 @@ writer (void *src, uint32_t value, int size)
case 4:
*(uint32_t *)src = value;
break;
default:
break;
}
}
@ -84,50 +88,47 @@ int
main (int argc, char **argv)
{
#define d2f pixman_double_to_fixed
GtkWidget *window, *swindow;
GtkWidget *table;
uint32_t *dest = malloc (WIDTH * HEIGHT * 4);
uint32_t *src = malloc (WIDTH * HEIGHT * 4);
pixman_image_t *src_img;
pixman_image_t *gradient, *parrot;
pixman_image_t *dest_img;
pixman_point_fixed_t p1 = { -10 << 0, 0 };
pixman_point_fixed_t p2 = { WIDTH << 16, (HEIGHT - 10) << 16 };
uint16_t full = 0xcfff;
uint16_t low = 0x5000;
uint16_t alpha = 0xffff;
pixman_point_fixed_t p1 = { -10 << 16, 10 << 16 };
pixman_point_fixed_t p2 = { (WIDTH + 10) << 16, (HEIGHT - 10) << 16 };
uint16_t alpha = 0xdddd;
pixman_gradient_stop_t stops[6] =
{
{ d2f (0.0), { full, low, low, alpha } },
{ d2f (0.25), { full, full, low, alpha } },
{ d2f (0.4), { low, full, low, alpha } },
{ d2f (0.5), { low, full, full, alpha } },
{ d2f (0.8), { low, low, full, alpha } },
{ d2f (1.0), { full, low, full, alpha } },
{ d2f (0.0), { 0xf2f2, 0x8787, 0x7d7d, alpha } },
{ d2f (0.22), { 0xf3f3, 0xeaea, 0x8383, alpha } },
{ d2f (0.42), { 0x6b6b, 0xc0c0, 0x7777, alpha } },
{ d2f (0.57), { 0x4b4b, 0xc9c9, 0xf5f5, alpha } },
{ d2f (0.75), { 0x6a6a, 0x7f7f, 0xbebe, alpha } },
{ d2f (1.0), { 0xeded, 0x8282, 0xb0b0, alpha } },
};
int i;
gtk_init (&argc, &argv);
window = gtk_window_new (GTK_WINDOW_TOPLEVEL);
gtk_window_set_default_size (window, 800, 600);
gtk_window_set_default_size (GTK_WINDOW (window), 800, 600);
g_signal_connect (window, "delete-event",
G_CALLBACK (gtk_main_quit),
NULL);
table = gtk_table_new (G_N_ELEMENTS (operators) / 6, 6, TRUE);
src_img = pixman_image_create_linear_gradient (&p1, &p2, stops,
sizeof (stops) / sizeof (stops[0]));
gradient = pixman_image_create_linear_gradient (&p1, &p2, stops, G_N_ELEMENTS (stops));
parrot = pixman_image_create_bits (PIXMAN_a8r8g8b8, WIDTH, HEIGHT, (uint32_t *)parrot_bits, WIDTH * 4);
pixman_image_set_repeat (gradient, PIXMAN_REPEAT_PAD);
pixman_image_set_repeat (src_img, PIXMAN_REPEAT_PAD);
dest_img = pixman_image_create_bits (PIXMAN_a8r8g8b8,
WIDTH, HEIGHT,
dest,
NULL,
WIDTH * 4);
pixman_image_set_accessors (dest_img, reader, writer);
@ -137,7 +138,6 @@ main (int argc, char **argv)
GdkPixbuf *pixbuf;
GtkWidget *vbox;
GtkWidget *label;
int j, k;
vbox = gtk_vbox_new (FALSE, 0);
@ -145,14 +145,11 @@ main (int argc, char **argv)
gtk_box_pack_start (GTK_BOX (vbox), label, FALSE, FALSE, 6);
gtk_widget_show (label);
for (j = 0; j < HEIGHT; ++j)
{
for (k = 0; k < WIDTH; ++k)
dest[j * WIDTH + k] = 0x7f6f6f00;
}
pixman_image_composite (operators[i].op, src_img, NULL, dest_img,
pixman_image_composite (PIXMAN_OP_SRC, gradient, NULL, dest_img,
0, 0, 0, 0, 0, 0, WIDTH, HEIGHT);
pixbuf = pixbuf_from_argb32 (pixman_image_get_data (dest_img), TRUE,
pixman_image_composite (operators[i].op, parrot, NULL, dest_img,
0, 0, 0, 0, 0, 0, WIDTH, HEIGHT);
pixbuf = pixbuf_from_argb32 (pixman_image_get_data (dest_img),
WIDTH, HEIGHT, WIDTH * 4);
image = gtk_image_new_from_pixbuf (pixbuf);
gtk_box_pack_start (GTK_BOX (vbox), image, FALSE, FALSE, 0);
@ -165,7 +162,7 @@ main (int argc, char **argv)
g_object_unref (pixbuf);
}
pixman_image_unref (src_img);
pixman_image_unref (gradient);
free (src);
pixman_image_unref (dest_img);
free (dest);
@ -174,7 +171,7 @@ main (int argc, char **argv)
gtk_scrolled_window_set_policy (GTK_SCROLLED_WINDOW (swindow),
GTK_POLICY_AUTOMATIC,
GTK_POLICY_AUTOMATIC);
gtk_scrolled_window_add_with_viewport (GTK_SCROLLED_WINDOW (swindow), table);
gtk_widget_show (table);

100
demos/conical-test.c Normal file
View File

@ -0,0 +1,100 @@
#include "utils.h"
#include "gtk-utils.h"
#define SIZE 128
#define GRADIENTS_PER_ROW 7
#define NUM_ROWS ((NUM_GRADIENTS + GRADIENTS_PER_ROW - 1) / GRADIENTS_PER_ROW)
#define WIDTH (SIZE * GRADIENTS_PER_ROW)
#define HEIGHT (SIZE * NUM_ROWS)
#define NUM_GRADIENTS 35
#define double_to_color(x) \
(((uint32_t) ((x)*65536)) - (((uint32_t) ((x)*65536)) >> 16))
#define PIXMAN_STOP(offset,r,g,b,a) \
{ pixman_double_to_fixed (offset), \
{ \
double_to_color (r), \
double_to_color (g), \
double_to_color (b), \
double_to_color (a) \
} \
}
static const pixman_gradient_stop_t stops[] = {
PIXMAN_STOP (0.25, 1, 0, 0, 0.7),
PIXMAN_STOP (0.5, 1, 1, 0, 0.7),
PIXMAN_STOP (0.75, 0, 1, 0, 0.7),
PIXMAN_STOP (1.0, 0, 0, 1, 0.7)
};
#define NUM_STOPS (sizeof (stops) / sizeof (stops[0]))
static pixman_image_t *
create_conical (int index)
{
pixman_point_fixed_t c;
double angle;
c.x = pixman_double_to_fixed (0);
c.y = pixman_double_to_fixed (0);
angle = (0.5 / NUM_GRADIENTS + index / (double)NUM_GRADIENTS) * 720 - 180;
return pixman_image_create_conical_gradient (
&c, pixman_double_to_fixed (angle), stops, NUM_STOPS);
}
int
main (int argc, char **argv)
{
pixman_transform_t transform;
pixman_image_t *src_img, *dest_img;
int i;
enable_divbyzero_exceptions ();
dest_img = pixman_image_create_bits (PIXMAN_a8r8g8b8,
WIDTH, HEIGHT,
NULL, 0);
draw_checkerboard (dest_img, 25, 0xffaaaaaa, 0xff888888);
pixman_transform_init_identity (&transform);
pixman_transform_translate (NULL, &transform,
pixman_double_to_fixed (0.5),
pixman_double_to_fixed (0.5));
pixman_transform_scale (NULL, &transform,
pixman_double_to_fixed (SIZE),
pixman_double_to_fixed (SIZE));
pixman_transform_translate (NULL, &transform,
pixman_double_to_fixed (0.5),
pixman_double_to_fixed (0.5));
for (i = 0; i < NUM_GRADIENTS; i++)
{
int column = i % GRADIENTS_PER_ROW;
int row = i / GRADIENTS_PER_ROW;
src_img = create_conical (i);
pixman_image_set_repeat (src_img, PIXMAN_REPEAT_NORMAL);
pixman_image_set_transform (src_img, &transform);
pixman_image_composite32 (
PIXMAN_OP_OVER, src_img, NULL,dest_img,
0, 0, 0, 0, column * SIZE, row * SIZE,
SIZE, SIZE);
pixman_image_unref (src_img);
}
show_image (dest_img);
pixman_image_unref (dest_img);
return 0;
}

View File

@ -1,7 +1,7 @@
#include <stdio.h>
#include <stdlib.h>
#include "pixman.h"
#include "utils.h"
#include "gtk-utils.h"
int
main (int argc, char **argv)

277
demos/dither.c Normal file
View File

@ -0,0 +1,277 @@
/*
* Copyright 2012, Red Hat, Inc.
* Copyright 2012, Soren Sandmann
* Copyright 2018, Basile Clement
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
#ifdef HAVE_CONFIG_H
#include "pixman-config.h"
#endif
#include <math.h>
#include <gtk/gtk.h>
#include <stdlib.h>
#include "utils.h"
#include "gtk-utils.h"
#define WIDTH 1024
#define HEIGHT 640
typedef struct
{
GtkBuilder * builder;
pixman_image_t * original;
pixman_format_code_t format;
pixman_dither_t dither;
int width;
int height;
} app_t;
static GtkWidget *
get_widget (app_t *app, const char *name)
{
GtkWidget *widget = GTK_WIDGET (gtk_builder_get_object (app->builder, name));
if (!widget)
g_error ("Widget %s not found\n", name);
return widget;
}
typedef struct
{
char name [20];
int value;
} named_int_t;
static const named_int_t formats[] =
{
{ "a8r8g8b8", PIXMAN_a8r8g8b8 },
{ "rgb", PIXMAN_rgb_float },
{ "sRGB", PIXMAN_a8r8g8b8_sRGB },
{ "r5g6b5", PIXMAN_r5g6b5 },
{ "a4r4g4b4", PIXMAN_a4r4g4b4 },
{ "a2r2g2b2", PIXMAN_a2r2g2b2 },
{ "r3g3b2", PIXMAN_r3g3b2 },
{ "r1g2b1", PIXMAN_r1g2b1 },
{ "a1r1g1b1", PIXMAN_a1r1g1b1 },
};
static const named_int_t dithers[] =
{
{ "None", PIXMAN_REPEAT_NONE },
{ "Bayer 8x8", PIXMAN_DITHER_ORDERED_BAYER_8 },
{ "Blue noise 64x64", PIXMAN_DITHER_ORDERED_BLUE_NOISE_64 },
};
static int
get_value (app_t *app, const named_int_t table[], const char *box_name)
{
GtkComboBox *box = GTK_COMBO_BOX (get_widget (app, box_name));
return table[gtk_combo_box_get_active (box)].value;
}
static void
rescale (GtkWidget *may_be_null, app_t *app)
{
app->dither = get_value (app, dithers, "dithering_combo_box");
app->format = get_value (app, formats, "target_format_combo_box");
gtk_widget_set_size_request (
get_widget (app, "drawing_area"), app->width + 0.5, app->height + 0.5);
gtk_widget_queue_draw (
get_widget (app, "drawing_area"));
}
static gboolean
on_draw (GtkWidget *widget, cairo_t *cr, gpointer user_data)
{
app_t *app = user_data;
GdkRectangle area;
cairo_surface_t *surface;
pixman_image_t *tmp, *final;
uint32_t *pixels;
gdk_cairo_get_clip_rectangle(cr, &area);
tmp = pixman_image_create_bits (
app->format, area.width, area.height, NULL, 0);
pixman_image_set_dither (tmp, app->dither);
pixman_image_composite (
PIXMAN_OP_SRC,
app->original, NULL, tmp,
area.x, area.y, 0, 0, 0, 0,
app->width - area.x,
app->height - area.y);
pixels = calloc (1, area.width * area.height * 4);
final = pixman_image_create_bits (
PIXMAN_a8r8g8b8, area.width, area.height, pixels, area.width * 4);
pixman_image_composite (
PIXMAN_OP_SRC,
tmp, NULL, final,
area.x, area.y, 0, 0, 0, 0,
app->width - area.x,
app->height - area.y);
surface = cairo_image_surface_create_for_data (
(uint8_t *)pixels, CAIRO_FORMAT_ARGB32,
area.width, area.height, area.width * 4);
cairo_set_source_surface (cr, surface, area.x, area.y);
cairo_paint (cr);
cairo_surface_destroy (surface);
free (pixels);
pixman_image_unref (final);
pixman_image_unref (tmp);
return TRUE;
}
static void
set_up_combo_box (app_t *app, const char *box_name,
int n_entries, const named_int_t table[])
{
GtkWidget *widget = get_widget (app, box_name);
GtkListStore *model;
GtkCellRenderer *cell;
int i;
model = gtk_list_store_new (1, G_TYPE_STRING);
cell = gtk_cell_renderer_text_new ();
gtk_cell_layout_pack_start (GTK_CELL_LAYOUT (widget), cell, TRUE);
gtk_cell_layout_set_attributes (GTK_CELL_LAYOUT (widget), cell,
"text", 0,
NULL);
gtk_combo_box_set_model (GTK_COMBO_BOX (widget), GTK_TREE_MODEL (model));
for (i = 0; i < n_entries; ++i)
{
const named_int_t *info = &(table[i]);
GtkTreeIter iter;
gtk_list_store_append (model, &iter);
gtk_list_store_set (model, &iter, 0, info->name, -1);
}
gtk_combo_box_set_active (GTK_COMBO_BOX (widget), 0);
g_signal_connect (widget, "changed", G_CALLBACK (rescale), app);
}
static app_t *
app_new (pixman_image_t *original)
{
GtkWidget *widget;
app_t *app = g_malloc (sizeof *app);
GError *err = NULL;
app->builder = gtk_builder_new ();
app->original = original;
if (original->type == BITS)
{
app->width = pixman_image_get_width (original);
app->height = pixman_image_get_height (original);
}
else
{
app->width = WIDTH;
app->height = HEIGHT;
}
if (!gtk_builder_add_from_file (app->builder, "dither.ui", &err))
g_error ("Could not read file dither.ui: %s", err->message);
widget = get_widget (app, "drawing_area");
g_signal_connect (widget, "draw", G_CALLBACK (on_draw), app);
set_up_combo_box (app, "target_format_combo_box",
G_N_ELEMENTS (formats), formats);
set_up_combo_box (app, "dithering_combo_box",
G_N_ELEMENTS (dithers), dithers);
app->dither = get_value (app, dithers, "dithering_combo_box");
app->format = get_value (app, formats, "target_format_combo_box");
rescale (NULL, app);
return app;
}
int
main (int argc, char **argv)
{
GtkWidget *window;
pixman_image_t *image;
app_t *app;
gtk_init (&argc, &argv);
if (argc < 2)
{
pixman_gradient_stop_t stops[] = {
/* These colors make it very obvious that dithering
* is useful even for 8-bit gradients
*/
{ 0x00000, { 0x1b1b, 0x5d5d, 0x7c7c, 0xffff } },
{ 0x10000, { 0x3838, 0x3232, 0x1010, 0xffff } },
};
pixman_point_fixed_t p1, p2;
p1.x = p1.y = 0x0000;
p2.x = WIDTH << 16;
p2.y = HEIGHT << 16;
if (!(image = pixman_image_create_linear_gradient (
&p1, &p2, stops, ARRAY_LENGTH (stops))))
{
printf ("Could not create gradient\n");
return -1;
}
}
else if (!(image = pixman_image_from_file (argv[1], PIXMAN_a8r8g8b8)))
{
printf ("Could not load image \"%s\"\n", argv[1]);
return -1;
}
app = app_new (image);
window = get_widget (app, "main");
g_signal_connect (window, "delete_event", G_CALLBACK (gtk_main_quit), NULL);
gtk_window_set_default_size (GTK_WINDOW (window), 1024, 768);
gtk_widget_show_all (window);
gtk_main ();
return 0;
}

147
demos/dither.ui Normal file
View File

@ -0,0 +1,147 @@
<?xml version="1.0" encoding="UTF-8"?>
<interface>
<requires lib="gtk+" version="2.12"/>
<object class="GtkWindow" id="main">
<property name="can_focus">False</property>
<child>
<placeholder/>
</child>
<child>
<object class="GtkHBox" id="u">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="spacing">12</property>
<child>
<object class="GtkScrolledWindow" id="scrolledwindow1">
<property name="visible">True</property>
<property name="can_focus">True</property>
<property name="shadow_type">in</property>
<child>
<object class="GtkViewport" id="viewport1">
<property name="visible">True</property>
<property name="can_focus">False</property>
<child>
<object class="GtkDrawingArea" id="drawing_area">
<property name="visible">True</property>
<property name="can_focus">False</property>
</object>
</child>
</object>
</child>
</object>
<packing>
<property name="expand">True</property>
<property name="fill">True</property>
<property name="position">0</property>
</packing>
</child>
<child>
<object class="GtkVBox" id="box1">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="spacing">12</property>
<child>
<object class="GtkVBox" id="box6">
<property name="visible">True</property>
<property name="can_focus">False</property>
<child>
<object class="GtkTable" id="grid1">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="n_rows">2</property>
<property name="n_columns">2</property>
<property name="column_spacing">8</property>
<property name="row_spacing">6</property>
<child>
<object class="GtkLabel" id="label4">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="label" translatable="yes">&lt;b&gt;Target format:&lt;/b&gt;</property>
<property name="use_markup">True</property>
<property name="xalign">1</property>
</object>
</child>
<child>
<object class="GtkLabel" id="label5">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="label" translatable="yes">&lt;b&gt;Dithering:&lt;/b&gt;</property>
<property name="use_markup">True</property>
<property name="xalign">1</property>
</object>
<packing>
<property name="top_attach">1</property>
</packing>
</child>
<child>
<object class="GtkComboBox" id="target_format_combo_box">
<property name="visible">True</property>
<property name="can_focus">False</property>
</object>
<packing>
<property name="left_attach">1</property>
</packing>
</child>
<child>
<object class="GtkComboBox" id="dithering_combo_box">
<property name="visible">True</property>
<property name="can_focus">False</property>
</object>
<packing>
<property name="left_attach">1</property>
<property name="top_attach">1</property>
</packing>
</child>
</object>
<packing>
<property name="expand">False</property>
<property name="fill">True</property>
<property name="padding">6</property>
<property name="position">1</property>
</packing>
</child>
</object>
<packing>
<property name="expand">False</property>
<property name="fill">True</property>
<property name="position">0</property>
</packing>
</child>
</object>
<packing>
<property name="expand">False</property>
<property name="fill">True</property>
<property name="position">1</property>
</packing>
</child>
</object>
</child>
</object>
<object class="GtkAdjustment" id="rotate_adjustment">
<property name="lower">-180</property>
<property name="upper">190</property>
<property name="step_increment">1</property>
<property name="page_increment">10</property>
<property name="page_size">10</property>
</object>
<object class="GtkAdjustment" id="scale_x_adjustment">
<property name="lower">-32</property>
<property name="upper">42</property>
<property name="step_increment">1</property>
<property name="page_increment">10</property>
<property name="page_size">10</property>
</object>
<object class="GtkAdjustment" id="scale_y_adjustment">
<property name="lower">-32</property>
<property name="upper">42</property>
<property name="step_increment">1</property>
<property name="page_increment">10</property>
<property name="page_size">10</property>
</object>
<object class="GtkAdjustment" id="subsample_adjustment">
<property name="upper">12</property>
<property name="value">4</property>
<property name="step_increment">1</property>
<property name="page_increment">1</property>
</object>
</interface>

View File

@ -1,7 +1,7 @@
#include <stdio.h>
#include <stdlib.h>
#include "pixman.h"
#include "utils.h"
#include "gtk-utils.h"
int
main (int argc, char **argv)
@ -15,38 +15,42 @@ main (int argc, char **argv)
int i;
pixman_gradient_stop_t stops[2] =
{
{ pixman_int_to_fixed (0), { 0xffff, 0xeeee, 0xeeee, 0xeeee } },
{ pixman_int_to_fixed (1), { 0xffff, 0x1111, 0x1111, 0x1111 } }
{ pixman_int_to_fixed (0), { 0x0000, 0x0000, 0xffff, 0xffff } },
{ pixman_int_to_fixed (1), { 0xffff, 0x1111, 0x1111, 0xffff } }
};
pixman_point_fixed_t p1 = { pixman_double_to_fixed (0), 0 };
pixman_point_fixed_t p2 = { pixman_double_to_fixed (WIDTH / 8.),
pixman_int_to_fixed (0) };
pixman_point_fixed_t p1 = { pixman_double_to_fixed (50), 0 };
pixman_point_fixed_t p2 = { pixman_double_to_fixed (200), 0 };
#if 0
pixman_transform_t trans = {
{ { pixman_double_to_fixed (2), pixman_double_to_fixed (0.5), pixman_double_to_fixed (-100), },
{ pixman_double_to_fixed (0), pixman_double_to_fixed (3), pixman_double_to_fixed (0), },
{ pixman_double_to_fixed (0), pixman_double_to_fixed (0.000), pixman_double_to_fixed (1.0) }
}
};
pixman_transform_t id = {
#else
pixman_transform_t trans = {
{ { pixman_fixed_1, 0, 0 },
{ 0, pixman_fixed_1, 0 },
{ 0, 0, pixman_fixed_1 } }
};
#endif
#if 0
pixman_point_fixed_t c_inner;
pixman_point_fixed_t c_outer;
pixman_fixed_t r_inner;
pixman_fixed_t r_outer;
#endif
for (i = 0; i < WIDTH * HEIGHT; ++i)
dest[i] = 0x4f00004f; /* pale blue */
dest[i] = 0xff00ff00;
dest_img = pixman_image_create_bits (PIXMAN_a8r8g8b8,
WIDTH, HEIGHT,
dest,
WIDTH * 4);
#if 0
c_inner.x = pixman_double_to_fixed (50.0);
c_inner.y = pixman_double_to_fixed (50.0);
c_outer.x = pixman_double_to_fixed (50.0);
@ -56,6 +60,7 @@ main (int argc, char **argv)
src_img = pixman_image_create_conical_gradient (&c_inner, r_inner,
stops, 2);
#endif
#if 0
src_img = pixman_image_create_conical_gradient (&c_inner, r_inner,
stops, 2);
@ -67,8 +72,8 @@ main (int argc, char **argv)
src_img = pixman_image_create_linear_gradient (&p1, &p2,
stops, 2);
pixman_image_set_transform (src_img, &id);
pixman_image_set_repeat (src_img, PIXMAN_REPEAT_PAD);
pixman_image_set_transform (src_img, &trans);
pixman_image_set_repeat (src_img, PIXMAN_REPEAT_NONE);
pixman_image_composite (PIXMAN_OP_OVER, src_img, NULL, dest_img,
0, 0, 0, 0, 0, 0, 10 * WIDTH, HEIGHT);

177
demos/gtk-utils.c Normal file
View File

@ -0,0 +1,177 @@
#include <gtk/gtk.h>
#ifdef HAVE_CONFIG_H
#include <pixman-config.h>
#endif
#include "utils.h"
#include "gtk-utils.h"
pixman_image_t *
pixman_image_from_file (const char *filename, pixman_format_code_t format)
{
GdkPixbuf *pixbuf;
pixman_image_t *image;
int width, height;
uint32_t *data, *d;
uint8_t *gdk_data;
int n_channels;
int j, i;
int stride;
if (!(pixbuf = gdk_pixbuf_new_from_file (filename, NULL)))
return NULL;
image = NULL;
width = gdk_pixbuf_get_width (pixbuf);
height = gdk_pixbuf_get_height (pixbuf);
n_channels = gdk_pixbuf_get_n_channels (pixbuf);
gdk_data = gdk_pixbuf_get_pixels (pixbuf);
stride = gdk_pixbuf_get_rowstride (pixbuf);
if (!(data = malloc (width * height * sizeof (uint32_t))))
goto out;
d = data;
for (j = 0; j < height; ++j)
{
uint8_t *gdk_line = gdk_data;
for (i = 0; i < width; ++i)
{
int r, g, b, a;
uint32_t pixel;
r = gdk_line[0];
g = gdk_line[1];
b = gdk_line[2];
if (n_channels == 4)
a = gdk_line[3];
else
a = 0xff;
r = (r * a + 127) / 255;
g = (g * a + 127) / 255;
b = (b * a + 127) / 255;
pixel = (a << 24) | (r << 16) | (g << 8) | b;
*d++ = pixel;
gdk_line += n_channels;
}
gdk_data += stride;
}
image = pixman_image_create_bits (
format, width, height, data, width * 4);
out:
g_object_unref (pixbuf);
return image;
}
GdkPixbuf *
pixbuf_from_argb32 (uint32_t *bits,
int width,
int height,
int stride)
{
GdkPixbuf *pixbuf = gdk_pixbuf_new (GDK_COLORSPACE_RGB, TRUE,
8, width, height);
int p_stride = gdk_pixbuf_get_rowstride (pixbuf);
guint32 *p_bits = (guint32 *)gdk_pixbuf_get_pixels (pixbuf);
int i;
for (i = 0; i < height; ++i)
{
uint32_t *src_row = &bits[i * (stride / 4)];
uint32_t *dst_row = p_bits + i * (p_stride / 4);
a8r8g8b8_to_rgba_np (dst_row, src_row, width);
}
return pixbuf;
}
static gboolean
on_draw (GtkWidget *widget, cairo_t *cr, gpointer user_data)
{
pixman_image_t *pimage = user_data;
int width = pixman_image_get_width (pimage);
int height = pixman_image_get_height (pimage);
int stride = pixman_image_get_stride (pimage);
cairo_surface_t *cimage;
cairo_format_t format;
if (pixman_image_get_format (pimage) == PIXMAN_x8r8g8b8)
format = CAIRO_FORMAT_RGB24;
else
format = CAIRO_FORMAT_ARGB32;
cimage = cairo_image_surface_create_for_data (
(uint8_t *)pixman_image_get_data (pimage),
format, width, height, stride);
cairo_rectangle (cr, 0, 0, width, height);
cairo_set_source_surface (cr, cimage, 0, 0);
cairo_fill (cr);
cairo_surface_destroy (cimage);
return TRUE;
}
void
show_image (pixman_image_t *image)
{
GtkWidget *window;
int width, height;
int argc;
char **argv;
char *arg0 = g_strdup ("pixman-test-program");
pixman_format_code_t format;
pixman_image_t *copy;
argc = 1;
argv = (char **)&arg0;
gtk_init (&argc, &argv);
window = gtk_window_new (GTK_WINDOW_TOPLEVEL);
width = pixman_image_get_width (image);
height = pixman_image_get_height (image);
gtk_window_set_default_size (GTK_WINDOW (window), width, height);
format = pixman_image_get_format (image);
/* We always display the image as if it contains sRGB data. That
* means that no conversion should take place when the image
* has the a8r8g8b8_sRGB format.
*/
switch (format)
{
case PIXMAN_a8r8g8b8_sRGB:
case PIXMAN_a8r8g8b8:
case PIXMAN_x8r8g8b8:
copy = pixman_image_ref (image);
break;
default:
copy = pixman_image_create_bits (PIXMAN_a8r8g8b8,
width, height, NULL, -1);
pixman_image_composite32 (PIXMAN_OP_SRC,
image, NULL, copy,
0, 0, 0, 0, 0, 0,
width, height);
break;
}
g_signal_connect (window, "draw", G_CALLBACK (on_draw), copy);
g_signal_connect (window, "delete_event", G_CALLBACK (gtk_main_quit), NULL);
gtk_widget_show (window);
gtk_main ();
}

15
demos/gtk-utils.h Normal file
View File

@ -0,0 +1,15 @@
#include <stdio.h>
#include <stdlib.h>
#include <glib.h>
#include <gtk/gtk.h>
#include "pixman.h"
void show_image (pixman_image_t *image);
pixman_image_t *
pixman_image_from_file (const char *filename, pixman_format_code_t format);
GdkPixbuf *pixbuf_from_argb32 (uint32_t *bits,
int width,
int height,
int stride);

50
demos/linear-gradient.c Normal file
View File

@ -0,0 +1,50 @@
#include "utils.h"
#include "gtk-utils.h"
#define WIDTH 1024
#define HEIGHT 640
int
main (int argc, char **argv)
{
pixman_image_t *src_img, *dest_img;
pixman_gradient_stop_t stops[] = {
{ 0x00000, { 0x0000, 0x0000, 0x4444, 0xdddd } },
{ 0x10000, { 0xeeee, 0xeeee, 0x8888, 0xdddd } },
#if 0
/* These colors make it very obvious that dithering
* is useful even for 8-bit gradients
*/
{ 0x00000, { 0x6666, 0x3333, 0x3333, 0xffff } },
{ 0x10000, { 0x3333, 0x6666, 0x6666, 0xffff } },
#endif
};
pixman_point_fixed_t p1, p2;
enable_divbyzero_exceptions ();
dest_img = pixman_image_create_bits (PIXMAN_x8r8g8b8,
WIDTH, HEIGHT,
NULL, 0);
p1.x = p1.y = 0x0000;
p2.x = WIDTH << 16;
p2.y = HEIGHT << 16;
src_img = pixman_image_create_linear_gradient (&p1, &p2, stops, ARRAY_LENGTH (stops));
pixman_image_composite32 (PIXMAN_OP_OVER,
src_img,
NULL,
dest_img,
0, 0,
0, 0,
0, 0,
WIDTH, HEIGHT);
show_image (dest_img);
pixman_image_unref (dest_img);
return 0;
}

66
demos/meson.build Normal file
View File

@ -0,0 +1,66 @@
# Copyright © 2018 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
extra_demo_cflags = []
if cc.get_argument_syntax() == 'msvc'
extra_demo_cflags = ['-D_USE_MATH_DEFINES']
endif
demos = [
'gradient-test',
'alpha-test',
'composite-test',
'clip-test',
'trap-test',
'screen-test',
'convolution-test',
'radial-test',
'linear-gradient',
'conical-test',
'tri-test',
'checkerboard',
'srgb-test',
'srgb-trap-test',
'scale',
'dither',
]
if dep_gtk.found()
libdemo = static_library(
'demo',
['gtk-utils.c', config_h, version_h],
dependencies : [libtestutils_dep, dep_gtk, dep_glib, dep_png, dep_m, dep_openmp],
include_directories : inc_pixman,
)
if dep_gtk.found()
foreach d : demos
executable(
d,
[d + '.c', config_h, version_h],
c_args : extra_demo_cflags,
link_with : [libdemo],
dependencies : [idep_pixman, libtestutils_dep, dep_glib, dep_gtk, dep_openmp, dep_png],
)
endforeach
endif
endif

Some files were not shown because too many files have changed in this diff Show More