Unconditionally back up/restore locale configuration files and generate
en_US.UTF-8. Previously the test failed in environments which have some
locale other than en_US.UTF-8 in /etc/default/locale.
Also fix the assertion of /etc/locale.conf not being present after
localectl. This only applies to Debian/Ubuntu tests, not upstream ones
(see Use-Debian-specific-config-files.patch)
Some kernels in Ubuntu (e.g. linux-kvm) do not enable CONFIG_PM, which
results in stderr output when the logind test tries to grep the power
state file, causing the test to fail. The test already handles skipping
the test if suspend isn't supported, so just use -s to suppress grep
from printing to stderr if the file doesn't exist.
Currently, the 'upstream' test runs all upstream-provided tests at tests/TEST-*,
except for any that are in a hardcoded list contained in this test.
Any change to the hardcoded blacklist involves patching the debian repo, which is
not an ideal way for upstream systemd to manage a hopefully temporary blacklist of
problematic tests.
There was discussion around making the blacklist entirely provided by an upstream
env var, configured in the upstream webhooks that control the Ubuntu CI tests,
but that results in 2 problems; 1) Debian would no longer be able to easily
control its own blacklist, and 2) upstream would only have an all-or-none control
over the blacklist, meaning that it could only enable or disable any specific
test for all PRs; it could not enable a test for only a specific PR that was
attempting to fix the flaky test (for example).
This approach moves blacklist control into the systemd tests themselves, by changing
the 'upstream' test to look for a file in the test directory, and skipping the test
if such file is found. This way, Debian (and Ubuntu and other Debian derivatives)
can continue to manage their own blacklist, but also upstream can control
the blacklist for individual tests on a per-PR basis.
For example, if upstream has the file 'tests/TEST-01-BASIC/blacklist-ubuntu-ci'
in its repo, Ubuntu CI will skip this test for all PRs opened. Then, a PR can
be opened that both fixes the test, as well as removes this file. The Ubuntu CI
will then run the test, but only for the PR that attempts to fix it. Once that
PR is merged, all future tests will then run the fixed test.
The specific naming of the per-test blacklist file, for tests run from the
upstream repo, is either "blacklist-ubuntu-ci" to completely blacklist the test
on all archs, or "blacklist-ubuntu-ci-$DPKGARCH" to blacklist the test only
for a specific arch, for example "blacklist-ubuntu-ci-amd64". For tests run
from Debian (or Ubuntu), which are run without the TEST_UPSTREAM param set,
the blacklist filename is 'blacklist-upstream-ci[-$DPKGARCH]'.
Note that the $DPKGARCH is specified as the value returned by
'dpkg --print-architecture', not the value returned from 'uname -m' (e.g.
to blacklist a test for intel 64-bit, 'amd64' should be used, not 'x86_64').
This naming matches the title of the ubuntu tests, such as 'bionic-amd64',
'bionic-arm64', etc.
polkit.service is started on-demand via D-Bus activation. This means it
is not a good indicator if a boot was successful. Instead check if
NetworkManager is running as it is started via multi-user.target.
Closes: #934992
The check for is-system-running handles verifying that systemd completes
all its jobs during startup, making this test redundant. Also, since the
testbed can (and does) start more jobs after is-system-running, this test
can provide a false negative if any of those jobs continue running after
this check's timeout.
There is currently no delay to wait for is-system-running to reach running
or degraded, so it's very possible for there to be running jobs. This adds
a delay until is-system-running is 'running' or 'degraded', and gathers
extra artifacts if the system is 'degraded'.
Additional details are in this Ubuntu bug:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1825997
Note that bug description states:
"This patch is not required for debian, because debian's boot-smoke does
not include the wait for systemctl is-system-running", however this patch
adds that wait, because without it, the check for running jobs can easily
fail, if the system isn't fully running yet.
A previous commit changed the plaintext_name to a unique per-test name,
so there is no need to match the ask_password content on 'scsi_debug' now,
since just matching the plaintext_name should be unique.
This removes the systemctl start, as the service should already be
running (or starting, or failed). Instead, just wait for the service
to actually reach 'active' or 'failed' state. Then, only stop the
service if its state is 'active'.
Previous attempts to fix hung or flaky test runs included adding a start
before the stop, but the start-the-stop calls can lead to cancelled jobs
instead of a properly stopped service, which is again flaky.
Setting the plaintext name to include the testname makes debugging failures
easier.
Also for convenience, create service_name field in setup method, for use
in tests and/or teardown method.
run the 'clean' target before setup/run, to match the upstream
'all' target's behavior; also the clean (again) target must be
run separately, as the failure of setup or run will stop make,
and cleanup won't be done if called in the same make line.
is-active return code isn't the correct way to verify a service is
fully stopped; instead use show --property=ActiveState to verify it
is 'inactive'
This also could use the text output of is-active, but (per manpage)
the show command is "intended to be used whenever computer-parsable
output is required."
systemctl is-active returns non-zero even while the service is
'deactivating', but not actually stopped, which allows the testcase
to fail intermittently on slow machines, if the service hasn't
actually stopped before reaching the check to verify the service
stopped.
For example:
$ systemctl is-active systemd-timesyncd
active
$ timedatectl set-ntp false ; systemctl is-active systemd-timesyncd ; echo $?
deactivating
3
So the test code which does:
$ while systemctl is-active --quiet systemd-timesyncd; do sleep 1; done
will never actually perform that sleep.
Unfortunately, these tests are failing on i386 builds, very
intermittently; this causes headaches for upstream systemd
maintainers and submitters, when the Ubuntu CI fails, because
the failure isn't related to the PRs the tests are running for,
and determining that from the Ubuntu CI log file and other artifacts
is very time-consuming.
This just blacklists the tests for now, until we can figure out why
they are failing and fix them.
TEST-30 is discussed:
https://github.com/systemd/systemd/issues/12268
TEST-34 is discussed:
https://github.com/systemd/systemd/issues/12932
The test currently looks for the first(ish) kernel log message, which may
not be present. If it's not, the test case fails.
This isn't the fault of systemd/journald, the problem is the kernel filled
up its klog buffer before we started journald to read them. This can be
caused by a too-small kernel klog buffer, or could be caused by a large
number of kernel boot-time messages. More details are in this Ubuntu bug:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1830479
While catching all the kernel log messages since boot is important, this
test case is testing systemd, not the kernel, and should not fail if
the kernel's log buffer is filling during boot.
This problem has caused upstream systemd to disable Ubuntu CI on arm64
for many months:
https://github.com/systemd/systemd/issues/11104Closes: #929730
BugLink: https://bugs.launchpad.net/bugs/1831296
The test case was checking for a failing result of 'code=killed', but
the recent change now causes the failure to be 'code=dumped'. The
test should pass if the result is either.
- Skip tests which can't work in containers.
- Add missing rsyslog test dependency.
- e2scrub_reap.service fails in containers, ignore (filed as #926138)
- Relax pgrep pattern for gdm, as there's no wayland session in
containers.
Otherwise we'll catch some
Failed to resolve group 'render': Connection timed out
messages that happen in earlier boots during VM setup, before the
"render" group is created.
Fixes https://github.com/systemd/systemd/issues/11875
Use their $AUTOPKGTEST_* equivalents.
These were introduced in autopkgtest 4.0 (June 2016), and all our CI
systems use a much newer version.
Gbp-Dch: Short
When running tests for upstream PRs, this test often fails with
checking for connection timeouts
systemd-udevd[1228]: Failed to resolve group 'render': Connection timed out
Which is not the kind of timeout the test is looking for. Create the
group in the test to avoid this.
We explicitly don't create the group in systemd.postinst as we revert
the patch that introduces the group into the udev rules.
On fast ppc64el machines, cryptsetup start job may not complete by the time
tearDown is executed. In that case stop, causes to simply cancel the start job
without actually cleaning up the dmsetup node. This leads to failing subsequent
test as it no longer starts with a clean device. Thus ensure the
systemd-cryptsetup unit is started, before stopping it.
Also rmmod scsi_debug module at the end, to allow re-running the test in a
loop.
So far, we tried to avoid cleaning up manually created cgroups via a
Debian specific patch. This patch was dropped though and that particular
use case was never really supported upstream.
As this would trigger an autopkgtest failure now, let's remove this test.
Follow-up for commit 9738816398.
Gbp-Dch: Short
It appears lightdm fails to start up without it, even though it's just a
Recommends:, and it does seem to work without it on amd64. But it does
not hurt much, so let's see if it helps.
https://github.com/systemd/systemd/issues/10497
Gbp-Dch: Short
This restriction has been deprecated and there are plans to remove it
altogether. The tests pass withouth needs-recommends, so it seems safe
to remove.
See 9fade8dcb5
- netcat-openbsd: Required by TEST-12-ISSUE-3171.
- busybox-static: Required by TEST-13-NSPAWN-SMOKE.
- plymouth: Required by TEST-15-DROPIN and TEST-22-TMPFILES.
Otherwise logs are missing on failures:
cp: -r not specified; omitting directory '/var/tmp/systemd-test.Nq2jqR/journal/59852163a37d458f9d238b65f279b6fa'
Showing the entire debug log is too hard to scan visually, and most of
the time the warnings and errors are sufficient to explain a failure.
Put the journal files into the artifacts though, in case the debug
information is necessary.
This was horribly inefficient as a separate test (from commit
6bd0dab41e), as that cost two VM resets plus accompanying boots; and
this does not change any state thus does not require this kind of
isolation.
_ninja_bin was added in https://github.com/systemd/systemd/pull/6544 in
order to make the tests work on CentOS. As we don't actually do a "ninja
install" and ninja is not available, replace the check with a dummy
value.
Without these, dracut complains about the missing libdw.so, and test
setup about missing "quotaon". These errors become fatal with
<https://github.com/systemd/systemd/pull/6475>.
Makefile.guess tries to find the build directory if $BUILD_DIR is not
set. This doesn't always exist for an autopkgtest, thus fix it to ".";
it is not actually being used anyway. Adjust the sed for using the
system-installed nspawn binary accordingly.
It has its own autopkgtest and needs some special preparation. At some
point that should be merged into root-unittests, but let's quickfix this
to unbreak upstream CI.
This installs the necessary test data along with the programs and thus
we can greatly reduce the blacklist in debian/tests/root-unittests and
also simplify debian/rules.