Age | Commit message (Collapse) | Author |
|
|
|
In #1006, we set Azure to apply networking config every
BOOT_NEW_INSTANCE because the BOOT_LEGACY option was causing problems
applying networking the second time per boot. However,
BOOT_NEW_INSTANCE is also wrong as Azure needs to apply networking
once per boot, during init-local phase.
|
|
* Update test_combined.py to allow either valid LXD subplatform
* Split jinja templated tests into separate module as they can be more
fragile
* Move checks for warnings and tracebacks into dedicated utility
function. This allows us to work around persistent and expected
tracebacks/warnings on particular clouds.
* Update test_upgrade.py to allow either valid Azure datasource.
/var/lib/waagent or a mounted device are both valid.
* Add specificity to test_ntp_servers.py
Clouds will often specify their own ntp servers in the ntp
configuration files, so make the tests manually specify their own.
* Account for additional keys on system in test_ssh_keysfiles.py
* Update tests to account for invalid cache
test_user_events.py and test_version_change.py both have tests that
assume we will have valid ds cache when rebooting.
In test_user_events.py, subsequent boots should block applying
network on boot if boot event is denied. However, if the cache is
invalid, it is valid to apply networking config that boot.
In test_version_change.py no cache found won't trigger the expected
debug log. Additionally, the pickle used for that test on an older
release triggered an unexpected issue that took a different error
path.
* Ignore bionic in hotplug tests (LP: #1942247)
On Bionic, we traceback when attempting to detect the hotplugged
device in the updated metadata. This is because Bionic is
specifically configured not to provide network metadata.
See LP: #1942247 for more details.
* Fix date used in test_final_message.
In test_final_message, we ensured the variable substitution works as
expected. For $timestamp, we compared against the current date. It's
possible for the host date to be massively different from the client
date, so obtain date on client rather than host.
* Remove module success from lp1813396 test. Module may fail
unrelatedly (in this case apt-get update is failing), but the test
should still pass.
* Skip testing events if network is disabled
* Ensure we install expected version of cloud-init
As part of test setup, we can install cloud-init from various
sources, including PROPOSED, PPAs, etc. We were never checking that
this install completes successfully, and on OCI, it wasn't
completing successfully because of apt locking issues. Code has
been updated to retry, and then fail loudly if we can't complete the
install.
* Remove ubuntu-azure-fips metapkg which mandates FIPS-flavour kernel
In test_lp1835584.py
* Update test_user_events.py to account for Azure behavior
since Azure has a separate service to clear the pickled metadata
every boot
* Change failure to warning in test_upgrade.py if initial boot errors
If there's already a pre-existing cause for warnings or tracebacks,
that shouldn't cause the new version to fail.
* Add retry to test_random_passwords_emitted_to_serial_console
It's possible we haven't retrieved the entire log when the call returns,
so retry a few times if the output isn't empty.
|
|
Using flake8 inplace of pyflakes
Renamed run-pyflakes -> run-flake8
Changed target name to flake8 in Makefile
With pyflakes we can't suppress warnings/errors in few required places.
flake8 is flexible in that regard. Hence using flake8 seems to be a
better choice here.
flake8 does the job of pep8 anyway.
So, removed pep8 target from Makefile along with tools/run-pep8 script.
Included setup.py in flake8 checks
|
|
Home directory permissions changed in hirsute. The integration test
assumed permissions from earlier releases. Test was fixed to take both
permissions into account
|
|
Fix home permissions modified by ssh module
In #956, we updated the file and directory permissions for keys not in
the user's home directory. We also unintentionally modified the
permissions within the home directory as well. These should not change,
and this commit changes that back.
LP: #1940233
|
|
Ensure jinja templates work for both instance-data.json and
instance-data-sensitive.json. Test for LP: #1931392
Also removed test_runcmd.py as it's made redundant by this change.
|
|
The issues we see on Bionic VMs don't appear anywhere else, including
when invoking kvm directly. It likely has to do with the extra
LXD agent setup happening on bionic. Given that we still have Bionic
covered on all other platforms, the risk of skipping bionic for LXD VM
tests seems low.
|
|
Alters hotplug hook to have a query mechanism checking if the
functionality is enabled. This allows us to avoid using the hotplug
socket and service when hotplug is disabled.
|
|
(SC-191) (#955)
This should enable us to remove the cloud-tests entirely.
|
|
Implement missing device_aliases feature
The device_aliases key has been documented as part of disk_setup for
years, however the feature was never implemented. This implements the
feature as documented allowing usercfg (rather than dsconfig) to create
a mapping of device names.
This is not to be confused with disk_aliases, a very similar map but
existing solely for use by datasources.
LP: #1867532
|
|
test_ssh_import_id.py occassionally fails because cloud-init finishes
before the keys have been fully imported. A retry has been added to the
test.
|
|
Adds a udev script which will invoke a hotplug hook script on all net
add events. The script will write some udev arguments to a systemd FIFO
socket (to ensure we have only instance of cloud-init running at a
time), which is then read by a new service that calls a new 'cloud-init
devel hotplug-hook' command to handle the new event.
This hotplug-hook command will:
- Fetch the pickled datsource
- Verify that the hotplug event is supported/enabled
- Update the metadata for the datasource
- Ensure the hotplugged device exists within the datasource
- Apply the config change on the datasource metadata
- Bring up the new interface (or apply global network configuration)
- Save the updated metadata back to the pickle cache
Also scattered in some unrelated typing where helpful
|
|
Python 3.6 added a new `policy` attribute to `MIMEMultipart`.
MIMEMultipart may be part of the cached object pickle of a datasource.
Upgrading from an old version of python to 3.6+ will cause the
datasource to be invalid after pickle load.
This commit uses the upgrade framework to attempt to access the mime
message and fail early (thus discarding the cache) if we cannot.
Commit 78e89b03 should fix this issue more generally.
|
|
defined in AuthorizedKeysFile (#937)
This patch aims to fix LP1911680, by analyzing the files provided
in sshd_config and merge all keys into an user-specific file. Also
introduces additional tests to cover this specific case.
The file is picked by analyzing the path given in AuthorizedKeysFile.
If it points inside the current user folder (path is /home/user/*), it
means it is an user-specific file, so we can copy all user-keys there.
If it contains a %u or %h, it means that there will be a specific
authorized_keys file for each user, so we can copy all user-keys there.
If no path points to an user-specific file, for example when only
/etc/ssh/authorized_keys is given, default to ~/.ssh/authorized_keys.
Note that if there are more than a single user-specific file, the last
one will be picked.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Co-authored-by: James Falcon <therealfalcon@gmail.com>
LP: #1911680
RHBZ:1862967
|
|
test_upgrade.py was outputting a ton of stuff that had to be manually
collected and verified. This commit adds more assertions to the test
and outputs directly to the logs rather than separate files.
|
|
summary: Clear cache when a Python version change is detected
When a distribution gets updated it is possible that the Python version
changes. Python makes no guarantee that pickle is consistent across
versions as such we need to purge the cache and start over.
Co-authored-by: James Falcon <therealfalcon@gmail.com>
|
|
Also new jenkins tox definition
|
|
|
|
Ensure no Traceback when 'chef_license' is set
|
|
In #856 we added the ability to use partprobe instead of blockdev for
reading partitions. Test that partprobe succeeds where blockdev fails.
Also add a mechanism to our integration tests to allow a callable to be
called between `lxc init` and `lxc start`
|
|
Control is currently limited to boot events, though this should
allow us to more easily incorporate HOTPLUG support. Disabling
'instance-first-boot' is not supported as we apply networking config
too early in boot to have processed userdata (along with the fact
that this would be a pretty big foot-gun).
The concept of update events on datasource has been split into
supported update events and default update events. Defaults will be
used if there is no user-defined update events, but user-defined
events won't be supplied if they aren't supported.
When applying the networking config, we now check to see if the event
is supported by the datasource as well as if it is enabled.
Configuration looks like:
updates:
network:
when: ['boot']
|
|
This allows us to use it when validating packages from -proposed (and
PPAs etc.).
|
|
In #777, we added 'vendordata2' and 'vendordata2_raw' attributes to
the DataSource class, but didn't use the upgrade framework to deal
with an unpickle after upgrade. This commit adds the necessary
upgrade code.
Additionally, added a smaller-scope upgrade test to our integration
tests that will be run on every CI run so we catch these issues
immediately in the future.
LP: #1922739
|
|
the above option allows the user to control the behavior of a distro
hostname selection if both short hostname and FQDN are supplied.
If `prefer_fqdn_over_hostname` is true the FQDN will be selected as
hostname; if false the hostname will be selected
LP: #1921004
|
|
The current method of running a background sleep until travis is
finished is causing integration test runs to pass even when they should
be failing.
Instead, update the code to emit dots itself.
|
|
When output of SSH host keys and/or SSH fingerprints are disabled for
all keys do not display headers and footers.
Prevent risk of message text being interpreted as "logger" option by
appending "--" to logger options.
Correct syslog output that was tagged with "ec2" regardless of DataSource
in use. Now use "cloud-init" tag instead.
Various "shellcheck" corrections.
Add testcase for disabled output of SSH host keys.
|
|
Prior to this commit, when a user specified configuration which would
generate random passwords for users, cloud-init would cause those
passwords to be written to the serial console by emitting them on
stderr. In the default configuration, any stdout or stderr emitted by
cloud-init is also written to `/var/log/cloud-init-output.log`. This
file is world-readable, meaning that those randomly-generated passwords
were available to be read by any user with access to the system. This
presents an obvious security issue.
This commit responds to this issue in two ways:
* We address the direct issue by moving from writing the passwords to
sys.stderr to writing them directly to /dev/console (via
util.multi_log); this means that the passwords will never end up in
cloud-init-output.log
* To avoid future issues like this, we also modify the logging code so
that any files created in a log sink subprocess will only be
owner/group readable and, if it exists, will be owned by the adm
group. This results in `/var/log/cloud-init-output.log` no longer
being world-readable, meaning that if there are other parts of the
codebase that are emitting sensitive data intended for the serial
console, that data is no longer available to all users of the system.
LP: #1918303
|
|
The apt default test wasn't ported over from cloud-tests correctly.
uri should be specified in the test, but it was not, so the test
failed on openstack (and likely other platforms) because without
a specified uri, the default uri will vary by platform. I separated
this uri test out into a separate test function.
Also add openstack specific test for apt configuration with no uri.
Other platform-specific tests should be added here over time.
|
|
The latest pycloudlib now launches official Ubuntu cloud images for
xenial, meaning that `lxc exec` no longer works against them. This
commit includes handling for tests which are affected by this change;
further details and reasoning in the included comment.
|
|
Newer verisons of /etc/sudoers prefer @includedir over
#includedir. Ensure we handle that properly and don't include an
additional #includedir when one isn't warranted.
|
|
This mounts the full directories that we install into systems over their
corresponding paths within the system under test, getting us slightly
closer to testing what a package would install.
|
|
`get_interfaces` is used to in two ways, broadly: firstly, to determine
the available interfaces when converting cloud network configuration
formats to cloud-init's network configuration formats; and, secondly, to
ensure that any interfaces which are specified in network configuration
are (a) available, and (b) named correctly. The first of these is
unaffected by this commit, as no clouds support Open vSwitch
configuration in their network configuration formats.
For the second, we check that MAC addresses of physical devices are
unique. In some OVS configurations, there are OVS-created devices which
have duplicate MAC addresses, either with each other or with physical
devices. As these interfaces are created by OVS, we can be confident
that (a) they will be available when appropriate, and (b) that OVS will
name them correctly. As such, this commit excludes any OVS-internal
interfaces from the set of interfaces returned by `get_interfaces`.
LP: #1912844
|
|
|
|
* Xenial issue
The `apt-key finger` format changed since Xenial. Sample Xenial output:
pub 4096R/991BC93C 2018-09-17
Key fingerprint = F6EC B376 2474 EDA9 D21B 7022 8719 20D1 991B
Sample Focal output:
pub rsa4096 2016-04-12 [SC]
EB4C 1BFD 4F04 2F6D DDCC EC91 7721 F63B D38B 4796
What didn't change is the format of the key fingerprint, which should be
enough to ensure that the right key is in place across all the supported
releases.
* Hirsute issue
TestApt::test_ppa_source also fails on Hirsute because of a difference
in how the PPA keys are added. On Focla this command:
add-apt-repository ppa:simplestreams-dev/trunk
install /etc/apt/trusted.gpg.d/simplestreams-dev_ubuntu_trunk.gpg, while
on Hirsute the file is names simplestreams-dev-ubuntu-trunk.gpg. The
filename is part of the `apt-key finger` output, and this the test
fails. Only checking for the presence of the key fingerprint in apt-key
also covers this case.
LP: #1916629
|
|
Changes:
* Only merge in default Azure cloud ephemeral disk configs
during DataSourceAzure._get_data() if the ephemeral disk
exists.
* DataSourceAzure.address_ephemeral_resize() (which is
invoked in DataSourceAzure.activate() should only set up
the ephemeral disk if the disk exists.
Azure VMs may or may not come with ephemeral resource disks
depending on the VM SKU. For VM SKUs that come with
ephemeral resource disks, the Azure platform guarantees that
the ephemeral resource disk is attached to the VM before
the VM is booted. For VM SKUs that do not come with
ephemeral resource disks, cloud-init currently attempts
to wait and set up a non-existent ephemeral resource
disk, which wastes boot time. It also causes disk setup
modules to fail (due to non-existent references to the
ephemeral resource disk).
udevadm settle is invoked by cloud-init very early in boot.
udevadm settle is invoked very early, before
DataSourceAzure's _get_data() and activate() methods.
Within DataSourceAzure's _get_data() and activate() methods,
the ephemeral resource disk path should exist if the
VM SKU comes with an ephemeral resource disk.
The ephemeral resource disk path should not exist if the
VM SKU does not come with an ephemeral resource disk.
LP: #1901011
|
|
Specifically:
ssh:
emit_keys_to_console: false
We also port the cc_keys_to_console cloud tests to the new integration
testing framework, and add a test for this new option.
LP: #1915460
|
|
pycloudlib has modified the way LXD executes tests
(https://github.com/canonical/pycloudlib/pull/114): it will always use
SSH to access them by default, instead of using `lxc exec`. This
behaviour is transparent for them majority of cloud-init's integration
tests, but some currently depend on using `lxc exec` to access instances
with (intentionally) broken networking: obviously these are not
accessible via SSH.
pycloudlib retains support for switching an instance to use `lxc exec`.
This commit introduces the `lxd_use_exec` mark, which tests can use to
indicate to the integration testing framework that they should be so
switched, and applies it to all applicable tests.
|
|
Kernel's newer than 4.15 present /sys/dmi/id/product_uuid as a
lowercase value. Previously UUID was uppercase.
Azure datasource reads the product_uuid directly as their platform's
instance-id. This presents a problem if a kernel is either
upgraded or downgraded across the 4.15 kernel version boundary because
the case of the UUID will change, resulting in cloud-init seeing a
"new" instance id and re-running all modules.
Re-running cc_ssh in cloud-init deletes and regenerates ssh_host keys
on a system which can cause concern on long-running instances that
somethingnefarious has happened.
Also add:
- An integration test for this for Azure Bionic Ubuntu FIPS upgrading from
a FIPS kernel with uppercase UUID to a lowercase UUID in linux-azure
- A new pytest.mark.sru_next to collect all integration tests related to our
next SRU
LP: #1835584
|
|
This allows out-of-date images to be brought up-to-date with the
archive, so that tests written against the latest cloud-init release
will pass.
|
|
Using the same MAC address results in strange test behaviour if more
than one such instance is up: traffic gets routed to an arbitrary
interface with the given MAC address. This can happen if running tests
in parallel, or on a system which retains test instances from previous
runs.
The introduction of tests/integration_tests/__init__.py means that
pylint now checks the integration tests: this commit also addresses
those failures.
|
|
|
|
Without a MAC address match clause, the test network configuration is
not applied to the primary interface in LXD VMs (which is named enp*s*
rather than eth0).
|
|
`test_seed_random_data.py` was failing on openstack as openstack
provides additional binary seed data to the end of the specified file.
The test has been changed to only read the ascii porition of
seed file.
|
|
|
|
In LXD containers, the default interface is named eth0. In VMs, it
isn't; it's renamed by systemd (likely to enp5s0, but we can't rely on
that). This means that, on VMs, the network configuration we specify
for "eth0" doesn't match an interface in the system and so is not
applied.
This modifies the test to set a MAC address in a match clause in the
network configuration and on the eth0 interface (which is the LXD name
in both containers and VMs pre-rename): this ensures that the specified
configuration applies in both cases.
|
|
|
|
pycloudlib no longer raises exceptions when cloud-init fails to start,
and the API has been updated accordingly. Changes have been made to
integration tests accordingly
|
|
Stop requiring compartment_id for OCI and project_id for GCE since they
can now be inferred in pycloudlib.
|
|
Ubuntu cloud images ship /etc/cloud/build.info which includes a line
with the build serial used to identify the image:
serial: 20210108
This is valuable information when verifying Ubuntu issues (to confirm
that testing is happening against the expected image), but is also
useful when debugging test failures: manifests of all packages in (the
base) images can be found at http://cloud-images.ubuntu.com/
|