Age | Commit message (Collapse) | Author |
|
During reprovisioning, VM network will change. fallback nic
should be cleared after use so that it can be re-evaluated after
reprovisioning
|
|
Without UDF support, DS Azure cannot mount the provisioning ISO,
which contains platform metadata necessary to support
pre-provisioning. The required metadata is made available in IMDS
starting with api version 2021-08-01. This change will leverage IMDS
to obtain the required metadata to support pre-preprovisioning if
provisioning ISO was not available.
|
|
Add DataSourceLXD which knows how to talk to the dev-lxd socket to
obtain all instance metadata API:
https://linuxcontainers.org/lxd/docs/master/dev-lxd.
This first branch is to deliver feature parity with the existing
NoCloud datasource which is currently used to intialize LXC instances
on first boot.
Introduce a SocketConnectionPool and LXDSocketAdapter to support
performing HTTP GETs on the following routes which are surfaced by the
LXD host to all containers:
http://unix.socket/1.0/meta-data
http://unix.socket/1.0/config/user.user-data
http://unix.socket/1.0/config/user.network-config
http://unix.socket/1.0/config/user.vendor-data
These 4 routes minimally replace the static content provided in the
following nocloud-net seed files:
/var/lib/cloud/nocloud-net/{meta-data,vendor-data,user-data,network-config}
The intent of this commit is to set a foundation for LXD socket
communication that will allow us to build network hot-plug features
by eventually consuming LXD's websocket upgrade route 1.0/events to
react to network, meta-data and user-data config changes over time.
In the event that no custom network-config is provided, default to the
same network-config definition provided by LXD to the NoCloud
network-config seed file.
Supplemental features above NoCloud datasource:
surface all custom instance data config keys via cloud-init query ds
which aids in discoverability of features/tags/labels as well as
conditional #cloud-config jinja templates operations based on custom
config options.
TBD: better cloud-init query support for dot-delimited keys
|
|
When we added the install hotplug module, we forgot to update the
redhet/cloud-init.spec.in file and allow for execution on /usr/libexec.
This PR adds that functionality.
|
|
In some of the cases, the system-product-name is just google.
This is useful incase of nocloud where we use the disk to load the datasource
|
|
When self.failed_desired_api_version was added to DataSourceAzure, the
attribute was never added to the _unpickle method using the upgrade
framework. This commit adds the attribute.
LP: #1946644
|
|
There is no reason for the ISO missing this functionality.
As discussed in https://github.com/canonical/cloud-init/pull/947/files#r707338489
|
|
CloudStack DNS resolution should be done against
the DNS search domain (with the final dot, DNS
resolution does not work with e.g. Fedora 34)
LP: #1942232
|
|
Due to multiarch, the libdeployPkgPlugin.so is deployed into dir
/usr/lib/<multiarch name>/open-vm-tools, we need to add this path
into search_paths.
LP: #1944946
|
|
OpenNebula 6.1.80 (current dev. version) is introducing new IPv6 gateway
contextualization variable ETHx_IP6_GATEWAY, which mimics existing
variable ETHx_GATEWAY6. The ETHx_GATEWAY6 used until now will
be depracated in future relase (ET spring 2022).
See:
- new variable - https://github.com/OpenNebula/one/commit/e4d2cc11b9f3c6d01b53774b831f48d9d089c1cc
- deprecation tracking issue - https://github.com/OpenNebula/one/issues/5536
Also, added support for SET_HOSTNAME context variable, which is
currently widely used variable to configure guest VM hostname. See
https://docs.opennebula.io/6.0/management_and_operations/references/template.html#context-section
|
|
Add MTU, accept-ra, routes, options and a direct way to provide intact
cloud configs for networking opposed to relying on configurations that
may need changed often.
|
|
Offload Vultr's vendordata assembly to the backend, correct vendordata
storage and parsing, allow passing critical data via the useragent,
better networking configuration for additional interfaces.
|
|
tox: bump the pinned flake8 and pylint version
* pylint: fix W1406 (redundant-u-string-prefix)
The u prefix for strings is no longer necessary in Python >=3.0.
* pylint: disable W1514 (unspecified-encoding)
From https://www.python.org/dev/peps/pep-0597/ (Python 3.10):
The new warning stems form https://www.python.org/dev/peps/pep-0597,
which says:
Developers using macOS or Linux may forget that the default encoding
is not always UTF-8. [...] Even Python experts may assume that the
default encoding is UTF-8. This creates bugs that only happen on Windows.
The warning could be fixed by always specifying encoding='utf-8',
however we should be careful to not break environments which are not
utf-8 (or explicitly state that only utf-8 is supported). Let's silence
the warning for now.
* _quick_read_instance_id: cover the case where load_yaml() returns None
Spotted by pylint:
- E1135 (unsupported-membership-test)
- E1136 (unsubscriptable-object)
LP: #1944414
|
|
Add retries to DatasourceGCE when connecting to GCE.
Sometimes when the trying to fetch the metadata,
cloud-init fails and the fallback datasource NoCloud is used which is
not expected. Add retries to ensure loading of the data source.
|
|
In #1006, we set Azure to apply networking config every
BOOT_NEW_INSTANCE because the BOOT_LEGACY option was causing problems
applying networking the second time per boot. However,
BOOT_NEW_INSTANCE is also wrong as Azure needs to apply networking
once per boot, during init-local phase.
|
|
Add connectivity_url to Oracle's EphemeralDHCPv4
On bionic, when trying to bring up the EphemeralDHCPv4, it's possible
that we already have a route defined, which will result in an error when
trying to add the DHCP route. Use the connectivity_url to check if we
can reach the metadata service, and if so, skip the EphemeralDHCPv4.
The has_url_connectivity function has also been modified to take
a dict of kwargs to send to readurl.
LP: #1939603
|
|
|
|
In #834, we refactored the handling of events for fetching new metadata.
Previously, in Azure's __init__, the BOOT event was added to the
update_events, so it was assumed that Azure required the standard BOOT
behavior, which is to apply metadata twice every boot: once during
local-init, then again during standard init phase.
https://github.com/canonical/cloud-init/blob/21.2/cloudinit/sources/DataSourceAzure.py#L356
However, this line was effectively meaningless. After the metadata was
fetched in local-init, it was then pickled out to disk. Because
"update_events" was a class variable, the EventType.BOOT was not
persisted into the pickle. When the pickle was then unpickled in the
init phase, metadata did not get re-fetched because EventType.BOOT was
not present, so Azure is effectely only BOOT_NEW_INSTANCE.
Fetching metadata twice during boot causes some issue for
pre-provisioning on Azure because updating metadata during
re-provisioning will cause cloud-init to poll for reprovisiondata again
in DataSourceAzure, which will infinitely return 404(reprovisiondata
is deleted from IMDS after health signal was sent by cloud-init during
init-local). This makes cloud-init stuck in 'init'
|
|
Using flake8 inplace of pyflakes
Renamed run-pyflakes -> run-flake8
Changed target name to flake8 in Makefile
With pyflakes we can't suppress warnings/errors in few required places.
flake8 is flexible in that regard. Hence using flake8 seems to be a
better choice here.
flake8 does the job of pep8 anyway.
So, removed pep8 target from Makefile along with tools/run-pep8 script.
Included setup.py in flake8 checks
|
|
In the nic attach path, we skip doing dhcp since we already did it
when bringing the interface up. However when polling for
reprovisiondata, it is possible for the request to timeout due to
platform issues. In those cases we still need to do dhcp and try again
since we tear down the context. We can only skip the first dhcp
attempt.
|
|
before rebinding again (#990)
Add 10 second polling loop in wait_for_link_up after performing
an unbind and re-bind of primary NIC in hv_netvsc driver.
Also reduce cloud-init logging levels to debug for these operations.
|
|
Alters hotplug hook to have a query mechanism checking if the
functionality is enabled. This allows us to avoid using the hotplug
socket and service when hotplug is disabled.
|
|
When bringing interface up by unbinding and then binding hv_netvsc
driver, it might take a short delay after binding for the link to be
up. So before trying unbind/bind again after sleep, check if the link
is up. This is a corner case when a preprovisioned VM is reused and
the NICs are hot-attached.
|
|
|
|
This patch finally introduces the Cloud-Init Datasource for VMware
GuestInfo as a part of cloud-init proper. This datasource has existed
since 2018, and rapidly became the de facto datasource for developers
working with Packer, Terraform, for projects like kube-image-builder,
and the de jure datasource for Photon OS.
The major change to the datasource from its previous incarnation is
the name. Now named DatasourceVMware, this new version of the
datasource will allow multiple transport types in addition to
GuestInfo keys.
This datasource includes several unique features developed to address
real-world situations:
* Support for reading any key (metadata, userdata, vendordata) both
from the guestinfo table when running on a VM in vSphere as well as
from an environment variable when running inside of a container,
useful for rapid dev/test.
* Allows booting with DHCP while still providing full participation
in Cloud-Init instance data and Jinja queries. The netifaces library
provides the ability to inspect the network after it is online,
and the runtime network configuration is then merged into the
existing metadata and persisted to disk.
* Advertises the local_ipv4 and local_ipv6 addresses via guestinfo
as well. This is useful as Guest Tools is not always able to
identify what would be considered the local address.
The primary author and current steward of this datasource spoke at
Cloud-Init Con 2020 where there was interest in contributing this datasource
to the Cloud-Init codebase.
The datasource currently lives in its own GitHub repository at
https://github.com/vmware/cloud-init-vmware-guestinfo. Once the datasource
is merged into Cloud-Init, the old repository will be deprecated.
|
|
Azure Linux Agent (WaLinuxAgent) waits for the ovf-env.xml file
to be written by cloud-init when cloud-init provisions the VM. This
file is written whenever cloud-init reads its contents from the
provisioning ISO.
With this change, when there is no provisioning ISO,
DataSourceAzure will generate the ovf-env.xml file based on the
metadata obtained from Azure IMDS.
|
|
Details:
1. Support guest set network config through guestinfo.ovfEnv using OVF
2. 'network-config' Property is optional
3. 'network-config' Property's value has to be base64 encoded
Added unittests and updated ovf-env.xml example
|
|
With a few exceptions, Azure VM deployments receive provisioning
metadata through the provisioning iso presented as a cdrom device
(/dev/sr0). The existing code attempts to find this device by calling
blkid to find all devices that have either type iso9660 or udf. This
can be very expensive if the VM has a lot of disks. This commit will
attempt to mount the default iso location first and only tries to use
blkid to locate the iso location if the default mounting location fails
|
|
Adds a udev script which will invoke a hotplug hook script on all net
add events. The script will write some udev arguments to a systemd FIFO
socket (to ensure we have only instance of cloud-init running at a
time), which is then read by a new service that calls a new 'cloud-init
devel hotplug-hook' command to handle the new event.
This hotplug-hook command will:
- Fetch the pickled datsource
- Verify that the hotplug event is supported/enabled
- Update the metadata for the datasource
- Ensure the hotplugged device exists within the datasource
- Apply the config change on the datasource metadata
- Bring up the new interface (or apply global network configuration)
- Save the updated metadata back to the pickle cache
Also scattered in some unrelated typing where helpful
|
|
Python 3.6 added a new `policy` attribute to `MIMEMultipart`.
MIMEMultipart may be part of the cached object pickle of a datasource.
Upgrading from an old version of python to 3.6+ will cause the
datasource to be invalid after pickle load.
This commit uses the upgrade framework to attempt to access the mime
message and fail early (thus discarding the cache) if we cannot.
Commit 78e89b03 should fix this issue more generally.
|
|
Add a new switch allow_raw_data to control raw data feature, update
the documentation. Fix bugs about max_wait.
|
|
The name "DigitalOcean" doesn't have a space in it; it's a single
compound word written in Pascal case (upper camel case).
|
|
Control is currently limited to boot events, though this should
allow us to more easily incorporate HOTPLUG support. Disabling
'instance-first-boot' is not supported as we apply networking config
too early in boot to have processed userdata (along with the fact
that this would be a pretty big foot-gun).
The concept of update events on datasource has been split into
supported update events and default update events. Defaults will be
used if there is no user-defined update events, but user-defined
events won't be supplied if they aren't supported.
When applying the networking config, we now check to see if the event
is supported by the datasource as well as if it is enabled.
Configuration looks like:
updates:
network:
when: ['boot']
|
|
See https://bugs.launchpad.net/cloud-init/+bug/1910835
|
|
|
|
When network interfaces are hot-attached to the VM, attempting to get
network metadata might return 410 (or 500, 503 etc) because the info
is not yet available. In those cases, we retry getting the metadata
before giving up. The only case where we can move on to wait for more
nic attach events is if the call times out despite retries, which
means the interface is not likely a primary interface, and we should
try for more nic attach events.
|
|
This change allows us to retrieve the username and hostname from
IMDS instead of having to rely on the mounted OVF.
|
|
Due to hyper-v implementations, iso ejection is more efficient if performed
from within the guest. The code will attempt to perform a best-effort ejection.
Failure during ejection will not prevent reporting ready from happening. If iso
ejection is successful, later iso ejection from the platform will be a no-op.
In the event the iso ejection from the guest fails, iso ejection will still happen at
the platform level.
|
|
In #777, we added 'vendordata2' and 'vendordata2_raw' attributes to
the DataSource class, but didn't use the upgrade framework to deal
with an unpickle after upgrade. This commit adds the necessary
upgrade code.
Additionally, added a smaller-scope upgrade test to our integration
tests that will be run on every CI run so we catch these issues
immediately in the future.
LP: #1922739
|
|
Invoking walinuxagent from within cloud-init is no longer
supported/necessary
|
|
This PR adds in support so that cloud-init can run on instances
deployed on Vultr cloud. This was originally brought up in #628.
Co-authored-by: Eric Benner <ebenner@vultr.com>
|
|
Ensure that the Azure helper's http handler sleeps a fixed duration
between retry failure attempts. The http handler will sleep a fixed
duration between failed attempts regardless of whether the attempt
failed due to (1) request timing out or (2) instant failure (no
timeout).
Due to certain platform issues, the http request to the Azure endpoint
may instantly fail without reaching the http timeout duration. Without
sleeping a fixed duration in between retry attempts, the http handler
will loop through the max retry attempts quickly. This causes the
communication between cloud-init and the Azure platform to be less
resilient due to the short total duration if there is no sleep in
between retries.
|
|
#342 (70dbccbb) introduced the ability to determine route-metrics based on
the `device-number` provided by the EC2 IMDS. Not all datasources that
subclass EC2 will have this attribute, so allow the old behavior if
`device-number` is not present.
LP: #1917875
|
|
`get_interfaces` is used to in two ways, broadly: firstly, to determine
the available interfaces when converting cloud network configuration
formats to cloud-init's network configuration formats; and, secondly, to
ensure that any interfaces which are specified in network configuration
are (a) available, and (b) named correctly. The first of these is
unaffected by this commit, as no clouds support Open vSwitch
configuration in their network configuration formats.
For the second, we check that MAC addresses of physical devices are
unique. In some OVS configurations, there are OVS-created devices which
have duplicate MAC addresses, either with each other or with physical
devices. As these interfaces are created by OVS, we can be confident
that (a) they will be available when appropriate, and (b) that OVS will
name them correctly. As such, this commit excludes any OVS-internal
interfaces from the set of interfaces returned by `get_interfaces`.
LP: #1912844
|
|
Add flexibility to IMDS api-version by having both a desired IMDS
api-version and a minimum api-version. The desired api-version will
be used first, and if that fails it will fall back to the minimum
api-version.
|
|
Changes:
* Only merge in default Azure cloud ephemeral disk configs
during DataSourceAzure._get_data() if the ephemeral disk
exists.
* DataSourceAzure.address_ephemeral_resize() (which is
invoked in DataSourceAzure.activate() should only set up
the ephemeral disk if the disk exists.
Azure VMs may or may not come with ephemeral resource disks
depending on the VM SKU. For VM SKUs that come with
ephemeral resource disks, the Azure platform guarantees that
the ephemeral resource disk is attached to the VM before
the VM is booted. For VM SKUs that do not come with
ephemeral resource disks, cloud-init currently attempts
to wait and set up a non-existent ephemeral resource
disk, which wastes boot time. It also causes disk setup
modules to fail (due to non-existent references to the
ephemeral resource disk).
udevadm settle is invoked by cloud-init very early in boot.
udevadm settle is invoked very early, before
DataSourceAzure's _get_data() and activate() methods.
Within DataSourceAzure's _get_data() and activate() methods,
the ephemeral resource disk path should exist if the
VM SKU comes with an ephemeral resource disk.
The ephemeral resource disk path should not exist if the
VM SKU does not come with an ephemeral resource disk.
LP: #1901011
|
|
Kernel's newer than 4.15 present /sys/dmi/id/product_uuid as a
lowercase value. Previously UUID was uppercase.
Azure datasource reads the product_uuid directly as their platform's
instance-id. This presents a problem if a kernel is either
upgraded or downgraded across the 4.15 kernel version boundary because
the case of the UUID will change, resulting in cloud-init seeing a
"new" instance id and re-running all modules.
Re-running cc_ssh in cloud-init deletes and regenerates ssh_host keys
on a system which can cause concern on long-running instances that
somethingnefarious has happened.
Also add:
- An integration test for this for Azure Bionic Ubuntu FIPS upgrading from
a FIPS kernel with uppercase UUID to a lowercase UUID in linux-azure
- A new pytest.mark.sru_next to collect all integration tests related to our
next SRU
LP: #1835584
|
|
New datasource utilizing UpCloud metadata API, including relevant unit
tests and documentation.
|
|
Add support for openstack's dynamic vendor data, which appears under openstack/latest/vendor_data2.json
This adds vendor_data2 to all pathways; it should be a no-op for non-OpenStack providers.
LP: #1841104
|
|
If cloud-init is enabled on VMware platform, cloud-init will wait until
its configuration file is ready and currently the max wait is 90
seconds by default. With our test, this configuration file should be
ready within 1 second, so change it to 15 seconds for better
performance. Also update the documentation about how to change the
default value in cloud-init configuration file.
|