Age | Commit message (Collapse) | Author |
|
Previously "cmdline" network configuration could be either
user-specified network-config=... configuration data, or
initramfs-provided configuration data. Before data sources could modify
the order in which network config sources were considered, this
conflation didn't matter (and, indeed, in the default data source
configuration it will continue to not matter).
However, it _is_ desirable for a data source to be able to specify that
its network configuration should be preferred over the
initramfs-provided network configuration but still allow explicit
network-config=... configuration passed to the kernel cmdline to
continue to override both of those sources.
(This also modifies the Oracle data source to use read_initramfs_config
directly, which is effectively what it was using
read_kernel_cmdline_config for previously.)
|
|
Currently, if a platform provides any network configuration via the
"cmdline" method (i.e. network-data=... on the kernel command line,
ip=... on the kernel command line, or iBFT config via /run/net-*.conf),
the value of the data source's network_config property is completely
ignored.
This means that on platforms that use iSCSI boot (such as Oracle Compute
Infrastructure), there is no way for the data source to configure any
network interfaces other than those that have already been configured by
the initramfs.
This change allows data sources to specify the order in which network
configuration sources are considered. Data sources that opt to use this
mechanism will be expected to consume the command line network data and
integrate it themselves.
(The generic merging of network configuration sources was considered,
but we concluded that the single use case we have presently (a) didn't
warrant the increased complexity, and (b) didn't give us a broad enough
view to be sure that our generic implementation would be sufficiently
generic. This change in no way precludes a merging strategy in future.)
|
|
cloud-init does not trigger reboots of a VM therefore adding custom
scripts to rc.local does not execute the post scripts. This patch
moves post-scripts into per-instance scripts dir and has cc_scripts
module run the post-scripts.
Also in this branch:
- Remove the sh interpreter and execute the customization script
directly.
- Update the unit test.
LP: #1833192
|
|
* cc_lxd: fix copy/paste error in debug logging
* DataSourceCloudSigma: remove unreachable code
* This unreachable code was introduced in a refactor (in 2015) which
removed the need for an exception handler, but retained the logging
from the exception handler as an unreachable fall-through.
|
|
|
|
This allows cloud-init query region to show valid region data for Azure
|
|
blkid is a Linux-only command. With this patch, cloud-init uses another
approach to find the data source on FreeBSD.
LP: #1645824
|
|
The Azure data source helper attempts to use information in the dhcp
lease to find the Wireserver endpoint (IP address). Under some unusual
circumstances, those attempts will fail. This change uses a static
address, known to be always correct in the Azure public and sovereign
clouds, when the helper fails to locate a valid dhcp lease. This
address is not guaranteed to be correct in Azure Stack environments;
it's still best to use the information from the lease whenever possible.
|
|
|
|
If the IMDS primary server is not available, falling back to the
secondary server takes about 1s. The net result is that the
expected E2E time is slightly more than 1s. This change increases
the timeout to 2s to prevent the infinite loop of timeouts.
|
|
On FreeBSD, mount_cd9660 does not accept the sync option that is enabled
by default. In addition, the sync is only useful with the `rw` mode.
However the `rw` mode was never used.
This patch removes the `rw` and `sync` parameter of `mount_cb` to
simplify the code base and resolve the FreeBSD issue.
LP: #1645824
|
|
Moving update_events from a class attribute to an instance attribute
means that it doesn't exist on DataSource objects that are unpickled,
causing tracebacks on cloud-init upgrade.
As this change is only required for cloud-init installations which don't
utilise ds-identify, we're backing it out to be reintroduced once the
upgrade path bug has been addressed.
This reverts commit f2fd6eac4407e60d0e98826ab03847dda4cde138.
|
|
NoCloud data source now accepts both 'cidata' and 'CIDATA'
as filesystem labels. This is similar to DataSourceConfigDrive's
support for 'config-2' and 'CONFIG-2'.
|
|
When the Azure datasource persists all of its metadata to the
instance directory, it deliberately sets the self.network_config
value to be the sources.UNSET value. The goal is to ensure that
each time the system boots, fresh network configuration data is
fetched from the cloud platform so that any control plane changes
will take effect. When a VM is first created, there's no pickled
instance to restore, so self._network_config is None, resulting
in self.network_config() properly building a new config. Azure
suffered from LP: #1801364 which prevented ds from being stored
in obj.pkl in the instance directory, so subsequent reboots always
regenerated their network configuration.
Commit 0dc3a77f41f4544e4cb5a41637af7693410d4cdf introduced a
new bug in which self.network_config() assumed the
self._network_config value was either None or trustable; when
the config was unpickled, that value was _unset, thus breaking
the assumption.
LP: #1823084
|
|
Create an Azure logging decorator and use additional ReportEventStack
context managers to provide additional logging details.
|
|
The Azure platform surfaces random bytes into /sys via Hyper-V.
Python 2.7 json.dump() raises an exception if asked to convert
a str with non-character content, and python 3.0 json.dump()
won't serialize a "bytes" value. As a result, c-i instance
data is often not written by Azure, making reboots slower (c-i
has to repeat work).
The random data is base64-encoded and then decoded into a string
(str or unicode depending on the version of Python in use). The
base64 string has just as many bits of entropy, so we're not
throwing away useful "information", but we can be certain
json.dump() will correctly serialize the bits.
|
|
Currently, DataSourceAzure updates self.update_events in __init__. As
update_events is a class attribute on DataSource, this updates it for
all instances of classes derived from DataSource including those for
other clouds. This means that if DataSourceAzure is even instantiated,
its behaviour is applied to whichever data source ends up being used for
boot.
To address this, update_events is moved from a class attribute to an
instance attribute (that is therefore populated at instantiation time).
This retains the defaults for all DataSource sub-class instances, but
avoids them being able to mutate the state in instances of other
DataSource sub-classes.
update_events is only ever referenced on an instance of DataSource (or a
sub-class); no code relies on it being a class attribute. (In fact,
it's only used within methods on DataSource or its sub-classes, so it
doesn't even _need_ to remain public, though I think it's appropriate
for it to be public.)
DataSourceScaleway is also updated to move update_events from a
class attribute to an instance attribute, as the class attribute would
now be masked by the DataSource instance attribute.
LP: #1819913
|
|
Our previous understanding of the upgrade issue was incomplete; it turns
out the only change we need is the one now outlined.
|
|
Some deployments of OpenStack expose link types to the guest which
cloud-init doesn't recognise. These will almost always be physical, so
we can operate more robustly if we assume that they are (whilst warning
the user that we're seeing something unexpected).
LP: #1639263
|
|
six already provides this for us, and we're already paying the cost to
determine it there; no need to do it twice.
|
|
The Azure data source is expected to expose a list of
ssh keys for the user-to-be-provisioned in the crawled
metadata. When configured to use the __builtin__ agent
this list is built by the WALinuxAgentShim. The shim
retrieves the full set of certificates and public keys
exposed to the VM from the wireserver, extracts any
ssh keys it can, and returns that list.
This fix reduces that list of ssh keys to just the
ones whose fingerprints appear in the "administrative
user" section of the ovf-env.xml file. The Azure
control plane exposes other ssh keys to the VM for
other reasons, but those should not be added to the
authorized_keys file for the provisioned user.
|
|
AWS EC2 instances' network come in 2 basic flavors: Classic and VPC
(Virtual Private Cloud). The former has an interesting behavior of having
its MAC address changed whenever the instance is stopped/restarted. This
behavior is not observed in VPC instances.
In Ubuntu 18.04 (Bionic) the network "management" changed from ENI-style
(etc/network/interfaces) to netplan, and when using netplan we observe
the following block present in /etc/netplan/50-cloud-init.yaml:
match:
macaddress: aa:bb:cc:dd:ee:ff
Jani Ollikainen noticed in Launchpad bug #1802073 that the EC2 Classic
instances were booting without network access in Bionic after stop/restart
procedure, due to their MAC address change behavior. It was narrowed down
to the netplan MAC match block, that kept the old MAC address after
stopping and restarting an instance, since the network configuration
writing happens by default only once in EC2 instances, in the first boot.
This patch changes the network configuration write to every boot in EC2
Classic instances, by checking against the "vpc-id" metadata information
provided only in the VPC instances - if we don't have this metadata value,
cloud-init will rewrite the network configuration file in every boot.
This was tested in an EC2 Classic instance and proved to fix the issue;
unit tests were also added for the new method is_classic_instance().
LP: #1802073
Reported-by: Jani Ollikainen <jani.ollikainen@ik.fi>
Suggested-by: Ryan Harper <ryan.harper@canonical.com>
Co-developed-by: Chad Smith <chad.smith@canonical.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
|
|
Fixes:
- flake8: use ==/!= to compare str, bytes, and int literals
- pycodestyle: E117 over-indented
|
|
In addition to EPOCHREALTIME there is also an EPOCHSECONDS environment
variable that OpenNebula needs to exclude as it is expected to change.
This commit supplements the other exclusion in commit
d1a2fe7307e9cf2251d1f9a666c12d71d3f522d6.
Without this fix, unittests will intermittently fail if
parse_shell_config is run across a timing boundary where the
EPOCHSECONDS changes mid-test.
LP: #1813641
|
|
This branch is needed to allow cloud-init to sbuild on Ubuntu Disco.
OpenNebula:parse_shell_config tries to do a comparison of bash
environment values, excluding expected environment variables which
are known to change.
Bash on Ubuntu Disco surfaces a new EPOCHREALTIME environment variable
which wasn't in previous bash environments, this var needs to be
ignored by parse_shell_config too.
LP: #1813383
|
|
Testing startup of large numbers of VMs (of varying distros) in Azure
shows that 3 retries results in a small percentage of failed VMs.
Increasing that by a few dramatically decreases the occurrence of
provisioning timeout errors. The initial choice of "3 retries" was
uninformed by heavy testing. Also, the alternate provisioning
mechanism for Azure (waagent) retries the Wireserver crawl without
limit. 10 retries seems a more reasonable choice.
|
|
The change here will utilize ssh keys found inside an instance's tag.
The tag value must start with 'AUTHORIZED_KEY'.
|
|
Transport functions (transport_iso9660 and transport_vmware_guestinfo)
would return a tuple of 3 values, but only the first was ever used
outside of test. The other values (device and filename) were just
ignored.
This just simplifies the transport functions to now return content
(in string format) or None indicating that the transport was not found.
|
|
This adds support for reading OVF information over the
'com.vmware.guestInfo' tranport. The current implementation requires
vmware-rpctool be installed in the system.
LP: #1807466
|
|
The tip-pylint tox target correctly reported the invalid use of
string formatting. The change here is to:
a.) Fix the error that was caught.
b.) move to pylint 2.2.2 for the default 'pylint' target.
|
|
NoCloud's 'network-config' file was originally expected to contain
network configuration without the top level 'network' key. This was
because the file was named 'network-config' so specifying 'network'
seemed redundant.
However, JuJu is currently providing a top level 'network' config when
it tries to disable networking ({"network": {"config": "disabled"}).
Other users have also been surprised/confused by the fact that
a network config in /etc/cloud/cloud.cfg.d/network.cfg differed from
what was expected in 'network-config'.
LP: #1798117
|
|
Move routes under the nic's subnet rather than use top-level
("global") route config ensuring all net renderers will provide the
configured route.
Also updated cloudinit/cmd/devel/net_convert.py:
- Add input type 'vmware-imc' for OVF customization config files
- Fix bug when output-type was netplan which invoked netplan
generate/apply and attempted to write to
/etc/netplan/50-cloud-init.yaml instead of joining with the
output directory.
LP: #1806103
|
|
Replace Azure pre-provision polling on IMDS with a blocking call
which watches for netlink link state change messages. The media
change event happens when a pre-provisioned VM has been activated
and is connected to the users virtual network and cloud-init can
then resume operation to complete image instantiation.
|
|
Check the appropriate variables based on code review. Correcting what
seems to be a copy/paste mistake for the error handling from a few lines
above.
|
|
Upon URL timeout, _poll_imds is expected to re-dhcp to get updated
IP configuration. We don't want to indefinitely retry because the
instance likely has invalid IP configuration.
LP: #1803598
|
|
There is an infrequent race when the booting instance can hit the IMDS
service before it is fully available. This results in a
requests.ConnectTimeout being raised.
Azure's retry_callback logic now retries on either 404s or Timeouts.
LP:1800223
|
|
If Azure detects an ntfs filesystem type during mount attempt, it should
still report the resource device as reformattable. There are slight
differences in error message format on RedHat and SuSE. This patch
simplifies the expected error match to work on both distributions.
LP: #1799338
|
|
In commitish 9073951 azure datasource tried to leverage stale DHCP
information obtained from EphemeralDHCPv4 context manager to report
updated provisioning status to the fabric earlier in the boot process.
Unfortunately the stale ephemeral network configuration had already been
torn down in preparation to bring up IMDS network config so the report
attempt failed on timeout.
This branch introduces obtain_lease and clean_network public methods on
EphemeralDHCPv4 to allow for setup and teardown of ephemeral network
configuration without using a context manager. Azure datasource now uses
this to persist ephemeral network configuration across multiple contexts
during provisioning to avoid multiple DHCP roundtrips.
|
|
There was a typo in the seeded filename s/azure-hotplug/hotplug-azure/.
|
|
When reusing a preprovisioned VM, report ready to Azure fabric as soon as
we get the reprovision data and the goal state so that we are not delayed
by the cloud-init stage switch, saving 2-3 seconds. Also reduce logging
when polling IMDS for reprovision data.
LP: #1799594
|
|
Azure generates network configuration from the IMDS service and removes
any preexisting hotplug network scripts which exist in Azure cloud images.
Add a datasource configuration option which allows for writing a default
network configuration which sets up dhcp on eth0 and leave the hotplug
handling to the cloud-image scripts.
To disable network-config from Azure IMDS, add the following to
/etc/cloud/cloud.cfg.d/99-azure-no-imds-network.cfg:
datasource:
Azure:
apply_network_config: False
LP: #1798424
|
|
Add a quick cloud lookup utility in order to more easily determine
the cloud on which an instance is running.
The utility parses standardized attributes from
/run/cloud-init/instance-data.json to print the canonical cloud-id
for the instance. It uses known region maps if necessary to determine
on which specific cloud the instance is running.
Examples:
aws, aws-gov, aws-china, rackspace, azure-china, lxd, openstack, unknown
|
|
Add the following instance-data.json standardized keys:
* v1._beta_keys: List any v1 keys in beta development,
e.g. ['subplatform'].
* v1.public_ssh_keys: List of any cloud-provided ssh keys for the
instance.
* v1.platform: String representing the cloud platform api supporting the
datasource. For example: 'ec2' for aws, aliyun and brightbox cloud
names.
* v1.subplatform: String with more details about the source of the
metadata consumed. For example, metadata uri, config drive device path
or seed directory.
To support the new platform and subplatform standardized instance-data,
DataSource and its subclasses grew platform and subplatform attributes.
The platform attribute defaults to the lowercase string datasource name at
self.dsname. This method is overridden in NoCloud, Ec2 and ConfigDrive
datasources.
The subplatform attribute calls a _get_subplatform method which will
return a string containing a simple slug for subplatform type such as
metadata, seed-dir or config-drive followed by a detailed uri, device or
directory path where the datasource consumed its configuration.
As part of this work, DatasourceEC2 methods _get_data and _crawl_metadata
have been refactored for a few reasons:
- crawl_metadata is now a read-only operation, persisting no attributes on
the datasource instance and returns a dictionary of consumed metadata.
- crawl_metadata now closely represents the raw stucture of the ec2
metadata consumed, so that end-users can leverage public ec2 metadata
documentation where possible.
- crawl_metadata adds a '_metadata_api_version' key to the crawled
ds.metadata to advertise what version of EC2's api was consumed by
cloud-init.
- _get_data now does all the processing of crawl_metadata and saves
datasource instance attributes userdata_raw, metadata etc.
Additional drive-bys:
* unit test rework for test_altcloud and test_azure to simplify mocks
and make use of existing util and test_helpers functions.
|
|
|
|
OpenStack ironic references Infiniband interfaces via a 6 byte 'MAC
address' formed from bytes 13-15 and 18-20 of interface's hardware
address. This address is used as the ethernet_mac_address of Infiniband
links in network_data.json in configdrives generated by OpenStack nova.
We can use this address to map links in network_data.json to their
corresponding interface names.
When generating interface configuration files, we need to use the
interface's full hardware address as the HWADDR, rather than the 6 byte
MAC address provided by network_data.json.
This change allows IB interfaces to be referenced in this dual mode - by
MAC address and hardware address, depending on the context.
Support TYPE=InfiniBand for sysconfig configuration of IB interfaces.
|
|
Cloud-init caches any cloud metadata crawled during boot in the file
/run/cloud-init/instance-data.json. Cloud-init also standardizes some of
that metadata across all clouds. The command 'cloud-init query' surfaces a
simple CLI to query or format any cached instance metadata so that scripts
or end-users do not have to write tools to crawl metadata themselves.
Since 'cloud-init query' is runnable by non-root users, redact any
sensitive data from instance-data.json and provide a root-readable
unredacted instance-data-sensitive.json. Datasources can now define a
sensitive_metadata_keys tuple which will redact any matching keys
which could contain passwords or credentials from instance-data.json.
Also add the following standardized 'v1' instance-data.json keys:
- user_data: The base64encoded user-data provided at instance launch
- vendor_data: Any vendor_data provided to the instance at launch
- underscore_delimited versions of existing hyphenated keys:
instance_id, local_hostname, availability_zone, cloud_name
|
|
Any distro that has a '_write_nework_config' method should no
longer get their _write_network called at all. So lets drop
that code and raise a RuntimeError any time we got there.
Replace the one caller of 'apply_network' (legacy openstack path)
with a call to apply_network_config after converting the ENI to
network config.
|
|
Fix a bug where setting of mac address on a bond device was
ignored when provided in OpenStack network_config.json.
LP: #1682064
|
|
On OpenStack based OVH public cloud, we got DHCP response with
fixed-address 54.36.113.86;
option subnet-mask 255.255.255.255;
option routers 54.36.112.1;
The router clearly is not on the subnet. So 'ip' would fail when
we tried to add the default route.
The solution here is to add an explicit route on that interface
to the router and then add the default route.
Also add 'bgpovs' to the list of 'physical' types for OpenStack
network configuration. That type is used on OVH public cloud.
LP: #1792415
|
|
Mark as supported for reading some newer versions of openstack metadata:
2016-06-30 : Newton one
2016-10-06 : Newton two
2017-02-22 : Ocata
2018-08-27 : Rocky
|