Age | Commit message (Collapse) | Author |
|
NoCloud's 'network-config' file was originally expected to contain
network configuration without the top level 'network' key. This was
because the file was named 'network-config' so specifying 'network'
seemed redundant.
However, JuJu is currently providing a top level 'network' config when
it tries to disable networking ({"network": {"config": "disabled"}).
Other users have also been surprised/confused by the fact that
a network config in /etc/cloud/cloud.cfg.d/network.cfg differed from
what was expected in 'network-config'.
LP: #1798117
|
|
Move routes under the nic's subnet rather than use top-level
("global") route config ensuring all net renderers will provide the
configured route.
Also updated cloudinit/cmd/devel/net_convert.py:
- Add input type 'vmware-imc' for OVF customization config files
- Fix bug when output-type was netplan which invoked netplan
generate/apply and attempted to write to
/etc/netplan/50-cloud-init.yaml instead of joining with the
output directory.
LP: #1806103
|
|
Replace Azure pre-provision polling on IMDS with a blocking call
which watches for netlink link state change messages. The media
change event happens when a pre-provisioned VM has been activated
and is connected to the users virtual network and cloud-init can
then resume operation to complete image instantiation.
|
|
Check the appropriate variables based on code review. Correcting what
seems to be a copy/paste mistake for the error handling from a few lines
above.
|
|
Upon URL timeout, _poll_imds is expected to re-dhcp to get updated
IP configuration. We don't want to indefinitely retry because the
instance likely has invalid IP configuration.
LP: #1803598
|
|
There is an infrequent race when the booting instance can hit the IMDS
service before it is fully available. This results in a
requests.ConnectTimeout being raised.
Azure's retry_callback logic now retries on either 404s or Timeouts.
LP:1800223
|
|
If Azure detects an ntfs filesystem type during mount attempt, it should
still report the resource device as reformattable. There are slight
differences in error message format on RedHat and SuSE. This patch
simplifies the expected error match to work on both distributions.
LP: #1799338
|
|
In commitish 9073951 azure datasource tried to leverage stale DHCP
information obtained from EphemeralDHCPv4 context manager to report
updated provisioning status to the fabric earlier in the boot process.
Unfortunately the stale ephemeral network configuration had already been
torn down in preparation to bring up IMDS network config so the report
attempt failed on timeout.
This branch introduces obtain_lease and clean_network public methods on
EphemeralDHCPv4 to allow for setup and teardown of ephemeral network
configuration without using a context manager. Azure datasource now uses
this to persist ephemeral network configuration across multiple contexts
during provisioning to avoid multiple DHCP roundtrips.
|
|
There was a typo in the seeded filename s/azure-hotplug/hotplug-azure/.
|
|
When reusing a preprovisioned VM, report ready to Azure fabric as soon as
we get the reprovision data and the goal state so that we are not delayed
by the cloud-init stage switch, saving 2-3 seconds. Also reduce logging
when polling IMDS for reprovision data.
LP: #1799594
|
|
Azure generates network configuration from the IMDS service and removes
any preexisting hotplug network scripts which exist in Azure cloud images.
Add a datasource configuration option which allows for writing a default
network configuration which sets up dhcp on eth0 and leave the hotplug
handling to the cloud-image scripts.
To disable network-config from Azure IMDS, add the following to
/etc/cloud/cloud.cfg.d/99-azure-no-imds-network.cfg:
datasource:
Azure:
apply_network_config: False
LP: #1798424
|
|
Add a quick cloud lookup utility in order to more easily determine
the cloud on which an instance is running.
The utility parses standardized attributes from
/run/cloud-init/instance-data.json to print the canonical cloud-id
for the instance. It uses known region maps if necessary to determine
on which specific cloud the instance is running.
Examples:
aws, aws-gov, aws-china, rackspace, azure-china, lxd, openstack, unknown
|
|
Add the following instance-data.json standardized keys:
* v1._beta_keys: List any v1 keys in beta development,
e.g. ['subplatform'].
* v1.public_ssh_keys: List of any cloud-provided ssh keys for the
instance.
* v1.platform: String representing the cloud platform api supporting the
datasource. For example: 'ec2' for aws, aliyun and brightbox cloud
names.
* v1.subplatform: String with more details about the source of the
metadata consumed. For example, metadata uri, config drive device path
or seed directory.
To support the new platform and subplatform standardized instance-data,
DataSource and its subclasses grew platform and subplatform attributes.
The platform attribute defaults to the lowercase string datasource name at
self.dsname. This method is overridden in NoCloud, Ec2 and ConfigDrive
datasources.
The subplatform attribute calls a _get_subplatform method which will
return a string containing a simple slug for subplatform type such as
metadata, seed-dir or config-drive followed by a detailed uri, device or
directory path where the datasource consumed its configuration.
As part of this work, DatasourceEC2 methods _get_data and _crawl_metadata
have been refactored for a few reasons:
- crawl_metadata is now a read-only operation, persisting no attributes on
the datasource instance and returns a dictionary of consumed metadata.
- crawl_metadata now closely represents the raw stucture of the ec2
metadata consumed, so that end-users can leverage public ec2 metadata
documentation where possible.
- crawl_metadata adds a '_metadata_api_version' key to the crawled
ds.metadata to advertise what version of EC2's api was consumed by
cloud-init.
- _get_data now does all the processing of crawl_metadata and saves
datasource instance attributes userdata_raw, metadata etc.
Additional drive-bys:
* unit test rework for test_altcloud and test_azure to simplify mocks
and make use of existing util and test_helpers functions.
|
|
|
|
OpenStack ironic references Infiniband interfaces via a 6 byte 'MAC
address' formed from bytes 13-15 and 18-20 of interface's hardware
address. This address is used as the ethernet_mac_address of Infiniband
links in network_data.json in configdrives generated by OpenStack nova.
We can use this address to map links in network_data.json to their
corresponding interface names.
When generating interface configuration files, we need to use the
interface's full hardware address as the HWADDR, rather than the 6 byte
MAC address provided by network_data.json.
This change allows IB interfaces to be referenced in this dual mode - by
MAC address and hardware address, depending on the context.
Support TYPE=InfiniBand for sysconfig configuration of IB interfaces.
|
|
Cloud-init caches any cloud metadata crawled during boot in the file
/run/cloud-init/instance-data.json. Cloud-init also standardizes some of
that metadata across all clouds. The command 'cloud-init query' surfaces a
simple CLI to query or format any cached instance metadata so that scripts
or end-users do not have to write tools to crawl metadata themselves.
Since 'cloud-init query' is runnable by non-root users, redact any
sensitive data from instance-data.json and provide a root-readable
unredacted instance-data-sensitive.json. Datasources can now define a
sensitive_metadata_keys tuple which will redact any matching keys
which could contain passwords or credentials from instance-data.json.
Also add the following standardized 'v1' instance-data.json keys:
- user_data: The base64encoded user-data provided at instance launch
- vendor_data: Any vendor_data provided to the instance at launch
- underscore_delimited versions of existing hyphenated keys:
instance_id, local_hostname, availability_zone, cloud_name
|
|
Any distro that has a '_write_nework_config' method should no
longer get their _write_network called at all. So lets drop
that code and raise a RuntimeError any time we got there.
Replace the one caller of 'apply_network' (legacy openstack path)
with a call to apply_network_config after converting the ENI to
network config.
|
|
Fix a bug where setting of mac address on a bond device was
ignored when provided in OpenStack network_config.json.
LP: #1682064
|
|
On OpenStack based OVH public cloud, we got DHCP response with
fixed-address 54.36.113.86;
option subnet-mask 255.255.255.255;
option routers 54.36.112.1;
The router clearly is not on the subnet. So 'ip' would fail when
we tried to add the default route.
The solution here is to add an explicit route on that interface
to the router and then add the default route.
Also add 'bgpovs' to the list of 'physical' types for OpenStack
network configuration. That type is used on OVH public cloud.
LP: #1792415
|
|
Mark as supported for reading some newer versions of openstack metadata:
2016-06-30 : Newton one
2016-10-06 : Newton two
2017-02-22 : Ocata
2018-08-27 : Rocky
|
|
Cloud-init was reading a list of versions from the OpenStack metadata
service (http://169.254.169.254/openstack/) and attempt to select the
newest known supported version. The problem was that the list
of versions was not being decoded, so we were comparing a list of
bytes (found versions) to a list of strings (known versions).
LP: #1792157
|
|
Allow users to provide '## template: jinja' as the first line or their
#cloud-config or custom script user-data parts. When this header exists,
the cloud-config or script will be rendered as a jinja template.
All instance metadata keys and values present in
/run/cloud-init/instance-data.json will be available as jinja variables
for the template. This means any cloud-config module or script can
reference any standardized instance data in templates and scripts.
Additionally, any standardized instance-data.json keys scoped below a
'<v#>' key will be promoted as a top-level key for ease of reference in
templates. This means that '{{ local_hostname }}' is the same as using the
latest '{{ v#.local_hostname }}'.
Since instance-data is written to /run/cloud-init/instance-data.json, make
sure it is persisted across reboots when the cached datasource opject is
reloaded.
LP: #1791781
|
|
In many cases, cloud-init uses 'util.subp' to run a subprocess.
This is not really desirable in our unit tests as it makes the tests
dependent upon existance of those utilities.
The change here is to modify the base test case class (CiTestCase) to
raise exception any time subp is called. Then, fix all callers.
For cases where subp is necessary or actually desired, we can use it
via
a.) context hander CiTestCase.allow_subp(value)
b.) class level self.allowed_subp = value
Both cases the value is a list of acceptable executable names that
will be called (essentially argv[0]).
Some cleanups in AltCloud were done as the code was being updated.
|
|
The issue is when customize a VM with static IPv4 and without gateway, it
will still extend route list and will loop a gateways list which is None.
This fix is to make sure when no gateway is here, it will not extend route
list.
LP: #1766538
|
|
This adds a Oracle specific datasource that functions with OCI.
It is a simplified version of the OpenStack metadata server
with support for vendor-data.
It does not support the OCI-C (classic) platform.
Also here is a move of BrokenMetadata to common 'sources'
as this was the third occurrence of that class.
|
|
Azure datasource now queries IMDS metadata service for network
configuration at link local address
http://169.254.169.254/metadata/instance?api-version=2017-12-01. The
azure metadata service presents a list of macs and allocated ip addresses
associated with this instance. Azure will now also regenerate network
configuration on every boot because it subscribes to EventType.BOOT
maintenance events as well as the 'first boot'
EventType.BOOT_NEW_INSTANCE.
For testing add azure-imds --kind to cloud-init devel net_convert tool
for debugging IMDS metadata.
Also refactor _get_data into 3 discrete methods:
- is_platform_viable: check quickly whether the datasource is
potentially compatible with the platform on which is is running
- crawl_metadata: walk all potential metadata candidates, returning a
structured dict of all metadata and userdata. Raise InvalidMetaData on
error.
- _get_data: call crawl_metadata and process results or error. Cache
instance data on class attributes: metadata, userdata_raw etc.
|
|
DEP_NETWORK is removed since the network_config must
run at each boot. New EventType.BOOT event is used
for that.
Network is brought up early to fetch the metadata which
is required to configure the network (ipv4 and/or v6).
Adds unittests for the following and fixes test_common for
LOCAL and NETWORK sets.
|
|
The OpenNebula data source generates an invalid netplan yaml file
if the IPv6 gateway is not defined in context.sh.
LP: #1768547
|
|
The OpenStack datasource in 18.3 changed to detect data in the
init-local stage instead of init-network and attempted to redetect
OpenStackLocal datasource on Oracle across reboots. The function
detect_openstack was added to quickly detect whether a platform is
OpenStack based on dmi product_name or chassis_asset_tag and it was
a bit too strict for Oracle in checking for 'OpenStack Nova'/'Compute'
DMI product_name.
Oracle's DMI product_name reports 'SAtandard PC (i440FX + PIIX, 1996)'
and DMI chassis_asset_tag is 'OracleCloud.com'.
detect_openstack function now adds 'OracleCloud.com' as a supported value
'OracleCloud.com' to valid chassis-asset-tags for the OpenStack
datasource.
LP: #1784685
|
|
The comment in update_metadata() that explains how a datasource should
enable network reconfig on every boot presumes that
EventType.BOOT_NEW_INSTANCE is a subset of EventType.BOOT. That's not
the case, and as such a datasource that needs to configure networking
when it is a new instance and every boot needs to include both event
types.
To make the situation above easier to debug, update_metadata() now
logs when it returns false.
To make it so that datasources do not need to test before appending to
the update_events['network'], it is changed from a list to a set.
test_update_metadata_only_acts_on_supported_update_events is updated
to allow datasources to support EventType.BOOT.
Author: Mike Gerdts <mike.gerdts@joyent.com>
|
|
Pylint 2.0.0 was recently released and complains more about
logging-not-lazy than it used to. I've fixed those warnings, here.
The changes in rh_subscription are more extensive. pylint may be
complaining incorrectly there, but the tests were not correctly un-doing
all of their mock/patching. This cleans those up and makes pylint happy.
|
|
Very basic type definitions are now defined to distinguish 'boot'
events from 'new instance (first boot)'. Event types will now be handed
to a datasource.update_metadata method which can determine whether
to refresh its metadata and re-render configuration based on that
source event.
A datasource can 'subscribe' to an event by setting up the update_events
attribute on the datasource class which describe what config scope is
updated by a list of matching events. By default datasources will have
the following update_events: {'network': [EventType.BOOT_NEW_INSTANCE]}
This setting says the datasource will re-write network configuration only
on first boot of a new instance or when the instance id changes.
New methods are now present on the datasource:
- clear_cached_attrs: Resets cached datasource attributes to values
listed in datasource.cached_attr_defaults. This is performed prior to
processing a fresh metadata process to avoid keeping old/invalid
cached data around.
- update_metadata: accepts source_event_types to determine if the
metadata should be crawled again and processed
|
|
OpenStack datasource is now discovered in init-local stage. In order to
probe whether OpenStack metadata is present, it performs a costly
sandboxed dhclient setup and metadata probe against http://169.254.169.254
for openstack data.
Cloud-init properly detects non-OpenStack on EC2, but it spends precious
time probing the metadata service also resulting in a confusing WARNING
log about 'metadata not present'. To avoid the wasted cycles, and
confusing warning, get_data will call a detect_openstack function to
quickly determine whether the platform looks like OpenStack before trying
to setup network to probe and crawl the metadata service.
LP: #1776701
|
|
A newer version of pyflakes (2.0.0) was released.
It identifed some unused variables that version 1.6.0 did not identify.
The change here merely fixes those unused variables.
|
|
- Updated datadict reference URL
- Store sdc:routes metadata in DatasourceSmartOS
- Map sdc:routes values to per-interface subnet configuration
- Added unittest
Co-authored-by: Mike Gerdts <mike.gerdts@joyent.com>
LP: #1763512
|
|
Network has not yet been configured in the init-local stage so the
openstack datasource will use dhcp-client to temporarily obtain an ipv4
address and query the metadata service at http://169.254.169.254 to get
network_data.json configuration. If present, the datasource will return
network_config version 1 config based on that network_data.json content.
Previously OpenStack datasource only setup dhcp on the fallback interface
so this represents a change in behavior to react to the full config
provided by openstack.
Also significant to OpenStack is the separation of a _crawl_data operation
from get_data(). crawl_data walks the available metadata services and
returns a dict of discovered content. get_data consumes the crawled_data,
caches it in the datasource and reacts to that data.
/run/cloud-init/instance-data.json now published network_data.json or
ec2_metadata key if that data is present on any datasource.
The main reasons for the separation of crawl from get_data:
* Enable performance metrics of cloud-init's metadata crawls on each
* Enable cloud-init modules and scripts to query and consume metadata
content which may have updated/changed after cloud-init's initial cache
during instance boot. (Think hotplug)
Also generalize common logic to base DataSource class/module:
* Move to a common UNSET variable up into base datasource module fix EC2,
ConfigDrive, OpenStack, SmartOS to use the global.
* Drop get_url_settings from Ec2, CloudStack and OpenStack and generalize
DataSource.get_url_params(). Allow subclasses to override url_max_wait,
url_timeout and url_retries params.
* Rename get_network_metadata bool to perform_dhcp_setup as it designates
whether EphemeralDHCPv4 setup is required before crawling metadata.
LP: #1749717
|
|
The Azure data source provides a method to check whether a NTFS partition
on the ephemeral disk is safe for reformatting to ext4. The method checks
to see if there are customer data files on the disk. However, mounting
the partition fails on systems that do not have the capability of
mounting NTFS. Note that in this case, it is also very unlikely that the
NTFS partition would have been used by the system (since it can't mount
it). The only case would be where an update to the system removed the
capability to mount NTFS, the likelihood of which is also very small.
This change allows the reformatting of the ephemeral disk to ext4 on
systems where mounting NTFS is not supported.
|
|
The result of a read_file_or_url on a file and on a url would differ
in behavior.
str(UrlResponse) would return UrlResponse.contents.decode('utf-8')
while
str(FileResponse) would return str(FileResponse.contents)
The difference being "b'foo'" versus "foo".
As part of the general goal of cleaning util, move read_file_or_url
into url_helper.
|
|
In playing with a SmartOS container I found that ds-identify did
not identify the container there as a container. Systemd-detect-virt
identifies it as 'container-other'.
Also here are tests for ds-identify for the SmartOS platform
identification, and some indentation fixes in ds-identify.
|
|
This change is for Azure VM Preprovisioning. A bug was found when after
azure VMs report ready the first time, during the time when VM is polling
indefinitely for the new ovf-env.xml from Instance Metadata Service
(IMDS), if a reboot happens, we send another report ready signal to the
fabric, which deletes the reprovisioning data on the node.
This marker file is used to fix this issue so that we will only send a
report ready signal to the fabric when no marker file is present. Then,
create a marker file so that when a reboot does occur, we check if a
marker file has been created and decide whether we would like to send the
repot ready signal.
LP: #1765214
|
|
Ubuntu images on IBMCloud for 16.04 have some seed data in
/var/lib/cloud/data/seed/nocloud-net. In order to have systems with
IBMCloud enabled, we modified ds-identify detection to skip that seed
if the system was on IBMCloud. That change did not consider the
fact that IBMCloud might not be in the datasource list.
There was similar logic in the ConfigDrive datasource in ds-identify
and the datasource itself.
Config drive is now updated to only check and avoid IBMCloud if IBMCloud
is enabled. The check in ds-identify for nocloud was dropped. If a
user provides a nocloud seed on IBMCloud, then that can be used.
This means that systems running Xenial will continue to get their
old datasources.
LP: #1766401
|
|
When images are deployed from template in a production environment
the artifacts of the provisioning stage (provisioningConfiguration.cfg)
that cloud-init referenced are cleaned up. However, when provisioned
in "debug" mode (internal to IBM) the artifacts are left.
This changes the 'is_ibm_provisioning' implementations in both
ds-identify and in the IBM datasource to identify the provisioning
stage more correctly. The change is to consider provisioning only
if the provisioing file existed and there was no log file or
the log file was older than this boot.
LP: #1767166
|
|
The cloud-init-local.service expects that any network device name changes
have already been completed by the kernel or udev daemon.
In some situations we've found that the renaming of interfaces from kernel
names (eth0, eth1, etc) to their persistent names (eno1, ens3, enp0s1,
etc) may happen after cloud-init-local has started where it reads values
from sysfs about what network devices are present, and which device to use
as a fallback nic.
Subsequently, cloud-init-local would write out network configuration for a
kernel device name which would no longer be present by the time that
networking services start to bring up the devices. The result is that the
instance does not get networking configured. Prior to use of
systemd-networkd, the Ubuntu 'networking.service' unit included a call to
udevadm settle which is why this race is not seen on a Xenial system.
This change adds the ability to detect if an interface has a stable name,
if if we find one without stable names and stable names have not been
disabled (net.ifnames=0 in /proc/cmdline), then cloud-init will invoke
udevadm settle.
LP: #1766287
|
|
This adds information to the IBMCloud datasource describing the
6 different scenarios that it is expected to handle.
|
|
cloud-init and mdata-get each have their own implementation of the
SmartOS metadata protocol. If cloud-init and other services that call
mdata-get are run concurrently, crosstalk on the serial port can cause
them both to become confused.
This change makes it so that cloud-init uses the same cooperative
locking scheme that's used by mdata-get, thus preventing cross-talk
between mdata-get and cloud-init.
For testing, a VM running on a SmartOS host and pyserial are required.
If the tests are run on a platform other than SmartOS, those that use a
real serial port are skipped. pyserial remains commented in
requirements.txt because most testers will not be running atop SmartOS.
LP: #1746605
|
|
There are three potential sources of the hostname, one of which is
documented SmartOS's vmadm(1M) via the hostname property. That
property's value is retrieved via the sdc:hostname key. The other
two sources for the hostname are a hostname key in customer_metadata
and the VM's uuid (sdc:uuid). Of these three, the sdc:hostname value
is not used in a meaningful way by DataSourceSmartOS.
This fix changes the fallback mechanism when hostname is not
specified in customer_metadata. The order of precedence for setting
the hostname is now 1) hostname in customer_metadata,
2) sdc:hostname, then 3) sdc:uuid.
LP: #1765085
|
|
If customer_metadata has no keys, the KEYS request returns an empty
string. Callers of the list() method expect a list to be returned and
will give a stack trace if this expectation is not met.
LP: #1763480
|
|
This enables warnings produced by pylint for unused variables (W0612),
and fixes the existing errors.
|
|
If the metadata service in the host is down while a guest that uses
DataSourceSmartOS is booting, the request from the guest falls into the
bit bucket. When the metadata service is eventually started, the guest
has no awareness of this and does not resend the request. This results in
cloud-init hanging forever with a guest reboot as the only recovery
option.
This fix updates the metadata protocol to implement the initialization
phase, just as is implemented by mdata-get and related utilities. The
initialization phase includes draining all pending data from the serial
port, writing an empty command and getting an expected error message in
reply. If the initialization phase times out, it is retried every five
seconds. Each timeout results in a warning message: "Timeout while
initializing metadata client. Is the host metadata service running?" By
default, warning messages are logged to the console, thus the reason for a
hung boot is readily apparent.
LP: #1667735
|
|
ext3 is not able to support file system sizes that are needed in Joyent's
cloud. For the default block size of 4k, the maximum filesystem size
for ext3 is 2^32 * 4096 = 16 TiB.
This changes the default file system type from ext3 to ext4.
LP: #1763511
|