Age | Commit message (Collapse) | Author |
|
Cloud-init caches any cloud metadata crawled during boot in the file
/run/cloud-init/instance-data.json. Cloud-init also standardizes some of
that metadata across all clouds. The command 'cloud-init query' surfaces a
simple CLI to query or format any cached instance metadata so that scripts
or end-users do not have to write tools to crawl metadata themselves.
Since 'cloud-init query' is runnable by non-root users, redact any
sensitive data from instance-data.json and provide a root-readable
unredacted instance-data-sensitive.json. Datasources can now define a
sensitive_metadata_keys tuple which will redact any matching keys
which could contain passwords or credentials from instance-data.json.
Also add the following standardized 'v1' instance-data.json keys:
- user_data: The base64encoded user-data provided at instance launch
- vendor_data: Any vendor_data provided to the instance at launch
- underscore_delimited versions of existing hyphenated keys:
instance_id, local_hostname, availability_zone, cloud_name
|
|
Allow users to provide '## template: jinja' as the first line or their
#cloud-config or custom script user-data parts. When this header exists,
the cloud-config or script will be rendered as a jinja template.
All instance metadata keys and values present in
/run/cloud-init/instance-data.json will be available as jinja variables
for the template. This means any cloud-config module or script can
reference any standardized instance data in templates and scripts.
Additionally, any standardized instance-data.json keys scoped below a
'<v#>' key will be promoted as a top-level key for ease of reference in
templates. This means that '{{ local_hostname }}' is the same as using the
latest '{{ v#.local_hostname }}'.
Since instance-data is written to /run/cloud-init/instance-data.json, make
sure it is persisted across reboots when the cached datasource opject is
reloaded.
LP: #1791781
|
|
This adds a Oracle specific datasource that functions with OCI.
It is a simplified version of the OpenStack metadata server
with support for vendor-data.
It does not support the OCI-C (classic) platform.
Also here is a move of BrokenMetadata to common 'sources'
as this was the third occurrence of that class.
|
|
The comment in update_metadata() that explains how a datasource should
enable network reconfig on every boot presumes that
EventType.BOOT_NEW_INSTANCE is a subset of EventType.BOOT. That's not
the case, and as such a datasource that needs to configure networking
when it is a new instance and every boot needs to include both event
types.
To make the situation above easier to debug, update_metadata() now
logs when it returns false.
To make it so that datasources do not need to test before appending to
the update_events['network'], it is changed from a list to a set.
test_update_metadata_only_acts_on_supported_update_events is updated
to allow datasources to support EventType.BOOT.
Author: Mike Gerdts <mike.gerdts@joyent.com>
|
|
Very basic type definitions are now defined to distinguish 'boot'
events from 'new instance (first boot)'. Event types will now be handed
to a datasource.update_metadata method which can determine whether
to refresh its metadata and re-render configuration based on that
source event.
A datasource can 'subscribe' to an event by setting up the update_events
attribute on the datasource class which describe what config scope is
updated by a list of matching events. By default datasources will have
the following update_events: {'network': [EventType.BOOT_NEW_INSTANCE]}
This setting says the datasource will re-write network configuration only
on first boot of a new instance or when the instance id changes.
New methods are now present on the datasource:
- clear_cached_attrs: Resets cached datasource attributes to values
listed in datasource.cached_attr_defaults. This is performed prior to
processing a fresh metadata process to avoid keeping old/invalid
cached data around.
- update_metadata: accepts source_event_types to determine if the
metadata should be crawled again and processed
|
|
Network has not yet been configured in the init-local stage so the
openstack datasource will use dhcp-client to temporarily obtain an ipv4
address and query the metadata service at http://169.254.169.254 to get
network_data.json configuration. If present, the datasource will return
network_config version 1 config based on that network_data.json content.
Previously OpenStack datasource only setup dhcp on the fallback interface
so this represents a change in behavior to react to the full config
provided by openstack.
Also significant to OpenStack is the separation of a _crawl_data operation
from get_data(). crawl_data walks the available metadata services and
returns a dict of discovered content. get_data consumes the crawled_data,
caches it in the datasource and reacts to that data.
/run/cloud-init/instance-data.json now published network_data.json or
ec2_metadata key if that data is present on any datasource.
The main reasons for the separation of crawl from get_data:
* Enable performance metrics of cloud-init's metadata crawls on each
* Enable cloud-init modules and scripts to query and consume metadata
content which may have updated/changed after cloud-init's initial cache
during instance boot. (Think hotplug)
Also generalize common logic to base DataSource class/module:
* Move to a common UNSET variable up into base datasource module fix EC2,
ConfigDrive, OpenStack, SmartOS to use the global.
* Drop get_url_settings from Ec2, CloudStack and OpenStack and generalize
DataSource.get_url_params(). Allow subclasses to override url_max_wait,
url_timeout and url_retries params.
* Rename get_network_metadata bool to perform_dhcp_setup as it designates
whether EphemeralDHCPv4 setup is required before crawling metadata.
LP: #1749717
|
|
When instance meta-data provides hostname information, run
cc_set_hostname in the init-local or init-net stage before network
comes up.
Prevent an initial DHCP request which leaks the stock cloud-image default
hostname before the meta-data provided hostname was processed.
A leaked cloud-image hostname adversely affects Dynamic DNS which
would reallocate 'ubuntu' hostname in DNS to every instance brought up by
cloud-init. These instances would only update DNS to the cloud-init
configured hostname upon DHCP lease renewal.
This branch extends the get_hostname methods in datasource, cloud and
util to limit results to metadata_only to avoid extra cost of querying
the distro for hostname information if metadata does not provide that
information.
LP: #1746455
|
|
Fix obvious typos. Replace 'for for' with a 'for'.
|
|
Each DataSource subclass must define its own get_data method. This branch
formalizes our DataSource class to require that subclasses define an
explicit dsname for sourcing cloud-config datasource configuration.
Subclasses must also override the _get_data method or a
NotImplementedError is raised.
The branch also writes /run/cloud-init/instance-data.json. This file
contains all meta-data, user-data and vendor-data and a standardized set
of metadata keys in a json blob which other utilities with root-access
could make use of. Because some meta-data or user-data is potentially
sensitive the file is only readable by root.
Generally most metadata content types should be json serializable. If
specific keys or values are not serializable, those specific values will
be base64encoded and the key path will be listed under the top-level key
'base64-encoded-keys' in instance-data.json. If json writing fails due to
other TypeErrors or UnicodeDecodeErrors, a warning log will be emitted to
/var/log/cloud-init.log and no instance-data.json will be created.
|
|
Currently the cloud-init default locale (en_US.UTF-8) is set by
the base datasource class. This patch allows a distro to overide
the fallback value with one that's available in the distro but continues
to respect an image which has preconfigured a locale.
- Distro object now has a get_locale method which will return a
preconfigure locale setting by checking the distros locale system
configuration file. If not set or not present, return the default
locale of en_US.UTF-8 which retains behavior of all previous cloud-init
releases.
- Apply locale now handles regenerating locales or system configuration
files as needed.
- Adjust apply_locale logic to skip locale-regen if the specified LANG
value is C.UTF-8,C, or POSIX; they do not require regeneration.
- Further add unittests to exercise the default paths for Ubuntu and
non-ubuntu paths to validate they get the LANG expected.
|
|
On systems with network devices with duplicate mac addresses, cloud-init
will fail to rename the devices according to the specified network
configuration. Refactor net layer to search by device driver and device
id if available. Azure systems may have duplicate mac addresses by
design.
Update Azure datasource to run at init-local time and let Azure datasource
generate a fallback networking config to handle advanced networking
configurations.
Lastly, add a 'setup' method to the datasources that is called before
userdata/vendordata is processed but after networking is up. That is
used here on Azure to interact with the 'fabric'.
|
|
This will change all instances of LOG.warn to LOG.warning as warn
is now a deprecated method. It will also make sure any logging
uses lazy logging by passing string format arguments as function
parameters.
|
|
Now tox will run pylint. The .pylintrc file sets pylint to only produce
errors, and will ignore certain classes that are known problematic (six).
|
|
When deploying on Azure and using only cloud-init, you must "bounce" the
network interface to trigger a DDNS update. This allows dhclient to
register the hostname with Azure so that DNS works correctly on their
private networks (i.e. between vm and vm).
The agent path was already doing the bounce so this creates parity
between the built-in path and the agent.
LP: #1674685
|
|
This has been a recurring ask and we had initially just made the change to
the cloud-init 2.0 codebase. As the current thinking is we'll just
continue to enhance the current codebase, its desirable to relicense to
match what we'd intended as part of the 2.0 plan here.
- put a brief description of license in LICENSE file
- put full license versions in LICENSE-GPLv3 and LICENSE-Apache2.0
- simplify the per-file header to reference LICENSE
- tox: ignore H102 (Apache License Header check)
Add license header to files that ship.
Reformat headers, make sure everything has vi: at end of file.
Non-shipping files do not need the copyright header,
but at the moment tests/ have it.
|
|
In some situations, cloud-init will erroneously append a default
domain to an already fully qualified hostname, resulting in something
like 'localhost.localdomain.localdomain'. This patch checks to see if
the value returned by util.get_hostname() contains a '.', and if it
does treats it as a fully qualified name.
Resolves: rhbz#1389048
LP: #1647910
|
|
This adds a call to 'activate_datasource'. That will be called
during init stage (or init-local in the event of a 'local' dsmode).
It is present so that the datasource can do platform specific operations
that may be necessary. It is passed the fully rendered cloud-config
and whether or not the instance is a new instance.
The Azure datasource uses this to address formatting of the ephemeral
devices. It does so by
a.) waiting for the device to come online
b.) removing the marker files for the disk_setup and mounts modules
if it finds that the ephemeral device has been reset.
LP: #1611074
|
|
Add vendor-data support to maas which will behave like the openstack
vendor-data does. Data returned from maas must be yaml loadable.
Also update the main in DataSourceMAAS to "just work" on a maas
deployed system.
LP: #1612313
|
|
On upgrade and reboot, if datasource restored from obj.pkl did not have
a dsmode attribute, then 'init --local' would fail due to stack trace.
LP: #1596690
|
|
|
|
== background ==
DataSource Mode (dsmode) is present in many datasources in cloud-init.
dsmode was originally added to cloud-init to specify when this datasource
should be 'realized'.
cloud-init has 4 stages of boot.
a.) cloud-init --local . network is guaranteed not present.
b.) cloud-init (--network). network is guaranteed present.
c.) cloud-config
d.) cloud-init final
'init_modules' [1] are run "as early as possible". And as such, are executed
in either 'a' or 'b' based on the datasource. However, executing them means
that user-data has been fully consumed. User-data and vendor-data may have
'#include http://...' which then rely on the network being present. boothooks
are an example of the things run in init_modules.
The 'dsmode' was a way for a user to indicate that init_modules
should run at 'a' (dsmode=local) or 'b' (dsmode=net) directly.
Things were further confused when a datasource could provide networking
configuration. Then, we needed to apply the networking config at 'a'
but if the user had provided boothooks that expected networking, then the
init_modules would need to be executed at 'b'. The config drive datasource
hacked its way through this and applies networking if *it* detects it is
a new instance.
== Suggested Change ==
The plan is to
1. incorporate 'dsmode' into DataSource superclass
2. make all existing datasources default to network
3. apply any networking configuration from a datasource on first boot only
apply_networking will always rename network devices when it runs.
for bug 1579130.
4. run init_modules at cloud-init (network) time frame unless datasource
is 'local'.
5. Datasources can provide a 'first_boot' method that will be called when
a new instance_id is found. This will allow the config drive's write_files
to be applied once.
Over all, this will very much simplify things. We'll no longer have
2 sources like DataSourceNoCloud and DataSourceNoCloudNet, but would just
have one source with a dsmode.
== Concerns ==
Some things have odd reliance on dsmode. For example, OpenNebula's get_hostname
uses it to determine if it should do a lookup of an ip address.
== Bugs to fix here ==
http://pad.lv/1577982 ConfigDrive: cloud-init fails to configure network from network_data.json
http://pad.lv/1579130 need to support systemd.link renaming of devices in container
http://pad.lv/1577844 Drop unnecessary blocking of all net udev rules
|
|
|
|
if the Datasource does not have an entry in config, then
set it to be a empty dictionary rather than None.
Also remove places that did this elsewhere.
|
|
Changing this interface to allow for easy change later.
The thing that this will enable is:
a.) maas datasource to look at the system config and see if it
is configured with the same consumer_key
b.) datasource config could allow setting a variable that it
would look at.
|
|
there is no data source that has a populated network_config()
so at this point this doesn't do anything.
|
|
This adds a check in cloud-init to see if the existing (cached)
datasource is still valid. It relies on support from the Datasource
to implement 'check_instance_id'. That method should quickly determine
(if possible) if the instance id found in the datasource is still valid.
This means that we can still notice new instance ids without
depending on a network datasource on every boot.
I've also implemented check_instance_id for the superclass and for
3 classes:
DataSourceAzure (check dmi data)
DataSourceOpenstack (check dmi data)
DataSourceNocloud (check the seeded data or kernel command line)
LP: #1553815
|
|
this just separates events from other things that could conceivably
be reported.
|
|
|
|
|
|
change ReportStack to ReportEventStack
change default ReportEventStack to be status.SUCCESS instead of None
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Also implement DataSource.region for EC2 and GCE data sources.
|
|
|
|
to be behind trunk.
`tox -e py27` passes full test suite. Now to work on replacing mocker.
|
|
|
|
|
|
Instead of using this log (which really isn't a failure) we should
instead of just return the looked up locations and then if there really
is an error the caller can handle the usage of the looked up locations
as they choose fit.
|
|
Fixed all complaints from running "make pep8". Also version locked
pep8 in test-requirements.txt to ensure that pep8 requirements don't
change without an explicit commit.
|
|
|
|
|
|
remove apply_filter from get_vendordata. I can't think of a good
reason to filter vendor-data per instance-id.
remove has_vendordata and consume_vendordata.
has vendordata is always "true", whether or not there is something
to operate is determined by:
if ds.vendordata_raw()
consume_vendordata is based on config entirely.
|
|
vendordata.
Vendordata is a datasource provided userdata-like blob that is parsed
similiarly to userdata, execept at the user's pleasure.
cloudinit/config/cc_scripts_vendor.py: added vendor script cloud config
cloudinit/config/cc_vendor_scripts_per_boot.py: added vendor per boot
cloud config
cloudinit/config/cc_vendor_scripts_per_instance.py: added vendor per
instance vendor cloud config
cloudinit/config/cc_vendor_scripts_per_once.py: added per once vendor
cloud config script
doc/examples/cloud-config-vendor-data.txt: documentation of vendor-data
examples
doc/vendordata.txt: documentation of vendordata for vendors
(RENAMED) tests/unittests/test_userdata.py => tests/unittests/test_userdata.py
TO: tests/unittests/test_userdata.py => tests/unittests/test_data.py:
userdata test cases are not expanded to confirm superiority over vendor
data.
bin/cloud-init: change instances of 'consume_userdata' to 'consume_data'
cloudinit/handlers/cloud_config.py: Added vendor script handling to default
cloud-config modules
cloudinit/handlers/shell_script.py: Added ability to change the path key to
support vendor provided 'vendor-scripts'. Defaults to 'script'.
cloudinit/helpers.py:
- Changed ConfigMerger to include handling of vendordata.
- Changed helpers to include paths for vendordata.
cloudinit/sources/__init__.py: Added functions for helping vendordata
- get_vendordata_raw(): returns vendordata unprocessed
- get_vendordata(): returns vendordata through userdata processor
- has_vendordata(): indicator if vendordata is present
- consume_vendordata(): datasource directive for indicating explict
user approval of vendordata consumption. Defaults to 'false'
cloudinit/stages.py: Re-jiggered for handling of vendordata
- _initial_subdirs(): added vendor script definition
- update(): added self._store_vendordata()
- [ADDED] _store_vendordata(): store vendordata
- _get_default_handlers(): modified to allow for filtering
which handlers will run against vendordata
- [ADDED] _do_handlers(): moved logic from consume_userdata
to _do_handlers(). This allows _consume_vendordata() and
_consume_userdata() to use the same code path.
- [RENAMED] consume_userdata() to _consume_userdata()
- [ADDED] _consume_vendordata() for handling vendordata
- run after userdata to get user cloud-config
- uses ConfigMerger to get the configuration from the
instance perspective about whether or not to use
vendordata
- [ADDED] consume_data() to call _consume_{user,vendor}data
cloudinit/util.py:
- [ADDED] get_nested_option_as_list() used by cc_vendor* for
getting a nested value from a dict and returned as a list
- runparts(): added 'exe_prefix' for running exe with a prefix,
used by cc_vendor*
config/cloud.cfg: Added vendor script execution as default
tests/unittests/test_runs/test_merge_run.py: changed consume_userdata() to
consume_data()
tests/unittests/test_runs/test_simple_run.py: changed consume_userdata() to
consume_data()
|
|
current form its still missing some modules though.
Supported:
-SSH-keys
-growpart
-growfs
-adduser
-powerstate
|
|
When the base DataSource class would set 'ds_cfg' for the specific
datasources' config, it would fail for the DataSources that are just named
'DataSourceFooNet' and we wanted to set configuration in 'Foo'.
For example, both DataSourceOpenNebula and DataSourceOpenNebulaNet want to
read datasource config from
sources:
OpenNebula:
foo: bar
But without this change, 'ds_cfg' would not be setup properly for
OpenNebulaNet.
|
|
|