Age | Commit message (Collapse) | Author |
|
This branch is a prerequisite for IPv6 support in AWS by allowing Ec2
datasource to query the metadata source version 2016-09-02 about whether
or not it needs to configure IPv6 on interfaces. If version 2016-09-02
is not present, fallback to the min_metadata_version of 2009-04-04. The
DataSourceEc2Local not run on FreeBSD because dhclient in doesn't
support the -sf flag allowing us to run dhclient without filesystem
side-effects.
To query AWS' metadata address @ 169.254.169.254, the instance must have
a dhcp-allocated address configured. Configuring IPv4 link-local
addresses result in timeouts from the metadata service. We introduced a
DataSourceEc2Local subclass which will perform a sandboxed dhclient
discovery which obtains an authorized IP address on eth0 and crawl
metadata about full instance network configuration.
Since ec2 IPv6 metadata is not sufficient in itself to tell us all the
ipv6 knownledge we need, it only be used as a boolean to tell us which
nics need IPv6. Cloud-init will then configure desired interfaces to
DHCPv6 versus DHCPv4.
Performance side note: Shifting the dhcp work into init-local for Ec2
actually gets us 1 second faster deployments by skipping init-network
phase of alternate datasource checks because Ec2Local is configured in
an ealier boot stage. In 3 test runs prior to this change: cloud-init
runs were 5.5 seconds, with the change we now average 4.6 seconds.
This efficiency could be even further improved if we avoiding dhcp
discovery in order to talk to the metadata service from an AWS
authorized dhcp address if there were some way to advertize the dhcp
configuration via DMI/SMBIOS or system environment variables.
Inspecting time costs of the dhclient setup/teardown in 3 live runs the
time cost for the dhcp setup round trip on AWS is:
test 1: 76 milliseconds
dhcp discovery + metadata: 0.347 seconds
metadata alone: 0.271 seconds
test 2: 88 milliseconds
dhcp discovery + metadata: 0.388 seconds
metadata alone: 0.300 seconds
test 3: 75 milliseconds
dhcp discovery + metadata: 0.366 seconds
metadata alone: 0.291 seconds
LP: #1709772
|
|
get_interfaces_by_mac and get_interfaces just looked much alike.
This makes get_interfaces_by_mac call get_interfaces.
|
|
This is not yet called, but will be called in a subsequent Ec2-related branch to manually initialize a network interface with the responses using dhcp discovery without any dhcp-script side-effects. The functionality has been tested on Ec2 ubuntu and CentOS vms to ensure that network interface initialization works in both OS-types.
Since there was poor unit test coverage for the cloudinit.net.__init__ module, this branch adds a bunch of coverage to the functions in cloudinit.net.__init. We can also now have unit tests local to the cloudinit modules. The benefits of having unittests under cloudinit module:
- Proximity of unittest to cloudinit module makes it easier for ongoing devs to know where to augment unit tests. The tests.unittest directory is organizated such that it
- Allows for 1 to 1 name mapping module -> tests/test_module.py
- Improved test and module isolation, if we find unit tests have to import from a number of modules besides the module under test, it will better prompt resturcturing of the module.
This also branch touches:
- tox.ini to run unit tests found in cloudinit as well as include all test-requirements for pylint since we now have unit tests living within cloudinit package
- setup.py to exclude any test modules under cloudinit when packaging
|
|
The network device renaming code previously required the case of
the mac address input to match that of the data read from the system.
For example, if user provided network config with mac address
in upper case, then cloud-init would not rename the device correctly
as /sys/class/net/address stores lower case values.
The fix here is to always compare lower case mac addresses.
LP: #1705147
|
|
On systems with network devices with duplicate mac addresses, cloud-init
will fail to rename the devices according to the specified network
configuration. Refactor net layer to search by device driver and device
id if available. Azure systems may have duplicate mac addresses by
design.
Update Azure datasource to run at init-local time and let Azure datasource
generate a fallback networking config to handle advanced networking
configurations.
Lastly, add a 'setup' method to the datasources that is called before
userdata/vendordata is processed but after networking is up. That is
used here on Azure to interact with the 'fabric'.
|
|
The code deciding which interface to choose as the default to request the
IP address through DHCP does not sort the interfaces correctly. On Ubuntu
Xenial images for example, the interfaces are named ens1, ens2, ens3...,
ens11, ... depending on the pci bus address. The python sorting will list
'ens11' before 'ens3' for example despite the fact that 'ens3' should be
before 'ens11'.
This patch address this issue and sort the interface names according to a
human sorting.
Signed-off-by: Marc-Aurèle Brothier <m@brothier.org>
|
|
Some interfaces (greptap0 in the bug) have a mac address of
'00:00:00:00:00:00'. That was causing a duplicate mac detection
as the 'lo' device also has that mac.
The change here is to just ignore macs other than 'lo' that have that.
LP: #1692028
|
|
Introduce is_vlan function and call that when building dictionary of
interfaces by mac address.
LP: #1682871
|
|
When cloud-init ran in the init stage (after networking had come up).
A bug could occur where cloud-init would attempt and fail to rename
network devices that had "inherited" mac addresses.
The intent of apply_network_config_names was always to rename only
the devices that were "physical" per the network config. (This would
include veth devices in a container). The bug was in creating
the dictionary of interfaces by mac address. If there were multiple
interfaces with the same mac address then renames could fail.
This situation was guaranteed to occur with bonds or vlans or other
devices that inherit their mac.
The solution is to change get_interfaces_by_mac to skip interfaces
that have an inherited mac.
Also drop the 'devs' argument to get_interfaces_by_mac. It was
non-obvious what the result should be if a device in the input
list was filtered out. ie should the following have an entry for
bond0 or not. get_interfaces_by_mac(devs=['bond0'])
LP: #1669860
|
|
Previously, the distro had hard coded which network renderer it would
use. This adds support for just picking the right renderer based
on what is available.
Now, that can be set via a priority in system_info, but should
generally work. That config looks like:
system_info:
network:
renderers: ["eni", "sysconfig"]
When no renderers are found, a specific RendererNotFoundError is raised.
stages.py is modified to catch that and log it at error level. This
path should not really be exercised, but could occur if for example an
Ubuntu system did not have ifupdown, or a rhel system did not have
sysconfig. In such a system previously we would have quietly rendered
ENI configuration but that would have been ignored. This is one step
better in that we at least log the error.
|
|
This has been a recurring ask and we had initially just made the change to
the cloud-init 2.0 codebase. As the current thinking is we'll just
continue to enhance the current codebase, its desirable to relicense to
match what we'd intended as part of the 2.0 plan here.
- put a brief description of license in LICENSE file
- put full license versions in LICENSE-GPLv3 and LICENSE-Apache2.0
- simplify the per-file header to reference LICENSE
- tox: ignore H102 (Apache License Header check)
Add license header to files that ship.
Reformat headers, make sure everything has vi: at end of file.
Non-shipping files do not need the copyright header,
but at the moment tests/ have it.
|
|
I've seen cases of unable to read from files as
well as the existing os errors so catch io error
and skip by using the smarter read_sys_net instead.
LP: #1625766
|
|
When using get_interface_mac, on a system with bond slaves, it would
return the bond_master's address. That isn't expected, and causes
problems in a caller like get_interfaces_by_mac which would then seem to
find duplicate macs on the system.
Additionally, in read_sys_net catch a errno.ENOTDIR error as ENOENT.
Opening a path as a file that has <existing_file>/anything will will raise
ENOTDIR rather than ENOENT. This handles that case in read_sys_net as a
if the file did not exist.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
'id' on a link in the openstack spec should be "Generic, generated ID".
current implementation was to use the host's name for the host
side nic. Which provided names like 'tap-adfasdffd'.
We do not want to name devices like that as its quite unexpected
and non user friendly. So here we use the system name for any
nic that is present, but then require that the nics found also
be present at the time of rendering.
The end result is that if the system boots with net.ifnames=0
then it will get 'eth0' like names. and if it boots without net.ifnames
then it will get enp0s1 like names.
|
|
this adds ability to support ENI that has:
hwadress ether 36:4c:e1:3b:14:31
or
hwaddress 36:4c:e1:3b:14:31
the former is written by openstack (at least on dreamhost).
Also, in the conversion of eni to network config support broadcast
and netmask.
|
|
|
|
The one issue i'm aware of currently is that tap devices
(ip tuntap add mode tap user root mytap1)
do not work correctly with 'is_up' which means the check
does not bring them down and the rename fails.
The LOG.debug message should be cleaned up too, as it currently
references the function rather function.__name__ for nicer message.
|
|
currently does not work in lxc
https://github.com/lxc/lxd/issues/2063
|
|
== background ==
DataSource Mode (dsmode) is present in many datasources in cloud-init.
dsmode was originally added to cloud-init to specify when this datasource
should be 'realized'.
cloud-init has 4 stages of boot.
a.) cloud-init --local . network is guaranteed not present.
b.) cloud-init (--network). network is guaranteed present.
c.) cloud-config
d.) cloud-init final
'init_modules' [1] are run "as early as possible". And as such, are executed
in either 'a' or 'b' based on the datasource. However, executing them means
that user-data has been fully consumed. User-data and vendor-data may have
'#include http://...' which then rely on the network being present. boothooks
are an example of the things run in init_modules.
The 'dsmode' was a way for a user to indicate that init_modules
should run at 'a' (dsmode=local) or 'b' (dsmode=net) directly.
Things were further confused when a datasource could provide networking
configuration. Then, we needed to apply the networking config at 'a'
but if the user had provided boothooks that expected networking, then the
init_modules would need to be executed at 'b'. The config drive datasource
hacked its way through this and applies networking if *it* detects it is
a new instance.
== Suggested Change ==
The plan is to
1. incorporate 'dsmode' into DataSource superclass
2. make all existing datasources default to network
3. apply any networking configuration from a datasource on first boot only
apply_networking will always rename network devices when it runs.
for bug 1579130.
4. run init_modules at cloud-init (network) time frame unless datasource
is 'local'.
5. Datasources can provide a 'first_boot' method that will be called when
a new instance_id is found. This will allow the config drive's write_files
to be applied once.
Over all, this will very much simplify things. We'll no longer have
2 sources like DataSourceNoCloud and DataSourceNoCloudNet, but would just
have one source with a dsmode.
== Concerns ==
Some things have odd reliance on dsmode. For example, OpenNebula's get_hostname
uses it to determine if it should do a lookup of an ip address.
== Bugs to fix here ==
http://pad.lv/1577982 ConfigDrive: cloud-init fails to configure network from network_data.json
http://pad.lv/1579130 need to support systemd.link renaming of devices in container
http://pad.lv/1577844 Drop unnecessary blocking of all net udev rules
|
|
|
|
|
|
|
|
|
|
This allows it to be used outside of cloudinit
more easily in the future.
|
|
|
|
|
|
This format allows for rendering to work in other distros
and clearly separates the API needed to do this (it also
moves the klibc parsing into its own module so that the
leftover code in net/__init__.py is smaller and only focused
on util code).
|
|
|
|
When ip= on the kernel command line defines the networking, set
those network devices to be manually controlled, instead of 'auto'.
The reason for this is that if they're marked as 'auto':
a.) a second attempt will be made to ifup them.
b.) they'll be brought down on shutdown
'b' is problematic on network root filesystem.
Also this picks up 2 changes from curtin's net module:
- Cleanup newline logic so we always have a clean '\n\n' between stanza
- Add a unittest to validate bonding network config render, specifically
when to emit auto $iface for dependent bond slaves.
LP: #1568637
|
|
|
|
when reading the initramfs configurewd devices and turning them
into network config, we change to not have 'auto' control (or allow=auto).
The reason for this is that if the device was still up:
a.) it would try to bring it up again (due to bug 1570142)
b.) it would be brought down.
'b' is problematic if there is an iscsi or network root filesystem.
Note, that ifupdown does now support 'no-auto-down' which means
that the nic should not be brought down on 'ifdown -a'.
LP: #1568637
|
|
This picks up newline cleanup and some bond fixes from curtin at rev 374.
- Cleanup newline logic so we always have a clean '\n\n' between stanza
- Add a unittest to validate bonding network config render, specifically
when to emit auto $iface for dependent bond slaves.
|
|
Just skip devices that are named veth*.
The fix here is to ignore lxd created devices, but any other veth
device that is created at this point in boot is probably not the
right interface to dhcp on.
LP: #1569064
|
|
It does not make sense to consider bridges when searching for fallback
networking. If the system is configured with a bridge, then its probably
for some purpose other than to get to a metadata service.
Considering the bridge could make cloud-init pick the wrong device on reboot.
LP: #1569974
|
|
a.) do not write systemd link files if we do not have a mac address.
the check is updated to check for value rather than just presense
(ie, 'mac_address': None)
b.) DataSourceNoCloudNet: search in the nocloud seed dir
this is important because NoCloud if dsmode is Net will look only
would pass by, expecting NoCloudNet to pick it up
but NoCloudNet would not look in /var/lib/cloud/seed/nocloud
and thus skip it.
c.) support the disabling of network configuration
via /var/lib/cloud/data/upgraded-network
This is what the package upgrader is writing.
|
|
|
|
This adds support for suppling network configuration on the
kernel command line in 2 ways:
a.) kernel command line includes 'network-config=<base64>'
value of that parameter is base64 encoded json (or yaml)
it is taken as network config yaml.
In order to save space on kernel command line, it can be
base64 encoded gzipped json also.
b.) ip= paired with files authored by klibc's ipconfig tool
When network devices are brought up in the initramfs, klibc's
ipconfig tool writes files are named /run/net-<DEVNAME>.conf.
The best documentation available on that tool is
/usr/share/doc/libklibc/README.ipconfig.gz.
Also changes util.get_cmdline() to return the command line of
pid 1 if it is in a container. That is to make it consistent with
The systemd generator, and allow passing a command line to lxd,
as lxd does not mask /proc/cmdline.
|
|
|
|
net: add render_route comment to document why we added || true to route
statements
DataSourceConfigDrive: Only convert network_json to network_config when
caller reads network_config attr. Cache the conversion.
|
|
|
|
add tests to show this functional.
|
|
|
|
|
|
|