Age | Commit message (Collapse) | Author |
|
If an OVS bridge was used as the only/primary interface, the 'init'
stage failed with a "Not all expected physical devices present" error,
leaving the system with a broken SSH setup.
LP: #1898997
|
|
* DHCP sandboxing failing on noexec mounted /var/tmp
If /var/tmp is mounted with noexec option the DHCP sandboxing will fail
with Permission Denied. This patch simply avoids this error by checking
the exec permission updating the dhcp path in negative case.
rhbz: https://bugzilla.redhat.com/show_bug.cgi?id=1857309
Signed-off-by: Eduardo Otubo <otubo@redhat.com>
* Replacing with os.* calls
* Adding test and removing isfile() useless call.
Co-authored-by: Rick Harding <rharding@mitechie.com>
|
|
* Refactor `cloudinit.net.wait_for_physdevs` to `cloudinit.distros.networking.Networking.wait_for_physdevs`
* Split the Linux-specific `udevadm_settle` call out to a separate abstract `Networking.settle` method; implement it on `LinuxNetworking` and add a `NotImplementedError` implementation to `BSDNetworking`
* Modify `wait_for_physdevs`s one callsite to use the new location
LP: #1884626
|
|
As the first refactor PR, this also includes the initial structure for tests.
LP: #1884619
|
|
Namely, is_connected, is_wireless and is_present. None of these are
used in the cloud-init codebase, so remove the dead code (instead of
refactoring it).
|
|
This introduces a way to log the dhclient error stream, and uses it for the Azure datasource (where we have a specific requirement for this data to be logged).
|
|
This was painful, but it finishes a TODO from cloudinit/subp.py.
It moves the following from util to subp:
ProcessExecutionError
subp
which
target_path
I moved subp_blob_in_tempfile into cc_chef, which is its only caller.
That saved us from having to deal with it using write_file
and temp_utils from subp (which does not import any cloudinit things now).
It is arguable that 'target_path' could be moved to a 'path_utils' or
something, but in order to use it from subp and also from utils,
we had to get it out of utils.
|
|
Remove extra spaces after a ','
|
|
These libraries provide backports of Python 3's stdlib components to Python 2. As we only support Python 3, we can simply use the stdlib now. This pull request does the following:
* removes some unneeded compatibility code for the old spelling of `assertRaisesRegex`
* replaces invocations of the Python 2-only `assertItemsEqual` with its new name, `assertCountEqual`
* replaces all usage of `unittest2` with `unittest`
* replaces all usage of `contextlib2` with `contextlib`
* drops `unittest2` and `contextlib2` from requirements files and tox.ini
It also rewrites some `test_azure` helpers to use bare asserts. We were seeing a strange error in xenial builds of this branch which appear to be stemming from the AssertionError that pytest produces being _different_ from the standard AssertionError. This means that the modified helpers weren't behaving correctly, because they weren't catching AssertionErrors as one would expect. (I believe this is related, in some way, to https://github.com/pytest-dev/pytest/issues/645, but the only version of pytest where we're affected is so far in the past that it's not worth pursuing it any further as we have a workaround.)
|
|
LP: #1870421
|
|
This also simplifies the implementation to rely on the stdlib, instead
of our own NIH checking.
|
|
This will be required for the mirror URL sanitisation work,
|
|
These classes don't use `self.logs` anywhere in their body, so we can
remove the `with_logs = True` setting from them.
These instances were found using astpath[0], with the following
invocation:
astpath "//Name[@id='with_logs' and not(ancestor::ClassDef//Attribute[@attr='logs'])]"
[0] https://github.com/hchasestevens/astpath
|
|
* cloudinit: replace "import mock" with "from unittest import mock"
* test-requirements.txt: drop mock
Co-authored-by: Chad Smith <chad.smith@canonical.com>
|
|
RedHat dhcp client writes out rfc3442 classless-static-routes in a different format[1]
than what is found in isc-dhcp clients. This patch adds support for the RedHat format.
1. Background details on the format
https://bugzilla.redhat.com/show_bug.cgi?id=516325
https://github.com/vaijab/fedora-dhcp/blob/e83fb19c51765442d77fa60596bfdb2b3b9fbe2e/dhcp-rfc3442-classless-static-routes.patch#L252
https://github.com/heftig/NetworkManager/blob/f56c82d86122fc45304fc829b5f1e4766ed51589/src/dhcp-manager/nm-dhcp-client.c#L978
LP: #1850642
|
|
Sending a valid but empty v1 network config resulted in a
stacktrace during execution. Update the network_state
parse path to specific check if the 'config' key is None
(not present) versus being present but explicitly empty.
Also add some network_state unittests.
LP: #1852496
|
|
The change that introduced this issue was handling interfaces that are
bonded in the kernel, in a way that doesn't present as "a bond" to
userspace in the normal way. Both members of this "bond" will share a
MAC address, so we filter one of them out to avoid incorrect MAC address
collision warnings.
Unfortunately, the matching condition was too broad, so that change also
affected normal bonds and bridges. This change specifically excludes
bonds and bridges from that determination, to address that regression.
LP: #1846535
|
|
Some network devices are transformed into a bond via kernel magic
and do not have the 'bonding' sysfs attribute, but like a bond they
have a duplicate MAC of other bond members. On Azure Advanced
Networking SRIOV devices are auto bonded and will have the same MAC
as the HyperV nic. We can detect this via the 'master' sysfs attribute
in the device sysfs path and this patch adds this to the list of devices
we ignore when enumerating device lists.
LP: #1844191
|
|
Add support for detecting netfailover[1] device 3-tuple in networking
layer. In the Oracle datasource ensure that if a provided network
config, either fallback or provided config includes a netfailover master
to remove any MAC address value as this can break under 3-netdev
as the other two devices have the same MAC.
1. https://www.kernel.org/doc/html/latest/networking/net_failover.html
|
|
The function generate_fallback_config is used by Azure by default when
not consuming IMDS configuration data. This function is also used by any
datasource which does not implement it's own network config. This simple
fallback configuration sets up dhcp on the most likely NIC. It will now
emit network v2 instead of network v1.
This is a step toward moving all components talking in v2 and allows us
to avoid costly conversions between v1 and v2 for newer distributions
which rely on netplan.
|
|
On systems with many interfaces, processing udev events may take a while.
Cloud-init expects devices included in a provided network-configuration
to be present when attempting to configure them. This patch adds a step
in net configuration where it will check for devices provided in the
configuration and if not found, issue udevadm settle commands to wait
for them to appear.
Additionally, the default path for udev persistent network rules
70-persistent-net.rules may also be written to systems which include
the 75-net-generator.rules. During boot, cloud-init and the
generator may race and interleave values causing issues. OpenSUSE
will now use a newer file, 85-persistent-net-cloud-init.rules which
will take precedence over values created by 75-net-generator and
avoid collisions on the same file.
LP: #1817368
|
|
The EphemeralDHCP context manager did not parse or handle
rfc3442 classless static routes which prevented reading
datasource metadata in some clouds. This branch adds support
for extracting the field from the leases output, parsing the
format and then adding the required iproute2 ip commands to
apply (and teardown) the static routes.
LP: #1821102
|
|
Currently on 18.04, running tox -e py27 will spew errors like:
.tests/unittests/test_net.py:2649: YAMLLoadWarning: calling yaml.load()
without Loader=... is deprecated, as the default Loader is unsafe.
Please read https://msg.pyyaml.org/load for full details.
The change here just uses cloud-init's yaml, which does safeloading
by default.
|
|
In test_ds_identify, don't mutate otherwise-static test data. When
running tests in a random order, this was causing failures due to
breaking preconditions for other tests.
In tests/helpers, reset logging level in tearDown. Some of the CLI
tests set the level of the root logger in a way that isn't correctly
reset.
For test_poll_imds_re_dhcp_on_timeout and
test_dhcp_discovery_run_in_sandbox_warns_invalid_pid, mock out
time.sleep; this saves ~11 seconds (or ~40% of previous test time!).
|
|
cloud-init uses dhclient to fetch the DHCP lease so it can extract
DHCP options. dhclient creates the leasefile, then writes to it;
simply waiting for the leasefile to appear creates a race between
dhclient and cloud-init. Instead, wait for dhclient to be parented by
init. At that point, we know it has written to the leasefile, so it's
safe to copy the file and kill the process.
cloud-init creates a temporary directory in which to execute dhclient,
and deletes that directory after it has killed the process. If
cloud-init abandons waiting for dhclient to daemonize, it will still
attempt to delete the temporary directory, but will not report an
exception should that attempt fail.
LP: #1794399
|
|
We add a new Optional parameter: connectivity_url
This is used in __enter__ to verify if a connection already exists.
If it does exist, no operations are performed.
|
|
On OpenStack based OVH public cloud, we got DHCP response with
fixed-address 54.36.113.86;
option subnet-mask 255.255.255.255;
option routers 54.36.112.1;
The router clearly is not on the subnet. So 'ip' would fail when
we tried to add the default route.
The solution here is to add an explicit route on that interface
to the router and then add the default route.
Also add 'bgpovs' to the list of 'physical' types for OpenStack
network configuration. That type is used on OVH public cloud.
LP: #1792415
|
|
In many cases, cloud-init uses 'util.subp' to run a subprocess.
This is not really desirable in our unit tests as it makes the tests
dependent upon existance of those utilities.
The change here is to modify the base test case class (CiTestCase) to
raise exception any time subp is called. Then, fix all callers.
For cases where subp is necessary or actually desired, we can use it
via
a.) context hander CiTestCase.allow_subp(value)
b.) class level self.allowed_subp = value
Both cases the value is a list of acceptable executable names that
will be called (essentially argv[0]).
Some cleanups in AltCloud were done as the code was being updated.
|
|
The cloud-init-local.service expects that any network device name changes
have already been completed by the kernel or udev daemon.
In some situations we've found that the renaming of interfaces from kernel
names (eth0, eth1, etc) to their persistent names (eno1, ens3, enp0s1,
etc) may happen after cloud-init-local has started where it reads values
from sysfs about what network devices are present, and which device to use
as a fallback nic.
Subsequently, cloud-init-local would write out network configuration for a
kernel device name which would no longer be present by the time that
networking services start to bring up the devices. The result is that the
instance does not get networking configured. Prior to use of
systemd-networkd, the Ubuntu 'networking.service' unit included a call to
udevadm settle which is why this race is not seen on a Xenial system.
This change adds the ability to detect if an interface has a stable name,
if if we find one without stable names and stable names have not been
disabled (net.ifnames=0 in /proc/cmdline), then cloud-init will invoke
udevadm settle.
LP: #1766287
|
|
net.apply_network_config_names currently only accepts network-config
in version 1 format. When users include a netplan format network-config the
rename code does not find any of the 'set-name' directives and does not rename
any of the interfaces. This causes some netplan configurations to fail.
This patch adds support for parsing netplan format and extracts the needed
information (macaddress and set-name values) to allow cloud-init to issue
interface rename commands. We know raise a RuntimeError if presented with
an unknown config format.
LP: #1709715
|
|
There is a race condition where our sandboxed dhclient properly writes a
lease file but has not yet written a pid file. If the sandbox temporary
directory is torn down before the dhclient subprocess writes a pidfile
DataSourceEc2Local gets a traceback and the instance will fallback to
DataSourceEc2 in the init-network stage. This wastes boot cycles we'd
rather not spend.
Fix handling of sandboxed dhclient to wait for both pidfile and leasefile
before proceding. If either file doesn't show in 5 seconds, log a warning
and return empty lease results {}.
LP: #1735331
|
|
dhclient runs, obtains a address and then backgrounds itself.
cloud-init did not take care to kill it after it was done with it.
After it has run and created the leases, we can kill it.
LP: #1732964
|
|
Systems that used systemd-networkd's dhcp client would not be able to get
information on the Azure endpoint (placed in Option 245) or the CloudStack
server (in 'server_address').
The change here supports reading these files in /run/systemd/netif/leases.
The files declare that "This is private data. Do not parse.", but at this
point we do not have another option.
LP: #1718029
|
|
The package cloudinit was sparsely added to only the makefile's unittest
target and tox's py3 target. This branch adds cloudinit package to 'make
unittest3' and all tox environments. It tweaks one cloudinit unit test to
use mocked_object.call_count instead of mocked_object.assert_called_once
which is not defined in some python unittest versions.
|
|
/run/cloud-init/tmp is on a filesystem mounted noexec, so running
dchlient in Ec2Local during discovery breaks with 'Permission denied'.
This branch allows us to run from a different tmp dir so we have exec
rights.
LP: #1717627
|
|
This moves the base test case classes into into cloudinit/tests and
updates all the corresponding imports.
|
|
This branch is a prerequisite for IPv6 support in AWS by allowing Ec2
datasource to query the metadata source version 2016-09-02 about whether
or not it needs to configure IPv6 on interfaces. If version 2016-09-02
is not present, fallback to the min_metadata_version of 2009-04-04. The
DataSourceEc2Local not run on FreeBSD because dhclient in doesn't
support the -sf flag allowing us to run dhclient without filesystem
side-effects.
To query AWS' metadata address @ 169.254.169.254, the instance must have
a dhcp-allocated address configured. Configuring IPv4 link-local
addresses result in timeouts from the metadata service. We introduced a
DataSourceEc2Local subclass which will perform a sandboxed dhclient
discovery which obtains an authorized IP address on eth0 and crawl
metadata about full instance network configuration.
Since ec2 IPv6 metadata is not sufficient in itself to tell us all the
ipv6 knownledge we need, it only be used as a boolean to tell us which
nics need IPv6. Cloud-init will then configure desired interfaces to
DHCPv6 versus DHCPv4.
Performance side note: Shifting the dhcp work into init-local for Ec2
actually gets us 1 second faster deployments by skipping init-network
phase of alternate datasource checks because Ec2Local is configured in
an ealier boot stage. In 3 test runs prior to this change: cloud-init
runs were 5.5 seconds, with the change we now average 4.6 seconds.
This efficiency could be even further improved if we avoiding dhcp
discovery in order to talk to the metadata service from an AWS
authorized dhcp address if there were some way to advertize the dhcp
configuration via DMI/SMBIOS or system environment variables.
Inspecting time costs of the dhclient setup/teardown in 3 live runs the
time cost for the dhcp setup round trip on AWS is:
test 1: 76 milliseconds
dhcp discovery + metadata: 0.347 seconds
metadata alone: 0.271 seconds
test 2: 88 milliseconds
dhcp discovery + metadata: 0.388 seconds
metadata alone: 0.300 seconds
test 3: 75 milliseconds
dhcp discovery + metadata: 0.366 seconds
metadata alone: 0.291 seconds
LP: #1709772
|
|
This is not yet called, but will be called in a subsequent Ec2-related branch to manually initialize a network interface with the responses using dhcp discovery without any dhcp-script side-effects. The functionality has been tested on Ec2 ubuntu and CentOS vms to ensure that network interface initialization works in both OS-types.
Since there was poor unit test coverage for the cloudinit.net.__init__ module, this branch adds a bunch of coverage to the functions in cloudinit.net.__init. We can also now have unit tests local to the cloudinit modules. The benefits of having unittests under cloudinit module:
- Proximity of unittest to cloudinit module makes it easier for ongoing devs to know where to augment unit tests. The tests.unittest directory is organizated such that it
- Allows for 1 to 1 name mapping module -> tests/test_module.py
- Improved test and module isolation, if we find unit tests have to import from a number of modules besides the module under test, it will better prompt resturcturing of the module.
This also branch touches:
- tox.ini to run unit tests found in cloudinit as well as include all test-requirements for pylint since we now have unit tests living within cloudinit package
- setup.py to exclude any test modules under cloudinit when packaging
|