summaryrefslogtreecommitdiff
path: root/cloudinit/sources/helpers/azure.py
AgeCommit message (Collapse)Author
2021-03-25Azure helper: Ensure Azure http handler sleeps between retries (#842)Johnson Shi
Ensure that the Azure helper's http handler sleeps a fixed duration between retry failure attempts. The http handler will sleep a fixed duration between failed attempts regardless of whether the attempt failed due to (1) request timing out or (2) instant failure (no timeout). Due to certain platform issues, the http request to the Azure endpoint may instantly fail without reaching the http timeout duration. Without sleeping a fixed duration in between retry attempts, the http handler will loop through the max retry attempts quickly. This causes the communication between cloud-init and the Azure platform to be less resilient due to the short total duration if there is no sleep in between retries.
2020-11-19DataSourceAzure: push dmesg log to KVP (#670)Anh Vo
Pushing dmesg log to KVP to help troubleshoot VM boot issues
2020-11-18Azure helper: Increase Azure Endpoint HTTP retries (#619)Johnson Shi
Increase Azure Endpoint HTTP retries to handle occasional platform network blips. Introduce a common method http_with_retries in the azure.py helper, which will serve as the common HTTP request handler for all HTTP requests with the Azure endpoint. This method has builtin retries and reporting diagnostics logic.
2020-11-18DataSourceAzure: send failure signal on Azure datasource failure (#594)Johnson Shi
On systems where the Azure datasource is a viable platform for crawling metadata, cloud-init occasionally encounters fatal irrecoverable errors during the crawling of the Azure datasource. When this happens, cloud-init crashes, and Azure VM provisioning would fail. However, instead of failing immediately, the user will continue seeing provisioning for a long time until it times out with "OS Provisioning Timed Out" message. In these situations, cloud-init should report failure to the Azure datasource endpoint indicating provisioning failure. The user will immediately see provisioning terminate, giving them a much better failure experience instead of pointlessly waiting for OS provisioning timeout.
2020-11-04azure: enable pushing the log to KVP from the last pushed byte (#614)Moustafa Moustafa
This allows the cloud-init log to be pushed multiple times during boot, with the latest lines being pushed each time.
2020-10-15azure: clean up and refactor report_diagnostic_event (#563)Johnson Shi
This moves logging into `report_diagnostic_event`, to clean up its callsites.
2020-09-10Retrieve SSH keys from IMDS first with OVF as a fallback (#509)Thomas Stringer
* pull ssh keys from imds first and fall back to ovf if unavailable * refactor log and diagnostic messages * refactor the OpenSSLManager instantiation and certificate usage * fix unit test where exception was being silenced for generate cert * fix tests now that certificate is not always generated * add documentation for ssh key retrieval * add ability to check if http client has security enabled * refactor certificate logic to GoalState
2020-08-28Add method type hints for Azure helper (#540)Johnson Shi
This reverts commit 8d25d5e6fac39ab3319ec5d37d23196429fb0c95.
2020-08-25tox: bump the pylint version to 2.6.0 in the default run (#544)Paride Legovini
Changes: tox: bump the pylint version to 2.6.0 in the default run Fix pylint 2.6.0 W0707 warnings (raise-missing-from)
2020-08-20Pushing cloud-init log to the KVP (#529)Moustafa Moustafa
Push the cloud-init.log file (Up to 500KB at once) to the KVP before reporting ready to the Azure platform. Based on the analysis done on a large sample of cloud-init.log files, Here's the statistics collected on the log file size: P50 P90 P95 P99 P99.9 P99.99 137K 423K 537K 3.5MB 6MB 16MB This change limits the size of cloud-init.log file data that gets dumped to KVP to 500KB. So for ~95% of the cases, the whole log file will be dumped and for the remaining ~5%, we will get the last 500KB of the cloud-init.log file. To asses the performance of the 500KB limit, 250 VM were deployed with a 500KB cloud-init.log file and the time taken to compress, encode and dump the entries to KVP was measured. Here's the time in milliseconds percentiles: P50 P99 P999 75.705 232.701 1169.636 Another 250 VMs were deployed with this logic dumping their normal cloud-init.log file to KVP, the same timing was measured as above. Here's the time in milliseconds percentiles: P50 P99 P999 1.88 5.277 6.992 Added excluded_handlers to the report_event function to be able to opt-out from reporting the events of the compressed cloud-init.log file to the cloud-init.log file. The KVP break_down logic had a bug, where it will reuse the same key for all the split chunks of KVP which results in overwriting the split KVPs by the last one when consumed by Hyper-V. I added the split chunk index as a differentiator to the KVP key. The Hyper-V consumes the KVPs from the KVP file as chunks whose key is 512KB and value is 2048KB but the Azure platform expects the value to be 1024KB, thus I introduced the Azure value limit.
2020-08-13Refactor Azure report ready code (#468)Johnson Shi
This PR refactors Azure report ready code to include more robust tests and telemetry.
2020-06-19printing the error stream of the dhclient process before killing it (#369)Moustafa Moustafa
This introduces a way to log the dhclient error stream, and uses it for the Azure datasource (where we have a specific requirement for this data to be logged).
2020-06-08Move subp into its own module. (#416)Scott Moser
This was painful, but it finishes a TODO from cloudinit/subp.py. It moves the following from util to subp: ProcessExecutionError subp which target_path I moved subp_blob_in_tempfile into cc_chef, which is its only caller. That saved us from having to deal with it using write_file and temp_utils from subp (which does not import any cloudinit things now). It is arguable that 'target_path' could be moved to a 'path_utils' or something, but in order to use it from subp and also from utils, we had to get it out of utils.
2020-06-02test: fix all flake8 E121 and E123 errors (#404)Joshua Powers
This fixes issues with closing brackets not matching the opening bracket's line and continuation line under-idented for hanging indent.
2019-12-12azure: avoid re-running cloud-init when instance-id is byte-swapped (#84)AOhassan
Azure stores the instance ID with an incorrect byte ordering for the first three hyphen delimited parts. This results in invalid is_new_instance checks forcing Azure datasource to recrawl the metadata service. When persisting instance-id from the metadata service, swap the instance-id string byte order such that it is consistent with that returned by dmi information. Check whether the instance-id string is a byte-swapped match when determining correctly whether the Azure platform instance-id has actually changed.
2019-12-02url_helper: read_file_or_url should pass headers param into readurl (#66)Chad Smith
Headers param was accidentally omitted and no longer passed through to readurl due to a previous commit. To avoid this omission of params in the future, drop positional param definitions from read_file_or_url and pass all kwargs through to readurl when we are not operating on a file. In util:read_seeded, correct the case where invalid positional param file_retries was being passed into read_file_or_url. Also drop duplicated file:// prefix addition from read_seeded because read_file_or_url does that work anyway. LP: #1854084
2019-08-14Azure: Record boot timestamps, system information, and diagnostic eventsAnh Vo
Collect and record the following information through KVP:  + timestamps related to kernel initialization and systemd activation    of cloud-init services  + system information including cloud-init version, kernel version,    distro version, and python version  + diagnostic events for the most common provisioning error issues    such as empty dhcp lease, corrupted ovf-env.xml, etc. + increasing the log frequency of polling IMDS during reprovision.
2019-05-10Azure: Return static fallback address as if failed to find endpointJason Zions (MSFT)
The Azure data source helper attempts to use information in the dhcp lease to find the Wireserver endpoint (IP address). Under some unusual circumstances, those attempts will fail. This change uses a static address, known to be always correct in the Azure public and sovereign clouds, when the helper fails to locate a valid dhcp lease. This address is not guaranteed to be correct in Azure Stack environments; it's still best to use the information from the lease whenever possible.
2019-04-03DatasourceAzure: add additional logging for azure datasourceAnh Vo
Create an Azure logging decorator and use additional ReportEventStack context managers to provide additional logging details.
2019-02-22azure: Filter list of ssh keys pulled from fabricJason Zions (MSFT)
The Azure data source is expected to expose a list of ssh keys for the user-to-be-provisioned in the crawled metadata. When configured to use the __builtin__ agent this list is built by the WALinuxAgentShim. The shim retrieves the full set of certificates and public keys exposed to the VM from the wireserver, extracts any ssh keys it can, and returns that list. This fix reduces that list of ssh keys to just the ones whose fingerprints appear in the "administrative user" section of the ovf-env.xml file. The Azure control plane exposes other ssh keys to the VM for other reasons, but those should not be added to the authorized_keys file for the provisioned user.
2018-05-17read_file_or_url: move to url_helper, fix bug in its FileResponse.Scott Moser
The result of a read_file_or_url on a file and on a url would differ in behavior. str(UrlResponse) would return UrlResponse.contents.decode('utf-8') while str(FileResponse) would return str(FileResponse.contents) The difference being "b'foo'" versus "foo". As part of the general goal of cleaning util, move read_file_or_url into url_helper.
2018-01-24Azure VM Preprovisioning support.Douglas Jordan
This change will enable azure vms to report provisioning has completed twice, first to tell the fabric it has completed then a second time to enable customer settings. The datasource for the second provisioning is the Instance Metadata Service (IMDS),and the VM will poll indefinitely for the new ovf-env.xml from IMDS. This branch introduces EphemeralDHCPv4 which encapsulates common logic used by both DataSourceEc2 an DataSourceAzure for temporary DHCP interactions without side-effects. LP: #1734991
2017-12-15lint: Fix lints seen by pylint version 1.8.1.Chad Smith
This branch resolves lints seen by pylint revision 1.8.1 and updates our pinned tox pylint dependency used by our tox pylint target.
2017-10-03Azure, CloudStack: Support reading dhcp options from systemd-networkd.Dimitri John Ledkov
Systems that used systemd-networkd's dhcp client would not be able to get information on the Azure endpoint (placed in Option 245) or the CloudStack server (in 'server_address'). The change here supports reading these files in /run/systemd/netif/leases. The files declare that "This is private data. Do not parse.", but at this point we do not have another option. LP: #1718029
2017-09-07Use /run/cloud-init for tempfile operations.Scott Moser
During boot, the usage of /tmp is not safe. In systemd systems, systemd-tmpfiles-clean may run at any point and clear out a temp file while cloud-init is using it. The solution here is to use /run/cloud-init/tmp. LP: #1707222
2017-05-10FreeBSD: improvements and fixes for use on AzureHongjiang Zhang
This patch targets to make FreeBSD 10.3 or 11 work on Azure. The modifications abide by the rule of: * making as less modification as possible * delegate to the distro or datasource where possible. The main modifications are: 1. network configuration improvements, and movement into distro path. 2. Fix setting of password. Password setting through "pw" can only work through pipe. 3. Add 'root:wheel' to syslog_fix_perms field. 4. Support resizing default file system (ufs) 5. copy cloud.cfg for freebsd to /etc/cloud/cloud.cfg rather than /usr/local/etc/cloud/cloud.cfg. 6. Azure specific changes: a. When reading the azure endpoint, search in a different path and read a different option name (option-245 vs. unknown-245). so, the lease file path should be generated according to platform. b. adjust the handling of ephemeral mounts for ufs filesystem and for finding the ephemeral device. c. fix mounting of cdrom LP: #1636345
2017-04-21pylint: fix all logging warningsJoshua Powers
This will change all instances of LOG.warn to LOG.warning as warn is now a deprecated method. It will also make sure any logging uses lazy logging by passing string format arguments as function parameters.
2016-12-22LICENSE: Allow dual licensing GPL-3 or Apache 2.0Jon Grimm
This has been a recurring ask and we had initially just made the change to the cloud-init 2.0 codebase. As the current thinking is we'll just continue to enhance the current codebase, its desirable to relicense to match what we'd intended as part of the 2.0 plan here. - put a brief description of license in LICENSE file - put full license versions in LICENSE-GPLv3 and LICENSE-Apache2.0 - simplify the per-file header to reference LICENSE - tox: ignore H102 (Apache License Header check) Add license header to files that ship. Reformat headers, make sure everything has vi: at end of file. Non-shipping files do not need the copyright header, but at the moment tests/ have it.
2016-10-19Fix python2.6 things found running in centos 6.Scott Moser
This gets the tests running in centos 6. * ProcessExecutionError: remove setting of .message Nothing in cloud-init seems to use .message anywhere, so it does not seem necessary. The reason to change it is that on 2.6 it spits out: cloudinit/util.py:286: DeprecationWarning: BaseException.message * tox.ini: add a centos6 environment the tox versions listed here replicate a centos6 install with packages from EPEL. You will still need a python2.6 to run this env so we do not enable it by default.
2016-08-22azure dhclient-hook cleanupsScott Moser
This adds some function to the generator to maintain the presense of a flag file '/run/cloud-init/enabled' indicating that cloud-init is enabled. Then, only run the dhclient hooks if on Azure and cloud-init is enabled. The test for is_azure currently only checks to see that the board vendor is Microsoft, not actually that we are on azure. Running should not be harmful anywhere, other than slowing down dhclient. The value of this additional code is that then dhclient having run does not task the system with the load of cloud-init. Additionally, some changes to config are done here. * rename 'dhclient_leases' to 'dhclient_lease_file' * move that to the datasource config (datasource/Azure/dhclient_lease_file) Also, it removes the config in config/cloud.cfg that set agent_command to __builtin__. This means that by default cloud-init still needs the agent installed. The suggested follow-on improvement is to use __builtin__ if there is no walinux-agent installed.
2016-08-15Get Azure endpoint server from DHCP clientBrent Baude
It is more efficient and cross-distribution safe to use the hooks function from dhclient to obtain the Azure endpoint server (DHCP option 245). This is done by providing shell scritps that are called by the hooks infrastructure of both dhclient and NetworkManager. The hooks then invoke 'cloud-init dhclient-hook' that maintains json data with the dhclient options in /run/cloud-init/dhclient.hooks/<interface>.json . The azure helper then pulls the value from /run/cloud-init/dhclient.hooks/<interface>.json file(s). If that file does not exist or the value is not present, it will then fall back to the original method of scraping the dhcp client lease file.
2016-05-12fix last flake8 errorScott Moser
2016-05-12Fix up a ton of flake8 issuesJoshua Harlow
2016-02-16Handle escaped quotes in WALinuxAgentShim.find_endpointScott Moser
LP: #1488891
2015-10-30Use DMI data to find Azure instance IDs.Daniel Watkins
This replaces the use of SharedConfig.xml in both the walinuxagent case, and the case where we communicate with the Azure fabric ourselves.
2015-10-09Refactor WALinuxAgentShim.find_endpoint to use a helper method for IP ↵Daniel Watkins
address unpacking.
2015-10-09Handle colons in packed strings in WALinuxAgentShim.find_endpoint.Daniel Watkins
This fixes bug 1488896.
2015-10-09Handle escaped quotes in WALinuxAgentShim.find_endpoint.Daniel Watkins
This fixes bug 1488891.
2015-05-08Python 2.6 fixes.Daniel Watkins
2015-05-08Fix retrying.Daniel Watkins
2015-05-08Move our walinuxagent implementation to a single function call.Daniel Watkins
2015-05-08Stop using Python 3 only tempfile.TemporaryDirectory (but lose free cleanup).Daniel Watkins
2015-05-08Split WALinuxAgentShim code out to separate file.Daniel Watkins