diff options
author | Chad Smith <chad.smith@canonical.com> | 2017-08-21 13:46:23 -0600 |
---|---|---|
committer | Chad Smith <chad.smith@canonical.com> | 2017-08-21 13:46:23 -0600 |
commit | e74d7752f1761c3a8d3c19877de4707d00c49d08 (patch) | |
tree | 0627dd5d06d83b770b38196bceabddd06c0b1b77 /doc | |
parent | dc2bd79949492bccdc1d7df0132f98c354d51943 (diff) | |
download | vyos-cloud-init-e74d7752f1761c3a8d3c19877de4707d00c49d08.tar.gz vyos-cloud-init-e74d7752f1761c3a8d3c19877de4707d00c49d08.zip |
tools: Add tooling for basic cloud-init performance analysis.
This branch adds cloudinit-analyze into cloud-init proper. It adds an
"analyze" subcommand to the cloud-init command line utility for quick
performance assessment of cloud-init stages and events.
On a cloud-init configured instance, running "cloud-init analyze blame"
will now report which cloud-init events cost the most wall time. This
allows for quick assessment of the most costly stages of cloud-init.
This functionality is pulled from Ryan Harper's analyze work.
The cloudinit-analyze main script itself has been refactored a bit for
inclusion as a subcommand of cloud-init CLI. There will be a followup
branch at some point which will optionally instrument detailed strace
profiling, but that approach needs a bit more discussion first.
This branch also adds:
* additional debugging topic to the sphinx-generated docs describing
cloud-init analyze, dump and show as well as cloud-init single usage.
* Updates the Makefile unittests target to include cloudinit directory
because we now have unittests within that package.
LP: #1709761
Diffstat (limited to 'doc')
-rw-r--r-- | doc/rtd/index.rst | 1 | ||||
-rw-r--r-- | doc/rtd/topics/debugging.rst | 146 |
2 files changed, 147 insertions, 0 deletions
diff --git a/doc/rtd/index.rst b/doc/rtd/index.rst index a691103e..de67f361 100644 --- a/doc/rtd/index.rst +++ b/doc/rtd/index.rst @@ -40,6 +40,7 @@ initialization of a cloud instance. topics/merging.rst topics/network-config.rst topics/vendordata.rst + topics/debugging.rst topics/moreinfo.rst topics/hacking.rst topics/tests.rst diff --git a/doc/rtd/topics/debugging.rst b/doc/rtd/topics/debugging.rst new file mode 100644 index 00000000..4e43dd57 --- /dev/null +++ b/doc/rtd/topics/debugging.rst @@ -0,0 +1,146 @@ +********************** +Testing and debugging cloud-init +********************** + +Overview +======== +This topic will discuss general approaches for test and debug of cloud-init on +deployed instances. + + +Boot Time Analysis - cloud-init analyze +====================================== +Occasionally instances don't appear as performant as we would like and +cloud-init packages a simple facility to inspect what operations took +cloud-init the longest during boot and setup. + +The script **/usr/bin/cloud-init** has an analyze sub-command **analyze** +which parses any cloud-init.log file into formatted and sorted events. It +allows for detailed analysis of the most costly cloud-init operations are to +determine the long-pole in cloud-init configuration and setup. These +subcommands default to reading /var/log/cloud-init.log. + +* ``analyze show`` Parse and organize cloud-init.log events by stage and +include each sub-stage granularity with time delta reports. + +.. code-block:: bash + + $ cloud-init analyze show -i my-cloud-init.log + -- Boot Record 01 -- + The total time elapsed since completing an event is printed after the "@" + character. + The time the event takes is printed after the "+" character. + + Starting stage: modules-config + |`->config-emit_upstart ran successfully @05.47600s +00.00100s + |`->config-snap_config ran successfully @05.47700s +00.00100s + |`->config-ssh-import-id ran successfully @05.47800s +00.00200s + |`->config-locale ran successfully @05.48000s +00.00100s + ... + + +* ``analyze dump`` Parse cloud-init.log into event records and return a list of +dictionaries that can be consumed for other reporting needs. + +.. code-block:: bash + + $ cloud-init analyze blame -i my-cloud-init.log + [ + { + "description": "running config modules", + "event_type": "start", + "name": "modules-config", + "origin": "cloudinit", + "timestamp": 1510807493.0 + },... + +* ``analyze blame`` Parse cloud-init.log into event records and sort them based +on highest time cost for quick assessment of areas of cloud-init that may need +improvement. + +.. code-block:: bash + + $ cloud-init analyze blame -i my-cloud-init.log + -- Boot Record 11 -- + 00.01300s (modules-final/config-scripts-per-boot) + 00.00400s (modules-final/config-final-message) + 00.00100s (modules-final/config-rightscale_userdata) + ... + + +Analyze quickstart - LXC +--------------------------- +To quickly obtain a cloud-init log try using lxc on any ubuntu system: + +.. code-block:: bash + + $ lxc init ubuntu-daily:xenial x1 + $ lxc start x1 + # Take lxc's cloud-init.log and pipe it to the analyzer + $ lxc file pull x1/var/log/cloud-init.log - | cloud-init analyze dump -i - + $ lxc file pull x1/var/log/cloud-init.log - | \ + python3 -m cloudinit.analyze dump -i - + +Analyze quickstart - KVM +--------------------------- +To quickly analyze a KVM a cloud-init log: + +1. Download the current cloud image + wget https://cloud-images.ubuntu.com/daily/server/xenial/current/xenial-server-cloudimg-amd64.img +2. Create a snapshot image to preserve the original cloud-image + +.. code-block:: bash + + $ qemu-img create -b xenial-server-cloudimg-amd64.img -f qcow2 \ + test-cloudinit.qcow2 + +3. Create a seed image with metadata using `cloud-localds` + +.. code-block:: bash + + $ cat > user-data <<EOF + #cloud-config + password: passw0rd + chpasswd: { expire: False } + EOF + $ cloud-localds my-seed.img user-data + +4. Launch your modified VM + +.. code-block:: bash + + $ kvm -m 512 -net nic -net user -redir tcp:2222::22 \ + -drive file=test-cloudinit.qcow2,if=virtio,format=qcow2 \ + -drive file=my-seed.img,if=virtio,format=raw + +5. Analyze the boot (blame, dump, show) + +.. code-block:: bash + + $ ssh -p 2222 ubuntu@localhost 'cat /var/log/cloud-init.log' | \ + cloud-init analyze blame -i - + + +Running single cloud config modules +=================================== +This subcommand is not called by the init system. It can be called manually to +load the configured datasource and run a single cloud-config module once using +the cached userdata and metadata after the instance has booted. Each +cloud-config module has a module FREQUENCY configured: PER_INSTANCE, PER_BOOT, +PER_ONCE or PER_ALWAYS. When a module is run by cloud-init, it stores a +semaphore file in +``/var/lib/cloud/instance/sem/config_<module_name>.<frequency>`` which marks +when the module last successfully ran. Presence of this semaphore file +prevents a module from running again if it has already been run. To ensure that +a module is run again, the desired frequency can be overridden on the +commandline: + +.. code-block:: bash + + $ sudo cloud-init single --name cc_ssh --frequency always + ... + Generating public/private ed25519 key pair + ... + +Inspect cloud-init.log for output of what operations were performed as a +result. |