From 385d1cae1023ed89c3830a148aea02807240a07d Mon Sep 17 00:00:00 2001
From: Ryan Harper <ryan.harper@canonical.com>
Date: Mon, 14 Aug 2017 11:40:54 -0500
Subject: doc: update capabilities with features available, link doc reference,
 cli example

---
 doc/rtd/topics/capabilities.rst | 50 ++++++++++++++++++++++++++++++++---------
 1 file changed, 40 insertions(+), 10 deletions(-)

(limited to 'doc')

diff --git a/doc/rtd/topics/capabilities.rst b/doc/rtd/topics/capabilities.rst
index 2c8770bd..b8034b07 100644
--- a/doc/rtd/topics/capabilities.rst
+++ b/doc/rtd/topics/capabilities.rst
@@ -31,19 +31,49 @@ support. This allows other applications to detect what features the installed
 cloud-init supports without having to parse its version number. If present,
 this list of features will be located at ``cloudinit.version.FEATURES``.
 
-When checking if cloud-init supports a feature, in order to not break the
-detection script on older versions of cloud-init without the features list, a
-script similar to the following should be used. Note that this will exit 0 if
-the feature is supported and 1 otherwise::
+Currently defined feature names include:
 
-    import sys
-    from cloudinit import version
-    sys.exit('<FEATURE_NAME>' not in getattr(version, 'FEATURES', []))
+ - ``NETWORK_CONFIG_V1`` support for v1 networking configuration,
+   see :ref:`network_config_v1` documentation for examples.
+ - ``NETWORK_CONFIG_V2`` support for v2 networking configuration,
+   see :ref:`network_config_v2` documentation for examples.
 
-Currently defined feature names include:
 
- - ``NETWORK_CONFIG_V1`` support for v1 networking configuration, see curtin
-   documentation for examples.
+CLI Interface :
+
+``cloud-init features`` will print out each feature supported.  If cloud-init
+does not have the features subcommand, it also does not support any features
+described in this document.
+
+.. code-block:: bash
+
+  % cloud-init --help
+  usage: cloud-init [-h] [--version] [--file FILES] [--debug] [--force]
+                    {init,modules,query,single,dhclient-hook,features} ...
+
+  positional arguments:
+    {init,modules,query,single,dhclient-hook,features}
+      init                initializes cloud-init and performs initial modules
+      modules             activates modules using a given configuration key
+      query               query information stored in cloud-init
+      single              run a single module
+      dhclient-hook       run the dhclient hookto record network info
+      features            list defined features
+
+  optional arguments:
+    -h, --help            show this help message and exit
+    --version, -v         show program's version number and exit
+    --file FILES, -f FILES
+                          additional yaml configuration files to use
+    --debug, -d           show additional pre-action logging (default: False)
+    --force               force running even if no datasource is found (use at
+                          your own risk)
+
+
+  % cloud-init features
+  NETWORK_CONFIG_V1
+  NETWORK_CONFIG_V2
+
 
 .. _Cloud-init: https://launchpad.net/cloud-init
 .. vi: textwidth=78
-- 
cgit v1.2.3


From e74d7752f1761c3a8d3c19877de4707d00c49d08 Mon Sep 17 00:00:00 2001
From: Chad Smith <chad.smith@canonical.com>
Date: Mon, 21 Aug 2017 13:46:23 -0600
Subject: tools: Add tooling for basic cloud-init performance analysis.

This branch adds cloudinit-analyze into cloud-init proper. It adds an
"analyze" subcommand to the cloud-init command line utility for quick
performance assessment of cloud-init stages and events.

On a cloud-init configured instance, running "cloud-init analyze blame"
will now report which cloud-init events cost the most wall time. This
allows for quick assessment of the most costly stages of cloud-init.

This functionality is pulled from Ryan Harper's analyze work.

The cloudinit-analyze main script itself has been refactored a bit for
inclusion as a subcommand of cloud-init CLI. There will be a followup
branch at some point which will optionally instrument detailed strace
profiling, but that approach needs a bit more discussion first.

This branch also adds:
 * additional debugging topic to the sphinx-generated docs describing
   cloud-init analyze, dump and show as well as cloud-init single usage.
 * Updates the Makefile unittests target to include cloudinit directory
   because we now have unittests within that package.

LP: #1709761
---
 Makefile                             |   2 +-
 cloudinit/analyze/__init__.py        |   0
 cloudinit/analyze/__main__.py        | 155 ++++++++++++++++++++++++++
 cloudinit/analyze/dump.py            | 176 +++++++++++++++++++++++++++++
 cloudinit/analyze/show.py            | 207 ++++++++++++++++++++++++++++++++++
 cloudinit/analyze/tests/test_dump.py | 210 +++++++++++++++++++++++++++++++++++
 cloudinit/cmd/main.py                |  44 +++-----
 doc/rtd/index.rst                    |   1 +
 doc/rtd/topics/debugging.rst         | 146 ++++++++++++++++++++++++
 tests/unittests/test_cli.py          |  87 ++++++++++++++-
 10 files changed, 995 insertions(+), 33 deletions(-)
 create mode 100644 cloudinit/analyze/__init__.py
 create mode 100644 cloudinit/analyze/__main__.py
 create mode 100644 cloudinit/analyze/dump.py
 create mode 100644 cloudinit/analyze/show.py
 create mode 100644 cloudinit/analyze/tests/test_dump.py
 create mode 100644 doc/rtd/topics/debugging.rst

(limited to 'doc')

diff --git a/Makefile b/Makefile
index f280911f..9e7f4ee7 100644
--- a/Makefile
+++ b/Makefile
@@ -48,7 +48,7 @@ pyflakes3:
 	@$(CWD)/tools/run-pyflakes3
 
 unittest: clean_pyc
-	nosetests $(noseopts) tests/unittests
+	nosetests $(noseopts) tests/unittests cloudinit
 
 unittest3: clean_pyc
 	nosetests3 $(noseopts) tests/unittests
diff --git a/cloudinit/analyze/__init__.py b/cloudinit/analyze/__init__.py
new file mode 100644
index 00000000..e69de29b
diff --git a/cloudinit/analyze/__main__.py b/cloudinit/analyze/__main__.py
new file mode 100644
index 00000000..71cba4f2
--- /dev/null
+++ b/cloudinit/analyze/__main__.py
@@ -0,0 +1,155 @@
+# Copyright (C) 2017 Canonical Ltd.
+#
+# This file is part of cloud-init. See LICENSE file for license information.
+
+import argparse
+import re
+import sys
+
+from . import dump
+from . import show
+
+
+def get_parser(parser=None):
+    if not parser:
+        parser = argparse.ArgumentParser(
+            prog='cloudinit-analyze',
+            description='Devel tool: Analyze cloud-init logs and data')
+    subparsers = parser.add_subparsers(title='Subcommands', dest='subcommand')
+    subparsers.required = True
+
+    parser_blame = subparsers.add_parser(
+        'blame', help='Print list of executed stages ordered by time to init')
+    parser_blame.add_argument(
+        '-i', '--infile', action='store', dest='infile',
+        default='/var/log/cloud-init.log',
+        help='specify where to read input.')
+    parser_blame.add_argument(
+        '-o', '--outfile', action='store', dest='outfile', default='-',
+        help='specify where to write output. ')
+    parser_blame.set_defaults(action=('blame', analyze_blame))
+
+    parser_show = subparsers.add_parser(
+        'show', help='Print list of in-order events during execution')
+    parser_show.add_argument('-f', '--format', action='store',
+                             dest='print_format', default='%I%D @%Es +%ds',
+                             help='specify formatting of output.')
+    parser_show.add_argument('-i', '--infile', action='store',
+                             dest='infile', default='/var/log/cloud-init.log',
+                             help='specify where to read input.')
+    parser_show.add_argument('-o', '--outfile', action='store',
+                             dest='outfile', default='-',
+                             help='specify where to write output.')
+    parser_show.set_defaults(action=('show', analyze_show))
+    parser_dump = subparsers.add_parser(
+        'dump', help='Dump cloud-init events in JSON format')
+    parser_dump.add_argument('-i', '--infile', action='store',
+                             dest='infile', default='/var/log/cloud-init.log',
+                             help='specify where to read input. ')
+    parser_dump.add_argument('-o', '--outfile', action='store',
+                             dest='outfile', default='-',
+                             help='specify where to write output. ')
+    parser_dump.set_defaults(action=('dump', analyze_dump))
+    return parser
+
+
+def analyze_blame(name, args):
+    """Report a list of records sorted by largest time delta.
+
+    For example:
+      30.210s (init-local) searching for datasource
+       8.706s (init-network) reading and applying user-data
+        166ms (modules-config) ....
+        807us (modules-final) ...
+
+    We generate event records parsing cloud-init logs, formatting the output
+    and sorting by record data ('delta')
+    """
+    (infh, outfh) = configure_io(args)
+    blame_format = '     %ds (%n)'
+    r = re.compile('(^\s+\d+\.\d+)', re.MULTILINE)
+    for idx, record in enumerate(show.show_events(_get_events(infh),
+                                                  blame_format)):
+        srecs = sorted(filter(r.match, record), reverse=True)
+        outfh.write('-- Boot Record %02d --\n' % (idx + 1))
+        outfh.write('\n'.join(srecs) + '\n')
+        outfh.write('\n')
+    outfh.write('%d boot records analyzed\n' % (idx + 1))
+
+
+def analyze_show(name, args):
+    """Generate output records using the 'standard' format to printing events.
+
+    Example output follows:
+        Starting stage: (init-local)
+          ...
+        Finished stage: (init-local) 0.105195 seconds
+
+        Starting stage: (init-network)
+          ...
+        Finished stage: (init-network) 0.339024 seconds
+
+        Starting stage: (modules-config)
+          ...
+        Finished stage: (modules-config) 0.NNN seconds
+
+        Starting stage: (modules-final)
+          ...
+        Finished stage: (modules-final) 0.NNN seconds
+    """
+    (infh, outfh) = configure_io(args)
+    for idx, record in enumerate(show.show_events(_get_events(infh),
+                                                  args.print_format)):
+        outfh.write('-- Boot Record %02d --\n' % (idx + 1))
+        outfh.write('The total time elapsed since completing an event is'
+                    ' printed after the "@" character.\n')
+        outfh.write('The time the event takes is printed after the "+" '
+                    'character.\n\n')
+        outfh.write('\n'.join(record) + '\n')
+    outfh.write('%d boot records analyzed\n' % (idx + 1))
+
+
+def analyze_dump(name, args):
+    """Dump cloud-init events in json format"""
+    (infh, outfh) = configure_io(args)
+    outfh.write(dump.json_dumps(_get_events(infh)) + '\n')
+
+
+def _get_events(infile):
+    rawdata = None
+    events, rawdata = show.load_events(infile, None)
+    if not events:
+        events, _ = dump.dump_events(rawdata=rawdata)
+    return events
+
+
+def configure_io(args):
+    """Common parsing and setup of input/output files"""
+    if args.infile == '-':
+        infh = sys.stdin
+    else:
+        try:
+            infh = open(args.infile, 'r')
+        except (FileNotFoundError, PermissionError):
+            sys.stderr.write('Cannot open file %s\n' % args.infile)
+            sys.exit(1)
+
+    if args.outfile == '-':
+        outfh = sys.stdout
+    else:
+        try:
+            outfh = open(args.outfile, 'w')
+        except PermissionError:
+            sys.stderr.write('Cannot open file %s\n' % args.outfile)
+            sys.exit(1)
+
+    return (infh, outfh)
+
+
+if __name__ == '__main__':
+    parser = get_parser()
+    args = parser.parse_args()
+    (name, action_functor) = args.action
+    action_functor(name, args)
+
+# vi: ts=4 expandtab
diff --git a/cloudinit/analyze/dump.py b/cloudinit/analyze/dump.py
new file mode 100644
index 00000000..ca4da496
--- /dev/null
+++ b/cloudinit/analyze/dump.py
@@ -0,0 +1,176 @@
+# This file is part of cloud-init. See LICENSE file for license information.
+
+import calendar
+from datetime import datetime
+import json
+import sys
+
+from cloudinit import util
+
+stage_to_description = {
+    'finished': 'finished running cloud-init',
+    'init-local': 'starting search for local datasources',
+    'init-network': 'searching for network datasources',
+    'init': 'searching for network datasources',
+    'modules-config': 'running config modules',
+    'modules-final': 'finalizing modules',
+    'modules': 'running modules for',
+    'single': 'running single module ',
+}
+
+# logger's asctime format
+CLOUD_INIT_ASCTIME_FMT = "%Y-%m-%d %H:%M:%S,%f"
+
+# journctl -o short-precise
+CLOUD_INIT_JOURNALCTL_FMT = "%b %d %H:%M:%S.%f %Y"
+
+# other
+DEFAULT_FMT = "%b %d %H:%M:%S %Y"
+
+
+def parse_timestamp(timestampstr):
+    # default syslog time does not include the current year
+    months = [calendar.month_abbr[m] for m in range(1, 13)]
+    if timestampstr.split()[0] in months:
+        # Aug 29 22:55:26
+        FMT = DEFAULT_FMT
+        if '.' in timestampstr:
+            FMT = CLOUD_INIT_JOURNALCTL_FMT
+        dt = datetime.strptime(timestampstr + " " +
+                               str(datetime.now().year),
+                               FMT)
+        timestamp = dt.strftime("%s.%f")
+    elif "," in timestampstr:
+        # 2016-09-12 14:39:20,839
+        dt = datetime.strptime(timestampstr, CLOUD_INIT_ASCTIME_FMT)
+        timestamp = dt.strftime("%s.%f")
+    else:
+        # allow date(1) to handle other formats we don't expect
+        timestamp = parse_timestamp_from_date(timestampstr)
+
+    return float(timestamp)
+
+
+def parse_timestamp_from_date(timestampstr):
+    out, _ = util.subp(['date', '+%s.%3N', '-d', timestampstr])
+    timestamp = out.strip()
+    return float(timestamp)
+
+
+def parse_ci_logline(line):
+    # Stage Starts:
+    # Cloud-init v. 0.7.7 running 'init-local' at \
+    #               Fri, 02 Sep 2016 19:28:07 +0000. Up 1.0 seconds.
+    # Cloud-init v. 0.7.7 running 'init' at \
+    #               Fri, 02 Sep 2016 19:28:08 +0000. Up 2.0 seconds.
+    # Cloud-init v. 0.7.7 finished at
+    # Aug 29 22:55:26 test1 [CLOUDINIT] handlers.py[DEBUG]: \
+    #               finish: modules-final: SUCCESS: running modules for final
+    # 2016-08-30T21:53:25.972325+00:00 y1 [CLOUDINIT] handlers.py[DEBUG]: \
+    #               finish: modules-final: SUCCESS: running modules for final
+    #
+    # Nov 03 06:51:06.074410 x2 cloud-init[106]: [CLOUDINIT] util.py[DEBUG]: \
+    #               Cloud-init v. 0.7.8 running 'init-local' at \
+    #               Thu, 03 Nov 2016 06:51:06 +0000. Up 1.0 seconds.
+    #
+    # 2017-05-22 18:02:01,088 - util.py[DEBUG]: Cloud-init v. 0.7.9 running \
+    #         'init-local' at Mon, 22 May 2017 18:02:01 +0000. Up 2.0 seconds.
+
+    separators = [' - ', ' [CLOUDINIT] ']
+    found = False
+    for sep in separators:
+        if sep in line:
+            found = True
+            break
+
+    if not found:
+        return None
+
+    (timehost, eventstr) = line.split(sep)
+
+    # journalctl -o short-precise
+    if timehost.endswith(":"):
+        timehost = " ".join(timehost.split()[0:-1])
+
+    if "," in timehost:
+        timestampstr, extra = timehost.split(",")
+        timestampstr += ",%s" % extra.split()[0]
+        if ' ' in extra:
+            hostname = extra.split()[-1]
+    else:
+        hostname = timehost.split()[-1]
+        timestampstr = timehost.split(hostname)[0].strip()
+    if 'Cloud-init v.' in eventstr:
+        event_type = 'start'
+        if 'running' in eventstr:
+            stage_and_timestamp = eventstr.split('running')[1].lstrip()
+            event_name, _ = stage_and_timestamp.split(' at ')
+            event_name = event_name.replace("'", "").replace(":", "-")
+            if event_name == "init":
+                event_name = "init-network"
+        else:
+            # don't generate a start for the 'finished at' banner
+            return None
+        event_description = stage_to_description[event_name]
+    else:
+        (pymodloglvl, event_type, event_name) = eventstr.split()[0:3]
+        event_description = eventstr.split(event_name)[1].strip()
+
+    event = {
+        'name': event_name.rstrip(":"),
+        'description': event_description,
+        'timestamp': parse_timestamp(timestampstr),
+        'origin': 'cloudinit',
+        'event_type': event_type.rstrip(":"),
+    }
+    if event['event_type'] == "finish":
+        result = event_description.split(":")[0]
+        desc = event_description.split(result)[1].lstrip(':').strip()
+        event['result'] = result
+        event['description'] = desc.strip()
+
+    return event
+
+
+def json_dumps(data):
+    return json.dumps(data, indent=1, sort_keys=True,
+                      separators=(',', ': '))
+
+
+def dump_events(cisource=None, rawdata=None):
+    events = []
+    event = None
+    CI_EVENT_MATCHES = ['start:', 'finish:', 'Cloud-init v.']
+
+    if not any([cisource, rawdata]):
+        raise ValueError('Either cisource or rawdata parameters are required')
+
+    if rawdata:
+        data = rawdata.splitlines()
+    else:
+        data = cisource.readlines()
+
+    for line in data:
+        for match in CI_EVENT_MATCHES:
+            if match in line:
+                try:
+                    event = parse_ci_logline(line)
+                except ValueError:
+                    sys.stderr.write('Skipping invalid entry\n')
+                if event:
+                    events.append(event)
+
+    return events, data
+
+
+def main():
+    if len(sys.argv) > 1:
+        cisource = open(sys.argv[1])
+    else:
+        cisource = sys.stdin
+
+    return json_dumps(dump_events(cisource))
+
+
+if __name__ == "__main__":
+    print(main())
diff --git a/cloudinit/analyze/show.py b/cloudinit/analyze/show.py
new file mode 100644
index 00000000..3b356bb8
--- /dev/null
+++ b/cloudinit/analyze/show.py
@@ -0,0 +1,207 @@
+#   Copyright (C) 2016 Canonical Ltd.
+#
+#   Author: Ryan Harper <ryan.harper@canonical.com>
+#
+# This file is part of cloud-init. See LICENSE file for license information.
+
+import base64
+import datetime
+import json
+import os
+
+from cloudinit import util
+
+#  An event:
+'''
+{
+        "description": "executing late commands",
+        "event_type": "start",
+        "level": "INFO",
+        "name": "cmd-install/stage-late"
+        "origin": "cloudinit",
+        "timestamp": 1461164249.1590767,
+},
+
+    {
+        "description": "executing late commands",
+        "event_type": "finish",
+        "level": "INFO",
+        "name": "cmd-install/stage-late",
+        "origin": "cloudinit",
+        "result": "SUCCESS",
+        "timestamp": 1461164249.1590767
+    }
+
+'''
+format_key = {
+    '%d': 'delta',
+    '%D': 'description',
+    '%E': 'elapsed',
+    '%e': 'event_type',
+    '%I': 'indent',
+    '%l': 'level',
+    '%n': 'name',
+    '%o': 'origin',
+    '%r': 'result',
+    '%t': 'timestamp',
+    '%T': 'total_time',
+}
+
+formatting_help = " ".join(["{0}: {1}".format(k.replace('%', '%%'), v)
+                           for k, v in format_key.items()])
+
+
+def format_record(msg, event):
+    for i, j in format_key.items():
+        if i in msg:
+            # ensure consistent formatting of time values
+            if j in ['delta', 'elapsed', 'timestamp']:
+                msg = msg.replace(i, "{%s:08.5f}" % j)
+            else:
+                msg = msg.replace(i, "{%s}" % j)
+    return msg.format(**event)
+
+
+def dump_event_files(event):
+    content = dict((k, v) for k, v in event.items() if k not in ['content'])
+    files = content['files']
+    saved = []
+    for f in files:
+        fname = f['path']
+        fn_local = os.path.basename(fname)
+        fcontent = base64.b64decode(f['content']).decode('ascii')
+        util.write_file(fn_local, fcontent)
+        saved.append(fn_local)
+
+    return saved
+
+
+def event_name(event):
+    if event:
+        return event.get('name')
+    return None
+
+
+def event_type(event):
+    if event:
+        return event.get('event_type')
+    return None
+
+
+def event_parent(event):
+    if event:
+        return event_name(event).split("/")[0]
+    return None
+
+
+def event_timestamp(event):
+    return float(event.get('timestamp'))
+
+
+def event_datetime(event):
+    return datetime.datetime.utcfromtimestamp(event_timestamp(event))
+
+
+def delta_seconds(t1, t2):
+    return (t2 - t1).total_seconds()
+
+
+def event_duration(start, finish):
+    return delta_seconds(event_datetime(start), event_datetime(finish))
+
+
+def event_record(start_time, start, finish):
+    record = finish.copy()
+    record.update({
+        'delta': event_duration(start, finish),
+        'elapsed': delta_seconds(start_time, event_datetime(start)),
+        'indent': '|' + ' ' * (event_name(start).count('/') - 1) + '`->',
+    })
+
+    return record
+
+
+def total_time_record(total_time):
+    return 'Total Time: %3.5f seconds\n' % total_time
+
+
+def generate_records(events, blame_sort=False,
+                     print_format="(%n) %d seconds in %I%D",
+                     dump_files=False, log_datafiles=False):
+
+    sorted_events = sorted(events, key=lambda x: x['timestamp'])
+    records = []
+    start_time = None
+    total_time = 0.0
+    stage_start_time = {}
+    stages_seen = []
+    boot_records = []
+
+    unprocessed = []
+    for e in range(0, len(sorted_events)):
+        event = events[e]
+        try:
+            next_evt = events[e + 1]
+        except IndexError:
+            next_evt = None
+
+        if event_type(event) == 'start':
+            if event.get('name') in stages_seen:
+                records.append(total_time_record(total_time))
+                boot_records.append(records)
+                records = []
+                start_time = None
+                total_time = 0.0
+
+            if start_time is None:
+                stages_seen = []
+                start_time = event_datetime(event)
+                stage_start_time[event_parent(event)] = start_time
+
+            # see if we have a pair
+            if event_name(event) == event_name(next_evt):
+                if event_type(next_evt) == 'finish':
+                    records.append(format_record(print_format,
+                                                 event_record(start_time,
+                                                              event,
+                                                              next_evt)))
+            else:
+                # This is a parent event
+                records.append("Starting stage: %s" % event.get('name'))
+                unprocessed.append(event)
+                stages_seen.append(event.get('name'))
+                continue
+        else:
+            prev_evt = unprocessed.pop()
+            if event_name(event) == event_name(prev_evt):
+                record = event_record(start_time, prev_evt, event)
+                records.append(format_record("Finished stage: "
+                                             "(%n) %d seconds ",
+                                             record) + "\n")
+                total_time += record.get('delta')
+            else:
+                # not a match, put it back
+                unprocessed.append(prev_evt)
+
+    records.append(total_time_record(total_time))
+    boot_records.append(records)
+    return boot_records
+
+
+def show_events(events, print_format):
+    return generate_records(events, print_format=print_format)
+
+
+def load_events(infile, rawdata=None):
+    if rawdata:
+        data = rawdata.read()
+    else:
+        data = infile.read()
+
+    j = None
+    try:
+        j = json.loads(data)
+    except json.JSONDecodeError:
+        pass
+
+    return j, data
diff --git a/cloudinit/analyze/tests/test_dump.py b/cloudinit/analyze/tests/test_dump.py
new file mode 100644
index 00000000..2c0885d0
--- /dev/null
+++ b/cloudinit/analyze/tests/test_dump.py
@@ -0,0 +1,210 @@
+# This file is part of cloud-init. See LICENSE file for license information.
+
+from datetime import datetime
+from textwrap import dedent
+
+from cloudinit.analyze.dump import (
+    dump_events, parse_ci_logline, parse_timestamp)
+from cloudinit.util import subp, write_file
+from tests.unittests.helpers import CiTestCase
+
+
+class TestParseTimestamp(CiTestCase):
+
+    def test_parse_timestamp_handles_cloud_init_default_format(self):
+        """Logs with cloud-init detailed formats will be properly parsed."""
+        trusty_fmt = '%Y-%m-%d %H:%M:%S,%f'
+        trusty_stamp = '2016-09-12 14:39:20,839'
+
+        parsed = parse_timestamp(trusty_stamp)
+
+        # convert ourselves
+        dt = datetime.strptime(trusty_stamp, trusty_fmt)
+        expected = float(dt.strftime('%s.%f'))
+
+        # use date(1)
+        out, _err = subp(['date', '+%s.%3N', '-d', trusty_stamp])
+        timestamp = out.strip()
+        date_ts = float(timestamp)
+
+        self.assertEqual(expected, parsed)
+        self.assertEqual(expected, date_ts)
+        self.assertEqual(date_ts, parsed)
+
+    def test_parse_timestamp_handles_syslog_adding_year(self):
+        """Syslog timestamps lack a year. Add year and properly parse."""
+        syslog_fmt = '%b %d %H:%M:%S %Y'
+        syslog_stamp = 'Aug 08 15:12:51'
+
+        # convert stamp ourselves by adding the missing year value
+        year = datetime.now().year
+        dt = datetime.strptime(syslog_stamp + " " + str(year), syslog_fmt)
+        expected = float(dt.strftime('%s.%f'))
+        parsed = parse_timestamp(syslog_stamp)
+
+        # use date(1)
+        out, _ = subp(['date', '+%s.%3N', '-d', syslog_stamp])
+        timestamp = out.strip()
+        date_ts = float(timestamp)
+
+        self.assertEqual(expected, parsed)
+        self.assertEqual(expected, date_ts)
+        self.assertEqual(date_ts, parsed)
+
+    def test_parse_timestamp_handles_journalctl_format_adding_year(self):
+        """Journalctl precise timestamps lack a year. Add year and parse."""
+        journal_fmt = '%b %d %H:%M:%S.%f %Y'
+        journal_stamp = 'Aug 08 17:15:50.606811'
+
+        # convert stamp ourselves by adding the missing year value
+        year = datetime.now().year
+        dt = datetime.strptime(journal_stamp + " " + str(year), journal_fmt)
+        expected = float(dt.strftime('%s.%f'))
+        parsed = parse_timestamp(journal_stamp)
+
+        # use date(1)
+        out, _ = subp(['date', '+%s.%6N', '-d', journal_stamp])
+        timestamp = out.strip()
+        date_ts = float(timestamp)
+
+        self.assertEqual(expected, parsed)
+        self.assertEqual(expected, date_ts)
+        self.assertEqual(date_ts, parsed)
+
+    def test_parse_unexpected_timestamp_format_with_date_command(self):
+        """Dump sends unexpected timestamp formats to data for processing."""
+        new_fmt = '%H:%M %m/%d %Y'
+        new_stamp = '17:15 08/08'
+
+        # convert stamp ourselves by adding the missing year value
+        year = datetime.now().year
+        dt = datetime.strptime(new_stamp + " " + str(year), new_fmt)
+        expected = float(dt.strftime('%s.%f'))
+        parsed = parse_timestamp(new_stamp)
+
+        # use date(1)
+        out, _ = subp(['date', '+%s.%6N', '-d', new_stamp])
+        timestamp = out.strip()
+        date_ts = float(timestamp)
+
+        self.assertEqual(expected, parsed)
+        self.assertEqual(expected, date_ts)
+        self.assertEqual(date_ts, parsed)
+
+
+class TestParseCILogLine(CiTestCase):
+
+    def test_parse_logline_returns_none_without_separators(self):
+        """When no separators are found, parse_ci_logline returns None."""
+        expected_parse_ignores = [
+            '', '-', 'adsf-asdf', '2017-05-22 18:02:01,088', 'CLOUDINIT']
+        for parse_ignores in expected_parse_ignores:
+            self.assertIsNone(parse_ci_logline(parse_ignores))
+
+    def test_parse_logline_returns_event_for_cloud_init_logs(self):
+        """parse_ci_logline returns an event parse from cloud-init format."""
+        line = (
+            "2017-08-08 20:05:07,147 - util.py[DEBUG]: Cloud-init v. 0.7.9"
+            " running 'init-local' at Tue, 08 Aug 2017 20:05:07 +0000. Up"
+            " 6.26 seconds.")
+        dt = datetime.strptime(
+            '2017-08-08 20:05:07,147', '%Y-%m-%d %H:%M:%S,%f')
+        timestamp = float(dt.strftime('%s.%f'))
+        expected = {
+            'description': 'starting search for local datasources',
+            'event_type': 'start',
+            'name': 'init-local',
+            'origin': 'cloudinit',
+            'timestamp': timestamp}
+        self.assertEqual(expected, parse_ci_logline(line))
+
+    def test_parse_logline_returns_event_for_journalctl_logs(self):
+        """parse_ci_logline returns an event parse from journalctl format."""
+        line = ("Nov 03 06:51:06.074410 x2 cloud-init[106]: [CLOUDINIT]"
+                " util.py[DEBUG]: Cloud-init v. 0.7.8 running 'init-local' at"
+                "  Thu, 03 Nov 2016 06:51:06 +0000. Up 1.0 seconds.")
+        year = datetime.now().year
+        dt = datetime.strptime(
+            'Nov 03 06:51:06.074410 %d' % year, '%b %d %H:%M:%S.%f %Y')
+        timestamp = float(dt.strftime('%s.%f'))
+        expected = {
+            'description': 'starting search for local datasources',
+            'event_type': 'start',
+            'name': 'init-local',
+            'origin': 'cloudinit',
+            'timestamp': timestamp}
+        self.assertEqual(expected, parse_ci_logline(line))
+
+    def test_parse_logline_returns_event_for_finish_events(self):
+        """parse_ci_logline returns a finish event for a parsed log line."""
+        line = ('2016-08-30 21:53:25.972325+00:00 y1 [CLOUDINIT]'
+                ' handlers.py[DEBUG]: finish: modules-final: SUCCESS: running'
+                ' modules for final')
+        expected = {
+            'description': 'running modules for final',
+            'event_type': 'finish',
+            'name': 'modules-final',
+            'origin': 'cloudinit',
+            'result': 'SUCCESS',
+            'timestamp': 1472594005.972}
+        self.assertEqual(expected, parse_ci_logline(line))
+
+
+SAMPLE_LOGS = dedent("""\
+Nov 03 06:51:06.074410 x2 cloud-init[106]: [CLOUDINIT] util.py[DEBUG]:\
+ Cloud-init v. 0.7.8 running 'init-local' at Thu, 03 Nov 2016\
+ 06:51:06 +0000. Up 1.0 seconds.
+2016-08-30 21:53:25.972325+00:00 y1 [CLOUDINIT] handlers.py[DEBUG]: finish:\
+ modules-final: SUCCESS: running modules for final
+""")
+
+
+class TestDumpEvents(CiTestCase):
+    maxDiff = None
+
+    def test_dump_events_with_rawdata(self):
+        """Rawdata is split and parsed into a tuple of events and data"""
+        events, data = dump_events(rawdata=SAMPLE_LOGS)
+        expected_data = SAMPLE_LOGS.splitlines()
+        year = datetime.now().year
+        dt1 = datetime.strptime(
+            'Nov 03 06:51:06.074410 %d' % year, '%b %d %H:%M:%S.%f %Y')
+        timestamp1 = float(dt1.strftime('%s.%f'))
+        expected_events = [{
+            'description': 'starting search for local datasources',
+            'event_type': 'start',
+            'name': 'init-local',
+            'origin': 'cloudinit',
+            'timestamp': timestamp1}, {
+            'description': 'running modules for final',
+            'event_type': 'finish',
+            'name': 'modules-final',
+            'origin': 'cloudinit',
+            'result': 'SUCCESS',
+            'timestamp': 1472594005.972}]
+        self.assertEqual(expected_events, events)
+        self.assertEqual(expected_data, data)
+
+    def test_dump_events_with_cisource(self):
+        """Cisource file is read and parsed into a tuple of events and data."""
+        tmpfile = self.tmp_path('logfile')
+        write_file(tmpfile, SAMPLE_LOGS)
+        events, data = dump_events(cisource=open(tmpfile))
+        year = datetime.now().year
+        dt1 = datetime.strptime(
+            'Nov 03 06:51:06.074410 %d' % year, '%b %d %H:%M:%S.%f %Y')
+        timestamp1 = float(dt1.strftime('%s.%f'))
+        expected_events = [{
+            'description': 'starting search for local datasources',
+            'event_type': 'start',
+            'name': 'init-local',
+            'origin': 'cloudinit',
+            'timestamp': timestamp1}, {
+            'description': 'running modules for final',
+            'event_type': 'finish',
+            'name': 'modules-final',
+            'origin': 'cloudinit',
+            'result': 'SUCCESS',
+            'timestamp': 1472594005.972}]
+        self.assertEqual(expected_events, events)
+        self.assertEqual(SAMPLE_LOGS.splitlines(), [d.strip() for d in data])
diff --git a/cloudinit/cmd/main.py b/cloudinit/cmd/main.py
index 139e03b3..9c0ac864 100644
--- a/cloudinit/cmd/main.py
+++ b/cloudinit/cmd/main.py
@@ -50,13 +50,6 @@ WELCOME_MSG_TPL = ("Cloud-init v. {version} running '{action}' at "
 # Module section template
 MOD_SECTION_TPL = "cloud_%s_modules"
 
-# Things u can query on
-QUERY_DATA_TYPES = [
-    'data',
-    'data_raw',
-    'instance_id',
-]
-
 # Frequency shortname to full name
 # (so users don't have to remember the full name...)
 FREQ_SHORT_NAMES = {
@@ -510,11 +503,6 @@ def main_modules(action_name, args):
     return run_module_section(mods, name, name)
 
 
-def main_query(name, _args):
-    raise NotImplementedError(("Action '%s' is not"
-                               " currently implemented") % (name))
-
-
 def main_single(name, args):
     # Cloud-init single stage is broken up into the following sub-stages
     # 1. Ensure that the init object fetches its config without errors
@@ -713,9 +701,11 @@ def main(sysv_args=None):
                         default=False)
 
     parser.set_defaults(reporter=None)
-    subparsers = parser.add_subparsers()
+    subparsers = parser.add_subparsers(title='Subcommands', dest='subcommand')
+    subparsers.required = True
 
     # Each action and its sub-options (if any)
+
     parser_init = subparsers.add_parser('init',
                                         help=('initializes cloud-init and'
                                               ' performs initial modules'))
@@ -737,17 +727,6 @@ def main(sysv_args=None):
                             choices=('init', 'config', 'final'))
     parser_mod.set_defaults(action=('modules', main_modules))
 
-    # These settings are used when you want to query information
-    # stored in the cloud-init data objects/directories/files
-    parser_query = subparsers.add_parser('query',
-                                         help=('query information stored '
-                                               'in cloud-init'))
-    parser_query.add_argument("--name", '-n', action="store",
-                              help="item name to query on",
-                              required=True,
-                              choices=QUERY_DATA_TYPES)
-    parser_query.set_defaults(action=('query', main_query))
-
     # This subcommand allows you to run a single module
     parser_single = subparsers.add_parser('single',
                                           help=('run a single module '))
@@ -781,15 +760,22 @@ def main(sysv_args=None):
                                             help=('list defined features'))
     parser_features.set_defaults(action=('features', main_features))
 
+    parser_analyze = subparsers.add_parser(
+        'analyze', help='Devel tool: Analyze cloud-init logs and data')
+    if sysv_args and sysv_args[0] == 'analyze':
+        # Only load this parser if analyze is specified to avoid file load cost
+        # FIXME put this under 'devel' subcommand (coming in next branch)
+        from cloudinit.analyze.__main__ import get_parser as analyze_parser
+        # Construct analyze subcommand parser
+        analyze_parser(parser_analyze)
+
     args = parser.parse_args(args=sysv_args)
 
-    try:
-        (name, functor) = args.action
-    except AttributeError:
-        parser.error('too few arguments')
+    # Subparsers.required = True and each subparser sets action=(name, functor)
+    (name, functor) = args.action
 
     # Setup basic logging to start (until reinitialized)
-    # iff in debug mode...
+    # iff in debug mode.
     if args.debug:
         logging.setupBasicLogging()
 
diff --git a/doc/rtd/index.rst b/doc/rtd/index.rst
index a691103e..de67f361 100644
--- a/doc/rtd/index.rst
+++ b/doc/rtd/index.rst
@@ -40,6 +40,7 @@ initialization of a cloud instance.
    topics/merging.rst
    topics/network-config.rst
    topics/vendordata.rst
+   topics/debugging.rst
    topics/moreinfo.rst
    topics/hacking.rst
    topics/tests.rst
diff --git a/doc/rtd/topics/debugging.rst b/doc/rtd/topics/debugging.rst
new file mode 100644
index 00000000..4e43dd57
--- /dev/null
+++ b/doc/rtd/topics/debugging.rst
@@ -0,0 +1,146 @@
+**********************
+Testing and debugging cloud-init
+**********************
+
+Overview
+========
+This topic will discuss general approaches for test and debug of cloud-init on
+deployed instances.
+
+
+Boot Time Analysis - cloud-init analyze
+======================================
+Occasionally instances don't appear as performant as we would like and
+cloud-init packages a simple facility to inspect what operations took
+cloud-init the longest during boot and setup.
+
+The script **/usr/bin/cloud-init** has an analyze sub-command **analyze**
+which parses any cloud-init.log file into formatted and sorted events. It
+allows for detailed analysis of the most costly cloud-init operations are to
+determine the long-pole in cloud-init configuration and setup. These
+subcommands default to reading /var/log/cloud-init.log.
+
+* ``analyze show`` Parse and organize cloud-init.log events by stage and
+include each sub-stage granularity with time delta reports.
+
+.. code-block:: bash
+
+    $ cloud-init analyze show -i my-cloud-init.log
+    -- Boot Record 01 --
+    The total time elapsed since completing an event is printed after the "@"
+    character.
+    The time the event takes is printed after the "+" character.
+
+    Starting stage: modules-config
+    |`->config-emit_upstart ran successfully @05.47600s +00.00100s
+    |`->config-snap_config ran successfully @05.47700s +00.00100s
+    |`->config-ssh-import-id ran successfully @05.47800s +00.00200s
+    |`->config-locale ran successfully @05.48000s +00.00100s
+    ...
+
+
+* ``analyze dump`` Parse cloud-init.log into event records and return a list of
+dictionaries that can be consumed for other reporting needs.
+
+.. code-block:: bash
+
+    $ cloud-init analyze blame -i my-cloud-init.log
+    [
+     {
+      "description": "running config modules",
+      "event_type": "start",
+      "name": "modules-config",
+      "origin": "cloudinit",
+      "timestamp": 1510807493.0
+     },...
+
+* ``analyze blame`` Parse cloud-init.log into event records and sort them based
+on highest time cost for quick assessment of areas of cloud-init that may need
+improvement.
+
+.. code-block:: bash
+
+    $ cloud-init analyze blame -i my-cloud-init.log
+    -- Boot Record 11 --
+         00.01300s (modules-final/config-scripts-per-boot)
+         00.00400s (modules-final/config-final-message)
+         00.00100s (modules-final/config-rightscale_userdata)
+         ...
+
+
+Analyze quickstart - LXC
+---------------------------
+To quickly obtain a cloud-init log try using lxc on any ubuntu system:
+
+.. code-block:: bash
+
+  $ lxc init ubuntu-daily:xenial x1
+  $ lxc start x1
+  # Take lxc's cloud-init.log and pipe it to the analyzer
+  $ lxc file pull x1/var/log/cloud-init.log - | cloud-init analyze dump -i -
+  $ lxc file pull x1/var/log/cloud-init.log - | \
+  python3 -m cloudinit.analyze dump -i -
+
+Analyze quickstart - KVM
+---------------------------
+To quickly analyze a KVM a cloud-init log:
+
+1. Download the current cloud image
+  wget https://cloud-images.ubuntu.com/daily/server/xenial/current/xenial-server-cloudimg-amd64.img
+2. Create a snapshot image to preserve the original cloud-image
+
+.. code-block:: bash
+
+    $ qemu-img create -b xenial-server-cloudimg-amd64.img -f qcow2 \
+    test-cloudinit.qcow2
+
+3. Create a seed image with metadata using `cloud-localds`
+
+.. code-block:: bash
+
+    $ cat > user-data <<EOF
+      #cloud-config
+      password: passw0rd
+      chpasswd: { expire: False }
+      EOF
+    $  cloud-localds my-seed.img user-data
+
+4. Launch your modified VM
+
+.. code-block:: bash
+
+    $  kvm -m 512 -net nic -net user -redir tcp:2222::22 \
+   -drive file=test-cloudinit.qcow2,if=virtio,format=qcow2 \
+   -drive file=my-seed.img,if=virtio,format=raw
+
+5. Analyze the boot (blame, dump, show)
+
+.. code-block:: bash
+
+    $ ssh -p 2222 ubuntu@localhost 'cat /var/log/cloud-init.log' | \
+   cloud-init analyze blame -i -
+
+
+Running single cloud config modules
+===================================
+This subcommand is not called by the init system. It can be called manually to
+load the configured datasource and run a single cloud-config module once using
+the cached userdata and metadata after the instance has booted. Each
+cloud-config module has a module FREQUENCY configured: PER_INSTANCE, PER_BOOT,
+PER_ONCE or PER_ALWAYS. When a module is run by cloud-init, it stores a
+semaphore file in
+``/var/lib/cloud/instance/sem/config_<module_name>.<frequency>`` which marks
+when the module last successfully ran. Presence of this semaphore file
+prevents a module from running again if it has already been run. To ensure that
+a module is run again, the desired frequency can be overridden on the
+commandline:
+
+.. code-block:: bash
+
+  $ sudo cloud-init single --name cc_ssh --frequency always
+  ...
+  Generating public/private ed25519 key pair
+  ...
+
+Inspect cloud-init.log for output of what operations were performed as a
+result.
diff --git a/tests/unittests/test_cli.py b/tests/unittests/test_cli.py
index 06f366b2..7780f164 100644
--- a/tests/unittests/test_cli.py
+++ b/tests/unittests/test_cli.py
@@ -31,9 +31,90 @@ class TestCLI(test_helpers.FilesystemMockingTestCase):
 
     def test_no_arguments_shows_error_message(self):
         exit_code = self._call_main()
-        self.assertIn('cloud-init: error: too few arguments',
-                      self.stderr.getvalue())
+        missing_subcommand_message = [
+            'too few arguments',  # python2.7 msg
+            'the following arguments are required: subcommand'  # python3 msg
+        ]
+        error = self.stderr.getvalue()
+        matches = ([msg in error for msg in missing_subcommand_message])
+        self.assertTrue(
+            any(matches), 'Did not find error message for missing subcommand')
         self.assertEqual(2, exit_code)
 
+    def test_all_subcommands_represented_in_help(self):
+        """All known subparsers are represented in the cloud-int help doc."""
+        self._call_main()
+        error = self.stderr.getvalue()
+        expected_subcommands = ['analyze', 'init', 'modules', 'single',
+                                'dhclient-hook', 'features']
+        for subcommand in expected_subcommands:
+            self.assertIn(subcommand, error)
 
-# vi: ts=4 expandtab
+    @mock.patch('cloudinit.cmd.main.status_wrapper')
+    def test_init_subcommand_parser(self, m_status_wrapper):
+        """The subcommand 'init' calls status_wrapper passing init."""
+        self._call_main(['cloud-init', 'init'])
+        (name, parseargs) = m_status_wrapper.call_args_list[0][0]
+        self.assertEqual('init', name)
+        self.assertEqual('init', parseargs.subcommand)
+        self.assertEqual('init', parseargs.action[0])
+        self.assertEqual('main_init', parseargs.action[1].__name__)
+
+    @mock.patch('cloudinit.cmd.main.status_wrapper')
+    def test_modules_subcommand_parser(self, m_status_wrapper):
+        """The subcommand 'modules' calls status_wrapper passing modules."""
+        self._call_main(['cloud-init', 'modules'])
+        (name, parseargs) = m_status_wrapper.call_args_list[0][0]
+        self.assertEqual('modules', name)
+        self.assertEqual('modules', parseargs.subcommand)
+        self.assertEqual('modules', parseargs.action[0])
+        self.assertEqual('main_modules', parseargs.action[1].__name__)
+
+    def test_analyze_subcommand_parser(self):
+        """The subcommand cloud-init analyze calls the correct subparser."""
+        self._call_main(['cloud-init', 'analyze'])
+        # These subcommands only valid for cloud-init analyze script
+        expected_subcommands = ['blame', 'show', 'dump']
+        error = self.stderr.getvalue()
+        for subcommand in expected_subcommands:
+            self.assertIn(subcommand, error)
+
+    @mock.patch('cloudinit.cmd.main.main_single')
+    def test_single_subcommand(self, m_main_single):
+        """The subcommand 'single' calls main_single with valid args."""
+        self._call_main(['cloud-init', 'single', '--name', 'cc_ntp'])
+        (name, parseargs) = m_main_single.call_args_list[0][0]
+        self.assertEqual('single', name)
+        self.assertEqual('single', parseargs.subcommand)
+        self.assertEqual('single', parseargs.action[0])
+        self.assertFalse(parseargs.debug)
+        self.assertFalse(parseargs.force)
+        self.assertIsNone(parseargs.frequency)
+        self.assertEqual('cc_ntp', parseargs.name)
+        self.assertFalse(parseargs.report)
+
+    @mock.patch('cloudinit.cmd.main.dhclient_hook')
+    def test_dhclient_hook_subcommand(self, m_dhclient_hook):
+        """The subcommand 'dhclient-hook' calls dhclient_hook with args."""
+        self._call_main(['cloud-init', 'dhclient-hook', 'net_action', 'eth0'])
+        (name, parseargs) = m_dhclient_hook.call_args_list[0][0]
+        self.assertEqual('dhclient_hook', name)
+        self.assertEqual('dhclient-hook', parseargs.subcommand)
+        self.assertEqual('dhclient_hook', parseargs.action[0])
+        self.assertFalse(parseargs.debug)
+        self.assertFalse(parseargs.force)
+        self.assertEqual('net_action', parseargs.net_action)
+        self.assertEqual('eth0', parseargs.net_interface)
+
+    @mock.patch('cloudinit.cmd.main.main_features')
+    def test_features_hook_subcommand(self, m_features):
+        """The subcommand 'features' calls main_features with args."""
+        self._call_main(['cloud-init', 'features'])
+        (name, parseargs) = m_features.call_args_list[0][0]
+        self.assertEqual('features', name)
+        self.assertEqual('features', parseargs.subcommand)
+        self.assertEqual('features', parseargs.action[0])
+        self.assertFalse(parseargs.debug)
+        self.assertFalse(parseargs.force)
+
+# : ts=4 expandtab
-- 
cgit v1.2.3


From cc9762a2d737ead386ffb9f067adc5e543224560 Mon Sep 17 00:00:00 2001
From: Chad Smith <chad.smith@canonical.com>
Date: Tue, 22 Aug 2017 20:06:20 -0600
Subject: schema cli: Add schema subcommand to cloud-init cli and cc_runcmd
 schema

This branch does a few things:
  - Add 'schema' subcommand to cloud-init CLI for validating
    cloud-config files against strict module jsonschema definitions
  - Add --annotate parameter to 'cloud-init schema' to annotate
    existing cloud-config file content with validation errors
  - Add jsonschema definition to cc_runcmd
  - Add unit test coverage for cc_runcmd
  - Update CLI capabilities documentation

This branch only imports development (and analyze) subparsers when the
specific subcommand is provided on the CLI to avoid adding costly unused
file imports during cloud-init system boot.

The schema command allows a person to quickly validate a cloud-config text
file against cloud-init's known module schemas to avoid costly roundtrips
deploying instances in their cloud of choice. As of this branch, only
cc_ntp and cc_runcmd cloud-config modules define schemas. Schema
validation will ignore all undefined config keys until all modules define
a strict schema.

To perform validation of runcmd and ntp sections of a cloud-config file:
$ cat > cloud.cfg <<EOF
runcmd: bogus
EOF
$ python -m cloudinit.cmd.main schema --config-file cloud.cfg

$ python -m cloudinit.cmd.main schema --config-file cloud.cfg \
  --annotate

Once jsonschema is defined for all ~55 cc modules, we will move this
schema subcommand up as a proper subcommand of the cloud-init CLI.
---
 cloudinit/cmd/devel/__init__.py                    |   0
 cloudinit/cmd/devel/parser.py                      |  26 +++
 cloudinit/cmd/main.py                              |  21 ++-
 cloudinit/config/cc_runcmd.py                      |  82 ++++++---
 cloudinit/config/schema.py                         | 199 +++++++++++++++++----
 doc/rtd/topics/capabilities.rst                    |  18 +-
 tests/unittests/test_cli.py                        |  21 ++-
 .../unittests/test_handler/test_handler_runcmd.py  | 108 +++++++++++
 tests/unittests/test_handler/test_schema.py        | 157 ++++++++++++++--
 9 files changed, 541 insertions(+), 91 deletions(-)
 create mode 100644 cloudinit/cmd/devel/__init__.py
 create mode 100644 cloudinit/cmd/devel/parser.py
 create mode 100644 tests/unittests/test_handler/test_handler_runcmd.py

(limited to 'doc')

diff --git a/cloudinit/cmd/devel/__init__.py b/cloudinit/cmd/devel/__init__.py
new file mode 100644
index 00000000..e69de29b
diff --git a/cloudinit/cmd/devel/parser.py b/cloudinit/cmd/devel/parser.py
new file mode 100644
index 00000000..acacc4ed
--- /dev/null
+++ b/cloudinit/cmd/devel/parser.py
@@ -0,0 +1,26 @@
+# Copyright (C) 2017 Canonical Ltd.
+#
+# This file is part of cloud-init. See LICENSE file for license information.
+
+"""Define 'devel' subcommand argument parsers to include in cloud-init cmd."""
+
+import argparse
+from cloudinit.config.schema import (
+    get_parser as schema_parser, handle_schema_args)
+
+
+def get_parser(parser=None):
+    if not parser:
+        parser = argparse.ArgumentParser(
+            prog='cloudinit-devel',
+            description='Run development cloud-init tools')
+    subparsers = parser.add_subparsers(title='Subcommands', dest='subcommand')
+    subparsers.required = True
+
+    parser_schema = subparsers.add_parser(
+        'schema', help='Validate cloud-config files or document schema')
+    # Construct schema subcommand parser
+    schema_parser(parser_schema)
+    parser_schema.set_defaults(action=('schema', handle_schema_args))
+
+    return parser
diff --git a/cloudinit/cmd/main.py b/cloudinit/cmd/main.py
index 9c0ac864..5b467979 100644
--- a/cloudinit/cmd/main.py
+++ b/cloudinit/cmd/main.py
@@ -705,7 +705,6 @@ def main(sysv_args=None):
     subparsers.required = True
 
     # Each action and its sub-options (if any)
-
     parser_init = subparsers.add_parser('init',
                                         help=('initializes cloud-init and'
                                               ' performs initial modules'))
@@ -762,12 +761,20 @@ def main(sysv_args=None):
 
     parser_analyze = subparsers.add_parser(
         'analyze', help='Devel tool: Analyze cloud-init logs and data')
-    if sysv_args and sysv_args[0] == 'analyze':
-        # Only load this parser if analyze is specified to avoid file load cost
-        # FIXME put this under 'devel' subcommand (coming in next branch)
-        from cloudinit.analyze.__main__ import get_parser as analyze_parser
-        # Construct analyze subcommand parser
-        analyze_parser(parser_analyze)
+
+    parser_devel = subparsers.add_parser(
+        'devel', help='Run development tools')
+
+    if sysv_args:
+        # Only load subparsers if subcommand is specified to avoid load cost
+        if sysv_args[0] == 'analyze':
+            from cloudinit.analyze.__main__ import get_parser as analyze_parser
+            # Construct analyze subcommand parser
+            analyze_parser(parser_analyze)
+        if sysv_args[0] == 'devel':
+            from cloudinit.cmd.devel.parser import get_parser as devel_parser
+            # Construct devel subcommand parser
+            devel_parser(parser_devel)
 
     args = parser.parse_args(args=sysv_args)
 
diff --git a/cloudinit/config/cc_runcmd.py b/cloudinit/config/cc_runcmd.py
index dfa8cb3d..7c3ccd41 100644
--- a/cloudinit/config/cc_runcmd.py
+++ b/cloudinit/config/cc_runcmd.py
@@ -6,41 +6,66 @@
 #
 # This file is part of cloud-init. See LICENSE file for license information.
 
-"""
-Runcmd
-------
-**Summary:** run commands
+"""Runcmd: run arbitrary commands at rc.local with output to the console"""
 
-Run arbitrary commands at a rc.local like level with output to the console.
-Each item can be either a list or a string. If the item is a list, it will be
-properly executed as if passed to ``execve()`` (with the first arg as the
-command). If the item is a string, it will be written to a file and interpreted
-using ``sh``.
-
-.. note::
-    all commands must be proper yaml, so you have to quote any characters yaml
-    would eat (':' can be problematic)
-
-**Internal name:** ``cc_runcmd``
-
-**Module frequency:** per instance
+from cloudinit.config.schema import validate_cloudconfig_schema
+from cloudinit.settings import PER_INSTANCE
+from cloudinit import util
 
-**Supported distros:** all
+import os
+from textwrap import dedent
 
-**Config keys**::
 
-    runcmd:
-        - [ ls, -l, / ]
-        - [ sh, -xc, "echo $(date) ': hello world!'" ]
-        - [ sh, -c, echo "=========hello world'=========" ]
-        - ls -l /root
-        - [ wget, "http://example.org", -O, /tmp/index.html ]
-"""
+# The schema definition for each cloud-config module is a strict contract for
+# describing supported configuration parameters for each cloud-config section.
+# It allows cloud-config to validate and alert users to invalid or ignored
+# configuration options before actually attempting to deploy with said
+# configuration.
 
+distros = ['all']
 
-import os
+schema = {
+    'id': 'cc_runcmd',
+    'name': 'Runcmd',
+    'title': 'Run arbitrary commands',
+    'description': dedent("""\
+        Run arbitrary commands at a rc.local like level with output to the
+        console. Each item can be either a list or a string. If the item is a
+        list, it will be properly executed as if passed to ``execve()`` (with
+        the first arg as the command). If the item is a string, it will be
+        written to a file and interpreted
+        using ``sh``.
 
-from cloudinit import util
+        .. note::
+        all commands must be proper yaml, so you have to quote any characters
+        yaml would eat (':' can be problematic)"""),
+    'distros': distros,
+    'examples': [dedent("""\
+        runcmd:
+            - [ ls, -l, / ]
+            - [ sh, -xc, "echo $(date) ': hello world!'" ]
+            - [ sh, -c, echo "=========hello world'=========" ]
+            - ls -l /root
+            - [ wget, "http://example.org", -O, /tmp/index.html ]
+    """)],
+    'frequency': PER_INSTANCE,
+    'type': 'object',
+    'properties': {
+        'runcmd': {
+            'type': 'array',
+            'items': {
+                'oneOf': [
+                    {'type': 'array', 'items': {'type': 'string'}},
+                    {'type': 'string'}]
+            },
+            'additionalItems': False,  # Reject items of non-string non-list
+            'additionalProperties': False,
+            'minItems': 1,
+            'required': [],
+            'uniqueItems': True
+        }
+    }
+}
 
 
 def handle(name, cfg, cloud, log, _args):
@@ -49,6 +74,7 @@ def handle(name, cfg, cloud, log, _args):
                    " no 'runcmd' key in configuration"), name)
         return
 
+    validate_cloudconfig_schema(cfg, schema)
     out_fn = os.path.join(cloud.get_ipath('scripts'), "runcmd")
     cmd = cfg["runcmd"]
     try:
diff --git a/cloudinit/config/schema.py b/cloudinit/config/schema.py
index 6400f005..73dd5c2e 100644
--- a/cloudinit/config/schema.py
+++ b/cloudinit/config/schema.py
@@ -3,11 +3,14 @@
 
 from __future__ import print_function
 
-from cloudinit.util import read_file_or_url
+from cloudinit import importer
+from cloudinit.util import find_modules, read_file_or_url
 
 import argparse
+from collections import defaultdict
 import logging
 import os
+import re
 import sys
 import yaml
 
@@ -15,7 +18,7 @@ SCHEMA_UNDEFINED = b'UNDEFINED'
 CLOUD_CONFIG_HEADER = b'#cloud-config'
 SCHEMA_DOC_TMPL = """
 {name}
----
+{title_underbar}
 **Summary:** {title}
 
 {description}
@@ -83,11 +86,49 @@ def validate_cloudconfig_schema(config, schema, strict=False):
             logging.warning('Invalid config:\n%s', '\n'.join(messages))
 
 
-def validate_cloudconfig_file(config_path, schema):
+def annotated_cloudconfig_file(cloudconfig, original_content, schema_errors):
+    """Return contents of the cloud-config file annotated with schema errors.
+
+    @param cloudconfig: YAML-loaded object from the original_content.
+    @param original_content: The contents of a cloud-config file
+    @param schema_errors: List of tuples from a JSONSchemaValidationError. The
+        tuples consist of (schemapath, error_message).
+    """
+    if not schema_errors:
+        return original_content
+    schemapaths = _schemapath_for_cloudconfig(cloudconfig, original_content)
+    errors_by_line = defaultdict(list)
+    error_count = 1
+    error_footer = []
+    annotated_content = []
+    for path, msg in schema_errors:
+        errors_by_line[schemapaths[path]].append(msg)
+        error_footer.append('# E{0}: {1}'.format(error_count, msg))
+        error_count += 1
+    lines = original_content.decode().split('\n')
+    error_count = 1
+    for line_number, line in enumerate(lines):
+        errors = errors_by_line[line_number + 1]
+        if errors:
+            error_label = ','.join(
+                ['E{0}'.format(count + error_count)
+                 for count in range(0, len(errors))])
+            error_count += len(errors)
+            annotated_content.append(line + '\t\t# ' + error_label)
+        else:
+            annotated_content.append(line)
+    annotated_content.append(
+        '# Errors: -------------\n{0}\n\n'.format('\n'.join(error_footer)))
+    return '\n'.join(annotated_content)
+
+
+def validate_cloudconfig_file(config_path, schema, annotate=False):
     """Validate cloudconfig file adheres to a specific jsonschema.
 
     @param config_path: Path to the yaml cloud-config file to parse.
     @param schema: Dict describing a valid jsonschema to validate against.
+    @param annotate: Boolean set True to print original config file with error
+        annotations on the offending lines.
 
     @raises SchemaValidationError containing any of schema_errors encountered.
     @raises RuntimeError when config_path does not exist.
@@ -108,8 +149,64 @@ def validate_cloudconfig_file(config_path, schema):
             ('format', 'File {0} is not valid yaml. {1}'.format(
                 config_path, str(e))),)
         raise SchemaValidationError(errors)
-    validate_cloudconfig_schema(
-        cloudconfig, schema, strict=True)
+
+    try:
+        validate_cloudconfig_schema(
+            cloudconfig, schema, strict=True)
+    except SchemaValidationError as e:
+        if annotate:
+            print(annotated_cloudconfig_file(
+                cloudconfig, content, e.schema_errors))
+        raise
+
+
+def _schemapath_for_cloudconfig(config, original_content):
+    """Return a dictionary mapping schemapath to original_content line number.
+
+    @param config: The yaml.loaded config dictionary of a cloud-config file.
+    @param original_content: The simple file content of the cloud-config file
+    """
+    # FIXME Doesn't handle multi-line lists or multi-line strings
+    content_lines = original_content.decode().split('\n')
+    schema_line_numbers = {}
+    list_index = 0
+    RE_YAML_INDENT = r'^(\s*)'
+    scopes = []
+    for line_number, line in enumerate(content_lines):
+        indent_depth = len(re.match(RE_YAML_INDENT, line).groups()[0])
+        line = line.strip()
+        if not line or line.startswith('#'):
+            continue
+        if scopes:
+            previous_depth, path_prefix = scopes[-1]
+        else:
+            previous_depth = -1
+            path_prefix = ''
+        if line.startswith('- '):
+            key = str(list_index)
+            value = line[1:]
+            list_index += 1
+        else:
+            list_index = 0
+            key, value = line.split(':', 1)
+        while indent_depth <= previous_depth:
+            if scopes:
+                previous_depth, path_prefix = scopes.pop()
+            else:
+                previous_depth = -1
+                path_prefix = ''
+        if path_prefix:
+            key = path_prefix + '.' + key
+        scopes.append((indent_depth, key))
+        if value:
+            value = value.strip()
+            if value.startswith('['):
+                scopes.append((indent_depth + 2, key + '.0'))
+                for inner_list_index in range(0, len(yaml.safe_load(value))):
+                    list_key = key + '.' + str(inner_list_index)
+                    schema_line_numbers[list_key] = line_number + 1
+        schema_line_numbers[key] = line_number + 1
+    return schema_line_numbers
 
 
 def _get_property_type(property_dict):
@@ -117,9 +214,15 @@ def _get_property_type(property_dict):
     property_type = property_dict.get('type', SCHEMA_UNDEFINED)
     if isinstance(property_type, list):
         property_type = '/'.join(property_type)
-    item_type = property_dict.get('items', {}).get('type')
-    if item_type:
-        property_type = '{0} of {1}'.format(property_type, item_type)
+    items = property_dict.get('items', {})
+    sub_property_type = items.get('type', '')
+    # Collect each item type
+    for sub_item in items.get('oneOf', {}):
+        if sub_property_type:
+            sub_property_type += '/'
+        sub_property_type += '(' + _get_property_type(sub_item) + ')'
+    if sub_property_type:
+        return '{0} of {1}'.format(property_type, sub_property_type)
     return property_type
 
 
@@ -148,9 +251,12 @@ def _get_schema_examples(schema, prefix=''):
         return ''
     rst_content = '\n**Examples**::\n\n'
     for example in examples:
-        example_yaml = yaml.dump(example, default_flow_style=False)
+        if isinstance(example, str):
+            example_content = example
+        else:
+            example_content = yaml.dump(example, default_flow_style=False)
         # Python2.6 is missing textwrapper.indent
-        lines = example_yaml.split('\n')
+        lines = example_content.split('\n')
         indented_lines = ['    {0}'.format(line) for line in lines]
         rst_content += '\n'.join(indented_lines)
     return rst_content
@@ -165,58 +271,83 @@ def get_schema_doc(schema):
     schema['property_doc'] = _get_property_doc(schema)
     schema['examples'] = _get_schema_examples(schema)
     schema['distros'] = ', '.join(schema['distros'])
+    # Need an underbar of the same length as the name
+    schema['title_underbar'] = re.sub(r'.', '-', schema['name'])
     return SCHEMA_DOC_TMPL.format(**schema)
 
 
-def get_schema(section_key=None):
-    """Return a dict of jsonschema defined in any cc_* module.
+FULL_SCHEMA = None
 
-    @param: section_key: Optionally limit schema to a specific top-level key.
-    """
-    # TODO use util.find_modules in subsequent branch
-    from cloudinit.config.cc_ntp import schema
-    return schema
+
+def get_schema():
+    """Return jsonschema coalesced from all cc_* cloud-config module."""
+    global FULL_SCHEMA
+    if FULL_SCHEMA:
+        return FULL_SCHEMA
+    full_schema = {
+        '$schema': 'http://json-schema.org/draft-04/schema#',
+        'id': 'cloud-config-schema', 'allOf': []}
+
+    configs_dir = os.path.dirname(os.path.abspath(__file__))
+    potential_handlers = find_modules(configs_dir)
+    for (fname, mod_name) in potential_handlers.items():
+        mod_locs, looked_locs = importer.find_module(
+            mod_name, ['cloudinit.config'], ['schema'])
+        if mod_locs:
+            mod = importer.import_module(mod_locs[0])
+            full_schema['allOf'].append(mod.schema)
+    FULL_SCHEMA = full_schema
+    return full_schema
 
 
 def error(message):
     print(message, file=sys.stderr)
-    return 1
+    sys.exit(1)
 
 
-def get_parser():
+def get_parser(parser=None):
     """Return a parser for supported cmdline arguments."""
-    parser = argparse.ArgumentParser()
+    if not parser:
+        parser = argparse.ArgumentParser(
+            prog='cloudconfig-schema',
+            description='Validate cloud-config files or document schema')
     parser.add_argument('-c', '--config-file',
                         help='Path of the cloud-config yaml file to validate')
     parser.add_argument('-d', '--doc', action="store_true", default=False,
                         help='Print schema documentation')
-    parser.add_argument('-k', '--key',
-                        help='Limit validation or docs to a section key')
+    parser.add_argument('--annotate', action="store_true", default=False,
+                        help='Annotate existing cloud-config file with errors')
     return parser
 
 
-def main():
-    """Tool to validate schema of a cloud-config file or print schema docs."""
-    parser = get_parser()
-    args = parser.parse_args()
+def handle_schema_args(name, args):
+    """Handle provided schema args and perform the appropriate actions."""
     exclusive_args = [args.config_file, args.doc]
     if not any(exclusive_args) or all(exclusive_args):
-        return error('Expected either --config-file argument or --doc')
-
-    schema = get_schema()
+        error('Expected either --config-file argument or --doc')
+    full_schema = get_schema()
     if args.config_file:
         try:
-            validate_cloudconfig_file(args.config_file, schema)
+            validate_cloudconfig_file(
+                args.config_file, full_schema, args.annotate)
         except (SchemaValidationError, RuntimeError) as e:
-            return error(str(e))
-        print("Valid cloud-config file {0}".format(args.config_file))
+            if not args.annotate:
+                error(str(e))
+        else:
+            print("Valid cloud-config file {0}".format(args.config_file))
     if args.doc:
-        print(get_schema_doc(schema))
+        for subschema in full_schema['allOf']:
+            print(get_schema_doc(subschema))
+
+
+def main():
+    """Tool to validate schema of a cloud-config file or print schema docs."""
+    parser = get_parser()
+    handle_schema_args('cloudconfig-schema', parser.parse_args())
     return 0
 
 
 if __name__ == '__main__':
     sys.exit(main())
 
-
 # vi: ts=4 expandtab
diff --git a/doc/rtd/topics/capabilities.rst b/doc/rtd/topics/capabilities.rst
index b8034b07..31eaba53 100644
--- a/doc/rtd/topics/capabilities.rst
+++ b/doc/rtd/topics/capabilities.rst
@@ -51,15 +51,6 @@ described in this document.
   usage: cloud-init [-h] [--version] [--file FILES] [--debug] [--force]
                     {init,modules,query,single,dhclient-hook,features} ...
 
-  positional arguments:
-    {init,modules,query,single,dhclient-hook,features}
-      init                initializes cloud-init and performs initial modules
-      modules             activates modules using a given configuration key
-      query               query information stored in cloud-init
-      single              run a single module
-      dhclient-hook       run the dhclient hookto record network info
-      features            list defined features
-
   optional arguments:
     -h, --help            show this help message and exit
     --version, -v         show program's version number and exit
@@ -69,6 +60,15 @@ described in this document.
     --force               force running even if no datasource is found (use at
                           your own risk)
 
+  Subcommands:
+    {init,modules,single,dhclient-hook,features,analyze,devel}
+      init                initializes cloud-init and performs initial modules
+      modules             activates modules using a given configuration key
+      single              run a single module
+      dhclient-hook       run the dhclient hookto record network info
+      features            list defined features
+      analyze             Devel tool: Analyze cloud-init logs and data
+      devel               Run development tools
 
   % cloud-init features
   NETWORK_CONFIG_V1
diff --git a/tests/unittests/test_cli.py b/tests/unittests/test_cli.py
index 7780f164..24498802 100644
--- a/tests/unittests/test_cli.py
+++ b/tests/unittests/test_cli.py
@@ -46,7 +46,7 @@ class TestCLI(test_helpers.FilesystemMockingTestCase):
         self._call_main()
         error = self.stderr.getvalue()
         expected_subcommands = ['analyze', 'init', 'modules', 'single',
-                                'dhclient-hook', 'features']
+                                'dhclient-hook', 'features', 'devel']
         for subcommand in expected_subcommands:
             self.assertIn(subcommand, error)
 
@@ -79,6 +79,25 @@ class TestCLI(test_helpers.FilesystemMockingTestCase):
         for subcommand in expected_subcommands:
             self.assertIn(subcommand, error)
 
+    def test_devel_subcommand_parser(self):
+        """The subcommand cloud-init devel calls the correct subparser."""
+        self._call_main(['cloud-init', 'devel'])
+        # These subcommands only valid for cloud-init schema script
+        expected_subcommands = ['schema']
+        error = self.stderr.getvalue()
+        for subcommand in expected_subcommands:
+            self.assertIn(subcommand, error)
+
+    @mock.patch('cloudinit.config.schema.handle_schema_args')
+    def test_wb_devel_schema_subcommand_parser(self, m_schema):
+        """The subcommand cloud-init schema calls the correct subparser."""
+        exit_code = self._call_main(['cloud-init', 'devel', 'schema'])
+        self.assertEqual(1, exit_code)
+        # Known whitebox output from schema subcommand
+        self.assertEqual(
+            'Expected either --config-file argument or --doc\n',
+            self.stderr.getvalue())
+
     @mock.patch('cloudinit.cmd.main.main_single')
     def test_single_subcommand(self, m_main_single):
         """The subcommand 'single' calls main_single with valid args."""
diff --git a/tests/unittests/test_handler/test_handler_runcmd.py b/tests/unittests/test_handler/test_handler_runcmd.py
new file mode 100644
index 00000000..7880ee72
--- /dev/null
+++ b/tests/unittests/test_handler/test_handler_runcmd.py
@@ -0,0 +1,108 @@
+# This file is part of cloud-init. See LICENSE file for license information.
+
+from cloudinit.config import cc_runcmd
+from cloudinit.sources import DataSourceNone
+from cloudinit import (distros, helpers, cloud, util)
+from ..helpers import FilesystemMockingTestCase, skipIf
+
+import logging
+import os
+import stat
+
+try:
+    import jsonschema
+    assert jsonschema  # avoid pyflakes error F401: import unused
+    _missing_jsonschema_dep = False
+except ImportError:
+    _missing_jsonschema_dep = True
+
+LOG = logging.getLogger(__name__)
+
+
+class TestRuncmd(FilesystemMockingTestCase):
+
+    with_logs = True
+
+    def setUp(self):
+        super(TestRuncmd, self).setUp()
+        self.subp = util.subp
+        self.new_root = self.tmp_dir()
+
+    def _get_cloud(self, distro):
+        self.patchUtils(self.new_root)
+        paths = helpers.Paths({'scripts': self.new_root})
+        cls = distros.fetch(distro)
+        mydist = cls(distro, {}, paths)
+        myds = DataSourceNone.DataSourceNone({}, mydist, paths)
+        paths.datasource = myds
+        return cloud.Cloud(myds, paths, {}, mydist, None)
+
+    def test_handler_skip_if_no_runcmd(self):
+        """When the provided config doesn't contain runcmd, skip it."""
+        cfg = {}
+        mycloud = self._get_cloud('ubuntu')
+        cc_runcmd.handle('notimportant', cfg, mycloud, LOG, None)
+        self.assertIn(
+            "Skipping module named notimportant, no 'runcmd' key",
+            self.logs.getvalue())
+
+    def test_handler_invalid_command_set(self):
+        """Commands which can't be converted to shell will raise errors."""
+        invalid_config = {'runcmd': 1}
+        cc = self._get_cloud('ubuntu')
+        cc_runcmd.handle('cc_runcmd', invalid_config, cc, LOG, [])
+        self.assertIn(
+            'Failed to shellify 1 into file'
+            ' /var/lib/cloud/instances/iid-datasource-none/scripts/runcmd',
+            self.logs.getvalue())
+
+    @skipIf(_missing_jsonschema_dep, "No python-jsonschema dependency")
+    def test_handler_schema_validation_warns_non_array_type(self):
+        """Schema validation warns of non-array type for runcmd key.
+
+        Schema validation is not strict, so runcmd attempts to shellify the
+        invalid content.
+        """
+        invalid_config = {'runcmd': 1}
+        cc = self._get_cloud('ubuntu')
+        cc_runcmd.handle('cc_runcmd', invalid_config, cc, LOG, [])
+        self.assertIn(
+            'Invalid config:\nruncmd: 1 is not of type \'array\'',
+            self.logs.getvalue())
+        self.assertIn('Failed to shellify', self.logs.getvalue())
+
+    @skipIf(_missing_jsonschema_dep, 'No python-jsonschema dependency')
+    def test_handler_schema_validation_warns_non_array_item_type(self):
+        """Schema validation warns of non-array or string runcmd items.
+
+        Schema validation is not strict, so runcmd attempts to shellify the
+        invalid content.
+        """
+        invalid_config = {
+            'runcmd': ['ls /', 20, ['wget', 'http://stuff/blah'], {'a': 'n'}]}
+        cc = self._get_cloud('ubuntu')
+        cc_runcmd.handle('cc_runcmd', invalid_config, cc, LOG, [])
+        expected_warnings = [
+            'runcmd.1: 20 is not valid under any of the given schemas',
+            'runcmd.3: {\'a\': \'n\'} is not valid under any of the given'
+            ' schema'
+        ]
+        logs = self.logs.getvalue()
+        for warning in expected_warnings:
+            self.assertIn(warning, logs)
+        self.assertIn('Failed to shellify', logs)
+
+    def test_handler_write_valid_runcmd_schema_to_file(self):
+        """Valid runcmd schema is written to a runcmd shell script."""
+        valid_config = {'runcmd': [['ls', '/']]}
+        cc = self._get_cloud('ubuntu')
+        cc_runcmd.handle('cc_runcmd', valid_config, cc, LOG, [])
+        runcmd_file = os.path.join(
+            self.new_root,
+            'var/lib/cloud/instances/iid-datasource-none/scripts/runcmd')
+        self.assertEqual("#!/bin/sh\n'ls' '/'\n", util.load_file(runcmd_file))
+        file_stat = os.stat(runcmd_file)
+        self.assertEqual(0o700, stat.S_IMODE(file_stat.st_mode))
+
+
+# vi: ts=4 expandtab
diff --git a/tests/unittests/test_handler/test_schema.py b/tests/unittests/test_handler/test_schema.py
index eda4802a..640f11d4 100644
--- a/tests/unittests/test_handler/test_schema.py
+++ b/tests/unittests/test_handler/test_schema.py
@@ -1,9 +1,9 @@
 # This file is part of cloud-init. See LICENSE file for license information.
 
 from cloudinit.config.schema import (
-    CLOUD_CONFIG_HEADER, SchemaValidationError, get_schema_doc,
-    validate_cloudconfig_file, validate_cloudconfig_schema,
-    main)
+    CLOUD_CONFIG_HEADER, SchemaValidationError, annotated_cloudconfig_file,
+    get_schema_doc, get_schema, validate_cloudconfig_file,
+    validate_cloudconfig_schema, main)
 from cloudinit.util import write_file
 
 from ..helpers import CiTestCase, mock, skipIf
@@ -11,6 +11,7 @@ from ..helpers import CiTestCase, mock, skipIf
 from copy import copy
 from six import StringIO
 from textwrap import dedent
+from yaml import safe_load
 
 try:
     import jsonschema
@@ -20,6 +21,29 @@ except ImportError:
     _missing_jsonschema_dep = True
 
 
+class GetSchemaTest(CiTestCase):
+
+    def test_get_schema_coalesces_known_schema(self):
+        """Every cloudconfig module with schema is listed in allOf keyword."""
+        schema = get_schema()
+        self.assertItemsEqual(
+            ['cc_ntp', 'cc_runcmd'],
+            [subschema['id'] for subschema in schema['allOf']])
+        self.assertEqual('cloud-config-schema', schema['id'])
+        self.assertEqual(
+            'http://json-schema.org/draft-04/schema#',
+            schema['$schema'])
+        # FULL_SCHEMA is updated by the get_schema call
+        from cloudinit.config.schema import FULL_SCHEMA
+        self.assertItemsEqual(['id', '$schema', 'allOf'], FULL_SCHEMA.keys())
+
+    def test_get_schema_returns_global_when_set(self):
+        """When FULL_SCHEMA global is already set, get_schema returns it."""
+        m_schema_path = 'cloudinit.config.schema.FULL_SCHEMA'
+        with mock.patch(m_schema_path, {'here': 'iam'}):
+            self.assertEqual({'here': 'iam'}, get_schema())
+
+
 class SchemaValidationErrorTest(CiTestCase):
     """Test validate_cloudconfig_schema"""
 
@@ -151,11 +175,11 @@ class GetSchemaDocTest(CiTestCase):
         full_schema.update(
             {'properties': {
                 'prop1': {'type': 'array', 'description': 'prop-description',
-                          'items': {'type': 'int'}}}})
+                          'items': {'type': 'integer'}}}})
         self.assertEqual(
             dedent("""
                 name
-                ---
+                ----
                 **Summary:** title
 
                 description
@@ -167,27 +191,71 @@ class GetSchemaDocTest(CiTestCase):
                 **Supported distros:** debian, rhel
 
                 **Config schema**:
-                    **prop1:** (array of int) prop-description\n\n"""),
+                    **prop1:** (array of integer) prop-description\n\n"""),
+            get_schema_doc(full_schema))
+
+    def test_get_schema_doc_handles_multiple_types(self):
+        """get_schema_doc delimits multiple property types with a '/'."""
+        full_schema = copy(self.required_schema)
+        full_schema.update(
+            {'properties': {
+                'prop1': {'type': ['string', 'integer'],
+                          'description': 'prop-description'}}})
+        self.assertIn(
+            '**prop1:** (string/integer) prop-description',
+            get_schema_doc(full_schema))
+
+    def test_get_schema_doc_handles_nested_oneof_property_types(self):
+        """get_schema_doc describes array items oneOf declarations in type."""
+        full_schema = copy(self.required_schema)
+        full_schema.update(
+            {'properties': {
+                'prop1': {'type': 'array',
+                          'items': {
+                              'oneOf': [{'type': 'string'},
+                                        {'type': 'integer'}]},
+                          'description': 'prop-description'}}})
+        self.assertIn(
+            '**prop1:** (array of (string)/(integer)) prop-description',
             get_schema_doc(full_schema))
 
     def test_get_schema_doc_returns_restructured_text_with_examples(self):
         """get_schema_doc returns indented examples when present in schema."""
         full_schema = copy(self.required_schema)
         full_schema.update(
-            {'examples': {'ex1': [1, 2, 3]},
+            {'examples': [{'ex1': [1, 2, 3]}],
              'properties': {
                 'prop1': {'type': 'array', 'description': 'prop-description',
-                          'items': {'type': 'int'}}}})
+                          'items': {'type': 'integer'}}}})
         self.assertIn(
             dedent("""
                 **Config schema**:
-                    **prop1:** (array of int) prop-description
+                    **prop1:** (array of integer) prop-description
 
                 **Examples**::
 
                     ex1"""),
             get_schema_doc(full_schema))
 
+    def test_get_schema_doc_handles_unstructured_examples(self):
+        """get_schema_doc properly indented examples which as just strings."""
+        full_schema = copy(self.required_schema)
+        full_schema.update(
+            {'examples': ['My example:\n    [don\'t, expand, "this"]'],
+             'properties': {
+                'prop1': {'type': 'array', 'description': 'prop-description',
+                          'items': {'type': 'integer'}}}})
+        self.assertIn(
+            dedent("""
+                **Config schema**:
+                    **prop1:** (array of integer) prop-description
+
+                **Examples**::
+
+                    My example:
+                        [don't, expand, "this"]"""),
+            get_schema_doc(full_schema))
+
     def test_get_schema_doc_raises_key_errors(self):
         """get_schema_doc raises KeyErrors on missing keys."""
         for key in self.required_schema:
@@ -198,13 +266,78 @@ class GetSchemaDocTest(CiTestCase):
             self.assertIn(key, str(context_mgr.exception))
 
 
+class AnnotatedCloudconfigFileTest(CiTestCase):
+    maxDiff = None
+
+    def test_annotated_cloudconfig_file_no_schema_errors(self):
+        """With no schema_errors, print the original content."""
+        content = b'ntp:\n  pools: [ntp1.pools.com]\n'
+        self.assertEqual(
+            content,
+            annotated_cloudconfig_file({}, content, schema_errors=[]))
+
+    def test_annotated_cloudconfig_file_schema_annotates_and_adds_footer(self):
+        """With schema_errors, error lines are annotated and a footer added."""
+        content = dedent("""\
+            #cloud-config
+            # comment
+            ntp:
+              pools: [-99, 75]
+            """).encode()
+        expected = dedent("""\
+            #cloud-config
+            # comment
+            ntp:		# E1
+              pools: [-99, 75]		# E2,E3
+
+            # Errors: -------------
+            # E1: Some type error
+            # E2: -99 is not a string
+            # E3: 75 is not a string
+
+            """)
+        parsed_config = safe_load(content[13:])
+        schema_errors = [
+            ('ntp', 'Some type error'), ('ntp.pools.0', '-99 is not a string'),
+            ('ntp.pools.1', '75 is not a string')]
+        self.assertEqual(
+            expected,
+            annotated_cloudconfig_file(parsed_config, content, schema_errors))
+
+    def test_annotated_cloudconfig_file_annotates_separate_line_items(self):
+        """Errors are annotated for lists with items on separate lines."""
+        content = dedent("""\
+            #cloud-config
+            # comment
+            ntp:
+              pools:
+                - -99
+                - 75
+            """).encode()
+        expected = dedent("""\
+            ntp:
+              pools:
+                - -99		# E1
+                - 75		# E2
+            """)
+        parsed_config = safe_load(content[13:])
+        schema_errors = [
+            ('ntp.pools.0', '-99 is not a string'),
+            ('ntp.pools.1', '75 is not a string')]
+        self.assertIn(
+            expected,
+            annotated_cloudconfig_file(parsed_config, content, schema_errors))
+
+
 class MainTest(CiTestCase):
 
     def test_main_missing_args(self):
         """Main exits non-zero and reports an error on missing parameters."""
         with mock.patch('sys.argv', ['mycmd']):
             with mock.patch('sys.stderr', new_callable=StringIO) as m_stderr:
-                self.assertEqual(1, main(), 'Expected non-zero exit code')
+                with self.assertRaises(SystemExit) as context_manager:
+                    main()
+        self.assertEqual('1', str(context_manager.exception))
         self.assertEqual(
             'Expected either --config-file argument or --doc\n',
             m_stderr.getvalue())
@@ -216,13 +349,13 @@ class MainTest(CiTestCase):
             with mock.patch('sys.stdout', new_callable=StringIO) as m_stdout:
                 self.assertEqual(0, main(), 'Expected 0 exit code')
         self.assertIn('\nNTP\n---\n', m_stdout.getvalue())
+        self.assertIn('\nRuncmd\n------\n', m_stdout.getvalue())
 
     def test_main_validates_config_file(self):
         """When --config-file parameter is provided, main validates schema."""
         myyaml = self.tmp_path('my.yaml')
         myargs = ['mycmd', '--config-file', myyaml]
-        with open(myyaml, 'wb') as stream:
-            stream.write(b'#cloud-config\nntp:')   # shortest ntp schema
+        write_file(myyaml, b'#cloud-config\nntp:')  # shortest ntp schema
         with mock.patch('sys.argv', myargs):
             with mock.patch('sys.stdout', new_callable=StringIO) as m_stdout:
                 self.assertEqual(0, main(), 'Expected 0 exit code')
-- 
cgit v1.2.3


From 44773a480d8fe32e97da44afd01e5882a480d136 Mon Sep 17 00:00:00 2001
From: Jason Butz <jason.butz@genesys.com>
Date: Fri, 25 Aug 2017 07:16:21 -0400
Subject: doc: Explain error behavior in user data include file format.

Update user data 'include file' format documentation to explain the
behavior that occurs when an error occurs while reading a file.
---
 doc/rtd/topics/format.rst | 1 +
 1 file changed, 1 insertion(+)

(limited to 'doc')

diff --git a/doc/rtd/topics/format.rst b/doc/rtd/topics/format.rst
index 436eb00f..e25289ad 100644
--- a/doc/rtd/topics/format.rst
+++ b/doc/rtd/topics/format.rst
@@ -85,6 +85,7 @@ This content is a ``include`` file.
 The file contains a list of urls, one per line.
 Each of the URLs will be read, and their content will be passed through this same set of rules.
 Ie, the content read from the URL can be gzipped, mime-multi-part, or plain text.
+If an error occurs reading a file the remaining files will not be read.
 
 Begins with: ``#include`` or ``Content-Type: text/x-include-url``  when using a MIME archive.
 
-- 
cgit v1.2.3


From fa266bf8818a08e37cd32a603d076ba2db300124 Mon Sep 17 00:00:00 2001
From: Scott Moser <smoser@brickies.net>
Date: Thu, 31 Aug 2017 20:01:57 -0600
Subject: upstart: do not package upstart jobs, drop ubuntu-init-switch module.

The ubuntu-init-switch module allowed the use to launch an instance that
was booted with upstart and have it switch its init system to systemd and
then reboot itself. It was only useful for the time period when Ubuntu was
transitioning to systemd but only produced images using upstart.

Also, do not run setup with --init-system=upstart. This means that by
default, debian packages built with packages/bddeb will not have upstart
unit files included. No other removal is done here.
---
 cloudinit/config/cc_ubuntu_init_switch.py | 160 ------------------------------
 config/cloud.cfg.tmpl                     |   3 -
 doc/rtd/topics/modules.rst                |   1 -
 packages/bddeb                            |   3 +-
 packages/debian/dirs                      |   1 -
 packages/debian/rules.in                  |   2 +-
 setup.py                                  |   2 +
 tests/cloud_tests/configs/modules/TODO.md |   2 -
 8 files changed, 4 insertions(+), 170 deletions(-)
 delete mode 100644 cloudinit/config/cc_ubuntu_init_switch.py

(limited to 'doc')

diff --git a/cloudinit/config/cc_ubuntu_init_switch.py b/cloudinit/config/cc_ubuntu_init_switch.py
deleted file mode 100644
index 5dd26901..00000000
--- a/cloudinit/config/cc_ubuntu_init_switch.py
+++ /dev/null
@@ -1,160 +0,0 @@
-# Copyright (C) 2014 Canonical Ltd.
-#
-# Author: Scott Moser <scott.moser@canonical.com>
-#
-# This file is part of cloud-init. See LICENSE file for license information.
-
-"""
-Ubuntu Init Switch
-------------------
-**Summary:** reboot system into another init.
-
-This module provides a way for the user to boot with systemd even if the image
-is set to boot with upstart. It should be run as one of the first
-``cloud_init_modules``, and will switch the init system and then issue a
-reboot. The next boot will come up in the target init system and no action
-will be taken. This should be inert on non-ubuntu systems, and also
-exit quickly.
-
-.. note::
-    best effort is made, but it's possible this system will break, and probably
-    won't interact well with any other mechanism you've used to switch the init
-    system.
-
-**Internal name:** ``cc_ubuntu_init_switch``
-
-**Module frequency:** once per instance
-
-**Supported distros:** ubuntu
-
-**Config keys**::
-
-    init_switch:
-      target: systemd (can be 'systemd' or 'upstart')
-      reboot: true (reboot if a change was made, or false to not reboot)
-"""
-
-from cloudinit.distros import ubuntu
-from cloudinit import log as logging
-from cloudinit.settings import PER_INSTANCE
-from cloudinit import util
-
-import os
-import time
-
-frequency = PER_INSTANCE
-REBOOT_CMD = ["/sbin/reboot", "--force"]
-
-DEFAULT_CONFIG = {
-    'init_switch': {'target': None, 'reboot': True}
-}
-
-SWITCH_INIT = """
-#!/bin/sh
-# switch_init: [upstart | systemd]
-
-is_systemd() {
-   [ "$(dpkg-divert --listpackage /sbin/init)" = "systemd-sysv" ]
-}
-debug() { echo "$@" 1>&2; }
-fail() { echo "$@" 1>&2; exit 1; }
-
-if [ "$1" = "systemd" ]; then
-   if is_systemd; then
-      debug "already systemd, nothing to do"
-   else
-      [ -f /lib/systemd/systemd ] || fail "no systemd available";
-      dpkg-divert --package systemd-sysv --divert /sbin/init.diverted \\
-          --rename /sbin/init
-   fi
-   [ -f /sbin/init ] || ln /lib/systemd/systemd /sbin/init
-elif [ "$1" = "upstart" ]; then
-   if is_systemd; then
-      rm -f /sbin/init
-      dpkg-divert --package systemd-sysv --rename --remove /sbin/init
-   else
-      debug "already upstart, nothing to do."
-   fi
-else
-  fail "Error. expect 'upstart' or 'systemd'"
-fi
-"""
-
-distros = ['ubuntu']
-
-
-def handle(name, cfg, cloud, log, args):
-    """Handler method activated by cloud-init."""
-
-    if not isinstance(cloud.distro, ubuntu.Distro):
-        log.debug("%s: distro is '%s', not ubuntu. returning",
-                  name, cloud.distro.__class__)
-        return
-
-    cfg = util.mergemanydict([cfg, DEFAULT_CONFIG])
-    target = cfg['init_switch']['target']
-    reboot = cfg['init_switch']['reboot']
-
-    if len(args) != 0:
-        target = args[0]
-        if len(args) > 1:
-            reboot = util.is_true(args[1])
-
-    if not target:
-        log.debug("%s: target=%s. nothing to do", name, target)
-        return
-
-    if not util.which('dpkg'):
-        log.warn("%s: 'dpkg' not available. Assuming not ubuntu", name)
-        return
-
-    supported = ('upstart', 'systemd')
-    if target not in supported:
-        log.warn("%s: target set to %s, expected one of: %s",
-                 name, target, str(supported))
-
-    if os.path.exists("/run/systemd/system"):
-        current = "systemd"
-    else:
-        current = "upstart"
-
-    if current == target:
-        log.debug("%s: current = target = %s. nothing to do", name, target)
-        return
-
-    try:
-        util.subp(['sh', '-s', target], data=SWITCH_INIT)
-    except util.ProcessExecutionError as e:
-        log.warn("%s: Failed to switch to init '%s'. %s", name, target, e)
-        return
-
-    if util.is_false(reboot):
-        log.info("%s: switched '%s' to '%s'. reboot=false, not rebooting.",
-                 name, current, target)
-        return
-
-    try:
-        log.warn("%s: switched '%s' to '%s'. rebooting.",
-                 name, current, target)
-        logging.flushLoggers(log)
-        _fire_reboot(log, wait_attempts=4, initial_sleep=4)
-    except Exception as e:
-        util.logexc(log, "Requested reboot did not happen!")
-        raise
-
-
-def _fire_reboot(log, wait_attempts=6, initial_sleep=1, backoff=2):
-    util.subp(REBOOT_CMD)
-    start = time.time()
-    wait_time = initial_sleep
-    for _i in range(0, wait_attempts):
-        time.sleep(wait_time)
-        wait_time *= backoff
-        elapsed = time.time() - start
-        log.debug("Rebooted, but still running after %s seconds", int(elapsed))
-    # If we got here, not good
-    elapsed = time.time() - start
-    raise RuntimeError(("Reboot did not happen"
-                        " after %s seconds!") % (int(elapsed)))
-
-# vi: ts=4 expandtab
diff --git a/config/cloud.cfg.tmpl b/config/cloud.cfg.tmpl
index f4b9069b..a537d65a 100644
--- a/config/cloud.cfg.tmpl
+++ b/config/cloud.cfg.tmpl
@@ -45,9 +45,6 @@ datasource_list: ['ConfigDrive', 'Azure', 'OpenStack', 'Ec2']
 # The modules that run in the 'init' stage
 cloud_init_modules:
  - migrator
-{% if variant in ["ubuntu", "unknown", "debian"] %}
- - ubuntu-init-switch
-{% endif %}
  - seed_random
  - bootcmd
  - write-files
diff --git a/doc/rtd/topics/modules.rst b/doc/rtd/topics/modules.rst
index c963c09a..cdb0f419 100644
--- a/doc/rtd/topics/modules.rst
+++ b/doc/rtd/topics/modules.rst
@@ -50,7 +50,6 @@ Modules
 .. automodule:: cloudinit.config.cc_ssh_authkey_fingerprints
 .. automodule:: cloudinit.config.cc_ssh_import_id
 .. automodule:: cloudinit.config.cc_timezone
-.. automodule:: cloudinit.config.cc_ubuntu_init_switch
 .. automodule:: cloudinit.config.cc_update_etc_hosts
 .. automodule:: cloudinit.config.cc_update_hostname
 .. automodule:: cloudinit.config.cc_users_groups
diff --git a/packages/bddeb b/packages/bddeb
index 609a94fb..7c123548 100755
--- a/packages/bddeb
+++ b/packages/bddeb
@@ -112,8 +112,7 @@ def get_parser():
     parser.add_argument("--init-system", dest="init_system",
                         help=("build deb with INIT_SYSTEM=xxx"
                               " (default: %(default)s"),
-                        default=os.environ.get("INIT_SYSTEM",
-                                               "upstart,systemd"))
+                        default=os.environ.get("INIT_SYSTEM", "systemd"))
 
     parser.add_argument("--release", dest="release",
                         help=("build with changelog referencing RELEASE"),
diff --git a/packages/debian/dirs b/packages/debian/dirs
index 9a633c60..1315cf8a 100644
--- a/packages/debian/dirs
+++ b/packages/debian/dirs
@@ -1,6 +1,5 @@
 var/lib/cloud
 usr/bin
-etc/init
 usr/share/doc/cloud
 etc/cloud
 lib/udev/rules.d
diff --git a/packages/debian/rules.in b/packages/debian/rules.in
index 053b7649..b87a5e84 100755
--- a/packages/debian/rules.in
+++ b/packages/debian/rules.in
@@ -1,6 +1,6 @@
 ## template:basic
 #!/usr/bin/make -f
-INIT_SYSTEM ?= upstart,systemd
+INIT_SYSTEM ?= systemd
 export PYBUILD_INSTALL_ARGS=--init-system=$(INIT_SYSTEM)
 PYVER ?= python${pyver}
 
diff --git a/setup.py b/setup.py
index 5c65c7fe..7662bd8b 100755
--- a/setup.py
+++ b/setup.py
@@ -191,6 +191,8 @@ class InitsysInstallData(install):
             datakeys = [k for k in INITSYS_ROOTS
                         if k.partition(".")[0] == system]
             for k in datakeys:
+                if not INITSYS_FILES[k]:
+                    continue
                 self.distribution.data_files.append(
                     (INITSYS_ROOTS[k], INITSYS_FILES[k]))
         # Force that command to reinitalize (with new file list)
diff --git a/tests/cloud_tests/configs/modules/TODO.md b/tests/cloud_tests/configs/modules/TODO.md
index d496da95..0b933b3b 100644
--- a/tests/cloud_tests/configs/modules/TODO.md
+++ b/tests/cloud_tests/configs/modules/TODO.md
@@ -89,8 +89,6 @@ Not applicable to write a test for this as it specifies when something should be
 ## ssh authkey fingerprints
 The authkey_hash key does not appear to work. In fact the default claims to be md5, however syslog only shows sha256
 
-## ubuntu init switch
-
 ## update etc hosts
 2016-11-17: Issues with changing /etc/hosts and lxc backend.
 
-- 
cgit v1.2.3


From cf10a2ff2e2f666d9370f38297a5a105e809ea3c Mon Sep 17 00:00:00 2001
From: Ethan Apodaca <papodaca@gmail.com>
Date: Wed, 13 Sep 2017 22:18:26 -0600
Subject: chef: Add option to pin chef omnibus install version

Most users of chef will want to pin the version that is installed.
Typically new versions of chef have to be evaluated for breakage etc.

This change proposes a new optional `omnibus_version` field to the chef
configuration. The changeset also adds documentation referencing the new
field.

LP: #1462693
---
 cloudinit/config/cc_chef.py                       | 45 ++++++++----
 cloudinit/util.py                                 | 25 +++++++
 doc/examples/cloud-config-chef.txt                |  4 ++
 tests/unittests/test_handler/test_handler_chef.py | 88 +++++++++++++++++++----
 4 files changed, 138 insertions(+), 24 deletions(-)

(limited to 'doc')

diff --git a/cloudinit/config/cc_chef.py b/cloudinit/config/cc_chef.py
index c192dd32..46abedd1 100644
--- a/cloudinit/config/cc_chef.py
+++ b/cloudinit/config/cc_chef.py
@@ -58,6 +58,9 @@ file).
       log_level:
       log_location:
       node_name:
+      omnibus_url:
+      omnibus_url_retries:
+      omnibus_version:
       pid_file:
       server_url:
       show_time:
@@ -71,7 +74,6 @@ import itertools
 import json
 import os
 
-from cloudinit import temp_utils
 from cloudinit import templater
 from cloudinit import url_helper
 from cloudinit import util
@@ -280,6 +282,31 @@ def run_chef(chef_cfg, log):
     util.subp(cmd, capture=False)
 
 
+def install_chef_from_omnibus(url=None, retries=None, omnibus_version=None):
+    """Install an omnibus unified package from url.
+
+    @param url: URL where blob of chef content may be downloaded. Defaults to
+        OMNIBUS_URL.
+    @param retries: Number of retries to perform when attempting to read url.
+        Defaults to OMNIBUS_URL_RETRIES
+    @param omnibus_version: Optional version string to require for omnibus
+        install.
+    """
+    if url is None:
+        url = OMNIBUS_URL
+    if retries is None:
+        retries = OMNIBUS_URL_RETRIES
+
+    if omnibus_version is None:
+        args = []
+    else:
+        args = ['-v', omnibus_version]
+    content = url_helper.readurl(url=url, retries=retries).contents
+    return util.subp_blob_in_tempfile(
+        blob=content, args=args,
+        basename='chef-omnibus-install', capture=False)
+
+
 def install_chef(cloud, chef_cfg, log):
     # If chef is not installed, we install chef based on 'install_type'
     install_type = util.get_cfg_option_str(chef_cfg, 'install_type',
@@ -298,17 +325,11 @@ def install_chef(cloud, chef_cfg, log):
         # This will install and run the chef-client from packages
         cloud.distro.install_packages(('chef',))
     elif install_type == 'omnibus':
-        # This will install as a omnibus unified package
-        url = util.get_cfg_option_str(chef_cfg, "omnibus_url", OMNIBUS_URL)
-        retries = max(0, util.get_cfg_option_int(chef_cfg,
-                                                 "omnibus_url_retries",
-                                                 default=OMNIBUS_URL_RETRIES))
-        content = url_helper.readurl(url=url, retries=retries).contents
-        with temp_utils.tempdir() as tmpd:
-            # Use tmpdir over tmpfile to avoid 'text file busy' on execute
-            tmpf = "%s/chef-omnibus-install" % tmpd
-            util.write_file(tmpf, content, mode=0o700)
-            util.subp([tmpf], capture=False)
+        omnibus_version = util.get_cfg_option_str(chef_cfg, "omnibus_version")
+        install_chef_from_omnibus(
+            url=util.get_cfg_option_str(chef_cfg, "omnibus_url"),
+            retries=util.get_cfg_option_int(chef_cfg, "omnibus_url_retries"),
+            omnibus_version=omnibus_version)
     else:
         log.warn("Unknown chef install type '%s'", install_type)
         run = False
diff --git a/cloudinit/util.py b/cloudinit/util.py
index ae5cda8d..7e9d94fc 100644
--- a/cloudinit/util.py
+++ b/cloudinit/util.py
@@ -1742,6 +1742,31 @@ def delete_dir_contents(dirname):
             del_file(node_fullpath)
 
 
+def subp_blob_in_tempfile(blob, *args, **kwargs):
+    """Write blob to a tempfile, and call subp with args, kwargs. Then cleanup.
+
+    'basename' as a kwarg allows providing the basename for the file.
+    The 'args' argument to subp will be updated with the full path to the
+    filename as the first argument.
+    """
+    basename = kwargs.pop('basename', "subp_blob")
+
+    if len(args) == 0 and 'args' not in kwargs:
+        args = [tuple()]
+
+    # Use tmpdir over tmpfile to avoid 'text file busy' on execute
+    with temp_utils.tempdir() as tmpd:
+        tmpf = os.path.join(tmpd, basename)
+        if 'args' in kwargs:
+            kwargs['args'] = [tmpf] + list(kwargs['args'])
+        else:
+            args = list(args)
+            args[0] = [tmpf] + args[0]
+
+        write_file(tmpf, blob, mode=0o700)
+        return subp(*args, **kwargs)
+
+
 def subp(args, data=None, rcs=None, env=None, capture=True, shell=False,
          logstring=False, decode="replace", target=None, update_env=None):
 
diff --git a/doc/examples/cloud-config-chef.txt b/doc/examples/cloud-config-chef.txt
index 9d235817..58d5fdc7 100644
--- a/doc/examples/cloud-config-chef.txt
+++ b/doc/examples/cloud-config-chef.txt
@@ -94,6 +94,10 @@ chef:
  # if install_type is 'omnibus', change the url to download
  omnibus_url: "https://www.chef.io/chef/install.sh"
 
+ # if install_type is 'omnibus', pass pinned version string
+ # to the install script
+ omnibus_version: "12.3.0"
+
 
 # Capture all subprocess output into a logfile
 # Useful for troubleshooting cloud-init issues
diff --git a/tests/unittests/test_handler/test_handler_chef.py b/tests/unittests/test_handler/test_handler_chef.py
index e5785cfd..0136a93d 100644
--- a/tests/unittests/test_handler/test_handler_chef.py
+++ b/tests/unittests/test_handler/test_handler_chef.py
@@ -1,11 +1,10 @@
 # This file is part of cloud-init. See LICENSE file for license information.
 
+import httpretty
 import json
 import logging
 import os
-import shutil
 import six
-import tempfile
 
 from cloudinit import cloud
 from cloudinit.config import cc_chef
@@ -14,18 +13,83 @@ from cloudinit import helpers
 from cloudinit.sources import DataSourceNone
 from cloudinit import util
 
-from cloudinit.tests import helpers as t_help
+from cloudinit.tests.helpers import (
+    CiTestCase, FilesystemMockingTestCase, mock, skipIf)
 
 LOG = logging.getLogger(__name__)
 
 CLIENT_TEMPL = os.path.sep.join(["templates", "chef_client.rb.tmpl"])
 
 
-class TestChef(t_help.FilesystemMockingTestCase):
+class TestInstallChefOmnibus(CiTestCase):
+
+    def setUp(self):
+        self.new_root = self.tmp_dir()
+
+    @httpretty.activate
+    def test_install_chef_from_omnibus_runs_chef_url_content(self):
+        """install_chef_from_omnibus runs downloaded OMNIBUS_URL as script."""
+        chef_outfile = self.tmp_path('chef.out', self.new_root)
+        response = '#!/bin/bash\necho "Hi Mom" > {0}'.format(chef_outfile)
+        httpretty.register_uri(
+            httpretty.GET, cc_chef.OMNIBUS_URL, body=response, status=200)
+        cc_chef.install_chef_from_omnibus()
+        self.assertEqual('Hi Mom\n', util.load_file(chef_outfile))
+
+    @mock.patch('cloudinit.config.cc_chef.url_helper.readurl')
+    @mock.patch('cloudinit.config.cc_chef.util.subp_blob_in_tempfile')
+    def test_install_chef_from_omnibus_retries_url(self, m_subp_blob, m_rdurl):
+        """install_chef_from_omnibus retries OMNIBUS_URL upon failure."""
+
+        class FakeURLResponse(object):
+            contents = '#!/bin/bash\necho "Hi Mom" > {0}/chef.out'.format(
+                self.new_root)
+
+        m_rdurl.return_value = FakeURLResponse()
+
+        cc_chef.install_chef_from_omnibus()
+        expected_kwargs = {'retries': cc_chef.OMNIBUS_URL_RETRIES,
+                           'url': cc_chef.OMNIBUS_URL}
+        self.assertItemsEqual(expected_kwargs, m_rdurl.call_args_list[0][1])
+        cc_chef.install_chef_from_omnibus(retries=10)
+        expected_kwargs = {'retries': 10,
+                           'url': cc_chef.OMNIBUS_URL}
+        self.assertItemsEqual(expected_kwargs, m_rdurl.call_args_list[1][1])
+        expected_subp_kwargs = {
+            'args': ['-v', '2.0'],
+            'basename': 'chef-omnibus-install',
+            'blob': m_rdurl.return_value.contents,
+            'capture': False
+        }
+        self.assertItemsEqual(
+            expected_subp_kwargs,
+            m_subp_blob.call_args_list[0][1])
+
+    @httpretty.activate
+    @mock.patch('cloudinit.config.cc_chef.util.subp_blob_in_tempfile')
+    def test_install_chef_from_omnibus_has_omnibus_version(self, m_subp_blob):
+        """install_chef_from_omnibus provides version arg to OMNIBUS_URL."""
+        chef_outfile = self.tmp_path('chef.out', self.new_root)
+        response = '#!/bin/bash\necho "Hi Mom" > {0}'.format(chef_outfile)
+        httpretty.register_uri(
+            httpretty.GET, cc_chef.OMNIBUS_URL, body=response)
+        cc_chef.install_chef_from_omnibus(omnibus_version='2.0')
+
+        called_kwargs = m_subp_blob.call_args_list[0][1]
+        expected_kwargs = {
+            'args': ['-v', '2.0'],
+            'basename': 'chef-omnibus-install',
+            'blob': response,
+            'capture': False
+        }
+        self.assertItemsEqual(expected_kwargs, called_kwargs)
+
+
+class TestChef(FilesystemMockingTestCase):
+
     def setUp(self):
         super(TestChef, self).setUp()
-        self.tmp = tempfile.mkdtemp()
-        self.addCleanup(shutil.rmtree, self.tmp)
+        self.tmp = self.tmp_dir()
 
     def fetch_cloud(self, distro_kind):
         cls = distros.fetch(distro_kind)
@@ -43,8 +107,8 @@ class TestChef(t_help.FilesystemMockingTestCase):
         for d in cc_chef.CHEF_DIRS:
             self.assertFalse(os.path.isdir(d))
 
-    @t_help.skipIf(not os.path.isfile(CLIENT_TEMPL),
-                   CLIENT_TEMPL + " is not available")
+    @skipIf(not os.path.isfile(CLIENT_TEMPL),
+            CLIENT_TEMPL + " is not available")
     def test_basic_config(self):
         """
         test basic config looks sane
@@ -122,8 +186,8 @@ class TestChef(t_help.FilesystemMockingTestCase):
                 'c': 'd',
             }, json.loads(c))
 
-    @t_help.skipIf(not os.path.isfile(CLIENT_TEMPL),
-                   CLIENT_TEMPL + " is not available")
+    @skipIf(not os.path.isfile(CLIENT_TEMPL),
+            CLIENT_TEMPL + " is not available")
     def test_template_deletes(self):
         tpl_file = util.load_file('templates/chef_client.rb.tmpl')
         self.patchUtils(self.tmp)
@@ -143,8 +207,8 @@ class TestChef(t_help.FilesystemMockingTestCase):
         self.assertNotIn('json_attribs', c)
         self.assertNotIn('Formatter.show_time', c)
 
-    @t_help.skipIf(not os.path.isfile(CLIENT_TEMPL),
-                   CLIENT_TEMPL + " is not available")
+    @skipIf(not os.path.isfile(CLIENT_TEMPL),
+            CLIENT_TEMPL + " is not available")
     def test_validation_cert_and_validation_key(self):
         # test validation_cert content is written to validation_key path
         tpl_file = util.load_file('templates/chef_client.rb.tmpl')
-- 
cgit v1.2.3


From 99ef5adfed0b31f87f8ea56b22113737f41bba9d Mon Sep 17 00:00:00 2001
From: Arnd Hannemann <arnd.hannemann@metrosystems.net>
Date: Mon, 21 Aug 2017 15:47:50 +0200
Subject: doc: document GCE datasource.

Add some minimal documentation for GCE datasource.
---
 doc/rtd/topics/datasources.rst     |  1 +
 doc/rtd/topics/datasources/gce.rst | 20 ++++++++++++++++++++
 2 files changed, 21 insertions(+)
 create mode 100644 doc/rtd/topics/datasources/gce.rst

(limited to 'doc')

diff --git a/doc/rtd/topics/datasources.rst b/doc/rtd/topics/datasources.rst
index a60f5eb7..7e2854de 100644
--- a/doc/rtd/topics/datasources.rst
+++ b/doc/rtd/topics/datasources.rst
@@ -94,5 +94,6 @@ Follow for more information.
    datasources/ovf.rst
    datasources/smartos.rst
    datasources/fallback.rst
+   datasources/gce.rst
 
 .. vi: textwidth=78
diff --git a/doc/rtd/topics/datasources/gce.rst b/doc/rtd/topics/datasources/gce.rst
new file mode 100644
index 00000000..8406695c
--- /dev/null
+++ b/doc/rtd/topics/datasources/gce.rst
@@ -0,0 +1,20 @@
+.. _datasource_gce:
+
+Google Compute Engine
+=====================
+
+The GCE datasource gets its data from the internal compute metadata server.
+Metadata can be queried at the URL
+'``http://metadata.google.internal/computeMetadata/v1/``'
+from within an instance.  For more information see the `GCE metadata docs`_.
+
+Currently the default project and instance level metadatakeys keys
+``project/attributes/sshKeys`` and ``instance/attributes/ssh-keys`` are merged
+to provide ``public-keys``.
+
+``user-data`` and ``user-data-encoding`` can be provided to cloud-init by
+setting those custom metadata keys for an *instance*.
+
+.. _GCE metadata docs: https://cloud.google.com/compute/docs/storing-retrieving-metadata#querying
+
+.. vi: textwidth=78
-- 
cgit v1.2.3