Merge rework branch in [Joshua Harlow]

- unified binary that activates the various stages - Now using argparse + subcommands to specify the various CLI options - a stage module that clearly separates the stages of the different components (also described how they are used and in what order in the new unified binary) - user_data is now a module that just does user data processing while the actual activation and 'handling' of the processed user data is done via a separate set of files (and modules) with the main 'init' stage being the controller of this - creation of boot_hook, cloud_config, shell_script, upstart_job version 2 modules (with classes that perform there functionality) instead of those having functionality that is attached to the cloudinit object (which reduces reuse and limits future functionality, and makes testing harder) - removal of global config that defined paths, shared config, now this is via objects making unit testing testing and global side-effects a non issue - creation of a 'helpers.py' - this contains an abstraction for the 'lock' like objects that the various module/handler running stages use to avoid re-running a given module/handler for a given frequency. this makes it separated from the actual usage of that object (thus helpful for testing and clear lines usage and how the actual job is accomplished) - a common 'runner' class is the main entrypoint using these locks to run function objects passed in (along with there arguments) and there frequency - add in a 'paths' object that provides access to the previously global and/or config based paths (thus providing a single entrypoint object/type that provides path information) - this also adds in the ability to change the path when constructing that path 'object' and adding in additional config that can be used to alter the root paths of 'joins' (useful for testing or possibly useful in chroots?) - config options now avaiable that can alter the 'write_root' and the 'read_root' when backing code uses the paths join() function - add a config parser subclass that will automatically add unknown sections and return default values (instead of throwing exceptions for these cases) - a new config merging class that will be the central object that knows how to do the common configuration merging from the various configuration sources. The order is the following: - cli config files override environment config files which override instance configs which override datasource configs which override base configuration which overrides default configuration. - remove the passing around of the 'cloudinit' object as a 'cloud' variable and instead pass around an 'interface' object that can be given to modules and handlers as there cloud access layer while the backing of that object can be varied (good for abstraction and testing) - use a single set of functions to do importing of modules - add a function in which will search for a given set of module names with a given set of attributes and return those which are found - refactor logging so that instead of using a single top level 'log' that instead each component/module can use its own logger (if desired), this should be backwards compatible with handlers and config modules that used the passed in logger (its still passed in) - ensure that all places where exception are caught and where applicable that the util logexc() is called, so that no exceptions that may occur are dropped without first being logged (where it makes sense for this to happen) - add a 'requires' file that lists cloud-init dependencies - applying it in package creation (bdeb and brpm) as well as using it in the modified setup.py to ensure dependencies are installed when using that method of packaging - add a 'version.py' that lists the active version (in code) so that code inside cloud-init can report the version in messaging and other config files - cleanup of subprocess usage so that all subprocess calls go through the subp() utility method, which now has an exception type that will provide detailed information on python 2.6 and 2.7 - forced all code loading, moving, chmod, writing files and other system level actions to go through standard set of util functions, this greatly helps in debugging and determining exactly which system actions cloud-init is performing - switching out the templating engine cheetah for tempita since tempita has no external dependencies (minus python) while cheetah has many dependencies which makes it more difficult to adopt cloud-init in distros that may not have those dependencies - adjust url fetching and url trying to go through a single function that reads urls in the new 'url helper' file, this helps in tracing, debugging and knowing which urls are being called and/or posted to from with-in cloud-init code - add in the sending of a 'User-Agent' header for all urls fetched that do not provide there own header mapping, derive this user-agent from the following template, 'Cloud-Init/{version}' where the version is the cloud-init version number - using prettytable for netinfo 'debug' printing since it provides a standard and defined output that should be easier to parse than a custom format - add a set of distro specific classes, that handle distro specific actions that modules and or handler code can use as needed, this is organized into a base abstract class with child classes that implement the shared functionality. config determines exactly which subclass to load, so it can be easily extended as needed. - current functionality - network interface config file writing - hostname setting/updating - locale/timezone/ setting - updating of /etc/hosts (with templates or generically) - package commands (ie installing, removing)/mirror finding - interface up/down activating - implemented a debian + ubuntu subclass - implemented a redhat + fedora subclass - adjust the root 'cloud.cfg' file to now have distrobution/path specific configuration values in it. these special configs are merged as the normal config is, but the system level config is not passed into modules/handlers - modules/handlers must go through the path and distro object instead - have the cloudstack datasource test the url before calling into boto to avoid the long wait for boto to finish retrying and finally fail when the gateway meta-data address is unavailable - add a simple mock ec2 meta-data python based http server that can serve a very simple set of ec2 meta-data back to callers - useful for testing or for understanding what the ec2 meta-data service can provide in terms of data or functionality - for ssh key and authorized key file parsing add in classes and util functions that maintain the state of individual lines, allowing for a clearer separation of parsing and modification (useful for testing and tracing) - add a set of 'base' init.d scripts that can be used on systems that do not have full upstart or systemd support (or support that does not match the standard fedora/ubuntu implementation) - currently these are being tested on RHEL 6.2 - separate the datasources into there own subdirectory (instead of being a top-level item), this matches how config 'modules' and user-data 'handlers' are also in there own subdirectory (thus helping new developers and others understand the code layout in a quicker manner) - add the building of rpms based off a new cli tool and template 'spec' file that will templatize and perform the necessary commands to create a source and binary package to be used with a cloud-init install on a 'rpm' supporting system - uses the new standard set of requires and converts those pypi requirements into a local set of package requirments (that are known to exist on RHEL systems but should also exist on fedora systems) - adjust the bdeb builder to be a python script (instead of a shell script) and make its 'control' file a template that takes in the standard set of pypi dependencies and uses a local mapping (known to work on ubuntu) to create the packages set of dependencies (that should also work on ubuntu-like systems) - pythonify a large set of various pieces of code - remove wrapping return statements with () when it has no effect - upper case all constants used - correctly 'case' class and method names (where applicable) - use os.path.join (and similar commands) instead of custom path creation - use 'is None' instead of the frowned upon '== None' which picks up a large set of 'true' cases than is typically desired (ie for objects that have there own equality) - use context managers on locks, tempdir, chdir, file, selinux, umask, unmounting commands so that these actions do not have to be closed and/or cleaned up manually in finally blocks, which is typically not done and will eventually be a bug in the future - use the 'abc' module for abstract classes base where possible - applied in the datasource root class, the distro root class, and the user-data v2 root class - when loading yaml, check that the 'root' type matches a predefined set of valid types (typically just 'dict') and throw a type error if a mismatch occurs, this seems to be a good idea to do when loading user config files - when forking a long running task (ie resizing a filesytem) use a new util function that will fork and then call a callback, instead of having to implement all that code in a non-shared location (thus allowing it to be used by others in the future) - when writing out filenames, go through a util function that will attempt to ensure that the given filename is 'filesystem' safe by replacing '/' with '_' and removing characters which do not match a given whitelist of allowed filename characters - for the varying usages of the 'blkid' command make a function in the util module that can be used as the single point of entry for interaction with that command (and its results) instead of having X separate implementations - place the rfc 8222 time formatting and uptime repeated pieces of code in the util module as a set of function with the name 'time_rfc2822'/'uptime' - separate the pylint+pep8 calling from one tool into two indivudal tools so that they can be called independently, add make file sections that can be used to call these independently - remove the support for the old style config that was previously located in '/etc/ec2-init/ec2-config.cfg', no longer supported! - instead of using a altered config parser that added its own 'dummy' section on in the 'mcollective' module, use configobj which handles the parsing of config without sections better (and it also maintains comments instead of removing them) - use the new defaulting config parser (that will not raise errors on sections that do not exist or return errors when values are fetched that do not exist) in the 'puppet' module - for config 'modules' add in the ability for the module to provide a list of distro names which it is known to work with, if when ran and the distro being used name does not match one of those in this list, a warning will be written out saying that this module may not work correctly on this distrobution - for all dynamically imported modules ensure that they are fixed up before they are used by ensuring that they have certain attributes, if they do not have those attributes they will be set to a sensible set of defaults instead - adjust all 'config' modules and handlers to use the adjusted util functions and the new distro objects where applicable so that those pieces of code can benefit from the unified and enhanced functionality being provided in that util module - fix a potential bug whereby when a #includeonce was encountered it would enable checking of urls against a cache, if later a #include was encountered it would continue checking against that cache, instead of refetching (which would likely be the expected case) - add a openstack/nova based pep8 extension utility ('hacking.py') that allows for custom checks (along with the standard pep8 checks) to occur when running 'make pep8' and its derivatives
author: Scott Moser <smoser@ubuntu.com> 2012-07-06 17:19:37 -0400
committer: Scott Moser <smoser@ubuntu.com> 2012-07-06 17:19:37 -0400
commit: b2a21ed1dc682a262d55a4202c6b9606496d211f (patch)
tree: 37f1c4acd3ea891c1e7a60bcd9dd8aa8c7ca1e0c /cloudinit/sources
parent: 646384ccd6f1707b2712d9bcd683ae877f1903bd (diff)
parent: e7095a1b19e849c530650d2d71edf8b28d30f1d1 (diff)
download: vyos-cloud-init-b2a21ed1dc682a262d55a4202c6b9606496d211f.tar.gz
vyos-cloud-init-b2a21ed1dc682a262d55a4202c6b9606496d211f.zip
7 files changed, 1646 insertions, 0 deletions
diff --git a/cloudinit/sources/DataSourceCloudStack.py b/cloudinit/sources/DataSourceCloudStack.py
new file mode 100644
index 00000000..751bef4f
--- /dev/null
+++ b/cloudinit/sources/DataSourceCloudStack.py
@@ -0,0 +1,147 @@
+# vi: ts=4 expandtab
+#
+#    Copyright (C) 2012 Canonical Ltd.
+#    Copyright (C) 2012 Cosmin Luta
+#    Copyright (C) 2012 Yahoo! Inc.
+#
+#    Author: Cosmin Luta <q4break@gmail.com>
+#    Author: Scott Moser <scott.moser@canonical.com>
+#    Author: Joshua Harlow <harlowja@yahoo-inc.com>
+#
+#    This program is free software: you can redistribute it and/or modify
+#    it under the terms of the GNU General Public License version 3, as
+#    published by the Free Software Foundation.
+#
+#    This program is distributed in the hope that it will be useful,
+#    but WITHOUT ANY WARRANTY; without even the implied warranty of
+#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+#    GNU General Public License for more details.
+#
+#    You should have received a copy of the GNU General Public License
+#    along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+from socket import inet_ntoa
+from struct import pack
+
+import os
+import time
+
+import boto.utils as boto_utils
+
+from cloudinit import log as logging
+from cloudinit import sources
+from cloudinit import url_helper as uhelp
+from cloudinit import util
+
+LOG = logging.getLogger(__name__)
+
+
+class DataSourceCloudStack(sources.DataSource):
+    def __init__(self, sys_cfg, distro, paths):
+        sources.DataSource.__init__(self, sys_cfg, distro, paths)
+        self.seed_dir = os.path.join(paths.seed_dir, 'cs')
+        # Cloudstack has its metadata/userdata URLs located at
+        # http://<default-gateway-ip>/latest/
+        self.api_ver = 'latest'
+        gw_addr = self.get_default_gateway()
+        if not gw_addr:
+            raise RuntimeError("No default gateway found!")
+        self.metadata_address = "http://%s/" % (gw_addr)
+
+    def get_default_gateway(self):
+        """ Returns the default gateway ip address in the dotted format
+        """
+        lines = util.load_file("/proc/net/route").splitlines()
+        for line in lines:
+            items = line.split("\t")
+            if items[1] == "00000000":
+                # Found the default route, get the gateway
+                gw = inet_ntoa(pack("<L", int(items[2], 16)))
+                LOG.debug("Found default route, gateway is %s", gw)
+                return gw
+        return None
+
+    def __str__(self):
+        return util.obj_name(self)
+
+    def _get_url_settings(self):
+        mcfg = self.ds_cfg
+        if not mcfg:
+            mcfg = {}
+        max_wait = 120
+        try:
+            max_wait = int(mcfg.get("max_wait", max_wait))
+        except Exception:
+            util.logexc(LOG, "Failed to get max wait. using %s", max_wait)
+
+        if max_wait == 0:
+            return False
+
+        timeout = 50
+        try:
+            timeout = int(mcfg.get("timeout", timeout))
+        except Exception:
+            util.logexc(LOG, "Failed to get timeout, using %s", timeout)
+
+        return (max_wait, timeout)
+
+    def wait_for_metadata_service(self):
+        mcfg = self.ds_cfg
+        if not mcfg:
+            mcfg = {}
+
+        (max_wait, timeout) = self._get_url_settings()
+
+        urls = [self.metadata_address]
+        start_time = time.time()
+        url = uhelp.wait_for_url(urls=urls, max_wait=max_wait,
+                                timeout=timeout, status_cb=LOG.warn)
+
+        if url:
+            LOG.debug("Using metadata source: '%s'", url)
+        else:
+            LOG.critical(("Giving up on waiting for the metadata from %s"
+                          " after %s seconds"),
+                          urls, int(time.time() - start_time))
+
+        return bool(url)
+
+    def get_data(self):
+        seed_ret = {}
+        if util.read_optional_seed(seed_ret, base=(self.seed_dir + "/")):
+            self.userdata_raw = seed_ret['user-data']
+            self.metadata = seed_ret['meta-data']
+            LOG.debug("Using seeded cloudstack data from: %s", self.seed_dir)
+            return True
+        try:
+            if not self.wait_for_metadata_service():
+                return False
+            start_time = time.time()
+            self.userdata_raw = boto_utils.get_instance_userdata(self.api_ver,
+                None, self.metadata_address)
+            self.metadata = boto_utils.get_instance_metadata(self.api_ver,
+                self.metadata_address)
+            LOG.debug("Crawl of metadata service took %s seconds",
+                      int(time.time() - start_time))
+            return True
+        except Exception:
+            util.logexc(LOG, ('Failed fetching from metadata '
+                              'service %s'), self.metadata_address)
+            return False
+
+    def get_instance_id(self):
+        return self.metadata['instance-id']
+
+    def get_availability_zone(self):
+        return self.metadata['availability-zone']
+
+
+# Used to match classes to dependencies
+datasources = [
+  (DataSourceCloudStack, (sources.DEP_FILESYSTEM, sources.DEP_NETWORK)),
+]
+
+
+# Return a list of data sources that match this set of dependencies
+def get_datasource_list(depends):
+    return sources.list_from_depends(depends, datasources)
diff --git a/cloudinit/sources/DataSourceConfigDrive.py b/cloudinit/sources/DataSourceConfigDrive.py
new file mode 100644
index 00000000..320dd1d1
--- /dev/null
+++ b/cloudinit/sources/DataSourceConfigDrive.py
@@ -0,0 +1,226 @@
+# vi: ts=4 expandtab
+#
+#    Copyright (C) 2012 Canonical Ltd.
+#    Copyright (C) 2012 Yahoo! Inc.
+#
+#    Author: Scott Moser <scott.moser@canonical.com>
+#    Author: Joshua Harlow <harlowja@yahoo-inc.com>
+#
+#    This program is free software: you can redistribute it and/or modify
+#    it under the terms of the GNU General Public License version 3, as
+#    published by the Free Software Foundation.
+#
+#    This program is distributed in the hope that it will be useful,
+#    but WITHOUT ANY WARRANTY; without even the implied warranty of
+#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+#    GNU General Public License for more details.
+#
+#    You should have received a copy of the GNU General Public License
+#    along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+import json
+import os
+
+from cloudinit import log as logging
+from cloudinit import sources
+from cloudinit import util
+
+LOG = logging.getLogger(__name__)
+
+# Various defaults/constants...
+DEFAULT_IID = "iid-dsconfigdrive"
+DEFAULT_MODE = 'pass'
+CFG_DRIVE_FILES = [
+    "etc/network/interfaces",
+    "root/.ssh/authorized_keys",
+    "meta.js",
+]
+DEFAULT_METADATA = {
+    "instance-id": DEFAULT_IID,
+    "dsmode": DEFAULT_MODE,
+}
+CFG_DRIVE_DEV_ENV = 'CLOUD_INIT_CONFIG_DRIVE_DEVICE'
+
+
+class DataSourceConfigDrive(sources.DataSource):
+    def __init__(self, sys_cfg, distro, paths):
+        sources.DataSource.__init__(self, sys_cfg, distro, paths)
+        self.seed = None
+        self.cfg = {}
+        self.dsmode = 'local'
+        self.seed_dir = os.path.join(paths.seed_dir, 'config_drive')
+
+    def __str__(self):
+        mstr = "%s [%s]" % (util.obj_name(self), self.dsmode)
+        mstr += "[seed=%s]" % (self.seed)
+        return mstr
+
+    def get_data(self):
+        found = None
+        md = {}
+        ud = ""
+
+        if os.path.isdir(self.seed_dir):
+            try:
+                (md, ud) = read_config_drive_dir(self.seed_dir)
+                found = self.seed_dir
+            except NonConfigDriveDir:
+                util.logexc(LOG, "Failed reading config drive from %s",
+                            self.seed_dir)
+        if not found:
+            dev = find_cfg_drive_device()
+            if dev:
+                try:
+                    (md, ud) = util.mount_cb(dev, read_config_drive_dir)
+                    found = dev
+                except (NonConfigDriveDir, util.MountFailedError):
+                    pass
+
+        if not found:
+            return False
+
+        if 'dsconfig' in md:
+            self.cfg = md['dscfg']
+
+        md = util.mergedict(md, DEFAULT_METADATA)
+
+        # Update interfaces and ifup only on the local datasource
+        # this way the DataSourceConfigDriveNet doesn't do it also.
+        if 'network-interfaces' in md and self.dsmode == "local":
+            LOG.debug("Updating network interfaces from config drive (%s)",
+                     md['dsmode'])
+            self.distro.apply_network(md['network-interfaces'])
+
+        self.seed = found
+        self.metadata = md
+        self.userdata_raw = ud
+
+        if md['dsmode'] == self.dsmode:
+            return True
+
+        LOG.debug("%s: not claiming datasource, dsmode=%s", self, md['dsmode'])
+        return False
+
+    def get_public_ssh_keys(self):
+        if not 'public-keys' in self.metadata:
+            return []
+        return self.metadata['public-keys']
+
+    # The data sources' config_obj is a cloud-config formated
+    # object that came to it from ways other than cloud-config
+    # because cloud-config content would be handled elsewhere
+    def get_config_obj(self):
+        return self.cfg
+
+
+class DataSourceConfigDriveNet(DataSourceConfigDrive):
+    def __init__(self, sys_cfg, distro, paths):
+        DataSourceConfigDrive.__init__(self, sys_cfg, distro, paths)
+        self.dsmode = 'net'
+
+
+class NonConfigDriveDir(Exception):
+    pass
+
+
+def find_cfg_drive_device():
+    """ Get the config drive device.  Return a string like '/dev/vdb'
+        or None (if there is no non-root device attached). This does not
+        check the contents, only reports that if there *were* a config_drive
+        attached, it would be this device.
+        Note: per config_drive documentation, this is
+        "associated as the last available disk on the instance"
+    """
+
+    # This seems to be for debugging??
+    if CFG_DRIVE_DEV_ENV in os.environ:
+        return os.environ[CFG_DRIVE_DEV_ENV]
+
+    # We are looking for a raw block device (sda, not sda1) with a vfat
+    # filesystem on it....
+    letters = "abcdefghijklmnopqrstuvwxyz"
+    devs = util.find_devs_with("TYPE=vfat")
+
+    # Filter out anything not ending in a letter (ignore partitions)
+    devs = [f for f in devs if f[-1] in letters]
+
+    # Sort them in reverse so "last" device is first
+    devs.sort(reverse=True)
+
+    if devs:
+        return devs[0]
+
+    return None
+
+
+def read_config_drive_dir(source_dir):
+    """
+    read_config_drive_dir(source_dir):
+       read source_dir, and return a tuple with metadata dict and user-data
+       string populated.  If not a valid dir, raise a NonConfigDriveDir
+    """
+
+    # TODO: fix this for other operating systems...
+    # Ie: this is where https://fedorahosted.org/netcf/ or similar should
+    # be hooked in... (or could be)
+    found = {}
+    for af in CFG_DRIVE_FILES:
+        fn = os.path.join(source_dir, af)
+        if os.path.isfile(fn):
+            found[af] = fn
+
+    if len(found) == 0:
+        raise NonConfigDriveDir("%s: %s" % (source_dir, "no files found"))
+
+    md = {}
+    ud = ""
+    keydata = ""
+    if "etc/network/interfaces" in found:
+        fn = found["etc/network/interfaces"]
+        md['network-interfaces'] = util.load_file(fn)
+
+    if "root/.ssh/authorized_keys" in found:
+        fn = found["root/.ssh/authorized_keys"]
+        keydata = util.load_file(fn)
+
+    meta_js = {}
+    if "meta.js" in found:
+        fn = found['meta.js']
+        content = util.load_file(fn)
+        try:
+            # Just check if its really json...
+            meta_js = json.loads(content)
+            if not isinstance(meta_js, (dict)):
+                raise TypeError("Dict expected for meta.js root node")
+        except (ValueError, TypeError) as e:
+            raise NonConfigDriveDir("%s: %s, %s" %
+                (source_dir, "invalid json in meta.js", e))
+        md['meta_js'] = content
+
+    # Key data override??
+    keydata = meta_js.get('public-keys', keydata)
+    if keydata:
+        lines = keydata.splitlines()
+        md['public-keys'] = [l for l in lines
+            if len(l) and not l.startswith("#")]
+
+    for copy in ('dsmode', 'instance-id', 'dscfg'):
+        if copy in meta_js:
+            md[copy] = meta_js[copy]
+
+    if 'user-data' in meta_js:
+        ud = meta_js['user-data']
+
+    return (md, ud)
+
+
+# Used to match classes to dependencies
+datasources = [
+  (DataSourceConfigDrive, (sources.DEP_FILESYSTEM, )),
+  (DataSourceConfigDriveNet, (sources.DEP_FILESYSTEM, sources.DEP_NETWORK)),
+]
+
+
+# Return a list of data sources that match this set of dependencies
+def get_datasource_list(depends):
+    return sources.list_from_depends(depends, datasources)
diff --git a/cloudinit/sources/DataSourceEc2.py b/cloudinit/sources/DataSourceEc2.py
new file mode 100644
index 00000000..cb460de1
--- /dev/null
+++ b/cloudinit/sources/DataSourceEc2.py
@@ -0,0 +1,265 @@
+# vi: ts=4 expandtab
+#
+#    Copyright (C) 2009-2010 Canonical Ltd.
+#    Copyright (C) 2012 Hewlett-Packard Development Company, L.P.
+#    Copyright (C) 2012 Yahoo! Inc.
+#
+#    Author: Scott Moser <scott.moser@canonical.com>
+#    Author: Juerg Hafliger <juerg.haefliger@hp.com>
+#    Author: Joshua Harlow <harlowja@yahoo-inc.com>
+#
+#    This program is free software: you can redistribute it and/or modify
+#    it under the terms of the GNU General Public License version 3, as
+#    published by the Free Software Foundation.
+#
+#    This program is distributed in the hope that it will be useful,
+#    but WITHOUT ANY WARRANTY; without even the implied warranty of
+#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+#    GNU General Public License for more details.
+#
+#    You should have received a copy of the GNU General Public License
+#    along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+import os
+import time
+
+import boto.utils as boto_utils
+
+from cloudinit import log as logging
+from cloudinit import sources
+from cloudinit import url_helper as uhelp
+from cloudinit import util
+
+LOG = logging.getLogger(__name__)
+
+DEF_MD_URL = "http://169.254.169.254"
+
+# Which version we are requesting of the ec2 metadata apis
+DEF_MD_VERSION = '2009-04-04'
+
+# Default metadata urls that will be used if none are provided
+# They will be checked for 'resolveability' and some of the
+# following may be discarded if they do not resolve
+DEF_MD_URLS = [DEF_MD_URL, "http://instance-data:8773"]
+
+
+class DataSourceEc2(sources.DataSource):
+    def __init__(self, sys_cfg, distro, paths):
+        sources.DataSource.__init__(self, sys_cfg, distro, paths)
+        self.metadata_address = DEF_MD_URL
+        self.seed_dir = os.path.join(paths.seed_dir, "ec2")
+        self.api_ver = DEF_MD_VERSION
+
+    def __str__(self):
+        return util.obj_name(self)
+
+    def get_data(self):
+        seed_ret = {}
+        if util.read_optional_seed(seed_ret, base=(self.seed_dir + "/")):
+            self.userdata_raw = seed_ret['user-data']
+            self.metadata = seed_ret['meta-data']
+            LOG.debug("Using seeded ec2 data from %s", self.seed_dir)
+            return True
+
+        try:
+            if not self.wait_for_metadata_service():
+                return False
+            start_time = time.time()
+            self.userdata_raw = boto_utils.get_instance_userdata(self.api_ver,
+                None, self.metadata_address)
+            self.metadata = boto_utils.get_instance_metadata(self.api_ver,
+                self.metadata_address)
+            LOG.debug("Crawl of metadata service took %s seconds",
+                       int(time.time() - start_time))
+            return True
+        except Exception:
+            util.logexc(LOG, "Failed reading from metadata address %s",
+                        self.metadata_address)
+            return False
+
+    def get_instance_id(self):
+        return self.metadata['instance-id']
+
+    def get_availability_zone(self):
+        return self.metadata['placement']['availability-zone']
+
+    def get_local_mirror(self):
+        return self.get_mirror_from_availability_zone()
+
+    def get_mirror_from_availability_zone(self, availability_zone=None):
+        # Availability is like 'us-west-1b' or 'eu-west-1a'
+        if availability_zone is None:
+            availability_zone = self.get_availability_zone()
+
+        if self.is_vpc():
+            return None
+
+        # Use the distro to get the mirror
+        if not availability_zone:
+            return None
+
+        mirror_tpl = self.distro.get_option('availability_zone_template')
+        if not mirror_tpl:
+            return None
+
+        tpl_params = {
+            'zone': availability_zone.strip(),
+        }
+        mirror_url = mirror_tpl % (tpl_params)
+
+        (max_wait, timeout) = self._get_url_settings()
+        worked = uhelp.wait_for_url([mirror_url], max_wait=max_wait,
+                                timeout=timeout, status_cb=LOG.warn)
+        if not worked:
+            return None
+
+        return mirror_url
+
+    def _get_url_settings(self):
+        mcfg = self.ds_cfg
+        if not mcfg:
+            mcfg = {}
+        max_wait = 120
+        try:
+            max_wait = int(mcfg.get("max_wait", max_wait))
+        except Exception:
+            util.logexc(LOG, "Failed to get max wait. using %s", max_wait)
+
+        if max_wait == 0:
+            return False
+
+        timeout = 50
+        try:
+            timeout = int(mcfg.get("timeout", timeout))
+        except Exception:
+            util.logexc(LOG, "Failed to get timeout, using %s", timeout)
+
+        return (max_wait, timeout)
+
+    def wait_for_metadata_service(self):
+        mcfg = self.ds_cfg
+        if not mcfg:
+            mcfg = {}
+
+        (max_wait, timeout) = self._get_url_settings()
+
+        # Remove addresses from the list that wont resolve.
+        mdurls = mcfg.get("metadata_urls", DEF_MD_URLS)
+        filtered = [x for x in mdurls if util.is_resolvable_url(x)]
+
+        if set(filtered) != set(mdurls):
+            LOG.debug("Removed the following from metadata urls: %s",
+                      list((set(mdurls) - set(filtered))))
+
+        if len(filtered):
+            mdurls = filtered
+        else:
+            LOG.warn("Empty metadata url list! using default list")
+            mdurls = DEF_MD_URLS
+
+        urls = []
+        url2base = {}
+        for url in mdurls:
+            cur = "%s/%s/meta-data/instance-id" % (url, self.api_ver)
+            urls.append(cur)
+            url2base[cur] = url
+
+        start_time = time.time()
+        url = uhelp.wait_for_url(urls=urls, max_wait=max_wait,
+                                timeout=timeout, status_cb=LOG.warn)
+
+        if url:
+            LOG.debug("Using metadata source: '%s'", url2base[url])
+        else:
+            LOG.critical("Giving up on md from %s after %s seconds",
+                            urls, int(time.time() - start_time))
+
+        self.metadata_address = url2base.get(url)
+        return bool(url)
+
+    def _remap_device(self, short_name):
+        # LP: #611137
+        # the metadata service may believe that devices are named 'sda'
+        # when the kernel named them 'vda' or 'xvda'
+        # we want to return the correct value for what will actually
+        # exist in this instance
+        mappings = {"sd": ("vd", "xvd")}
+        for (nfrom, tlist) in mappings.iteritems():
+            if not short_name.startswith(nfrom):
+                continue
+            for nto in tlist:
+                cand = "/dev/%s%s" % (nto, short_name[len(nfrom):])
+                if os.path.exists(cand):
+                    return cand
+        return None
+
+    def device_name_to_device(self, name):
+        # Consult metadata service, that has
+        #  ephemeral0: sdb
+        # and return 'sdb' for input 'ephemeral0'
+        if 'block-device-mapping' not in self.metadata:
+            return None
+
+        # Example:
+        # 'block-device-mapping':
+        # {'ami': '/dev/sda1',
+        # 'ephemeral0': '/dev/sdb',
+        # 'root': '/dev/sda1'}
+        found = None
+        bdm_items = self.metadata['block-device-mapping'].iteritems()
+        for (entname, device) in bdm_items:
+            if entname == name:
+                found = device
+                break
+            # LP: #513842 mapping in Euca has 'ephemeral' not 'ephemeral0'
+            if entname == "ephemeral" and name == "ephemeral0":
+                found = device
+
+        if found is None:
+            LOG.debug("Unable to convert %s to a device", name)
+            return None
+
+        ofound = found
+        if not found.startswith("/"):
+            found = "/dev/%s" % found
+
+        if os.path.exists(found):
+            return found
+
+        remapped = self._remap_device(os.path.basename(found))
+        if remapped:
+            LOG.debug("Remapped device name %s => %s", (found, remapped))
+            return remapped
+
+        # On t1.micro, ephemeral0 will appear in block-device-mapping from
+        # metadata, but it will not exist on disk (and never will)
+        # at this point, we've verified that the path did not exist
+        # in the special case of 'ephemeral0' return None to avoid bogus
+        # fstab entry (LP: #744019)
+        if name == "ephemeral0":
+            return None
+        return ofound
+
+    def is_vpc(self):
+        # See: https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/615545
+        # Detect that the machine was launched in a VPC.
+        # But I did notice that when in a VPC, meta-data
+        # does not have public-ipv4 and public-hostname
+        # listed as a possibility.
+        ph = "public-hostname"
+        p4 = "public-ipv4"
+        if ((ph not in self.metadata or self.metadata[ph] == "") and
+            (p4 not in self.metadata or self.metadata[p4] == "")):
+            return True
+        return False
+
+
+# Used to match classes to dependencies
+datasources = [
+  (DataSourceEc2, (sources.DEP_FILESYSTEM, sources.DEP_NETWORK)),
+]
+
+
+# Return a list of data sources that match this set of dependencies
+def get_datasource_list(depends):
+    return sources.list_from_depends(depends, datasources)
diff --git a/cloudinit/sources/DataSourceMAAS.py b/cloudinit/sources/DataSourceMAAS.py
new file mode 100644
index 00000000..f16d5c21
--- /dev/null
+++ b/cloudinit/sources/DataSourceMAAS.py
@@ -0,0 +1,264 @@
+# vi: ts=4 expandtab
+#
+#    Copyright (C) 2012 Canonical Ltd.
+#    Copyright (C) 2012 Yahoo! Inc.
+#
+#    Author: Scott Moser <scott.moser@canonical.com>
+#    Author: Joshua Harlow <harlowja@yahoo-inc.com>
+#
+#    This program is free software: you can redistribute it and/or modify
+#    it under the terms of the GNU General Public License version 3, as
+#    published by the Free Software Foundation.
+#
+#    This program is distributed in the hope that it will be useful,
+#    but WITHOUT ANY WARRANTY; without even the implied warranty of
+#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+#    GNU General Public License for more details.
+#
+#    You should have received a copy of the GNU General Public License
+#    along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+import errno
+import oauth.oauth as oauth
+import os
+import time
+import urllib2
+
+from cloudinit import log as logging
+from cloudinit import sources
+from cloudinit import url_helper as uhelp
+from cloudinit import util
+
+LOG = logging.getLogger(__name__)
+MD_VERSION = "2012-03-01"
+
+
+class DataSourceMAAS(sources.DataSource):
+    """
+    DataSourceMAAS reads instance information from MAAS.
+    Given a config metadata_url, and oauth tokens, it expects to find
+    files under the root named:
+      instance-id
+      user-data
+      hostname
+    """
+    def __init__(self, sys_cfg, distro, paths):
+        sources.DataSource.__init__(self, sys_cfg, distro, paths)
+        self.base_url = None
+        self.seed_dir = os.path.join(paths.seed_dir, 'maas')
+
+    def __str__(self):
+        return "%s [%s]" % (util.obj_name(self), self.base_url)
+
+    def get_data(self):
+        mcfg = self.ds_cfg
+
+        try:
+            (userdata, metadata) = read_maas_seed_dir(self.seed_dir)
+            self.userdata_raw = userdata
+            self.metadata = metadata
+            self.base_url = self.seed_dir
+            return True
+        except MAASSeedDirNone:
+            pass
+        except MAASSeedDirMalformed as exc:
+            LOG.warn("%s was malformed: %s" % (self.seed_dir, exc))
+            raise
+
+        # If there is no metadata_url, then we're not configured
+        url = mcfg.get('metadata_url', None)
+        if not url:
+            return False
+
+        try:
+            if not self.wait_for_metadata_service(url):
+                return False
+
+            self.base_url = url
+
+            (userdata, metadata) = read_maas_seed_url(self.base_url,
+                                                      self.md_headers)
+            self.userdata_raw = userdata
+            self.metadata = metadata
+            return True
+        except Exception:
+            util.logexc(LOG, "Failed fetching metadata from url %s", url)
+            return False
+
+    def md_headers(self, url):
+        mcfg = self.ds_cfg
+
+        # If we are missing token_key, token_secret or consumer_key
+        # then just do non-authed requests
+        for required in ('token_key', 'token_secret', 'consumer_key'):
+            if required not in mcfg:
+                return {}
+
+        consumer_secret = mcfg.get('consumer_secret', "")
+        return oauth_headers(url=url,
+                             consumer_key=mcfg['consumer_key'],
+                             token_key=mcfg['token_key'],
+                             token_secret=mcfg['token_secret'],
+                             consumer_secret=consumer_secret)
+
+    def wait_for_metadata_service(self, url):
+        mcfg = self.ds_cfg
+
+        max_wait = 120
+        try:
+            max_wait = int(mcfg.get("max_wait", max_wait))
+        except Exception:
+            util.logexc(LOG, "Failed to get max wait. using %s", max_wait)
+
+        if max_wait == 0:
+            return False
+
+        timeout = 50
+        try:
+            if timeout in mcfg:
+                timeout = int(mcfg.get("timeout", timeout))
+        except Exception:
+            LOG.warn("Failed to get timeout, using %s" % timeout)
+
+        starttime = time.time()
+        check_url = "%s/%s/meta-data/instance-id" % (url, MD_VERSION)
+        urls = [check_url]
+        url = uhelp.wait_for_url(urls=urls, max_wait=max_wait,
+                                 timeout=timeout, status_cb=LOG.warn,
+                                 headers_cb=self.md_headers)
+
+        if url:
+            LOG.debug("Using metadata source: '%s'", url)
+        else:
+            LOG.critical("Giving up on md from %s after %i seconds",
+                            urls, int(time.time() - starttime))
+
+        return bool(url)
+
+
+def read_maas_seed_dir(seed_d):
+    """
+    Return user-data and metadata for a maas seed dir in seed_d.
+    Expected format of seed_d are the following files:
+      * instance-id
+      * local-hostname
+      * user-data
+    """
+    if not os.path.isdir(seed_d):
+        raise MAASSeedDirNone("%s: not a directory")
+
+    files = ('local-hostname', 'instance-id', 'user-data', 'public-keys')
+    md = {}
+    for fname in files:
+        try:
+            md[fname] = util.load_file(os.path.join(seed_d, fname))
+        except IOError as e:
+            if e.errno != errno.ENOENT:
+                raise
+
+    return check_seed_contents(md, seed_d)
+
+
+def read_maas_seed_url(seed_url, header_cb=None, timeout=None,
+    version=MD_VERSION):
+    """
+    Read the maas datasource at seed_url.
+    header_cb is a method that should return a headers dictionary that will
+    be given to urllib2.Request()
+
+    Expected format of seed_url is are the following files:
+      * <seed_url>/<version>/meta-data/instance-id
+      * <seed_url>/<version>/meta-data/local-hostname
+      * <seed_url>/<version>/user-data
+    """
+    base_url = "%s/%s" % (seed_url, version)
+    file_order = [
+        'local-hostname',
+        'instance-id',
+        'public-keys',
+        'user-data',
+    ]
+    files = {
+        'local-hostname': "%s/%s" % (base_url, 'meta-data/local-hostname'),
+        'instance-id': "%s/%s" % (base_url, 'meta-data/instance-id'),
+        'public-keys': "%s/%s" % (base_url, 'meta-data/public-keys'),
+        'user-data': "%s/%s" % (base_url, 'user-data'),
+    }
+    md = {}
+    for name in file_order:
+        url = files.get(name)
+        if header_cb:
+            headers = header_cb(url)
+        else:
+            headers = {}
+        try:
+            resp = uhelp.readurl(url, headers=headers, timeout=timeout)
+            if resp.ok():
+                md[name] = str(resp)
+            else:
+                LOG.warn(("Fetching from %s resulted in"
+                          " an invalid http code %s"), url, resp.code)
+        except urllib2.HTTPError as e:
+            if e.code != 404:
+                raise
+    return check_seed_contents(md, seed_url)
+
+
+def check_seed_contents(content, seed):
+    """Validate if content is Is the content a dict that is valid as a
+       return for a datasource.
+       Either return a (userdata, metadata) tuple or
+       Raise MAASSeedDirMalformed or MAASSeedDirNone
+    """
+    md_required = ('instance-id', 'local-hostname')
+    if len(content) == 0:
+        raise MAASSeedDirNone("%s: no data files found" % seed)
+
+    found = list(content.keys())
+    missing = [k for k in md_required if k not in found]
+    if len(missing):
+        raise MAASSeedDirMalformed("%s: missing files %s" % (seed, missing))
+
+    userdata = content.get('user-data', "")
+    md = {}
+    for (key, val) in content.iteritems():
+        if key == 'user-data':
+            continue
+        md[key] = val
+
+    return (userdata, md)
+
+
+def oauth_headers(url, consumer_key, token_key, token_secret, consumer_secret):
+    consumer = oauth.OAuthConsumer(consumer_key, consumer_secret)
+    token = oauth.OAuthToken(token_key, token_secret)
+    params = {
+        'oauth_version': "1.0",
+        'oauth_nonce': oauth.generate_nonce(),
+        'oauth_timestamp': int(time.time()),
+        'oauth_token': token.key,
+        'oauth_consumer_key': consumer.key,
+    }
+    req = oauth.OAuthRequest(http_url=url, parameters=params)
+    req.sign_request(oauth.OAuthSignatureMethod_PLAINTEXT(),
+                     consumer, token)
+    return req.to_header()
+
+
+class MAASSeedDirNone(Exception):
+    pass
+
+
+class MAASSeedDirMalformed(Exception):
+    pass
+
+
+# Used to match classes to dependencies
+datasources = [
+  (DataSourceMAAS, (sources.DEP_FILESYSTEM, sources.DEP_NETWORK)),
+]
+
+
+# Return a list of data sources that match this set of dependencies
+def get_datasource_list(depends):
+    return sources.list_from_depends(depends, datasources)
diff --git a/cloudinit/sources/DataSourceNoCloud.py b/cloudinit/sources/DataSourceNoCloud.py
new file mode 100644
index 00000000..bed500a2
--- /dev/null
+++ b/cloudinit/sources/DataSourceNoCloud.py
@@ -0,0 +1,228 @@
+# vi: ts=4 expandtab
+#
+#    Copyright (C) 2009-2010 Canonical Ltd.
+#    Copyright (C) 2012 Hewlett-Packard Development Company, L.P.
+#    Copyright (C) 2012 Yahoo! Inc.
+#
+#    Author: Scott Moser <scott.moser@canonical.com>
+#    Author: Juerg Hafliger <juerg.haefliger@hp.com>
+#    Author: Joshua Harlow <harlowja@yahoo-inc.com>
+#
+#    This program is free software: you can redistribute it and/or modify
+#    it under the terms of the GNU General Public License version 3, as
+#    published by the Free Software Foundation.
+#
+#    This program is distributed in the hope that it will be useful,
+#    but WITHOUT ANY WARRANTY; without even the implied warranty of
+#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+#    GNU General Public License for more details.
+#
+#    You should have received a copy of the GNU General Public License
+#    along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+import errno
+import os
+
+from cloudinit import log as logging
+from cloudinit import sources
+from cloudinit import util
+
+LOG = logging.getLogger(__name__)
+
+
+class DataSourceNoCloud(sources.DataSource):
+    def __init__(self, sys_cfg, distro, paths):
+        sources.DataSource.__init__(self, sys_cfg, distro, paths)
+        self.dsmode = 'local'
+        self.seed = None
+        self.cmdline_id = "ds=nocloud"
+        self.seed_dir = os.path.join(paths.seed_dir, 'nocloud')
+        self.supported_seed_starts = ("/", "file://")
+
+    def __str__(self):
+        mstr = "%s [seed=%s][dsmode=%s]" % (util.obj_name(self),
+                                            self.seed, self.dsmode)
+        return mstr
+
+    def get_data(self):
+        defaults = {
+            "instance-id": "nocloud",
+            "dsmode": self.dsmode,
+        }
+
+        found = []
+        md = {}
+        ud = ""
+
+        try:
+            # Parse the kernel command line, getting data passed in
+            if parse_cmdline_data(self.cmdline_id, md):
+                found.append("cmdline")
+        except:
+            util.logexc(LOG, "Unable to parse command line data")
+            return False
+
+        # Check to see if the seed dir has data.
+        seedret = {}
+        if util.read_optional_seed(seedret, base=self.seed_dir + "/"):
+            md = util.mergedict(md, seedret['meta-data'])
+            ud = seedret['user-data']
+            found.append(self.seed_dir)
+            LOG.debug("Using seeded cache data from %s", self.seed_dir)
+
+        # If the datasource config had a 'seedfrom' entry, then that takes
+        # precedence over a 'seedfrom' that was found in a filesystem
+        # but not over external media
+        if 'seedfrom' in self.ds_cfg and self.ds_cfg['seedfrom']:
+            found.append("ds_config")
+            md["seedfrom"] = self.ds_cfg['seedfrom']
+
+        fslist = util.find_devs_with("TYPE=vfat")
+        fslist.extend(util.find_devs_with("TYPE=iso9660"))
+
+        label_list = util.find_devs_with("LABEL=cidata")
+        devlist = list(set(fslist) & set(label_list))
+        devlist.sort(reverse=True)
+
+        for dev in devlist:
+            try:
+                LOG.debug("Attempting to use data from %s", dev)
+
+                (newmd, newud) = util.mount_cb(dev, util.read_seeded)
+                md = util.mergedict(newmd, md)
+                ud = newud
+
+                # For seed from a device, the default mode is 'net'.
+                # that is more likely to be what is desired.
+                # If they want dsmode of local, then they must
+                # specify that.
+                if 'dsmode' not in md:
+                    md['dsmode'] = "net"
+
+                LOG.debug("Using data from %s", dev)
+                found.append(dev)
+                break
+            except OSError as e:
+                if e.errno != errno.ENOENT:
+                    raise
+            except util.MountFailedError:
+                util.logexc(LOG, ("Failed to mount %s"
+                                  " when looking for data"), dev)
+
+        # There was no indication on kernel cmdline or data
+        # in the seeddir suggesting this handler should be used.
+        if len(found) == 0:
+            return False
+
+        seeded_interfaces = None
+
+        # The special argument "seedfrom" indicates we should
+        # attempt to seed the userdata / metadata from its value
+        # its primarily value is in allowing the user to type less
+        # on the command line, ie: ds=nocloud;s=http://bit.ly/abcdefg
+        if "seedfrom" in md:
+            seedfrom = md["seedfrom"]
+            seedfound = False
+            for proto in self.supported_seed_starts:
+                if seedfrom.startswith(proto):
+                    seedfound = proto
+                    break
+            if not seedfound:
+                LOG.debug("Seed from %s not supported by %s", seedfrom, self)
+                return False
+
+            if 'network-interfaces' in md:
+                seeded_interfaces = self.dsmode
+
+            # This could throw errors, but the user told us to do it
+            # so if errors are raised, let them raise
+            (md_seed, ud) = util.read_seeded(seedfrom, timeout=None)
+            LOG.debug("Using seeded cache data from %s", seedfrom)
+
+            # Values in the command line override those from the seed
+            md = util.mergedict(md, md_seed)
+            found.append(seedfrom)
+
+        # Now that we have exhausted any other places merge in the defaults
+        md = util.mergedict(md, defaults)
+
+        # Update the network-interfaces if metadata had 'network-interfaces'
+        # entry and this is the local datasource, or 'seedfrom' was used
+        # and the source of the seed was self.dsmode
+        # ('local' for NoCloud, 'net' for NoCloudNet')
+        if ('network-interfaces' in md and
+            (self.dsmode in ("local", seeded_interfaces))):
+            LOG.debug("Updating network interfaces from %s", self)
+            self.distro.apply_network(md['network-interfaces'])
+
+        if md['dsmode'] == self.dsmode:
+            self.seed = ",".join(found)
+            self.metadata = md
+            self.userdata_raw = ud
+            return True
+
+        LOG.debug("%s: not claiming datasource, dsmode=%s", self, md['dsmode'])
+        return False
+
+
+# Returns true or false indicating if cmdline indicated
+# that this module should be used
+# Example cmdline:
+#  root=LABEL=uec-rootfs ro ds=nocloud
+def parse_cmdline_data(ds_id, fill, cmdline=None):
+    if cmdline is None:
+        cmdline = util.get_cmdline()
+    cmdline = " %s " % cmdline
+
+    if not (" %s " % ds_id in cmdline or " %s;" % ds_id in cmdline):
+        return False
+
+    argline = ""
+    # cmdline can contain:
+    # ds=nocloud[;key=val;key=val]
+    for tok in cmdline.split():
+        if tok.startswith(ds_id):
+            argline = tok.split("=", 1)
+
+    # argline array is now 'nocloud' followed optionally by
+    # a ';' and then key=value pairs also terminated with ';'
+    tmp = argline[1].split(";")
+    if len(tmp) > 1:
+        kvpairs = tmp[1:]
+    else:
+        kvpairs = ()
+
+    # short2long mapping to save cmdline typing
+    s2l = {"h": "local-hostname", "i": "instance-id", "s": "seedfrom"}
+    for item in kvpairs:
+        try:
+            (k, v) = item.split("=", 1)
+        except:
+            k = item
+            v = None
+        if k in s2l:
+            k = s2l[k]
+        fill[k] = v
+
+    return True
+
+
+class DataSourceNoCloudNet(DataSourceNoCloud):
+    def __init__(self, sys_cfg, distro, paths):
+        DataSourceNoCloud.__init__(self, sys_cfg, distro, paths)
+        self.cmdline_id = "ds=nocloud-net"
+        self.supported_seed_starts = ("http://", "https://", "ftp://")
+        self.seed_dir = os.path.join(paths.seed_dir, 'nocloud-net')
+        self.dsmode = "net"
+
+
+# Used to match classes to dependencies
+datasources = [
+  (DataSourceNoCloud, (sources.DEP_FILESYSTEM, )),
+  (DataSourceNoCloudNet, (sources.DEP_FILESYSTEM, sources.DEP_NETWORK)),
+]
+
+
+# Return a list of data sources that match this set of dependencies
+def get_datasource_list(depends):
+    return sources.list_from_depends(depends, datasources)
diff --git a/cloudinit/sources/DataSourceOVF.py b/cloudinit/sources/DataSourceOVF.py
new file mode 100644
index 00000000..7728b36f
--- /dev/null
+++ b/cloudinit/sources/DataSourceOVF.py
@@ -0,0 +1,293 @@
+# vi: ts=4 expandtab
+#
+#    Copyright (C) 2011 Canonical Ltd.
+#    Copyright (C) 2012 Hewlett-Packard Development Company, L.P.
+#    Copyright (C) 2012 Yahoo! Inc.
+#
+#    Author: Scott Moser <scott.moser@canonical.com>
+#    Author: Juerg Hafliger <juerg.haefliger@hp.com>
+#    Author: Joshua Harlow <harlowja@yahoo-inc.com>
+#
+#    This program is free software: you can redistribute it and/or modify
+#    it under the terms of the GNU General Public License version 3, as
+#    published by the Free Software Foundation.
+#
+#    This program is distributed in the hope that it will be useful,
+#    but WITHOUT ANY WARRANTY; without even the implied warranty of
+#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+#    GNU General Public License for more details.
+#
+#    You should have received a copy of the GNU General Public License
+#    along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+from xml.dom import minidom
+
+import base64
+import os
+import re
+
+from cloudinit import log as logging
+from cloudinit import sources
+from cloudinit import util
+
+LOG = logging.getLogger(__name__)
+
+
+class DataSourceOVF(sources.DataSource):
+    def __init__(self, sys_cfg, distro, paths):
+        sources.DataSource.__init__(self, sys_cfg, distro, paths)
+        self.seed = None
+        self.seed_dir = os.path.join(paths.seed_dir, 'ovf')
+        self.environment = None
+        self.cfg = {}
+        self.supported_seed_starts = ("/", "file://")
+
+    def __str__(self):
+        return "%s [seed=%s]" % (util.obj_name(self), self.seed)
+
+    def get_data(self):
+        found = []
+        md = {}
+        ud = ""
+
+        defaults = {
+            "instance-id": "iid-dsovf",
+        }
+
+        (seedfile, contents) = get_ovf_env(self.paths.seed_dir)
+        if seedfile:
+            # Found a seed dir
+            seed = os.path.join(self.paths.seed_dir, seedfile)
+            (md, ud, cfg) = read_ovf_environment(contents)
+            self.environment = contents
+            found.append(seed)
+        else:
+            np = {'iso': transport_iso9660,
+                  'vmware-guestd': transport_vmware_guestd, }
+            name = None
+            for (name, transfunc) in np.iteritems():
+                (contents, _dev, _fname) = transfunc()
+                if contents:
+                    break
+            if contents:
+                (md, ud, cfg) = read_ovf_environment(contents)
+                self.environment = contents
+                found.append(name)
+
+        # There was no OVF transports found
+        if len(found) == 0:
+            return False
+
+        if 'seedfrom' in md and md['seedfrom']:
+            seedfrom = md['seedfrom']
+            seedfound = False
+            for proto in self.supported_seed_starts:
+                if seedfrom.startswith(proto):
+                    seedfound = proto
+                    break
+            if not seedfound:
+                LOG.debug("Seed from %s not supported by %s",
+                          seedfrom, self)
+                return False
+
+            (md_seed, ud) = util.read_seeded(seedfrom, timeout=None)
+            LOG.debug("Using seeded cache data from %s", seedfrom)
+
+            md = util.mergedict(md, md_seed)
+            found.append(seedfrom)
+
+        # Now that we have exhausted any other places merge in the defaults
+        md = util.mergedict(md, defaults)
+
+        self.seed = ",".join(found)
+        self.metadata = md
+        self.userdata_raw = ud
+        self.cfg = cfg
+        return True
+
+    def get_public_ssh_keys(self):
+        if not 'public-keys' in self.metadata:
+            return []
+        pks = self.metadata['public-keys']
+        if isinstance(pks, (list)):
+            return pks
+        else:
+            return [pks]
+
+    # The data sources' config_obj is a cloud-config formatted
+    # object that came to it from ways other than cloud-config
+    # because cloud-config content would be handled elsewhere
+    def get_config_obj(self):
+        return self.cfg
+
+
+class DataSourceOVFNet(DataSourceOVF):
+    def __init__(self, sys_cfg, distro, paths):
+        DataSourceOVF.__init__(self, sys_cfg, distro, paths)
+        self.seed_dir = os.path.join(paths.seed_dir, 'ovf-net')
+        self.supported_seed_starts = ("http://", "https://", "ftp://")
+
+
+# This will return a dict with some content
+#  meta-data, user-data, some config
+def read_ovf_environment(contents):
+    props = get_properties(contents)
+    md = {}
+    cfg = {}
+    ud = ""
+    cfg_props = ['password']
+    md_props = ['seedfrom', 'local-hostname', 'public-keys', 'instance-id']
+    for (prop, val) in props.iteritems():
+        if prop == 'hostname':
+            prop = "local-hostname"
+        if prop in md_props:
+            md[prop] = val
+        elif prop in cfg_props:
+            cfg[prop] = val
+        elif prop == "user-data":
+            try:
+                ud = base64.decodestring(val)
+            except:
+                ud = val
+    return (md, ud, cfg)
+
+
+# Returns tuple of filename (in 'dirname', and the contents of the file)
+# on "not found", returns 'None' for filename and False for contents
+def get_ovf_env(dirname):
+    env_names = ("ovf-env.xml", "ovf_env.xml", "OVF_ENV.XML", "OVF-ENV.XML")
+    for fname in env_names:
+        full_fn = os.path.join(dirname, fname)
+        if os.path.isfile(full_fn):
+            try:
+                contents = util.load_file(full_fn)
+                return (fname, contents)
+            except:
+                util.logexc(LOG, "Failed loading ovf file %s", full_fn)
+    return (None, False)
+
+
+# Transport functions take no input and return
+# a 3 tuple of content, path, filename
+def transport_iso9660(require_iso=True):
+
+    # default_regex matches values in
+    # /lib/udev/rules.d/60-cdrom_id.rules
+    # KERNEL!="sr[0-9]*|hd[a-z]|xvd*", GOTO="cdrom_end"
+    envname = "CLOUD_INIT_CDROM_DEV_REGEX"
+    default_regex = "^(sr[0-9]+|hd[a-z]|xvd.*)"
+
+    devname_regex = os.environ.get(envname, default_regex)
+    cdmatch = re.compile(devname_regex)
+
+    # Go through mounts to see if it was already mounted
+    mounts = util.mounts()
+    for (dev, info) in mounts.iteritems():
+        fstype = info['fstype']
+        if fstype != "iso9660" and require_iso:
+            continue
+        if cdmatch.match(dev[5:]) is None:  # take off '/dev/'
+            continue
+        mp = info['mountpoint']
+        (fname, contents) = get_ovf_env(mp)
+        if contents is not False:
+            return (contents, dev, fname)
+
+    devs = os.listdir("/dev/")
+    devs.sort()
+    for dev in devs:
+        fullp = os.path.join("/dev/", dev)
+
+        if (fullp in mounts or
+            not cdmatch.match(dev) or os.path.isdir(fullp)):
+            continue
+
+        try:
+            # See if we can read anything at all...??
+            with open(fullp, 'rb') as fp:
+                fp.read(512)
+        except:
+            continue
+
+        try:
+            (fname, contents) = util.mount_cb(fullp,
+                                               get_ovf_env, mtype="iso9660")
+        except util.MountFailedError:
+            util.logexc(LOG, "Failed mounting %s", fullp)
+            continue
+
+        if contents is not False:
+            return (contents, fullp, fname)
+
+    return (False, None, None)
+
+
+def transport_vmware_guestd():
+    # http://blogs.vmware.com/vapp/2009/07/ \
+    #    selfconfiguration-and-the-ovf-environment.html
+    # try:
+    #     cmd = ['vmware-guestd', '--cmd', 'info-get guestinfo.ovfEnv']
+    #     (out, err) = subp(cmd)
+    #     return(out, 'guestinfo.ovfEnv', 'vmware-guestd')
+    # except:
+    #     # would need to error check here and see why this failed
+    #     # to know if log/error should be raised
+    #     return(False, None, None)
+    return (False, None, None)
+
+
+def find_child(node, filter_func):
+    ret = []
+    if not node.hasChildNodes():
+        return ret
+    for child in node.childNodes:
+        if filter_func(child):
+            ret.append(child)
+    return ret
+
+
+def get_properties(contents):
+
+    dom = minidom.parseString(contents)
+    if dom.documentElement.localName != "Environment":
+        raise XmlError("No Environment Node")
+
+    if not dom.documentElement.hasChildNodes():
+        raise XmlError("No Child Nodes")
+
+    envNsURI = "http://schemas.dmtf.org/ovf/environment/1"
+
+    # could also check here that elem.namespaceURI ==
+    #   "http://schemas.dmtf.org/ovf/environment/1"
+    propSections = find_child(dom.documentElement,
+        lambda n: n.localName == "PropertySection")
+
+    if len(propSections) == 0:
+        raise XmlError("No 'PropertySection's")
+
+    props = {}
+    propElems = find_child(propSections[0],
+                            (lambda n: n.localName == "Property"))
+
+    for elem in propElems:
+        key = elem.attributes.getNamedItemNS(envNsURI, "key").value
+        val = elem.attributes.getNamedItemNS(envNsURI, "value").value
+        props[key] = val
+
+    return props
+
+
+class XmlError(Exception):
+    pass
+
+
+# Used to match classes to dependencies
+datasources = (
+  (DataSourceOVF, (sources.DEP_FILESYSTEM, )),
+  (DataSourceOVFNet, (sources.DEP_FILESYSTEM, sources.DEP_NETWORK)),
+)
+
+
+# Return a list of data sources that match this set of dependencies
+def get_datasource_list(depends):
+    return sources.list_from_depends(depends, datasources)
diff --git a/cloudinit/sources/__init__.py b/cloudinit/sources/__init__.py
new file mode 100644
index 00000000..b25724a5
--- /dev/null
+++ b/cloudinit/sources/__init__.py
@@ -0,0 +1,223 @@
+# vi: ts=4 expandtab
+#
+#    Copyright (C) 2012 Canonical Ltd.
+#    Copyright (C) 2012 Hewlett-Packard Development Company, L.P.
+#    Copyright (C) 2012 Yahoo! Inc.
+#
+#    Author: Scott Moser <scott.moser@canonical.com>
+#    Author: Juerg Haefliger <juerg.haefliger@hp.com>
+#    Author: Joshua Harlow <harlowja@yahoo-inc.com>
+#
+#    This program is free software: you can redistribute it and/or modify
+#    it under the terms of the GNU General Public License version 3, as
+#    published by the Free Software Foundation.
+#
+#    This program is distributed in the hope that it will be useful,
+#    but WITHOUT ANY WARRANTY; without even the implied warranty of
+#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+#    GNU General Public License for more details.
+#
+#    You should have received a copy of the GNU General Public License
+#    along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+import abc
+
+from cloudinit import importer
+from cloudinit import log as logging
+from cloudinit import user_data as ud
+from cloudinit import util
+
+DEP_FILESYSTEM = "FILESYSTEM"
+DEP_NETWORK = "NETWORK"
+DS_PREFIX = 'DataSource'
+
+LOG = logging.getLogger(__name__)
+
+
+class DataSourceNotFoundException(Exception):
+    pass
+
+
+class DataSource(object):
+
+    __metaclass__ = abc.ABCMeta
+
+    def __init__(self, sys_cfg, distro, paths, ud_proc=None):
+        self.sys_cfg = sys_cfg
+        self.distro = distro
+        self.paths = paths
+        self.userdata = None
+        self.metadata = None
+        self.userdata_raw = None
+        name = util.obj_name(self)
+        if name.startswith(DS_PREFIX):
+            name = name[len(DS_PREFIX):]
+        self.ds_cfg = util.get_cfg_by_path(self.sys_cfg,
+                                          ("datasource", name), {})
+        if not ud_proc:
+            self.ud_proc = ud.UserDataProcessor(self.paths)
+        else:
+            self.ud_proc = ud_proc
+
+    def get_userdata(self):
+        if self.userdata is None:
+            raw_data = self.get_userdata_raw()
+            self.userdata = self.ud_proc.process(raw_data)
+        return self.userdata
+
+    def get_userdata_raw(self):
+        return self.userdata_raw
+
+    # the data sources' config_obj is a cloud-config formated
+    # object that came to it from ways other than cloud-config
+    # because cloud-config content would be handled elsewhere
+    def get_config_obj(self):
+        return {}
+
+    def get_public_ssh_keys(self):
+        keys = []
+
+        if not self.metadata or 'public-keys' not in self.metadata:
+            return keys
+
+        if isinstance(self.metadata['public-keys'], (basestring, str)):
+            return str(self.metadata['public-keys']).splitlines()
+
+        if isinstance(self.metadata['public-keys'], (list, set)):
+            return list(self.metadata['public-keys'])
+
+        if isinstance(self.metadata['public-keys'], (dict)):
+            for (_keyname, klist) in self.metadata['public-keys'].iteritems():
+                # lp:506332 uec metadata service responds with
+                # data that makes boto populate a string for 'klist' rather
+                # than a list.
+                if isinstance(klist, (str, basestring)):
+                    klist = [klist]
+                if isinstance(klist, (list, set)):
+                    for pkey in klist:
+                        # There is an empty string at
+                        # the end of the keylist, trim it
+                        if pkey:
+                            keys.append(pkey)
+
+        return keys
+
+    def device_name_to_device(self, _name):
+        # translate a 'name' to a device
+        # the primary function at this point is on ec2
+        # to consult metadata service, that has
+        #  ephemeral0: sdb
+        # and return 'sdb' for input 'ephemeral0'
+        return None
+
+    def get_locale(self):
+        return 'en_US.UTF-8'
+
+    def get_local_mirror(self):
+        # ??
+        return None
+
+    def get_instance_id(self):
+        if not self.metadata or 'instance-id' not in self.metadata:
+            # Return a magic not really instance id string
+            return "iid-datasource"
+        return str(self.metadata['instance-id'])
+
+    def get_hostname(self, fqdn=False):
+        defdomain = "localdomain"
+        defhost = "localhost"
+        domain = defdomain
+
+        if not self.metadata or not 'local-hostname' in self.metadata:
+            # this is somewhat questionable really.
+            # the cloud datasource was asked for a hostname
+            # and didn't have one. raising error might be more appropriate
+            # but instead, basically look up the existing hostname
+            toks = []
+            hostname = util.get_hostname()
+            fqdn = util.get_fqdn_from_hosts(hostname)
+            if fqdn and fqdn.find(".") > 0:
+                toks = str(fqdn).split(".")
+            elif hostname:
+                toks = [hostname, defdomain]
+            else:
+                toks = [defhost, defdomain]
+        else:
+            # if there is an ipv4 address in 'local-hostname', then
+            # make up a hostname (LP: #475354) in format ip-xx.xx.xx.xx
+            lhost = self.metadata['local-hostname']
+            if util.is_ipv4(lhost):
+                toks = "ip-%s" % lhost.replace(".", "-")
+            else:
+                toks = lhost.split(".")
+
+        if len(toks) > 1:
+            hostname = toks[0]
+            domain = '.'.join(toks[1:])
+        else:
+            hostname = toks[0]
+
+        if fqdn:
+            return "%s.%s" % (hostname, domain)
+        else:
+            return hostname
+
+
+def find_source(sys_cfg, distro, paths, ds_deps, cfg_list, pkg_list):
+    ds_list = list_sources(cfg_list, ds_deps, pkg_list)
+    ds_names = [util.obj_name(f) for f in ds_list]
+    LOG.debug("Searching for data source in: %s", ds_names)
+
+    for cls in ds_list:
+        try:
+            LOG.debug("Seeing if we can get any data from %s", cls)
+            s = cls(sys_cfg, distro, paths)
+            if s.get_data():
+                return (s, util.obj_name(cls))
+        except Exception:
+            util.logexc(LOG, "Getting data from %s failed", cls)
+
+    msg = ("Did not find any data source,"
+           " searched classes: (%s)") % (", ".join(ds_names))
+    raise DataSourceNotFoundException(msg)
+
+
+# Return a list of classes that have the same depends as 'depends'
+# iterate through cfg_list, loading "DataSource*" modules
+# and calling their "get_datasource_list".
+# Return an ordered list of classes that match (if any)
+def list_sources(cfg_list, depends, pkg_list):
+    src_list = []
+    LOG.debug(("Looking for for data source in: %s,"
+               " via packages %s that matches dependencies %s"),
+              cfg_list, pkg_list, depends)
+    for ds_name in cfg_list:
+        if not ds_name.startswith(DS_PREFIX):
+            ds_name = '%s%s' % (DS_PREFIX, ds_name)
+        m_locs = importer.find_module(ds_name,
+                                      pkg_list,
+                                      ['get_datasource_list'])
+        for m_loc in m_locs:
+            mod = importer.import_module(m_loc)
+            lister = getattr(mod, "get_datasource_list")
+            matches = lister(depends)
+            if matches:
+                src_list.extend(matches)
+                break
+    return src_list
+
+
+# 'depends' is a list of dependencies (DEP_FILESYSTEM)
+# ds_list is a list of 2 item lists
+# ds_list = [
+#   ( class, ( depends-that-this-class-needs ) )
+# }
+# It returns a list of 'class' that matched these deps exactly
+# It mainly is a helper function for DataSourceCollections
+def list_from_depends(depends, ds_list):
+    ret_list = []
+    depset = set(depends)
+    for (cls, deps) in ds_list:
+        if depset == set(deps):
+            ret_list.append(cls)
+    return ret_list
author	Scott Moser <smoser@ubuntu.com>	2012-07-06 17:19:37 -0400
committer	Scott Moser <smoser@ubuntu.com>	2012-07-06 17:19:37 -0400
commit	b2a21ed1dc682a262d55a4202c6b9606496d211f (patch)
tree	37f1c4acd3ea891c1e7a60bcd9dd8aa8c7ca1e0c /cloudinit/sources
parent	646384ccd6f1707b2712d9bcd683ae877f1903bd (diff)
parent	e7095a1b19e849c530650d2d71edf8b28d30f1d1 (diff)
download	vyos-cloud-init-b2a21ed1dc682a262d55a4202c6b9606496d211f.tar.gz vyos-cloud-init-b2a21ed1dc682a262d55a4202c6b9606496d211f.zip