summaryrefslogtreecommitdiff
path: root/doc/rtd/topics/datasources.rst
blob: 14432e651c50f25fcc9db9513a75e7c01584e088 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
.. _datasources:

***********
Datasources
***********

What is a datasource?
=====================

Datasources are sources of configuration data for cloud-init that typically
come from the user (aka userdata) or come from the stack that created the
configuration drive (aka metadata). Typical userdata would include files,
yaml, and shell scripts while typical metadata would include server name,
instance id, display name and other cloud specific details. Since there are
multiple ways to provide this data (each cloud solution seems to prefer its
own way) internally a datasource abstract class was created to allow for a
single way to access the different cloud systems methods to provide this data
through the typical usage of subclasses.


.. _instance_metadata:

instance-data
-------------
For reference, cloud-init stores all the metadata, vendordata and userdata
provided by a cloud in a json blob at ``/run/cloud-init/instance-data.json``.
While the json contains datasource-specific keys and names, cloud-init will
maintain a minimal set of standardized keys that will remain stable on any
cloud. Standardized instance-data keys will be present under a "v1" key.
Any datasource metadata cloud-init consumes will all be present under the
"ds" key.

Below is an instance-data.json example from an OpenStack instance:

.. sourcecode:: json

  {
   "base64-encoded-keys": [
    "ds/meta-data/random_seed",
    "ds/user-data"
   ],
   "ds": {
    "ec2_metadata": {
     "ami-id": "ami-0000032f",
     "ami-launch-index": "0",
     "ami-manifest-path": "FIXME",
     "block-device-mapping": {
      "ami": "vda",
      "ephemeral0": "/dev/vdb",
      "root": "/dev/vda"
     },
     "hostname": "xenial-test.novalocal",
     "instance-action": "none",
     "instance-id": "i-0006e030",
     "instance-type": "m1.small",
     "local-hostname": "xenial-test.novalocal",
     "local-ipv4": "10.5.0.6",
     "placement": {
      "availability-zone": "None"
     },
     "public-hostname": "xenial-test.novalocal",
     "public-ipv4": "10.245.162.145",
     "reservation-id": "r-fxm623oa",
     "security-groups": "default"
    },
    "meta-data": {
     "availability_zone": null,
     "devices": [],
     "hostname": "xenial-test.novalocal",
     "instance-id": "3e39d278-0644-4728-9479-678f9212d8f0",
     "launch_index": 0,
     "local-hostname": "xenial-test.novalocal",
     "name": "xenial-test",
     "project_id": "e0eb2d2538814...",
     "random_seed": "A6yPN...",
     "uuid": "3e39d278-0644-4728-9479-678f92..."
    },
    "network_json": {
     "links": [
      {
       "ethernet_mac_address": "fa:16:3e:7d:74:9b",
       "id": "tap9ca524d5-6e",
       "mtu": 8958,
       "type": "ovs",
       "vif_id": "9ca524d5-6e5a-4809-936a-6901..."
      }
     ],
     "networks": [
      {
       "id": "network0",
       "link": "tap9ca524d5-6e",
       "network_id": "c6adfc18-9753-42eb-b3ea-18b57e6b837f",
       "type": "ipv4_dhcp"
      }
     ],
     "services": [
      {
       "address": "10.10.160.2",
       "type": "dns"
      }
     ]
    },
    "user-data": "I2Nsb3VkLWNvbmZpZ...",
    "vendor-data": null
   },
   "v1": {
    "availability-zone": null,
    "cloud-name": "openstack",
    "instance-id": "3e39d278-0644-4728-9479-678f9212d8f0",
    "local-hostname": "xenial-test",
    "region": null
   }
  }

 
As of cloud-init v. 18.4, any values present in
``/run/cloud-init/instance-data.json`` can be used in cloud-init user data
scripts or cloud config data. This allows consumers to use cloud-init's
vendor-neutral, standardized metadata keys as well as datasource-specific
content for any scripts or cloud-config modules they are using.

To use instance-data.json values in scripts and **#config-config** files the
user-data will need to contain the following header as the first line **## template: jinja**. Cloud-init will source all variables defined in
``/run/cloud-init/instance-data.json`` and allow scripts or cloud-config files 
to reference those paths. Below are two examples::

 * Cloud config calling home with the ec2 public hostname and avaliability-zone
    ```
    ## template: jinja
    #cloud-config
    runcmd:
        - echo 'EC2 public hostname allocated to instance: {{ ds.meta_data.public_hostname }}' > /tmp/instance_metadata
        - echo 'EC2 avaiability zone: {{ v1.availability_zone }}' >> /tmp/instance_metadata 
        - curl -X POST -d '{"hostname": "{{ds.meta_data.public_hostname }}", "availability-zone": "{{ v1.availability_zone }}"}'  https://example.com.com
    ```

 * Custom user script performing different operations based on region
    ```
    ## template: jinja
    #!/bin/bash
    {% if v1.region == 'us-east-2' -%}
    echo 'Installing custom proxies for {{ v1.region }}
    sudo apt-get install my-xtra-fast-stack
    {%- endif %}
    ...

    ```

.. note::
  Trying to reference jinja variables that don't exist in
  instance-data.json will result in warnings in ``/var/log/cloud-init.log``
  and the following string in your rendered user-data:
  ``CI_MISSING_JINJA_VAR/<your_varname>``.
  
.. note::
  To save time designing your user-data for a specific cloud's
  instance-data.json, use the 'render' cloud-init command on an
  instance booted on your favorite cloud. See :ref:`cli_devel` for more
  information.


Datasource API
--------------
The current interface that a datasource object must provide is the following:

.. sourcecode:: python

    # returns a mime multipart message that contains
    # all the various fully-expanded components that
    # were found from processing the raw userdata string
    # - when filtering only the mime messages targeting
    #   this instance id will be returned (or messages with
    #   no instance id)
    def get_userdata(self, apply_filter=False)

    # returns the raw userdata string (or none)
    def get_userdata_raw(self)

    # returns a integer (or none) which can be used to identify
    # this instance in a group of instances which are typically
    # created from a single command, thus allowing programmatic
    # filtering on this launch index (or other selective actions)
    @property
    def launch_index(self)

    # the data sources' config_obj is a cloud-config formatted
    # object that came to it from ways other than cloud-config
    # because cloud-config content would be handled elsewhere
    def get_config_obj(self)

    #returns a list of public ssh keys
    def get_public_ssh_keys(self)

    # translates a device 'short' name into the actual physical device
    # fully qualified name (or none if said physical device is not attached
    # or does not exist)
    def device_name_to_device(self, name)

    # gets the locale string this instance should be applying 
    # which typically used to adjust the instances locale settings files
    def get_locale(self)

    @property
    def availability_zone(self)

    # gets the instance id that was assigned to this instance by the 
    # cloud provider or when said instance id does not exist in the backing
    # metadata this will return 'iid-datasource'
    def get_instance_id(self)

    # gets the fully qualified domain name that this host should  be using
    # when configuring network or hostname releated settings, typically
    # assigned either by the cloud provider or the user creating the vm
    def get_hostname(self, fqdn=False)

    def get_package_mirror_info(self)


Datasource Documentation
========================
The following is a list of the implemented datasources.
Follow for more information.

.. toctree::
   :maxdepth: 2

   datasources/aliyun.rst
   datasources/altcloud.rst
   datasources/azure.rst
   datasources/cloudsigma.rst
   datasources/cloudstack.rst
   datasources/configdrive.rst
   datasources/digitalocean.rst
   datasources/ec2.rst
   datasources/maas.rst
   datasources/nocloud.rst
   datasources/opennebula.rst
   datasources/openstack.rst
   datasources/oracle.rst
   datasources/ovf.rst
   datasources/smartos.rst
   datasources/fallback.rst
   datasources/gce.rst

.. vi: textwidth=78