diff options
Diffstat (limited to 'doc/opportunism-spec.txt')
-rw-r--r-- | doc/opportunism-spec.txt | 1254 |
1 files changed, 1254 insertions, 0 deletions
diff --git a/doc/opportunism-spec.txt b/doc/opportunism-spec.txt new file mode 100644 index 000000000..fbe319a57 --- /dev/null +++ b/doc/opportunism-spec.txt @@ -0,0 +1,1254 @@ + + + + + + + + + + Opportunistic Encryption + + Henry Spencer + D. Hugh Redelmeier + henry@spsystems.net + hugh@mimosa.com + Linux FreeS/WAN Project + + + + Opportunistic encryption permits secure + (encrypted, authenticated) communication via IPsec + without connection-by-connection prearrangement, + either explicitly between hosts (when the hosts + are capable of it) or transparently via packet- + intercepting security gateways. It uses DNS + records (authenticated with DNSSEC) to provide the + necessary information for gateway discovery and + gateway authentication, and constrains negotiation + enough to guarantee success. + + Substantive changes since draft 3: write off + inverse queries as a lost cause; use Invalid-SPI + rather than Delete as notification of unknown SA; + minor wording improvements and clarifications. + This document takes over from the older ``Imple- + menting Opportunistic Encryption'' document. + + +1. Introduction + +A major goal of the FreeS/WAN project is opportunistic +encryption: a (security) gateway intercepts an outgoing +packet aimed at a remote host, and quickly attempts to nego- +tiate an IPsec tunnel to that host's security gateway. If +the attempt succeeds, traffic can then be secure, transpar- +ently (without changes to the host software). If the +attempt fails, the packet (or a retry thereof) passes +through in clear or is dropped, depending on local policy. +Prearranged tunnels bypass the packet interception etc., so +static VPNs can coexist with opportunistic encryption. + +This generalizes trivially to the end-to-end case: host and +security gateway simply are one and the same. Some opti- +mizations are possible in that case, but the basic scheme +need not change. + +The objectives for security systems need to be explicitly +stated. Opportunistic encryption is meant to achieve secure +communication, without prearrangement of the individual con- +nection (although some prearrangement on a per-host basis is + + + +Draft 4 3 May 2001 1 + + + + + + Opportunistic Encryption + + +required), between any two hosts which implement the proto- +col (and, if they act as security gateways, between hosts +behind them). Here ``secure'' means strong encryption and +authentication of packets, with authentication of partici- +pants--to prevent man-in-the-middle and impersonation +attacks--dependent on several factors. The biggest factor +is the authentication of DNS records, via DNSSEC or equiva- +lent means. A lesser factor is which exact variant of the +setup procedure (see section 2.2) is used, because there is +a tradeoff between strong authentication of the other end +and ability to negotiate opportunistic encryption with hosts +which have limited or no control of their reverse-map DNS +records: without reverse-map information, we can verify that +the host has the right to use a particular FQDN (Fully Qual- +ified Domain Name), but not whether that FQDN is authorized +to use that IP address. Local policy must decide whether +authentication or connectivity has higher priority. + +Apart from careful attention to detail in various areas, +there are three crucial design problems for opportunistic +encryption. It needs a way to quickly identify the remote +host's security gateway. It needs a way to quickly obtain +an authentication key for the security gateway. And the +numerous options which can be specified with IKE must be +constrained sufficiently that two independent implementa- +tions are guaranteed to reach agreement, without any +explicit prearrangement or preliminary negotiation. The +first two problems are solved using DNS, with DNSSEC ensur- +ing that the data obtained is reliable; the third is solved +by specifying a minimum standard which must be supported. + +A note on philosophy: we have deliberately avoided providing +six different ways to do each job, in favor of specifying +one good one. Choices are provided only when they appear to +be necessary, or at least important. + +A note on terminology: to avoid constant circumlocutions, an +ISAKMP/IKE SA, possibly recreated occasionally by rekeying, +will be referred to as a ``keying channel'', and a set of +IPsec SAs providing bidirectional communication between two +IPsec hosts, possibly recreated occasionally by rekeying, +will be referred to as a ``tunnel'' (it could conceivably +use transport mode in the host-to-host case, but we advocate +using tunnel mode even there). The word ``connection'' is +here used in a more generic sense. The word ``lifetime'' +will be avoided in favor of ``rekeying interval'', since +many of the connections will have useful lives far shorter +than any reasonable rekeying interval, and hence the two +concepts must be separated. + +A note on document structure: Discussions of why things were +done a particular way, or not done a particular way, are +broken out in paragraphs headed ``Rationale:'' (to preserve +the flow of the text, many such paragraphs are deferred to + + + +Draft 4 3 May 2001 2 + + + + + + Opportunistic Encryption + + +the ends of sections). Paragraphs headed ``Ahem:'' are dis- +cussions of where the problem is being made significantly +harder by problems elsewhere, and how that might be cor- +rected. Some meta-comments are enclosed in []. + +Rationale: The motive is to get the Internet encrypted. +That requires encryption without connection-by-connection +prearrangement: a system must be able to reliably negotiate +an encrypted, authenticated connection with a total +stranger. While end-to-end encryption is preferable, doing +opportunistic encryption in security gateways gives enormous +leverage for quick deployment of this technology, in a world +where end-host software is often primitive, rigid, and out- +dated. + +Rationale: Speed is of the essence in tunnel setup: a con- +nection-establishment delay longer than about 10 seconds +begins to cause problems for users and applications. Thus +the emphasis on rapidity in gateway discovery and key fetch- +ing. + +Ahem: Host-to-host opportunistic encryption would be utterly +trivial if a fast public-key encryption/signature algorithm +was available. You would do a reverse lookup on the desti- +nation address to obtain a public key for that address, and +simply encrypt all packets going to it with that key, sign- +ing them with your own private key. Alas, this is impracti- +cal with current CPU speeds and current algorithms (although +as noted later, it might be of some use for limited pur- +poses). Nevertheless, it is a useful model. + +2. Connection Setup + +For purposes of discussion, the network is taken to look +like this: + + Source----Initiator----...----Responder----Destination + +The intercepted packet comes from the Source, bound for the +Destination, and is intercepted at the Initiator. The Ini- +tiator communicates over the insecure Internet to the +Responder. The Source and the Initiator might be the same +host, or the Source might be an end-user host and the Ini- +tiator a security gateway (SG). Likewise for the Responder +and the Destination. + +Given an intercepted packet, whose useful information (for +our purposes) is essentially only the Destination's IP +address, the Initiator must quickly determine the Responder +(the Destination's SG) and fetch everything needed to +authenticate it. The Responder must do likewise for the +Initiator. Both must eventually also confirm that the other +is authorized to act on behalf of the client host behind it +(if any). + + + +Draft 4 3 May 2001 3 + + + + + + Opportunistic Encryption + + +An important subtlety here is that if the alternative to an +IPsec tunnel is plaintext transmission, negative results +must be obtained quickly. That is, the decision that no +tunnel can be established must also be made rapidly. + +2.1. Packet Interception + +Interception of outgoing packets is relatively straightfor- +ward in principle. It is preferable to put the intercepted +packet on hold rather than dropping it, since higher-level +retries are not necessarily well-timed. There is a problem +of hosts and applications retrying during negotiations. ARP +implementations, which face the same problem, use the +approach of keeping the most recent packet for an as-yet- +unresolved address, and throwing away older ones. (Incre- +menting of request numbers etc. means that replies to older +ones may no longer be accepted.) + +Is it worth intercepting incoming packets, from the outside +world, and attempting tunnel setup based on them? No, +unless and until a way can be devised to initiate oppor- +tunistic encryption to a non-opportunistic responder, +because if the other end has not initiated tunnel setup +itself, it will not be prepared to do so at our request. + +Rationale: Note, however, that most incoming packets will +promptly be followed by an outgoing packet in response! +Conceivably it might be useful to start early stages of +negotiation, at least as far as looking up information, in +response to an incoming packet. + +Rationale: If a plaintext incoming packet indicates that the +other end is not prepared to do opportunistic encryption, it +might seem that this fact should be noted, to avoid consum- +ing resources and delaying traffic in an attempt at oppor- +tunistic setup which is doomed to fail. However, this would +be a major security hole, since the plaintext packet is not +authenticated; see section 2.5. + +2.2. Algorithm + +For clarity, the following defers most discussion of error +handling to the end. + +Step 1. Initiator does a DNS reverse lookup on the Destina- + tion address, asking not for the usual PTR records, + but for TXT records. Meanwhile, Initiator also + sends a ping to the Destination, to cause any other + dynamic setup actions to start happening. (Ping + replies are disregarded; the host might not be + reachable with plaintext pings.) + +Step 2A. If at least one suitable TXT record (see section + 2.3) comes back, each contains a potential + + + +Draft 4 3 May 2001 4 + + + + + + Opportunistic Encryption + + + Responder's IP address and that Responder's public + key (or where to find it). Initiator picks one TXT + record, based on priority (see 2.3), thus picking a + Responder. If there was no public key in the TXT + record, the Initiator also starts a DNS lookup (as + specified by the TXT record) to get KEY records. + +Step 2B. If no suitable TXT record is available, and policy + permits, Initiator designates the Destination + itself as the Responder (see section 2.4). If pol- + icy does not permit, or the Destination is unre- + sponsive to the negotiation, then opportunistic + encryption is not possible, and Initiator gives up + (see section 2.5). + +Step 3. If there already is a keying channel to the Respon- + der's IP address, the Initiator uses the existing + keying channel; skip to step 10. Otherwise, the + Initiator starts an IKE Phase 1 negotiation (see + section 2.7 for details) with the Responder. The + address family of the Responder's IP address dic- + tates whether the keying channel and the outside of + the tunnel should be IPv4 or IPv6. + +Step 4. Responder gets the first IKE message, and responds. + It also starts a DNS reverse lookup on the Initia- + tor's IP address, for KEY records, on speculation. + +Step 5. Initiator gets Responder's reply, and sends first + message of IKE's D-H exchange (see 2.4). + +Step 6. Responder gets Initiator's D-H message, and + responds with a matching one. + +Step 7. Initiator gets Responder's D-H message; encryption + is now established, authentication remains to be + done. Initiator sends IKE authentication message, + with an FQDN identity if a reverse lookup on its + address will not yield a suitable KEY record. + (Note, an FQDN need not actually correspond to a + host--e.g., the DNS data for it need not include an + A record.) + +Step 8. Responder gets Initiator's authentication message. + If there is no identity included, Responder waits + for step 4's speculative DNS lookup to finish; it + should yield a suitable KEY record (see 2.3). If + there is an FQDN identity, responder discards any + data obtained from step 4's DNS lookup; does a for- + ward lookup on the FQDN, for a KEY record; waits + for that lookup to return; it should yield a suit- + able KEY record. Either way, Responder uses the + KEY data to verify the message's hash. Responder + replies with an authentication message, with an + + + +Draft 4 3 May 2001 5 + + + + + + Opportunistic Encryption + + + FQDN identity if a reverse lookup on its address + will not yield a suitable KEY record. + +Step 9A. (If step 2A was used.) The Initiator gets the + Responder's authentication message. Step 2A has + provided a key (from the TXT record or via DNS + lookup). Verify message's hash. Encrypted and + authenticated keying channel established, man-in- + middle attack precluded. + +Step 9B. (If step 2B was used.) The Initiator gets the + Responder's authentication message, which must con- + tain an FQDN identity (if the Responder can't put a + TXT in his reverse map he presumably can't do a KEY + either). Do forward lookup on the FQDN, get suit- + able KEY record, verify hash. Encrypted keying + channel established, man-in-middle attack pre- + cluded, but authentication weak (see 2.4). + +Step 10. Initiator initiates IKE Phase 2 negotiation (see + 2.7) to establish tunnel, specifying Source and + Destination identities as IP addresses (see 2.6). + The address family of those addresses also deter- + mines whether the inside of the tunnel should be + IPv4 or IPv6. + +Step 11. Responder gets first Phase 2 message. Now the + Responder finally knows what's going on! Unless + the specified Source is identical to the Initiator, + Responder initiates DNS reverse lookup on Source IP + address, for TXT records; waits for result; gets + suitable TXT record(s) (see 2.3), which should con- + tain either the Initiator's IP address or an FQDN + identity identical to that supplied by the Initia- + tor in step 7. This verifies that the Initiator is + authorized to act as SG for the Source. Responder + replies with second Phase 2 message, selecting + acceptable details (see 2.7), and establishes tun- + nel. + +Step 12. Initiator gets second Phase 2 message, establishes + tunnel (if he didn't already), and releases the + intercepted packet into it, finally. + +Step 13. Communication proceeds. See section 3 for what + happens later. + +As additional information becomes available, notably in +steps 1, 2, 4, 8, 9, 11, and 12, there is always a possibil- +ity that local policy (e.g., access limitations) might pre- +vent further progress. Whenever possible, at least attempt +to inform the other end of this. + + + + + +Draft 4 3 May 2001 6 + + + + + + Opportunistic Encryption + + +At any time, there is a possibility of the negotiation fail- +ing due to unexpected responses, e.g. the Responder not +responding at all or rejecting all Initiator's proposals. +If multiple SGs were found as possible Responders, the Ini- +tiator should try at least one more before giving up. The +number tried should be influenced by what the alternative +is: if the traffic will otherwise be discarded, trying the +full list is probably appropriate, while if the alternative +is plaintext transmission, it might be based on how long the +tries are taking. The Initiator should try as many as it +reasonably can, ideally all of them. + +There is a sticky problem with timeouts. If the Responder +is down or otherwise inaccessible, in the worst case we +won't hear about this except by not getting responses. Some +other, more pathological or even evil, failure cases can +have the same result. The problem is that in the case where +plaintext is permitted, we want to decide whether a tunnel +is possible quickly. There is no good solution to this, +alas; we just have to take the time and do it right. (Pass- +ing plaintext meanwhile looks attractive at first glance... +but exposing the first few seconds of a connection is often +almost as bad as exposing the whole thing. Worse, if the +user checks the status of the connection, after that brief +window it looks secure!) + +The flip side of waiting for a timeout is that all other +forms of feedback, e.g. ``host not reachable'', arguably +should be ignored, because in the absence of authenticated +ICMP, you cannot trust them! + +Rationale: An alternative, sometimes suggested, to the use +of explicit DNS records for SG discovery is to directly +attempt IKE negotiation with the destination host, and +assume that any relevant SG will be on the packet path, will +intercept the IKE packets, and will impersonate the destina- +tion host for the IKE negotiation. This is superficially +attractive but is a very bad idea. It assumes that routing +is stable throughout negotiation, that the SG is on the +plaintext-packets path, and that the destination host is +routable (yes, it is possible to have (private) DNS data for +an unroutable host). Playing extra games in the plaintext- +packet path hurts performance and can be expected to be +unpopular. Various difficulties ensue when there are multi- +ple SGs along the path (there is already bad experience with +this, in RSVP), and the presence of even one can make it +impossible to do IKE direct to the host when that is what's +wanted. Worst of all, such impersonation breaks the IP net- +work model badly, making problems difficult to diagnose and +impossible to work around (and there is already bad experi- +ence with this, in areas like web caching). + +Rationale: (Step 1.) Dynamic setup actions might include +establishment of demand-dialed links. These might be + + + +Draft 4 3 May 2001 7 + + + + + + Opportunistic Encryption + + +present anywhere along the path, so one cannot rely on out- +of-band communication at the Initiator to trigger them. +Hence the ping. + +Rationale: (Step 2.) In many cases, the IP address on the +intercepted packet will be the result of a name lookup just +done. Inverse queries, an obscure DNS feature from the dis- +tant past, in theory can be used to ask a DNS server to +reverse that lookup, giving the name that produced the +address. This is not the same as a reverse lookup, and the +difference can matter a great deal in cases where a host +does not control its reverse map (e.g., when the host's IP +address is dynamically assigned). Unfortunately, inverse +queries were never widely implemented and are now considered +obsolete. Phooey. + +Ahem: Support for a small subset of this admittedly-obscure +feature would be useful. Unfortunately, it seems unlikely. + +Rationale: (Step 3.) Using only IP addresses to decide +whether there is already a relevant keying channel avoids +some difficult problems. In particular, it might seem that +this should be based on identities, but those are not known +until very late in IKE Phase 1 negotiations. + +Rationale: (Step 4.) The DNS lookup is done on speculation +because the data will probably be useful and the lookup can +be done in parallel with IKE activity, potentially speeding +things up. + +Rationale: (Steps 7 and 8.) If an SG does not control its +reverse map, there is no way it can prove its right to use +an IP address, but it can nevertheless supply both an iden- +tity (as an FQDN) and proof of its right to use that iden- +tity. This is somewhat better than nothing, and may be +quite useful if the SG is representing a client host which +can prove its right to its IP address. (For example, a +fixed-address subnet might live behind an SG with a dynami- +cally-assigned address; such an SG has to be the Initiator, +not the Responder, so the subnet's TXT records can contain +FQDN identities, but with that restriction, this works.) It +might sound like this would permit some man-in-the-middle +attacks in important cases like Road Warrior, but the RW can +still do full authentication of the home base, so a man in +the middle cannot successfully impersonate home base, and +the D-H exchange doesn't work unless the man in the middle +impersonates both ends. + +Rationale: (Steps 7 and 8.) Another situation where proof +of the right to use an identity can be very useful is when +access is deliberately limited. While opportunistic encryp- +tion is intended as a general-purpose connection mechanism +between strangers, it may well be convenient for prearranged +connections to use the same mechanism. + + + +Draft 4 3 May 2001 8 + + + + + + Opportunistic Encryption + + +Rationale: (Steps 7 and 8.) FQDNs as identities are avoided +where possible, since they can involve synchronous DNS +lookups. + +Rationale: (Step 11.) Note that only here, in Phase 2, does +the Responder actually learn who the Source and Destination +hosts are. This unfortunately demands a synchronous DNS +lookup to verify that the Initiator is authorized to repre- +sent the Source, unless they are one and the same. This and +the initial TXT lookup are the only synchronous DNS lookups +absolutely required by the algorithm, and they appear to be +unavoidable. + +Rationale: While it might seem unlikely that a refusal to +cooperate from one SG could be remedied by trying another-- +presumably they all use the same policies--it's conceivable +that one might be misconfigured. Preferably they should all +be tried, but it may be necessary to set some limits on this +if alternatives exist. + +2.3. DNS Records + +Gateway discovery and key lookup are based on TXT and KEY +DNS records. The TXT record specifies IP address or other +identity of a host's SG, and possibly supplies its public +key as well, while the KEY record supplies public keys not +found in TXT records. + +2.3.1. TXT + +Opportunistic-encryption SG discovery uses TXT records with +the content: + + X-IPsec-Gateway(nnn)=iii kkk + +following RFC 1464 attribute/value notation. Records which +do not contain an ``='', or which do not have exactly the +specified form to the left of it, are ignored. (Near misses +perhaps should be reported.) + +The nnn is an unsigned integer which will fit in 16 bits, +specifying an MX-style preference (lower number = stronger +preference) to control the order in which multiple SGs are +tried. If there are ties, pick one, randomly enough that +the choice will probably be different each time. The pref- +erence field is not optional; use ``0'' if there is no mean- +ingful preference ordering. + +The iii part identifies the SG. Normally this is a dotted- +decimal IPv4 address or a colon-hex IPv6 address. The sole +exception is if the SG has no fixed address (see 2.4) but +the host(s) behind it do, in which case iii is of the form +``@fqdn'', where fqdn is the FQDN that the SG will use to +identify itself (in step 7 of section 2.2); such a record + + + +Draft 4 3 May 2001 9 + + + + + + Opportunistic Encryption + + +cannot be used for SG discovery by an Initiator, but can be +used for SG verification (step 11 of 2.2) by a Responder. + +The kkk part is optional. If it is present, it is an RSA- +MD5 public key in base-64 notation, as in the text form of +an RFC 2535 KEY record. If it is not present, this speci- +fies that the public key can be found in a KEY record +located based on the SG's identification: if iii is an IP +address, do a reverse lookup on that address, else do a for- +ward lookup on the FQDN. + +Rationale: While it is unusual for a reverse lookup to go +for records other than PTR records (or possibly CNAME +records, for RFC 2317 classless delegation), there's no rea- +son why it can't. The TXT record is a temporary stand-in +for (we hope, someday) a new DNS record for SG identifica- +tion and keying. Keeping the setup process fast requires +minimizing the number of DNS lookups, hence the desire to +put all the information in one place. + +Rationale: The use of RFC 1464 notation avoids collisions +with other uses of TXT records. The ``X-'' in the attribute +name indicates that this format is tentative and experimen- +tal; this design will probably need modification after ini- +tial experiments. The format is chosen with an eye on even- +tual binary encoding. Note, in particular, that the TXT +record normally contains the address of the SG, not (repeat, +not) its name. Name-to-address conversion is the job of +whatever generates the TXT record, which is expected to be a +program, not a human--this is conceptually a binary record, +temporarily using a text encoding. The ``@fqdn'' form of +the SG identity is for specialized uses and is never mapped +to an address. + +Ahem: A DNS TXT record contains one or more character +strings, but RFC 1035 does not describe exactly how a multi- +string TXT record is interpreted. This is relevant because +a string can be at most 255 characters, and public keys can +exceed this. Empirically, the standard pattern is that each +string which is both less than 255 characters and not the +final string of the record should have a blank appended to +it, and the strings of the record should then be concate- +nated. (This observation is based on how BIND 8 transforms +a TXT record from text to DNS binary.) + +2.3.2. KEY + +An opportunistic-encryption KEY record is an Authentication- +permitted, Entity (host), non-Signatory, IPsec, RSA/MD5 +record (that is, its first four bytes are 0x42000401), as +per RFCs 2535 and 2537. KEY records with other flags, pro- +tocol, or algorithm values are ignored. + + + + + +Draft 4 3 May 2001 10 + + + + + + Opportunistic Encryption + + +Rationale: Unfortunately, the public key has to be associ- +ated with the SG, not the client host behind it. The +Responder does not know which client it is supposed to be +representing, or which client the Initiator is representing, +until far too late. + +Ahem: Per-client keys would reduce vulnerability to key com- +promise, and simplify key changes, but they would require +changes to IKE Phase 1, to separately identify the SG and +its initial client(s). (At present, the client identities +are not known to the Responder until IKE Phase 2.) While +the current IKE standard does not actually specify (!) who +is being identified by identity payloads, the overwhelming +consensus is that they identify the SG, and as seen earlier, +this has important uses. + +2.3.3. Summary + +For reference, the minimum set of DNS records needed to make +this all work is either: + +1. TXT in Destination reverse map, identifying Responder + and providing public key. + +2. KEY in Initiator reverse map, providing public key. + +3. TXT in Source reverse map, verifying relationship to + Initiator. + +or: + +1. TXT in Destination reverse map, identifying Responder. + +2. KEY in Responder reverse map, providing public key. + +3. KEY in Initiator reverse map, providing public key. + +4. TXT in Source reverse map, verifying relationship to + Initiator. + +Slight complications ensue for dynamic addresses, lack of +control over reverse maps, etc. + +2.3.4. Implementation + +In the long run, we need either a tree of trust or a web of +trust, so we can trust our DNS data. The obvious approach +for DNS is a tree of trust, but there are various practical +problems with running all of this through the root servers, +and a web of trust is arguably more robust anyway. This is +logically independent of opportunistic encryption, and a +separate design proposal will be prepared. + + + + + +Draft 4 3 May 2001 11 + + + + + + Opportunistic Encryption + + +Interim stages of implementation of this will require a bit +of thought. Notably, we need some way of dealing with the +lack of fully signed DNSSEC records right away. Without +user interaction, probably the best we can do is to remember +the results of old fetches, compare them to the results of +new fetches, and complain and disbelieve all of it if +there's a mismatch. This does mean that somebody who gets +fake data into our very first fetch will fool us, at least +for a while, but that seems an acceptable tradeoff. (Obvi- +ously there needs to be a way to manually flush the remem- +bered results for a specific host, to permit deliberate +changes.) + +2.4. Responders Without Credentials + +In cases where the Destination simply does not control its +DNS reverse-map entries, there is no verifiable way to +determine a suitable SG. This does not make communication +utterly impossible, though. + +Simply attempting negotiation directly with the host is a +last resort. (An aggressive implementation might wish to +attempt it in parallel, rather than waiting until other +options are known to be unavailable.) In particular, in +many cases involving dynamic addresses, it will work. It +has the disadvantage of delaying the discovery that oppor- +tunistic encryption is entirely impossible, but the case +seems common enough to justify the overhead. + +However, there are policy issues here either way, because it +is possible to impersonate such a host. The host can supply +an FQDN identity and verify its right to use that identity, +but except by prearrangement, there is no way to verify that +the FQDN is the right one for that IP address. (The data +from forward lookups may be controlled by people who do not +own the address, so it cannot be trusted.) The encryption +is still solid, though, so in many cases this may be useful. + +2.5. Failure of Opportunism + +When there is no way to do opportunistic encryption, a pol- +icy issue arises: whether to put in a bypass (which allows +plaintext traffic through) or a block (which discards it, +perhaps with notification back to the sender). The choice +is very much a matter of local policy, and may depend on +details such as the higher-level protocol being used. For +example, an SG might well permit plaintext HTTP but forbid +plaintext Telnet, in which case both a block and a bypass +would be set up if opportunistic encryption failed. + +A bypass/block must, in practice, be treated much like an +IPsec tunnel. It should persist for a while, so that high- +overhead processing doesn't have to be done for every +packet, but should go away eventually to return resources. + + + +Draft 4 3 May 2001 12 + + + + + + Opportunistic Encryption + + +It may be simplest to treat it as a degenerate tunnel. It +should have a relatively long lifetime (say 6h) to keep the +frequency of negotiation attempts down, except in the case +where the other SG simply did not respond to IKE packets, +where the lifetime should be short (say 10min) because the +other SG is presumably down and might come back up again. +(Cases where the other SG responded to IKE with unauthenti- +cated error reports like ``port unreachable'' are border- +line, and might deserve to be treated as an intermediate +case: while such reports cannot be trusted unreservedly, in +the absence of any other response, they do give some reason +to suspect that the other SG is unable or unwilling to par- +ticipate in opportunistic encryption.) + +As noted in section 2.1, one might think that arrival of a +plaintext incoming packet should cause a bypass/block to be +set up for its source host: such a packet is almost always +followed by an outgoing reply packet; the incoming packet is +clear evidence that opportunistic encryption is not avail- +able at the other end; attempting it will waste resources +and delay traffic to no good purpose. Unfortunately, this +means that anyone out on the Internet who can forge a source +address can prevent encrypted communication! Since their +source addresses are not authenticated, plaintext packets +cannot be taken as evidence of anything, except perhaps that +communication from that host is likely to occur soon. + +There needs to be a way for local administrators to remove a +bypass/block ahead of its normal expiry time, to force a +retry after a problem at the other end is known to have been +fixed. + +2.6. Subnet Opportunism + +In principle, when the Source or Destination host belongs to +a subnet and the corresponding SG is willing to provide tun- +nels to the whole subnet, this should be done. There is no +extra overhead, and considerable potential for avoiding +later overhead if similar communication occurs with other +members of the subnet. Unfortunately, at the moment, oppor- +tunistic tunnels can only have degenerate subnets (single +hosts) at their ends. (This does, at least, set up the key- +ing channel, so that negotiations for tunnels to other hosts +in the same subnets will be considerably faster.) + +The crucial problem is step 11 of section 2.2: the Responder +must verify that the Initiator is authorized to represent +the Source, and this is impossible for a subnet because +there is no way to do a reverse lookup on it. Information +in DNS records for a name or a single address cannot be +trusted, because they may be controlled by people who do not +control the whole subnet. + + + + + +Draft 4 3 May 2001 13 + + + + + + Opportunistic Encryption + + +Ahem: Except in the special case of a subnet masked on a +byte boundary (in which case RFC 1035's convention of an +incomplete in-addr.arpa name could be used), subnet lookup +would need extensions to the reverse-map name space, perhaps +along the lines of that commonly done for RFC 2317 delega- +tion. IPv6 already has suitable name syntax, as in RFC +2874, but has no specific provisions for subnet entries in +its reverse maps. Fixing all this is is not conceptually +difficult, but is logically independent of opportunistic +encryption, and will be proposed separately. + +A less-troublesome problem is that the Initiator, in step 10 +of 2.2, must know exactly what subnet is present on the +Responder's end so he can propose a tunnel to it. This +information could be included in the TXT record of the Des- +tination (it would have to be verified with a subnet lookup, +but that could be done in parallel with other operations). +The Initiator presumably can be configured to know what sub- +net(s) are present on its end. + +2.7. Option Settings + +IPsec and IKE have far too many useless options, and a few +useful ones. IKE negotiation is quite simplistic, and can- +not handle even simple discrepancies between the two SGs. +So it is necessary to be quite specific about what should be +done and what should be proposed, to guarantee interoper- +ability without prearrangement or other negotiation proto- +cols. + +Rationale: The prohibition of other negotiations is simply +because there is no time. The setup algorithm (section 2.2) +is lengthy already. + +[Open question: should opportunistic IKE use a different +port than normal IKE?] + +Somewhat arbitrarily and tentatively, opportunistic SGs must +support Main Mode, Oakley group 5 for D-H, 3DES encryption +and MD5 authentication for both ISAKMP and IPsec SAs, +RSA/MD5 digital-signature authentication with keys between +2048 and 8192 bits, and ESP doing both encryption and +authentication. They must do key PFS in Quick Mode, but not +identity PFS. They may support IPComp, preferably using +Deflate, but must not insist on it. They may support AES as +an alternative to 3DES, but must not insist on it. + +Rationale: Identity PFS essentially requires establishing a +complete new keying channel for each new tunnel, but key PFS +just does a new Diffie-Hellman exchange for each rekeying, +which is relatively cheap. + +Keying channels must remain in existence at least as long as +any tunnel created with them remains (they are not costly, + + + +Draft 4 3 May 2001 14 + + + + + + Opportunistic Encryption + + +and keeping the management path up and available simplifies +various issues). See section 3.1 for related issues. Given +the use of key PFS, frequent rekeying does not seem critical +here. In the absence of strong reason to do otherwise, the +Initiator should propose rekeying at 8hr-or-1MB. The +Responder must accept any proposal which specifies a rekey- +ing time between 1hr and 24hr inclusive and a rekeying vol- +ume between 100KB and 10MB inclusive. + +Given the short expected useful life of most tunnels (see +section 3.1), very few of them will survive long enough to +be rekeyed. In the absence of strong reason to do other- +wise, the Initiator should propose rekeying at 1hr-or-100MB. +The Responder must accept any proposal which specifies a +rekeying time between 10min and 8hr inclusive and a rekeying +volume between 1MB and 1000MB inclusive. + +It is highly desirable to add some random jitter to the +times of actual rekeying attempts, to break up ``convoys'' +of rekeying events; this and certain other aspects of robust +rekeying practice will be the subject of a separate design +proposal. + +Rationale: The numbers used here for rekeying intervals are +chosen quite arbitrarily and should be re-assessed after +some implementation experience is gathered. + +3. Renewal and Teardown + +3.1. Aging + +When to tear tunnels down is a bit problematic, but if we're +setting up a potentially unbounded number of them, we have +to tear them down somehow sometime. + +Set a short initial tentative lifespan, say 1min, since most +net flows in fact last only a few seconds. When that +expires, look to see if the tunnel is still in use (defini- +tion: has had traffic, in either direction, in the last half +of the tentative lifespan). If so, assign it a somewhat +longer tentative lifespan, say 20min, after which, look +again. If not, close it down. (This tentative lifespan is +independent of rekeying; it is just the time when the tun- +nel's future is next considered. This should happen reason- +ably frequently, unlike rekeying, which is costly and +shouldn't be too frequent.) Multi-step backoff algorithms +are not worth the trouble; looking every 20min doesn't seem +onerous. + +If the security gateway and the client host are one and the +same, tunnel teardown decisions might wish to pay attention +to TCP connection status, as reported by the local TCP +layer. A still-open TCP connection is almost a guarantee +that more traffic is coming, while the demise of the only + + + +Draft 4 3 May 2001 15 + + + + + + Opportunistic Encryption + + +TCP connection through a tunnel is a strong hint that none +is. If the SG and the client host are separate machines, +though, tracking TCP connection status requires packet +snooping, which is complicated and probably not worthwhile. + +IKE keying channels likewise are torn down when it appears +the need has passed. They always linger longer than the +last tunnel they administer, in case they are needed again; +the cost of retaining them is low. Other than that, unless +the number of keying channels on the SG gets large, the SG +should simply retain all of them until rekeying time, since +rekeying is the only costly event. When about to rekey a +keying channel which has no current tunnels, note when the +last actual keying-channel traffic occurred, and close the +keying channel down if it wasn't in the last, say, 30min. +When rekeying a keying channel (or perhaps shortly before +rekeying is expected), Initiator and Responder should re- +fetch the public keys used for SG authentication, against +the possibility that they have changed or disappeared. + +See section 2.7 for discussion of rekeying intervals. + +Given the low user impact of tearing down and rebuilding a +connection (a tunnel or a keying channel), rekeying attempts +should not be too persistent: one can always just rebuild +when needed, so heroic efforts to preserve an existing con- +nection are unnecessary. Say, try every 10s for a minute +and every minute for 5min, and then give up and declare the +connection (and all other connections to that IKE peer) +dead. + +Rationale: In future, more sophisticated, versions of this +protocol, examining the initial packet might permit a more +intelligent guess at the tunnel's useful life. HTTP connec- +tions in particular are notoriously bursty and repetitive. + +Rationale: Note that rekeying a keying connection basically +consists of building a new keying connection from scratch, +using IKE Phase 1, and abandoning the old one. + +3.2. Teardown and Cleanup + +Teardown should always be coordinated with the other end. +This means interpreting and sending Delete notifications. + +On receiving a Delete for the outbound SAs of a tunnel (or +some subset of them), tear down the inbound ones too, and +notify the other end with a Delete. Tunnels need to be con- +sidered as bidirectional entities, even though the low-level +protocols don't think of them that way. + +When the deletion is initiated locally, rather than as a +response to a received Delete, send a Delete for (all) the +inbound SAs of a tunnel. If no responding Delete is + + + +Draft 4 3 May 2001 16 + + + + + + Opportunistic Encryption + + +received for the outbound SAs, try re-sending the original +Delete. Three tries spaced 10s apart seems a reasonable +level of effort. (Indefinite persistence is not necessary; +whether the other end isn't cooperating because it doesn't +feel like it, or because it is down/disconnected/etc., the +problem will eventually be cleared up by other means.) + +After rekeying, transmission should switch to using the new +SAs (ISAKMP or IPsec) immediately, and the old leftover SAs +should be cleared out promptly (and Deletes sent) rather +than waiting for them to expire. This reduces clutter and +minimizes confusion. + +Since there is only one keying channel per remote IP +address, the question of whether a Delete notification has +appeared on a ``suitable'' keying channel does not arise. + +Rationale: The pairing of Delete notifications effectively +constitutes an acknowledged Delete, which is highly desir- +able. + +3.3. Outages and Reboots + +Tunnels sometimes go down because the other end crashes, or +disconnects, or has a network link break, and there is no +notice of this in the general case. (Even in the event of a +crash and successful reboot, other SGs don't hear about it +unless the rebooted SG has specific reason to talk to them +immediately.) Over-quick response to temporary network out- +ages is undesirable... but note that a tunnel can be torn +down and then re-established without any user-visible effect +except a pause in traffic, whereas if one end does reboot, +the other end can't get packets to it at all (except via +IKE) until the situation is noticed. So a bias toward quick +response is appropriate, even at the cost of occasional +false alarms. + +Heartbeat mechanisms are somewhat unsatisfactory for this. +Unless they are very frequent, which causes other problems, +they do not detect the problem promptly. + +Ahem: What is really wanted is authenticated ICMP. This +might be a case where public-key encryption/authentication +of network packets is the right thing to do, despite the +expense. + +In the absence of that, a two-part approach seems warranted. + +First, when an SG receives an IPsec packet that is addressed +to it, and otherwise appears healthy, but specifies an +unknown SA and is from a host that the receiver currently +has no keying channel to, the receiver must attempt to +inform the sender via an IKE Initial-Contact notification +(necessarily sent in plaintext, since there is no suitable + + + +Draft 4 3 May 2001 17 + + + + + + Opportunistic Encryption + + +keying channel). This must be severely rate-limited on both +ends; one notification per SG pair per minute seems ample. + +Second, there is an obvious difficulty with this: the Ini- +tial-Contact notification is unauthenticated and cannot be +trusted. So it must be taken as a hint only: there must be +a way to confirm it. + +What is needed here is something that's desirable for debug- +ging and testing anyway: an IKE-level ping mechanism. Ping- +ing direct at the IP level instead will not tell us about a +crash/reboot event. Sending pings through tunnels has vari- +ous complications (they should stop at the far mouth of the +tunnel instead of going on to a subnet; they should not +count against idle timers; etc.). What is needed is a con- +tinuity check on a keying channel. (This could also be used +as a heartbeat, should that seem useful.) + +IKE Ping delivery need not be reliable, since the whole +point of a ping is simply to provoke an acknowledgement. +They should preferably be authenticated, but it is not clear +that this is absolutely necessary, although if they are not +they need encryption plus a timestamp or a nonce, to foil +replay mischief. How they are implemented is a secondary +issue, and a separate design proposal will be prepared. + +Ahem: Some existing implementations are already using (pri- +vate) notify value 30000 (``LIKE_HELLO'') as ping and (pri- +vate) notify value 30002 (``SHUT_UP'') as ping reply. + +If an IKE Ping gets no response, try some (say 8) IP pings, +spaced a few seconds apart, to check IP connectivity; if one +comes back, try another IKE Ping; if that gets no response, +the other end probably has rebooted, or otherwise been re- +initialized, and its tunnels and keying channel(s) should be +torn down. + +In a similar vein, giving limited rekeying persistence, a +short network outage could take some tunnels down without +disrupting others. On receiving a packet for an unknown SA +from a host that a keying channel is currently open to, send +that host a Invalid-SPI notification for that SA. The other +host can then tear down the half-torn-down tunnel, and nego- +tiate a new tunnel for the traffic it presumably still wants +to send. + +Finally, it would be helpful if SGs made some attempt to +deal intelligently with crashes and reboots. A deliberate +shutdown should include an attempt to notify all other SGs +currently connected by keying channels, using Deletes, that +communication is about to fail. (Again, these will be taken +as teardowns; attempts by the other SGs to negotiate new +tunnels as replacements should be ignored at this point.) +And when possible, SGs should attempt to preserve + + + +Draft 4 3 May 2001 18 + + + + + + Opportunistic Encryption + + +information about currently-connected SGs in non-volatile +storage, so that after a crash, an Initial-Contact can be +sent to previous partners to indicate loss of all previ- +ously-established connections. + +4. Conclusions + +This design appears to achieve the objective of setting up +encryption with strangers. The authentication aspects also +seem adequately addressed if the destination controls its +reverse-map DNS entries and the DNS data itself can be reli- +ably authenticated as having originated from the legitimate +administrators of that subnet/FQDN. The authentication sit- +uation is less satisfactory when DNS is less helpful, but it +is difficult to see what else could be done about it. + +5. References + +[TBW] + +6. Appendix: Separate Design Proposals TBW + +o How can we build a web of trust with DNSSEC? (See sec- + tion 2.3.4.) + +o How can we extend DNS reverse lookups to permit reverse + lookup on a subnet? (Both address and mask must appear + in the name to be looked up.) (See section 2.6.) + +o How can rekeying be done as robustly as possible? (At + least partly, this is just documenting current FreeS/WAN + practice.) (See section 2.7.) + +o How should IKE Pings be implemented? (See section 3.3.) + + + + + + + + + + + + + + + + + + + + + + + +Draft 4 3 May 2001 19 + + |