diff options
author | Rene Mayrhofer <rene@mayrhofer.eu.org> | 2006-05-22 05:12:18 +0000 |
---|---|---|
committer | Rene Mayrhofer <rene@mayrhofer.eu.org> | 2006-05-22 05:12:18 +0000 |
commit | aa0f5b38aec14428b4b80e06f90ff781f8bca5f1 (patch) | |
tree | 95f3d0c8cb0d59d88900dbbd72110d7ab6e15b2a /doc/oppimpl.txt | |
parent | 7c383bc22113b23718be89fe18eeb251942d7356 (diff) | |
download | vyos-strongswan-aa0f5b38aec14428b4b80e06f90ff781f8bca5f1.tar.gz vyos-strongswan-aa0f5b38aec14428b4b80e06f90ff781f8bca5f1.zip |
Import initial strongswan 2.7.0 version into SVN.
Diffstat (limited to 'doc/oppimpl.txt')
-rw-r--r-- | doc/oppimpl.txt | 514 |
1 files changed, 514 insertions, 0 deletions
diff --git a/doc/oppimpl.txt b/doc/oppimpl.txt new file mode 100644 index 000000000..fe4527d4e --- /dev/null +++ b/doc/oppimpl.txt @@ -0,0 +1,514 @@ +Implementing Opportunistic Encryption + +Henry Spencer & D. Hugh Redelmeier + +Version 4+, 15 Dec 2000 + + + +Updates + +Major changes since last version: "Negotiation Issues" section discussing +some interoperability matters, plus some wording cleanup. Some issues +arising from discussions at OLS are not yet resolved, so there will almost +certainly be another version soon. + +xxx incoming could be opportunistic or RW. xxx any way of saving unaware +implementations??? xxx compression needs mention. + + + +Introduction + +A major long-term goal of the FreeS/WAN project is opportunistic +encryption: a security gateway intercepts an outgoing packet aimed at a +new remote host, and quickly attempts to negotiate an IPsec tunnel to that +host's security gateway, so that traffic can be encrypted and +authenticated without changes to the host software. (This generalizes +trivially to the end-to-end case where host and security gateway are one +and the same.) If the attempt fails, the packet (or a retry thereof) +passes through in clear or is dropped, depending on local policy. +Prearranged tunnels bypass all this, so static VPNs can coexist with +opportunistic encryption. + +xxx here Although significant intelligence about all this is necessary at the +initiator end, it's highly desirable for little or no special machinery +to be needed at the responder end. In particular, if none were needed, +then a security gateway which knows nothing about opportunistic encryption +could nevertheless participate in some opportunistic connections. + +IPSEC gives us the low-level mechanisms, and the key-exchange machinery, +but there are some vague spots (to put it mildly) at higher levels. + +One constraint which deserves comment is that the process of tunnel setup +should be quick. Moreover, the decision that no tunnel can be created +should also be quick, since that will be a common case, at least in the +beginning. People will be reluctant to use opportunistic encryption if it +causes gross startup delays on every connection, even connections which see +no benefit from it. Win or lose, the process must be rapid. + +There's nothing much we can do to speed up the key exchange itself. (The +one thing which conceivably might be done is to use Aggressive Mode, which +involves fewer round trips, but it has limitations and possible security +problems, and we're reluctant to touch it.) What we can do, is to make the +other parts of the setup process as quick as possible. This desire will +come back to haunt us below. :-) + +A further note is that we must consider the processing at the responder +end as well as the initiator end. + +Several pieces of new machinery are needed to make this work. Here's a +brief list, with details considered below. + ++ Outgoing Packet Interception. KLIPS needs to intercept packets which +likely would benefit from tunnel setup, and bring them to Pluto's +attention. There needs to be enough memory in the process that the same +tunnel doesn't get proposed too often (win or lose). + ++ Smart Connection Management. Not only do we need to establish tunnels +on request, once a tunnel is set up, it needs to be torn down eventually +if it's not in use. It's also highly desirable to detect the fact that it +has stopped working, and do something useful. Status changes should be +coordinated between the two security gateways unless one has crashed, +and even then, they should get back into sync eventually. + ++ Security Gateway Discovery. Given a packet destination, we must decide +who to attempt to negotiate a tunnel with. This must be done quickly, win +or lose, and reliably even in the presence of diverse network setups. + ++ Authentication Without Prearrangement. We need to be sure we're really +talking to the intended security gateway, without being able to prearrange +any shared information. He needs the same assurance about us. + ++ More Flexible Policy. In particular, the responding Pluto needs a way +to figure out whether the connection it is being asked to make is okay. +This isn't as simple as just searching our existing conn database -- we +probably have to specify *classes* of legitimate connections. + +Conveniently, we have a three-letter acronym for each of these. :-) + +Note on philosophy: we have deliberately avoided providing six different +ways to do each step, in favor of specifying one good one. Choices are +provided only when they appear to be necessary. (Or when we are not yet +quite sure yet how best to do something...) + + + +OPI, SCM + +Smart Connection Management would be quite useful even by itself, +requiring manual triggering. (Right now, we do the manual triggering, but +not the other parts of SCM.) Outgoing Packet Interception fits together +with SCM quite well, and improves its usefulness further. Going through a +connection's life cycle from the start... + +OPI itself is relatively straightforward, aside from the nagging question +of whether the intercepted packet is put on hold and then released, or +dropped. Putting it on hold is preferable; the alternative is to rely on +the application or the transport layer re-trying. The downside of packet +hold is extra resources; the downside of packet dropping is that IPSEC +knows *when* the packet can finally go out, and the higher layers don't. +Either way, life gets a little tricky because a quickly-retrying +application may try more than once before we know for sure whether a +tunnel can be set up, and something has to detect and filter out the +duplications. Some ARP implementations use the approach of keeping one +packet for an as-yet-unresolved address, and throwing away any more that +appear; that seems a reasonable choice. + +(Is it worth intercepting *incoming* packets, from the outside world, and +attempting tunnel setup based on them? Perhaps... if, and only if, we +organize AWP so that non-opportunistic SGs can do it somehow. Otherwise, +if the other end has not initiated tunnel setup itself, it will not be +prepared to do so at our request.) + +Once a tunnel is up, packets going into it naturally are not intercepted +by OPI. However, we need to do something about the flip side of this too: +after deciding that we *cannot* set up a tunnel, either because we don't +have enough information or because the other security gateway is +uncooperative, we have to remember that for a while, so we don't keep +knocking on the same locked door. One plausible way of doing that is to +set up a bypass "tunnel" -- the equivalent of our current %passthrough +connection -- and have it managed like a real SCM tunnel (finite lifespan +etc.). This sounds a bit heavyweight, but in practice, the alternatives +all end up doing something very similar when examined closely. Note that +we need an extra variant of this, a block rather than a bypass, to cover +the case where local policy dictates that packets *not* be passed through; +we still have to remember the fact that we can't set up a real tunnel. + +When to tear tunnels down is a bit problematic, but if we're setting up a +potentially unbounded number of them, we have to tear them down *somehow* +*sometime*. It seems fairly obvious that we set a tentative lifespan, +probably fairly short (say 1min), and when it expires, we look to see if +the tunnel is still in use (say, has had traffic in the last half of the +lifespan). If so, we assign it a somewhat longer lifespan (say 10min), +after which we look again. If not, we close it down. (This lifespan is +independent of key lifetime; it is just the time when the tunnel's future +is next considered. This should happen reasonably frequently, unlike +rekeying, which is costly and shouldn't be too frequent.) Multi-step +backoff algorithms probably are not worth the trouble; looking every +10min doesn't seem onerous. + +For the tunnel-expiry decision, we need to know how long it has been since +the last traffic went through. A more detailed history of the traffic +does not seem very useful; a simple idle timer (or last-traffic timestamp) +is both necessary and sufficient. And KLIPS already has this. + +As noted, default initial lifespan should be short. However, Pluto should +keep a history of recently-closed tunnels, to detect cases where a tunnel +is being repeatedly re-established and should be given a longer lifespan. +(Not only is tunnel setup costly, but it adds user-visible delay, so +keeping a tunnel alive is preferable if we have reason to suspect more +traffic soon.) Any tunnel re-established within 10min of dying should have +10min added to its initial lifespan. (Just leaving all tunnels open longer +is unappealing -- adaptive lifetimes which are sensitive to the behavior +of a particular tunnel are wanted. Tunnels are relatively cheap entities +for us, but that is not necessarily true of all implementations, and there +may also be administrative problems in sorting through large accumulations +of idle tunnels.) + +It might be desirable to have detailed information about the initial +packet when determining lifespans. HTTP connections in particular are +notoriously bursty and repetitive. + +Arguably it would be nice to monitor TCP connection status. A still-open +TCP connection is almost a guarantee that more traffic is coming, while +the closing of the only TCP connection through a tunnel is a good hint +that none is. But the monitoring is complex, and it doesn't seem worth +the trouble. + +IKE connections likewise should be torn down when it appears the need has +passed. They should linger longer than the last tunnel they administer, +just in case they are needed again; the cost of retaining them is low. An +SG with only a modest number of them open might want to simply retain each +until rekeying time, with more aggressive management cutting in only when +the number gets large. (They should be torn down eventually, if only to +minimize the length of a status report, but rekeying is the only expensive +event for them.) + +It's worth remembering that tunnels sometimes go down because the other +end crashes, or disconnects, or has a network link break, and we don't get +any notice of this in the general case. (Even in the event of a crash and +successful reboot, we won't hear about it unless the other end has +specific reason to talk IKE to us immediately.) Of course, we have to +guard against being too quick to respond to temporary network outages, +but it's not quite the same issue for us as for TCP, because we can tear +down and then re-establish a tunnel without any user-visible effect except +a pause in traffic. And if the other end does go down and come back up, +we and it can't communicate *at all* (except via IKE) until we tear down +our tunnel. + +So... we need some kind of heartbeat mechanism. Currently there is none +in IKE, but there is discussion of changing that, and this seems like the +best approach. Doing a heartbeat at the IP level will not tell us about a +crash/reboot event, and sending heartbeat packets through tunnels has +various complications (they should stop at the far mouth of the tunnel +instead of going on to a subnet; they should not count against idle +timers; etc.). Heartbeat exchanges obviously should be done only when +there are tunnels established *and* there has been no recent incoming +traffic through them. It seems reasonable to do them at lifespan ends, +subject to appropriate rate limiting when more than one tunnel goes to the +same other SG. When all traffic between the two ends is supposed to go +via the tunnel, it might be reasonable to do a heartbeat -- subject to a +rate limiter to avoid DOS attacks -- if the kernel sees a non-tunnel +non-IKE packet from the other end. + +If a heartbeat gets no response, try a few (say 3) pings to check IP +connectivity; if one comes back, try another heartbeat; if it gets no +response, the other end has rebooted, or otherwise been re-initialized, +and its tunnels should be torn down. If there's no response to the pings, +note the fact and try the sequence again at the next lifespan end; if +there's nothing then either, declare the tunnels dead. + +Finally... except in cases where we've decided that the other end is dead +or has rebooted, tunnel teardown should always be coordinated with the +other end. This means interpreting and sending Delete notifications, and +also Initial-Contacts. Receiving a Delete for the other party's tunnel +SAs should lead us to tear down our end too -- SAs (SA bundles, really) +need to be considered as paired bidirectional entities, even though the +low-level protocols don't think of them that way. + + + +SGD, AWP + +Given a packet destination, how do we decide who to (attempt to) negotiate +a tunnel with? And as a related issue, how do the negotiating parties +authenticate each other? DNSSEC obviously provides the tools for the +latter, but how exactly do we use them? + +Having intercepted a packet, what we know is basically the IP addresses of +source and destination (plus, in principle, some information about the +desired communication, like protocol and port). We might be able to map +the source address to more information about the source, depending on how +well we control our local networks, but we know nothing further about the +destination. + +The obvious first thing to do is a DNS reverse lookup on the destination +address; that's about all we can do with available data. Ideally, we'd +like to get all necessary information with this one DNS lookup, because +DNS lookups are time-consuming -- all the more so if they involve a DNSSEC +signature-checking treewalk by the name server -- and we've got to hurry. +While it is unusual for a reverse lookup to yield records other than PTR +records (or possibly CNAME records, for RFC 2317 classless delegation), +there's no reason why it can't. + +(For purposes like logging, a reverse lookup is usually followed by a +forward lookup, to verify that the reverse lookup wasn't lying about the +host name. For our purposes, this is not vital, since we use stronger +authentication methods anyway.) + +While we want to get as much data as possible (ideally all of it) from one +lookup, it is useful to first consider how the necessary information would +be obtained if DNS lookups were instantaneous. Two pieces of information +are absolutely vital at this point: the IP address of the other end's +security gateway, and the SG's public key*. + +(* Actually, knowledge of the key can be postponed slightly -- it's not +needed until the second exchange of the negotiations, while we can't even +start negotiations without knowing the IP address. The SG is not +necessarily on the plain-IP route to the destination, especially when +multiple SGs are present.) + +Given instantaneous DNS lookups, we would: + ++ Start with a reverse lookup to turn the address into a name. + ++ Look for something like RFC-2782 SRV records using the name, to find out +who provides this particular service. If none comes back, we can abandon +the whole process. + ++ Select one SRV record, which gives us the name of a target host (plus +possibly one or more addresses, if the name server has supplied address +records as Additional Data for the SRV records -- this is recommended +behavior but is not required). + ++ Use the target name to look up a suitable KEY record, and also address +record(s) if they are still needed. + +This gives us the desired address(es) and key. However, it requires three +lookups, and we don't even find out whether there's any point in trying +until after the second. + +With real DNS lookups, which are far from instantaneous, some optimization +is needed. At the very least, typical cases should need fewer lookups. + +So when we do the reverse lookup on the IP address, instead of asking for +PTR, we ask for TXT. If we get none, we abandon opportunistic +negotiation, and set up a bypass/block with a relatively long life (say +6hr) because it's not worth trying again soon. (Note, there needs to be a +way to manually force an early retry -- say, by just clearing out all +memory of a particular address -- to cover cases where a configuration +error is discovered and fixed.) + +xxx need to discuss multi-string TXTs + +In the results, we look for at least one TXT record with content +"X-IPsec-Server(nnn)=a.b.c.d kkk", following RFC 1464 attribute/value +notation. (The "X-" indicates that this is tentative and experimental; +this design will probably need modification after initial experiments.) +Again, if there is no such record, we abandon opportunistic negotiation. + +"nnn" and the parentheses surrounding it are optional. If present, it +specifies a priority (low number high priority), as for MX records, to +control the order in which multiple servers are tried. If there are no +priorities, or there are ties, pick one randomly. + +"a.b.c.d" is the dotted-decimal IP address of the SG. (Suitable extensions +for IPv6, when the time comes, are straightforward.) + +"kkk" is either an RSA-MD5 public key in base-64 notation, as in the text +form of an RFC 2535 KEY record, or "@hhh". In the latter case, hhh is a +DNS name, under which one Host/Authentication/IPSEC/RSA-MD5 KEY record is +present, giving the server's authentication key. (The delay of the extra +lookup is undesirable, but practical issues of key management may make it +advisable not to duplicate the key itself in DNS entries for many +clients.) + +It unfortunately does appear that the authentication key has to be +associated with the server, not the client behind it. At the time when +the responder has to authenticate our SG, it does not know which of its +clients we are interested in (i.e., which key to use), and there is no +good way to tell it. (There are some bad ways; this decision may merit +re-examination after experimental use.) + +The responder authenticates our SG by doing a reverse lookup on its IP +address to get a Host/Authentication/IPSEC/RSA-MD5 KEY record. He can +attempt this in parallel with the early parts of the negotiation (since he +knows our SG IP address from the first negotiation packet), at the risk of +having to abandon the attempt and do a different lookup if we use +something different as our ID (see below). Unfortunately, he doesn't yet +know what client we will claim to represent, so he'll need to do another +lookup as part of phase 2 negotiation (unless the client *is* our SG), to +confirm that the client has a TXT X-IPsec-Server record pointing to our +SG. (Checking that the record specifies the same key is not important, +since the responder already has a trustworthy key for our SG.) + +Also unfortunately, opportunistic tunnels can only have degenerate subnets +(/32 subnets, containing one host) at their ends. It's superficially +attractive to negotiate broader connections... but without prearrangement, +you don't know whether you can trust the other end's claim to have a +specific subnet behind it. Fixing this would require a way to do a +reverse lookup on the *subnet* (you cannot trust information in DNS +records for a name or a single address, which may be controlled by people +who do not control the whole subnet) with both the address and the mask +included in the name. Except in the special case of a subnet masked on a +byte boundary (in which case RFC 1035's convention of an incomplete +in-addr.arpa name could be used), this would need extensions to the +reverse-map name space, which is awkward, especially in the presence of +RFC 2317 delegation. (IPv6 delegation is more flexible and it might be +easier there.) + +There is a question of what ID should be used in later steps of +negotiation. However, the desire not to put more DNS lookups in the +critical path suggests avoiding the extra complication of varied IDs, +except in the Road Warrior case (where an extra lookup is inevitable). +Also, figuring out what such IDs *mean* gets messy. To keep things simple, +except in the RW case, all IDs should be IP addresses identical to those +used in the packet headers. + +For Road Warrior, the RW must be the initiator, since the home-base SG has +no idea what address the RW will appear at. Moreover, in general the RW +does not control the DNS entries for his address. This inherently denies +the home base any authentication of the RW's IP address; the most it can +do is to verify an identity he provides, and perhaps decide whether it +wishes to talk to someone with that identity, but this does not verify his +right to use that IP address -- nothing can, really. + +(That may sound like it would permit some man-in-the-middle attacks, but +the RW can still do full authentication of the home base, so a man in the +middle cannot successfully impersonate home base. Furthermore, a man in +the middle must impersonate both sides for the DH exchange to work. So +either way, the IKE negotiation falls apart.) + +A Road Warrior provides an FQDN ID, used for a forward lookup to obtain a +Host/Authentication/IPSEC/RSA-MD5 KEY record. (Note, an FQDN need not +actually correspond to a host -- e.g., the DNS data for it need not +include an A record.) This suffices, since the RW is the initiator and +the responder knows his address from his first packet. + +Certain situations where a host has a more-or-less permanent IP address, +but does not control its DNS entries, must be treated essentially like +Road Warrior. It is unfortunate that DNS's old inverse-query feature +cannot be used (nonrecursively) to ask the initiator's local DNS server +whether it has a name for the address, because the address will almost +always have been obtained from a DNS name lookup, and it might be a lookup +of a name whose DNS entries the host *does* control. (Real examples of +this exist: the host has a preferred name whose host-controlled entry +includes an A record, but a reverse lookup on the address sends you to an +ISP-controlled name whose entry has an A record but not much else.) Alas, +inverse query is long obsolete and is not widely implemented now. + +There are some questions in failure cases. If we cannot acquire the info +needed to set up a tunnel, this is the no-tunnel-possible case. If we +reach an SG but negotiation fails, this too is the no-tunnel-possible +case, with a relatively long bypass/block lifespan (say 1hr) since +fruitless negotiations are expensive. (In the multiple-SG case, it seems +unlikely to be worthwhile to try other SGs just in case one of them might +have a configuration permitting successful negotiation.) + +Finally, there is a sticky problem with timeouts. If the other SG is down +or otherwise inaccessible, in the worst case we won't hear about this +except by not getting responses. Some other, more pathological or even +evil, failure cases can have the same result. The problem is that in the +case where a bypass is permitted, we want to decide whether a tunnel is +possible quickly. It gets even worse if there are multiple SGs, in which +case conceivably we might want to try them all (since some SGs being up +when others are down is much more likely than SGs differing in policy). + +The patience setting needs to be configurable policy, with a reasonable +default (to be determined by experiment). If it expires, we simply have +to declare the attempt a failure, and set up a bypass/block. (Setting up +a tentative bypass/block, and replacing it with a real tunnel if remaining +attempts do produce one, looks attractive at first glance... but exposing +the first few seconds of a connection is often almost as bad as exposing +the whole thing!) Such a bypass/block should have a short lifespan, say +10min, because the SG(s) might be only temporarily unavailable. + +The flip side of IKE waiting for a timeout is that all other forms of +feedback, e.g. "host not reachable", should be *ignored*, because you +cannot trust them! This may need kernel changes. + +Can AWP be done by non-opportunistic SGs? Probably not; existing SG +implementations generally aren't prepared to do anything suitable, except +perhaps via the messy business of certificates. There is one borderline +exception: some implementations rely on LDAP for at least some of their +information fetching, and it might be possible to substitute a custom LDAP +server which does the right things for them. Feasibility of this depends +on details, which we don't know well enough. + +[This could do with a full example, a complete packet by packet walkthrough +including all DNS and IKE traffic.] + + + +MFP + +Our current conn database simply isn't flexible enough to cover all this +properly. In particular, the responding Pluto needs a way to figure out +whether the connection it is being asked to make is legitimate. + +This is more subtle than it sounds, given the problem noted earlier, that +there's no clear way to authenticate claims to represent a non-degenerate +subnet. Our database has to be able to say "a connection to any host in +this subnet is okay" or "a connection to any subnet within this subnet is +okay", rather than "a connection to exactly this subnet is okay". (There +is some analogy to the Road Warrior case here, which may be relevant.) +This will require at least a re-interpretation of ipsec.conf. + +Interim stages of implementation of this will require a bit of thought. +Notably, we need some way of dealing with the lack of fully signed DNSSEC +records. Without user interaction, probably the best we can do is to +remember the results of old fetches, compare them to the results of new +fetches, and complain and disbelieve all of it if there's a mismatch. +This does mean that somebody who gets fake data into our very first fetch +will fool us, at least for a while, but that seems an acceptable tradeoff. + + + +Negotiation Issues + +There are various options which are nominally open to negotiation as part +of setup, but which have to be nailed down at least well enough that +opportunistic SGs can reliably interoperate. Somewhat arbitrarily and +tentatively, opportunistic SGs must support Main Mode, Oakley group 5 for +D-H, 3DES encryption and MD5 authentication for both ISAKMP and IPsec SAs, +RSA digital-signature authentication with keys between 2048 and 8192 bits, +and ESP doing both encryption and authentication. They must do key PFS +in Quick Mode, but not identity PFS. + + + +What we need from DNS + +Fortunately, we don't need any new record types or suchlike to make this +all work. We do, however, need attention to a couple of areas in DNS +implementation. + +First, size limits. Although the information we directly need from a +lookup is not enormous -- the only potentially-big item is the KEY record, +and there should be only one of those -- there is still a problem with +DNSSEC authentication signatures. With a 2048-bit key and assorted +supporting information, we will fill most of a 512-byte DNS UDP packet... +and if the data is to have DNSSEC authentication, at least one quite large +SIG record will come too. Plus maybe a TSIG signature on the whole +response, to authenticate it to our resolver. So: DNSSEC-capable name +servers must fix the 512-byte UDP limit. We're told there are provisions +for this; implementation of them is mandatory. + +Second, interface. It is unclear how the resolver interface will let us +ask for DNSSEC authentication. We would prefer to ask for "authentication +where possible", and get back the data with each item flagged by whether +authentication was available (and successful!) or not available. Having +to ask separately for authenticated and non-authenticated data would +probably be acceptable, *provided* both will be cached on the first +request, so the two requests incur only one set of (non-local) network +traffic. Either way, we want to see the name server and resolver do this +for us; that makes sense in any case, since it's important that +verification be done somewhere where it can be cached, the more centrally +the better. + +Finally, a wistful note: the ability to do a limited form of inverse +queries (an almost forgotten feature), to ask the local name server which +hostname it recently mapped to a particular address, would be quite +helpful. Note, this is *NOT* the same as a reverse lookup, and crude +fakes like putting a dotted-decimal address in brackets do not suffice. |