From aa0f5b38aec14428b4b80e06f90ff781f8bca5f1 Mon Sep 17 00:00:00 2001 From: Rene Mayrhofer Date: Mon, 22 May 2006 05:12:18 +0000 Subject: Import initial strongswan 2.7.0 version into SVN. --- doc/src/trouble.html | 840 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 840 insertions(+) create mode 100644 doc/src/trouble.html (limited to 'doc/src/trouble.html') diff --git a/doc/src/trouble.html b/doc/src/trouble.html new file mode 100644 index 000000000..604264c01 --- /dev/null +++ b/doc/src/trouble.html @@ -0,0 +1,840 @@ + + + FreeS/WAN troubleshooting + + + + + + +

Linux FreeS/WAN Troubleshooting Guide

+ +

Overview

+ +

+This document covers several general places where you might have a problem:

+
    +
  1. During install.
  2. +
  3. During the negotiation process.
  4. +
  5. Using an established connection.
  6. +
+

This document also contains notes which +expand on points made in these sections, and tips for +problem +reporting. If the other end of your connection is not FreeS/WAN, +you'll also want to read our +interoperation document.

+

1. During Install

+

1.1 RPM install gotchas

+

With the RPM method:

+ +

1.2 Problems installing from source

+

When installing from source, you may find these problems:

+ +

1.3 Install checks

+

ipsec verify checks a number +of FreeS/WAN essentials. Here are some hints on what do to when your +system doesn't check out:

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ProblemStatusAction
ipsec not on-path 

Add /usr/local/sbin to your PATH.

Missing KLIPS supportcriticalSee this FAQ.
No RSA private key  +

Follow these +instructions to create an RSA key pair for your host. RSA keys are:

+
    +
  • required for opportunistic encryption, and
  • +
  • our preferred method to authenticate pre-configured connections.
  • +
+
pluto not runningcritical
service ipsec start
No port 500 holecriticalOpen port 500 for IKE negotiation.
Port 500 check N/A Check that port 500 is open for IKE negotiation.
Failed DNS checks Opportunistic encryption requires information from DNS. +To set this up, see our instructions. +
No public IP address Check that the interface which you want to protect with IPSec is up and +running.
+ + +

1.3 Troubleshooting OE

+

OE should work with no local configuration, if you have posted +DNS TXT records according to the instructions in our +quickstart guide. +If you encounter trouble, try these hints. +We welcome additional hints via the +users' mailing list.

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
SymptomProblemAction
+You're running FreeS/WAN 2.01 (or later), +and initiating a connection to FreeS/WAN +2.00 (or earlier). +In your logs, you see a message like: +
no RSA public key known for '192.0.2.13';
+DNS search for KEY failed (no KEY record
+for 13.2.0.192.in-addr.arpa.)
+The older FreeS/WAN logs no error. +
+ +A protocol level incompatibility between 2.01 (or later) and +2.00 (or earlier) causes this error. It occurs when a FreeS/WAN 2.01 +(or later) box for which no KEY record is posted attempts to initiate an OE +connection to older FreeS/WAN versions (2.00 and earlier). +Note that older versions can initiate to newer versions without this error. +If you control the peer host, upgrade its FreeS/WAN to 2.01 (or later), and +post new style TXT records for it. If not, but if you know its sysadmin, +perhaps a quick note is in order. If neither option is possible, you can +ease the transition by posting an old style KEY record (created with a +command like "ipsec showhostkey --key") to the reverse map for +the FreeS/WAN 2.01 (or later) box.
OE host is very slow to contact other hosts.Slow DNS service while running OE.It's a good idea to run a caching DNS server on your OE host, +as outlined in this +mailing list message. If your DNS servers are elsewhere, +put their IPs +in the clear policy group, and +re-read groups with
ipsec auto --rereadgroups
+
+
Can't Opportunistically initiate for
+192.0.2.2 to 192.0.2.3: no TXT record
+for 13.2.0.192.in-addr.arpa.
+
Peer is not set up for OE.

None. Plenty of hosts on the Internet +do not run OE. If, however, you have set OE up on that peer, this may +indicate that you need to wait up to 48 hours +for its DNS records to propagate.

ipsec verify does not find DNS records: +
...
+Looking for TXT in forward map:
+                xy.example.com...[FAILED]
+Looking for TXT in reverse map...[FAILED]
+...
+ +You also experience authentication failure:
+
Possible authentication failure:
+no acceptable response to our
+first encrypted message
+
DNS records are not posted or have not propagated.Did you post the DNS records necessary for OE? If not, +do so using the instructions in our +quickstart guide. +If so, wait up to 48 hours for the DNS records to propagate.
ipsec verify does not find DNS records, and you experience +authentication failure.For iOE, your ID +does not match location of +forward DNS record.In config setup, change +myid= to match the forward DNS where you posted the record. +Restart FreeS/WAN. + For reference, see our +iOE instructions.
ipsec verify finds DNS records, yet there is +still authentication failure. ( ? )DNS records are malformed.Re-create the records and send new copies to your DNS administrator.
ipsec verify finds DNS records, yet there is +still authentication failure. ( ? )DNS records show different keys for a gateway vs. its subnet hosts.All TXT records for boxes protected by an OE gateway must contain the +gateway's public key. Re-create and re-post any incorrect records using +these instructions.
OE gateway loses connectivity to its subnet. The gateway's +routing table shows routes to the subnet through IPsec interfaces.The subnet is part of the private or block +policy group on the gateway.Remove the subnet from the group, and reread +groups with
ipsec auto --rereadgroups
OE does not work to hosts on the local LAN.This is a known issue.See this list of known issues +with OE. +
FreeS/WAN does not seem to be executing your default policy. In your +logs, you see a message like: +
/etc/ipsec.d/policies/iprivate-or-clear"
+line 14: subnet "0.0.0.0/0",
+source 192.0.2.13/32,
+already "private-or-clear"
+
Fullnet in a policy group file defines +your default policy. Fullnet should normally be present in only one policy +group file. The fine print: you can have two default policies defined so long +as they protect different local endpoints (e.g. the FreeS/WAN gateway and a +subnet). +Find all policies which contain fullnet with:
+
grep -F 0.0.0.0/0 /etc/ipsec.d/policies/*
+then remove the unwanted occurrence(s). +
+ + +

2. During Negotiation

+

When you fail to bring up a tunnel, you'll need to find out:

+ +

before you can +diagnose your problem.

+

2.1 Determine Connection State

+

Finding current state

+

You can see connection states (STATE_MAIN_I1 and so on) when you +bring up a connection on the command line. If you have missed this, +or brought up your connection automatically, use: +

+
ipsec auto --status
+

The most relevant state is the last one reached.

+

What's this supposed to look like?

+

Negotiations should proceed though various states, in the processes of:

+
    +
  1. IKE negotiations (aka Phase 1, Main Mode, STATE_MAIN_*)
  2. +
  3. IPSEC negotiations (aka Phase 2, Quick Mode, STATE_QUICK_*)
  4. +
+

These are done and a connection is established when you see messages like:

+
    000 #21: "myconn" STATE_MAIN_I4 (ISAKMP SA established)...
+    000 #2: "myconn" STATE_QUICK_I2 (sent QI2, IPsec SA established)...

+Look for the key phrases are "ISAKMP SA established" and "IPSec +SA established", with the relevant connection name. Often, this happens +at STATE_MAIN_I4 and STATE_QUICK_I2, respectively.

+

ipsec auto --status will tell you what states have +been achieved, rather than the current state. Since +determining the current state is rather more difficult to do, current +state information is not available from Linux FreeS/WAN. If you are +actively bringing a connection up, the status report's last states +for that connection likely reflect its current state. Beware, though, +of the case where a connection was correctly brought up but is now +downed: Linux FreeS/WAN will not notice this until it attempts to +rekey. Meanwhile, the last known state indicates that the connection +has been established.

+

If your connection is stuck at STATE_MAIN_I1, skip straight to +here. + +

2.2 Finding error text

+

Solving most errors will require you to find verbose error text, +either on the command line or in the logs.

+

Verbose start for more information

+

+Note that you can get more detail from ipsec auto using +the --verbose flag:

+
    ipsec auto --verbose --up west-east

+More complete information can be gleaned from the log +files.

+ +

Debug levels count

+

The amount of description you'll get here depends on ipsec.conf debug +settings, klipsdebug= and plutodebug=. +When troubleshooting, set at least one of these to all, and +when done, reset it to none so your logs don't fill up. +Note that you must have enabled the klipsdebug +compile-time option for the +klipsdebug configuration switch to work.

+

For negotiation problems plutodebug is most relevant. +klipsdebug applies mainly to attempts to use an +already-established connection. See also this +description of the division of duties within Linux FreeS/WAN.

+

After raising your debug levels, restart Linux FreeS/WAN to ensure +that ipsec.conf is reread, then recreate the error to generate +verbose logs. +

+

ipsec barf for lots of debugging information

+

+ipsec barf (8) +collects a bunch of useful debugging information, including these logs +Use the command

+
+    ipsec barf > barf.west
+
+

to generate one.

+

Find the error

+

Search out the failure point in your logs. + Are there a handful of lines which succinctly describe how +things are going wrong or contrary to your expectation? Sometimes the +failure point is not immediately obvious: Linux FreeS/WAN's errors +are usually not marked "Error". Have a look in the +FAQ +for what some common failures look like.

+

Tip: problems snowball. +Focus your efforts on the first problem, which is likely to be the +cause of later errors.

+

Play both sides

+

Also find error text on the peer IPSec box. +This gives you two perspectives on the same failure.

+

At times you will require information which only one side has. +The peer can merely indicate the presence of an error, and its +approximate point in the negotiations. If one side keeps retrying, +it may be because there is a show stopper on the other side. +Have a look at the other side and figure out what it doesn't like.

+

If the other end is not Linux FreeS/WAN, the principle is the +same: replicate the error with its most verbose logging on, and +capture the output to a file.

+

2.3 Interpreting a Negotiation Error

+

Connection stuck at STATE_MAIN_I1

+

This error commonly happens because IKE (port 500) packets, needed +to negotiate an IPSec connection, cannot travel freely between your IPSec +gateways. See our firewall document +for details.

+

Other errors

+

Other errors require a bit more digging. Use the following resources:

+ +

If you have failed to solve your problem with the help of these +resources, send a detailed problem report to the users list, +following these guidelines.

+

3. Using a Connection

+

3.1 Orienting yourself

+

How do I know if it works?

+

Test your connection by sending packets through it. The simplest way +to do this is with ping, but the ping needs to test the correct +tunnel. See this example scenario if +you don't understand this.

+

If your ping returns, test any other connections you've brought +u all check out, great. You may wish to test +with large packets for MTU problems.

+

ipsec barf is useful again

+

If your ping fails to return, generate an ipsec barf debugging +report on each IPSec gateway. On a non-Linux FreeS/WAN +implementation, gather equivalent information. Use this, and the tips +in the next sections, to troubleshoot. Are you sure that both +endpoints are capable of hearing and responding to ping?

+

3.2 Those pesky configuration errors

+

IPSec may be dropping your ping packets since they do not belong in the +tunnels you have constructed:

+ +

In either case, you will often see a message like:

+
klipsdebug... no eroute
+

which we discuss in this +FAQ.

+

Note:

+ +

3.3 Check Routing and Firewalling

+

If you've confirmed your configuration assumptions, the problem is +almost certainly with routing or firewalling. Isolate the problem +using interface statistics, firewall statistics, or a packet sniffer.

+

Background:

+ +

View Interface and Firewall +Statistics

+

Interface reports and firewall statistics can help you track down +lost packets at a glance. Check any firewall statistics you may be keeping +on your IPSec gateways, for dropped packets.

+ +

Tip: You can take a snapshot of the packets processed +by your firewall with:

+ +
    iptables -L -n -v
+ +

You can get creative with "diff" to find out what happens to a +particular packet during transmission.

+ +

Both cat /proc/net/dev and ifconfig display +interface statistics, and both are included in ipsec barf. Use +either to check if any interface has dropped packets. If you find +that one has, test whether this is related to your ping. While you +ping continuously, print that interface's statistics several times. +Does its drop count increase in proportion to the ping? If so, check +why the packets are dropped there.

+ +

To do this, look at the firewall rules that apply to that interface. If the +interface is an IPSec interface, more information may be available in +the log. Grep for the word "drop" in a log which was +created with klipsdebug=all as the error happened.

+

See also this discussion on interpreting +ifconfig statistics.

+

3.4 When in doubt, sniff it out

+

If you have checked configuration assumptions, routing, and +firewall rules, and your interface statistics yield no clue, it +remains for you to investigate the mystery of the lost packet by the +most thorough method: with a packet sniffer (providing, of course, +that this is legal where you are working). +

In order to detect packets on the ipsec virtual interfaces, +you will need an up-to-date sniffer (tcpdump, ethereal, ksnuffle) on +your IPSec gateway machines. You may also find it useful to sniff the ping +endpoints.

+

Anticipate your packets' path

+

Ping, and examine each interface along the projected path, checking for your +ping's arrival. If it doesn't get to the the next stop, you have narrowed +down where to look for it. In this way, you can isolate a problem area, +and narrow your troubleshooting focus.

+

Within a machine running Linux FreeS/WAN, this +packet flow diagram will help you +anticipate a packet's path. +

Note that:

+ +

Once you isolate where the packet is lost, take a closer look at +firewall rules, routing and configuration assumptions as they affect +that specific area. If the packet is lost on an IPSec gateway, comb +through klipsdebug output for anomalies. +

+

If the packet goes through both gateways successfully and reaches +the ping target, but does not return, suspect routing. Check that the +ping target routes packets back to the IPSec gateway.

+

3.5 Check your logs

+

Here, too, log information can be useful. Start with the +guidelines above.

+

For connection use problems, set klipsdebug=all. Note +that you must have enabled the klipsdebug +compile-time option to do this. +Restart Linux FreeS/WAN so that it rereads ipsec.conf, +then recreate the error condition. When searching through +klipsdebug data, look especially for the keywords +"drop" (as in dropped packets) and "error".

+

Often the problem with connection use is not software error, but +rather that the software is behaving contrary to expectation. +

+

Interpreting log text

+

To interpret the Linux FreeS/WAN log text you've found, use the +same resources as indicated for troubleshooting +connection negotiation: +the FAQ , our +background document, and the +list archives. +Looking in the KLIPS code is only for the very brave.

+

If you are still stuck, send a detailed +problem report to the users' list.

+

3.6 More testing for the truly thorough

+

Large Packets

+

If each of your connections passed the ping test, you may wish to +test by pinging with large packets (2000 bytes or larger). If it does +not return, suspect MTU issues, and see this discussion.

+

Stress Tests

+

In most users' view, a simple ping test, and perhaps a +large-packet ping test suffice to indicate a working IPSec +connection.

+

Some people might like to do additional stress tests prior to +production use. They may be interested in this testing +protocol we use at interoperation conferences, aka "bakeoffs". +We also have a testing directory that ships with the +release.

+

4. Problem Reporting

+

4.1 How to ask for help

+

Ask for troubleshooting help on the users' mailing list, +users@lists.freeswan.org. +While sometimes an initial query with a quick description of your +intent and error will twig someone's memory of a similar problem, +it's often necessary to send a second mail with a complete problem +report. +

+ + +

When reporting problems to the mailing list(s), please include: +

+ + +

Here are some good general guidelines on bug reporting: +How To Ask Questions +The Smart Way and How to Report +Bugs Effectively.

+ + +

4.2 Where to ask

+

To report a problem, send mail about it to the users' list. If you +are certain that you have found a bug, report it to the bugs list. If +you encounter a problem while doing your own coding on the Linux +FreeS/WAN codebase and think it is of interest to the design team, +notify the design list. When in doubt, default to the users' list. +More information about the mailing lists is found here.

+

For a number of reasons -- including export-control regulations +affecting almost any private discussion of +encryption software -- we prefer that problem reports and discussions +go to the lists, not directly to the team. Beware that the list goes +worldwide; US citizens, read this important information about your +export laws. If you're using this +software, you really should be on the lists. To get onto them, visit +lists.freeswan.org.

+

If you do send private mail to our coders or want a private reply +from them, please make sure that the return address on your mail +(From or Reply-To header) is a valid one. They have more important +things to do than to unravel addresses that have been mangled in an +attempt to confuse spammers. +

+

5. Additional Notes on Troubleshooting

+

The following sections supplement the Guide: information +available on your system; testing between +security gateways; ifconfig reports for +KLIPS debugging; using GDB on Pluto.

+

5.1 Information available on your +system

+

Logs used

+

Linux FreeS/WAN logs to:

+ +

Check both places to get full information. If you find nothing, +check your syslogd.conf(5) to see where your +/etc/syslog.conf or equivalent is directing authpriv +messages.

+

man pages provided

+
+
ipsec.conf(5) +
+ Manual page for IPSEC configuration file. +
+ ipsec(8) +
+ Primary man page for ipsec utilities. +
+

+Other man pages are on this list and in

+ +

Status information

+
+
ipsec auto --status +
+ Command to get status report from running system. Displays Pluto's + state. Includes the list of connections which are currently "added" + to Pluto's internal database; lists state objects reflecting ISAKMP + and IPsec SAs being negotiated or installed. +
+ ipsec look +
+ Brief status info. +
+ ipsec barf +
+ Copious debugging info. +
+

+5.2 Testing between security gateways

+

Sometimes you need to test a subnet-subnet tunnel. This is a +tunnel between two security gateways, which protects traffic on +behalf of the subnets behind these gateways. On this network:

+
     Sunset==========West------------------East=========Sunrise
+                     IPSec gateway         IPSec gateway
+           local net       untrusted net       local net

+you might name this tunnel sunset-sunrise. You can test this tunnel +by having a machine behind one gateway ping a machine behind the +other gateway, but this is not always convenient or even possible.

+

Simply pinging one gateway from the other is not useful. Such a +ping does not normally go through the tunnel. The tunnel +handles traffic between the two protected subnets, not between the +gateways . Depending on the routing in place, a ping might

+ +

Neither event tells you anything about the tunnel. +You can explicitly create an eroute to force such packets through the +tunnel, or you can create additional tunnels as described in our +configuration document, but +those may be unnecessary complications in your situation.

+

The trick is to explicitly test between both gateways' +private-side IP addresses. Since the private-side interfaces +are on the protected subnets, the resulting packets do go via the +tunnel. Use either ping -I or traceroute -i, both of which allow you +to specify a source interface. (Note: unsupported on older Linuxes). +The same principles apply for a road warrior (or other) case where +only one end of your tunnel is a subnet.

+

5.3 ifconfig reports for KLIPS debugging

+

When diagnosing problems using ifconfig statistics, you may wonder +what type of activity increments a particular counter for an ipsecN +device. Here's an index, posted by KLIPS developer Richard Guy +Briggs:

+
Here is a catalogue of the types of errors that can occur for which
+statistics are kept when transmitting and receiving packets via klips.
+I notice that they are not necessarily logged in the right counter.
+. . .
+
+Sources of ifconfig statistics for ipsec devices
+
+rx-errors:
+- packet handed to ipsec_rcv that is not an ipsec packet.
+- ipsec packet with payload length not modulo 4.
+- ipsec packet with bad authenticator length.
+- incoming packet with no SA.
+- replayed packet.
+- incoming authentication failed.
+- got esp packet with length not modulo 8.
+
+tx_dropped:
+- cannot process ip_options.
+- packet ttl expired.
+- packet with no eroute.
+- eroute with no SA.
+- cannot allocate sk_buff.
+- cannot allocate kernel memory.
+- sk_buff internal error.
+
+
+The standard counters are:
+
+struct enet_statistics
+{
+        int        rx_packets;                /* total packets received */
+        int        tx_packets;                /* total packets transmitted */
+        int        rx_errors;                /* bad packets received */
+        int        tx_errors;                /* packet transmit problems */
+        int        rx_dropped;                /* no space in linux buffers */
+        int        tx_dropped;                /* no space available in linux */
+        int        multicast;                /* multicast packets received */
+        int        collisions;
+
+        /* detailed rx_errors: */
+        int        rx_length_errors;
+        int        rx_over_errors;                /* receiver ring buff overflow */
+        int        rx_crc_errors;                /* recved pkt with crc error */
+        int        rx_frame_errors;        /* recv'd frame alignment error */
+        int        rx_fifo_errors;                /* recv'r fifo overrun */
+        int        rx_missed_errors;        /* receiver missed packet */
+
+        /* detailed tx_errors */
+        int        tx_aborted_errors;
+        int        tx_carrier_errors;
+        int        tx_fifo_errors;
+        int        tx_heartbeat_errors;
+        int        tx_window_errors;
+};
+
+of which I think only the first 6 are useful.

+5.4 Using GDB on Pluto

+

You may need to use the GNU debugger, gdb(1), on Pluto. This +should be necessary only in unusual cases, for example if you +encounter a problem which the Pluto developer cannot readily +reproduce or if you are modifying Pluto. +

+

Here are the Pluto developer's suggestions for doing this: +

+
Can you get a core dump and use gdb to find out what Pluto was doing
+when it died?
+
+To get a core dump, you will have to set dumpdir to point to a
+suitable directory (see ipsec.conf(5)).
+
+To get gdb to tell you interesting stuff:
+        $ script
+        $ cd dump-directory-you-chose
+        $ gdb /usr/local/lib/ipsec/pluto core
+        (gdb) where
+        (gdb) quit
+        $ exit
+
+The resulting output will have been captured by the script command in
+a file called "typescript".  Send it to the list.
+
+Do not delete the core file.  I may need to ask you to print out some
+more relevant stuff.

+Note that the dumpdir parameter takes effect only when the +IPsec subsystem is restarted -- reboot or ipsec setup restart.

+



+

+ + -- cgit v1.2.3