summaryrefslogtreecommitdiff
path: root/doc/src/trouble.html
blob: 604264c019351850017802d08c066bc5e5bb8a2d (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
<HTML>
<HEAD>
	<TITLE>FreeS/WAN troubleshooting</TITLE>
      <meta name="keywords" content="Linux, IPSEC, VPN, security, FreeSWAN, troubleshooting, debugging">
<!--
     Written by Claudia Schmeing for the Linux FreeS/WAN project
     Freely distributable under the GNU General Public License

     More information at www.freeswan.org
     Feedback to users@lists.freeswan.org

CVS information:
RCS ID:          $Id: trouble.html,v 1.1 2004/03/15 20:35:24 as Exp $
Last changed:    $Date: 2004/03/15 20:35:24 $
Revision number: $Revision: 1.1 $

CVS revision numbers do not correspond to FreeS/WAN release numbers.
-->
 
</HEAD>
<BODY>

<H1><A NAME="trouble"></A>Linux FreeS/WAN Troubleshooting Guide</H1>

<H2><A NAME="overview"></A>Overview</H2>

<P>
This document covers several general places where you might have a problem:</P>
<OL>
	<LI><A HREF="#install">During install</A>.</LI>
	<LI><A HREF="#negotiation">During the negotiation process</A>.</LI>
	<LI><A HREF="#use">Using an established connection</A>.</LI>
</OL>
<P>This document also contains <A HREF="#notes">notes</A> which
expand on points made in these sections, and tips for 
<A HREF="#prob.report">problem
reporting</A>. If the other end of your connection is not FreeS/WAN,
you'll also want to read our 
<A HREF="interop.html#interop.problem">interoperation</A> document.</P>
<H2><A NAME="install"></A>1. During Install</H2>
<H3>1.1 RPM install gotchas</H3>
<P>With the RPM method:</P>
<UL>
<LI>Be sure you have installed both the userland tools and the kernel
  components. One will not work without the other. For example, when using 
  FreeS/WAN-produced RPMs for our 2.04 release, you need both:
<PRE>    freeswan-userland-2.04_2.4.20_20.9-0.i386.rpm
    freeswan-module-2.04_2.4.20_20.9-0.i386.rpm
</PRE>
</LI>
</UL>
<H3>1.2 Problems installing from source</H3>
<P>When installing from source, you may find these problems:</P>
<UL>
	<LI>Missing library. See <A HREF="faq.html#gmp.h_missing">this</A>
	FAQ.</LI>
	<LI>Missing utilities required for compile. See this 
        <A HREF="install.html#tool.lib">checklist</A>.</LI>
	<LI>Kernel version incompatibility. See <A HREF="faq.html#k.versions">this</A>
	FAQ.</LI>
	<LI>Another compile problem. Find information in the out.* files,
	ie. out.kpatch, out.kbuild, created at compile time in the top-level
	Linux FreeS/WAN directory. Error messages generated by KLIPS during
	the boot sequence are accessible with the <VAR>dmesg</VAR> command. 
	<BR>
	Check the list archives and the List in Brief to see if this is a
	known issue. If it is not, report it to the bugs list as described
	in our <A HREF="#prob.report">problem reporting</A> section. In some
	cases, you may be asked to provide debugging information using gdb;
	details <A HREF="#gdb">below</A>.</LI>
	<LI>If your kernel compiles but you fail to install your new
	FreeS/WAN-enabled kernel, review the sections on <A HREF="install.html#newk">installing
	the patched kernel</A>, and <A HREF="install.html#testinstall">testing</A>
	to see if install succeeded.</LI>
</UL>
<H3><A NAME="install.check"></A>1.3 Install checks</H3>
<P><VAR>ipsec verify</VAR> checks a number 
of FreeS/WAN essentials. Here are some hints on what do to when your
system doesn't check out:</P>
<P>
<TABLE border=1>
<TR>
<TD><STRONG>Problem</STRONG></TD>
<TD><STRONG>Status</STRONG></TD>
<TD><STRONG>Action</STRONG></TD>
</TR>
<TR>
<TD><VAR>ipsec</VAR> not on-path</TD>
<TD>&nbsp;</TD>
<TD><P>Add <VAR>/usr/local/sbin</VAR> to your PATH.</P></TD>
</TR>
<TR>
<TD>Missing KLIPS support</TD>
<TD><FONT COLOR="#FF0000">critical</FONT></TD>
<TD>See <A HREF="faq.html#noKLIPS">this FAQ.</A></TD>
</TR>
<TR>
<TD>No RSA private key</TD>
<TD>&nbsp;</TD>
<TD>
<P>Follow <A HREF="install.html#genrsakey">these 
instructions</A> to create an RSA key pair for your host. RSA keys are:</P>
<UL>
<LI>required for opportunistic encryption, and</LI>
<LI>our preferred method to authenticate pre-configured connections.</LI>
</UL>
</TD>
</TR>
<TR>
<TD><VAR>pluto</VAR> not running</TD>
<TD><FONT COLOR="#FF0000">critical</FONT></TD>
<TD><PRE>service ipsec start</PRE></TD>
</TR>
<TR>
<TD>No port 500 hole</TD>
<TD><FONT COLOR="#FF0000">critical</FONT></TD>
<TD>Open port 500 for IKE negotiation.</TD>
</TR>
<TR>
<TD>Port 500 check N/A</TD>
<TD>&nbsp;</TD>
<TD>Check that port 500 is open for IKE negotiation.</TD>
</TR>
<TR>
<TD>Failed DNS checks</TD>
<TD>&nbsp;</TD>
<TD>Opportunistic encryption requires information from DNS.
To set this up, see <A HREF="quickstart.html#opp.setup">our instructions</A>.
</TD>
</TR>
<TR>
<TD>No public IP address</TD>
<TD>&nbsp;</TD>
<TD>Check that the interface which you want to protect with IPSec is up and 
running.</TD>
</TR>
</TABLE>


<H3><A NAME="oe.trouble"></A>1.3 Troubleshooting OE</H3>
<P>OE should work with no local configuration, if you have posted
DNS TXT records according to the instructions in our 
<A HREF="quickstart.html">quickstart guide</A>.
If you encounter trouble, try these hints. 
We welcome additional hints via the
<A HREF="mail.html">users' mailing list</A>.</P>
                                                                                
<TABLE border=1>
<TR>
<TD><STRONG>Symptom</STRONG></TD>
<TD><STRONG>Problem</STRONG></TD>
<TD><STRONG>Action</STRONG></TD>
</TR>
<TR>
<TD>
You're running FreeS/WAN 2.01 (or later),
and initiating a connection to FreeS/WAN
2.00 (or earlier).
In your logs, you see a message like:
<pre>no RSA public key known for '192.0.2.13';
DNS search for KEY failed (no KEY record
for 13.2.0.192.in-addr.arpa.)</pre>
The older FreeS/WAN logs no error.
</TD>
<TD>
<A NAME="oe.trouble.flagday"></A>
A protocol level incompatibility between 2.01 (or later) and
2.00 (or earlier) causes this error. It occurs when a FreeS/WAN 2.01
(or later) box for which no KEY record is posted attempts to initiate an OE
connection to older FreeS/WAN versions (2.00 and earlier).
Note that older versions can initiate to newer versions without this error.
</TD>
<TD>If you control the peer host, upgrade its FreeS/WAN to 2.01 (or later), and
post new style TXT records for it. If not, but if you know its sysadmin,
perhaps a quick note is in order. If neither option is possible, you can
ease the transition by posting an old style KEY record (created with a
command like "ipsec&nbsp;showhostkey&nbsp;--key") to the reverse map for
the FreeS/WAN 2.01 (or later) box.</TD>
</TR>
<TR>
<TD>OE host is very slow to contact other hosts.</TD>
<TD>Slow DNS service while running OE.</TD>
<TD>It's a good idea to run a caching DNS server on your OE host,
as outlined in <A HREF="http://lists.freeswan.org/pipermail/design/2003-January/004205.html">this
mailing list message</A>. If your DNS servers are elsewhere,
put their IPs
in the <VAR>clear</VAR> policy group, and
re-read groups with <PRE>ipsec auto --rereadgroups</PRE>
</TD>
</TR>
<TR>
<TD>
<PRE>Can't Opportunistically initiate for
192.0.2.2 to 192.0.2.3: no TXT record
for 13.2.0.192.in-addr.arpa.</PRE>
</TD>
<TD>Peer is not set up for OE.</TD>
<TD><P>None. Plenty of hosts on the Internet
do not run OE. If, however, you have set OE up on that peer, this may
indicate that you need to wait up to 48 hours
for its DNS records to propagate.</P></TD>
</TR>
<TR>
<TD><VAR>ipsec verify</VAR> does not find DNS records:
<PRE>...
Looking for TXT in forward map:
                xy.example.com...[FAILED]
Looking for TXT in reverse map...[FAILED]
...</PRE>
                                                                                
You also experience authentication failure:<BR>
<PRE>Possible authentication failure:
no acceptable response to our
first encrypted message</PRE>
</TD>
                                                                                
<TD>DNS records are not posted or have not propagated.</TD>
<TD>Did you post the DNS records necessary for OE? If not,
do so using the instructions in our
<A HREF="quickstart.html#quickstart">quickstart guide</A>.
If so, wait up to 48 hours for the DNS records to propagate.</TD>
</TR>
<TR>
<TD><VAR>ipsec verify</VAR> does not find DNS records, and you experience
authentication failure.</TD>
<TD>For iOE, your ID
does not match location of
forward DNS record.</TD>
<TD>In <VAR>config setup</VAR>, change
<VAR>myid=</VAR> to match the forward DNS where you posted the record.
Restart FreeS/WAN.
 For reference, see our
<A HREF="quickstart.html#opp.client">iOE instructions</A>.</TD>
</TR>
<TR>
<TD><VAR>ipsec verify</VAR> finds DNS records, yet there is
still authentication failure. ( ? )</TD>
<TD>DNS records are malformed.</TD>
<TD>Re-create the records and send new copies to your DNS administrator.</TD>
</TR>
<TR>
<TD><VAR>ipsec verify</VAR> finds DNS records, yet there is
still authentication failure. ( ? )</TD>
<TD>DNS records show different keys for a gateway vs. its subnet hosts.</TD>
<TD>All TXT records for boxes protected by an OE gateway must contain the
gateway's public key. Re-create and re-post any incorrect records using
<A HREF="quickstart.html#opp.incoming">these instructions</A>.</TD>
</TR>
<TR>
<TD>OE gateway loses connectivity to its subnet. The gateway's
routing table shows routes to the subnet through IPsec interfaces.</TD>
<TD>The subnet is part of the <VAR>private</VAR> or <VAR>block</VAR>
policy group on the gateway.</TD>
<TD>Remove the subnet from the group, and reread
groups with <PRE>ipsec auto --rereadgroups</PRE></TD>
</TR>
<TR>
<TD>OE does not work to hosts on the local LAN.</TD>
<TD>This is a known issue.</TD>
<TD>See <A HREF="opportunism.known-issues">this list</A> of known issues
with OE.
</TD>
</TR>
                                                                                
<TR>
<TD>FreeS/WAN does not seem to be executing your default policy. In your
logs, you see a message like:
<PRE>/etc/ipsec.d/policies/iprivate-or-clear"
line 14: subnet "0.0.0.0/0",
source 192.0.2.13/32,
already "private-or-clear"</PRE>
</TD>
<TD><A HREF="glossary.html#fullnet">Fullnet</A> in a policy group file defines
your default policy. Fullnet should normally be present in only one policy
group file. The fine print: you can have two default policies defined so long
as they protect different local endpoints (e.g. the FreeS/WAN gateway and a
subnet).</TD>
<TD>
Find all policies which contain fullnet with:<br>
<PRE>grep -F 0.0.0.0/0 /etc/ipsec.d/policies/*</PRE>
then remove the unwanted occurrence(s).
</TD>
</TR>
                                                                                
</TABLE>


<H2><A NAME="negotiation"></A>2. During Negotiation</H2>
<P>When you fail to bring up a tunnel, you'll need to find out:</P>
<UL>
<LI><A HREF="#state">what your connection state is,</A> and often</LI>
<LI><A HREF="#find.pluto.error">an error message</A>.</LI>
</UL>
<P>before you can 
<A HREF="#interpret.pluto.error">diagnose your problem</A>.</P>
<H3><A NAME="state"></A>2.1 Determine Connection State</H3>
<H4>Finding current state</H4>
<P>You can see connection states (STATE_MAIN_I1 and so on) when you
bring up a connection on the command line. If you have missed this,
or brought up your connection automatically, use:
</P>
<PRE>ipsec auto --status</PRE>
<P>The most relevant state is the last one reached.</P>
<H4><VAR>What's this supposed to look like?</VAR></H4>
<P>Negotiations should proceed though various states, in the processes of:</P>
<OL>
<LI>IKE negotiations (aka Phase 1, Main Mode, STATE_MAIN_*)</LI>
<LI>IPSEC negotiations (aka Phase 2, Quick Mode, STATE_QUICK_*)</LI>
</OL>
<P>These are done and a connection is established when you see messages like:</P>
<PRE>    000 #21: &quot;myconn&quot; STATE_MAIN_I4 (ISAKMP SA established)...
    000 #2: &quot;myconn&quot; STATE_QUICK_I2 (sent QI2, IPsec SA established)...</PRE><P>
Look for the key phrases are &quot;ISAKMP SA established&quot; and &quot;IPSec
SA established&quot;, with the relevant connection name. Often, this happens 
at STATE_MAIN_I4 and STATE_QUICK_I2, respectively.</P>
<P><VAR>ipsec auto --status</VAR> will tell you what states <STRONG>have
been achieved</STRONG>, rather than the current state. Since
determining the current state is rather more difficult to do, current
state information is not available from Linux FreeS/WAN. If you are
actively bringing a connection up, the status report's last states
for that connection likely reflect its current state. Beware, though,
of the case where a connection was correctly brought up but is now
downed: Linux FreeS/WAN will not notice this until it attempts to
rekey. Meanwhile, the last known state indicates that the connection
has been established.</P>
<P>If your connection is stuck at STATE_MAIN_I1, skip straight to 
<A HREF="#ikepath">here</A>.

<H3><A NAME="find.pluto.error"></A>2.2 Finding error text</H3>
<P>Solving most errors will require you to find verbose error text,
either on the command line or in the logs.</P>
<H4>Verbose start for more information</H4>
<P>
Note that you can get more detail from <VAR>ipsec auto</VAR> using
the --verbose flag:</P>
<PRE STYLE="margin-bottom: 0.2in">    ipsec auto --verbose --up west-east</PRE><P>
More complete information can be gleaned from the <A HREF="#logusage">log
files</A>.</P>

<H4>Debug levels count</H4>
<P>The amount of description you'll get here depends on ipsec.conf debug
settings, <VAR>klipsdebug</VAR>= and <VAR>plutodebug</VAR>=. 
When troubleshooting, set at least one of these to <VAR>all</VAR>, and 
when done, reset it to <VAR>none</VAR> so your logs don't fill up.
Note that you must have enabled the <VAR>klipsdebug</VAR> 
<A HREF="install.html#allbut">compile-time option</A> for the 
<VAR>klipsdebug</VAR> configuration switch to work.</P>
<P>For negotiation problems <VAR>plutodebug</VAR> is most relevant.
<VAR>klipsdebug</VAR> applies mainly to attempts to use an
already-established connection. See also <A HREF="ipsec.html#parts">this</A>
description of the division of duties within Linux FreeS/WAN.</P>
<P>After raising your debug levels, restart Linux FreeS/WAN to ensure
that ipsec.conf is reread, then recreate the error to generate
verbose logs. 
</P>
<H4><VAR>ipsec barf</VAR> for lots of debugging information</H4>
<P>
<A HREF="manpage.d/ipsec_barf.8.html"><VAR>ipsec barf (8)</VAR></A>
collects a bunch of useful debugging information, including these logs
Use the command</P>
<PRE>
    ipsec barf &gt; barf.west
</PRE>
<P>to generate one.</P>
<H4>Find the error</H4>
<P>Search out the failure point in your logs.
 Are there a handful of lines which succinctly describe how
things are going wrong or contrary to your expectation? Sometimes the
failure point is not immediately obvious: Linux FreeS/WAN's errors
are usually not marked &quot;Error&quot;. Have a look in the 
<A HREF="faq.html">FAQ</A>
for what some common failures look like.</P>
<P>Tip: problems snowball.
Focus your efforts on the first problem, which is likely to be the
cause of later errors.</P>
<H4>Play both sides</H4>
<P>Also find error text on the peer IPSec box.
This gives you two perspectives on the same failure.</P>
<P>At times you will require information which only one side has.
The peer can merely indicate the presence of an error, and its 
approximate point in the negotiations. If one side keeps retrying, 
it may be because there is a show stopper on the other side. 
Have a look at the other side and figure out what it doesn't like.</P>
<P>If the other end is not Linux FreeS/WAN, the principle is the
same: replicate the error with its most verbose logging on, and
capture the output to a file.</P>
<H3><A NAME="interpret.pluto.error"></A>2.3 Interpreting a Negotiation Error</H3>
<H4><A NAME="ikepath"></A>Connection stuck at STATE_MAIN_I1</H4>
<P>This error commonly happens because IKE (port 500) packets, needed 
to negotiate an IPSec connection, cannot travel freely between your IPSec 
gateways. See <A HREF="firewall.html#packets">our firewall document</A> 
for details.</P>
<H4>Other errors</H4>
<P>Other errors require a bit more digging. Use the following resources:</P>
<UL>
	<LI><A HREF="faq.html">the FAQ</A> . Since this document is
	constantly updated, the snapshot's FAQ may have a new entry relevant
	to your problem.</LI>
	<LI>our <A HREF="background.html">background document</A> .
	Special considerations which, while not central to Linux FreeS/WAN,
	are often tripped over. Includes problems with 
      <a href="background.html#MTU.trouble">packet fragmentation</a>,
      and considerations for
	testing opportunism.</LI>
	<LI>the <A HREF="mail.html#lists">list archives</A>. Each of the
	searchable archives works differently, so it's worth checking each.
	Use a search term which is generic, but identifies your error, for
	example &quot;No connection is known for&quot;.
	<BR>
        Often, you will find that your question has been answered in the
	past. Finding an archived answer is quicker than asking the list.
	You may, however, find similar questions without answers. If you do,
	send their URLs to the list with your trouble report. The additional
	examples may help the list tech support person find your answer.</LI>
	<LI>Look into the code where the error is being generated. The
	pluto code is nicely documented with comments and meaningful
	variable names.</LI>
</UL>
<P>If you have failed to solve your problem with the help of these
resources, send a detailed problem report to the users list,
following these <A HREF="#prob.report">guidelines</A>.</P>
<H2><A NAME="use"></A>3. Using a Connection</H2>
<H3>3.1 Orienting yourself</H3>
<H4><VAR>How do I know if it works?</VAR></H4>
<P>Test your connection by sending packets through it. The simplest way
to do this is with ping, but the ping needs to <STRONG>test the correct 
tunnel.</STRONG> See <A HREF="#testgates">this example scenario</A> if
you don't understand this.<P>
<P>If your ping returns, test any other connections you've brought
u all check out, great. You may wish to <A HREF="#bigpacket">test
with large packets</A> for MTU problems.</P>
<H4><VAR>ipsec barf</VAR> is useful again</H4>
<P>If your ping fails to return, generate an ipsec barf debugging
report on each IPSec gateway. On a non-Linux FreeS/WAN
implementation, gather equivalent information. Use this, and the tips
in the next sections, to troubleshoot. Are you sure that both
endpoints are capable of hearing and responding to ping?</P>
<H3>3.2 Those pesky configuration errors</H3>
<P>IPSec may be dropping your ping packets since they do not belong in the
tunnels you have constructed:</P>
<UL>
<LI>Your ping may not test the tunnel you intend to test. For details, see our  
<A HREF="faq.html#cantping">&quot;I can't ping&quot;</A> FAQ.
</LI>
<LI>
Alternately, you may have a configuration error.
For example, you may have configured one of the four possible tunnels between 
two gateways, but not the one required to secure the important
traffic you're now testing. In this case, add and start the tunnel,
and try again.
</LI>
</UL>
<P>In either case, you will often see a message like:</P>
<PRE>klipsdebug... no eroute</PRE>
<P>which we discuss in <A HREF="faq.html#no_eroute">this
FAQ</A>.</P>
<P>Note:</P>
<UL>
<LI><A HREF="glossary.html#NAT.gloss">Network Address Translation (NAT)</A> 
and <A HREF="glossary.html#masq">IP masquerade</A> may have an effect on 
which tunnels you need to configure.</LI>
<LI>When testing a tunnel that protects a multi-node subnet, try several 
subnet nodes as ping targets, in case one node is routing incorrectly.</LI>
</UL>
<H3><A NAME="route.firewall"></A>3.3 Check Routing and Firewalling</H3>
<P>If you've confirmed your configuration assumptions, the problem is
almost certainly with routing or firewalling. Isolate the problem
using interface statistics, firewall statistics, or a packet sniffer.</P>
<H4>Background:</H4>
<UL>
	<LI>Linux FreeS/WAN supplies all the special routing it needs;
	you need only route packets out through your IPSec gateway. Verify
	that on the <VAR>subnetted</VAR> machines you are using for your 
        ping-test, your routing is as expected. I have seen a tunnel 
        &quot;fail&quot; because the subnet machine sending packets 
        out an alternate gateway (not our IPSec gateway) on their return path.
	<LI>Linux FreeS/WAN requires particular <A HREF="firewall.html"> 
        firewalling considerations</A>.
	Check the firewall rules on your IPSec gateways and ensure that they
	allow IPSec traffic through. Be sure that no other machine - for
	example a router between the gateways - is blocking your IPSec
	packets.
</UL>
<H4><A NAME="ifconfig"></A>View Interface and Firewall
Statistics</H4>
<P>Interface reports and firewall statistics can help you track down
lost packets at a glance. Check any firewall statistics you may be keeping 
on your IPSec gateways, for dropped packets.</P>

<P><STRONG>Tip</STRONG>: You can take a snapshot of the packets processed 
by your firewall with:</P>

<PRE>    iptables -L -n -v</PRE>

<P>You can get creative with "diff" to find out what happens to a
particular packet during transmission.</P>

<P>Both <VAR>cat /proc/net/dev</VAR> and <VAR>ifconfig</VAR> display
interface statistics, and both are included in <VAR>ipsec barf</VAR>. Use
either to check if any interface has dropped packets. If you find
that one has, test whether this is related to your ping. While you
ping continuously, print that interface's statistics several times.
Does its drop count increase in proportion to the ping? If so, check
why the packets are dropped there.</P>

<P>To do this, look at the firewall rules that apply to that interface. If the
interface is an IPSec interface, more information may be available in
the log. Grep for the word &quot;drop&quot; in a log which was
created with <VAR>klipsdebug=all</VAR> as the error happened.</P>
<P>See also this <A HREF="#ifconfig1">discussion</A> on interpreting 
<VAR>ifconfig</VAR> statistics.</P>
<H3><A NAME="sniff"></A>3.4 When in doubt, sniff it out</H3>
<P>If you have checked configuration assumptions, routing, and
firewall rules, and your interface statistics yield no clue, it
remains for you to investigate the mystery of the lost packet by the
most thorough method: with a packet sniffer (providing, of course, 
that this is legal where you are working).
<P>In order to detect packets on the ipsec virtual interfaces,
you will need an up-to-date sniffer (tcpdump, ethereal, ksnuffle) on
your IPSec gateway machines. You may also find it useful to sniff the ping
endpoints.</P>
<H4>Anticipate your packets' path</H4>
<P>Ping, and examine each interface along the projected path, checking for your 
ping's arrival. If it doesn't get to the the next stop, you have narrowed 
down where to look for it. In this way, you can isolate a problem area, 
and narrow your troubleshooting focus.</P>
<P>Within a machine running Linux FreeS/WAN, this 
<A HREF="firewall.html#packets">packet flow diagram</A> will help you 
anticipate a packet's path.
<P>Note that:</P>
<UL>
<LI>
from the perspective of the tunneled packet, the entire tunnel is one hop. 
That's explained in <A HREF="faq.html#no_trace">this</A> FAQ.
</LI>
<LI>
 an encapsulated IPSec packet will look different, when
sniffed, from the plaintext packet which generated it. You
can see plaintext packets entering an IPSec interface and the
resulting cyphertext packets as they emerge from the corresponding
physical interface. 
</LI>
</UL>
<P>Once you isolate where the packet is lost, take a closer look at
firewall rules, routing and configuration assumptions as they affect
that specific area. If the packet is lost on an IPSec gateway, comb
through <VAR>klipsdebug</VAR> output for anomalies. 
</P>
<P>If the packet goes through both gateways successfully and reaches
the ping target, but does not return, suspect routing. Check that the
ping target routes packets back to the IPSec gateway.</P>
<H3><A NAME="find.use.error"></A>3.5 Check your logs</H3>
<P>Here, too, log information can be useful. Start with the 
<A HREF="#find.pluto.error">guidelines above</A>.</P>
<P>For connection use problems, set <VAR>klipsdebug=all</VAR>. Note
that you must have enabled the <VAR>klipsdebug</VAR> 
<A HREF="install.html#allbut">compile-time option</A> to do this. 
Restart Linux FreeS/WAN so that it rereads <VAR>ipsec.conf</VAR>, 
then recreate the error condition. When searching through 
<VAR>klipsdebug</VAR> data, look especially for the keywords
&quot;drop&quot; (as in dropped packets) and &quot;error&quot;.</P>
<P>Often the problem with connection use is not software error, but
rather that the software is behaving contrary to expectation. 
</P>
<H4><A NAME="interpret.use.error"></A>Interpreting log text</H4>
<P>To interpret the Linux FreeS/WAN log text you've found, use the
same resources as indicated for troubleshooting 
connection negotiation: 
<A HREF="faq.html">the FAQ</A> , our
<A HREF="background.html">background document</A>, and the 
<A HREF="mail.html#lists">list archives</A>.
Looking in the KLIPS code is only for the very brave.</P>
<P>If you are still stuck, send a <A HREF="#prob.report">detailed
problem report</A> to the users' list.</P>
<H3><A NAME="bigpacket"></A>3.6 More testing for the truly thorough</H3>
<H4>Large Packets</H4>
<P>If each of your connections passed the ping test, you may wish to
test by pinging with large packets (2000 bytes or larger). If it does
not return, suspect MTU issues, and see this <A HREF="background.html#MTU.trouble">discussion</A>.</P>
<H4>Stress Tests</H4>
<P>In most users' view, a simple ping test, and perhaps a
large-packet ping test suffice to indicate a working IPSec
connection.</P>
<P>Some people might like to do additional stress tests prior to
production use. They may be interested in this <A HREF="http://www.sandelman.ottawa.on.ca/linux-ipsec/html/2000/12/msg00224.html">testing
protocol</A> we use at interoperation conferences, aka &quot;bakeoffs&quot;.
We also have a <VAR>testing</VAR> directory that ships with the
release.</P>
<H2><A NAME="prob.report"></A>4. Problem Reporting</H2>
<H3>4.1 How to ask for help</H3>
<P>Ask for troubleshooting help on the users' mailing list,
<A HREF="mailto:users@lists.freeswan.org">users@lists.freeswan.org</A>.
While sometimes an initial query with a quick description of your
intent and error will twig someone's memory of a similar problem,
it's often necessary to send a second mail with a complete problem
report. 
</P>


<P>When reporting problems to the mailing list(s), please include: 
</P>
<UL>
	<LI>a brief description of the problem</LI>
	<LI>if it's a compile problem, the actual output from make,
	showing the problem. Try to edit it down to only the relevant part,
	but when in doubt, be as complete as you can. If it's a kernel
	compile problem, any relevant out.* files</LI>
	<LI>if it's a run-time problem, pointers to where we can find the
	complete output from &quot;ipsec barf&quot; from BOTH ENDS (not just
	one of them). Remember that it's common outside the US and Canada to
	pay for download volume, so if you can't post barfs on the web and
	send the URL to the mailing list, at least compress them with tar or
	gzip.<BR>
	If you can, try to simplify the case that is causing the problem.
	In particular, if you clear your logs, start FreeS/WAN with no other
	connections running, cause the problem to happen, and then do <VAR>ipsec
	barf</VAR> on both ends immediately, that gives the smallest and
	least cluttered output.</LI>
	<LI>any other error messages, complaints, etc. that you saw.
	Please send the complete text of the messages, not just a summary.</LI>
	<LI>what your network setup is. Include subnets, gateway
	addresses, etc. A schematic diagram is a
	good format for this information.</LI>
	<LI>exactly what you were trying to do with Linux FreeS/WAN, and
	exactly what went wrong</LI>
	<LI>a fix, if you have one. But remember, you are sending mail to
	people all over the world; US residents and US citizens in
	particular, please read doc/exportlaws.html before sending code --
	even small bug fixes -- to the list or to us.</LI>
	<LI>When in doubt about whether to include some seemingly-trivial
	item of information, include it. It is rare for problem reports to
	have too much information, and common for them to have too little.</LI>
</UL>

<P>Here are some good general guidelines on bug reporting:
<a href="http://tuxedo.org/~esr/faqs/smart-questions.html">How To Ask Questions
The Smart Way</a> and <a
href="http://www.chiark.greenend.org.uk/~sgtatham/bugs.html">How to Report
Bugs Effectively</a>.</p>


<H3>4.2 Where to ask</H3>
<P>To report a problem, send mail about it to the users' list. If you
are certain that you have found a bug, report it to the bugs list. If
you encounter a problem while doing your own coding on the Linux
FreeS/WAN codebase and think it is of interest to the design team,
notify the design list. When in doubt, default to the users' list.
More information about the mailing lists is found <A HREF="mail.html#lists">here</A>.</P>
<P>For a number of reasons -- including export-control regulations
affecting almost any <STRONG>private</STRONG> discussion of
encryption software -- we prefer that problem reports and discussions
go to the lists, not directly to the team. Beware that the list goes
worldwide; US citizens, read this important information about your
<A HREF="politics.html#exlaw">export laws</A>. If you're using this
software, you really should be on the lists. To get onto them, visit
<A HREF="http://lists.freeswan.org/">lists.freeswan.org</A>.</P>
<P>If you do send private mail to our coders or want a private reply
from them, please make sure that the return address on your mail
(From or Reply-To header) is a valid one. They have more important
things to do than to unravel addresses that have been mangled in an
attempt to confuse spammers. 
</P>
<H2><A NAME="notes"></A>5. Additional Notes on Troubleshooting</H2>
<P>The following sections supplement the Guide: <A HREF="#system.info">information
available on your system</A>; <A HREF="#testgates">testing between
security gateways</A>; <A HREF="#ifconfig1">ifconfig reports for
KLIPS debugging</A>; <A HREF="#gdb">using GDB on Pluto</A>.</P>
<H3><A NAME="system.info"></A>5.1 Information available on your
system</H3>
<H4><A NAME="logusage"></A>Logs used</H4>
<P>Linux FreeS/WAN logs to:</P>
<UL>
	<LI>/var/log/secure (or, on Debian, /var/log/auth.log)</LI>
	<LI>/var/log/messages</LI>
</UL>
<P>Check both places to get full information. If you find nothing,
check your <VAR>syslogd.conf(5)</VAR> to see where your
/etc/syslog.conf or equivalent is directing <VAR>authpriv</VAR>
messages.</P>
<H4><A NAME="pages"></A>man pages provided</H4>
<DL>
	<DT><A HREF="manpage.d/ipsec.conf.5.html">ipsec.conf(5)</A> 
	</DT><DD>
	Manual page for IPSEC configuration file. 
	</DD><DT>
	<A HREF="manpage.d/ipsec.8.html">ipsec(8)</A> 
	</DT><DD STYLE="margin-bottom: 0.2in">
	Primary man page for ipsec utilities. 
	</DD></DL>
<P>
Other man pages are on <A HREF="manpages.html">this list</A> and in</P>
<UL>
	<LI>/usr/local/man/man3</LI>
	<LI>/usr/local/man/man5</LI>
	<LI>/usr/local/man/man8/ipsec_*</LI>
</UL>
<H4><A NAME="statusinfo"></A>Status information</H4>
<DL>
	<DT>ipsec auto --status 
	</DT><DD>
	Command to get status report from running system. Displays Pluto's
	state. Includes the list of connections which are currently &quot;added&quot;
	to Pluto's internal database; lists state objects reflecting ISAKMP
	and IPsec SAs being negotiated or installed. 
	</DD><DT>
	ipsec look 
	</DT><DD>
	Brief status info. 
	</DD><DT>
	ipsec barf 
	</DT><DD STYLE="margin-bottom: 0.2in">
	Copious debugging info. 
	</DD></DL>
<H3>
<A NAME="testgates"></A>5.2 Testing between security gateways</H3>
<P>Sometimes you need to test a subnet-subnet tunnel. This is a
tunnel between two security gateways, which protects traffic on
behalf of the subnets behind these gateways. On this network:</P>
<PRE>     Sunset==========West------------------East=========Sunrise
                     IPSec gateway         IPSec gateway
           local net       untrusted net       local net</PRE><P>
you might name this tunnel sunset-sunrise. You can test this tunnel
by having a machine behind one gateway ping a machine behind the
other gateway, but this is not always convenient or even possible.</P>
<P>Simply pinging one gateway from the other is not useful. Such a
ping does not normally go through the tunnel. <STRONG>The tunnel
handles traffic between the two protected subnets, not between the
gateways</STRONG> . Depending on the routing in place, a ping might</P>
<UL>
	<LI>either succeed by finding an
	unencrypted route</LI>
	<LI>or fail by finding no route. Packets without an IPSEC eroute
	are discarded.</LI>
</UL>
<P><STRONG>Neither event tells you anything about the tunnel</STRONG>.
You can explicitly create an eroute to force such packets through the
tunnel, or you can create additional tunnels as described in our
<A HREF="config.html#multitunnel">configuration document</A>, but
those may be unnecessary complications in your situation.</P>
<P>The trick is to explicitly test between <STRONG>both gateways'
private-side IP addresses</STRONG>. Since the private-side interfaces
are on the protected subnets, the resulting packets do go via the
tunnel. Use either ping -I or traceroute -i, both of which allow you
to specify a source interface. (Note: unsupported on older Linuxes).
The same principles apply for a road warrior (or other) case where
only one end of your tunnel is a subnet.</P>
<H3><A NAME="ifconfig1"></A>5.3 ifconfig reports for KLIPS debugging</H3>
<P>When diagnosing problems using ifconfig statistics, you may wonder
what type of activity increments a particular counter for an ipsecN
device. Here's an index, posted by KLIPS developer Richard Guy
Briggs:</P>
<PRE>Here is a catalogue of the types of errors that can occur for which
statistics are kept when transmitting and receiving packets via klips.
I notice that they are not necessarily logged in the right counter.
. . .

Sources of ifconfig statistics for ipsec devices

rx-errors:
- packet handed to ipsec_rcv that is not an ipsec packet.
- ipsec packet with payload length not modulo 4.
- ipsec packet with bad authenticator length.
- incoming packet with no SA.
- replayed packet.
- incoming authentication failed.
- got esp packet with length not modulo 8.

tx_dropped:
- cannot process ip_options.
- packet ttl expired.
- packet with no eroute.
- eroute with no SA.
- cannot allocate sk_buff.
- cannot allocate kernel memory.
- sk_buff internal error.


The standard counters are:

struct enet_statistics
{
        int        rx_packets;                /* total packets received */
        int        tx_packets;                /* total packets transmitted */
        int        rx_errors;                /* bad packets received */
        int        tx_errors;                /* packet transmit problems */
        int        rx_dropped;                /* no space in linux buffers */
        int        tx_dropped;                /* no space available in linux */
        int        multicast;                /* multicast packets received */
        int        collisions;

        /* detailed rx_errors: */
        int        rx_length_errors;
        int        rx_over_errors;                /* receiver ring buff overflow */
        int        rx_crc_errors;                /* recved pkt with crc error */
        int        rx_frame_errors;        /* recv'd frame alignment error */
        int        rx_fifo_errors;                /* recv'r fifo overrun */
        int        rx_missed_errors;        /* receiver missed packet */

        /* detailed tx_errors */
        int        tx_aborted_errors;
        int        tx_carrier_errors;
        int        tx_fifo_errors;
        int        tx_heartbeat_errors;
        int        tx_window_errors;
};

of which I think only the first 6 are useful.</PRE><H3>
<A NAME="gdb"></A>5.4 Using GDB on Pluto</H3>
<P>You may need to use the GNU debugger, gdb(1), on Pluto. This
should be necessary only in unusual cases, for example if you
encounter a problem which the Pluto developer cannot readily
reproduce or if you are modifying Pluto. 
</P>
<P>Here are the Pluto developer's suggestions for doing this: 
</P>
<PRE>Can you get a core dump and use gdb to find out what Pluto was doing
when it died?

To get a core dump, you will have to set dumpdir to point to a
suitable directory (see <A HREF="manpage.d/ipsec.conf.5.html">ipsec.conf(5)</A>).

To get gdb to tell you interesting stuff:
        $ script
        $ cd dump-directory-you-chose
        $ gdb /usr/local/lib/ipsec/pluto core
        (gdb) where
        (gdb) quit
        $ exit

The resulting output will have been captured by the script command in
a file called &quot;typescript&quot;.  Send it to the list.

Do not delete the core file.  I may need to ask you to print out some
more relevant stuff.</PRE><P>
Note that the <VAR>dumpdir</VAR> parameter takes effect only when the
IPsec subsystem is restarted -- reboot or ipsec setup restart.</P>
<P><BR><BR>
</P>
</BODY>
</HTML>