linux-ppp.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Spontaneous LCP ConfReq after connection made
@ 2005-08-12  7:33 James Cameron
  2005-08-12 14:17 ` James Carlson
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: James Cameron @ 2005-08-12  7:33 UTC (permalink / raw)
  To: linux-ppp

[-- Attachment #1: Type: text/plain, Size: 1512 bytes --]

G'day,

pppd tears down a connection and starts a new one, without the modem
hanging up, in response to an LCP ConfReq that arrives out of the blue
from the peer.

I'm thinking it's a peer bug, but I'm not sure, and I can't do anything
about the peer, as it's a consumer service by an apparently
disinterested service provider who refuses to resolve problems if
"unsupported" software is used.

Can anyone suggest anything?

Is there somewhere in pppd source where I can easily disable this
response, in case I can continue the existing connection?


The debug and dump output is here;
http://quozl.linux.org.au/mm-5100/2005-08-12/rtt-fail.log
(the SIGHUP after the restart is generated by my ip-down script)

The pppdump output is here;
http://quozl.linux.org.au/mm-5100/2005-08-12/rtt.pppdump

The service I'm using is a CDMA 1x RTT modem attached by USB;
http://quozl.linux.org.au/mm-5100/

pppd is from Debian GNU/Linux sarge, 2.4.3-20050321
Linux kernel 2.6.11.7 on i686.

Problem is reproducible; most easily if there is a PPTP session active
on another interface and packets begin to be sent over the new interface
as a result of a route change, but also during normal operation without
any PPTP session active.  Sometimes it will operate for minutes or
hours, sometimes only seconds.  The trouble always begins with an LCP
ConfReq out of the blue, and generally in apparent response to some
packet sent by my end.

-- 
James Cameron
http://ftp.hp.com.au/sigs/jc/

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Spontaneous LCP ConfReq after connection made
  2005-08-12  7:33 Spontaneous LCP ConfReq after connection made James Cameron
@ 2005-08-12 14:17 ` James Carlson
  2005-08-12 23:32 ` James Cameron
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: James Carlson @ 2005-08-12 14:17 UTC (permalink / raw)
  To: linux-ppp

James Cameron writes:
> pppd tears down a connection and starts a new one, without the modem
> hanging up, in response to an LCP ConfReq that arrives out of the blue
> from the peer.

That's exactly what the standards require pppd to do.

When an LCP Configure-Request arrives, the LCP state machine leaves
Opened state.  This causes all the other state machines to get
lower-layer-down, and they all come down as well.

> I'm thinking it's a peer bug, but I'm not sure, and I can't do anything
> about the peer, as it's a consumer service by an apparently
> disinterested service provider who refuses to resolve problems if
> "unsupported" software is used.
> 
> Can anyone suggest anything?

I'd strongly suggest finding a different service provider.  Giving
your money to a "service" provider that won't even accept a valid bug
report doesn't strike me as a good plan.

> Is there somewhere in pppd source where I can easily disable this
> response, in case I can continue the existing connection?

You'd probably have to modify the link_down() routine so that it
doesn't do upper_layers_down.

Note that by doing so, you're essentially making pppd non-compliant
with the standards.  I'm also not really sure it would fix the
problem.  How do you really know that this is an out-of-the-blue
packet, and *not* the peer attempting to renegotiate the link for some
reason?  If it's the latter, which I suspect is the case, then just
ignoring the out-of-the-blue packet will cause the link to go dead.

In other words, you can do it, but I think it'd be wrong and
self-defeating to do so.

> The debug and dump output is here;
> http://quozl.linux.org.au/mm-5100/2005-08-12/rtt-fail.log
> (the SIGHUP after the restart is generated by my ip-down script)

In reading this log, it looks to me like the peer just wanted to
renegotiate the link.

How do you know that's not what it wanted to do?  Does this behavior
happen with other PPP implementations?  If not, then what's different
with them?

Assuming that it's a peer bug (and the peer doesn't actually *want* to
renegotiate the link), you might be able to find out what causes the
peer to behave this way and avoid that, rather than breaking pppd's
state machine.

Sometimes, bug-ridden peers (which this seems to be) are tickled by
seemingly innocuous behaviors.  Other implementations that don't have
trouble with this problematic peer might ask for options you don't ask
for or exhibit timing behaviors that you don't.  Getting to the bottom
of such issues is essentially debugging that remote peer from afar.
It's not easy at all.

One other bit of concern is that the peer says "mru 1500."  That's the
default, and there's just never any reason to ask for the well-known
default.  That means the peer is probably an idiot.

-- 
James Carlson         42.703N 71.076W         <carlsonj@workingcode.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Spontaneous LCP ConfReq after connection made
  2005-08-12  7:33 Spontaneous LCP ConfReq after connection made James Cameron
  2005-08-12 14:17 ` James Carlson
@ 2005-08-12 23:32 ` James Cameron
  2005-08-13  5:27 ` Gilles Espinasse
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: James Cameron @ 2005-08-12 23:32 UTC (permalink / raw)
  To: linux-ppp

[-- Attachment #1: Type: text/plain, Size: 3014 bytes --]

James Carlson wrote:
> I'd strongly suggest finding a different service provider.  Giving
> your money to a "service" provider that won't even accept a valid bug
> report doesn't strike me as a good plan.

Unfortunately, there is no other service provider; all transmission
towers within radio range are owned by the one provider, and no
comparable service is available.  DSL is not available, and copper pairs
can only do about 24kbit/sec.  It's a 24-month contract, I'm up to the
ninth month, and the contract terms include a "supported software"
clause.  It was working reasonably well before now.

> In reading this log, it looks to me like the peer just wanted to
> renegotiate the link.
> 
> How do you know that's not what it wanted to do?  Does this behavior
> happen with other PPP implementations?  If not, then what's different
> with them?

Apart from the IP address change, there doesn't seem to be any reason
for the renegotiation.  These renegotiations happen in the middle of the
cell phone call; the cell call is not terminated.  But the provider
charges based on their PPP records, not their cell call records.  As a
result of the renegotiation, a new call record is placed in the service
provider's billing system, costing me $USD 0.385 each renegotiation.
I've had it renegotiate 30 times in a minute, before I added an if-down
script to SIGHUP it.  The service provider has reversed these charges,
admitting a fault condition exists, but they haven't been able to
progress the fault since Feb '05, so I thought to seek additional help.

You raise an excellent point with regard to other PPP implementations.
Perhaps there is something negotiated or implied that isn't being done
by Linux PPP.  (Nor should it do so, of course).

I've not tried the PPP implementation that has been provided by the
modem manufacturer, because it is Windows specific, I don't run Windows
here, and as the modem is USB connected instead of serial port connected
I don't know a way to catch the stream for analysis.  But thanks for the
idea, I'll see if I can get the manufacturer to provide information.
(So far they have been silent since I told them I am using Linux).

> Getting to the bottom of such issues is essentially debugging that
> remote peer from afar.  It's not easy at all.

Agreed.  So in addition to whinging I could try to find what innocuous
behaviour Linux PPP is doing that apparently causes the peer to
renegotiate.  Have you any suggestions for options to try blindly in
case they have an effect?

I've tried "asyncmap ffffffff" and "escape
00,01,02,03,04,05,06,07,08,09,0a,0b,0c,0d,0e,0f,10,11,12,13,14,15,16,17,
18,19,1a,1b,1c,1d,1e,1f,80,81,82,83,84,85,86,87,88,89,8a,8b,8c,8d,8e,8f,90,91,92
,93,94,95,96,97,98,99,9a,9b,9c,9d,9e,9f" so far, with no apparent
change.

Have you any suggestions for things to look for in the data sent to the
peer that seems to trigger the event?

-- 
James Cameron
http://ftp.hp.com.au/sigs/jc/

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Spontaneous LCP ConfReq after connection made
  2005-08-12  7:33 Spontaneous LCP ConfReq after connection made James Cameron
  2005-08-12 14:17 ` James Carlson
  2005-08-12 23:32 ` James Cameron
@ 2005-08-13  5:27 ` Gilles Espinasse
  2005-08-13 18:22 ` James Carlson
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Gilles Espinasse @ 2005-08-13  5:27 UTC (permalink / raw)
  To: linux-ppp


----- Original Message ----- 
From: "James Cameron" <james.cameron@hp.com>
To: <linux-ppp@vger.kernel.org>
Sent: Saturday, August 13, 2005 1:32 AM
Subject: Re: Spontaneous LCP ConfReq after connection made

> James Carlson wrote:
> > I'd strongly suggest finding a different service provider.  Giving
> > your money to a "service" provider that won't even accept a valid bug
> > report doesn't strike me as a good plan.
>
[ snip ]

> I've not tried the PPP implementation that has been provided by the
> modem manufacturer, because it is Windows specific, I don't run Windows
> here, and as the modem is USB connected instead of serial port connected
> I don't know a way to catch the stream for analysis.  But thanks for the
> idea, I'll see if I can get the manufacturer to provide information.
> (So far they have been silent since I told them I am using Linux).

there is an open source tool to sniff usb from windows
http://sourceforge.net/projects/usbsnoop

and a sligthlty modified version available at
http://eciadsl.flashtux.org/download.php?lang=en
(search for sniffer)

Gilles


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Spontaneous LCP ConfReq after connection made
  2005-08-12  7:33 Spontaneous LCP ConfReq after connection made James Cameron
                   ` (2 preceding siblings ...)
  2005-08-13  5:27 ` Gilles Espinasse
@ 2005-08-13 18:22 ` James Carlson
  2005-08-15  6:57 ` James Cameron
  2005-08-31  5:33 ` James Cameron
  5 siblings, 0 replies; 7+ messages in thread
From: James Carlson @ 2005-08-13 18:22 UTC (permalink / raw)
  To: linux-ppp

James Cameron writes:
> Unfortunately, there is no other service provider; all transmission
> towers within radio range are owned by the one provider, and no
> comparable service is available.  DSL is not available, and copper pairs
> can only do about 24kbit/sec.  It's a 24-month contract, I'm up to the
> ninth month, and the contract terms include a "supported software"
> clause.  It was working reasonably well before now.

Ugh.  Sounds like an expensive experience.

If it's not for a mobile station (assuming so, since you mentioned
DSL), the one means you didn't suggest was cable.  I take it that's
not an option.

(There's also satellite, but I certainly wouldn't recommend that due
to the latency issues.  Though I suppose that any functioning
connection is better than none.)

> Apart from the IP address change, there doesn't seem to be any reason
> for the renegotiation.  These renegotiations happen in the middle of the
> cell phone call; the cell call is not terminated.  But the provider

There are a few things that could be causing it, even if there doesn't
appear to be a cause.

In general, PPP will want to renegotiate if the lower layer appears to
have gone down and back up.  If the RF layer is telling their system
that your telephone has disconnected and come back, it may well
trigger a renegotiation on that glitch.  Another possibility (since
you mention cellular service) is that something about your location or
the phone itself is causing it to hand off frequently between sites.
(That's not _supposed_ to happen, but it's at least theoretically
possible.)

And since it gets them extra cash to have this bug and most people
won't notice, they probably aren't too highly inclined to fix it.

Here's another possibility: I once saw a bug in a particular embedded
implementation where they apparently accidentally left a timer running
after setting LCP to Opened state that caused renegotiation.  It was
triggered when the 'wrong' side sent the first LCP message -- if I
recall correctly, they were expecting that the answerer (the "server")
would send the first packet, and somehow had managed to code their
implementation to depend on that.  (Pretty hard to do, I think, when
the RFC has the state machine right in it ...)

Obviously, of course, LCP is completely symmetric, and there's just no
reason for such a bug (and it's hard for me to imagine how you could
possibly get it that wrong), but bugs are funny that way.

Using the "silent" option might help, if the problem is similar.

> charges based on their PPP records, not their cell call records.  As a
> result of the renegotiation, a new call record is placed in the service
> provider's billing system, costing me $USD 0.385 each renegotiation.
> I've had it renegotiate 30 times in a minute, before I added an if-down
> script to SIGHUP it.  The service provider has reversed these charges,

There's a fine line, I think, between deliberately foolish billing
practices and just outright fraud.

> I've not tried the PPP implementation that has been provided by the
> modem manufacturer, because it is Windows specific, I don't run Windows
> here, and as the modem is USB connected instead of serial port connected
> I don't know a way to catch the stream for analysis.  But thanks for the
> idea, I'll see if I can get the manufacturer to provide information.
> (So far they have been silent since I told them I am using Linux).

Is there any chance you could borrow someone else's machine for a day
to troubleshoot the problem?  One of the unfortunately good things
about Windows is that it's often pretty easily available.

Even if it's USB, there's a PPP-level "advanced" debug log buried in
the Windows client, and that might provide enough clues to go on.
Plus, in the good case, it fails just as miserably as Linux does,
proving to them that they have a problem and owe you a fix, and that
it's not your "unsupported" software that's at fault.

> > Getting to the bottom of such issues is essentially debugging that
> > remote peer from afar.  It's not easy at all.
> 
> Agreed.  So in addition to whinging I could try to find what innocuous
> behaviour Linux PPP is doing that apparently causes the peer to
> renegotiate.  Have you any suggestions for options to try blindly in
> case they have an effect?

As above, "silent" is one possibility.

Another would be to start refusing some of the options, such as
"nopcomp" and "noaccomp".

> I've tried "asyncmap ffffffff" and "escape

Instead of ffffffff, I'd use "default-asyncmap".  I doubt it'll help,
though.

> Have you any suggestions for things to look for in the data sent to the
> peer that seems to trigger the event?

Well, if varying the options doesn't help, and borrowing a Windows
machine doesn't get to it, and you still really want to debug this,
then the next step would probably be to start varying the timing.
Fortunately, pppd's negotiation mechanisms are all in user space, so a
few well-placed "sleep(1);" hacks might either avoid the problem or
open the hole wide enough so that you can see it happen every time.

Do they support anything besides Windows?  If they support either
standalone routers or perhaps Macs you might be able to get something
cheap on ebay to use as a gateway or NAT.  A solution like that (if
it's supported and actually works) might be cheaper than trying to fix
whatever's wrong here ... don't forget that your time is worth
something, too.  ;-}

-- 
James Carlson         42.703N 71.076W         <carlsonj@workingcode.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Spontaneous LCP ConfReq after connection made
  2005-08-12  7:33 Spontaneous LCP ConfReq after connection made James Cameron
                   ` (3 preceding siblings ...)
  2005-08-13 18:22 ` James Carlson
@ 2005-08-15  6:57 ` James Cameron
  2005-08-31  5:33 ` James Cameron
  5 siblings, 0 replies; 7+ messages in thread
From: James Cameron @ 2005-08-15  6:57 UTC (permalink / raw)
  To: linux-ppp

[-- Attachment #1: Type: text/plain, Size: 1897 bytes --]

On Sat, Aug 13, 2005 at 02:22:10PM -0400, James Carlson wrote:
> If it's not for a mobile station (assuming so, since you mentioned
> DSL), the one means you didn't suggest was cable.  I take it that's
> not an option.

Yes.  I'm six hours drive from the nearest city with cable.  I'm using
satellite too, but it has data limits (500Mb per month) whereas the CDMA
service has none.  They only have a time limit (50 hours per month).

> Here's another possibility: I once saw a bug in a particular embedded
> implementation where they apparently accidentally left a timer running
> after setting LCP to Opened state that caused renegotiation.

I've excluded this based on the apparent random timing, but I will give
"silent" a try.

One thing I've noticed with testing past few days ... the renegotiation
is very probable if a GRE packet for PPTP is sent over the PPP link.  

The peer provides a NAT service, and supports the use of PPTP.  PPTP
over NAT requires the peer implement stateful inspection and connection
tracking.

Normally a PPTP tunnel runs inside an OpenVPN tunnel over the satellite
service.  When the CDMA service comes up, my if-up scripts change the
route to the PPTP tunnel server so that packets for the active tunnel go
via the link in question.  Breaks the tunnel, of course.

The presumably buggy peer receives a GRE packet out of the blue that it
cannot relate to any active connection.  It might be crashing and
restarting.

It occurs to me that this might be causing problems for other users.
The service provider did say that other users had reported a similar
problem.  If I can reproduce it reliably, I'll let the service provider
know.

Thanks for the suggestion to capture the Windows client negotiation;
first I'll see if I can isolate the problem to orphan PPTP frames.

-- 
James Cameron
http://ftp.hp.com.au/sigs/jc/

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Spontaneous LCP ConfReq after connection made
  2005-08-12  7:33 Spontaneous LCP ConfReq after connection made James Cameron
                   ` (4 preceding siblings ...)
  2005-08-15  6:57 ` James Cameron
@ 2005-08-31  5:33 ` James Cameron
  5 siblings, 0 replies; 7+ messages in thread
From: James Cameron @ 2005-08-31  5:33 UTC (permalink / raw)
  To: linux-ppp

[-- Attachment #1: Type: text/plain, Size: 765 bytes --]

Providing a touch more closure to this thread; since blocking certain
packets from being sent through the link, the symptom no longer occurs,
so I think the peer is indeed broken.

The ip-up script now does this;

# eth1 satellite service is 192.168.x.x
iptables --insert OUTPUT 1 --source 192.168.0.0/255.255.0.0 \
    --destination 0.0.0.0/0.0.0.0 --jump DROP \
    --out-interface ${PPP_IFACE}

# eth0 internal network is 10.0.x.x
iptables --insert OUTPUT 1 --source 10.0.0.0/255.255.0.0 \
    --destination 0.0.0.0/0.0.0.0 --jump DROP \
    --out-interface ${PPP_IFACE}

# block any PPTP VPN traffic
iptables --insert OUTPUT 1 --protocol GRE --jump DROP \
    --out-interface ${PPP_IFACE}

-- 
James Cameron
http://ftp.hp.com.au/sigs/jc/

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2005-08-31  5:33 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-12  7:33 Spontaneous LCP ConfReq after connection made James Cameron
2005-08-12 14:17 ` James Carlson
2005-08-12 23:32 ` James Cameron
2005-08-13  5:27 ` Gilles Espinasse
2005-08-13 18:22 ` James Carlson
2005-08-15  6:57 ` James Cameron
2005-08-31  5:33 ` James Cameron

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).