All of lore.kernel.org
 help / color / mirror / Atom feed
* pppd hangup only when traffic sent over connection
@ 2015-01-23 11:56 Matt Dooner
  2015-01-23 12:54 ` James Carlson
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Matt Dooner @ 2015-01-23 11:56 UTC (permalink / raw)
  To: linux-ppp

Hello,

I'm experiencing an issue where pppd hangs up, but only when traffic
is sent over the connection. The connection is stable if it is idle,
but soon after any traffic is sent over the connection pppd hangs up.
For example, 15-30 pings or a few lines into a telnet session kills
the connection. I suspect this is the result of a lack of support for
a Link Quality Report (which is Confreq'd by the target).

As you can see in the debug about below if-down is started after the
"rcvd [LCP ConfReq id=0xf0 <mru 400> <quality lqr 00 00 01 2c> <magic
0xc0a80ea5> <pcomp> <accomp>]". This ConfReq occured after 17
successful pings over the link, and the link went down after the
ConfReq. Is this expected behaviour?

Thanks in advance,
Matt

using channel 14
Using interface ppp0
Connect: ppp0 <--> /dev/ttyO1
sent [LCP ConfReq id=0x1 <asyncmap 0x0> <magic 0x9ad4ff1a> <pcomp> <accomp>]
rcvd [LCP ConfReq id=0xd5 <mru 400> <quality lqr 00 00 01 2c> <magic
0xc0a80ea5> <pcomp> <accomp>]
sent [LCP ConfRej id=0xd5 <quality lqr 00 00 01 2c>]
rcvd [LCP ConfRej id=0x1 <asyncmap 0x0>]
sent [LCP ConfReq id=0x2 <magic 0x9ad4ff1a> <pcomp> <accomp>]
rcvd [LCP ConfReq id=0xd6 <mru 400> <magic 0xc0a80ea5> <pcomp> <accomp>]
sent [LCP ConfAck id=0xd6 <mru 400> <magic 0xc0a80ea5> <pcomp> <accomp>]
rcvd [LCP ConfAck id=0x2 <magic 0x9ad4ff1a> <pcomp> <accomp>]
kernel does not support PPP filtering
sent [IPCP ConfReq id=0x1 <addr 0.0.0.0>]
rcvd [IPCP ConfReq id=0xd7 <addr 192.168.14.165>]
sent [IPCP ConfAck id=0xd7 <addr 192.168.14.165>]
sent [IPCP ConfReq id=0x1 <addr 0.0.0.0>]
rcvd [IPCP ConfNak id=0x1 <addr 192.168.14.166>]
sent [IPCP ConfReq id=0x2 <addr 192.168.14.166>]
sent [IPCP ConfReq id=0x2 <addr 192.168.14.166>]
rcvd [IPCP ConfAck id=0x2 <addr 192.168.14.166>]
local  IP address 192.168.14.166
remote IP address 192.168.14.165
Script /etc/ppp/ip-up started (pid 1686)
Script /etc/ppp/ip-up finished (pid 1686), status = 0x0
rcvd [LCP ConfReq id=0xf0 <mru 400> <quality lqr 00 00 01 2c> <magic
0xc0a80ea5> <pcomp> <accomp>]
Connect time 1.3 minutes.
Sent 2112 bytes, received 2344 bytes.
Script /etc/ppp/ip-down started (pid 1695)
sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
sent [LCP ConfRej id=0xf0 <quality lqr 00 00 01 2c>]
Script /etc/ppp/ip-down finished (pid 1695), status = 0x0
sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
LCP: timeout sending Config-Requests
Connection terminated.
Modem hangup

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: pppd hangup only when traffic sent over connection
  2015-01-23 11:56 pppd hangup only when traffic sent over connection Matt Dooner
@ 2015-01-23 12:54 ` James Carlson
  2015-01-23 13:51 ` Matt Dooner
  2015-01-23 14:14 ` James Carlson
  2 siblings, 0 replies; 4+ messages in thread
From: James Carlson @ 2015-01-23 12:54 UTC (permalink / raw)
  To: linux-ppp

On 01/23/15 06:56, Matt Dooner wrote:
> I'm experiencing an issue where pppd hangs up, but only when traffic
> is sent over the connection. The connection is stable if it is idle,
> but soon after any traffic is sent over the connection pppd hangs up.
> For example, 15-30 pings or a few lines into a telnet session kills
> the connection. I suspect this is the result of a lack of support for
> a Link Quality Report (which is Confreq'd by the target).

What type of connection is this?  Is it by any chance a 3GPP link?

I agree that the LQR request is one oddity here.  It's hard to imagine,
though, how rejection of LQR could possibly cause the link to fail.
You'd probably have to ask the administrator of that remote system about
that.

And if the refusal of that one option did in fact cause the link to
fail, I'd expect that the peer would just hang up immediately or perhaps
send an LCP Terminate-Request with some explanatory text ("sorry, I
require LQR and you can't refuse") and then hang up.  Allowing the link
to come up and then flaking out like this is deceptive behavior at best.
 Are you sure you want to connect to this peer?

The other oddities I see are the rejection of ACCM (asyncmap), which is
somewhat unusual for what appears to be a normal serial link, and the
absurdly low MRU 400.  What are they thinking?  What can one reasonably
do on an IP link with an MRU smaller than the minimum IP MTU?

It's also surprising to see a link without authentication enabled, but I
guess that's normal in your environment.  You haven't disclosed much
about what you're doing, so there's a bit of guesswork here.

There's something odd with this link.

> As you can see in the debug about below if-down is started after the
> "rcvd [LCP ConfReq id=0xf0 <mru 400> <quality lqr 00 00 01 2c> <magic
> 0xc0a80ea5> <pcomp> <accomp>]". This ConfReq occured after 17
> successful pings over the link, and the link went down after the
> ConfReq. Is this expected behaviour?

I see the local system doing the right thing with respect to the
messages it receives.  The peer appears to be behaving strangely and
you'll probably need to get in touch with the person who controls that
device for more information.

> sent [IPCP ConfReq id=0x1 <addr 0.0.0.0>]
> rcvd [IPCP ConfReq id=0xd7 <addr 192.168.14.165>]
> sent [IPCP ConfAck id=0xd7 <addr 192.168.14.165>]
> sent [IPCP ConfReq id=0x1 <addr 0.0.0.0>]
> rcvd [IPCP ConfNak id=0x1 <addr 192.168.14.166>]

Here's the first hint of some sort of communications problem.  We had to
send the IPCP Configure-Request twice before the peer responded with
that Configure-Nak.  The peer simply didn't respond the first time.
That doesn't happen if the peer is behaving normally.  (If the link has
a really long delay to it -- we can't tell because timing is omitted
here -- you'd expect to see some duplicate responses.  So this looks
like outright packet loss.)

I think the peer may be bug-ridden.  I'd take this as a sign to find
another.

> sent [IPCP ConfReq id=0x2 <addr 192.168.14.166>]
> sent [IPCP ConfReq id=0x2 <addr 192.168.14.166>]
> rcvd [IPCP ConfAck id=0x2 <addr 192.168.14.166>]

The same thing happens again here.  This isn't good at all.

> local  IP address 192.168.14.166
> remote IP address 192.168.14.165
> Script /etc/ppp/ip-up started (pid 1686)
> Script /etc/ppp/ip-up finished (pid 1686), status = 0x0
> rcvd [LCP ConfReq id=0xf0 <mru 400> <quality lqr 00 00 01 2c> <magic
> 0xc0a80ea5> <pcomp> <accomp>]

The peer requests LCP re-negotiation at this point.  That's perfectly
legal, but certainly unusual.  Yes, it causes everything to be torn down
and started over -- LCP drops out of Opened state, causing IPCP to halt.

> Connect time 1.3 minutes.
> Sent 2112 bytes, received 2344 bytes.
> Script /etc/ppp/ip-down started (pid 1695)
> sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
> sent [LCP ConfRej id=0xf0 <quality lqr 00 00 01 2c>]

We do the right thing here: we restart LCP and reject the offered LQR
option.

> Script /etc/ppp/ip-down finished (pid 1695), status = 0x0
> sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
> sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
> sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
> sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
> sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
> sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
> sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
> sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
> sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
> LCP: timeout sending Config-Requests

We try multiple times to finish the re-negotiation procedure, but the
peer simply refuses to respond.  The peer is clearly broken.  Time to
find another with fewer bugs.

-- 
James Carlson         42.703N 71.076W         <carlsonj@workingcode.com>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: pppd hangup only when traffic sent over connection
  2015-01-23 11:56 pppd hangup only when traffic sent over connection Matt Dooner
  2015-01-23 12:54 ` James Carlson
@ 2015-01-23 13:51 ` Matt Dooner
  2015-01-23 14:14 ` James Carlson
  2 siblings, 0 replies; 4+ messages in thread
From: Matt Dooner @ 2015-01-23 13:51 UTC (permalink / raw)
  To: linux-ppp

James,

Thanks for the quick repsonse.

> What type of connection is this?  Is it by any chance a 3GPP link?
>  Are you sure you want to connect to this peer?
> It's also surprising to see a link without authentication enabled, but I guess that's normal in your environment.

The peer is a rather old wireless radio connected via RS232 and the
host is an embedded linux system in the same rack acting as a RIP
router, among other things. That should explain the lack of
authentication.

> the absurdly low MRU 400.  What are they thinking?  What can one reasonably
> do on an IP link with an MRU smaller than the minimum IP MTU?

Thanks for pointing this out. I noted that the MRU was low but it had
not occurred to me that it was actually below the minimum specified by
IP. I'll have to check with the vendor as to how they're fragmenting
the traffic. The low MRU might be related to the nature of the
wireless link, but the purpose of the device is to trasmit IP traffic
after all.

Being an RS232 link it is a bit surprising that the peer is fussy aout
the link quality. Swapping in a new cable makes no difference. We're
using ppp 2.4.5 (debian 6 package) and have tested 2.4.6 with similar
results. I have built 2.4.7 for ARM but have not been able to install
it yet. From looking at the change logs I don't expect much of a
difference in behaviour though.

If we replace the linux system with a Cisco router it works. The only
difference in the negotiation is acceptance of LQR.

I agree it looks like the peer is broken as it "gives up" and fails to
respond to the final ConfReq id=0x3.

Thanks again for the help, this is valuable information about how to
proceed from here.

Cheers,
Matt

On 23 January 2015 at 12:54, James Carlson <carlsonj@workingcode.com> wrote:
> On 01/23/15 06:56, Matt Dooner wrote:
>> I'm experiencing an issue where pppd hangs up, but only when traffic
>> is sent over the connection. The connection is stable if it is idle,
>> but soon after any traffic is sent over the connection pppd hangs up.
>> For example, 15-30 pings or a few lines into a telnet session kills
>> the connection. I suspect this is the result of a lack of support for
>> a Link Quality Report (which is Confreq'd by the target).
>
> What type of connection is this?  Is it by any chance a 3GPP link?
>
> I agree that the LQR request is one oddity here.  It's hard to imagine,
> though, how rejection of LQR could possibly cause the link to fail.
> You'd probably have to ask the administrator of that remote system about
> that.
>
> And if the refusal of that one option did in fact cause the link to
> fail, I'd expect that the peer would just hang up immediately or perhaps
> send an LCP Terminate-Request with some explanatory text ("sorry, I
> require LQR and you can't refuse") and then hang up.  Allowing the link
> to come up and then flaking out like this is deceptive behavior at best.
>  Are you sure you want to connect to this peer?
>
> The other oddities I see are the rejection of ACCM (asyncmap), which is
> somewhat unusual for what appears to be a normal serial link, and the
> absurdly low MRU 400.  What are they thinking?  What can one reasonably
> do on an IP link with an MRU smaller than the minimum IP MTU?
>
> It's also surprising to see a link without authentication enabled, but I
> guess that's normal in your environment.  You haven't disclosed much
> about what you're doing, so there's a bit of guesswork here.
>
> There's something odd with this link.
>
>> As you can see in the debug about below if-down is started after the
>> "rcvd [LCP ConfReq id=0xf0 <mru 400> <quality lqr 00 00 01 2c> <magic
>> 0xc0a80ea5> <pcomp> <accomp>]". This ConfReq occured after 17
>> successful pings over the link, and the link went down after the
>> ConfReq. Is this expected behaviour?
>
> I see the local system doing the right thing with respect to the
> messages it receives.  The peer appears to be behaving strangely and
> you'll probably need to get in touch with the person who controls that
> device for more information.
>
>> sent [IPCP ConfReq id=0x1 <addr 0.0.0.0>]
>> rcvd [IPCP ConfReq id=0xd7 <addr 192.168.14.165>]
>> sent [IPCP ConfAck id=0xd7 <addr 192.168.14.165>]
>> sent [IPCP ConfReq id=0x1 <addr 0.0.0.0>]
>> rcvd [IPCP ConfNak id=0x1 <addr 192.168.14.166>]
>
> Here's the first hint of some sort of communications problem.  We had to
> send the IPCP Configure-Request twice before the peer responded with
> that Configure-Nak.  The peer simply didn't respond the first time.
> That doesn't happen if the peer is behaving normally.  (If the link has
> a really long delay to it -- we can't tell because timing is omitted
> here -- you'd expect to see some duplicate responses.  So this looks
> like outright packet loss.)
>
> I think the peer may be bug-ridden.  I'd take this as a sign to find
> another.
>
>> sent [IPCP ConfReq id=0x2 <addr 192.168.14.166>]
>> sent [IPCP ConfReq id=0x2 <addr 192.168.14.166>]
>> rcvd [IPCP ConfAck id=0x2 <addr 192.168.14.166>]
>
> The same thing happens again here.  This isn't good at all.
>
>> local  IP address 192.168.14.166
>> remote IP address 192.168.14.165
>> Script /etc/ppp/ip-up started (pid 1686)
>> Script /etc/ppp/ip-up finished (pid 1686), status = 0x0
>> rcvd [LCP ConfReq id=0xf0 <mru 400> <quality lqr 00 00 01 2c> <magic
>> 0xc0a80ea5> <pcomp> <accomp>]
>
> The peer requests LCP re-negotiation at this point.  That's perfectly
> legal, but certainly unusual.  Yes, it causes everything to be torn down
> and started over -- LCP drops out of Opened state, causing IPCP to halt.
>
>> Connect time 1.3 minutes.
>> Sent 2112 bytes, received 2344 bytes.
>> Script /etc/ppp/ip-down started (pid 1695)
>> sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
>> sent [LCP ConfRej id=0xf0 <quality lqr 00 00 01 2c>]
>
> We do the right thing here: we restart LCP and reject the offered LQR
> option.
>
>> Script /etc/ppp/ip-down finished (pid 1695), status = 0x0
>> sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
>> sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
>> sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
>> sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
>> sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
>> sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
>> sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
>> sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
>> sent [LCP ConfReq id=0x3 <asyncmap 0x0> <magic 0x49d0163a> <pcomp> <accomp>]
>> LCP: timeout sending Config-Requests
>
> We try multiple times to finish the re-negotiation procedure, but the
> peer simply refuses to respond.  The peer is clearly broken.  Time to
> find another with fewer bugs.
>
> --
> James Carlson         42.703N 71.076W         <carlsonj@workingcode.com>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: pppd hangup only when traffic sent over connection
  2015-01-23 11:56 pppd hangup only when traffic sent over connection Matt Dooner
  2015-01-23 12:54 ` James Carlson
  2015-01-23 13:51 ` Matt Dooner
@ 2015-01-23 14:14 ` James Carlson
  2 siblings, 0 replies; 4+ messages in thread
From: James Carlson @ 2015-01-23 14:14 UTC (permalink / raw)
  To: linux-ppp

On 01/23/15 08:51, Matt Dooner wrote:
> James,
> 
> Thanks for the quick repsonse.
> 
>> What type of connection is this?  Is it by any chance a 3GPP link?
>>  Are you sure you want to connect to this peer?
>> It's also surprising to see a link without authentication enabled, but I guess that's normal in your environment.
> 
> The peer is a rather old wireless radio connected via RS232 and the
> host is an embedded linux system in the same rack acting as a RIP
> router, among other things. That should explain the lack of
> authentication.

Great!  Can you get debug logs from that remote system?  That's the guy
that's having trouble here.

>> the absurdly low MRU 400.  What are they thinking?  What can one reasonably
>> do on an IP link with an MRU smaller than the minimum IP MTU?
> 
> Thanks for pointing this out. I noted that the MRU was low but it had
> not occurred to me that it was actually below the minimum specified by
> IP. I'll have to check with the vendor as to how they're fragmenting
> the traffic. The low MRU might be related to the nature of the
> wireless link, but the purpose of the device is to trasmit IP traffic
> after all.

Actually, that explains the odd MRU pretty well.  If they're using
something like ALOHA, the packet sizes need to be small.  Normally,
that's intentionally hidden by the hardware using something like MNP-4,
V.42, V.120, 802.11, or similar segmentation/reassembly/error-control layer.

However, in this case, it sounds like it's just a raw radio channel.
I'm surprised it's usable.

> Being an RS232 link it is a bit surprising that the peer is fussy aout
> the link quality. Swapping in a new cable makes no difference. We're
> using ppp 2.4.5 (debian 6 package) and have tested 2.4.6 with similar
> results. I have built 2.4.7 for ARM but have not been able to install
> it yet. From looking at the change logs I don't expect much of a
> difference in behaviour though.

No.  It's not the local system that has trouble (I assume that's what
you're referring to here).  It's the remote system that's experiencing
some sort of error that causes it to renegotiate the link.

You need to find out what that error is, and the information exists only
on the peer's system.

> If we replace the linux system with a Cisco router it works. The only
> difference in the negotiation is acceptance of LQR.

Well ... that's a difference.  It's hard to tell, though, without
information from the peer whether that's really the determining
difference or if it's coincidence.

I've run into problems before that ended up being all sorts of issues
unrelated to the LCP negotiation -- such as electrical issues (RS-232
signaling voltage too low, unterminated signal wires) and in-band flow
control failure.  There are too many variables in that swap.

> I agree it looks like the peer is broken as it "gives up" and fails to
> respond to the final ConfReq id=0x3.
> 
> Thanks again for the help, this is valuable information about how to
> proceed from here.

The fact that you've got an RS-232 link is good.  You can put an
analyzer on there.  There are many to choose from (Google is your
friend).  For what it's worth, I've used this one in the past:

http://www.klos.com/products/serialview/

-- 
James Carlson         42.703N 71.076W         <carlsonj@workingcode.com>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-01-23 14:14 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-01-23 11:56 pppd hangup only when traffic sent over connection Matt Dooner
2015-01-23 12:54 ` James Carlson
2015-01-23 13:51 ` Matt Dooner
2015-01-23 14:14 ` James Carlson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.