linux-ppp.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* ifconfig ppp0 errors
@ 2011-01-11 20:30 Slawomir Skret
  2011-01-11 21:04 ` James Carlson
                   ` (13 more replies)
  0 siblings, 14 replies; 15+ messages in thread
From: Slawomir Skret @ 2011-01-11 20:30 UTC (permalink / raw)
  To: linux-ppp

Hi,

I use pppd to connect two serial bus points, i.e. I use it with the 
"local" flag and not with a modem controls. It all works fine but when 
downloading files I do get some errors reported by the ifconfig:

ppp0      Link encap:Point-to-Point Protocol
           inet addr:192.168.1.202  P-t-P:192.168.1.201 
Mask:255.255.255.255
           UP POINTOPOINT RUNNING NOARP MULTICAST  MTU:1500  Metric:1
           RX packets:3662 errors:381 dropped:0 overruns:0 frame:0
           TX packets:3536 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:3
           RX bytes:5454442 (5.2 MiB)  TX bytes:203944 (199.1 KiB)

I tried to use the "debug" and "kdebug" flags when starting pppd but get 
only the configuration related info but nothing during the runtime 
that would tell me why the errors occurred or what do they mean. From what 
I read, the ifconfig reports the contents of the /proc/net/dev file which 
is updated by the kernel modules but again it is a statistics reporter 
only without describing the nature of the errors.

How can I get more information about these errors?

Thanks,
Swavek


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ifconfig ppp0 errors
  2011-01-11 20:30 ifconfig ppp0 errors Slawomir Skret
@ 2011-01-11 21:04 ` James Carlson
  2011-01-11 21:57 ` Slawomir Skret
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: James Carlson @ 2011-01-11 21:04 UTC (permalink / raw)
  To: linux-ppp

Slawomir Skret wrote:
> I tried to use the "debug" and "kdebug" flags when starting pppd but get
> only the configuration related info but nothing during the runtime that
> would tell me why the errors occurred or what do they mean. From what I
> read, the ifconfig reports the contents of the /proc/net/dev file which
> is updated by the kernel modules but again it is a statistics reporter
> only without describing the nature of the errors.
> 
> How can I get more information about these errors?

How did you use "kdebug?"  It doesn't work quite the same way as "debug"
-- it takes an argument, and on most platforms, the argument is an
integer interpreted as a set of flags representing the debug information
to enable.  "kdebug 7" usually turns everything on.

The "debug" option is unlikely to help when you're talking about basic
I/O errors.  The "debug" option causes the system to log the details of
the PPP negotiation between the peers, but this usually has little to do
with I/O problems.

And where did you look for the messages generated by those options?
/etc/syslog.conf directs these things to files depending on the origin
of the message, and the severity.  You may need to modify
/etc/syslog.conf (and SIGHUP syslogd) to see everything.

Did you try using the "pppstats" command?  If the errors are related to
data compression or the like, then pppstats may well give you more
details than kdebug.  At least, I'd use pppstats to rule out other problems.

What options are you using?  Have you tried disabling data compression
with "noccp"?

How fast does your "serial bus" run?  Is it likely to drop data when
there are bursts -- as you might see with packet-oriented networking
protocols, such as PPP?

-- 
James Carlson         42.703N 71.076W         <carlsonj@workingcode.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ifconfig ppp0 errors
  2011-01-11 20:30 ifconfig ppp0 errors Slawomir Skret
  2011-01-11 21:04 ` James Carlson
@ 2011-01-11 21:57 ` Slawomir Skret
  2011-01-11 22:50 ` Slawomir Skret
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Slawomir Skret @ 2011-01-11 21:57 UTC (permalink / raw)
  To: linux-ppp

Hi James,

Please see my responses inline.

Thanks,
Swavek

On Tue, 11 Jan 2011, James Carlson wrote:

> Slawomir Skret wrote:
>> I tried to use the "debug" and "kdebug" flags when starting pppd but get
>> only the configuration related info but nothing during the runtime that
>> would tell me why the errors occurred or what do they mean. From what I
>> read, the ifconfig reports the contents of the /proc/net/dev file which
>> is updated by the kernel modules but again it is a statistics reporter
>> only without describing the nature of the errors.
>>
>> How can I get more information about these errors?
>
> How did you use "kdebug?"  It doesn't work quite the same way as "debug"
> -- it takes an argument, and on most platforms, the argument is an
> integer interpreted as a set of flags representing the debug information
> to enable.  "kdebug 7" usually turns everything on.

I used the kdebug with various args like you mentioned including 7.

>
> The "debug" option is unlikely to help when you're talking about basic
> I/O errors.  The "debug" option causes the system to log the details of
> the PPP negotiation between the peers, but this usually has little to do
> with I/O problems.

I noticed.

>
> And where did you look for the messages generated by those options?
> /etc/syslog.conf directs these things to files depending on the origin
> of the message, and the severity.  You may need to modify
> /etc/syslog.conf (and SIGHUP syslogd) to see everything.
>

I adjusted the /etc/syslog.conf and restarted the syslogd. I would get 
configuration related logs at the start/end of the ppp0 session. This is 
how I knew that it worked. However, there was nothing during the runtime 
when, I presume, the errors happened.

> Did you try using the "pppstats" command?  If the errors are related to
> data compression or the like, then pppstats may well give you more
> details than kdebug.  At least, I'd use pppstats to rule out other problems.

I do not have the pppstats compiled but will compile it and use it to, as 
you mentioned, eliminate compression issues. I will also run it with the 
nocpp flag to see if it makes any difference.

>
> What options are you using?  Have you tried disabling data compression
> with "noccp"?

The server side:
pppd passive local 192.168.1.201:192.168.1.202 /dev/ttyCPM3 maxfail 0

and the client:
pppd local /dev/ttyCPM1

>
> How fast does your "serial bus" run?  Is it likely to drop data when
> there are bursts -- as you might see with packet-oriented networking
> protocols, such as PPP?

The serial bus has 8MHz clock.

>
> -- 
> James Carlson         42.703N 71.076W         <carlsonj@workingcode.com>
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ifconfig ppp0 errors
  2011-01-11 20:30 ifconfig ppp0 errors Slawomir Skret
  2011-01-11 21:04 ` James Carlson
  2011-01-11 21:57 ` Slawomir Skret
@ 2011-01-11 22:50 ` Slawomir Skret
  2011-01-11 23:30 ` James Cameron
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Slawomir Skret @ 2011-01-11 22:50 UTC (permalink / raw)
  To: linux-ppp

Running the server with the noccp flag does not change the error count 
reported by the ifconfig.

Now, here is the ifconfig from the server:

ppp0      Link encap:Point-to-Point Protocol
           inet addr:192.168.1.201  P-t-P:192.168.1.202 
Mask:255.255.255.255
           UP POINTOPOINT RUNNING NOARP MULTICAST  MTU:1500  Metric:1
           RX packets:3319 errors:0 dropped:0 overruns:0 frame:0
           TX packets:3994 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:3
           RX bytes:191000 (186.5 KiB)  TX bytes:5948354 (5.6 MiB)

and the matching from the client:

ppp0      Link encap:Point-to-Point Protocol
           inet addr:192.168.1.202  P-t-P:192.168.1.201 
Mask:255.255.255.255
           UP POINTOPOINT RUNNING NOARP MULTICAST  MTU:1500  Metric:1
           RX packets:3664 errors:322 dropped:0 overruns:0 frame:0
           TX packets:3319 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:3
           RX bytes:5454546 (5.2 MiB)  TX bytes:191000 (186.5 KiB)

The data is transfered from the server to the client, and the client shows 
322 errors, which is the delta between the server TX and client RX packet 
count.

The tcpdump run on the server and client shows that the client does not 
receive some tcp packets that the server shows to have sent and the client 
requests them to be resent. However, the tcp does not get these 
packets and it is not clear what happened to them and what errors they 
had.

Can you suggest anything to shed more light onto what could be causing 
these errors?

Thanks,
Swavek

On Tue, 11 Jan 2011, James Carlson wrote:

> Slawomir Skret wrote:
>> I tried to use the "debug" and "kdebug" flags when starting pppd but get
>> only the configuration related info but nothing during the runtime that
>> would tell me why the errors occurred or what do they mean. From what I
>> read, the ifconfig reports the contents of the /proc/net/dev file which
>> is updated by the kernel modules but again it is a statistics reporter
>> only without describing the nature of the errors.
>>
>> How can I get more information about these errors?
>
> How did you use "kdebug?"  It doesn't work quite the same way as "debug"
> -- it takes an argument, and on most platforms, the argument is an
> integer interpreted as a set of flags representing the debug information
> to enable.  "kdebug 7" usually turns everything on.
>
> The "debug" option is unlikely to help when you're talking about basic
> I/O errors.  The "debug" option causes the system to log the details of
> the PPP negotiation between the peers, but this usually has little to do
> with I/O problems.
>
> And where did you look for the messages generated by those options?
> /etc/syslog.conf directs these things to files depending on the origin
> of the message, and the severity.  You may need to modify
> /etc/syslog.conf (and SIGHUP syslogd) to see everything.
>
> Did you try using the "pppstats" command?  If the errors are related to
> data compression or the like, then pppstats may well give you more
> details than kdebug.  At least, I'd use pppstats to rule out other problems.
>
> What options are you using?  Have you tried disabling data compression
> with "noccp"?
>
> How fast does your "serial bus" run?  Is it likely to drop data when
> there are bursts -- as you might see with packet-oriented networking
> protocols, such as PPP?
>
> -- 
> James Carlson         42.703N 71.076W         <carlsonj@workingcode.com>
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ifconfig ppp0 errors
  2011-01-11 20:30 ifconfig ppp0 errors Slawomir Skret
                   ` (2 preceding siblings ...)
  2011-01-11 22:50 ` Slawomir Skret
@ 2011-01-11 23:30 ` James Cameron
  2011-01-12  2:29 ` Paul Mackerras
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: James Cameron @ 2011-01-11 23:30 UTC (permalink / raw)
  To: linux-ppp

On Tue, Jan 11, 2011 at 05:50:23PM -0500, Slawomir Skret wrote:
> The data is transfered from the server to the client, and the client
> shows 322 errors, which is the delta between the server TX and
> client RX packet count.

Kernel source drivers/net/ppp_generic, looking at increments of
ppp->dev->stats.rx_errors ... only happens in ppp_receive_error, which
is called in several situations, some of which include a kernel message
being emitted:

- zero length skb,

- compression used by peer without compression enabled at host,

- compression not used by peer with compression enabled at host,

- no memory on VJ decompression, (KERN_ERR level message),

- VJ decompression error, (KERN_DEBUG level message),

- non linearity just prior to SLHC decompression,

- no memory during filtering.

So I don't think it will be easy to isolate unless you find a kernel
message or can prove inconsistent use of compression, or lack of the
right kind of memory.

As well as James, I'd be interested in the pppstats output.

You might also try "debug dump" on both pppd and cross check the
negotiations.

-- 
James Cameron
http://quozl.linux.org.au/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ifconfig ppp0 errors
  2011-01-11 20:30 ifconfig ppp0 errors Slawomir Skret
                   ` (3 preceding siblings ...)
  2011-01-11 23:30 ` James Cameron
@ 2011-01-12  2:29 ` Paul Mackerras
  2011-01-12 13:21 ` James Carlson
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Paul Mackerras @ 2011-01-12  2:29 UTC (permalink / raw)
  To: linux-ppp

On Tue, Jan 11, 2011 at 05:50:23PM -0500, Slawomir Skret wrote:

> Can you suggest anything to shed more light onto what could be
> causing these errors?

pppstats, like James said.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ifconfig ppp0 errors
  2011-01-11 20:30 ifconfig ppp0 errors Slawomir Skret
                   ` (4 preceding siblings ...)
  2011-01-12  2:29 ` Paul Mackerras
@ 2011-01-12 13:21 ` James Carlson
  2011-01-12 18:40 ` Slawomir Skret
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: James Carlson @ 2011-01-12 13:21 UTC (permalink / raw)
  To: linux-ppp

On 01/11/11 16:57, Slawomir Skret wrote:
> On Tue, 11 Jan 2011, James Carlson wrote:
>> What options are you using?  Have you tried disabling data compression
>> with "noccp"?
> 
> The server side:
> pppd passive local 192.168.1.201:192.168.1.202 /dev/ttyCPM3 maxfail 0
> 
> and the client:
> pppd local /dev/ttyCPM1

OK; sounds straightforward.

Since data loss under stress is suspected here, and data compression
adds stress, I'd recommend "noccp novj".  Turn off the "complicated"
compression mechanisms.  (Leave on ACFC and PFC; they're simple.)

>> How fast does your "serial bus" run?  Is it likely to drop data when
>> there are bursts -- as you might see with packet-oriented networking
>> protocols, such as PPP?
> 
> The serial bus has 8MHz clock.

That doesn't sound terribly fast for a modern machine, but I do think
that running a link that lacks either a flow control mechanism (as with
most async links) or a native framing mechanism (as with most
synchronous links) is bad karma.  You're just out looking for a problem.

If the serial interface has an HDLC mode, I'd use it, and tell PPP that
it has a "sync" device.

-- 
James Carlson         42.703N 71.076W         <carlsonj@workingcode.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ifconfig ppp0 errors
  2011-01-11 20:30 ifconfig ppp0 errors Slawomir Skret
                   ` (5 preceding siblings ...)
  2011-01-12 13:21 ` James Carlson
@ 2011-01-12 18:40 ` Slawomir Skret
  2011-01-12 19:47 ` James Carlson
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Slawomir Skret @ 2011-01-12 18:40 UTC (permalink / raw)
  To: linux-ppp

Hi James,

I added an instrumental printk statement in the ppp_receive_frame to print 
a message before the ppp_receive_error(), rebuilt the kernel, and got 
these prints for every error count reported. What does that mean? Is there 
anything I can do/turn on/instrument/etc to tell what's wrong with these 
frames?

Also, I built the pppstats and run it:

/var # ./pppstats
       IN   PACK VJCOMP  VJUNC  VJERR  |      OUT   PACK VJCOMP  VJUNC 
NON-VJ
  2351664   1587      0      0      0  |    78364   1465      0      0 
1465

but it does not report these errors.

Thanks,
Swavek

On Wed, 12 Jan 2011, James Cameron wrote:

> On Tue, Jan 11, 2011 at 05:50:23PM -0500, Slawomir Skret wrote:
>> The data is transfered from the server to the client, and the client
>> shows 322 errors, which is the delta between the server TX and
>> client RX packet count.
>
> Kernel source drivers/net/ppp_generic, looking at increments of
> ppp->dev->stats.rx_errors ... only happens in ppp_receive_error, which
> is called in several situations, some of which include a kernel message
> being emitted:
>
> - zero length skb,
>
> - compression used by peer without compression enabled at host,
>
> - compression not used by peer with compression enabled at host,
>
> - no memory on VJ decompression, (KERN_ERR level message),
>
> - VJ decompression error, (KERN_DEBUG level message),
>
> - non linearity just prior to SLHC decompression,
>
> - no memory during filtering.
>
> So I don't think it will be easy to isolate unless you find a kernel
> message or can prove inconsistent use of compression, or lack of the
> right kind of memory.
>
> As well as James, I'd be interested in the pppstats output.
>
> You might also try "debug dump" on both pppd and cross check the
> negotiations.
>
> -- 
> James Cameron
> http://quozl.linux.org.au/
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ifconfig ppp0 errors
  2011-01-11 20:30 ifconfig ppp0 errors Slawomir Skret
                   ` (6 preceding siblings ...)
  2011-01-12 18:40 ` Slawomir Skret
@ 2011-01-12 19:47 ` James Carlson
  2011-01-12 21:05 ` James Cameron
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: James Carlson @ 2011-01-12 19:47 UTC (permalink / raw)
  To: linux-ppp

On 01/12/11 13:40, Slawomir Skret wrote:
> I added an instrumental printk statement in the ppp_receive_frame to
> print a message before the ppp_receive_error(), rebuilt the kernel, and
> got these prints for every error count reported. What does that mean? Is
> there anything I can do/turn on/instrument/etc to tell what's wrong with
> these frames?

It means that your hardware is destroying the data in flight.  There's
not much that PPP can do about that except detect the errors (using the
Frame Check Sequence -- a CRC-16 scheme) and discard the packets that
are affected.

You could use the "record filename" option to capture the actual raw
data.  That interposes a pty pair, so it's not completely transparent.
To do transparent analysis, you'll need an external serial analyzer.
Several vendors (such as HP) make such machines.  They're by no means
cheap, but if you're doing real hardware development, there's no
substitute for having good test gear.

Reasonable operation of PPP assumes a lower layer that has only a modest
-- and hopefully not traffic-sensitive -- error rate.  It doesn't have
to be completely error free, but even a small error rate will have a
relatively large effect on usability and performance of the link.  (And
systematic errors, such as [say] always discarding the 256th byte of
packets with 256 or more bytes, will make the link effectively useless
for ordinary networking purposes.)

(This isn't really a characteristic of PPP itself, but rather of any
interface intended for datagram networking purposes.  Error rates other
than "a little" are bad news.)

> Also, I built the pppstats and run it:
> 
> /var # ./pppstats
>       IN   PACK VJCOMP  VJUNC  VJERR  |      OUT   PACK VJCOMP  VJUNC
> NON-VJ
>  2351664   1587      0      0      0  |    78364   1465      0      0 1465
> 
> but it does not report these errors.

That indicates that there's no significant errors above the framing
level.  That's good, as it likely means there are no difficult software
problems to solve.

It's merely a matter of inadequate hardware.

-- 
James Carlson         42.703N 71.076W         <carlsonj@workingcode.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ifconfig ppp0 errors
  2011-01-11 20:30 ifconfig ppp0 errors Slawomir Skret
                   ` (7 preceding siblings ...)
  2011-01-12 19:47 ` James Carlson
@ 2011-01-12 21:05 ` James Cameron
  2011-01-12 21:23 ` James Carlson
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: James Cameron @ 2011-01-12 21:05 UTC (permalink / raw)
  To: linux-ppp

I agree with James, it smells like hardware.

You might compare clock cycle, bit or byte counters at either end.

You might check for sensitivity to specific data patterns;

- all zero (00000000 repeating), 

- all ones (11111111 repeating), 

- proportion of bits set (00000001, then 00000011, then 00000111, etc), 

- full byte range sequences (00000000 through to 11111111), 

- random numbers (a file of them for cross checking), 

- long duration patterns that might cause lower frequency effects
  (00000000 repeating for 10ms, then 11111111 for 10ms, then change the
  low frequency).

You might test by excluding pppd, such as using cat, diff, and md5sum.

If you've passed all these tests already, then you should find out from
the kernel why the error count is being incremented.

-- 
James Cameron
http://quozl.linux.org.au/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ifconfig ppp0 errors
  2011-01-11 20:30 ifconfig ppp0 errors Slawomir Skret
                   ` (8 preceding siblings ...)
  2011-01-12 21:05 ` James Cameron
@ 2011-01-12 21:23 ` James Carlson
  2011-01-12 21:51 ` James Cameron
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: James Carlson @ 2011-01-12 21:23 UTC (permalink / raw)
  To: linux-ppp

On 01/12/11 16:05, James Cameron wrote:
> I agree with James, it smells like hardware.
> 
> You might compare clock cycle, bit or byte counters at either end.
> 
> You might check for sensitivity to specific data patterns;

Pattern sensitivity is certainly an interesting area.  I think the ones
you've mentioned would tend to trip up poorly-designed hardware -- i.e.,
hardware that has trouble with DC restoration or noise coupled into
ground, or similar sorts of issues.  They're certainly interesting tests
to try.

I think the most prosaic cause of this sort of problem -- given the
evidence so far, which (if I recall correctly) started with OK behavior
until large data transfers were attempted -- is simple overflow.  Either
the system is just unable to keep up with the interrupt load, or
something's blocking the interrupts that do happen, or the latency into
the service routine is greater than the depth of the hardware input buffer.

An overrun would cause lost bytes in the middle of the frame, and cause
frames to be discarded as corrupt when the FCS isn't validated, and it'd
be aggravated by larger packet sizes, by a shorter interval between
packets, and by a system busy doing other things (such as talking to
disk).  All of those seem to line up well with the original problem report.

-- 
James Carlson         42.703N 71.076W         <carlsonj@workingcode.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ifconfig ppp0 errors
  2011-01-11 20:30 ifconfig ppp0 errors Slawomir Skret
                   ` (9 preceding siblings ...)
  2011-01-12 21:23 ` James Carlson
@ 2011-01-12 21:51 ` James Cameron
  2011-01-12 22:17 ` James Carlson
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: James Cameron @ 2011-01-12 21:51 UTC (permalink / raw)
  To: linux-ppp

On Wed, Jan 12, 2011 at 04:23:29PM -0500, James Carlson wrote:
> I think the most prosaic cause of this sort of problem -- given the
> evidence so far, which (if I recall correctly) started with OK behavior
> until large data transfers were attempted -- is simple overflow.

Yes, that seems most likely.  Overflow can be tested for as well though,
by sending large amounts of data over the link, in the absence of PPP.
Careful counting will show what is lost.

Although, if the link itself has no flow control, then some loss will
always be a possibility.  10% of packets affected by loss seems a bit
high, but I've no idea what the original poster's design threshold is.

-- 
James Cameron
http://quozl.linux.org.au/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ifconfig ppp0 errors
  2011-01-11 20:30 ifconfig ppp0 errors Slawomir Skret
                   ` (10 preceding siblings ...)
  2011-01-12 21:51 ` James Cameron
@ 2011-01-12 22:17 ` James Carlson
  2011-01-14 16:58 ` Slawomir Skret
  2011-01-14 17:28 ` James Carlson
  13 siblings, 0 replies; 15+ messages in thread
From: James Carlson @ 2011-01-12 22:17 UTC (permalink / raw)
  To: linux-ppp

On 01/12/11 16:51, James Cameron wrote:
> On Wed, Jan 12, 2011 at 04:23:29PM -0500, James Carlson wrote:
>> I think the most prosaic cause of this sort of problem -- given the
>> evidence so far, which (if I recall correctly) started with OK behavior
>> until large data transfers were attempted -- is simple overflow.
> 
> Yes, that seems most likely.  Overflow can be tested for as well though,
> by sending large amounts of data over the link, in the absence of PPP.
> Careful counting will show what is lost.
> 
> Although, if the link itself has no flow control, then some loss will
> always be a possibility.  10% of packets affected by loss seems a bit
> high, but I've no idea what the original poster's design threshold is.

A possibly-helpful debug idea: while chasing a problem that ended up
being a bug inside a special optimization case in one vendor's TCP (!),
I wrote a little state machine in the kernel to detect the start of some
recognizable data (the standard "ABCD..." from chargen), and then
continue checking successive messages until a miscompare.  I panicked
the system on miscompare, and was able to chase down the bug using a
post-mortem debugger.

Such debug tools are possibly less available in the original poster's
environment, but the technique (modify the driver to detect test data
and assert its goodness) might still be useful in catching the event.

Breakpointing when something doesn't happen (i.e., some bytes not
received) is sometimes hard.  ;-}

-- 
James Carlson         42.703N 71.076W         <carlsonj@workingcode.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ifconfig ppp0 errors
  2011-01-11 20:30 ifconfig ppp0 errors Slawomir Skret
                   ` (11 preceding siblings ...)
  2011-01-12 22:17 ` James Carlson
@ 2011-01-14 16:58 ` Slawomir Skret
  2011-01-14 17:28 ` James Carlson
  13 siblings, 0 replies; 15+ messages in thread
From: Slawomir Skret @ 2011-01-14 16:58 UTC (permalink / raw)
  To: linux-ppp

Hi James,

I run the both ends of the pppd with the record flag. It shows that the 
sending and receiving files differ and the receiving file includes frames 
discarted due to either the BAD FCS residue error or the frame larger 
than the MRU, both of which do point into the problems under the 
ppp layer.

I appreciate very much your comments and prompt responses.

One comment that I have is that it would be useful to be able to retrieve 
the actual cause of the errors with the pppstats or some other way from 
the ppp driver. When they show up in the ifconfig it is not clear what's 
causing them and it requires some work to get to the actual cause.

Thanks again,
Swavek


On Wed, 12 Jan 2011, James Carlson wrote:

> On 01/12/11 13:40, Slawomir Skret wrote:
>> I added an instrumental printk statement in the ppp_receive_frame to
>> print a message before the ppp_receive_error(), rebuilt the kernel, and
>> got these prints for every error count reported. What does that mean? Is
>> there anything I can do/turn on/instrument/etc to tell what's wrong with
>> these frames?
>
> It means that your hardware is destroying the data in flight.  There's
> not much that PPP can do about that except detect the errors (using the
> Frame Check Sequence -- a CRC-16 scheme) and discard the packets that
> are affected.
>
> You could use the "record filename" option to capture the actual raw
> data.  That interposes a pty pair, so it's not completely transparent.
> To do transparent analysis, you'll need an external serial analyzer.
> Several vendors (such as HP) make such machines.  They're by no means
> cheap, but if you're doing real hardware development, there's no
> substitute for having good test gear.
>
> Reasonable operation of PPP assumes a lower layer that has only a modest
> -- and hopefully not traffic-sensitive -- error rate.  It doesn't have
> to be completely error free, but even a small error rate will have a
> relatively large effect on usability and performance of the link.  (And
> systematic errors, such as [say] always discarding the 256th byte of
> packets with 256 or more bytes, will make the link effectively useless
> for ordinary networking purposes.)
>
> (This isn't really a characteristic of PPP itself, but rather of any
> interface intended for datagram networking purposes.  Error rates other
> than "a little" are bad news.)
>
>> Also, I built the pppstats and run it:
>>
>> /var # ./pppstats
>>       IN   PACK VJCOMP  VJUNC  VJERR  |      OUT   PACK VJCOMP  VJUNC
>> NON-VJ
>>  2351664   1587      0      0      0  |    78364   1465      0      0 1465
>>
>> but it does not report these errors.
>
> That indicates that there's no significant errors above the framing
> level.  That's good, as it likely means there are no difficult software
> problems to solve.
>
> It's merely a matter of inadequate hardware.
>
> -- 
> James Carlson         42.703N 71.076W         <carlsonj@workingcode.com>
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ifconfig ppp0 errors
  2011-01-11 20:30 ifconfig ppp0 errors Slawomir Skret
                   ` (12 preceding siblings ...)
  2011-01-14 16:58 ` Slawomir Skret
@ 2011-01-14 17:28 ` James Carlson
  13 siblings, 0 replies; 15+ messages in thread
From: James Carlson @ 2011-01-14 17:28 UTC (permalink / raw)
  To: linux-ppp

Slawomir Skret wrote:
> One comment that I have is that it would be useful to be able to
> retrieve the actual cause of the errors with the pppstats or some other
> way from the ppp driver. When they show up in the ifconfig it is not
> clear what's causing them and it requires some work to get to the actual
> cause.

The information you get from ifconfig -- the count of input errors -- is
essentially all that the driver knows.  It knows that it's getting bad
frames, but it can't really know _why_ they're bad.  That takes a human.

(It'd be possible to report some more details about edge cases here, and
some of the implementations do just that, but low-level problems that
drop hunks of data are, I think, inherently hard for a PPP driver to
diagnose on its own.)

-- 
James Carlson         42.703N 71.076W         <carlsonj@workingcode.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2011-01-14 17:28 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-01-11 20:30 ifconfig ppp0 errors Slawomir Skret
2011-01-11 21:04 ` James Carlson
2011-01-11 21:57 ` Slawomir Skret
2011-01-11 22:50 ` Slawomir Skret
2011-01-11 23:30 ` James Cameron
2011-01-12  2:29 ` Paul Mackerras
2011-01-12 13:21 ` James Carlson
2011-01-12 18:40 ` Slawomir Skret
2011-01-12 19:47 ` James Carlson
2011-01-12 21:05 ` James Cameron
2011-01-12 21:23 ` James Carlson
2011-01-12 21:51 ` James Cameron
2011-01-12 22:17 ` James Carlson
2011-01-14 16:58 ` Slawomir Skret
2011-01-14 17:28 ` James Carlson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).