linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* UPD: Frequent/consistent panics in 2.4.19 at ip_route_input_slow, in_dev_get(dev)
@ 2002-10-21  3:02 Alain Fauconnet
  2002-10-21 16:40 ` kuznet
  0 siblings, 1 reply; 7+ messages in thread
From: Alain Fauconnet @ 2002-10-21  3:02 UTC (permalink / raw)
  To: linux-kernel; +Cc: Alan Cox, Alexey Kuznetsov, Vitaly E.Lavrov

Hello,

I'm  updating  this  information:  the  box  that has been upgraded to
2.4.19 crashes as well exactly at *the same* instruction.

I'm not sure what to do next: AFAIK the kernel crash dump patches  are
not available for 2.4.19 yet so I can't generate a core dump  on  this
box. Since it's definitely the same problem, I'd think that collecting
a full crash core dump under 2.4.17 or 2.4.18 would help, wouldn't it?

As I wrote down below, these box have 200++ ppp interfaces up all  the
time. Such interfaces come and go  many  times  per  minute  as  users
connect and disconnect. I really suspect like a race condition on  the
net  interfaces  structures   and  the  comments in ./net/ipv4/route.c
imply that such race conditions have been identified and fixed in  the
past.

What  I  can't  explain  is why these servers used to be quite stable.
Their load has increased, sure, but slowly. It  really  looked  to  me
like some traffic caused the crashes  at  first  sight  (kind  of  DoS
attack)  but  the  crash  context  doesn't  quite  back up this guess.

I'm Cc'ing this message to  the  people  whose  names  appear  in  the
route.c  file  as  per  the  kernel crash "howto" docs. Hope that this
won't get me flamed.

Greets,
_Alain_

----- Forwarded message from Alain Fauconnet <alain@cscoms.net> -----

Date: Fri, 18 Oct 2002 10:38:49 +0700
From: Alain Fauconnet <alain@cscoms.net>
To: linux-kernel@vger.kernel.org
Subject: Frequent/consistent panics in 2.4.17/.18 at ip_route_input_slow, in_dev_get(dev) 

Hello all,

I've been getting frequent and consistent panics  on  three  different
boxes running 2.4.17 and 2.4.18 kernels,  always  at  the  same  exact
instruction,    a    "incl      0x4(%edx)"   at   the   beginning   of
ip_route_input_slow().

The boxes are very busy PPTP (PoPToP) servers on brand single-CPU (P3,
P4) SCSI-based hardware (2 x IBM, 1 x Asus MB).

Example for a 2.4.17 kernel:

Unable to handle kernel paging request at virtual address f73cdc48
Oops = 0002
EIP = 0010:c0223ff9 not tainted

(gdb) x/10i 0xc0223ff9
0xc0223ff9 <ip_route_input_slow+33>:    incl   0x4(%edx)

EAX = 0 EBX = 410c0100 ECX = f69a30ac EDX = f73cdc44

The indexed access through EDX is causing the fault

ESI = ccb23fc6 EDI = d71a10ac EBP =  c583b0a0
ESP = d01d1e6c

DS = 0010 ES = 0018 SS = 0018

Process varies, in this case pptpctrl.

Stack: 410c0100 ccb23fcb d71a10ac c583b0ad 0000003d 

Call trace:

c023f91b <raw_recvmsg+111>
c022491a <ip_route_input+338>
c0226834 <ip_rcv_finish+40>
d88251ae ?
d8825180 ?
c0117867 <update_wall_time+11>
c0117c42 <timer_bh+546>
...

Sorry,  this partial data was hand-written in haste, I *really* had to
reboot this production box immediately.

Looking  at  the code in net/ipv4/route.c, I strongly suspect that the
offending line is:

int ip_route_input_slow(struct sk_buff *skb, u32 daddr, u32 saddr,
                        u8 tos, struct net_device *dev)
{
        struct rt_key   key;
        struct fib_result res;
------> struct in_device *in_dev = in_dev_get(dev); <-------

Because that's quite close to the beginning of the function:

0xc0223fd8 <ip_route_input_slow>:       sub    $0x50,%esp
0xc0223fdb <ip_route_input_slow+3>:     push   %ebp
0xc0223fdc <ip_route_input_slow+4>:     push   %edi
0xc0223fdd <ip_route_input_slow+5>:     push   %esi
0xc0223fde <ip_route_input_slow+6>:     push   %ebx
0xc0223fdf <ip_route_input_slow+7>:     mov    0x74(%esp,1),%edx
0xc0223fe3 <ip_route_input_slow+11>:    mov    0x70(%esp,1),%eax
0xc0223fe7 <ip_route_input_slow+15>:    mov    %al,0x2f(%esp,1)
0xc0223feb <ip_route_input_slow+19>:    mov    0xac(%edx),%edx
0xc0223ff1 <ip_route_input_slow+25>:    mov    %edx,0x24(%esp,1)
0xc0223ff5 <ip_route_input_slow+29>:    test   %edx,%edx
0xc0223ff7 <ip_route_input_slow+31>:    je     0xc0223ffc <ip_route_input_slow+36>
0xc0223ff9 <ip_route_input_slow+33>:    incl   0x4(%edx) <----- offending instruction

and in_dev_get() inlines to:

in_dev_get(const struct net_device *dev)
{
        struct in_device *in_dev;

        read_lock(&inetdev_lock);
        in_dev = dev->ip_ptr;
        if (in_dev)
                atomic_inc(&in_dev->refcnt); <---- inlines to "lock incl %0"
        read_unlock(&inetdev_lock);
        return in_dev;
}

So  unless  I  am  seriously  misled, we fault when trying to bump the
in_dev reference count up. What could that mean? This box  is  a  busy
PPTP  (PoPToP) server usually serving 200-350 sessions at once, with a
lot of connections/disconnections per  minute.  Could  it  be  a  race
condition that causes the  in_dev  structure  to  go  away  before  we
reference it? I'm  not  familiar  with  the  Linux  kernel  enough  to
investigate further.

I've got up to 4 crashes on these 3 boxes in a single day, all at that
precise instruction. Nothing has changed lately on the servers, except
a steadily increasing load. They  had  been  rock  solid  for  several
months   before   this   started   happening   about   a   week   ago.

They  run  a  quite  simple  configuration,  almost  vanilla  hardened
Slackware with  ppptp  compiled  from  source  (no  encryption).  Only
modification to kernel is in ./net/core/dev.c where I have  upped  the
max   number   of   instances   for  a  device  from  100  to  500  in
dev_alloc_name(). Algorithm is unchanged yet (yes,  I  have  read  the
comments). 

One  of  the boxes has been upgraded to 2.4.19 (it broke everything, I
had to get pppd 2.4.2b1 to fix - don't know if that's to be  expected)
and has been running stable since then  (only  1  day,  doesn't  prove
much   yet).   My   next   plan is to implement LKCD on the 2.4.17 box
(not available for 2.4.19 yet as it  seems)  and  capture  a  complete
crash dump. Would that help tracking down?

Thanks in advance for any help,

Greets,

_Alain_
"I've RTFM. It says: `see your system administrator'. But... *I* am  
the system administrator"
(DECUS US symposium session title, author unknown, ca. 1990)


----- End forwarded message -----

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: UPD: Frequent/consistent panics in 2.4.19 at ip_route_input_slow, in_dev_get(dev)
  2002-10-21  3:02 UPD: Frequent/consistent panics in 2.4.19 at ip_route_input_slow, in_dev_get(dev) Alain Fauconnet
@ 2002-10-21 16:40 ` kuznet
  2002-10-22  3:24   ` Alain Fauconnet
  2002-10-28 10:19   ` Alain Fauconnet
  0 siblings, 2 replies; 7+ messages in thread
From: kuznet @ 2002-10-21 16:40 UTC (permalink / raw)
  To: Alain Fauconnet; +Cc: linux-kernel, lve

Hello!

> much   yet).   My   next   plan is to implement LKCD on the 2.4.17 box
> (not available for 2.4.19 yet as it  seems)  and  capture  a  complete
> crash dump. Would that help tracking down?

I do not think that this will help.

Try better to enable slab poisoning in slab.h. If it that thing
which I think of, it would provoke crash.

Alexey

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: UPD: Frequent/consistent panics in 2.4.19 at ip_route_input_slow, in_dev_get(dev)
  2002-10-21 16:40 ` kuznet
@ 2002-10-22  3:24   ` Alain Fauconnet
  2002-10-28 10:19   ` Alain Fauconnet
  1 sibling, 0 replies; 7+ messages in thread
From: Alain Fauconnet @ 2002-10-22  3:24 UTC (permalink / raw)
  To: kuznet; +Cc: linux-kernel, lve

On Mon, Oct 21, 2002 at 08:40:16PM +0400, kuznet@ms2.inr.ac.ru wrote:
> Hello!
> 
> > much   yet).   My   next   plan is to implement LKCD on the 2.4.17 box
> > (not available for 2.4.19 yet as it  seems)  and  capture  a  complete
> > crash dump. Would that help tracking down?
> 
> I do not think that this will help.
> 
> Try better to enable slab poisoning in slab.h. If it that thing
> which I think of, it would provoke crash.

Thanks for your reply,

Does that mean just enabling CONFIG_DEBUG_SLAB  in  kernel  config  or
more?

I see:

#define SLAB_POISON             0x00000800UL    /* Poison objects */

in slab.h, but there's nothing settable in there.

Anyway I've recompiled the 2.4.19 kernel with CONFIG_DEBUG_SLAB=y.
Let's see what happens.

Greets,
_Alain_

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: UPD: Frequent/consistent panics in 2.4.19 at ip_route_input_slow, in_dev_get(dev)
  2002-10-21 16:40 ` kuznet
  2002-10-22  3:24   ` Alain Fauconnet
@ 2002-10-28 10:19   ` Alain Fauconnet
  2002-10-29  1:30     ` kuznet
  1 sibling, 1 reply; 7+ messages in thread
From: Alain Fauconnet @ 2002-10-28 10:19 UTC (permalink / raw)
  To: kuznet; +Cc: linux-kernel, lve

Hello,

On Mon, Oct 21, 2002 at 08:40:16PM +0400, kuznet@ms2.inr.ac.ru wrote:
> 
> Try better to enable slab poisoning in slab.h. If it that thing
> which I think of, it would provoke crash.
> 
> Alexey
> 

I have got a new crash on the same box with CONFIG_DEBUG_SLAB=y
(reminder: Kernel 2.4.19)

Unable to handle kernel paging request at  virtual address 5a5a5a5e

EIP : 0010:[<c02318a9>] <ip_route_input_slow+33>
EAX : 00000000   EBX : 5a5a5a5a   ECX : 9c515bec
EDX : 5a5a5a5a   ESI : ccb5b542   EDI : d71a10ac
EBP : c1aae690   ESP : c02f3e44

Process Swapper (pid: 0, stackpage = c02f3000)

Stack: 5a5a5a   ccb5b542   d71a10ac   c1aae690
(sorry, first stack word was written down incorrectly by operator, one
byte lost)

Call Trace:
[<c010980d>] <get_irq_list+137>
[<c0109a71>] <do_IRQ+133>
[<c02321ca>] <ip_route_input+338>
[<c02342e4>] <ip_rcv_finish+40>

(I find this stack trace strange. Once again I  haven't  written  this
down myself so I can't be sure it's 100% accurate)


Code : ff 42 04 b8 54 24 24 89 54 24 28 c7 44 24 20 00 00 00 00 c7

Still the same exact place in ip_route_input_slow:

(gdb) x/i 0xc02318a9
0xc02318a9 <ip_route_input_slow+33>:    incl   0x4(%edx)
0xc02318ac <ip_route_input_slow+36>:    mov    0x24(%esp,1),%edx
0xc02318b0 <ip_route_input_slow+40>:    mov    %edx,0x28(%esp,1)
0xc02318b4 <ip_route_input_slow+44>:    movl   $0x0,0x20(%esp,1)
0xc02318bc <ip_route_input_slow+52>:    movl   $0x0,0x38(%esp,1)
0xc02318c4 <ip_route_input_slow+60>:    movl   $0x0,0x34(%esp,1)

It looks to me that the value in DX is "poisoned" data isn't it?
I assume that the kernel is trying to use dynamic memory that has been
released already, right?

What's next in tracing this one down?

Thanks for any hint,
_Alain_


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: UPD: Frequent/consistent panics in 2.4.19 at ip_route_input_slow, in_dev_get(dev)
  2002-10-28 10:19   ` Alain Fauconnet
@ 2002-10-29  1:30     ` kuznet
  2002-10-29  4:13       ` Alain Fauconnet
  0 siblings, 1 reply; 7+ messages in thread
From: kuznet @ 2002-10-29  1:30 UTC (permalink / raw)
  To: Alain Fauconnet; +Cc: linux-kernel, lve

Hello!

> I assume that the kernel is trying to use dynamic memory that has been
> released already, right?

Right.

> What's next in tracing this one down?

To tell what exactly driver makes this. Apparently, it continues
to inject packets to the stack even after it has been destroyed.

If you did not see message "Freeing alive device", this means
that driver unregistered it. Usual ppp seems to be sane...

Alexey

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: UPD: Frequent/consistent panics in 2.4.19 at ip_route_input_slow, in_dev_get(dev)
  2002-10-29  1:30     ` kuznet
@ 2002-10-29  4:13       ` Alain Fauconnet
  2002-10-29  4:31         ` kuznet
  0 siblings, 1 reply; 7+ messages in thread
From: Alain Fauconnet @ 2002-10-29  4:13 UTC (permalink / raw)
  To: kuznet; +Cc: linux-kernel, lve

On Tue, Oct 29, 2002 at 04:30:25AM +0300, kuznet@ms2.inr.ac.ru wrote:

> > I assume that the kernel is trying to use dynamic memory that has been
> > released already, right?
> 
> Right.
> 
> > What's next in tracing this one down?
> 
> To tell what exactly driver makes this. Apparently, it continues
> to inject packets to the stack even after it has been destroyed.
>

In  this  case,  would  they  be  packets  TO  or FROM the ppp device?
(reminder:  crash  happens  in ip_route_input_slow, called with struct
net_device *dev presumably pointing at what used to be the PPP device)

> If you did not see message "Freeing alive device", this means
> that driver unregistered it. Usual ppp seems to be sane...

No such message in logs, but  "PPPIOCDETACH  file->f_count=3"  appears
quite often. If I understand it right, it'd mean that the ppp interface
is destroyed while having channels open to  it...  Does  it  give  any
hint?

Let's try to summarize anything unusual this box has:

- pptp  (PoPToP). But pptp is only userland software, how could it cause
such a problem? It merely forks pppd as a child.

- NAT, both SNAT and DNAT (for transparent redirection to a Squid),  and
a lot of rules used to do traffic accounting so iptables configuration
is kind of hairy. Could it somehow cause packets to be  "delayed"  and
thus re-injected to the stack later than  usual?  (I'm  probably  just
talking nonsense here - sorry - trying to guess).

- assymetrical routing: packets come from clients over VPN, thus over the
PPP  interface.  They are routed back though a LAN interface that goes
to the satellite uplink. To do this, the route to the peer IP  through
the PPP interface is deleted in the ip-up script.

Ah..  something  else that could be relevant (??): I see that ifconfig
-a shows "duplicate" ppp devices e.g.:

ppp96     Link encap:Point-to-Point Protocol  
          inet addr:xxx.yyy.zzz.2  P-t-P:172.16.27.104  Mask:255.255.255.255
          UP POINTOPOINT RUNNING NOARP MULTICAST  MTU:1500  Metric:1
          RX packets:44238 errors:0 dropped:0 overruns:0 frame:0
          TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:3 
          RX bytes:2192741 (2.0 Mb)  TX bytes:153 (153.0 b)

ppp96     Link encap:Point-to-Point Protocol  
          inet addr:xxx.yyy.zzz.2  P-t-P:172.16.27.104  Mask:255.255.255.255
          UP POINTOPOINT RUNNING NOARP MULTICAST  MTU:1500  Metric:1

Is this "normal"?

Greets,
_Alain_


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: UPD: Frequent/consistent panics in 2.4.19 at ip_route_input_slow, in_dev_get(dev)
  2002-10-29  4:13       ` Alain Fauconnet
@ 2002-10-29  4:31         ` kuznet
  0 siblings, 0 replies; 7+ messages in thread
From: kuznet @ 2002-10-29  4:31 UTC (permalink / raw)
  To: Alain Fauconnet; +Cc: linux-kernel, lve

Hello!

> In  this  case,  would  they  be  packets  TO  or FROM the ppp device?

FROM, of course. When a packet reaches IP, the device is already destroyed.
This is impossible, unless driver is broken or we have an error
in reference counting. The second possibility is eliminated by the fact,
that you do not see "Freeing alive device". So, device was cleanly
unregistered.

> No such message in logs, but  "PPPIOCDETACH  file->f_count=3"

I will look. It sounds as an interesting hint.

> - pptp  (PoPToP). But pptp is only userland software, how could it cause
> such a problem?

In no way.

> thus re-injected to the stack later than  usual?  (I'm  probably  just
> talking nonsense here - sorry - trying to guess).

It is not a nonsense. However, iptables hold necessary reference counts.

> - assymetrical routing:

This is not essential.

> -a shows "duplicate" ppp devices e.g.:
....
> Is this "normal"?

Well, ifconfig can be racy. If some device sitting in the list before ppp96
was deleted while ifconfig scanned the list, you can see ppp96 twice.
If some device was added, ifconfig can lose some device. Awkward, of course.

Alexey

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2002-10-29  4:25 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-10-21  3:02 UPD: Frequent/consistent panics in 2.4.19 at ip_route_input_slow, in_dev_get(dev) Alain Fauconnet
2002-10-21 16:40 ` kuznet
2002-10-22  3:24   ` Alain Fauconnet
2002-10-28 10:19   ` Alain Fauconnet
2002-10-29  1:30     ` kuznet
2002-10-29  4:13       ` Alain Fauconnet
2002-10-29  4:31         ` kuznet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).