netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Memory leak in 2.6.11-rc1?
       [not found]         ` <20050123023248.263daca9.akpm@osdl.org>
@ 2005-01-23 20:03           ` Russell King
  2005-01-24 11:48             ` Russell King
  0 siblings, 1 reply; 36+ messages in thread
From: Russell King @ 2005-01-23 20:03 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Jens Axboe, alexn, kas, linux-kernel, netdev

On Sun, Jan 23, 2005 at 02:32:48AM -0800, Andrew Morton wrote:
> Jens Axboe <axboe@suse.de> wrote:
> >
> > But I'm still stuck with all of my ram gone after a
> >  600MB fillmem, half of it is just in swap.
> 
> Well.  Half of it has gone so far ;)
> 
> > 
> >  Attaching meminfo and sysrq-m after fillmem.
> 
> (I meant a really big fillmem: a couple of 2GB ones.  Not to worry.)
> 
> It's not in slab and the pagecache and anonymous memory stuff seems to be
> working OK.  So it has to be something else, which does a bare
> __alloc_pages().  Low-level block stuff, networking, arch code, perhaps.
> 
> I don't think I've ever really seen code to diagnose this.
> 
> A simplistic approach would be to add eight or so ulongs into struct page,
> populate them with builtin_return_address(0...7) at allocation time, then
> modify sysrq-m to walk mem_map[] printing it all out for pages which have
> page_count() > 0.  That'd find the culprit.

I think I may be seeing something odd here, maybe a possible memory leak.
The only problem I have is wondering whether I'm actually comparing like
with like.  Maybe some networking people can provide a hint?

Below is gathered from 2.6.11-rc1.

bash-2.05a# head -n2 /proc/slabinfo
slabinfo - version: 2.1
# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab>
bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
115
ip_dst_cache         759    885    256   15    1
bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
117
ip_dst_cache         770    885    256   15    1
bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
133
ip_dst_cache         775    885    256   15    1
bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
18
ip_dst_cache         664    885    256   15    1
bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
20
ip_dst_cache         664    885    256   15    1
bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
22
ip_dst_cache         673    885    256   15    1
bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
23
ip_dst_cache         670    885    256   15    1
bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
24
ip_dst_cache         675    885    256   15    1
bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
24
ip_dst_cache         669    885    256   15    1

I'm fairly positive when I rebooted the machine a couple of days ago,
ip_dst_cache was significantly smaller for the same number of lines in
/proc/net/rt_cache.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-23 20:03           ` Memory leak in 2.6.11-rc1? Russell King
@ 2005-01-24 11:48             ` Russell King
  2005-01-25 19:32               ` Russell King
  0 siblings, 1 reply; 36+ messages in thread
From: Russell King @ 2005-01-24 11:48 UTC (permalink / raw)
  To: Andrew Morton, Jens Axboe, alexn, kas, linux-kernel, netdev

On Sun, Jan 23, 2005 at 08:03:15PM +0000, Russell King wrote:
> I think I may be seeing something odd here, maybe a possible memory leak.
> The only problem I have is wondering whether I'm actually comparing like
> with like.  Maybe some networking people can provide a hint?
> 
> Below is gathered from 2.6.11-rc1.
> 
> bash-2.05a# head -n2 /proc/slabinfo
> slabinfo - version: 2.1
> # name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab>
> bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
> 115
> ip_dst_cache         759    885    256   15    1
> bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
> 117
> ip_dst_cache         770    885    256   15    1
> bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
> 133
> ip_dst_cache         775    885    256   15    1
> bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
> 18
> ip_dst_cache         664    885    256   15    1
>...
> bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
> 24
> ip_dst_cache         675    885    256   15    1
> bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
> 24
> ip_dst_cache         669    885    256   15    1
> 
> I'm fairly positive when I rebooted the machine a couple of days ago,
> ip_dst_cache was significantly smaller for the same number of lines in
> /proc/net/rt_cache.

FYI, today it looks like this:

bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
26
ip_dst_cache         820   1065    256   15    1 

So the dst cache seems to have grown by 151 in 16 hours...  I'll continue
monitoring and providing updates.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-24 11:48             ` Russell King
@ 2005-01-25 19:32               ` Russell King
  2005-01-27  8:28                 ` Russell King
  0 siblings, 1 reply; 36+ messages in thread
From: Russell King @ 2005-01-25 19:32 UTC (permalink / raw)
  To: Andrew Morton, Jens Axboe, alexn, kas, linux-kernel, netdev

On Mon, Jan 24, 2005 at 11:48:53AM +0000, Russell King wrote:
> On Sun, Jan 23, 2005 at 08:03:15PM +0000, Russell King wrote:
> > I think I may be seeing something odd here, maybe a possible memory leak.
> > The only problem I have is wondering whether I'm actually comparing like
> > with like.  Maybe some networking people can provide a hint?
> > 
> > Below is gathered from 2.6.11-rc1.
> > 
> > bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
> > 24
> > ip_dst_cache         669    885    256   15    1
> > 
> > I'm fairly positive when I rebooted the machine a couple of days ago,
> > ip_dst_cache was significantly smaller for the same number of lines in
> > /proc/net/rt_cache.
> 
> FYI, today it looks like this:
> 
> bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
> 26
> ip_dst_cache         820   1065    256   15    1 
> 
> So the dst cache seems to have grown by 151 in 16 hours...  I'll continue
> monitoring and providing updates.

Tonights update:
50
ip_dst_cache        1024   1245    256   15    1

As you can see, the dst cache is consistently growing by about 200
entries per day.  Given this, I predict that the box will fall over
due to "dst cache overflow" in roughly 35 days.

kernel network configuration:

CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_ROUTE_FWMARK=y
CONFIG_IP_ROUTE_VERBOSE=y
CONFIG_IP_PNP=y
CONFIG_IP_PNP_BOOTP=y
CONFIG_SYN_COOKIES=y
CONFIG_IPV6=y
CONFIG_NETFILTER=y
CONFIG_IP_NF_CONNTRACK=y
CONFIG_IP_NF_CONNTRACK_MARK=y
CONFIG_IP_NF_FTP=y
CONFIG_IP_NF_IRC=y
CONFIG_IP_NF_IPTABLES=y
CONFIG_IP_NF_MATCH_LIMIT=y
CONFIG_IP_NF_MATCH_IPRANGE=y
CONFIG_IP_NF_MATCH_MAC=m
CONFIG_IP_NF_MATCH_PKTTYPE=m
CONFIG_IP_NF_MATCH_MARK=y
CONFIG_IP_NF_MATCH_MULTIPORT=m
CONFIG_IP_NF_MATCH_TOS=m
CONFIG_IP_NF_MATCH_RECENT=y
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_DSCP=m
CONFIG_IP_NF_MATCH_AH_ESP=m
CONFIG_IP_NF_MATCH_LENGTH=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_MATCH_TCPMSS=m
CONFIG_IP_NF_MATCH_HELPER=y
CONFIG_IP_NF_MATCH_STATE=y
CONFIG_IP_NF_MATCH_CONNTRACK=y
CONFIG_IP_NF_MATCH_ADDRTYPE=m
CONFIG_IP_NF_MATCH_REALM=m
CONFIG_IP_NF_MATCH_CONNMARK=m
CONFIG_IP_NF_MATCH_HASHLIMIT=m
CONFIG_IP_NF_FILTER=y
CONFIG_IP_NF_TARGET_REJECT=y
CONFIG_IP_NF_TARGET_LOG=m
CONFIG_IP_NF_TARGET_TCPMSS=m
CONFIG_IP_NF_NAT=y
CONFIG_IP_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=y
CONFIG_IP_NF_TARGET_REDIRECT=y
CONFIG_IP_NF_TARGET_NETMAP=y
CONFIG_IP_NF_TARGET_SAME=y
CONFIG_IP_NF_NAT_IRC=y
CONFIG_IP_NF_NAT_FTP=y
CONFIG_IP_NF_MANGLE=y
CONFIG_IP_NF_TARGET_TOS=m
CONFIG_IP_NF_TARGET_ECN=m
CONFIG_IP_NF_TARGET_DSCP=m
CONFIG_IP_NF_TARGET_MARK=y
CONFIG_IP_NF_TARGET_CLASSIFY=m
CONFIG_IP_NF_TARGET_CONNMARK=m
CONFIG_IP_NF_TARGET_CLUSTERIP=m
CONFIG_IP6_NF_IPTABLES=y
CONFIG_IP6_NF_MATCH_LIMIT=y
CONFIG_IP6_NF_MATCH_MAC=y
CONFIG_IP6_NF_MATCH_RT=y
CONFIG_IP6_NF_MATCH_OPTS=y
CONFIG_IP6_NF_MATCH_FRAG=y
CONFIG_IP6_NF_MATCH_HL=y
CONFIG_IP6_NF_MATCH_MULTIPORT=y
CONFIG_IP6_NF_MATCH_MARK=y
CONFIG_IP6_NF_MATCH_AHESP=y
CONFIG_IP6_NF_MATCH_LENGTH=y
CONFIG_IP6_NF_FILTER=y
CONFIG_IP6_NF_MANGLE=y
CONFIG_IP6_NF_TARGET_MARK=y


-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-25 19:32               ` Russell King
@ 2005-01-27  8:28                 ` Russell King
  2005-01-27  8:47                   ` Andrew Morton
  0 siblings, 1 reply; 36+ messages in thread
From: Russell King @ 2005-01-27  8:28 UTC (permalink / raw)
  To: Andrew Morton, Linus Torvalds, alexn, kas, linux-kernel, netdev

On Tue, Jan 25, 2005 at 07:32:07PM +0000, Russell King wrote:
> On Mon, Jan 24, 2005 at 11:48:53AM +0000, Russell King wrote:
> > On Sun, Jan 23, 2005 at 08:03:15PM +0000, Russell King wrote:
> > > I think I may be seeing something odd here, maybe a possible memory leak.
> > > The only problem I have is wondering whether I'm actually comparing like
> > > with like.  Maybe some networking people can provide a hint?
> > > 
> > > Below is gathered from 2.6.11-rc1.
> > > 
> > > bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
> > > 24
> > > ip_dst_cache         669    885    256   15    1
> > > 
> > > I'm fairly positive when I rebooted the machine a couple of days ago,
> > > ip_dst_cache was significantly smaller for the same number of lines in
> > > /proc/net/rt_cache.
> > 
> > FYI, today it looks like this:
> > 
> > bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
> > 26
> > ip_dst_cache         820   1065    256   15    1 
> > 
> > So the dst cache seems to have grown by 151 in 16 hours...  I'll continue
> > monitoring and providing updates.
> 
> Tonights update:
> 50
> ip_dst_cache        1024   1245    256   15    1
> 
> As you can see, the dst cache is consistently growing by about 200
> entries per day.  Given this, I predict that the box will fall over
> due to "dst cache overflow" in roughly 35 days.

This mornings magic numbers are:

3
ip_dst_cache        1292   1485    256   15    1

Is no one interested in the fact that the DST cache is leaking and
eventually takes out machines?  I've had virtually zero interest in
this problem so far.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-27  8:28                 ` Russell King
@ 2005-01-27  8:47                   ` Andrew Morton
  2005-01-27 10:19                     ` Alessandro Suardi
                                       ` (2 more replies)
  0 siblings, 3 replies; 36+ messages in thread
From: Andrew Morton @ 2005-01-27  8:47 UTC (permalink / raw)
  To: Russell King; +Cc: torvalds, alexn, kas, linux-kernel, netdev

Russell King <rmk+lkml@arm.linux.org.uk> wrote:
>
> This mornings magic numbers are:
> 
>  3
>  ip_dst_cache        1292   1485    256   15    1

I just did a q-n-d test here: send one UDP frame to 1.1.1.1 up to
1.1.255.255.  The ip_dst_cache grew to ~15k entries and grew no further. 
It's now gradually shrinking.  So there doesn't appear to be a trivial
bug..

>  Is no one interested in the fact that the DST cache is leaking and
>  eventually takes out machines?  I've had virtually zero interest in
>  this problem so far.

I guess we should find a way to make it happen faster.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-27  8:47                   ` Andrew Morton
@ 2005-01-27 10:19                     ` Alessandro Suardi
  2005-01-27 12:17                     ` Martin Josefsson
  2005-01-27 12:56                     ` Robert Olsson
  2 siblings, 0 replies; 36+ messages in thread
From: Alessandro Suardi @ 2005-01-27 10:19 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Russell King, torvalds, alexn, kas, linux-kernel, netdev

On Thu, 27 Jan 2005 00:47:32 -0800, Andrew Morton <akpm@osdl.org> wrote:
> Russell King <rmk+lkml@arm.linux.org.uk> wrote:
> >
> > This mornings magic numbers are:
> >
> >  3
> >  ip_dst_cache        1292   1485    256   15    1
> 
> I just did a q-n-d test here: send one UDP frame to 1.1.1.1 up to
> 1.1.255.255.  The ip_dst_cache grew to ~15k entries and grew no further.
> It's now gradually shrinking.  So there doesn't appear to be a trivial
> bug..
> 
> >  Is no one interested in the fact that the DST cache is leaking and
> >  eventually takes out machines?  I've had virtually zero interest in
> >  this problem so far.
> 
> I guess we should find a way to make it happen faster.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

Data point... on my box, used as ed2k/bittorrent
 machine, the ip_dst_cache grows and shrinks quite
 fast; these two samples were ~3 minutes apart:


[root@donkey ~]# grep ip_dst /proc/slabinfo
ip_dst_cache         998   1005    256   15    1 : tunables  120   60 
  0 : slabdata     67     67      0
[root@donkey ~]# wc -l /proc/net/rt_cache
926 /proc/net/rt_cache

[root@donkey ~]# grep ip_dst /proc/slabinfo
ip_dst_cache         466    795    256   15    1 : tunables  120   60 
  0 : slabdata     53     53      0
[root@donkey ~]# wc -l /proc/net/rt_cache
443 /proc/net/rt_cache

 and these were 2 seconds apart

[root@donkey ~]# wc -l /proc/net/rt_cache
737 /proc/net/rt_cache
[root@donkey ~]# grep ip_dst /proc/slabinfo
ip_dst_cache         795    795    256   15    1 : tunables  120   60 
  0 : slabdata     53     53      0

[root@donkey ~]# wc -l /proc/net/rt_cache
1023 /proc/net/rt_cache
[root@donkey ~]# grep ip_dst /proc/slabinfo
ip_dst_cache        1035   1035    256   15    1 : tunables  120   60 
  0 : slabdata     69     69      0

--alessandro
 
  "And every dream, every, is just a dream after all"
 
     (Heather Nova, "Paper Cup")

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-27  8:47                   ` Andrew Morton
  2005-01-27 10:19                     ` Alessandro Suardi
@ 2005-01-27 12:17                     ` Martin Josefsson
  2005-01-27 12:56                     ` Robert Olsson
  2 siblings, 0 replies; 36+ messages in thread
From: Martin Josefsson @ 2005-01-27 12:17 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Russell King, torvalds, alexn, kas, linux-kernel, netdev

On Thu, 27 Jan 2005, Andrew Morton wrote:

> Russell King <rmk+lkml@arm.linux.org.uk> wrote:
> >
> > This mornings magic numbers are:
> >
> >  3
> >  ip_dst_cache        1292   1485    256   15    1
>
> I just did a q-n-d test here: send one UDP frame to 1.1.1.1 up to
> 1.1.255.255.  The ip_dst_cache grew to ~15k entries and grew no further.
> It's now gradually shrinking.  So there doesn't appear to be a trivial
> bug..
>
> >  Is no one interested in the fact that the DST cache is leaking and
> >  eventually takes out machines?  I've had virtually zero interest in
> >  this problem so far.
>
> I guess we should find a way to make it happen faster.

I could be a refcount problem. I think Russell is using NAT, it could be
the MASQUERADE target if that is in use. A simple test would be to switch
to SNAT and try again if possible.

/Martin

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-27  8:47                   ` Andrew Morton
  2005-01-27 10:19                     ` Alessandro Suardi
  2005-01-27 12:17                     ` Martin Josefsson
@ 2005-01-27 12:56                     ` Robert Olsson
  2005-01-27 13:03                       ` Robert Olsson
  2005-01-27 16:49                       ` Russell King
  2 siblings, 2 replies; 36+ messages in thread
From: Robert Olsson @ 2005-01-27 12:56 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Russell King, torvalds, alexn, kas, linux-kernel, netdev


Andrew Morton writes:
 > Russell King <rmk+lkml@arm.linux.org.uk> wrote:

 > >  ip_dst_cache        1292   1485    256   15    1

 > I guess we should find a way to make it happen faster.
 
Here is route DoS attack. Pure routing no NAT no filter.

Start
=====
ip_dst_cache           5     30    256   15    1 : tunables  120   60    8 : slabdata      2      2      0

After DoS
=========
ip_dst_cache       66045  76125    256   15    1 : tunables  120   60    8 : slabdata   5075   5075    480

After some GC runs.
==================
ip_dst_cache           2     15    256   15    1 : tunables  120   60    8 : slabdata      1      1      0

No problems here. I saw Martin talked about NAT...

							--ro

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-27 12:56                     ` Robert Olsson
@ 2005-01-27 13:03                       ` Robert Olsson
  2005-01-27 16:49                       ` Russell King
  1 sibling, 0 replies; 36+ messages in thread
From: Robert Olsson @ 2005-01-27 13:03 UTC (permalink / raw)
  To: Robert Olsson
  Cc: Andrew Morton, Russell King, torvalds, alexn, kas, linux-kernel,
	netdev


Oh. Linux version 2.6.11-rc2 was used.

Robert Olsson writes:
 > 
 > Andrew Morton writes:
 >  > Russell King <rmk+lkml@arm.linux.org.uk> wrote:
 > 
 >  > >  ip_dst_cache        1292   1485    256   15    1
 > 
 >  > I guess we should find a way to make it happen faster.
 >  
 > Here is route DoS attack. Pure routing no NAT no filter.
 > 
 > Start
 > =====
 > ip_dst_cache           5     30    256   15    1 : tunables  120   60    8 : slabdata      2      2      0
 > 
 > After DoS
 > =========
 > ip_dst_cache       66045  76125    256   15    1 : tunables  120   60    8 : slabdata   5075   5075    480
 > 
 > After some GC runs.
 > ==================
 > ip_dst_cache           2     15    256   15    1 : tunables  120   60    8 : slabdata      1      1      0
 > 
 > No problems here. I saw Martin talked about NAT...
 > 
 > 							--ro

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-27 12:56                     ` Robert Olsson
  2005-01-27 13:03                       ` Robert Olsson
@ 2005-01-27 16:49                       ` Russell King
  2005-01-27 18:37                         ` Phil Oester
  2005-01-27 20:33                         ` David S. Miller
  1 sibling, 2 replies; 36+ messages in thread
From: Russell King @ 2005-01-27 16:49 UTC (permalink / raw)
  To: Robert Olsson; +Cc: Andrew Morton, torvalds, alexn, kas, linux-kernel, netdev

On Thu, Jan 27, 2005 at 01:56:30PM +0100, Robert Olsson wrote:
> 
> Andrew Morton writes:
>  > Russell King <rmk+lkml@arm.linux.org.uk> wrote:
> 
>  > >  ip_dst_cache        1292   1485    256   15    1
> 
>  > I guess we should find a way to make it happen faster.
>  
> Here is route DoS attack. Pure routing no NAT no filter.
> 
> Start
> =====
> ip_dst_cache           5     30    256   15    1 : tunables  120   60    8 : slabdata      2      2      0
> 
> After DoS
> =========
> ip_dst_cache       66045  76125    256   15    1 : tunables  120   60    8 : slabdata   5075   5075    480
> 
> After some GC runs.
> ==================
> ip_dst_cache           2     15    256   15    1 : tunables  120   60    8 : slabdata      1      1      0
> 
> No problems here. I saw Martin talked about NAT...

Yes, I can reproduce that same behaviour, where I can artificially
inflate the DST cache and the GC does run and trims it back down to
something reasonable.

BUT, over time, my DST cache just increases in size and won't trim back
down.  Not even by writing to the /proc/sys/net/ipv4/route/flush sysctl
(which, if I'm reading the code correctly - and would be nice to know
from those who actually know this stuff - should force an immediate
flush of the DST cache.)

For instance, I have (in sequence):

# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo
581
ip_dst_cache        1860   1860    256   15    1 : tunables  120   60    0 : slabdata    124    124      0
# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo
717
ip_dst_cache        1995   1995    256   15    1 : tunables  120   60    0 : slabdata    133    133      0
# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo
690
ip_dst_cache        1995   1995    256   15    1 : tunables  120   60    0 : slabdata    133    133      0
# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo
696
ip_dst_cache        1995   1995    256   15    1 : tunables  120   60    0 : slabdata    133    133      0
# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo
700
ip_dst_cache        1995   1995    256   15    1 : tunables  120   60    0 : slabdata    133    133      0
# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo
718
ip_dst_cache        1993   1995    256   15    1 : tunables  120   60    0 : slabdata    133    133      0
# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo
653
ip_dst_cache        1993   1995    256   15    1 : tunables  120   60    0 : slabdata    133    133      0
# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo
667
ip_dst_cache        1956   1995    256   15    1 : tunables  120   60    0 : slabdata    133    133      0
# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo
620
ip_dst_cache        1944   1995    256   15    1 : tunables  120   60    0 : slabdata    133    133      0
# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo
623
ip_dst_cache        1920   1995    256   15    1 : tunables  120   60    0 : slabdata    133    133      0
# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo
8
ip_dst_cache        1380   1980    256   15    1 : tunables  120   60    0 : slabdata    132    132      0
# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo
86
ip_dst_cache        1375   1875    256   15    1 : tunables  120   60    0 : slabdata    125    125      0

so obviously the GC does appear to be working - as can be seen from the
number of entries in /proc/net/rt_cache.  However, the number of objects
in the slab cache does grow day on day.  About 4 days ago, it was only
about 600 active objects.  Now it's more than twice that, and it'll
continue increasing until it hits 8192, where upon it's game over.

And, here's the above with /proc/net/stat/rt_cache included:

# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo; cat /proc/net/stat/rt_cache
61
ip_dst_cache        1340   1680    256   15    1 : tunables  120   60    0 : slabdata    112    112      0
entries  in_hit in_slow_tot in_no_route in_brd in_martian_dst in_martian_src  out_hit out_slow_tot out_slow_mc  gc_total gc_ignored gc_goal_miss gc_dst_overflow in_hlist_search out_hlist_search
00000538  005c9f10 0005e163 00000000 00000013 000002e2 00000000 00000005  003102e3 00038f6d 00000000 0007887a 0005286d 00001142 00000000 00138855 0010848d

notice how /proc/net/stat/rt_cache says there's 1336 entries in the
route cache.  _Where_ are they?  They're not there according to
/proc/net/rt_cache.

(PS, the formatting of the headings in /proc/net/stat/rt_cache doesn't
appear to tie up with the formatting of the data which is _really_
confusing.)

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-27 16:49                       ` Russell King
@ 2005-01-27 18:37                         ` Phil Oester
  2005-01-27 19:25                           ` Russell King
  2005-01-27 20:33                         ` David S. Miller
  1 sibling, 1 reply; 36+ messages in thread
From: Phil Oester @ 2005-01-27 18:37 UTC (permalink / raw)
  To: Robert Olsson, Andrew Morton, torvalds, alexn, kas, linux-kernel,
	netdev

On Thu, Jan 27, 2005 at 04:49:18PM +0000, Russell King wrote:
> so obviously the GC does appear to be working - as can be seen from the
> number of entries in /proc/net/rt_cache.  However, the number of objects
> in the slab cache does grow day on day.  About 4 days ago, it was only
> about 600 active objects.  Now it's more than twice that, and it'll
> continue increasing until it hits 8192, where upon it's game over.

I can confirm the behavior you are seeing -- does seem to be a leak
somewhere.  Below from a heavily used gateway with 26 days uptime:

# wc -l /proc/net/rt_cache ; grep ip_dst /proc/slabinfo
  12870 /proc/net/rt_cache
ip_dst_cache       53327  57855

Eventually I get the dst_cache overflow errors and have to reboot.

Phil

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-27 18:37                         ` Phil Oester
@ 2005-01-27 19:25                           ` Russell King
  2005-01-27 20:40                             ` Phil Oester
  0 siblings, 1 reply; 36+ messages in thread
From: Russell King @ 2005-01-27 19:25 UTC (permalink / raw)
  To: Phil Oester
  Cc: Robert Olsson, Andrew Morton, torvalds, alexn, kas, linux-kernel,
	netdev

On Thu, Jan 27, 2005 at 10:37:45AM -0800, Phil Oester wrote:
> On Thu, Jan 27, 2005 at 04:49:18PM +0000, Russell King wrote:
> > so obviously the GC does appear to be working - as can be seen from the
> > number of entries in /proc/net/rt_cache.  However, the number of objects
> > in the slab cache does grow day on day.  About 4 days ago, it was only
> > about 600 active objects.  Now it's more than twice that, and it'll
> > continue increasing until it hits 8192, where upon it's game over.
> 
> I can confirm the behavior you are seeing -- does seem to be a leak
> somewhere.  Below from a heavily used gateway with 26 days uptime:
> 
> # wc -l /proc/net/rt_cache ; grep ip_dst /proc/slabinfo
>   12870 /proc/net/rt_cache
> ip_dst_cache       53327  57855
> 
> Eventually I get the dst_cache overflow errors and have to reboot.

Can you provide some details, eg kernel configuration, loaded modules
and a brief overview of any netfilter modules you may be using.

Maybe we can work out what's common between our setups.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-27 16:49                       ` Russell King
  2005-01-27 18:37                         ` Phil Oester
@ 2005-01-27 20:33                         ` David S. Miller
  2005-01-28  0:17                           ` Russell King
  1 sibling, 1 reply; 36+ messages in thread
From: David S. Miller @ 2005-01-27 20:33 UTC (permalink / raw)
  To: Russell King
  Cc: Robert.Olsson, akpm, torvalds, alexn, kas, linux-kernel, netdev

On Thu, 27 Jan 2005 16:49:18 +0000
Russell King <rmk+lkml@arm.linux.org.uk> wrote:

> notice how /proc/net/stat/rt_cache says there's 1336 entries in the
> route cache.  _Where_ are they?  They're not there according to
> /proc/net/rt_cache.

When the route cache is flushed, that kills a reference to each
entry in the routing cache.  If for some reason, other references
remain (route connected to socket, some leak in the stack somewhere)
the route cache entry can't be immediately completely freed up.

So they won't be listed in /proc/net/rt_cache (since they've been
removed from the lookup table) but they will be accounted for in
/proc/net/stat/rt_cache until the final release is done on the
routing cache object and it can be completely freed up.

Do you happen to be using IPV6 in any way by chance?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-27 19:25                           ` Russell King
@ 2005-01-27 20:40                             ` Phil Oester
  2005-01-28  9:32                               ` Russell King
  0 siblings, 1 reply; 36+ messages in thread
From: Phil Oester @ 2005-01-27 20:40 UTC (permalink / raw)
  To: Robert Olsson, Andrew Morton, torvalds, alexn, kas, linux-kernel,
	netdev

On Thu, Jan 27, 2005 at 07:25:04PM +0000, Russell King wrote:
> Can you provide some details, eg kernel configuration, loaded modules
> and a brief overview of any netfilter modules you may be using.
> 
> Maybe we can work out what's common between our setups.

Vanilla 2.6.10, though I've been seeing these problems since 2.6.8 or
earlier.  Netfilter running on all boxes, some utilizing SNAT, others
not -- none using MASQ.  This is from a box running no NAT at all,
although has some other filter rules:

# wc -l /proc/net/rt_cache ; grep dst_cache /proc/slabinfo
     50 /proc/net/rt_cache
ip_dst_cache       84285  84285

Also with uptime of 26 days.  

These boxes are all running the quagga OSPF daemon, but those that
are lightly loaded are not exhibiting these problems.

Phil

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-27 20:33                         ` David S. Miller
@ 2005-01-28  0:17                           ` Russell King
  2005-01-28  0:34                             ` David S. Miller
  2005-01-28  1:41                             ` Phil Oester
  0 siblings, 2 replies; 36+ messages in thread
From: Russell King @ 2005-01-28  0:17 UTC (permalink / raw)
  To: David S. Miller
  Cc: Robert.Olsson, akpm, torvalds, alexn, kas, linux-kernel, netdev

On Thu, Jan 27, 2005 at 12:33:26PM -0800, David S. Miller wrote:
> So they won't be listed in /proc/net/rt_cache (since they've been
> removed from the lookup table) but they will be accounted for in
> /proc/net/stat/rt_cache until the final release is done on the
> routing cache object and it can be completely freed up.
> 
> Do you happen to be using IPV6 in any way by chance?

Yes.  Someone suggested this evening that there may have been a recent
change to do with some IPv6 refcounting which may have caused this
problem.  Is that something you can confirm?

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-28  0:17                           ` Russell King
@ 2005-01-28  0:34                             ` David S. Miller
  2005-01-28  8:58                               ` Russell King
  2005-01-28  1:41                             ` Phil Oester
  1 sibling, 1 reply; 36+ messages in thread
From: David S. Miller @ 2005-01-28  0:34 UTC (permalink / raw)
  To: Russell King
  Cc: Robert.Olsson, akpm, torvalds, alexn, kas, linux-kernel, netdev

On Fri, 28 Jan 2005 00:17:01 +0000
Russell King <rmk+lkml@arm.linux.org.uk> wrote:

> Yes.  Someone suggested this evening that there may have been a recent
> change to do with some IPv6 refcounting which may have caused this
> problem.  Is that something you can confirm?

Yep, it would be this change below.  Try backing it out and see
if that makes your leak go away.

# This is a BitKeeper generated diff -Nru style patch.
#
# ChangeSet
#   2005/01/14 20:41:55-08:00 herbert@gondor.apana.org.au 
#   [IPV6]: Fix locking in ip6_dst_lookup().
#   
#   The caller does not necessarily have the socket locked
#   (udpv6sendmsg() is one such case) so we have to use
#   sk_dst_check() instead of __sk_dst_check().
#   
#   Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
#   Signed-off-by: David S. Miller <davem@davemloft.net>
# 
# net/ipv6/ip6_output.c
#   2005/01/14 20:41:34-08:00 herbert@gondor.apana.org.au +3 -3
#   [IPV6]: Fix locking in ip6_dst_lookup().
#   
#   The caller does not necessarily have the socket locked
#   (udpv6sendmsg() is one such case) so we have to use
#   sk_dst_check() instead of __sk_dst_check().
#   
#   Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
#   Signed-off-by: David S. Miller <davem@davemloft.net>
# 
diff -Nru a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
--- a/net/ipv6/ip6_output.c	2005-01-27 16:07:21 -08:00
+++ b/net/ipv6/ip6_output.c	2005-01-27 16:07:21 -08:00
@@ -745,7 +745,7 @@
 	if (sk) {
 		struct ipv6_pinfo *np = inet6_sk(sk);
 	
-		*dst = __sk_dst_check(sk, np->dst_cookie);
+		*dst = sk_dst_check(sk, np->dst_cookie);
 		if (*dst) {
 			struct rt6_info *rt = (struct rt6_info*)*dst;
 	
@@ -772,9 +772,9 @@
 			     && (np->daddr_cache == NULL ||
 				 !ipv6_addr_equal(&fl->fl6_dst, np->daddr_cache)))
 			    || (fl->oif && fl->oif != (*dst)->dev->ifindex)) {
+				dst_release(*dst);
 				*dst = NULL;
-			} else
-				dst_hold(*dst);
+			}
 		}
 	}
 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-28  0:17                           ` Russell King
  2005-01-28  0:34                             ` David S. Miller
@ 2005-01-28  1:41                             ` Phil Oester
  1 sibling, 0 replies; 36+ messages in thread
From: Phil Oester @ 2005-01-28  1:41 UTC (permalink / raw)
  To: David S. Miller, Robert.Olsson, akpm, torvalds, alexn, kas,
	linux-kernel, netdev

On Fri, Jan 28, 2005 at 12:17:01AM +0000, Russell King wrote:
> On Thu, Jan 27, 2005 at 12:33:26PM -0800, David S. Miller wrote:
> > So they won't be listed in /proc/net/rt_cache (since they've been
> > removed from the lookup table) but they will be accounted for in
> > /proc/net/stat/rt_cache until the final release is done on the
> > routing cache object and it can be completely freed up.
> > 
> > Do you happen to be using IPV6 in any way by chance?
> 
> Yes.  Someone suggested this evening that there may have been a recent
> change to do with some IPv6 refcounting which may have caused this
> problem.  Is that something you can confirm?

FWIW, I do not use IPv6, and it is not compiled into the kernel.

Phil

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-28  0:34                             ` David S. Miller
@ 2005-01-28  8:58                               ` Russell King
  2005-01-30 13:23                                 ` Russell King
  0 siblings, 1 reply; 36+ messages in thread
From: Russell King @ 2005-01-28  8:58 UTC (permalink / raw)
  To: David S. Miller
  Cc: Robert.Olsson, akpm, torvalds, alexn, kas, linux-kernel, netdev

On Thu, Jan 27, 2005 at 04:34:44PM -0800, David S. Miller wrote:
> On Fri, 28 Jan 2005 00:17:01 +0000
> Russell King <rmk+lkml@arm.linux.org.uk> wrote:
> > Yes.  Someone suggested this evening that there may have been a recent
> > change to do with some IPv6 refcounting which may have caused this
> > problem.  Is that something you can confirm?
> 
> Yep, it would be this change below.  Try backing it out and see
> if that makes your leak go away.

Thanks.  I'll try it, but:

1. Looking at the date of the change it seems unlikely.  The recent
   death occurred with 2.6.10-rc2, booted on 29th November and dying
   on 19th January, which obviously predates this cset.
2. It'll take a couple of days to confirm the behaviour of the dst cache.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-27 20:40                             ` Phil Oester
@ 2005-01-28  9:32                               ` Russell King
  0 siblings, 0 replies; 36+ messages in thread
From: Russell King @ 2005-01-28  9:32 UTC (permalink / raw)
  To: Phil Oester
  Cc: Robert Olsson, Andrew Morton, torvalds, alexn, kas, linux-kernel,
	netdev

On Thu, Jan 27, 2005 at 12:40:12PM -0800, Phil Oester wrote:
> Vanilla 2.6.10, though I've been seeing these problems since 2.6.8 or
> earlier.

Right.  For me:

- 2.6.9-rc3 (installed 8th Oct) died with dst cache overflow on 29th November
- 2.6.10-rc2 (booted 29th Nov) died with the same on 19th January
- 2.6.11-rc1 (booted 19th Jan) appears to have the same problem, but
  it hasn't died yet.

> Netfilter running on all boxes, some utilizing SNAT, others
> not -- none using MASQ.

IPv4 filter targets: ACCEPT, DROP, REJECT, LOG
	using: state, limit & protocol

IPv4 nat targets: DNAT, MASQ
	using: protocol

IPv4 mangle targets: ACCEPT, MARK
	using: protocol

IPv6 filter targets: ACCEPT, DROP
	using: protocol

IPv6 mangle targets: none

(protocol == at least one rule matching tcp, icmp or udp packets)

IPv6 configured native on internal interface, tun6to4 for external IPv6
communication.

IPv4 and IPv6 forwarding enabled.
IPv4 rpfilter, proxyarp, syncookies enabled.
IPv4 proxy delay on internal interface set to '1'.

> These boxes are all running the quagga OSPF daemon, but those that
> are lightly loaded are not exhibiting these problems.

Running zebra (for ipv6 route advertisment on the local network only.)

Network traffic-wise, 2.6.11-rc1 has this on its public facing
interface(s) in 8.5 days.

4: eth1: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
    RX: bytes  packets  errors  dropped overrun mcast
    667468541  2603373  0       0       0       0
    TX: bytes  packets  errors  dropped carrier collsns
    1245774764 2777605  0       0       1       2252

5: tun6to4@NONE: <NOARP,UP> mtu 1480 qdisc noqueue
    RX: bytes  packets  errors  dropped overrun mcast
    19130536   84034    0       0       0       0
    TX: bytes  packets  errors  dropped carrier collsns
    10436749   91589    0       0       0       0


-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-28  8:58                               ` Russell King
@ 2005-01-30 13:23                                 ` Russell King
  2005-01-30 15:34                                   ` Russell King
  2005-01-30 17:23                                   ` Patrick McHardy
  0 siblings, 2 replies; 36+ messages in thread
From: Russell King @ 2005-01-30 13:23 UTC (permalink / raw)
  To: David S. Miller, Robert.Olsson, akpm, torvalds, alexn, kas,
	linux-kernel, netdev

On Fri, Jan 28, 2005 at 08:58:59AM +0000, Russell King wrote:
> On Thu, Jan 27, 2005 at 04:34:44PM -0800, David S. Miller wrote:
> > On Fri, 28 Jan 2005 00:17:01 +0000
> > Russell King <rmk+lkml@arm.linux.org.uk> wrote:
> > > Yes.  Someone suggested this evening that there may have been a recent
> > > change to do with some IPv6 refcounting which may have caused this
> > > problem.  Is that something you can confirm?
> > 
> > Yep, it would be this change below.  Try backing it out and see
> > if that makes your leak go away.
> 
> Thanks.  I'll try it, but:
> 
> 1. Looking at the date of the change it seems unlikely.  The recent
>    death occurred with 2.6.10-rc2, booted on 29th November and dying
>    on 19th January, which obviously predates this cset.
> 2. It'll take a couple of days to confirm the behaviour of the dst cache.

I have another question whether ip6_output.c is the problem - the leak
is with ip_dst_cache (== IPv4).  If the problem were ip6_output, wouldn't
we see ip6_dst_cache leaking instead?

Anyway, I've produced some code which keeps a record of the __refcnt
increments and decrements, and I think it's produced some interesting
results.  Essentially, I'm seeing the odd dst entry with a __refcnt of
14000 or so (which is still in active use, so probably ok), and a number
with 4, 7, and 13 which haven't had the refcount touched for at least 14
minutes.

One of these were created via ip_route_input_slow(), the other three via
ip_route_output_slow().  That isn't significant on its own.

However, whenever ip_copy_metadata() appears in the refcount log, I see
half the number of increments due to that still remaining to be
decremented (see the output below).  0 = "mark", positive numbers =
increment refcnt this many times, negative numbers = decrement refcnt
this many times.

I don't know if the code is using fragment lists in ip_fragment(), but
on reading the code a question comes to mind: if we have a list of
fragments, does each fragment skb have a valid (and refcounted) dst
pointer before ip_fragment() does it's job?  If yes, then isn't the
first ip_copy_metadata() in ip_fragment() going to overwrite this
pointer without dropping the refcount?

All that said, it's probably far too early to read much into these
results - once the machine has been running for more than 19 minutes
and has a significant number of "stuck" dst cache entries, I think
it'll be far more conclusive.  Nevertheless, it looks like food for
thought.

dst pointer: creation time (200Hz jiffies) last reference time (200Hz jiffies)
c1c66260: ffff6c79 ffff879d:
	location count	function
        c01054f4 0      dst_alloc
        c0114a80 1      ip_route_input_slow
        c00fa95c -18    __kfree_skb
        c0115104 13     ip_route_input
        c011ae1c 8      ip_copy_metadata
        c01055ac 0      __dst_free
	untracked counts
        : 0
	total
	= 4
  next=c1c66b60 refcnt=00000004 use=0000000d dst=24f45cc3 src=0f00a8c0

c1c66b60: ffff20fe ffff5066:
        c01054f4 0      dst_alloc
        c01156e8 1      ip_route_output_slow
        c011b854 6813   ip_append_data
        c011c7e0 6813   ip_push_pending_frames
        c00fa95c -6826  __kfree_skb
        c011c8fc -6813  ip_push_pending_frames
        c0139dbc -6813  udp_sendmsg
        c0115a0c 6814   __ip_route_output_key
        c013764c -2     ip4_datagram_connect
        c011ae1c 26     ip_copy_metadata
        c01055ac 0      __dst_free
        : 0
	= 13
  next=c1c57680 refcnt=0000000d use=00001a9e dst=bbe812d4 src=bae812d4

c1c66960: ffff89ac ffffa42d:
        c01054f4 0      dst_alloc
        c01156e8 1      ip_route_output_slow
        c011b854 3028   ip_append_data
        c0139dbc -3028  udp_sendmsg
        c011c7e0 3028   ip_push_pending_frames
        c011ae1c 8      ip_copy_metadata
        c00fa95c -3032  __kfree_skb
        c011c8fc -3028  ip_push_pending_frames
        c0115a0c 3027   __ip_route_output_key
        c01055ac 0      __dst_free
        : 0
	= 4
  next=c16d1080 refcnt=00000004 use=00000bd3 dst=bbe812d4 src=bae812d4

c16d1080: ffff879b ffff89af:
        c01054f4 0      dst_alloc
        c01156e8 1      ip_route_output_slow
        c011b854 240    ip_append_data
        c011c7e0 240    ip_push_pending_frames
        c00fa95c -247   __kfree_skb
        c011c8fc -240   ip_push_pending_frames
        c0139dbc -240   udp_sendmsg
        c0115a0c 239    __ip_route_output_key
        c011ae1c 14     ip_copy_metadata
        c01055ac 0      __dst_free
        : 0
	= 7
  next=c1c66260 refcnt=00000007 use=000000ef dst=bbe812d4 src=bae812d4


-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-30 13:23                                 ` Russell King
@ 2005-01-30 15:34                                   ` Russell King
  2005-01-30 16:57                                     ` Phil Oester
  2005-01-30 17:23                                   ` Patrick McHardy
  1 sibling, 1 reply; 36+ messages in thread
From: Russell King @ 2005-01-30 15:34 UTC (permalink / raw)
  To: David S. Miller, Robert.Olsson, akpm, torvalds, alexn, kas,
	linux-kernel, netdev

On Sun, Jan 30, 2005 at 01:23:43PM +0000, Russell King wrote:
> Anyway, I've produced some code which keeps a record of the __refcnt
> increments and decrements, and I think it's produced some interesting
> results.  Essentially, I'm seeing the odd dst entry with a __refcnt of
> 14000 or so (which is still in active use, so probably ok), and a number
> with 4, 7, and 13 which haven't had the refcount touched for at least 14
> minutes.

An hour or so goes by.  I now have 14 dst cache entries with non-zero
refcounts, and these have the following properties:

* The five from before (with counts 13, 14473, 4, 4, 7 respectively):
  + all remain unfreed.
  + show precisely no change in the refcounts.
  + the refcount has not been touched for more than an hour.
* They have all been touched by ip_copy_metadata.
* Their remaining refcounts are precisely half the number of
  ip_copy_metadata calls in every instance.

No entries with a refcount of zero contain ip_copy_metadata() and do
appear in /proc/net/rt_cache.

The following may also be a pointer - from /proc/net/snmp:

Ip: Forwarding DefaultTTL InReceives InHdrErrors InAddrErrors ForwDatagrams InUnknownProtos InDiscards InDelivers OutRequests OutDiscards OutNoRoutes ReasmTimeout ReasmReqds ReasmOKs ReasmFails FragOKs FragFails FragCreates
Ip: 1 64 140510 0 0 36861 0 0 93549 131703 485 0 21 46622 15695 21 21950 0 0

Since FragCreates is 0, this means that we are using the frag_lists
rather than creating our own fragments (and indeed the first
ip_copy_metadata() call rather than the second in ip_fragment()).

I think the case against the IPv4 fragmentation code is mounting.
However, without knowing what the expected conditions for this code,
(eg, are skbs on the fraglist supposed to have NULL skb->dst?) I'm
unable to progress this any further.  However, I think it's quite
clear that there is something bad going on here.

Why many more people aren't seeing this I've no idea.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-30 15:34                                   ` Russell King
@ 2005-01-30 16:57                                     ` Phil Oester
  0 siblings, 0 replies; 36+ messages in thread
From: Phil Oester @ 2005-01-30 16:57 UTC (permalink / raw)
  To: David S. Miller, Robert.Olsson, akpm, torvalds, alexn, kas,
	linux-kernel, netdev

On Sun, Jan 30, 2005 at 03:34:49PM +0000, Russell King wrote:
> I think the case against the IPv4 fragmentation code is mounting.
> However, without knowing what the expected conditions for this code,
> (eg, are skbs on the fraglist supposed to have NULL skb->dst?) I'm
> unable to progress this any further.  However, I think it's quite
> clear that there is something bad going on here.

Interesting...the gateway which exhibits the problem fastest in my
area does have a large number of fragmented UDP packets running through it,
as shown by tcpdump 'ip[6:2] & 0x1fff != 0'.

> Why many more people aren't seeing this I've no idea.

Perhaps you (and I) experience more fragments than the average user???

Nice detective work!

Phil

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-30 13:23                                 ` Russell King
  2005-01-30 15:34                                   ` Russell King
@ 2005-01-30 17:23                                   ` Patrick McHardy
  2005-01-30 17:26                                     ` Patrick McHardy
  1 sibling, 1 reply; 36+ messages in thread
From: Patrick McHardy @ 2005-01-30 17:23 UTC (permalink / raw)
  To: Russell King
  Cc: David S. Miller, Robert.Olsson, akpm, torvalds, alexn, kas,
	linux-kernel, netdev

Russell King wrote:

>I don't know if the code is using fragment lists in ip_fragment(), but
>on reading the code a question comes to mind: if we have a list of
>fragments, does each fragment skb have a valid (and refcounted) dst
>pointer before ip_fragment() does it's job?  If yes, then isn't the
>first ip_copy_metadata() in ip_fragment() going to overwrite this
>pointer without dropping the refcount?
>
Nice spotting. If conntrack isn't loaded defragmentation happens after
routing, so this is likely the cause.

Regards
Patrick

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-30 17:23                                   ` Patrick McHardy
@ 2005-01-30 17:26                                     ` Patrick McHardy
  2005-01-30 17:58                                       ` Patrick McHardy
  2005-01-30 18:01                                       ` Russell King
  0 siblings, 2 replies; 36+ messages in thread
From: Patrick McHardy @ 2005-01-30 17:26 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Russell King, David S. Miller, Robert.Olsson, akpm, torvalds,
	alexn, kas, linux-kernel, netdev

Patrick McHardy wrote:

> Russell King wrote:
>
>> I don't know if the code is using fragment lists in ip_fragment(), but
>> on reading the code a question comes to mind: if we have a list of
>> fragments, does each fragment skb have a valid (and refcounted) dst
>> pointer before ip_fragment() does it's job?  If yes, then isn't the
>> first ip_copy_metadata() in ip_fragment() going to overwrite this
>> pointer without dropping the refcount?
>>
> Nice spotting. If conntrack isn't loaded defragmentation happens after
> routing, so this is likely the cause.

OTOH, if conntrack isn't loaded forwarded packet are never defragmented,
so frag_list should be empty. So probably false alarm, sorry.

> Regards
> Patrick

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-30 17:26                                     ` Patrick McHardy
@ 2005-01-30 17:58                                       ` Patrick McHardy
  2005-01-30 18:45                                         ` Russell King
                                                           ` (2 more replies)
  2005-01-30 18:01                                       ` Russell King
  1 sibling, 3 replies; 36+ messages in thread
From: Patrick McHardy @ 2005-01-30 17:58 UTC (permalink / raw)
  To: Russell King
  Cc: David S. Miller, Robert.Olsson, akpm, torvalds, alexn, kas,
	linux-kernel, netdev

[-- Attachment #1: Type: text/plain, Size: 899 bytes --]

Patrick McHardy wrote:

>> Russell King wrote:
>>
>>> I don't know if the code is using fragment lists in ip_fragment(), but
>>> on reading the code a question comes to mind: if we have a list of
>>> fragments, does each fragment skb have a valid (and refcounted) dst
>>> pointer before ip_fragment() does it's job?  If yes, then isn't the
>>> first ip_copy_metadata() in ip_fragment() going to overwrite this
>>> pointer without dropping the refcount?
>>>
>> Nice spotting. If conntrack isn't loaded defragmentation happens after
>> routing, so this is likely the cause.
>
>
> OTOH, if conntrack isn't loaded forwarded packet are never defragmented,
> so frag_list should be empty. So probably false alarm, sorry.

Ok, final decision: you are right :) conntrack also defragments locally
generated packets before they hit ip_fragment. In this case the fragments
have skb->dst set.

Regards
Patrick


[-- Attachment #2: x --]
[-- Type: text/plain, Size: 366 bytes --]

===== net/ipv4/ip_output.c 1.74 vs edited =====
--- 1.74/net/ipv4/ip_output.c	2005-01-25 01:40:10 +01:00
+++ edited/net/ipv4/ip_output.c	2005-01-30 18:54:43 +01:00
@@ -389,6 +389,7 @@
 	to->priority = from->priority;
 	to->protocol = from->protocol;
 	to->security = from->security;
+	dst_release(to->dst);
 	to->dst = dst_clone(from->dst);
 	to->dev = from->dev;
 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-30 17:26                                     ` Patrick McHardy
  2005-01-30 17:58                                       ` Patrick McHardy
@ 2005-01-30 18:01                                       ` Russell King
  2005-01-30 18:19                                         ` Phil Oester
  1 sibling, 1 reply; 36+ messages in thread
From: Russell King @ 2005-01-30 18:01 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: David S. Miller, Robert.Olsson, akpm, torvalds, alexn, kas,
	linux-kernel, netdev

On Sun, Jan 30, 2005 at 06:26:29PM +0100, Patrick McHardy wrote:
> Patrick McHardy wrote:
> 
> > Russell King wrote:
> >
> >> I don't know if the code is using fragment lists in ip_fragment(), but
> >> on reading the code a question comes to mind: if we have a list of
> >> fragments, does each fragment skb have a valid (and refcounted) dst
> >> pointer before ip_fragment() does it's job?  If yes, then isn't the
> >> first ip_copy_metadata() in ip_fragment() going to overwrite this
> >> pointer without dropping the refcount?
> >>
> > Nice spotting. If conntrack isn't loaded defragmentation happens after
> > routing, so this is likely the cause.
> 
> OTOH, if conntrack isn't loaded forwarded packet are never defragmented,
> so frag_list should be empty. So probably false alarm, sorry.

I've just checked Phil's mails - both Phil and myself are using
netfilter on the troublesome boxen.

Also, since FragCreates is zero, and this does mean that the frag_list
is not empty in all cases so far where ip_fragment() has been called.
(Reading the code, if frag_list was empty, we'd have to create some
fragments, which increments the FragCreates statistic.)

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-30 18:01                                       ` Russell King
@ 2005-01-30 18:19                                         ` Phil Oester
  0 siblings, 0 replies; 36+ messages in thread
From: Phil Oester @ 2005-01-30 18:19 UTC (permalink / raw)
  To: Patrick McHardy, David S. Miller, Robert.Olsson, akpm, torvalds,
	alexn, kas, linux-kernel, netdev

On Sun, Jan 30, 2005 at 06:01:46PM +0000, Russell King wrote:
> > OTOH, if conntrack isn't loaded forwarded packet are never defragmented,
> > so frag_list should be empty. So probably false alarm, sorry.
> 
> I've just checked Phil's mails - both Phil and myself are using
> netfilter on the troublesome boxen.
> 
> Also, since FragCreates is zero, and this does mean that the frag_list
> is not empty in all cases so far where ip_fragment() has been called.
> (Reading the code, if frag_list was empty, we'd have to create some
> fragments, which increments the FragCreates statistic.)

The below testcase seems to illustrate the problem nicely -- ip_dst_cache
grows but never shrinks:

On gateway:

iptables -I FORWARD -d 10.10.10.0/24 -j DROP

On client:

for i in `seq 1 254` ; do ping -s 1500 -c 5 -w 1 -f 10.10.10.$i ; done


Phil

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-30 17:58                                       ` Patrick McHardy
@ 2005-01-30 18:45                                         ` Russell King
  2005-01-31  2:48                                         ` David S. Miller
  2005-01-31  4:11                                         ` Herbert Xu
  2 siblings, 0 replies; 36+ messages in thread
From: Russell King @ 2005-01-30 18:45 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: David S. Miller, Robert.Olsson, akpm, torvalds, alexn, kas,
	linux-kernel, netdev

On Sun, Jan 30, 2005 at 06:58:27PM +0100, Patrick McHardy wrote:
> Patrick McHardy wrote:
> >> Russell King wrote:
> >>> I don't know if the code is using fragment lists in ip_fragment(), but
> >>> on reading the code a question comes to mind: if we have a list of
> >>> fragments, does each fragment skb have a valid (and refcounted) dst
> >>> pointer before ip_fragment() does it's job?  If yes, then isn't the
> >>> first ip_copy_metadata() in ip_fragment() going to overwrite this
> >>> pointer without dropping the refcount?
> >>>
> >> Nice spotting. If conntrack isn't loaded defragmentation happens after
> >> routing, so this is likely the cause.
> >
> > OTOH, if conntrack isn't loaded forwarded packet are never defragmented,
> > so frag_list should be empty. So probably false alarm, sorry.
> 
> Ok, final decision: you are right :) conntrack also defragments locally
> generated packets before they hit ip_fragment. In this case the fragments
> have skb->dst set.

Good news - with this in place, I no longer have refcounts of 14000!
After 18 minutes (the first clearout of the dst cache from 500 odd
down to 11 or so), all dst cache entries have a ref count of zero.

I'll check it again later this evening to be sure.

Thanks Patrick.

> ===== net/ipv4/ip_output.c 1.74 vs edited =====
> --- 1.74/net/ipv4/ip_output.c	2005-01-25 01:40:10 +01:00
> +++ edited/net/ipv4/ip_output.c	2005-01-30 18:54:43 +01:00
> @@ -389,6 +389,7 @@
>  	to->priority = from->priority;
>  	to->protocol = from->protocol;
>  	to->security = from->security;
> +	dst_release(to->dst);
>  	to->dst = dst_clone(from->dst);
>  	to->dev = from->dev;
>  


-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-30 17:58                                       ` Patrick McHardy
  2005-01-30 18:45                                         ` Russell King
@ 2005-01-31  2:48                                         ` David S. Miller
  2005-01-31  4:11                                         ` Herbert Xu
  2 siblings, 0 replies; 36+ messages in thread
From: David S. Miller @ 2005-01-31  2:48 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: rmk+lkml, Robert.Olsson, akpm, torvalds, alexn, kas, linux-kernel,
	netdev

On Sun, 30 Jan 2005 18:58:27 +0100
Patrick McHardy <kaber@trash.net> wrote:

> Ok, final decision: you are right :) conntrack also defragments locally
> generated packets before they hit ip_fragment. In this case the fragments
> have skb->dst set.

It's amazing how many bugs exist due to the local defragmentation and
refragmentation done by netfilter. :-)

Good catch Patrick, I'll apply this and push upstream.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-30 17:58                                       ` Patrick McHardy
  2005-01-30 18:45                                         ` Russell King
  2005-01-31  2:48                                         ` David S. Miller
@ 2005-01-31  4:11                                         ` Herbert Xu
  2005-01-31  4:45                                           ` YOSHIFUJI Hideaki / 吉藤英明
  2 siblings, 1 reply; 36+ messages in thread
From: Herbert Xu @ 2005-01-31  4:11 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: rmk+lkml, davem, Robert.Olsson, akpm, torvalds, alexn, kas,
	linux-kernel, netdev

Patrick McHardy <kaber@trash.net> wrote:
> 
> Ok, final decision: you are right :) conntrack also defragments locally
> generated packets before they hit ip_fragment. In this case the fragments
> have skb->dst set.

Well caught.  The same thing is needed for IPv6, right?
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-31  4:11                                         ` Herbert Xu
@ 2005-01-31  4:45                                           ` YOSHIFUJI Hideaki / 吉藤英明
  2005-01-31  5:00                                             ` Patrick McHardy
  0 siblings, 1 reply; 36+ messages in thread
From: YOSHIFUJI Hideaki / 吉藤英明 @ 2005-01-31  4:45 UTC (permalink / raw)
  To: herbert, davem
  Cc: kaber, rmk+lkml, Robert.Olsson, akpm, torvalds, alexn, kas,
	linux-kernel, netdev, yoshfuji

In article <E1CvSuS-00056x-00@gondolin.me.apana.org.au> (at Mon, 31 Jan 2005 15:11:32 +1100), Herbert Xu <herbert@gondor.apana.org.au> says:

> Patrick McHardy <kaber@trash.net> wrote:
> > 
> > Ok, final decision: you are right :) conntrack also defragments locally
> > generated packets before they hit ip_fragment. In this case the fragments
> > have skb->dst set.
> 
> Well caught.  The same thing is needed for IPv6, right?

(not yet confirmed, but) yes, please.

Signed-off-by: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>

===== net/ipv6/ip6_output.c 1.82 vs edited =====
--- 1.82/net/ipv6/ip6_output.c	2005-01-25 09:40:10 +09:00
+++ edited/net/ipv6/ip6_output.c	2005-01-31 13:44:01 +09:00
@@ -463,6 +463,7 @@
 	to->priority = from->priority;
 	to->protocol = from->protocol;
 	to->security = from->security;
+	dst_release(to->dst);
 	to->dst = dst_clone(from->dst);
 	to->dev = from->dev;
 

--yoshfuji

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-31  4:45                                           ` YOSHIFUJI Hideaki / 吉藤英明
@ 2005-01-31  5:00                                             ` Patrick McHardy
  2005-01-31  5:11                                               ` David S. Miller
  2005-01-31  5:16                                               ` YOSHIFUJI Hideaki / 吉藤英明
  0 siblings, 2 replies; 36+ messages in thread
From: Patrick McHardy @ 2005-01-31  5:00 UTC (permalink / raw)
  To: yoshfuji
  Cc: herbert, davem, rmk+lkml, Robert.Olsson, akpm, torvalds, alexn,
	kas, linux-kernel, netdev

YOSHIFUJI Hideaki / ^[$B5HF#1QL@^[ wrote:

>In article <E1CvSuS-00056x-00@gondolin.me.apana.org.au> (at Mon, 31 Jan 2005 15:11:32 +1100), Herbert Xu <herbert@gondor.apana.org.au> says:
>
>
>>Patrick McHardy <kaber@trash.net> wrote:
>>
>>>Ok, final decision: you are right :) conntrack also defragments locally
>>>generated packets before they hit ip_fragment. In this case the fragments
>>>have skb->dst set.
>>>
>>Well caught.  The same thing is needed for IPv6, right?
>>
>
>(not yet confirmed, but) yes, please.
>
We don't need this for IPv6 yet. Once we get nf_conntrack in we
might need this, but its IPv6 fragment handling is different from
ip_conntrack, I need to check first.

Regards
Patrick

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-31  5:00                                             ` Patrick McHardy
@ 2005-01-31  5:11                                               ` David S. Miller
  2005-01-31  5:40                                                 ` Herbert Xu
  2005-01-31  5:16                                               ` YOSHIFUJI Hideaki / 吉藤英明
  1 sibling, 1 reply; 36+ messages in thread
From: David S. Miller @ 2005-01-31  5:11 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: yoshfuji, herbert, rmk+lkml, Robert.Olsson, akpm, torvalds, alexn,
	kas, linux-kernel, netdev

On Mon, 31 Jan 2005 06:00:40 +0100
Patrick McHardy <kaber@trash.net> wrote:

> We don't need this for IPv6 yet. Once we get nf_conntrack in we
> might need this, but its IPv6 fragment handling is different from
> ip_conntrack, I need to check first.

Right, ipv6 netfilter cannot create this situation yet.

However, logically the fix is still correct and I'll add
it into the tree.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-31  5:00                                             ` Patrick McHardy
  2005-01-31  5:11                                               ` David S. Miller
@ 2005-01-31  5:16                                               ` YOSHIFUJI Hideaki / 吉藤英明
  2005-01-31  5:42                                                 ` Yasuyuki KOZAKAI
  1 sibling, 1 reply; 36+ messages in thread
From: YOSHIFUJI Hideaki / 吉藤英明 @ 2005-01-31  5:16 UTC (permalink / raw)
  To: kaber, kozakai
  Cc: herbert, davem, rmk+lkml, Robert.Olsson, akpm, torvalds, alexn,
	kas, linux-kernel, netdev

In article <41FDBB78.2050403@trash.net> (at Mon, 31 Jan 2005 06:00:40 +0100), Patrick McHardy <kaber@trash.net> says:

|We don't need this for IPv6 yet. Once we get nf_conntrack in we
|might need this, but its IPv6 fragment handling is different from
|ip_conntrack, I need to check first.

Ok. It would be better to have some comment but anyway...
kozakai-san?

--yoshfuji

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-31  5:11                                               ` David S. Miller
@ 2005-01-31  5:40                                                 ` Herbert Xu
  0 siblings, 0 replies; 36+ messages in thread
From: Herbert Xu @ 2005-01-31  5:40 UTC (permalink / raw)
  To: David S. Miller
  Cc: Patrick McHardy, yoshfuji, rmk+lkml, Robert.Olsson, akpm,
	torvalds, alexn, kas, linux-kernel, netdev

On Sun, Jan 30, 2005 at 09:11:50PM -0800, David S. Miller wrote:
> On Mon, 31 Jan 2005 06:00:40 +0100
> Patrick McHardy <kaber@trash.net> wrote:
> 
> > We don't need this for IPv6 yet. Once we get nf_conntrack in we
> > might need this, but its IPv6 fragment handling is different from
> > ip_conntrack, I need to check first.
> 
> Right, ipv6 netfilter cannot create this situation yet.

Not through netfilter but I'm not convinced that other paths
won't do this.

For instance, what about ipv6_frag_rcv -> esp6_input -> ... -> ip6_fragment?
That would seem to be a potential path for a non-NULL dst to survive
through to ip6_fragment, no?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-31  5:16                                               ` YOSHIFUJI Hideaki / 吉藤英明
@ 2005-01-31  5:42                                                 ` Yasuyuki KOZAKAI
  0 siblings, 0 replies; 36+ messages in thread
From: Yasuyuki KOZAKAI @ 2005-01-31  5:42 UTC (permalink / raw)
  To: yoshfuji
  Cc: kaber, kozakai, herbert, davem, rmk+lkml, Robert.Olsson, akpm,
	torvalds, alexn, kas, linux-kernel, netdev


Hi,

From: YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org>
Date: Mon, 31 Jan 2005 14:16:36 +0900 (JST)

> In article <41FDBB78.2050403@trash.net> (at Mon, 31 Jan 2005 06:00:40 +0100), Patrick McHardy <kaber@trash.net> says:
> 
> |We don't need this for IPv6 yet. Once we get nf_conntrack in we
> |might need this, but its IPv6 fragment handling is different from
> |ip_conntrack, I need to check first.
> 
> Ok. It would be better to have some comment but anyway...
> kozakai-san?

IMO, fix for nf_conntrack isn't needed yet. Because someone may change
IPv6 fragment handling in nf_conntrack.

Anyway, current nf_conntrack passes the original (not de-fragmented) skb to
IPv6 stack. nf_conntrack doesn't touch its dst.

Regards,
----------------------------------------
Yasuyuki KOZAKAI

Communication Platform Laboratory,
Corporate Research & Development Center,
Toshiba Corporation

yasuyuki.kozakai@toshiba.co.jp
----------------------------------------

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2005-01-31  5:42 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20050121161959.GO3922@fi.muni.cz>
     [not found] ` <1106360639.15804.1.camel@boxen>
     [not found]   ` <20050123091154.GC16648@suse.de>
     [not found]     ` <20050123011918.295db8e8.akpm@osdl.org>
     [not found]       ` <20050123095608.GD16648@suse.de>
     [not found]         ` <20050123023248.263daca9.akpm@osdl.org>
2005-01-23 20:03           ` Memory leak in 2.6.11-rc1? Russell King
2005-01-24 11:48             ` Russell King
2005-01-25 19:32               ` Russell King
2005-01-27  8:28                 ` Russell King
2005-01-27  8:47                   ` Andrew Morton
2005-01-27 10:19                     ` Alessandro Suardi
2005-01-27 12:17                     ` Martin Josefsson
2005-01-27 12:56                     ` Robert Olsson
2005-01-27 13:03                       ` Robert Olsson
2005-01-27 16:49                       ` Russell King
2005-01-27 18:37                         ` Phil Oester
2005-01-27 19:25                           ` Russell King
2005-01-27 20:40                             ` Phil Oester
2005-01-28  9:32                               ` Russell King
2005-01-27 20:33                         ` David S. Miller
2005-01-28  0:17                           ` Russell King
2005-01-28  0:34                             ` David S. Miller
2005-01-28  8:58                               ` Russell King
2005-01-30 13:23                                 ` Russell King
2005-01-30 15:34                                   ` Russell King
2005-01-30 16:57                                     ` Phil Oester
2005-01-30 17:23                                   ` Patrick McHardy
2005-01-30 17:26                                     ` Patrick McHardy
2005-01-30 17:58                                       ` Patrick McHardy
2005-01-30 18:45                                         ` Russell King
2005-01-31  2:48                                         ` David S. Miller
2005-01-31  4:11                                         ` Herbert Xu
2005-01-31  4:45                                           ` YOSHIFUJI Hideaki / 吉藤英明
2005-01-31  5:00                                             ` Patrick McHardy
2005-01-31  5:11                                               ` David S. Miller
2005-01-31  5:40                                                 ` Herbert Xu
2005-01-31  5:16                                               ` YOSHIFUJI Hideaki / 吉藤英明
2005-01-31  5:42                                                 ` Yasuyuki KOZAKAI
2005-01-30 18:01                                       ` Russell King
2005-01-30 18:19                                         ` Phil Oester
2005-01-28  1:41                             ` Phil Oester

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).