netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* deleting a conntrack record
@ 2004-06-17 15:07 Tobias DiPasquale
  2004-06-17 16:02 ` Patrick McHardy
  0 siblings, 1 reply; 5+ messages in thread
From: Tobias DiPasquale @ 2004-06-17 15:07 UTC (permalink / raw)
  To: netdev, linux-net, netfilter

Hello all,

I have a module that exports a /proc entry which takes a string with 4
args in it (src IP/port and dst IP/port) and then attempts to delete
the conntrack entry for the TCP connection associated with those
arguments. Here's the code in question (keep reading past the code for
a description of the problem I'm having):

<code>
static inline int kill_ct_record( const struct ip_conntrack *c, void *p)
{
       struct ip_conntrack *q = (struct ip_conntrack *)p;

       if (!memcmp( &c->tuplehash[IP_CT_DIR_ORIGINAL],
                    &q->tuplehash[IP_CT_DIR_ORIGINAL],
                    sizeof( struct ip_conntrack_tuple_hash))) {
               ip_conntrack_put( q);
               return 1;
       }
       return 0;
}

static int delete_ct_record( u_int32_t src, u_int16_t sport, u_int32_t
dst, u_int16_t dport)
{
       struct ip_conntrack_tuple tuple;
       struct ip_conntrack_tuple_hash *h;

       memset( &tuple, 0, sizeof( tuple));
       tuple.src.ip = src;
       tuple.src.u.tcp.port = sport;
       tuple.dst.ip = dst;
       tuple.dst.u.tcp.port = dport;
       tuple.dst.protonum = IPPROTO_TCP;
       h = ip_conntrack_find_get( &tuple, NULL);
       if (!h)
               return -ENOENT;
       ip_ct_selective_cleanup( kill_ct_record, h->ctrack);
       return 1;
}
</code>

The problem is as follows:

There is a userspace script that runs from cron every 5 minutes. It
looks through the /proc/net/ip_conntrack listing to see if any 
connections are "stale" (i.e. haven't seen a packet from them in
some amount of time). It then feeds their connection information
into my module's /proc entry so that those conntrack records can
be destroyed.

In the kill_ct_record() function in the module, if the 
ip_conntrack_put() call is not commented out, this causes the box 
to go into some infinite loop after some unspecified amount of time. 
There is no LKCD dump and I don't know what happened since I wasn't 
physically present for the crash in any of the instances.

On the other hand, when the ip_conntrack_put() call _is_ commented
out, the system leaks memory from conntrack as indicated in the
ip_conntrack line in /proc/slabinfo. But the crash doesn't happen
under that condition.

So, is there a cleaner way to hand-delete a conntrack record? Or is
this the only method? Or is there some error in the way that I am
doing the above?

By the way, this is almost exactly what ctnetlink does to delete a
conntrack record so any errors discovered here will almost surely have
to be fixed there, as well.

-- 
[ Tobias DiPasquale ]
0x636f6465736c696e67657240676d61696c2e636f6d

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: deleting a conntrack record
  2004-06-17 15:07 deleting a conntrack record Tobias DiPasquale
@ 2004-06-17 16:02 ` Patrick McHardy
  2004-06-17 16:17   ` Tobias DiPasquale
  0 siblings, 1 reply; 5+ messages in thread
From: Patrick McHardy @ 2004-06-17 16:02 UTC (permalink / raw)
  To: Tobias DiPasquale; +Cc: netdev, linux-net, netfilter

Tobias DiPasquale wrote:
> Hello all,
> 
> I have a module that exports a /proc entry which takes a string with 4
> args in it (src IP/port and dst IP/port) and then attempts to delete
> the conntrack entry for the TCP connection associated with those
> arguments. Here's the code in question (keep reading past the code for
> a description of the problem I'm having):
> 
> <code>
> static inline int kill_ct_record( const struct ip_conntrack *c, void *p)
> {
>        struct ip_conntrack *q = (struct ip_conntrack *)p;
> 
>        if (!memcmp( &c->tuplehash[IP_CT_DIR_ORIGINAL],
>                     &q->tuplehash[IP_CT_DIR_ORIGINAL],
>                     sizeof( struct ip_conntrack_tuple_hash))) {
>                ip_conntrack_put( q);
>                return 1;
>        }
>        return 0;
> }
> 
> static int delete_ct_record( u_int32_t src, u_int16_t sport, u_int32_t
> dst, u_int16_t dport)
> {
>        struct ip_conntrack_tuple tuple;
>        struct ip_conntrack_tuple_hash *h;
> 
>        memset( &tuple, 0, sizeof( tuple));
>        tuple.src.ip = src;
>        tuple.src.u.tcp.port = sport;
>        tuple.dst.ip = dst;
>        tuple.dst.u.tcp.port = dport;
>        tuple.dst.protonum = IPPROTO_TCP;
>        h = ip_conntrack_find_get( &tuple, NULL);
>        if (!h)
>                return -ENOENT;
>        ip_ct_selective_cleanup( kill_ct_record, h->ctrack);
>        return 1;
> }
> </code>
> 
> The problem is as follows:
> 
> There is a userspace script that runs from cron every 5 minutes. It
> looks through the /proc/net/ip_conntrack listing to see if any 
> connections are "stale" (i.e. haven't seen a packet from them in
> some amount of time). It then feeds their connection information
> into my module's /proc entry so that those conntrack records can
> be destroyed.

Why don't you just adjust the timeout values in
/proc/sys/net/ipv4/netfilter ?

> 
> In the kill_ct_record() function in the module, if the 
> ip_conntrack_put() call is not commented out, this causes the box 
> to go into some infinite loop after some unspecified amount of time. 
> There is no LKCD dump and I don't know what happened since I wasn't 
> physically present for the crash in any of the instances.
> 
> On the other hand, when the ip_conntrack_put() call _is_ commented
> out, the system leaks memory from conntrack as indicated in the
> ip_conntrack line in /proc/slabinfo. But the crash doesn't happen
> under that condition.

The function passed to ip_ct_selective_cleanup is supposed to decide
if a conntrack should be destroyed by returning 0/1, not to do it
itself. ip_ct_selective_cleanup tries to destroy the already destroyed
conntrack.

> 
> So, is there a cleaner way to hand-delete a conntrack record? Or is
> this the only method? Or is there some error in the way that I am
> doing the above?
> 
> By the way, this is almost exactly what ctnetlink does to delete a
> conntrack record so any errors discovered here will almost surely have
> to be fixed there, as well.
> 

Thanks for pointing that out, I've fixed the ctnetlink code.

Regards
Patrick

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: deleting a conntrack record
  2004-06-17 16:02 ` Patrick McHardy
@ 2004-06-17 16:17   ` Tobias DiPasquale
  2004-06-17 16:42     ` Patrick McHardy
  0 siblings, 1 reply; 5+ messages in thread
From: Tobias DiPasquale @ 2004-06-17 16:17 UTC (permalink / raw)
  To: netdev, linux-net, netfilter

On Thu, 17 Jun 2004 18:02:16 +0200, Patrick McHardy <kaber@trash.net> wrote:
> Why don't you just adjust the timeout values in
> /proc/sys/net/ipv4/netfilter ?

Can't, because I only want to shorten the lifespans of some particular
TCP connections and also when I delete a record I need to do some
other stuff that's associated with the destruction of that connection.
 
> The function passed to ip_ct_selective_cleanup is supposed to decide
> if a conntrack should be destroyed by returning 0/1, not to do it
> itself. ip_ct_selective_cleanup tries to destroy the already destroyed
> conntrack.

This results in a memory leak in the conntrack slab cache. If you
don't call ip_conntrack_put(), the conntrack entry leaves the
ip_ct_selective_cleanup() function with a value >0 and thus is a
permanent part of the scenery in RAM. As well, its been removed from
the conntrack hash table, so it no longer appears in
/proc/net/ip_conntrack, but you can see the effects by viewing
/proc/slabinfo.

I have printed the integral value of the ct_general.use variable out
in order to confirm this on many occassions. If this is not the case,
then something extremely weird is going on with my kernel.

(btw, the kernel is a kernel.org 2.4.26 with the clear_fpu and the
2.4.26 LKCD (from the ML) patch applied)

-- 
[ Tobias DiPasquale ]
0x636f6465736c696e67657240676d61696c2e636f6d

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: deleting a conntrack record
  2004-06-17 16:17   ` Tobias DiPasquale
@ 2004-06-17 16:42     ` Patrick McHardy
  2004-06-17 23:03       ` Tobias DiPasquale
  0 siblings, 1 reply; 5+ messages in thread
From: Patrick McHardy @ 2004-06-17 16:42 UTC (permalink / raw)
  To: Tobias DiPasquale; +Cc: netdev, linux-net, netfilter

Tobias DiPasquale wrote:
> On Thu, 17 Jun 2004 18:02:16 +0200, Patrick McHardy <kaber@trash.net> wrote:
> 
>>The function passed to ip_ct_selective_cleanup is supposed to decide
>>if a conntrack should be destroyed by returning 0/1, not to do it
>>itself. ip_ct_selective_cleanup tries to destroy the already destroyed
>>conntrack.
> 
> This results in a memory leak in the conntrack slab cache. If you
> don't call ip_conntrack_put(), the conntrack entry leaves the
> ip_ct_selective_cleanup() function with a value >0 and thus is a
> permanent part of the scenery in RAM. As well, its been removed from
> the conntrack hash table, so it no longer appears in
> /proc/net/ip_conntrack, but you can see the effects by viewing
> /proc/slabinfo.

Now I remember - the reason why ctnetlink called ip_conntrack_put
in ctnetlink_kill is because it uses ip_conntrack_find_get before,
which increments the reference count. This is wrong because the
conntrack is then destroyed immediately after returning 1 to
ip_ct_selective_cleanup, but still used for comparing the tuple.
You (and ctnetlink) need to call ip_conntrack_put after the
call to ip_ct_selective_cleanup. In fact you shouldn't use
ip_ct_selective_cleanup at all but destroy it yourself. You
already have a reference, so there is no need to iterate through
the entire hash.

Regards
Patrick

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: deleting a conntrack record
  2004-06-17 16:42     ` Patrick McHardy
@ 2004-06-17 23:03       ` Tobias DiPasquale
  0 siblings, 0 replies; 5+ messages in thread
From: Tobias DiPasquale @ 2004-06-17 23:03 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netdev, linux-net, netfilter

On Thu, 17 Jun 2004 18:42:35 +0200, Patrick McHardy <kaber@trash.net> wrote:
> In fact you shouldn't use ip_ct_selective_cleanup at all but destroy 
> it yourself. You already have a reference, so there is no need to 
> iterate through the entire hash.

In case anyone is interested, as a followup, the above advice works
perfectly. The code now looks like this:

<code>
static int delete_ct_record( u_int32_t src, u_int16_t sport, u_int32_t
dst, u_int16_t dport)
{
      struct ip_conntrack_tuple tuple;
      struct ip_conntrack_tuple_hash *h;

      memset( &tuple, 0, sizeof( tuple));
      tuple.src.ip = src;
      tuple.src.u.tcp.port = sport;
      tuple.dst.ip = dst;
      tuple.dst.u.tcp.port = dport;
      tuple.dst.protonum = IPPROTO_TCP;
      h = ip_conntrack_find_get( &tuple, NULL);
      if (!h)
              return -ENOENT;
      if (del_timer( &h->ctrack->timeout))
              h->ctrack->timeout.function( (unsigned long)h->ctrack);
      ip_conntrack_put( h->ctrack);
      return 1;
}
</code>

As well, the kill_ct_record() function has been removed as it is now
useless. Thanks for all the help, Patrick especially. :)

-- 
[ Tobias DiPasquale ]
0x636f6465736c696e67657240676d61696c2e636f6d

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2004-06-17 23:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-06-17 15:07 deleting a conntrack record Tobias DiPasquale
2004-06-17 16:02 ` Patrick McHardy
2004-06-17 16:17   ` Tobias DiPasquale
2004-06-17 16:42     ` Patrick McHardy
2004-06-17 23:03       ` Tobias DiPasquale

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).