BUG() in ip_ct_event_cache

All of lore.kernel.org
 help / color / mirror / Atom feed

* BUG() in ip_ct_event_cache_flush()?
@ 2005-10-06  6:40 David S. Miller
  2005-10-07  4:09 ` Harald Welte
  0 siblings, 1 reply; 6+ messages in thread
From: David S. Miller @ 2005-10-06  6:40 UTC (permalink / raw)
  To: netfilter-devel; +Cc: laforge, kaber

I've seen an OOPS backtrace with current 2.6.x kernels
that seems to be in ip_ct_event_cache_flush().

This loop there is suspect:

	for_each_cpu(cpu) {
		ecache = &per_cpu(ip_conntrack_ecache, cpu);
		if (ecache->ct)
			ip_conntrack_put(ecache->ct);
	}

This should use "for_each_online_cpu()" I think.  Non-possible
cpus end up with percpu pointers being NULL or undefined.

Comments?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG() in ip_ct_event_cache_flush()?
  2005-10-07  4:09 ` Harald Welte
@ 2005-10-06 20:14   ` David S. Miller
  2005-10-07 21:35     ` Harald Welte
  0 siblings, 1 reply; 6+ messages in thread
From: David S. Miller @ 2005-10-06 20:14 UTC (permalink / raw)
  To: laforge; +Cc: netfilter-devel, kaber

From: Harald Welte <laforge@netfilter.org>
Date: Fri, 7 Oct 2005 06:09:06 +0200

> On Wed, Oct 05, 2005 at 11:40:46PM -0700, David S. Miller wrote:
> > 
> > I've seen an OOPS backtrace with current 2.6.x kernels
> > that seems to be in ip_ct_event_cache_flush().
> 
> :( do you have the oops/backtrace?

It's in a Fedora bug report here:

http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=169981

> Mh, but what happens if we take offline a cpu?  Let's say the event
> cache is used on four cpus, but then one cpu goes offline.  Ideally, we
> would still flush the event cache for that cpu that now has become
> offline, since otherwise we loose the event.

Right.

> According to the documentation in include/cpumask.h, "for_each_cpu"
> actually iterates over 'num_possible_map'.  So how would we end up with
> a "non-possible" cpu, according to your comment?

Good point.  I'm just throwing out ideas :-)

The reporter is running an SMP AMD64 kernel on a laptop, FWIW.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG() in ip_ct_event_cache_flush()?
  2005-10-06  6:40 BUG() in ip_ct_event_cache_flush()? David S. Miller
@ 2005-10-07  4:09 ` Harald Welte
  2005-10-06 20:14   ` David S. Miller
  0 siblings, 1 reply; 6+ messages in thread
From: Harald Welte @ 2005-10-07  4:09 UTC (permalink / raw)
  To: David S. Miller; +Cc: netfilter-devel, kaber

[-- Attachment #1: Type: text/plain, Size: 1346 bytes --]

On Wed, Oct 05, 2005 at 11:40:46PM -0700, David S. Miller wrote:
> 
> I've seen an OOPS backtrace with current 2.6.x kernels
> that seems to be in ip_ct_event_cache_flush().

:( do you have the oops/backtrace?

> This loop there is suspect:
> 
> 	for_each_cpu(cpu) {
> 		ecache = &per_cpu(ip_conntrack_ecache, cpu);
> 		if (ecache->ct)
> 			ip_conntrack_put(ecache->ct);
> 	}
> 
> This should use "for_each_online_cpu()" I think.  Non-possible
> cpus end up with percpu pointers being NULL or undefined.

Mh, but what happens if we take offline a cpu?  Let's say the event
cache is used on four cpus, but then one cpu goes offline.  Ideally, we
would still flush the event cache for that cpu that now has become
offline, since otherwise we loose the event.

According to the documentation in include/cpumask.h, "for_each_cpu"
actually iterates over 'num_possible_map'.  So how would we end up with
a "non-possible" cpu, according to your comment?

-- 
- Harald Welte <laforge@netfilter.org>                 http://netfilter.org/
============================================================================
  "Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed."                    -- Paul Vixie

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG() in ip_ct_event_cache_flush()?
  2005-10-07 21:35     ` Harald Welte
@ 2005-10-07 20:09       ` David S. Miller
  2005-10-07 23:33       ` what is the purpose of "filter" table? Tadas
  1 sibling, 0 replies; 6+ messages in thread
From: David S. Miller @ 2005-10-07 20:09 UTC (permalink / raw)
  To: laforge; +Cc: netfilter-devel, kaber

From: Harald Welte <laforge@netfilter.org>
Date: Fri, 7 Oct 2005 23:35:25 +0200

> On Thu, Oct 06, 2005 at 01:14:08PM -0700, David S. Miller wrote:
> > 
> > > On Wed, Oct 05, 2005 at 11:40:46PM -0700, David S. Miller wrote:
> > > > 
> > > > I've seen an OOPS backtrace with current 2.6.x kernels
> > > > that seems to be in ip_ct_event_cache_flush().
> > > 
> > > :( do you have the oops/backtrace?
> > 
> > It's in a Fedora bug report here:
> > 
> > http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=169981
> 
> Oh, a fedora kernel.  I received some reports that fedora kernel
> changelog claims to have the event cache fix, but in reality it is not
> applied.  
> People who switched from fedora to a current git kernel have reported
> that the problem has gone away.

That was fixed.  A bug in the fedora .spec file caused the
GIT patch to not get applied, Dave Jones fixed that.

> Anyway, the bug report from the link above talks about 
> 
> Call Trace:<ffffffff88308c29>{:ip_conntrack:ip_conntrack_cleanup+20}
>        <ffffffff88306ab4>{:ip_conntrack:init_or_cleanup+676}
>        <ffffffff80157798>{sys_delete_module+490}
>        <ffffffff801118aa>{syscall_trace_enter+217>
>        <ffffffff8010eb6c>{tracesys+209}
> 
> which seesm to have no relation to the event cahce, but rather some
> module unload problem. Are you sure that bug report is the right one?

Harald, it can be related to the event cache, because ip_conntrack_cleanup()
calls ip_ct_event_cache_flush() which gcc tends to inline.

> mh, I could try that since I'm having a Turion64 notebook... but
> never had the idea to run an smp kernel on it.

Fedora doesn't even build uniprocessor amd64 kernels any longer,
it stops to make any sense these days with so many dual-core et al.
chips out there :-)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG() in ip_ct_event_cache_flush()?
  2005-10-06 20:14   ` David S. Miller
@ 2005-10-07 21:35     ` Harald Welte
  2005-10-07 20:09       ` David S. Miller
  2005-10-07 23:33       ` what is the purpose of "filter" table? Tadas
  0 siblings, 2 replies; 6+ messages in thread
From: Harald Welte @ 2005-10-07 21:35 UTC (permalink / raw)
  To: David S. Miller; +Cc: netfilter-devel, kaber

[-- Attachment #1: Type: text/plain, Size: 1708 bytes --]

On Thu, Oct 06, 2005 at 01:14:08PM -0700, David S. Miller wrote:
> 
> > On Wed, Oct 05, 2005 at 11:40:46PM -0700, David S. Miller wrote:
> > > 
> > > I've seen an OOPS backtrace with current 2.6.x kernels
> > > that seems to be in ip_ct_event_cache_flush().
> > 
> > :( do you have the oops/backtrace?
> 
> It's in a Fedora bug report here:
> 
> http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=169981

Oh, a fedora kernel.  I received some reports that fedora kernel
changelog claims to have the event cache fix, but in reality it is not
applied.  
People who switched from fedora to a current git kernel have reported
that the problem has gone away.

Anyway, the bug report from the link above talks about 

Call Trace:<ffffffff88308c29>{:ip_conntrack:ip_conntrack_cleanup+20}
       <ffffffff88306ab4>{:ip_conntrack:init_or_cleanup+676}
       <ffffffff80157798>{sys_delete_module+490}
       <ffffffff801118aa>{syscall_trace_enter+217>
       <ffffffff8010eb6c>{tracesys+209}

which seesm to have no relation to the event cahce, but rather some
module unload problem. Are you sure that bug report is the right one?

> The reporter is running an SMP AMD64 kernel on a laptop, FWIW.

mh, I could try that since I'm having a Turion64 notebook... but never
had the idea to run an smp kernel on it.

-- 
- Harald Welte <laforge@netfilter.org>                 http://netfilter.org/
============================================================================
  "Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed."                    -- Paul Vixie

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* what is the purpose of "filter" table?
  2005-10-07 21:35     ` Harald Welte
  2005-10-07 20:09       ` David S. Miller
@ 2005-10-07 23:33       ` Tadas
  1 sibling, 0 replies; 6+ messages in thread
From: Tadas @ 2005-10-07 23:33 UTC (permalink / raw)
  To: netfilter-devel

As I see flter table cant do anthing usefull, 99% of modules are for mangle
and nat tables.
also dropping packets at input or forward is useless,
because conntrack will kill the atacked machine anyway.

I think it would be good idea to merge raw and filter tables into one
or add all available hooks to filter table, like prerouting and postrouting
I dont undertand what was the original purpose of filter table
since everyhing can be done in mangle table even better.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2005-10-07 23:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-10-06  6:40 BUG() in ip_ct_event_cache_flush()? David S. Miller
2005-10-07  4:09 ` Harald Welte
2005-10-06 20:14   ` David S. Miller
2005-10-07 21:35     ` Harald Welte
2005-10-07 20:09       ` David S. Miller
2005-10-07 23:33       ` what is the purpose of "filter" table? Tadas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.