* [PATCH] my recent meddling with ip_conntrack
[not found] <20020802102451.A685@oknodo.bof.de>
@ 2002-08-03 19:55 ` Patrick Schaaf
0 siblings, 0 replies; only message in thread
From: Patrick Schaaf @ 2002-08-03 19:55 UTC (permalink / raw)
To: netfilter-devel; +Cc: netdev
[-- Attachment #1: Type: text/plain, Size: 1512 bytes --]
Hi netfilter-devel & netdev,
I have pulled my recent ip_conntrack patches up to 2.4.19, and have
that merge running now on my shiny new dual P-MMX 200. No surprises.
It's already up 40 minutes with hundreds of connections tracked!
Patch appended for curious people and would-be testers. All comments welcome.
This is not meant for inclusion anywhere, right now, just looking for some
eyeballs.
have a nice weekend
Patrick
Short Changelog, in order of probable importance:
- netfilter hook statistics, /proc/net/nf_hook_stat*, as a compile option
found under "Networking Options". Per-hook-function rdtscll() based
timing and occurrence counting. See netfilter in action for yourself!
- remove unneccessary add_timer() calls from per-packet processing.
Introduces new ip_conntrack->timeout_target, 4 byte in size.
The running timer is never disturbed when increasing monotonically.
That covers the normal ESTABLISHED case. When the timer runs out,
it possibly restarts itself to the then-current timeout_target.
- prefer to allocate the ip_conntrack hash using get_free_pages()
- use a single linked list to hash them. BTW, with bucket count
autoselection, this change doubles the number of available buckets.
Saves four byte per ip_conntrack_hash_tuple, 8 byte per ip_conntrack.
- in include/linux/skbuff.h, introduce nf_skb_forget(), and use that to
cleanup several of places in ipv4/ core stack code.
- make init_conntrack() a bit more sane, removes unneccessary hash
computations.
[-- Attachment #2: bof-ct-merged-20020803.Changelog --]
[-- Type: text/plain, Size: 3954 bytes --]
---------------------- Would send the following csets ---------------------
ChangeSet@1.597, 2002-08-03 18:20:26+02:00, bof@cdr.(none)
Merge bkbits-linux-2.4 after 2.4.19
ChangeSet@1.582.9.7, 2002-08-02 09:02:19+02:00, bof@cdr.(none)
ip_conntrack_standalone.c:
cleanup /proc/net/ip_conntrack output - same as always, now.
ChangeSet@1.582.9.6, 2002-08-02 09:00:13+02:00, bof@cdr.(none)
ip_conntrack.h:
prototype for new ip_ct_sudden_death()
ip_conntrack_proto_icmp.c, ip_conntrack_proto_tcp.c:
use ip_ct_sudden_death(), instead of fiddling with ct->timeout directly.
ip_conntrack_core.c:
introduce ct->timeout_target, makes add_timer() a rare event.
ip_conntrack_standalone.c:
show ct->timeout_target in /proc/net/ip_conntrack.
ChangeSet@1.582.9.5, 2002-08-01 22:52:01+02:00, bof@cdr.(none)
ip_conntrack_core.c:
more add_timer avoidance
ChangeSet@1.582.9.4, 2002-08-01 22:27:28+02:00, bof@cdr.(none)
ip_conntrack.h, ip_conntrack_core.c:
begin add_timer avoidance
ChangeSet@1.582.9.3, 2002-08-01 21:21:14+02:00, bof@cdr.(none)
ip_conntrack_core.c:
get_free_pages allocation for ip_conntrack_hash
ChangeSet@1.582.7.5, 2002-08-01 19:04:41+02:00, bof@cdr.(none)
netfilter.c:
remove KERN_NOTICE output
ChangeSet@1.582.9.2, 2002-08-01 09:45:24+02:00, bof@cdr.(none)
srlist.h:
fix single ring list code
ip_conntrack_core.c:
type agnostic ip_conntrack_hash allocation
ChangeSet@1.582.9.1, 2002-07-31 20:57:39+02:00, bof@cdr.(none)
include/linux/netfilter_ipv4/srlist.h:
introduced. a single ring list implementation with almost the same
interface as the netfilter_ipv4/listhelp.h
include/linux/netfilter_ipv4/ip_conntrack*.h:
use srlist_head in place of list_head for conntrack tuple hashing.
net/ipv4/netfilter/ip_conntrack_{core,standalone}.c:
use srlist_head instead of list_head for conntrack tuple hashing.
ChangeSet@1.582.7.4, 2002-07-29 09:25:21+02:00, bof@cdr.(none)
netfilter.c:
some more comments, minimal cleanup, KERN_NOTICE upon (un)registration.
Configure.help:
friendly help and advise regarding CONFIG_NETFILTER_HOOK_STAT
ChangeSet@1.582.7.3, 2002-07-28 11:58:38+02:00, bof@cdr.(none)
net/core/netfilter.c:
remove debug printks related to slabifying hook statistic counters.
ChangeSet@1.582.7.2, 2002-07-28 11:54:23+02:00, bof@cdr.(none)
net/core/netfilter.c:
slabify, make per-cpu counters.
ChangeSet@1.582.7.1, 2002-07-27 19:16:23+02:00, bof@cdr.(none)
Config.in, netfilter.h, netfilter.c:
netfilter hook statistics
ChangeSet@1.582.4.2, 2002-07-22 21:07:02+02:00, bof@cdr.(none)
overall:
compiles now, skb_nf_forget() introduction probably OK.
skbuff.h:
sk_buff speling fix
ChangeSet@1.582.4.1, 2002-07-22 20:45:22+02:00, bof@cdr.(none)
skbuff.h:
define skb_nf_forget()
skbuff.c, ip_conntrack_core.c, ipt_REJECT.c, ipip.c, ip_gre.c, sit.c:
use skb_nf_forget()
ip_input.c, ipmr.c:
use skb_nf_forget()
NOTE: original code did not clear nf_debug. Now it will.
ChangeSet@1.582.2.70, 2002-07-22 09:32:20+02:00, bof@cdr.(none)
ip_conntrack_core.c:
in init_conntrack(), rename drop_next to drop_rotor: document recent change.
ChangeSet@1.582.2.69, 2002-07-22 09:31:29+02:00, bof@cdr.(none)
ip_conntrack_core.c:
in init_conntrack(), narrow scope of static drop_next.
ChangeSet@1.582.2.68, 2002-07-22 09:30:29+02:00, bof@cdr.(none)
ip_conntrack_core.c:
sanitize typing for hash_conntrack() return value: always use u_int32_t.
ChangeSet@1.582.2.67, 2002-07-22 09:24:39+02:00, bof@cdr.(none)
ip_conntrack_core.c:
remove hash calculation from unconditional part of init_conntrack(),
to the rare place where it is needed.
ChangeSet@1.582.2.66, 2002-07-22 09:23:20+02:00, bof@cdr.(none)
ip_conntrack_core.c:
remove repl_hash calculation in init_conntrack(): it was not used.
---------------------------------------------------------------------------
[-- Attachment #3: bof-ct-merged-20020803.patch --]
[-- Type: text/plain, Size: 36506 bytes --]
diff -urN ex1/Documentation/Configure.help ex2/Documentation/Configure.help
--- ex1/Documentation/Configure.help Sat Aug 3 21:21:17 2002
+++ ex2/Documentation/Configure.help Sat Aug 3 21:24:54 2002
@@ -2429,6 +2429,23 @@
You can say Y here if you want to get additional messages useful in
debugging the netfilter code.
+Netfilter hook statistics
+CONFIG_NETFILTER_HOOK_STAT
+ If you say Y here, the time spent in the various netfilter hook
+ functions is measured, using the TSC of your processor. Your
+ kernel won't boot when you don't have a working TSC.
+ Say N when you don't have a modern Intel/AMD processor.
+
+ When enabled, look at /proc/net/nf_stat_hook_* for the actual
+ measurement results, presented in a format easy to guess by
+ any well-calibrated crystal ball.
+
+ The timing imposes a processing overhead that may be relevant
+ on machines with high packet rates. The overhead is estimated
+ at about 5% of the time used by the hook functions, themselves.
+
+ The safe thing is to say N.
+
Connection tracking (required for masq/NAT)
CONFIG_IP_NF_CONNTRACK
Connection tracking keeps a record of what packets have passed
diff -urN ex1/include/linux/netfilter.h ex2/include/linux/netfilter.h
--- ex1/include/linux/netfilter.h Sat Aug 3 21:21:14 2002
+++ ex2/include/linux/netfilter.h Sat Aug 3 21:24:50 2002
@@ -51,6 +51,9 @@
int hooknum;
/* Hooks are ordered in ascending priority. */
int priority;
+#ifdef CONFIG_NETFILTER_HOOK_STAT
+ void *hook_stat;
+#endif
};
struct nf_sockopt_ops
diff -urN ex1/include/linux/netfilter_ipv4/ip_conntrack.h ex2/include/linux/netfilter_ipv4/ip_conntrack.h
--- ex1/include/linux/netfilter_ipv4/ip_conntrack.h Sat Aug 3 21:21:09 2002
+++ ex2/include/linux/netfilter_ipv4/ip_conntrack.h Sat Aug 3 21:24:47 2002
@@ -97,6 +97,7 @@
volatile unsigned long status;
/* Timer function; drops refcnt when it goes off. */
+ unsigned long timeout_target;
struct timer_list timeout;
/* If we're expecting another related connection, this will be
@@ -160,6 +161,9 @@
extern int invert_tuplepr(struct ip_conntrack_tuple *inverse,
const struct ip_conntrack_tuple *orig);
+
+/* Kill this conntrack immediately, without regard to timeouts. */
+extern int ip_ct_sudden_death(struct ip_conntrack *ct);
/* Refresh conntrack for this many jiffies */
extern void ip_ct_refresh(struct ip_conntrack *ct,
diff -urN ex1/include/linux/netfilter_ipv4/ip_conntrack_core.h ex2/include/linux/netfilter_ipv4/ip_conntrack_core.h
--- ex1/include/linux/netfilter_ipv4/ip_conntrack_core.h Sat Aug 3 21:21:21 2002
+++ ex2/include/linux/netfilter_ipv4/ip_conntrack_core.h Sat Aug 3 21:24:55 2002
@@ -1,6 +1,7 @@
#ifndef _IP_CONNTRACK_CORE_H
#define _IP_CONNTRACK_CORE_H
#include <linux/netfilter_ipv4/lockhelp.h>
+#include <linux/netfilter_ipv4/srlist.h>
/* This header is used to share core functionality between the
standalone connection tracking module, and the compatibility layer's use
@@ -44,7 +45,7 @@
return NF_ACCEPT;
}
-extern struct list_head *ip_conntrack_hash;
+extern struct srlist_head *ip_conntrack_hash;
extern struct list_head expect_list;
DECLARE_RWLOCK_EXTERN(ip_conntrack_lock);
#endif /* _IP_CONNTRACK_CORE_H */
diff -urN ex1/include/linux/netfilter_ipv4/ip_conntrack_tuple.h ex2/include/linux/netfilter_ipv4/ip_conntrack_tuple.h
--- ex1/include/linux/netfilter_ipv4/ip_conntrack_tuple.h Sat Aug 3 21:21:16 2002
+++ ex2/include/linux/netfilter_ipv4/ip_conntrack_tuple.h Sat Aug 3 21:24:54 2002
@@ -1,6 +1,8 @@
#ifndef _IP_CONNTRACK_TUPLE_H
#define _IP_CONNTRACK_TUPLE_H
+#include <linux/netfilter_ipv4/srlist.h>
+
/* A `tuple' is a structure containing the information to uniquely
identify a connection. ie. if two packets have the same tuple, they
are in the same connection; if not, they are not.
@@ -85,7 +87,7 @@
/* Connections have two entries in the hash table: one for each way */
struct ip_conntrack_tuple_hash
{
- struct list_head list;
+ struct srlist_head list;
struct ip_conntrack_tuple tuple;
diff -urN ex1/include/linux/netfilter_ipv4/srlist.h ex2/include/linux/netfilter_ipv4/srlist.h
--- ex1/include/linux/netfilter_ipv4/srlist.h Thu Jan 1 01:00:00 1970
+++ ex2/include/linux/netfilter_ipv4/srlist.h Sat Aug 3 21:24:35 2002
@@ -0,0 +1,78 @@
+#ifndef __NETFILTER_IPV4_SRLIST_H
+#define __NETFILTER_IPV4_SRLIST_H
+
+struct srlist_head {
+ struct srlist_head *next;
+};
+
+#define INIT_SRLIST_HEAD(ptr) do { (ptr)->next = (ptr); } while (0)
+
+#define SRLIST_FIND(srl, cmpfn, type, args...) \
+({ \
+ struct srlist_head *__head = (struct srlist_head *) (srl); \
+ struct srlist_head *__i; \
+ \
+ ASSERT_READ_LOCK(__head); \
+ __i = __head; \
+ do { \
+ if (__i->next == __head) { __i = 0; break; } \
+ __i = __i->next; \
+ } while (!cmpfn((const type)__i , ## args)); \
+ (type)__i; \
+})
+
+#define SRLIST_FIND_W(srl, cmpfn, type, args...) \
+({ \
+ struct srlist_head *__head = (struct srlist_head *) (srl); \
+ struct srlist_head *__i; \
+ \
+ ASSERT_WRITE_LOCK(__head); \
+ __i = __head; \
+ do { \
+ if (__i->next == __head) { __i = 0; break; } \
+ __i = __i->next; \
+ } while (!cmpfn((const type)__i , ## args)); \
+ (type)__i; \
+})
+
+#ifndef CONFIG_NETFILTER_DEBUG
+#define SRLIST_DELETE_WARN(estr, e, hstr) do{}while (0)
+#else
+#define SRLIST_DELETE_WARN(estr, e, hstr) \
+ printk("TUPLE_DELETE: %s:%u `%s'(%p) not in %s.\n", \
+ __FILE__, __LINE__, estr, e, hstr)
+#endif
+
+#define SRLIST_DELETE(srl, elem) \
+do { \
+ struct srlist_head *__head = (struct srlist_head *) (srl); \
+ struct srlist_head *__elem = (struct srlist_head *) (elem); \
+ struct srlist_head *__i; \
+ \
+ ASSERT_WRITE_LOCK(__head); \
+ __i = __head; \
+ while (1) { \
+ struct srlist_head *__next = __i->next; \
+ \
+ if (__next == __head) { \
+ SRLIST_DELETE_WARN(#elem, __elem, #srl); \
+ break; \
+ } \
+ if (__next == __elem) { \
+ __i->next = __elem->next; \
+ break; \
+ } \
+ __i = __next; \
+ } \
+} while (0)
+
+#define SRLIST_PREPEND(srl, elem) \
+do { \
+ struct srlist_head *__head = (struct srlist_head *) (srl); \
+ struct srlist_head *__elem = (struct srlist_head *) (elem); \
+ \
+ __elem->next = __head->next; \
+ __head->next = __elem; \
+} while (0)
+
+#endif
diff -urN ex1/include/linux/skbuff.h ex2/include/linux/skbuff.h
--- ex1/include/linux/skbuff.h Sat Aug 3 21:20:56 2002
+++ ex2/include/linux/skbuff.h Sat Aug 3 21:24:37 2002
@@ -1144,6 +1144,17 @@
if (nfct)
atomic_inc(&nfct->master->use);
}
+static inline void
+skb_nf_forget(struct sk_buff *skb)
+{
+ nf_conntrack_put(skb->nfct);
+ skb->nfct = NULL;
+#ifdef CONFIG_NETFILTER_DEBUG
+ skb->nf_debug = 0;
+#endif
+}
+#else
+static inline void skb_nf_forget(struct sk_buff *skb) {}
#endif
#endif /* __KERNEL__ */
diff -urN ex1/net/Config.in ex2/net/Config.in
--- ex1/net/Config.in Sat Aug 3 21:21:01 2002
+++ ex2/net/Config.in Sat Aug 3 21:24:41 2002
@@ -13,6 +13,7 @@
bool 'Network packet filtering (replaces ipchains)' CONFIG_NETFILTER
if [ "$CONFIG_NETFILTER" = "y" ]; then
bool ' Network packet filtering debugging' CONFIG_NETFILTER_DEBUG
+ bool ' Netfilter hook statistics' CONFIG_NETFILTER_HOOK_STAT
fi
bool 'Socket Filtering' CONFIG_FILTER
tristate 'Unix domain sockets' CONFIG_UNIX
diff -urN ex1/net/core/netfilter.c ex2/net/core/netfilter.c
--- ex1/net/core/netfilter.c Sat Aug 3 21:21:12 2002
+++ ex2/net/core/netfilter.c Sat Aug 3 21:24:49 2002
@@ -47,6 +47,293 @@
struct list_head nf_hooks[NPROTO][NF_MAX_HOOKS];
static LIST_HEAD(nf_sockopts);
+#ifdef CONFIG_NETFILTER_HOOK_STAT
+
+/*
+ * menuconfig this under "Network options" >> "Netfilter hook statistics"
+ *
+ * The following code, up to the next #endif, implements per hook
+ * statistics counting. If enabled, look at /proc/net/nf_stat_hook*
+ * for the results.
+ *
+ */
+
+#include <linux/slab.h>
+#include <linux/proc_fs.h>
+#include <asm/msr.h>
+
+/*
+ * nf_stat_hook_proc[pf][hooknum] is a flag per protocol/hook, telling
+ * whether we have already created the /proc/net/nf_stat_hook_X.Y file.
+ * The array is only consulted during module registration. This code
+ * never removes the proc files; when all hook functions unregister,
+ * an empty file remains.
+ *
+ * Not used under normal per-packet processing.
+ */
+static unsigned char nf_stat_hook_proc[NPROTO][NF_MAX_HOOKS];
+
+/*
+ * struct nf_stat_hook_sample is used in nf_inject(), to record the
+ * beginning of the operation. After calling the hook function,
+ * it is reused to compute the duration of the hook function call,
+ * which is then recorded in nf_hook_ops->stat[percpu].
+ *
+ * CPU-local data on the stack, unshared.
+ */
+struct nf_stat_hook_sample {
+ unsigned long long stamp;
+};
+
+/*
+ * struct nf_stat_hook is our main statistics state structure.
+ * It is kept cache-aligned and per-cpu, summing the per-cpu
+ * values only when read through the /proc interface.
+ *
+ * CPU-local data, read across all CPUs only on user request.
+ * Updated locally on each CPU, one update per packet and hook function.
+ */
+struct nf_stat_hook {
+ unsigned long long count;
+ unsigned long long sum;
+} __attribute__ ((__aligned__(SMP_CACHE_BYTES)));
+
+/*
+ * The nf_stat_hook structures come from our private slab cache.
+ */
+static kmem_cache_t *nf_stat_hook_slab;
+
+/*
+ * nf_stat_hook_zero() is the slab ctor/dtor
+ */
+static void nf_stat_hook_zero(void *data, kmem_cache_t *slab, unsigned long x)
+{
+ struct nf_stat_hook *stat = data;
+ int i;
+
+ for (i=0; i<NR_CPUS; i++,stat++)
+ stat->count = stat->sum = 0;
+}
+
+/*
+ * nf_stat_hook_setup() is the one-time initialization routine.
+ * It allocates the slab cache for our statistics counters,
+ * and initializes the "proc registration" flag array.
+ */
+static void __init nf_stat_hook_setup(void)
+{
+ /* early rdtsc to catch booboo at boot time */
+ { struct nf_stat_hook_sample sample; rdtscll(sample.stamp); }
+
+ nf_stat_hook_slab = kmem_cache_create("nf_stat_hook",
+ NR_CPUS * sizeof(struct nf_stat_hook),
+ 0, SLAB_HWCACHE_ALIGN,
+ nf_stat_hook_zero, nf_stat_hook_zero);
+ if (!nf_stat_hook_slab)
+ printk(KERN_ERR "nf_stat_hook will NOT WORK - no slab.\n");
+
+ memset(nf_stat_hook_proc, 0, sizeof(nf_stat_hook_proc));
+}
+
+/*
+ * nf_stat_hook_read_proc() is a proc_fs read_proc() callback.
+ * Called per protocol/hook, the statistics of all netfilter
+ * hook elements sitting on that hook, are shown, in priority
+ * order. On SMP, the per-cpu counters are summed here.
+ * For accuracy, maybe we need to take some write lock. Later.
+ *
+ * Readings might look strange, until such locking is done.
+ * If you need to compensate, read several times, and throw
+ * out the strange results. Look for silly non-monotony.
+ *
+ * Output fields are seperated by a single blank, and represent:
+ * [0] address of 'struct nf_hook_ops'. (pointer, in unadorned 8-byte hex)
+ * [1] address of nf_hook_ops->hook() function pointer. When the
+ * hook module is built into the kernel, you can find this
+ * in System.map. (pointer, in unadorned 8-byte hex)
+ * [2] hook priority. (signed integer, in ascii)
+ * [3] number of times hook was called. (unsigned 64 bit integer, in ascii)
+ * [4] total number of cycles spent in the hook function, measured by
+ * summing the rdtscll() differences across the calls. (unsigned
+ * 64 bit integer, in ascii)
+ *
+ * Additional fields may be added in the future; if any field is eventually
+ * retired, it will be set to neutral values: '00000000' for the pointer
+ * fields, and '0' for the integer fields. That's theory, not guarantee. :)
+ */
+static int nf_stat_hook_read_proc(
+ char *page,
+ char **start,
+ off_t off,
+ int count,
+ int *eof,
+ void *data
+) {
+ struct list_head *l;
+ int res;
+
+ for ( res = 0, l = ((struct list_head *)data)->next;
+ l != data;
+ l = l->next
+ ) {
+ int i;
+ struct nf_hook_ops *elem = (struct nf_hook_ops *) l;
+ struct nf_stat_hook *stat = elem->hook_stat;
+
+ if (stat) {
+ unsigned long long count;
+ unsigned long long sum;
+ /* maybe write_lock something here */
+ for (i=0, count=0, sum=0; i<NR_CPUS; i++, stat++) {
+ count += stat->count;
+ sum += stat->sum;
+ }
+ /* and then write_unlock it here */
+ i = sprintf(page+res, "%p %p %d %Lu %Lu\n",
+ elem, elem->hook, elem->priority,
+ count, sum);
+ } else {
+ i = sprintf(page+res, "%p %p %d 0 0\n",
+ elem, elem->hook, elem->priority);
+ }
+ if (i <= 0)
+ break;
+ res += i;
+ }
+ return res;
+}
+
+/*
+ * nf_stat_hook_register() is called whenever a hook element registers.
+ * When neccessary, we create a /proc/net/nf_stat_hook* file here,
+ * and we always allocate one struct nf_stat_hook.
+ */
+static void nf_stat_hook_register(struct nf_hook_ops *elem)
+{
+ elem->hook_stat = (NULL == nf_stat_hook_slab)
+ ? 0 : kmem_cache_alloc(nf_stat_hook_slab, SLAB_ATOMIC);
+ if (!elem->hook_stat) return;
+ if (!nf_stat_hook_proc[elem->pf][elem->hooknum]) {
+ char buf[64];
+ char hookname_buf[16];
+ char pfname_buf[16];
+ char *hookname;
+ char *pfname;
+ struct proc_dir_entry *proc;
+
+ switch(elem->pf) {
+ case 2:
+ pfname = "ipv4";
+ switch(elem->hooknum) {
+ case 0:
+ hookname = "PRE-ROUTING";
+ break;
+ case 1:
+ hookname = "LOCAL-IN";
+ break;
+ case 2:
+ hookname = "FORWARD";
+ break;
+ case 3:
+ hookname = "LOCAL-OUT";
+ break;
+ case 4:
+ hookname = "POST-ROUTING";
+ break;
+ default:
+ sprintf(hookname_buf, "hook%d",
+ elem->hooknum);
+ hookname = hookname_buf;
+ break;
+ }
+ break;
+ default:
+ sprintf(hookname_buf, "hook%d",
+ elem->hooknum);
+ hookname = hookname_buf;
+ sprintf(pfname_buf, "pf%d",
+ elem->pf);
+ pfname = pfname_buf;
+ break;
+ }
+ sprintf(buf, "net/nf_stat_hook_%s.%s", pfname, hookname);
+ proc = create_proc_read_entry(buf, 0644, NULL,
+ nf_stat_hook_read_proc,
+ &nf_hooks[elem->pf][elem->hooknum]
+ );
+ if (!proc) {
+ printk(KERN_ERR "cannot create %s\n", buf);
+ kmem_cache_free(nf_stat_hook_slab, elem->hook_stat);
+ elem->hook_stat = 0;
+ return;
+ }
+ proc->owner = THIS_MODULE;
+ }
+ nf_stat_hook_proc[elem->pf][elem->hooknum]++;
+}
+
+/*
+ * nf_stat_hook_unregister() is called when a hook element unregisters.
+ * The statistics structure is freed, but we NEVER remove the /proc/net
+ * file entry. Maybe we should. nf_stat_hook_proc[][] contains the correct
+ * counter, I think (modulo races).
+ */
+static void nf_stat_hook_unregister(struct nf_hook_ops *elem)
+{
+ if (elem->hook_stat)
+ kmem_cache_free(nf_stat_hook_slab, elem->hook_stat);
+ nf_stat_hook_proc[elem->pf][elem->hooknum]--;
+}
+
+/*
+ * Finally, the next two functions implement the real timekeeping.
+ * If rdtscll() proves problematic, these have to be changed.
+ * The _begin() function is called before a specific hook entry
+ * function gets called - it starts the timer.
+ * The _end() function is called after the hook entry function,
+ * and it stops the timer, and remembers the interval in the
+ * statistics structure (per-cpu).
+ */
+
+static inline void nf_stat_hook_begin(struct nf_stat_hook_sample *sample)
+{
+ rdtscll(sample->stamp);
+}
+
+static inline void nf_stat_hook_end(
+ struct nf_stat_hook_sample *sample,
+ struct nf_hook_ops *elem,
+ int verdict
+) {
+ struct nf_stat_hook *stat = elem->hook_stat;
+ struct nf_stat_hook_sample now;
+ if (!stat) return;
+ rdtscll(now.stamp); now.stamp -= sample->stamp;
+ stat += smp_processor_id();
+ stat->count++;
+ stat->sum += now.stamp;
+}
+
+#else
+
+/*
+ * Here, a set of empty macros provides for nice ifdef free callers into
+ * this statistics code. If CONFIG_NETFILTER_HOOK_STAT is NOT defined,
+ * these should make the compiled code identical to what we had before.
+ */
+struct nf_stat_hook_sample {};
+#define nf_stat_hook_begin(a) do{}while(0)
+#define nf_stat_hook_end(a,b,c) do{}while(0)
+#define nf_stat_hook_register(a) do{}while(0)
+#define nf_stat_hook_unregister(a) do{}while(0)
+#define nf_stat_hook_setup() do{}while(0)
+
+/*
+ * End of new statistics stuff. On with the traditional net/core/netfilter.c
+ * Search below for "nf_stat_hook" to see where we call into the statistics.
+ */
+#endif
+
/*
* A queue handler may be registered for each protocol. Each is protected by
* long term mutex. The handler must provide an an outfn() to accept packets
@@ -68,6 +355,7 @@
if (reg->priority < ((struct nf_hook_ops *)i)->priority)
break;
}
+ nf_stat_hook_register(reg);
list_add(®->list, i->prev);
br_write_unlock_bh(BR_NETPROTO_LOCK);
return 0;
@@ -77,6 +365,7 @@
{
br_write_lock_bh(BR_NETPROTO_LOCK);
list_del(®->list);
+ nf_stat_hook_unregister(reg);
br_write_unlock_bh(BR_NETPROTO_LOCK);
}
@@ -346,14 +635,19 @@
{
for (*i = (*i)->next; *i != head; *i = (*i)->next) {
struct nf_hook_ops *elem = (struct nf_hook_ops *)*i;
+ struct nf_stat_hook_sample sample;
+ nf_stat_hook_begin(&sample);
switch (elem->hook(hook, skb, indev, outdev, okfn)) {
case NF_QUEUE:
+ nf_stat_hook_end(&sample, elem, NF_QUEUE);
return NF_QUEUE;
case NF_STOLEN:
+ nf_stat_hook_end(&sample, elem, NF_STOLEN);
return NF_STOLEN;
case NF_DROP:
+ nf_stat_hook_end(&sample, elem, NF_DROP);
return NF_DROP;
case NF_REPEAT:
@@ -369,6 +663,7 @@
elem->hook, hook);
#endif
}
+ nf_stat_hook_end(&sample, elem, NF_ACCEPT);
}
return NF_ACCEPT;
}
@@ -638,4 +933,5 @@
for (h = 0; h < NF_MAX_HOOKS; h++)
INIT_LIST_HEAD(&nf_hooks[i][h]);
}
+ nf_stat_hook_setup();
}
diff -urN ex1/net/core/skbuff.c ex2/net/core/skbuff.c
--- ex1/net/core/skbuff.c Sat Aug 3 21:21:00 2002
+++ ex2/net/core/skbuff.c Sat Aug 3 21:24:40 2002
@@ -323,9 +323,7 @@
}
skb->destructor(skb);
}
-#ifdef CONFIG_NETFILTER
- nf_conntrack_put(skb->nfct);
-#endif
+ skb_nf_forget(skb);
skb_headerinit(skb, NULL, 0); /* clean state */
kfree_skbmem(skb);
}
diff -urN ex1/net/ipv4/ip_gre.c ex2/net/ipv4/ip_gre.c
--- ex1/net/ipv4/ip_gre.c Sat Aug 3 21:21:16 2002
+++ ex2/net/ipv4/ip_gre.c Sat Aug 3 21:24:54 2002
@@ -644,13 +644,7 @@
skb->dev = tunnel->dev;
dst_release(skb->dst);
skb->dst = NULL;
-#ifdef CONFIG_NETFILTER
- nf_conntrack_put(skb->nfct);
- skb->nfct = NULL;
-#ifdef CONFIG_NETFILTER_DEBUG
- skb->nf_debug = 0;
-#endif
-#endif
+ skb_nf_forget(skb);
ipgre_ecn_decapsulate(iph, skb);
netif_rx(skb);
read_unlock(&ipgre_lock);
@@ -876,13 +870,7 @@
}
}
-#ifdef CONFIG_NETFILTER
- nf_conntrack_put(skb->nfct);
- skb->nfct = NULL;
-#ifdef CONFIG_NETFILTER_DEBUG
- skb->nf_debug = 0;
-#endif
-#endif
+ skb_nf_forget(skb);
IPTUNNEL_XMIT();
tunnel->recursion--;
diff -urN ex1/net/ipv4/ip_input.c ex2/net/ipv4/ip_input.c
--- ex1/net/ipv4/ip_input.c Sat Aug 3 21:20:57 2002
+++ ex2/net/ipv4/ip_input.c Sat Aug 3 21:24:37 2002
@@ -226,12 +226,9 @@
__skb_pull(skb, ihl);
-#ifdef CONFIG_NETFILTER
/* Free reference early: we don't need it any more, and it may
hold ip_conntrack module loaded indefinitely. */
- nf_conntrack_put(skb->nfct);
- skb->nfct = NULL;
-#endif /*CONFIG_NETFILTER*/
+ skb_nf_forget(skb);
/* Point into the IP datagram, just past the header. */
skb->h.raw = skb->data;
diff -urN ex1/net/ipv4/ipip.c ex2/net/ipv4/ipip.c
--- ex1/net/ipv4/ipip.c Sat Aug 3 21:21:14 2002
+++ ex2/net/ipv4/ipip.c Sat Aug 3 21:24:50 2002
@@ -493,13 +493,7 @@
skb->dev = tunnel->dev;
dst_release(skb->dst);
skb->dst = NULL;
-#ifdef CONFIG_NETFILTER
- nf_conntrack_put(skb->nfct);
- skb->nfct = NULL;
-#ifdef CONFIG_NETFILTER_DEBUG
- skb->nf_debug = 0;
-#endif
-#endif
+ skb_nf_forget(skb);
ipip_ecn_decapsulate(iph, skb);
netif_rx(skb);
read_unlock(&ipip_lock);
@@ -644,13 +638,7 @@
if ((iph->ttl = tiph->ttl) == 0)
iph->ttl = old_iph->ttl;
-#ifdef CONFIG_NETFILTER
- nf_conntrack_put(skb->nfct);
- skb->nfct = NULL;
-#ifdef CONFIG_NETFILTER_DEBUG
- skb->nf_debug = 0;
-#endif
-#endif
+ skb_nf_forget(skb);
IPTUNNEL_XMIT();
tunnel->recursion--;
diff -urN ex1/net/ipv4/ipmr.c ex2/net/ipv4/ipmr.c
--- ex1/net/ipv4/ipmr.c Sat Aug 3 21:21:13 2002
+++ ex2/net/ipv4/ipmr.c Sat Aug 3 21:24:49 2002
@@ -1096,10 +1096,7 @@
skb->h.ipiph = skb->nh.iph;
skb->nh.iph = iph;
-#ifdef CONFIG_NETFILTER
- nf_conntrack_put(skb->nfct);
- skb->nfct = NULL;
-#endif
+ skb_nf_forget(skb);
}
static inline int ipmr_forward_finish(struct sk_buff *skb)
@@ -1441,10 +1438,7 @@
skb->dst = NULL;
((struct net_device_stats*)reg_dev->priv)->rx_bytes += skb->len;
((struct net_device_stats*)reg_dev->priv)->rx_packets++;
-#ifdef CONFIG_NETFILTER
- nf_conntrack_put(skb->nfct);
- skb->nfct = NULL;
-#endif
+ skb_nf_forget(skb);
netif_rx(skb);
dev_put(reg_dev);
return 0;
@@ -1508,10 +1502,7 @@
((struct net_device_stats*)reg_dev->priv)->rx_bytes += skb->len;
((struct net_device_stats*)reg_dev->priv)->rx_packets++;
skb->dst = NULL;
-#ifdef CONFIG_NETFILTER
- nf_conntrack_put(skb->nfct);
- skb->nfct = NULL;
-#endif
+ skb_nf_forget(skb);
netif_rx(skb);
dev_put(reg_dev);
return 0;
diff -urN ex1/net/ipv4/netfilter/ip_conntrack_core.c ex2/net/ipv4/netfilter/ip_conntrack_core.c
--- ex1/net/ipv4/netfilter/ip_conntrack_core.c Sat Aug 3 21:20:55 2002
+++ ex2/net/ipv4/netfilter/ip_conntrack_core.c Sat Aug 3 21:24:34 2002
@@ -52,9 +52,31 @@
unsigned int ip_conntrack_htable_size = 0;
static int ip_conntrack_max = 0;
static atomic_t ip_conntrack_count = ATOMIC_INIT(0);
-struct list_head *ip_conntrack_hash;
+struct srlist_head *ip_conntrack_hash;
+static int ip_conntrack_hash_vmalloced;
static kmem_cache_t *ip_conntrack_cachep;
+static __init void alloc_ip_conntrack_hash(void)
+{
+ const size_t s = sizeof(*ip_conntrack_hash) * ip_conntrack_htable_size;
+ IP_NF_ASSERT(ip_conntrack_hash == 0);
+ ip_conntrack_hash = (void *) __get_free_pages(GFP_KERNEL, get_order(s));
+ if (!ip_conntrack_hash) {
+ ip_conntrack_hash = vmalloc(s);
+ if (!ip_conntrack_hash) BUG();
+ ip_conntrack_hash_vmalloced = 1;
+ }
+}
+
+static void free_ip_conntrack_hash(void)
+{
+ const size_t s = sizeof(*ip_conntrack_hash) * ip_conntrack_htable_size;
+ if (ip_conntrack_hash_vmalloced)
+ vfree(ip_conntrack_hash);
+ else
+ free_pages((unsigned long)ip_conntrack_hash, get_order(s));
+}
+
extern struct ip_conntrack_protocol ip_conntrack_generic_protocol;
static inline int proto_cmpfn(const struct ip_conntrack_protocol *curr,
@@ -155,12 +177,12 @@
{
MUST_BE_WRITE_LOCKED(&ip_conntrack_lock);
/* Remove from both hash lists: must not NULL out next ptrs,
- otherwise we'll look unconfirmed. Fortunately, LIST_DELETE
+ otherwise we'll look unconfirmed. Fortunately, SRLIST_DELETE
doesn't do this. --RR */
- LIST_DELETE(&ip_conntrack_hash
+ SRLIST_DELETE(&ip_conntrack_hash
[hash_conntrack(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple)],
&ct->tuplehash[IP_CT_DIR_ORIGINAL]);
- LIST_DELETE(&ip_conntrack_hash
+ SRLIST_DELETE(&ip_conntrack_hash
[hash_conntrack(&ct->tuplehash[IP_CT_DIR_REPLY].tuple)],
&ct->tuplehash[IP_CT_DIR_REPLY]);
/* If our expected is in the list, take it out. */
@@ -196,14 +218,46 @@
atomic_dec(&ip_conntrack_count);
}
+static inline int later_than(unsigned long this, unsigned long ref)
+{
+ return this > ref
+ || (ref > ((unsigned long)-1) - 864000 && this < ref + 864000);
+}
+
+static inline int earlier_than(unsigned long this, unsigned long ref)
+{
+ return this != ref && !later_than(this, ref);
+}
+
+static inline void activate_timeout_target(struct ip_conntrack *ct)
+{
+ ct->timeout.expires = ct->timeout_target;
+ add_timer(&ct->timeout);
+}
+
static void death_by_timeout(unsigned long ul_conntrack)
{
struct ip_conntrack *ct = (void *)ul_conntrack;
WRITE_LOCK(&ip_conntrack_lock);
+ if (later_than(ct->timeout_target, ct->timeout.expires)) {
+ activate_timeout_target(ct);
+ WRITE_UNLOCK(&ip_conntrack_lock);
+ return;
+ }
+ clean_from_lists(ct);
+ WRITE_UNLOCK(&ip_conntrack_lock);
+ ip_conntrack_put(ct);
+}
+
+int ip_ct_sudden_death(struct ip_conntrack *ct)
+{
+ if (!del_timer(&ct->timeout)) return 0;
+ WRITE_LOCK(&ip_conntrack_lock);
clean_from_lists(ct);
WRITE_UNLOCK(&ip_conntrack_lock);
ip_conntrack_put(ct);
+ return 1;
}
static inline int
@@ -223,7 +277,7 @@
struct ip_conntrack_tuple_hash *h;
MUST_BE_READ_LOCKED(&ip_conntrack_lock);
- h = LIST_FIND(&ip_conntrack_hash[hash_conntrack(tuple)],
+ h = SRLIST_FIND(&ip_conntrack_hash[hash_conntrack(tuple)],
conntrack_tuple_cmp,
struct ip_conntrack_tuple_hash *,
tuple, ignored_conntrack);
@@ -271,7 +325,7 @@
int
__ip_conntrack_confirm(struct nf_ct_info *nfct)
{
- unsigned int hash, repl_hash;
+ u_int32_t hash, repl_hash;
struct ip_conntrack *ct;
enum ip_conntrack_info ctinfo;
@@ -301,23 +355,19 @@
/* See if there's one in the list already, including reverse:
NAT could have grabbed it without realizing, since we're
not in the hash. If there is, we lost race. */
- if (!LIST_FIND(&ip_conntrack_hash[hash],
+ if (!SRLIST_FIND(&ip_conntrack_hash[hash],
conntrack_tuple_cmp,
struct ip_conntrack_tuple_hash *,
&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple, NULL)
- && !LIST_FIND(&ip_conntrack_hash[repl_hash],
+ && !SRLIST_FIND(&ip_conntrack_hash[repl_hash],
conntrack_tuple_cmp,
struct ip_conntrack_tuple_hash *,
&ct->tuplehash[IP_CT_DIR_REPLY].tuple, NULL)) {
- list_prepend(&ip_conntrack_hash[hash],
+ SRLIST_PREPEND(&ip_conntrack_hash[hash],
&ct->tuplehash[IP_CT_DIR_ORIGINAL]);
- list_prepend(&ip_conntrack_hash[repl_hash],
+ SRLIST_PREPEND(&ip_conntrack_hash[repl_hash],
&ct->tuplehash[IP_CT_DIR_REPLY]);
- /* Timer relative to confirmation time, not original
- setting time, otherwise we'd get timer wrap in
- wierd delay cases. */
- ct->timeout.expires += jiffies;
- add_timer(&ct->timeout);
+ activate_timeout_target(ct);
atomic_inc(&ct->ct_general.use);
WRITE_UNLOCK(&ip_conntrack_lock);
return NF_ACCEPT;
@@ -435,19 +485,22 @@
/* There's a small race here where we may free a just-assured
connection. Too bad: we're in trouble anyway. */
-static inline int unreplied(const struct ip_conntrack_tuple_hash *i)
+static inline int unreplied(const struct ip_conntrack_tuple_hash *i,
+ struct ip_conntrack_tuple_hash **lru)
{
- return !(i->ctrack->status & IPS_ASSURED);
+ if (!(i->ctrack->status & IPS_ASSURED))
+ *lru = (struct ip_conntrack_tuple_hash *) i;
+ return 0;
}
-static int early_drop(struct list_head *chain)
+static int early_drop(struct srlist_head *chain)
{
/* Traverse backwards: gives us oldest, which is roughly LRU */
- struct ip_conntrack_tuple_hash *h;
+ struct ip_conntrack_tuple_hash *h = 0;
int dropped = 0;
READ_LOCK(&ip_conntrack_lock);
- h = LIST_FIND(chain, unreplied, struct ip_conntrack_tuple_hash *);
+ SRLIST_FIND(chain, unreplied, struct ip_conntrack_tuple_hash *, &h);
if (h)
atomic_inc(&h->ctrack->ct_general.use);
READ_UNLOCK(&ip_conntrack_lock);
@@ -455,10 +508,7 @@
if (!h)
return dropped;
- if (del_timer(&h->ctrack->timeout)) {
- death_by_timeout((unsigned long)h->ctrack);
- dropped = 1;
- }
+ dropped = ip_ct_sudden_death(h->ctrack);
ip_conntrack_put(h->ctrack);
return dropped;
}
@@ -485,22 +535,19 @@
{
struct ip_conntrack *conntrack;
struct ip_conntrack_tuple repl_tuple;
- size_t hash, repl_hash;
struct ip_conntrack_expect *expected;
int i;
- static unsigned int drop_next = 0;
-
- hash = hash_conntrack(tuple);
if (ip_conntrack_max &&
atomic_read(&ip_conntrack_count) >= ip_conntrack_max) {
/* Try dropping from random chain, or else from the
chain about to put into (in case they're trying to
bomb one hash chain). */
- unsigned int next = (drop_next++)%ip_conntrack_htable_size;
+ static u_int32_t drop_rotor = 0;
+ u_int32_t next = (drop_rotor++)%ip_conntrack_htable_size;
if (!early_drop(&ip_conntrack_hash[next])
- && !early_drop(&ip_conntrack_hash[hash])) {
+ && !early_drop(&ip_conntrack_hash[hash_conntrack(tuple)])) {
if (net_ratelimit())
printk(KERN_WARNING
"ip_conntrack: table full, dropping"
@@ -513,7 +560,6 @@
DEBUGP("Can't invert tuple.\n");
return NULL;
}
- repl_hash = hash_conntrack(&repl_tuple);
conntrack = kmem_cache_alloc(ip_conntrack_cachep, GFP_ATOMIC);
if (!conntrack) {
@@ -689,8 +735,7 @@
ret = proto->packet(ct, (*pskb)->nh.iph, (*pskb)->len, ctinfo);
if (ret == -1) {
/* Invalid */
- nf_conntrack_put((*pskb)->nfct);
- (*pskb)->nfct = NULL;
+ skb_nf_forget(*pskb);
return NF_ACCEPT;
}
@@ -699,8 +744,7 @@
ct, ctinfo);
if (ret == -1) {
/* Invalid */
- nf_conntrack_put((*pskb)->nfct);
- (*pskb)->nfct = NULL;
+ skb_nf_forget(*pskb);
return NF_ACCEPT;
}
}
@@ -808,7 +852,7 @@
return 0;
}
-static inline int unhelp(struct ip_conntrack_tuple_hash *i,
+static inline int unhelp(const struct ip_conntrack_tuple_hash *i,
const struct ip_conntrack_helper *me)
{
if (i->ctrack->helper == me) {
@@ -834,7 +878,7 @@
/* Get rid of expecteds, set helpers to NULL. */
for (i = 0; i < ip_conntrack_htable_size; i++)
- LIST_FIND_W(&ip_conntrack_hash[i], unhelp,
+ SRLIST_FIND_W(&ip_conntrack_hash[i], unhelp,
struct ip_conntrack_tuple_hash *, me);
WRITE_UNLOCK(&ip_conntrack_lock);
@@ -851,15 +895,11 @@
IP_NF_ASSERT(ct->timeout.data == (unsigned long)ct);
WRITE_LOCK(&ip_conntrack_lock);
- /* If not in hash table, timer will not be active yet */
- if (!is_confirmed(ct))
- ct->timeout.expires = extra_jiffies;
- else {
- /* Need del_timer for race avoidance (may already be dying). */
- if (del_timer(&ct->timeout)) {
- ct->timeout.expires = jiffies + extra_jiffies;
- add_timer(&ct->timeout);
- }
+ ct->timeout_target = jiffies + extra_jiffies;
+ if ( is_confirmed(ct)
+ && earlier_than(ct->timeout_target, ct->timeout.expires)
+ && del_timer(&ct->timeout)) {
+ activate_timeout_target(ct);
}
WRITE_UNLOCK(&ip_conntrack_lock);
}
@@ -942,7 +982,7 @@
READ_LOCK(&ip_conntrack_lock);
for (i = 0; !h && i < ip_conntrack_htable_size; i++) {
- h = LIST_FIND(&ip_conntrack_hash[i], do_kill,
+ h = SRLIST_FIND(&ip_conntrack_hash[i], do_kill,
struct ip_conntrack_tuple_hash *, kill, data);
}
if (h)
@@ -961,10 +1001,7 @@
/* This is order n^2, by the way. */
while ((h = get_next_corpse(kill, data)) != NULL) {
/* Time to push up daises... */
- if (del_timer(&h->ctrack->timeout))
- death_by_timeout((unsigned long)h->ctrack);
- /* ... else the timer will get him soon. */
-
+ ip_ct_sudden_death(h->ctrack);
ip_conntrack_put(h->ctrack);
}
}
@@ -1073,7 +1110,7 @@
}
kmem_cache_destroy(ip_conntrack_cachep);
- vfree(ip_conntrack_hash);
+ free_ip_conntrack_hash();
nf_unregister_sockopt(&so_getorigdst);
}
@@ -1092,7 +1129,7 @@
} else {
ip_conntrack_htable_size
= (((num_physpages << PAGE_SHIFT) / 16384)
- / sizeof(struct list_head));
+ / sizeof(*ip_conntrack_hash));
if (num_physpages > (1024 * 1024 * 1024 / PAGE_SIZE))
ip_conntrack_htable_size = 8192;
if (ip_conntrack_htable_size < 16)
@@ -1107,8 +1144,7 @@
if (ret != 0)
return ret;
- ip_conntrack_hash = vmalloc(sizeof(struct list_head)
- * ip_conntrack_htable_size);
+ alloc_ip_conntrack_hash();
if (!ip_conntrack_hash) {
nf_unregister_sockopt(&so_getorigdst);
return -ENOMEM;
@@ -1119,7 +1155,7 @@
SLAB_HWCACHE_ALIGN, NULL, NULL);
if (!ip_conntrack_cachep) {
printk(KERN_ERR "Unable to create ip_conntrack slab cache\n");
- vfree(ip_conntrack_hash);
+ free_ip_conntrack_hash();
nf_unregister_sockopt(&so_getorigdst);
return -ENOMEM;
}
@@ -1133,7 +1169,7 @@
WRITE_UNLOCK(&ip_conntrack_lock);
for (i = 0; i < ip_conntrack_htable_size; i++)
- INIT_LIST_HEAD(&ip_conntrack_hash[i]);
+ INIT_SRLIST_HEAD(&ip_conntrack_hash[i]);
/* This is fucking braindead. There is NO WAY of doing this without
the CONFIG_SYSCTL unless you don't want to detect errors.
@@ -1143,7 +1179,7 @@
= register_sysctl_table(ip_conntrack_root_table, 0);
if (ip_conntrack_sysctl_header == NULL) {
kmem_cache_destroy(ip_conntrack_cachep);
- vfree(ip_conntrack_hash);
+ free_ip_conntrack_hash();
nf_unregister_sockopt(&so_getorigdst);
return -ENOMEM;
}
diff -urN ex1/net/ipv4/netfilter/ip_conntrack_proto_icmp.c ex2/net/ipv4/netfilter/ip_conntrack_proto_icmp.c
--- ex1/net/ipv4/netfilter/ip_conntrack_proto_icmp.c Sat Aug 3 21:20:56 2002
+++ ex2/net/ipv4/netfilter/ip_conntrack_proto_icmp.c Sat Aug 3 21:24:36 2002
@@ -77,9 +77,8 @@
means this will only run once even if count hits zero twice
(theoretically possible with SMP) */
if (CTINFO2DIR(ctinfo) == IP_CT_DIR_REPLY) {
- if (atomic_dec_and_test(&ct->proto.icmp.count)
- && del_timer(&ct->timeout))
- ct->timeout.function((unsigned long)ct);
+ if (atomic_dec_and_test(&ct->proto.icmp.count))
+ ip_ct_sudden_death(ct);
} else {
atomic_inc(&ct->proto.icmp.count);
ip_ct_refresh(ct, ICMP_TIMEOUT);
diff -urN ex1/net/ipv4/netfilter/ip_conntrack_proto_tcp.c ex2/net/ipv4/netfilter/ip_conntrack_proto_tcp.c
--- ex1/net/ipv4/netfilter/ip_conntrack_proto_tcp.c Sat Aug 3 21:21:12 2002
+++ ex2/net/ipv4/netfilter/ip_conntrack_proto_tcp.c Sat Aug 3 21:24:49 2002
@@ -189,8 +189,7 @@
problem case, so we can delete the conntrack
immediately. --RR */
if (!(conntrack->status & IPS_SEEN_REPLY) && tcph->rst) {
- if (del_timer(&conntrack->timeout))
- conntrack->timeout.function((unsigned long)conntrack);
+ ip_ct_sudden_death(conntrack);
} else {
/* Set ASSURED if we see see valid ack in ESTABLISHED after SYN_RECV */
if (oldtcpstate == TCP_CONNTRACK_SYN_RECV
diff -urN ex1/net/ipv4/netfilter/ip_conntrack_standalone.c ex2/net/ipv4/netfilter/ip_conntrack_standalone.c
--- ex1/net/ipv4/netfilter/ip_conntrack_standalone.c Sat Aug 3 21:21:12 2002
+++ ex2/net/ipv4/netfilter/ip_conntrack_standalone.c Sat Aug 3 21:24:48 2002
@@ -83,7 +83,7 @@
conntrack->tuplehash[IP_CT_DIR_ORIGINAL]
.tuple.dst.protonum,
timer_pending(&conntrack->timeout)
- ? (conntrack->timeout.expires - jiffies)/HZ : 0);
+ ? (conntrack->timeout_target - jiffies)/HZ : 0);
len += proto->print_conntrack(buffer + len, conntrack);
len += print_tuple(buffer + len,
@@ -140,7 +140,7 @@
READ_LOCK(&ip_conntrack_lock);
/* Traverse hash; print originals then reply. */
for (i = 0; i < ip_conntrack_htable_size; i++) {
- if (LIST_FIND(&ip_conntrack_hash[i], conntrack_iterate,
+ if (SRLIST_FIND(&ip_conntrack_hash[i], conntrack_iterate,
struct ip_conntrack_tuple_hash *,
buffer, offset, &upto, &len, length))
goto finished;
diff -urN ex1/net/ipv4/netfilter/ipt_REJECT.c ex2/net/ipv4/netfilter/ipt_REJECT.c
--- ex1/net/ipv4/netfilter/ipt_REJECT.c Sat Aug 3 21:21:04 2002
+++ ex2/net/ipv4/netfilter/ipt_REJECT.c Sat Aug 3 21:24:43 2002
@@ -69,12 +69,8 @@
return;
/* This packet will not be the same as the other: clear nf fields */
- nf_conntrack_put(nskb->nfct);
- nskb->nfct = NULL;
nskb->nfcache = 0;
-#ifdef CONFIG_NETFILTER_DEBUG
- nskb->nf_debug = 0;
-#endif
+ skb_nf_forget(nskb);
tcph = (struct tcphdr *)((u_int32_t*)nskb->nh.iph + nskb->nh.iph->ihl);
diff -urN ex1/net/ipv6/sit.c ex2/net/ipv6/sit.c
--- ex1/net/ipv6/sit.c Sat Aug 3 21:21:17 2002
+++ ex2/net/ipv6/sit.c Sat Aug 3 21:24:54 2002
@@ -403,13 +403,7 @@
skb->dev = tunnel->dev;
dst_release(skb->dst);
skb->dst = NULL;
-#ifdef CONFIG_NETFILTER
- nf_conntrack_put(skb->nfct);
- skb->nfct = NULL;
-#ifdef CONFIG_NETFILTER_DEBUG
- skb->nf_debug = 0;
-#endif
-#endif
+ skb_nf_forget(skb);
ipip6_ecn_decapsulate(iph, skb);
netif_rx(skb);
read_unlock(&ipip6_lock);
@@ -600,13 +594,7 @@
if ((iph->ttl = tiph->ttl) == 0)
iph->ttl = iph6->hop_limit;
-#ifdef CONFIG_NETFILTER
- nf_conntrack_put(skb->nfct);
- skb->nfct = NULL;
-#ifdef CONFIG_NETFILTER_DEBUG
- skb->nf_debug = 0;
-#endif
-#endif
+ skb_nf_forget(skb);
IPTUNNEL_XMIT();
tunnel->recursion--;
^ permalink raw reply [flat|nested] only message in thread