Netdev List
 help / color / mirror / Atom feed
* [Patch] xfrm_policy_destroy: rename and relative fixes
From: WANG Cong @ 2008-01-03 12:05 UTC (permalink / raw)
  To: LKML; +Cc: Herbert Xu, David Miller, netdev


Since __xfrm_policy_destroy is used to destory the resources
allocated by xfrm_policy_alloc. So using the name
__xfrm_policy_destroy is not correspond with xfrm_policy_alloc.
Rename it to xfrm_policy_destroy.

And along with some instances that call xfrm_policy_alloc
but not using xfrm_policy_destroy to destroy the resource,
fix them.

Cc: David Miller <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>

---

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 58dfa82..6eff085 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -505,12 +505,12 @@ static inline void xfrm_pol_hold(struct xfrm_policy *policy)
 		atomic_inc(&policy->refcnt);
 }
 
-extern void __xfrm_policy_destroy(struct xfrm_policy *policy);
+extern void xfrm_policy_destroy(struct xfrm_policy *policy);
 
 static inline void xfrm_pol_put(struct xfrm_policy *policy)
 {
 	if (atomic_dec_and_test(&policy->refcnt))
-		__xfrm_policy_destroy(policy);
+		xfrm_policy_destroy(policy);
 }
 
 #ifdef CONFIG_XFRM_SUB_POLICY
diff --git a/net/key/af_key.c b/net/key/af_key.c
index 26d5e63..3667f44 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -2291,8 +2291,7 @@ static int pfkey_spdadd(struct sock *sk, struct sk_buff *skb, struct sadb_msg *h
 	return 0;
 
 out:
-	security_xfrm_policy_free(xp);
-	kfree(xp);
+	xfrm_policy_destroy(xp);
 	return err;
 }
 
@@ -3236,8 +3235,7 @@ static struct xfrm_policy *pfkey_compile_policy(struct sock *sk, int opt,
 	return xp;
 
 out:
-	security_xfrm_policy_free(xp);
-	kfree(xp);
+	xfrm_policy_destroy(xp);
 	return NULL;
 }
 
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 26b846e..087484e 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -206,7 +206,7 @@ EXPORT_SYMBOL(xfrm_policy_alloc);
 
 /* Destroy xfrm_policy: descendant resources must be released to this moment. */
 
-void __xfrm_policy_destroy(struct xfrm_policy *policy)
+void xfrm_policy_destroy(struct xfrm_policy *policy)
 {
 	BUG_ON(!policy->dead);
 
@@ -218,7 +218,7 @@ void __xfrm_policy_destroy(struct xfrm_policy *policy)
 	security_xfrm_policy_free(policy);
 	kfree(policy);
 }
-EXPORT_SYMBOL(__xfrm_policy_destroy);
+EXPORT_SYMBOL(xfrm_policy_destroy);
 
 static void xfrm_policy_gc_kill(struct xfrm_policy *policy)
 {
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index e75dbdc..73cc755 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -1048,7 +1048,7 @@ static struct xfrm_policy *xfrm_policy_construct(struct xfrm_userpolicy_info *p,
 	return xp;
  error:
 	*errp = err;
-	kfree(xp);
+	xfrm_policy_destroy(xp);
 	return NULL;
 }
 

^ permalink raw reply related

* [NET]: Avoid divides in net/core/gen_estimator.c
From: Eric Dumazet @ 2008-01-03 13:12 UTC (permalink / raw)
  To: David Miller; +Cc: netdev@vger.kernel.org

We can void divides (as seen with CONFIG_CC_OPTIMIZE_FOR_SIZE=y on x86)
changing ((HZ<<idx)/4) to ((HZ/4) << idx)

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>

diff --git a/net/core/gen_estimator.c b/net/core/gen_estimator.c
index daadbcc..86037d1 100644
--- a/net/core/gen_estimator.c
+++ b/net/core/gen_estimator.c
@@ -135,7 +135,7 @@ skip:
 	}
 
 	if (!list_empty(&elist[idx].list))
-		mod_timer(&elist[idx].timer, jiffies + ((HZ<<idx)/4));
+		mod_timer(&elist[idx].timer, jiffies + ((HZ/4) << idx));
 	rcu_read_unlock();
 }
 
@@ -191,7 +191,7 @@ int gen_new_estimator(struct gnet_stats_basic *bstats,
 	}
 
 	if (list_empty(&elist[idx].list))
-		mod_timer(&elist[idx].timer, jiffies + ((HZ<<idx)/4));
+		mod_timer(&elist[idx].timer, jiffies + ((HZ/4) << idx));
 
 	list_add_rcu(&est->list, &elist[idx].list);
 	return 0;


^ permalink raw reply related

* Re: [RFC PATCH] NET: Clone the sk_buff->iif field properly
From: Paul Moore @ 2008-01-03 14:01 UTC (permalink / raw)
  To: hadi; +Cc: Jarek Poplawski, netdev
In-Reply-To: <1199359402.4710.17.camel@localhost>

On Thursday 03 January 2008 6:23:22 am jamal wrote:
> On Thu, 2008-03-01 at 10:58 +0100, Jarek Poplawski wrote:
> > On 02-01-2008 17:01, Paul Moore wrote:
> > > This patch is needed by some of the labeled networking changes proposed
> > > for 2.6.25, does anyone have any objections?
> >
> > Probably Jamal could be the most interested (added to CC):
>
> Gracias Jarek.

Yes, thank you.  One of these days I need to learn some git commands other 
than clone, update, and push ;)

> Paul, (out of curiosity more than anything) what are the circumstances
> of the cloned skb - are you going to reinject it back at some point?

Well, I'm not planning on reinjecting the cloned skb at present (doesn't mean 
I won't think up some contrived use in the future) but the stack appears to 
do this already in a few cases and it is causing problems when we try to 
perform access control on the cloned skb.  The git-lblnet "horkage" in 
the -mm tree just before the holiday is the most notable example.

> I cant think of any good reason why iif shouldnt be copied - thats how
> its been from the begining (dammit;->). The reason it hasnt mattered so
> far is everything that needs to write the iif never copied (refer to
> Documentation/networking/tc-actions-env-rules.txt). For correctness i
> think it should be copied. So no objections;

Great.

> The better patch would be to just put it in skb_clone and remove it from
> tc_act_clone.

I assume you mean skb_act_clone()?  That sounds like the best idea, I'll fixup 
the patch and resend it today for more review.

Thanks guys.

-- 
paul moore
linux security @ hp

^ permalink raw reply

* RE: [PATCH 2.6.23.12] net/bonding: option to specify initial bond interface number
From: Jari Takkala @ 2008-01-03 15:23 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: netdev
In-Reply-To: <22042.1199312663@death>

On Wednesday, January 02, 2008 17:24, Jay Vosburgh wrote:
> 	What advantage does this have over:
> 
> # echo +bond5 > /sys/class/net/bonding_masters
> 
> 	which will create a new bonding master for the already-loaded driver?
>

The advantage is that you can load multiple instances of the bonding driver and control the name of the bond interface that will be created. Normally the bond interface name would take the next available number.

In our startup scripts we need to be able to ensure that the interface name is consistent across reboots. Sometimes bond1 may be brought up before bond0 and it may have different options (requiring a different instance of the bonding driver).

I understand that the startup scripts could be modified to account for this, however we also have an IOS like interface to an embedded system where the user can create new bond interfaces and specify the interface number, they may create interfaces out of order and this feature enables them to accomplish that.

Jari

^ permalink raw reply

* RE: [PATCH 2.6.23.12] net/bonding: option to specify initial bond interface number
From: Jari Takkala @ 2008-01-03 15:26 UTC (permalink / raw)
  To: Randy Dunlap; +Cc: netdev
In-Reply-To: <20080102135554.65f86364.randy.dunlap@oracle.com>

On Wednesday, January 02, 2008 16:56, Randy Dunlap wrote:
> You could (should) make <ifnum> be unsigned int and then use
> module_param(ifnum, uint, 0); and then ...
> 
> then this block is mostly useless since ifnum cannot be < 0.
> And how could it ever be > INT_MAX (when ifnum was an int)?
> 
> If <ifnum> is unsigned int but you want to limit it to INT_MAX,
> then half of this if-test would be OK.
> 

Thanks, that makes sense. I simply copied some of the max_bonds code which uses a signed int. I suppose that could be changed as well. I will post back a new patch.

Jari

^ permalink raw reply

* Re: 2.6.24-rc6-mm1
From: Torsten Kaiser @ 2008-01-03 15:37 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Herbert Xu, Andrew Morton, linux-kernel, Neil Brown, netdev,
	Tom Tucker
In-Reply-To: <20080102215706.GB6480@fieldses.org>

On Jan 2, 2008 10:57 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> On Thu, Jan 03, 2008 at 08:51:54AM +1100, Herbert Xu wrote:
> > The two specific trees of interest would be git-nfsd and git-net.
>
> Also, if it's git-nfsd, it'd be useful to test with the current git-nfsd
> from the for-mm branch at:
>
>         git://linux-nfs.org/~bfields/linus.git for-mm
>
> and then any bisection results (even partial) from that tree would help
> immensely....

Wrong URL, its (now?) at git://git.linux-nfs.org/projects/bfields/linux.git

Using "HEAD is now at cd7e1c9... Merge commit 'server-xprt-switch^' into for-mm"
I was able to compile&install 54 packages, so this seems to be working.

Now git-fetch'ing
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.25.git

Torsten

^ permalink raw reply

* [LIB] pcounter : unline too big functions
From: Eric Dumazet @ 2008-01-03 15:52 UTC (permalink / raw)
  To: David Miller; +Cc: acme@redhat.com, Herbert Xu, netdev@vger.kernel.org

Before pushing pcounter to Linus tree, I would like to make some adjustments.

Goal is to reduce kernel text size, by unlining too big functions.

When a pcounter is bound to a statically defined per_cpu variable,
we define two small helpers functions. (No more folding function
using the fat for_each_possible_cpu(cpu) ... )

static DEFINE_PER_CPU(int, NAME##_pcounter_values);            
static void NAME##_pcounter_add(struct pcounter *self, int val)
{                                                             
       __get_cpu_var(NAME##_pcounter_values) += val;          
}                                                             
static int NAME##_pcounter_getval(const struct pcounter *self, int cpu)
{                                                                
       return per_cpu(NAME##_pcounter_values, cpu);
}

Fast path is therefore unchanged, while folding/alloc/free is now unlined.

This saves 228 bytes on i386

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>

 include/linux/pcounter.h |   80 ++++++++++++-------------------------
 lib/pcounter.c           |   42 +++++++++++++++++--
 2 files changed, 63 insertions(+), 59 deletions(-)


diff --git a/include/linux/pcounter.h b/include/linux/pcounter.h
index 9c4760a..a82d9f2 100644
--- a/include/linux/pcounter.h
+++ b/include/linux/pcounter.h
@@ -1,41 +1,39 @@
 #ifndef __LINUX_PCOUNTER_H
 #define __LINUX_PCOUNTER_H
-
+/*
+ * Using a dynamic percpu 'int' variable has a cost :
+ * 1) Extra dereference
+ * Current per_cpu_ptr() implementation uses an array per 'percpu variable'.
+ * 2) memory cost of NR_CPUS*(32+sizeof(void *)) instead of num_possible_cpus()*4
+ *
+ * This pcounter implementation is an abstraction to be able to use
+ * either a static or a dynamic per cpu variable.
+ * One dynamic per cpu variable gets a fast & cheap implementation, we can
+ * change pcounter implementation too.
+ */
 struct pcounter {
 #ifdef CONFIG_SMP
 	void		(*add)(struct pcounter *self, int inc);
-	int		(*getval)(const struct pcounter *self);
+	int		(*getval)(const struct pcounter *self, int cpu);
 	int		*per_cpu_values;
 #else
 	int		val;
 #endif
 };
 
-/*
- * Special macros to let pcounters use a fast version of {getvalue|add}
- * using a static percpu variable per pcounter instead of an allocated one,
- * saving one dereference.
- * This might be changed if/when dynamic percpu vars become fast.
- */
 #ifdef CONFIG_SMP
-#include <linux/cpumask.h>
 #include <linux/percpu.h>
 
-#define DEFINE_PCOUNTER(NAME)					\
-static DEFINE_PER_CPU(int, NAME##_pcounter_values);		\
-static void NAME##_pcounter_add(struct pcounter *self, int inc)	\
-{								\
-       __get_cpu_var(NAME##_pcounter_values) += inc;		\
-}								\
-								\
-static int NAME##_pcounter_getval(const struct pcounter *self)	\
-{								\
-       int res = 0, cpu;					\
-								\
-       for_each_possible_cpu(cpu)				\
-               res += per_cpu(NAME##_pcounter_values, cpu);	\
-       return res;						\
-}
+#define DEFINE_PCOUNTER(NAME)						\
+static DEFINE_PER_CPU(int, NAME##_pcounter_values);			\
+static void NAME##_pcounter_add(struct pcounter *self, int val)		\
+{									\
+       __get_cpu_var(NAME##_pcounter_values) += val;			\
+}									\
+static int NAME##_pcounter_getval(const struct pcounter *self, int cpu)	\
+{									\
+	return per_cpu(NAME##_pcounter_values, cpu);			\
+}									\
 
 #define PCOUNTER_MEMBER_INITIALIZER(NAME, MEMBER)		\
 	MEMBER = {						\
@@ -43,42 +41,16 @@ static int NAME##_pcounter_getval(const struct pcounter *self)	\
 		.getval = NAME##_pcounter_getval,		\
 	}
 
-extern void pcounter_def_add(struct pcounter *self, int inc);
-extern int pcounter_def_getval(const struct pcounter *self);
-
-static inline int pcounter_alloc(struct pcounter *self)
-{
-	int rc = 0;
-	if (self->add == NULL) {
-		self->per_cpu_values = alloc_percpu(int);
-		if (self->per_cpu_values != NULL) {
-			self->add    = pcounter_def_add;
-			self->getval = pcounter_def_getval;
-		} else
-			rc = 1;
-	}
-	return rc;
-}
-
-static inline void pcounter_free(struct pcounter *self)
-{
-	if (self->per_cpu_values != NULL) {
-		free_percpu(self->per_cpu_values);
-		self->per_cpu_values = NULL;
-		self->getval = NULL;
-		self->add = NULL;
-	}
-}
 
 static inline void pcounter_add(struct pcounter *self, int inc)
 {
 	self->add(self, inc);
 }
 
-static inline int pcounter_getval(const struct pcounter *self)
-{
-	return self->getval(self);
-}
+extern int pcounter_getval(const struct pcounter *self);
+extern int pcounter_alloc(struct pcounter *self);
+extern void pcounter_free(struct pcounter *self);
+
 
 #else /* CONFIG_SMP */
 
diff --git a/lib/pcounter.c b/lib/pcounter.c
index 93feea5..9b56807 100644
--- a/lib/pcounter.c
+++ b/lib/pcounter.c
@@ -7,20 +7,52 @@
 #include <linux/module.h>
 #include <linux/pcounter.h>
 #include <linux/smp.h>
+#include <linux/cpumask.h>
 
-void pcounter_def_add(struct pcounter *self, int inc)
+static void pcounter_dyn_add(struct pcounter *self, int inc)
 {
 	per_cpu_ptr(self->per_cpu_values, smp_processor_id())[0] += inc;
 }
 
-EXPORT_SYMBOL_GPL(pcounter_def_add);
+static int pcounter_dyn_getval(const struct pcounter *self, int cpu)
+{
+	return per_cpu_ptr(self->per_cpu_values, cpu)[0];
+}
 
-int pcounter_def_getval(const struct pcounter *self)
+int pcounter_getval(const struct pcounter *self)
 {
 	int res = 0, cpu;
+
 	for_each_possible_cpu(cpu)
-		res += per_cpu_ptr(self->per_cpu_values, cpu)[0];
+		res += self->getval(self, cpu);
+
 	return res;
 }
+EXPORT_SYMBOL_GPL(pcounter_getval);
+
+int pcounter_alloc(struct pcounter *self)
+{
+	int rc = 0;
+	if (self->add == NULL) {
+		self->per_cpu_values = alloc_percpu(int);
+		if (self->per_cpu_values != NULL) {
+			self->add    = pcounter_dyn_add;
+			self->getval = pcounter_dyn_getval;
+		} else
+			rc = 1;
+	}
+	return rc;
+}
+EXPORT_SYMBOL_GPL(pcounter_alloc);
+
+void pcounter_free(struct pcounter *self)
+{
+	if (self->per_cpu_values != NULL) {
+		free_percpu(self->per_cpu_values);
+		self->per_cpu_values = NULL;
+		self->getval = NULL;
+		self->add = NULL;
+	}
+}
+EXPORT_SYMBOL_GPL(pcounter_free);
 
-EXPORT_SYMBOL_GPL(pcounter_def_getval);

^ permalink raw reply related

* [PATCH] [INET] Fix netdev renaming and inet address labels
From: Mark McLoughlin @ 2008-01-03 15:57 UTC (permalink / raw)
  To: netdev; +Cc: Mark McLoughlin

When re-naming an interface, the previous secondary address
labels get lost e.g.

  $> brctl addbr foo
  $> ip addr add 192.168.0.1 dev foo
  $> ip addr add 192.168.0.2 dev foo label foo:00
  $> ip addr show dev foo | grep inet
    inet 192.168.0.1/32 scope global foo
    inet 192.168.0.2/32 scope global foo:00
  $> ip link set foo name bar
  $> ip addr show dev bar | grep inet
    inet 192.168.0.1/32 scope global bar
    inet 192.168.0.2/32 scope global bar:2

Turns out to be a simple thinko in inetdev_changename() - clearly we
want to look at the address label, rather than the device name, for
a suffix to retain.

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
---
 net/ipv4/devinet.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index 3168c3d..b42f746 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -1027,7 +1027,7 @@ static void inetdev_changename(struct net_device *dev, struct in_device *in_dev)
 		memcpy(ifa->ifa_label, dev->name, IFNAMSIZ);
 		if (named++ == 0)
 			continue;
-		dot = strchr(ifa->ifa_label, ':');
+		dot = strchr(old, ':');
 		if (dot == NULL) {
 			sprintf(old, ":%d", named);
 			dot = old;
-- 
1.5.3.6


^ permalink raw reply related

* Re: [RFC PATCH] NET: Clone the sk_buff->iif field properly
From: Paul Moore @ 2008-01-03 16:15 UTC (permalink / raw)
  To: hadi, netdev; +Cc: Jarek Poplawski
In-Reply-To: <1199359402.4710.17.camel@localhost>

On Thursday 03 January 2008 6:23:22 am jamal wrote:
> On Thu, 2008-03-01 at 10:58 +0100, Jarek Poplawski wrote:
> > On 02-01-2008 17:01, Paul Moore wrote:
> > > This patch is needed by some of the labeled networking changes
> > > proposed for 2.6.25, does anyone have any objections?
> >
> > Probably Jamal could be the most interested (added to CC):
>
> Gracias Jarek.
> Paul, (out of curiosity more than anything) what are the
> circumstances of the cloned skb - are you going to reinject it back
> at some point?
>
> I cant think of any good reason why iif shouldnt be copied - thats
> how its been from the begining (dammit;->). The reason it hasnt
> mattered so far is everything that needs to write the iif never
> copied (refer to Documentation/networking/tc-actions-env-rules.txt).
> For correctness i think it should be copied. So no objections;
> The better patch would be to just put it in skb_clone and remove it
> from tc_act_clone.

While I'm at it, is there some reason for this #define in __skb_clone()?

 #define C(x) n->x = skb->x

... it seems kinda silly to me and I tend to think the code would be 
better without it.

-- 
paul moore
linux security @ hp

^ permalink raw reply

* Re: [PATCH 2.6.23.12] net/bonding: option to specify initial bond interface number
From: Jay Vosburgh @ 2008-01-03 17:03 UTC (permalink / raw)
  To: Jari Takkala; +Cc: netdev
In-Reply-To: <413FEEF1743111439393FB76D0221E4809790A30@leopard.zoo.q9networks.com>

Jari Takkala <Jari.Takkala@Q9.com> wrote:

>On Wednesday, January 02, 2008 17:24, Jay Vosburgh wrote:
>> 	What advantage does this have over:
>> 
>> # echo +bond5 > /sys/class/net/bonding_masters
>> 
>> 	which will create a new bonding master for the already-loaded driver?
>>
>
>The advantage is that you can load multiple instances of the bonding
>driver and control the name of the bond interface that will be
>created. Normally the bond interface name would take the next available
>number.
>
>In our startup scripts we need to be able to ensure that the interface
>name is consistent across reboots. Sometimes bond1 may be brought up
>before bond0 and it may have different options (requiring a different
>instance of the bonding driver).

	With the sysfs interface to bonding, your last statement is not
true; any number of bonding interfaces, with arbitrary names, can be
created and have their options set without loading multiple instances of
the bonding driver.

>I understand that the startup scripts could be modified to account for
>this, however we also have an IOS like interface to an embedded system
>where the user can create new bond interfaces and specify the interface
>number, they may create interfaces out of order and this feature enables
>them to accomplish that.

	Does your embedded system have sysfs available?  If it does,
then it's to your advantage to use the sysfs API; for one thing, the
single instance of the bonding driver with all interfaces through it
should utilize fewer resources than loading the driver repeatedly.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply

* Re: [patch 7/9][NETNS][IPV6] make mld_max_msf per namespace
From: David Stevens @ 2008-01-03 17:05 UTC (permalink / raw)
  To: Daniel Lezcano; +Cc: davem, netdev, netdev-owner
In-Reply-To: <477CC060.10700@fr.ibm.com>

Daniel Lezcano <dlezcano@fr.ibm.com> wrote on 01/03/2008 03:00:48 AM:
...
> With this solution, we can handle different values for the namespaces 
> but these values are driven by the initial network namespace because 
> their values are lesser or equal to the one from the initial network 
> namespace.
> 
> Is it acceptable ?

Daniel,
        If you have the premise that there's a reason for them to be
different, then your original implementation is fine already. It
requires root privilege to change the value, so I don't mind the
ability to raise it to a higher value later.
        I don't object, but I don't understand. I can't think of
any circumstances where I would want to modify it per namespace.
Making it small is not an effective restriction, since someone
*wanting* to use lots of sources can simply do them on different
sockets of the same group. The point is to catch accidental silly
use and it's protecting a global resource so differing values
just change the threshold at which you catch accidental silly
use in different namespaces.
        Setting it to "0" might be a method of preventing its
use entirely in some namespaces, but it's part of the socket
interface-- disabling it isn't something you generally want to
do, either.
        Are you intending to convert all variables to be
per-namespace? If not -- that is, if you will have global
sysctl variables, then I think this one should be one of
those. Actually, all of the IGMP & MLD variables are tied
naturally to the shared interfaces, so should be global,
I think.

                                                        +-DLS


^ permalink raw reply

* [XFRM]: Do not define km_migrate() if !CONFIG_XFRM_MIGRATE
From: Eric Dumazet @ 2008-01-03 17:24 UTC (permalink / raw)
  To: David Miller; +Cc: netdev@vger.kernel.org

In include/net/xfrm.h we find :

#ifdef CONFIG_XFRM_MIGRATE
extern int km_migrate(struct xfrm_selector *sel, u8 dir, u8 type,
                      struct xfrm_migrate *m, int num_bundles);
...
#endif

We can also guard the function body itself in net/xfrm/xfrm_state.c
with same condition.

(Problem spoted by sparse checker)
make C=2 net/xfrm/xfrm_state.o
...
net/xfrm/xfrm_state.c:1765:5: warning: symbol 'km_migrate' was not declared. Should it be static?
...

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>

diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index 4dec655..65f5ea4 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -1762,6 +1762,7 @@ void km_policy_expired(struct xfrm_policy *pol, int dir, int hard, u32 pid)
 }
 EXPORT_SYMBOL(km_policy_expired);
 
+#ifdef CONFIG_XFRM_MIGRATE
 int km_migrate(struct xfrm_selector *sel, u8 dir, u8 type,
 	       struct xfrm_migrate *m, int num_migrate)
 {
@@ -1781,6 +1782,7 @@ int km_migrate(struct xfrm_selector *sel, u8 dir, u8 type,
 	return err;
 }
 EXPORT_SYMBOL(km_migrate);
+#endif
 
 int km_report(u8 proto, struct xfrm_selector *sel, xfrm_address_t *addr)
 {

^ permalink raw reply related

* [PATCH 0/2] Labeled networking core stack changes for 2.6.25
From: Paul Moore @ 2008-01-03 17:25 UTC (permalink / raw)
  To: netdev; +Cc: davem

Two small patches for 2.6.25 which enable some new labeled networking features
and fixes.  One patch introduces a new outbound packet LSM hook and one adds
the skb 'iif' field to the list of fields copied during a clone operation.

Both of these patches are part of a much larger patchset that has been under
review on the SELinux and LSM mailing lists for the past few months (some bits
have been under review since this past spring).  I'll save everyone the spam
and not post the entire patchset here, but if you want to take a look you can
find the latest bits here:

 * git://git.infradead.org/users/pcmoore/lblnet-2.6_testing

I'm posting these patches here for review and hopefully an 'Acked-by', not
inclusion into net-2.6.25.  If these patches are acceptable then they will
pushed upstream with the rest of the changes when 2.6.25 is ready.

Thanks.

-- 
paul moore
linux security @ hp

^ permalink raw reply

* [PATCH 1/2] LSM: Add inet_sys_snd_skb() LSM hook
From: Paul Moore @ 2008-01-03 17:25 UTC (permalink / raw)
  To: netdev; +Cc: davem
In-Reply-To: <20080103171649.14445.65274.stgit@flek.americas.hpqcorp.net>

Add an inet_sys_snd_skb() LSM hook to allow the LSM to provide packet level
access control for all outbound packets.  Using the existing postroute_last
netfilter hook turns out to be problematic as it is can be invoked multiple
times for a single packet, e.g. individual IPsec transforms, adding unwanted
overhead and complicating the security policy.

Signed-off-by: Paul Moore <paul.moore@hp.com>
---

 include/linux/security.h |   11 +++++++++++
 net/ipv4/ip_output.c     |    7 +++++++
 net/ipv6/ip6_output.c    |    5 +++++
 security/dummy.c         |    8 +++++++-
 security/security.c      |    6 ++++++
 5 files changed, 36 insertions(+), 1 deletions(-)

diff --git a/include/linux/security.h b/include/linux/security.h
index db19c92..1b8d332 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -876,6 +876,10 @@ struct request_sock;
  *     Sets the connection's peersid to the secmark on skb.
  * @req_classify_flow:
  *	Sets the flow's sid to the openreq sid.
+ * @inet_sys_snd_skb:
+ *	Check permissions on outgoing network packets.
+ *	@skb is the packet to check
+ *	@family is the packet's address family
  *
  * Security hooks for XFRM operations.
  *
@@ -1416,6 +1420,7 @@ struct security_operations {
 	void (*inet_csk_clone)(struct sock *newsk, const struct request_sock *req);
 	void (*inet_conn_established)(struct sock *sk, struct sk_buff *skb);
 	void (*req_classify_flow)(const struct request_sock *req, struct flowi *fl);
+	int (*inet_sys_snd_skb)(struct sk_buff *skb, int family);
 #endif	/* CONFIG_SECURITY_NETWORK */
 
 #ifdef CONFIG_SECURITY_NETWORK_XFRM
@@ -2328,6 +2333,7 @@ void security_sk_free(struct sock *sk);
 void security_sk_clone(const struct sock *sk, struct sock *newsk);
 void security_sk_classify_flow(struct sock *sk, struct flowi *fl);
 void security_req_classify_flow(const struct request_sock *req, struct flowi *fl);
+int security_inet_sys_snd_skb(struct sk_buff *skb, int family);
 void security_sock_graft(struct sock*sk, struct socket *parent);
 int security_inet_conn_request(struct sock *sk,
 			struct sk_buff *skb, struct request_sock *req);
@@ -2471,6 +2477,11 @@ static inline void security_req_classify_flow(const struct request_sock *req, st
 {
 }
 
+static inline int security_inet_sys_snd_skb(struct sk_buff *skb, int family)
+{
+	return 0;
+}
+
 static inline void security_sock_graft(struct sock* sk, struct socket *parent)
 {
 }
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index fd99fbd..82a7297 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -204,6 +204,8 @@ static inline int ip_skb_dst_mtu(struct sk_buff *skb)
 
 static int ip_finish_output(struct sk_buff *skb)
 {
+	int err;
+
 #if defined(CONFIG_NETFILTER) && defined(CONFIG_XFRM)
 	/* Policy lookup after SNAT yielded a new policy */
 	if (skb->dst->xfrm != NULL) {
@@ -211,6 +213,11 @@ static int ip_finish_output(struct sk_buff *skb)
 		return dst_output(skb);
 	}
 #endif
+
+	err = security_inet_sys_snd_skb(skb, AF_INET);
+	if (err)
+		return err;
+
 	if (skb->len > ip_skb_dst_mtu(skb) && !skb_is_gso(skb))
 		return ip_fragment(skb, ip_finish_output2);
 	else
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 6338a9c..44ddf32 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -72,8 +72,13 @@ static __inline__ void ipv6_select_ident(struct sk_buff *skb, struct frag_hdr *f
 
 static int ip6_output_finish(struct sk_buff *skb)
 {
+	int err;
 	struct dst_entry *dst = skb->dst;
 
+	err = security_inet_sys_snd_skb(skb, AF_INET6);
+	if (err)
+		return err;
+
 	if (dst->hh)
 		return neigh_hh_output(dst->hh, skb);
 	else if (dst->neighbour)
diff --git a/security/dummy.c b/security/dummy.c
index 0b62f95..384979a 100644
--- a/security/dummy.c
+++ b/security/dummy.c
@@ -848,6 +848,11 @@ static inline void dummy_req_classify_flow(const struct request_sock *req,
 			struct flowi *fl)
 {
 }
+
+static inline int dummy_inet_sys_snd_skb(struct sk_buff *skb, int family)
+{
+	return 0;
+}
 #endif	/* CONFIG_SECURITY_NETWORK */
 
 #ifdef CONFIG_SECURITY_NETWORK_XFRM
@@ -1122,7 +1127,8 @@ void security_fixup_ops (struct security_operations *ops)
 	set_to_dummy_if_null(ops, inet_csk_clone);
 	set_to_dummy_if_null(ops, inet_conn_established);
 	set_to_dummy_if_null(ops, req_classify_flow);
- #endif	/* CONFIG_SECURITY_NETWORK */
+	set_to_dummy_if_null(ops, inet_sys_snd_skb);
+#endif	/* CONFIG_SECURITY_NETWORK */
 #ifdef  CONFIG_SECURITY_NETWORK_XFRM
 	set_to_dummy_if_null(ops, xfrm_policy_alloc_security);
 	set_to_dummy_if_null(ops, xfrm_policy_clone_security);
diff --git a/security/security.c b/security/security.c
index 3bdcada..7f55459 100644
--- a/security/security.c
+++ b/security/security.c
@@ -961,6 +961,12 @@ void security_req_classify_flow(const struct request_sock *req, struct flowi *fl
 }
 EXPORT_SYMBOL(security_req_classify_flow);
 
+int security_inet_sys_snd_skb(struct sk_buff *skb, int family)
+{
+	return security_ops->inet_sys_snd_skb(skb, family);
+}
+EXPORT_SYMBOL(security_inet_sys_snd_skb);
+
 void security_sock_graft(struct sock *sk, struct socket *parent)
 {
 	security_ops->sock_graft(sk, parent);


^ permalink raw reply related

* [PATCH 2/2] NET: Clone the sk_buff 'iif' field in __skb_clone()
From: Paul Moore @ 2008-01-03 17:25 UTC (permalink / raw)
  To: netdev; +Cc: davem
In-Reply-To: <20080103171649.14445.65274.stgit@flek.americas.hpqcorp.net>

Both NetLabel and SELinux (other LSMs may grow to use it as well) rely on the
'iif' field to determine the receiving network interface of inbound packets.
Unfortunately, at present this field is not preserved across a skb clone
operation which can lead to garbage values if the cloned skb is sent back
through the network stack.  This patch corrects this problem by properly
copying the 'iif' field in __skb_clone() and removing the 'iif' field
assignment from skb_act_clone() since it is no longer needed.

Also, while we are here, get rid of that silly C() macro.

Signed-off-by: Paul Moore <paul.moore@hp.com>
---

 include/net/sch_generic.h |    1 -
 net/core/skbuff.c         |   20 +++++++++-----------
 2 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index c926551..4c3b351 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -325,7 +325,6 @@ static inline struct sk_buff *skb_act_clone(struct sk_buff *skb, gfp_t gfp_mask)
 		n->tc_verd = SET_TC_VERD(n->tc_verd, 0);
 		n->tc_verd = CLR_TC_OK2MUNGE(n->tc_verd);
 		n->tc_verd = CLR_TC_MUNGED(n->tc_verd);
-		n->iif = skb->iif;
 	}
 	return n;
 }
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 5b4ce9b..c726cd4 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -407,31 +407,29 @@ static void __copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
 
 static struct sk_buff *__skb_clone(struct sk_buff *n, struct sk_buff *skb)
 {
-#define C(x) n->x = skb->x
-
 	n->next = n->prev = NULL;
 	n->sk = NULL;
 	__copy_skb_header(n, skb);
 
-	C(len);
-	C(data_len);
-	C(mac_len);
+	n->iif = skb->iif;
+	n->len = skb->len;
+	n->data_len = skb->data_len;
+	n->mac_len = skb->mac_len;
 	n->cloned = 1;
 	n->hdr_len = skb->nohdr ? skb_headroom(skb) : skb->hdr_len;
 	n->nohdr = 0;
 	n->destructor = NULL;
-	C(truesize);
+	n->truesize = skb->truesize;
 	atomic_set(&n->users, 1);
-	C(head);
-	C(data);
-	C(tail);
-	C(end);
+	n->head = skb->head;
+	n->data = skb->data;
+	n->tail = skb->tail;
+	n->end = skb->end;
 
 	atomic_inc(&(skb_shinfo(skb)->dataref));
 	skb->cloned = 1;
 
 	return n;
-#undef C
 }
 
 /**


^ permalink raw reply related

* [NET]: prot_inuse cleanups and optimizations
From: Eric Dumazet @ 2008-01-03 17:41 UTC (permalink / raw)
  To: David Miller; +Cc: netdev@vger.kernel.org, acme@redhat.com

1) Cleanups (all functions are prefixed by sock_prot_inuse)

sock_prot_inc_use(prot) -> sock_prot_inuse_add(prot,-1)
sock_prot_dec_use(prot) -> sock_prot_inuse_add(prot,-1)
sock_prot_inuse()       -> sock_prot_inuse_get()

New functions :

sock_prot_inuse_init() and sock_prot_inuse_free() to abstract pcounter use.

2) if CONFIG_PROC_FS=n, we can zap 'inuse' member from "struct proto",
since nobody wants to read the inuse value.

This saves 1372 bytes on i386/SMP and some cpu cycles.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>

 include/net/inet_hashtables.h |    2 -
 include/net/sock.h            |   40 ++++++++++++++++++++++++--------
 include/net/udp.h             |    2 -
 net/core/sock.c               |    6 ++--
 net/ipv4/inet_hashtables.c    |    6 ++--
 net/ipv4/inet_timewait_sock.c |    2 -
 net/ipv4/proc.c               |    9 ++++---
 net/ipv4/raw.c                |    4 +--
 net/ipv4/udp.c                |    2 -
 net/ipv6/inet6_hashtables.c   |    4 +--
 net/ipv6/ipv6_sockglue.c      |    8 +++---
 net/ipv6/proc.c               |    8 +++---
 12 files changed, 57 insertions(+), 36 deletions(-)


diff --git a/include/net/sock.h b/include/net/sock.h
index a52c2ea..89e04e4 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -548,7 +548,9 @@ struct proto {
 	int			(*get_port)(struct sock *sk, unsigned short snum);
 
 	/* Keeping track of sockets in use */
+#ifdef CONFIG_PROC_FS
 	struct pcounter		inuse;
+#endif
 
 	/* Memory pressure */
 	void			(*enter_memory_pressure)(void);
@@ -584,9 +586,6 @@ struct proto {
 #endif
 };
 
-#define DEFINE_PROTO_INUSE(NAME) DEFINE_PCOUNTER(NAME)
-#define REF_PROTO_INUSE(NAME) PCOUNTER_MEMBER_INITIALIZER(NAME, .inuse)
-
 extern int proto_register(struct proto *prot, int alloc_slab);
 extern void proto_unregister(struct proto *prot);
 
@@ -615,21 +614,42 @@ static inline void sk_refcnt_debug_release(const struct sock *sk)
 #define sk_refcnt_debug_release(sk) do { } while (0)
 #endif /* SOCK_REFCNT_DEBUG */
 
+
+#ifdef CONFIG_PROC_FS
+# define DEFINE_PROTO_INUSE(NAME) DEFINE_PCOUNTER(NAME)
+# define REF_PROTO_INUSE(NAME) PCOUNTER_MEMBER_INITIALIZER(NAME, .inuse)
 /* Called with local bh disabled */
-static __inline__ void sock_prot_inc_use(struct proto *prot)
+static inline void sock_prot_inuse_add(struct proto *prot, int inc)
 {
-	pcounter_add(&prot->inuse, 1);
+	pcounter_add(&prot->inuse, inc);
 }
-
-static __inline__ void sock_prot_dec_use(struct proto *prot)
+static inline int sock_prot_inuse_init(struct proto *proto)
 {
-	pcounter_add(&prot->inuse, -1);
+	return pcounter_alloc(&proto->inuse);
 }
-
-static __inline__ int sock_prot_inuse(struct proto *proto)
+static inline int sock_prot_inuse_get(struct proto *proto)
 {
 	return pcounter_getval(&proto->inuse);
 }
+static inline void sock_prot_inuse_free(struct proto *proto)
+{
+	pcounter_free(&proto->inuse);
+}
+#else
+# define DEFINE_PROTO_INUSE(NAME)
+# define REF_PROTO_INUSE(NAME)
+static void inline sock_prot_inuse_add(struct proto *prot, int inc)
+{
+}
+static int inline sock_prot_inuse_init(struct proto *proto)
+{
+	return 0;
+}
+static void inline sock_prot_inuse_free(struct proto *proto)
+{
+}
+#endif
+
 
 /* With per-bucket locks this operation is not-atomic, so that
  * this version is not worse.
diff --git a/include/net/udp.h b/include/net/udp.h
index 93796be..c6669c0 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -115,7 +115,7 @@ static inline void udp_lib_unhash(struct sock *sk)
 	write_lock_bh(&udp_hash_lock);
 	if (sk_del_node_init(sk)) {
 		inet_sk(sk)->num = 0;
-		sock_prot_dec_use(sk->sk_prot);
+		sock_prot_inuse_add(sk->sk_prot, -1);
 	}
 	write_unlock_bh(&udp_hash_lock);
 }
diff --git a/net/core/sock.c b/net/core/sock.c
index 3d7757e..1c4b1cd 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1913,7 +1913,7 @@ int proto_register(struct proto *prot, int alloc_slab)
 	char *request_sock_slab_name = NULL;
 	char *timewait_sock_slab_name;
 
-	if (pcounter_alloc(&prot->inuse) != 0) {
+	if (sock_prot_inuse_init(prot) != 0) {
 		printk(KERN_CRIT "%s: Can't alloc inuse counters!\n", prot->name);
 		goto out;
 	}
@@ -1984,7 +1984,7 @@ out_free_sock_slab:
 	kmem_cache_destroy(prot->slab);
 	prot->slab = NULL;
 out_free_inuse:
-	pcounter_free(&prot->inuse);
+	sock_prot_inuse_free(prot);
 out:
 	return -ENOBUFS;
 }
@@ -1997,7 +1997,7 @@ void proto_unregister(struct proto *prot)
 	list_del(&prot->node);
 	write_unlock(&proto_list_lock);
 
-	pcounter_free(&prot->inuse);
+	sock_prot_inuse_free(prot);
 
 	if (prot->slab != NULL) {
 		kmem_cache_destroy(prot->slab);
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index 88a059e..619c63c 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -278,7 +278,7 @@ unique:
 	sk->sk_hash = hash;
 	BUG_TRAP(sk_unhashed(sk));
 	__sk_add_node(sk, &head->chain);
-	sock_prot_inc_use(sk->sk_prot);
+	sock_prot_inuse_add(sk->sk_prot, 1);
 	write_unlock(lock);
 
 	if (twp) {
@@ -321,7 +321,7 @@ void __inet_hash_nolisten(struct inet_hashinfo *hashinfo, struct sock *sk)
 
 	write_lock(lock);
 	__sk_add_node(sk, list);
-	sock_prot_inc_use(sk->sk_prot);
+	sock_prot_inuse_add(sk->sk_prot, 1);
 	write_unlock(lock);
 }
 EXPORT_SYMBOL_GPL(__inet_hash_nolisten);
@@ -342,7 +342,7 @@ void __inet_hash(struct inet_hashinfo *hashinfo, struct sock *sk)
 
 	inet_listen_wlock(hashinfo);
 	__sk_add_node(sk, list);
-	sock_prot_inc_use(sk->sk_prot);
+	sock_prot_inuse_add(sk->sk_prot, 1);
 	write_unlock(lock);
 	wake_up(&hashinfo->lhash_wait);
 }
diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c
index 1b7db42..876169f 100644
--- a/net/ipv4/inet_timewait_sock.c
+++ b/net/ipv4/inet_timewait_sock.c
@@ -91,7 +91,7 @@ void __inet_twsk_hashdance(struct inet_timewait_sock *tw, struct sock *sk,
 
 	/* Step 2: Remove SK from established hash. */
 	if (__sk_del_node_init(sk))
-		sock_prot_dec_use(sk->sk_prot);
+		sock_prot_inuse_add(sk->sk_prot, -1);
 
 	/* Step 3: Hash TW into TIMEWAIT chain. */
 	inet_twsk_add_node(tw, &ehead->twchain);
diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index 53bc010..cb3787f 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -53,13 +53,14 @@ static int sockstat_seq_show(struct seq_file *seq, void *v)
 {
 	socket_seq_show(seq);
 	seq_printf(seq, "TCP: inuse %d orphan %d tw %d alloc %d mem %d\n",
-		   sock_prot_inuse(&tcp_prot), atomic_read(&tcp_orphan_count),
+		   sock_prot_inuse_get(&tcp_prot),
+		   atomic_read(&tcp_orphan_count),
 		   tcp_death_row.tw_count, atomic_read(&tcp_sockets_allocated),
 		   atomic_read(&tcp_memory_allocated));
-	seq_printf(seq, "UDP: inuse %d mem %d\n", sock_prot_inuse(&udp_prot),
+	seq_printf(seq, "UDP: inuse %d mem %d\n", sock_prot_inuse_get(&udp_prot),
 		   atomic_read(&udp_memory_allocated));
-	seq_printf(seq, "UDPLITE: inuse %d\n", sock_prot_inuse(&udplite_prot));
-	seq_printf(seq, "RAW: inuse %d\n", sock_prot_inuse(&raw_prot));
+	seq_printf(seq, "UDPLITE: inuse %d\n", sock_prot_inuse_get(&udplite_prot));
+	seq_printf(seq, "RAW: inuse %d\n", sock_prot_inuse_get(&raw_prot));
 	seq_printf(seq,  "FRAG: inuse %d memory %d\n",
 			ip_frag_nqueues(), ip_frag_mem());
 	return 0;
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 1bd188f..b1c423a 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -92,7 +92,7 @@ void raw_hash_sk(struct sock *sk, struct raw_hashinfo *h)
 
 	write_lock_bh(&h->lock);
 	sk_add_node(sk, head);
-	sock_prot_inc_use(sk->sk_prot);
+	sock_prot_inuse_add(sk->sk_prot, 1);
 	write_unlock_bh(&h->lock);
 }
 EXPORT_SYMBOL_GPL(raw_hash_sk);
@@ -101,7 +101,7 @@ void raw_unhash_sk(struct sock *sk, struct raw_hashinfo *h)
 {
 	write_lock_bh(&h->lock);
 	if (sk_del_node_init(sk))
-		sock_prot_dec_use(sk->sk_prot);
+		sock_prot_inuse_add(sk->sk_prot, -1);
 	write_unlock_bh(&h->lock);
 }
 EXPORT_SYMBOL_GPL(raw_unhash_sk);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 02fcccd..cb2411c 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -230,7 +230,7 @@ gotit:
 	if (sk_unhashed(sk)) {
 		head = &udptable[snum & (UDP_HTABLE_SIZE - 1)];
 		sk_add_node(sk, head);
-		sock_prot_inc_use(sk->sk_prot);
+		sock_prot_inuse_add(sk->sk_prot, 1);
 	}
 	error = 0;
 fail:
diff --git a/net/ipv6/inet6_hashtables.c b/net/ipv6/inet6_hashtables.c
index adc73ad..d0b3447 100644
--- a/net/ipv6/inet6_hashtables.c
+++ b/net/ipv6/inet6_hashtables.c
@@ -43,7 +43,7 @@ void __inet6_hash(struct inet_hashinfo *hashinfo,
 	}
 
 	__sk_add_node(sk, list);
-	sock_prot_inc_use(sk->sk_prot);
+	sock_prot_inuse_add(sk->sk_prot, 1);
 	write_unlock(lock);
 }
 EXPORT_SYMBOL(__inet6_hash);
@@ -216,7 +216,7 @@ unique:
 	BUG_TRAP(sk_unhashed(sk));
 	__sk_add_node(sk, &head->chain);
 	sk->sk_hash = hash;
-	sock_prot_inc_use(sk->sk_prot);
+	sock_prot_inuse_add(sk->sk_prot, 1);
 	write_unlock(lock);
 
 	if (twp != NULL) {
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 20fece4..bf2a686 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -268,8 +268,8 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 				struct inet_connection_sock *icsk = inet_csk(sk);
 
 				local_bh_disable();
-				sock_prot_dec_use(sk->sk_prot);
-				sock_prot_inc_use(&tcp_prot);
+				sock_prot_inuse_add(sk->sk_prot, -1);
+				sock_prot_inuse_add(&tcp_prot, 1);
 				local_bh_enable();
 				sk->sk_prot = &tcp_prot;
 				icsk->icsk_af_ops = &ipv4_specific;
@@ -282,8 +282,8 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 				if (sk->sk_protocol == IPPROTO_UDPLITE)
 					prot = &udplite_prot;
 				local_bh_disable();
-				sock_prot_dec_use(sk->sk_prot);
-				sock_prot_inc_use(prot);
+				sock_prot_inuse_add(sk->sk_prot, -1);
+				sock_prot_inuse_add(prot, 1);
 				local_bh_enable();
 				sk->sk_prot = prot;
 				sk->sk_socket->ops = &inet_dgram_ops;
diff --git a/net/ipv6/proc.c b/net/ipv6/proc.c
index 9fc2428..6b0314e 100644
--- a/net/ipv6/proc.c
+++ b/net/ipv6/proc.c
@@ -36,13 +36,13 @@ static struct proc_dir_entry *proc_net_devsnmp6;
 static int sockstat6_seq_show(struct seq_file *seq, void *v)
 {
 	seq_printf(seq, "TCP6: inuse %d\n",
-		       sock_prot_inuse(&tcpv6_prot));
+		       sock_prot_inuse_get(&tcpv6_prot));
 	seq_printf(seq, "UDP6: inuse %d\n",
-		       sock_prot_inuse(&udpv6_prot));
+		       sock_prot_inuse_get(&udpv6_prot));
 	seq_printf(seq, "UDPLITE6: inuse %d\n",
-			sock_prot_inuse(&udplitev6_prot));
+			sock_prot_inuse_get(&udplitev6_prot));
 	seq_printf(seq, "RAW6: inuse %d\n",
-		       sock_prot_inuse(&rawv6_prot));
+		       sock_prot_inuse_get(&rawv6_prot));
 	seq_printf(seq, "FRAG6: inuse %d memory %d\n",
 		       ip6_frag_nqueues(), ip6_frag_mem());
 	return 0;
diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h
index 65ddb25..761bdc0 100644
--- a/include/net/inet_hashtables.h
+++ b/include/net/inet_hashtables.h
@@ -293,7 +293,7 @@ static inline void inet_unhash(struct inet_hashinfo *hashinfo, struct sock *sk)
 	}
 
 	if (__sk_del_node_init(sk))
-		sock_prot_dec_use(sk->sk_prot);
+		sock_prot_inuse_add(sk->sk_prot, -1);
 	write_unlock_bh(lock);
 out:
 	if (sk->sk_state == TCP_LISTEN)

^ permalink raw reply related

* Re: [PATCH 2/2] NET: Clone the sk_buff 'iif' field in __skb_clone()
From: Joe Perches @ 2008-01-03 18:33 UTC (permalink / raw)
  To: Paul Moore; +Cc: netdev, David Miller
In-Reply-To: <20080103172545.14445.95317.stgit@flek.americas.hpqcorp.net>

On Thu, 2008-01-03 at 12:25 -0500, Paul Moore wrote:
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 5b4ce9b..c726cd4 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -407,31 +407,29 @@ static void __copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
>  
>  static struct sk_buff *__skb_clone(struct sk_buff *n, struct sk_buff *skb)
>  {
> -#define C(x) n->x = skb->x
> -
>  	n->next = n->prev = NULL;
>  	n->sk = NULL;
>  	__copy_skb_header(n, skb);
>  
> -	C(len);
> -	C(data_len);
> -	C(mac_len);
> +	n->iif = skb->iif;
> +	n->len = skb->len;
> +	n->data_len = skb->data_len;
> +	n->mac_len = skb->mac_len;
>  	n->cloned = 1;
>  	n->hdr_len = skb->nohdr ? skb_headroom(skb) : skb->hdr_len;
>  	n->nohdr = 0;

To reduce possible cacheline bounces, shouldn't the order of
operation on the elements be in struct order?

iif should be after destructor.

nohdr then hdr_len

>  	n->destructor = NULL;
> -	C(truesize);
> +	n->truesize = skb->truesize;
>  	atomic_set(&n->users, 1);
> -	C(head);
> -	C(data);
> -	C(tail);
> -	C(end);
> +	n->head = skb->head;
> +	n->data = skb->data;
> +	n->tail = skb->tail;
> +	n->end = skb->end;

and perhaps tail,end,head,data,truesize,users?



^ permalink raw reply

* Re: [PATCH 2/2] NET: Clone the sk_buff 'iif' field in __skb_clone()
From: Paul Moore @ 2008-01-03 18:40 UTC (permalink / raw)
  To: Joe Perches; +Cc: netdev, David Miller
In-Reply-To: <1199385186.10508.11.camel@localhost>

On Thursday 03 January 2008 1:33:05 pm Joe Perches wrote:
> On Thu, 2008-01-03 at 12:25 -0500, Paul Moore wrote:
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index 5b4ce9b..c726cd4 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -407,31 +407,29 @@ static void __copy_skb_header(struct sk_buff
> > *new, const struct sk_buff *old)
> >
> >  static struct sk_buff *__skb_clone(struct sk_buff *n, struct
> > sk_buff *skb) {
> > -#define C(x) n->x = skb->x
> > -
> >  	n->next = n->prev = NULL;
> >  	n->sk = NULL;
> >  	__copy_skb_header(n, skb);
> >
> > -	C(len);
> > -	C(data_len);
> > -	C(mac_len);
> > +	n->iif = skb->iif;
> > +	n->len = skb->len;
> > +	n->data_len = skb->data_len;
> > +	n->mac_len = skb->mac_len;
> >  	n->cloned = 1;
> >  	n->hdr_len = skb->nohdr ? skb_headroom(skb) : skb->hdr_len;
> >  	n->nohdr = 0;
>
> To reduce possible cacheline bounces, shouldn't the order of
> operation on the elements be in struct order?

Sounds reasonable to me, I'll adjust the function to match the field 
offsets.

-- 
paul moore
linux security @ hp

^ permalink raw reply

* Re: 2.6.24-rc6-mm1
From: J. Bruce Fields @ 2008-01-03 18:52 UTC (permalink / raw)
  To: Torsten Kaiser
  Cc: Herbert Xu, Andrew Morton, linux-kernel, Neil Brown, netdev,
	Tom Tucker
In-Reply-To: <64bb37e0801030737r435ff2b1r6632ab98c2286f46@mail.gmail.com>

On Thu, Jan 03, 2008 at 04:37:46PM +0100, Torsten Kaiser wrote:
> On Jan 2, 2008 10:57 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> > On Thu, Jan 03, 2008 at 08:51:54AM +1100, Herbert Xu wrote:
> > > The two specific trees of interest would be git-nfsd and git-net.
> >
> > Also, if it's git-nfsd, it'd be useful to test with the current git-nfsd
> > from the for-mm branch at:
> >
> >         git://linux-nfs.org/~bfields/linus.git for-mm
> >
> > and then any bisection results (even partial) from that tree would help
> > immensely....
> 
> Wrong URL, its (now?) at git://git.linux-nfs.org/projects/bfields/linux.git

Whoops, apologies.  Actually, the only change required should have been
to change "linus" to "linux".  I should cut-n-paste and not relying on
typing it right each time....

> Using "HEAD is now at cd7e1c9... Merge commit 'server-xprt-switch^' into for-mm"
> I was able to compile&install 54 packages, so this seems to be working.

Hm, OK, thanks again for all this followup.

--b.

> 
> Now git-fetch'ing
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.25.git

^ permalink raw reply

* RE: [PATCH 2.6.23.12] net/bonding: option to specify initial bond interface number
From: Jari Takkala @ 2008-01-03 19:19 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: netdev
In-Reply-To: <29661.1199379831@death>

On Thursday, January 03, 2008 12:04, Jay Vosburgh wrote:
>> In our startup scripts we need to be able to ensure that the
>> interface name is consistent across reboots. Sometimes bond1 may be
>> brought up before bond0 and it may have different options (requiring
>> a different instance of the bonding driver).
> 
> 	With the sysfs interface to bonding, your last statement is not
> true; any number of bonding interfaces, with arbitrary names, can be
> created and have their options set without loading multiple instances
> of the bonding driver.
> 
> 	Does your embedded system have sysfs available?  If it does,
> then it's to your advantage to use the sysfs API; for one thing, the
> single instance of the bonding driver with all interfaces through it
> should utilize fewer resources than loading the driver repeatedly.
> 

We do have sysfs available. I actually did not realize that it is possible to have one bonding instance and create multiple interfaces with different options. It looks like when I initially wrote this patch  (~2.6.20) either this feature was not available, or the bonding documentation may have been out of date. I see now in the latest Documentation/networking/bonding.txt that this is explained clearly.

Thank you for your help, I will drop the patch and move to using the sysfs method. Although I may need to patch it for our own uses to either not create bond0 on module load, or I may just run 'echo -bond0 > /sys/class/net/bonding_masters' immediately after loading the module.

Jari


^ permalink raw reply

* Re: [RFC PATCH] NET: Clone the sk_buff->iif field properly
From: Jarek Poplawski @ 2008-01-03 21:13 UTC (permalink / raw)
  To: Paul Moore; +Cc: hadi, netdev
In-Reply-To: <200801031115.34886.paul.moore@hp.com>

On Thu, Jan 03, 2008 at 11:15:34AM -0500, Paul Moore wrote:
...
> While I'm at it, is there some reason for this #define in __skb_clone()?
> 
>  #define C(x) n->x = skb->x
> 
> ... it seems kinda silly to me and I tend to think the code would be 
> better without it.

IMHO, if there are a lot of this, it's definitely more readable: easier
to check which values are simply copied and which need something more.
But, as usual, it's probably a question of taste, and of course without
it it would definitely look classier...

Cheers,
Jarek P.

PS: I hope you didn't suggest earlier my (better?) knowlege of git;
otherwise don't bother: with your git push you are far ahead of my
gitweb 'degree'.

^ permalink raw reply

* Re: [RFC PATCH] NET: Clone the sk_buff->iif field properly
From: Paul Moore @ 2008-01-03 21:20 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: hadi, netdev
In-Reply-To: <20080103211312.GA7258@ami.dom.local>

On Thursday 03 January 2008 4:13:12 pm Jarek Poplawski wrote:
> On Thu, Jan 03, 2008 at 11:15:34AM -0500, Paul Moore wrote:
> ...
>
> > While I'm at it, is there some reason for this #define in
> > __skb_clone()?
> >
> >  #define C(x) n->x = skb->x
> >
> > ... it seems kinda silly to me and I tend to think the code would
> > be better without it.
>
> IMHO, if there are a lot of this, it's definitely more readable:
> easier to check which values are simply copied and which need
> something more. But, as usual, it's probably a question of taste, and
> of course without it it would definitely look classier...

For me personally, I would argue the readability bit.  Whenever I see a 
function/macro call I have to go find the function/macro definition 
before I can understand what it is doing.  Granted, the macro is 
defined "local" to the function but my point is that being able to look 
at a line of code and understand it without having to look elsewhere is 
a nice quality.  To loose that simply because someone wants to save a 
few keystrokes is a mistake from my point of view.

Besides, if we are really interested in writing a kernel with the least 
number of keystrokes possible wouldn't we be doing it in perl?  I'm 
sure somebody out there has ported the current kernel source to a 
single line of perl ... ;)

> PS: I hope you didn't suggest earlier my (better?) knowlege of git;
> otherwise don't bother: with your git push you are far ahead of my
> gitweb 'degree'.

 ;)

On a serious note, your comment about gitweb made me poke around with 
some of the extra little features ... that 'history' link for each file 
is pretty cool!

-- 
paul moore
linux security @ hp

^ permalink raw reply

* Re: [RFC PATCH] NET: Clone the sk_buff->iif field properly
From: Jarek Poplawski @ 2008-01-03 22:06 UTC (permalink / raw)
  To: Paul Moore; +Cc: hadi, netdev
In-Reply-To: <200801031620.06603.paul.moore@hp.com>

On Thu, Jan 03, 2008 at 04:20:06PM -0500, Paul Moore wrote:
...
> For me personally, I would argue the readability bit.  Whenever I see a 
> function/macro call I have to go find the function/macro definition 
> before I can understand what it is doing.  Granted, the macro is 
> defined "local" to the function but my point is that being able to look 
> at a line of code and understand it without having to look elsewhere is 
> a nice quality.  To loose that simply because someone wants to save a 
> few keystrokes is a mistake from my point of view.

When I first read this __skb_clone() I had mixed emotions about this
macro too. Later I didn't think about it, and only now, after your
question I've done a quick test, compared with __copy_skb_header() and
it seems there is really less rightwords eye moving. So, it was only
about this one very "local" macro.

I'm not macros fan in general: just yesterday I've cursed a bit at some
guy (I forgot the name...), who gave all these "meaningful" names to
macros in linux/pkt_cls.h. But, maybe after some time I'll start to
defend them as well... Especially when I try to imagine doing the same
without them.

Jarek P.

^ permalink raw reply

* Re: [RFC PATCH] NET: Clone the sk_buff->iif field properly
From: Jarek Poplawski @ 2008-01-03 22:49 UTC (permalink / raw)
  To: Paul Moore; +Cc: hadi, netdev
In-Reply-To: <20080103220608.GB7258@ami.dom.local>

On Thu, Jan 03, 2008 at 11:06:08PM +0100, Jarek Poplawski wrote:
...
> I'm not macros fan in general: just yesterday I've cursed a bit at some
> guy (I forgot the name...), who gave all these "meaningful" names to
> macros in linux/pkt_cls.h. But, maybe after some time I'll start to
> defend them as well... Especially when I try to imagine doing the same
> without them.

...Jamal, as a matter of fact, I don't know why, just about writing this
previous message, I started to like all these names without exception!
Isn't it a miracle?

Jarek P.

^ permalink raw reply

* [PATCH 1/1] r8169: fix missing loop variable increment
From: Francois Romieu @ 2008-01-03 22:38 UTC (permalink / raw)
  To: jeff
  Cc: netdev, Andrew Morton, David S. Miller,
	"Citizen Lee <citizen_lee

Spotted-by: Citizen Lee <citizen_lee@thecus.com>
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
---
 drivers/net/r8169.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index 5863190..6ee9db1 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -2002,7 +2002,7 @@ static void rtl8169_set_magic_reg(void __iomem *ioaddr, unsigned mac_version)
 	u32 clk;
 
 	clk = RTL_R8(Config2) & PCI_Clock_66MHz;
-	for (i = 0; i < ARRAY_SIZE(cfg2_info); i++) {
+	for (i = 0; i < ARRAY_SIZE(cfg2_info); i++, p++) {
 		if ((p->mac_version == mac_version) && (p->clk == clk)) {
 			RTL_W32(0x7c, p->val);
 			break;
-- 
1.5.3.3


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox