Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: 2.6.38-rc3-git1: Reported regressions 2.6.36 -> 2.6.37
From: Carlos R. Mafra @ 2011-02-04  7:05 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Keith Packard, Dave Airlie, Dave Airlie, Rafael J. Wysocki,
	Takashi Iwai, Linux Kernel Mailing List, Maciej Rutecki,
	Florian Mickler, Andrew Morton, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List,
	Linux Wireless List, DRI
In-Reply-To: <AANLkTin-9a5Z3qq4t8UakRvgB1G3_CT2RLKMVaHXvnLr-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Thu  3.Feb'11 at 17:11:14 -0800, Linus Torvalds wrote:
> On Thu, Feb 3, 2011 at 5:05 PM, Keith Packard <keithp-aN4HjG94KOLQT0dZR+AlfA@public.gmane.org> wrote:
> >
> > The goal is to make it so that when you *do* set a mode, DPMS gets set
> > to ON (as the monitor will actually be "on" at that point). Here's a
> > patch which does the DPMS_ON precisely when setting a mode.
> 
> Ok, patch looks sane, but it does leave me with the "what about the
> 'fb_changed' case?" question. Is that case basically guaranteed to not
> change any existing dpms state?
> 
> > (note, this patch compiles, but is otherwise only lightly tested).
> 
> Carlos? Takashi? Ignore my crazy patch, try this one instead. Does it
> fix things for you?

Yes! (tested on top of 2.6.38-rc3+).

Thanks to everyone involved!

^ permalink raw reply

* Re: [PATCH] be2net: use device model DMA API
From: David Miller @ 2011-02-04  4:49 UTC (permalink / raw)
  To: ajit.khaparde; +Cc: ivecera, netdev, sathya.perla, subramanian.seetharaman
In-Reply-To: <c6e8a0f7-efdf-498e-9647-b544bc4adf34@exht1.ad.emulex.com>

From: Ajit Khaparde <ajit.khaparde@emulex.com>
Date: Thu, 3 Feb 2011 22:39:25 -0600

> -----Original Message-----
>> From: David Miller [mailto:davem@davemloft.net] 
>> Sent: Wednesday, February 02, 2011 4:57 PM
>> To: ivecera@redhat.com
>> Cc: netdev@vger.kernel.org; Perla, Sathya; Seetharaman, Subramanian; Bandi, Sarveshwar; Khaparde, Ajit
>> Subject: Re: [PATCH] be2net: use device model DMA API
> 
>> From: Ivan Vecera <ivecera@redhat.com>
>> Date: Wed,  2 Feb 2011 19:05:12 +0100
> 
>> > Use DMA API as PCI equivalents will be deprecated.
>> > 
>> > Signed-off-by: Ivan Vecera <ivecera@redhat.com>
> 
>> Looks good to me, can I get some review from the be2net maintainers?
> 
> Looks good. Thanks.
> 
> Acked-by: Ajit Khaparde <ajit.khaparde@emulex.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH] include/net/genetlink.h: Allow genlmsg_cancel to accept a NULL argument
From: David Miller @ 2011-02-04  4:43 UTC (permalink / raw)
  To: julia; +Cc: netdev, linux-kernel, paul.moore, kernel-janitors
In-Reply-To: <Pine.LNX.4.64.1102020659520.9302@ask.diku.dk>

From: Julia Lawall <julia@diku.dk>
Date: Wed, 2 Feb 2011 07:17:29 +0100 (CET)

> This pattern occurred in eg:
> 
> net/netlabel/netlabel_unlabeled.c
> 
> in the function netlbl_unlabel_staticlist_gen and in other netlabel code, 
> as well as in net/wireless/nl80211.c, but with the function nl80211hdr_put 
> instead of genlmsg_put.  I submitted patches for all of these cases, so 
> that is perhaps why you don't see them.  But someone suggested to change 
> genlmsg_cancel as well, to be as permissive as nlmsg_cancel.
> 
> For nlmsg_cancel, there are two occurrences in 
> net/netfilter/nf_conntrack_netlink.c where nlmsg_cancel is reachable with 
> the second argument NULL.
> 
> For nlmsg_cancel the ability to accept NULL as a second argument comes 
> from the fact that it only calls nlmsg_trim, which does nothing if NULL is 
> the second argument.  nlmsg_trim is also called by nla_nest_cancel.  There 
> are many calls to nla_nest_cancel with NULL as the second argument in the 
> directory net/sched, for example in the function gred_dump in 
> net/sched/sch_gred.c.  net/sched also contains a call to nlmsg_trim with 
> NULL as the second argument, in the function flow_dump, in 
> net/sched/cls_flow.c.
> 
> The whole thing seems somewhat sloppy.  I'm sure that all of the 
> above-cited occurrences could be rewritten as outlined above to skip over 
> the cancel/trim function.

Thanks for the analysis Julia.

I think the only safe thing to do in net-2.6 and -stable is to add
the NULL check to genlmsg_cancel() as your patch did.

I we later want to move things such that, consistently, we never
call *nlmsg_cancel() with a NULL second arg, that's fine.

I'll apply your genlmsg_cancel() patch, thanks Julia.

^ permalink raw reply

* [PATCH 5/5] ipv6: share sysctl net/ipv6/conf/DEVNAME/ tables
From: Lucian Adrian Grijincu @ 2011-02-04  4:37 UTC (permalink / raw)
  To: linux-kernel, netdev, Eric W. Biederman, Eric Dumazet,
	David S. Miller, Oct
  Cc: Lucian Adrian Grijincu
In-Reply-To: <cover.1296793770.git.lucian.grijincu@gmail.com>

Similar to the ipv4 patch:

Before this, for each network device DEVNAME that supports ipv4 a new
sysctl table was registered in $PROC/sys/net/ipv6/conf/DEVNAME/.

The sysctl table was identical for all network devices, except for:
* data: pointer to the data to be accessed in the sysctl
* extra1: the 'struct inet6_dev*' of the network device
* extra2: the 'struct net*' of the network namespace

Assuming we have a device name and a 'struct net*', we can get the
'struct net_device*'. From there we can compute:
* data: each entry corresponds to a position in 'struct ipv6_devconf*'
* extra1: 'struct inet6_dev*' can be reached from 'struct net_device*'
* extra2: the 'struct net*' that we assume we have

The device name is determined from the path to the file (the name of
the parent dentry).

The 'struct net*' is stored in the parent 'struct ctl_table*' path by
register_net_sysctl_table_pathdata().

Signed-off-by: Lucian Adrian Grijincu <lucian.grijincu@gmail.com>
---
 include/linux/ipv6.h |   15 ++++-
 net/ipv6/addrconf.c  |  192 +++++++++++++++++++++++++++++++++----------------
 2 files changed, 143 insertions(+), 64 deletions(-)

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 0c99776..623761d 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -129,6 +129,17 @@ struct ipv6hdr {
 };
 
 #ifdef __KERNEL__
+
+#ifdef CONFIG_SYSCTL
+struct addrconf_sysctl {
+	/* dev_name holds a copy of dev_name, because '.procname' is
+	 * regarded as const by sysctl and we wouldn't want anyone to
+	 * change it under our feet (see SIOCSIFNAME). */
+	char *dev_name;
+	struct ctl_table_header *sysctl_header;
+};
+#endif
+
 /*
  * This structure contains configuration options per IPv6 link.
  */
@@ -172,7 +183,9 @@ struct ipv6_devconf {
 	__s32		disable_ipv6;
 	__s32		accept_dad;
 	__s32		force_tllao;
-	void		*sysctl;
+#ifdef CONFIG_SYSCTL
+	struct addrconf_sysctl addrconf_sysctl;
+#endif
 };
 
 struct ipv6_params {
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index fd6782e..27fd8a1 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -364,7 +364,8 @@ static struct inet6_dev * ipv6_add_dev(struct net_device *dev)
 
 	memcpy(&ndev->cnf, dev_net(dev)->ipv6.devconf_dflt, sizeof(ndev->cnf));
 	ndev->cnf.mtu6 = dev->mtu;
-	ndev->cnf.sysctl = NULL;
+	ndev->cnf.addrconf_sysctl.dev_name = NULL;
+	ndev->cnf.addrconf_sysctl.sysctl_header = NULL;
 	ndev->nd_parms = neigh_parms_alloc(dev, &nd_tbl);
 	if (ndev->nd_parms == NULL) {
 		kfree(ndev);
@@ -4249,90 +4250,176 @@ int addrconf_sysctl_disable(ctl_table *ctl, int write,
 	return ret;
 }
 
-static struct addrconf_sysctl_table
+static int addrconf_handler(ctl_table *ctl, int write,
+			    void __user *buffer,
+			    size_t *lenp, loff_t *ppos,
+			    struct file *filp,
+			    proc_handler *proc_handler)
 {
-	struct ctl_table_header *sysctl_header;
-	ctl_table addrconf_vars[DEVCONF_MAX+1];
-	char *dev_name;
-} addrconf_sysctl __read_mostly = {
-	.sysctl_header = NULL,
-	.addrconf_vars = {
+	/* The path to this file is of the form:
+	 *  $PROC_MOUNT/sys/net/ipv6/conf/$DEVNAME/$CTL
+	 *
+	 * The array of 'struct ctl_table' of devinet entries is
+	 * shared between all ipv6 network devices and the 'data'
+	 * field of each structure only hold the offset into the
+	 * 'data' field of 'struct ipv6_devconf'.
+	 *
+	 * To find the propper location of the data that must be
+	 * accessed by this handler we need the device name and the
+	 * network namespace in which it belongs.
+	 */
+
+	/* We store the network namespace in the parent table's ->extra2 */
+	struct inode *parent_inode = filp->f_path.dentry->d_parent->d_inode;
+	struct ctl_table *parent_table = PROC_I(parent_inode)->sysctl_entry;
+	struct net *net = parent_table->extra2;
+
+	const char *dev_name = filp->f_path.dentry->d_parent->d_name.name;
+	struct ctl_table tmp_ctl;
+	struct net_device *dev = NULL;
+	struct inet6_dev *in6_dev = NULL;
+	struct ipv6_devconf *cnf;
+	int ret;
+
+	if (strcmp(dev_name, "all") == 0) {
+		cnf = net->ipv6.devconf_all;
+	} else if (strcmp(dev_name, "default") == 0) {
+		cnf = net->ipv6.devconf_dflt;
+	} else {
+		/* the device could have been renamed (SIOCSIFADDR) or
+		 * deleted since we started accessing it's proc sysctl */
+		dev = dev_get_by_name(net, dev_name);
+		if (dev == NULL)
+			return -ENOENT;
+		in6_dev = in6_dev_get(dev);
+		cnf = &in6_dev->cnf;
+	}
+
+	tmp_ctl = *ctl;
+	tmp_ctl.data += (char *)cnf - (char *)&ipv6_devconf;
+	tmp_ctl.extra1 = in6_dev;
+	tmp_ctl.extra2 = net;
+
+	ret = proc_handler(&tmp_ctl, write, buffer, lenp, ppos);
+
+	if (in6_dev)
+		in6_dev_put(in6_dev);
+	if (dev)
+		dev_put(dev);
+	return ret;
+}
+
+
+static int  addrconf_proc_dointvec(ctl_table *ctl, int write,
+				   void __user *buffer, size_t *lenp,
+				   loff_t *ppos, struct file *filp)
+{
+	return addrconf_handler(ctl, write, buffer, lenp, ppos, filp,
+				proc_dointvec);
+}
+
+static int  addrconf_proc_dointvec_jiffies(ctl_table *ctl, int write,
+					   void __user *buffer, size_t *lenp,
+					   loff_t *ppos, struct file *filp)
+{
+	return addrconf_handler(ctl, write, buffer, lenp, ppos, filp,
+				proc_dointvec_jiffies);
+}
+
+static int addrconf_sysctl_forward__(ctl_table *ctl, int write,
+				     void __user *buffer, size_t *lenp,
+				     loff_t *ppos, struct file *filp)
+{
+	return addrconf_handler(ctl, write, buffer, lenp, ppos, filp,
+				addrconf_sysctl_forward);
+}
+
+
+static int addrconf_sysctl_disable__(ctl_table *ctl, int write,
+				     void __user *buffer, size_t *lenp,
+				     loff_t *ppos, struct file *filp)
+{
+	return addrconf_handler(ctl, write, buffer, lenp, ppos, filp,
+				addrconf_sysctl_disable);
+}
+
+static const struct ctl_table ipv6_addrconf_sysctl_table[DEVCONF_MAX+1] = {
 		{
 			.procname	= "forwarding",
 			.data		= &ipv6_devconf.forwarding,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= addrconf_sysctl_forward,
+			.proc_handler	= addrconf_sysctl_forward__,
 		},
 		{
 			.procname	= "hop_limit",
 			.data		= &ipv6_devconf.hop_limit,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= proc_dointvec,
+			.proc_handler	= addrconf_proc_dointvec,
 		},
 		{
 			.procname	= "mtu",
 			.data		= &ipv6_devconf.mtu6,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= proc_dointvec,
+			.proc_handler	= addrconf_proc_dointvec,
 		},
 		{
 			.procname	= "accept_ra",
 			.data		= &ipv6_devconf.accept_ra,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= proc_dointvec,
+			.proc_handler	= addrconf_proc_dointvec,
 		},
 		{
 			.procname	= "accept_redirects",
 			.data		= &ipv6_devconf.accept_redirects,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= proc_dointvec,
+			.proc_handler	= addrconf_proc_dointvec,
 		},
 		{
 			.procname	= "autoconf",
 			.data		= &ipv6_devconf.autoconf,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= proc_dointvec,
+			.proc_handler	= addrconf_proc_dointvec,
 		},
 		{
 			.procname	= "dad_transmits",
 			.data		= &ipv6_devconf.dad_transmits,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= proc_dointvec,
+			.proc_handler	= addrconf_proc_dointvec,
 		},
 		{
 			.procname	= "router_solicitations",
 			.data		= &ipv6_devconf.rtr_solicits,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= proc_dointvec,
+			.proc_handler	= addrconf_proc_dointvec,
 		},
 		{
 			.procname	= "router_solicitation_interval",
 			.data		= &ipv6_devconf.rtr_solicit_interval,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= proc_dointvec_jiffies,
+			.proc_handler	= addrconf_proc_dointvec_jiffies,
 		},
 		{
 			.procname	= "router_solicitation_delay",
 			.data		= &ipv6_devconf.rtr_solicit_delay,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= proc_dointvec_jiffies,
+			.proc_handler	= addrconf_proc_dointvec_jiffies,
 		},
 		{
 			.procname	= "force_mld_version",
 			.data		= &ipv6_devconf.force_mld_version,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= proc_dointvec,
+			.proc_handler	= addrconf_proc_dointvec,
 		},
 #ifdef CONFIG_IPV6_PRIVACY
 		{
@@ -4340,35 +4427,35 @@ static struct addrconf_sysctl_table
 			.data		= &ipv6_devconf.use_tempaddr,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= proc_dointvec,
+			.proc_handler	= addrconf_proc_dointvec,
 		},
 		{
 			.procname	= "temp_valid_lft",
 			.data		= &ipv6_devconf.temp_valid_lft,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= proc_dointvec,
+			.proc_handler	= addrconf_proc_dointvec,
 		},
 		{
 			.procname	= "temp_prefered_lft",
 			.data		= &ipv6_devconf.temp_prefered_lft,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= proc_dointvec,
+			.proc_handler	= addrconf_proc_dointvec,
 		},
 		{
 			.procname	= "regen_max_retry",
 			.data		= &ipv6_devconf.regen_max_retry,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= proc_dointvec,
+			.proc_handler	= addrconf_proc_dointvec,
 		},
 		{
 			.procname	= "max_desync_factor",
 			.data		= &ipv6_devconf.max_desync_factor,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= proc_dointvec,
+			.proc_handler	= addrconf_proc_dointvec,
 		},
 #endif
 		{
@@ -4376,21 +4463,21 @@ static struct addrconf_sysctl_table
 			.data		= &ipv6_devconf.max_addresses,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= proc_dointvec,
+			.proc_handler	= addrconf_proc_dointvec,
 		},
 		{
 			.procname	= "accept_ra_defrtr",
 			.data		= &ipv6_devconf.accept_ra_defrtr,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= proc_dointvec,
+			.proc_handler	= addrconf_proc_dointvec,
 		},
 		{
 			.procname	= "accept_ra_pinfo",
 			.data		= &ipv6_devconf.accept_ra_pinfo,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= proc_dointvec,
+			.proc_handler	= addrconf_proc_dointvec,
 		},
 #ifdef CONFIG_IPV6_ROUTER_PREF
 		{
@@ -4398,14 +4485,14 @@ static struct addrconf_sysctl_table
 			.data		= &ipv6_devconf.accept_ra_rtr_pref,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= proc_dointvec,
+			.proc_handler	= addrconf_proc_dointvec,
 		},
 		{
 			.procname	= "router_probe_interval",
 			.data		= &ipv6_devconf.rtr_probe_interval,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= proc_dointvec_jiffies,
+			.proc_handler	= addrconf_proc_dointvec_jiffies,
 		},
 #ifdef CONFIG_IPV6_ROUTE_INFO
 		{
@@ -4413,7 +4500,7 @@ static struct addrconf_sysctl_table
 			.data		= &ipv6_devconf.accept_ra_rt_info_max_plen,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= proc_dointvec,
+			.proc_handler	= addrconf_proc_dointvec,
 		},
 #endif
 #endif
@@ -4422,14 +4509,14 @@ static struct addrconf_sysctl_table
 			.data		= &ipv6_devconf.proxy_ndp,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= proc_dointvec,
+			.proc_handler	= addrconf_proc_dointvec,
 		},
 		{
 			.procname	= "accept_source_route",
 			.data		= &ipv6_devconf.accept_source_route,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= proc_dointvec,
+			.proc_handler	= addrconf_proc_dointvec,
 		},
 #ifdef CONFIG_IPV6_OPTIMISTIC_DAD
 		{
@@ -4437,7 +4524,7 @@ static struct addrconf_sysctl_table
 			.data           = &ipv6_devconf.optimistic_dad,
 			.maxlen         = sizeof(int),
 			.mode           = 0644,
-			.proc_handler   = proc_dointvec,
+			.proc_handler   = addrconf_proc_dointvec,
 
 		},
 #endif
@@ -4447,7 +4534,7 @@ static struct addrconf_sysctl_table
 			.data		= &ipv6_devconf.mc_forwarding,
 			.maxlen		= sizeof(int),
 			.mode		= 0444,
-			.proc_handler	= proc_dointvec,
+			.proc_handler	= addrconf_proc_dointvec,
 		},
 #endif
 		{
@@ -4455,33 +4542,31 @@ static struct addrconf_sysctl_table
 			.data		= &ipv6_devconf.disable_ipv6,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= addrconf_sysctl_disable,
+			.proc_handler	= addrconf_sysctl_disable__,
 		},
 		{
 			.procname	= "accept_dad",
 			.data		= &ipv6_devconf.accept_dad,
 			.maxlen		= sizeof(int),
 			.mode		= 0644,
-			.proc_handler	= proc_dointvec,
+			.proc_handler	= addrconf_proc_dointvec,
 		},
 		{
 			.procname       = "force_tllao",
 			.data           = &ipv6_devconf.force_tllao,
 			.maxlen         = sizeof(int),
 			.mode           = 0644,
-			.proc_handler   = proc_dointvec
+			.proc_handler   = addrconf_proc_dointvec
 		},
 		{
 			/* sentinel */
 		}
-	},
 };
 
 static int __addrconf_sysctl_register(struct net *net, char *dev_name,
 		struct inet6_dev *idev, struct ipv6_devconf *p)
 {
-	int i;
-	struct addrconf_sysctl_table *t;
+	struct addrconf_sysctl *t = &p->addrconf_sysctl;
 
 #define ADDRCONF_CTL_PATH_DEV	3
 
@@ -4494,16 +4579,6 @@ static int __addrconf_sysctl_register(struct net *net, char *dev_name,
 	};
 
 
-	t = kmemdup(&addrconf_sysctl, sizeof(*t), GFP_KERNEL);
-	if (t == NULL)
-		goto out;
-
-	for (i = 0; t->addrconf_vars[i].data; i++) {
-		t->addrconf_vars[i].data += (char *)p - (char *)&ipv6_devconf;
-		t->addrconf_vars[i].extra1 = idev; /* embedded; no ref */
-		t->addrconf_vars[i].extra2 = net;
-	}
-
 	/*
 	 * Make a copy of dev_name, because '.procname' is regarded as const
 	 * by sysctl and we wouldn't want anyone to change it under our feet
@@ -4511,38 +4586,29 @@ static int __addrconf_sysctl_register(struct net *net, char *dev_name,
 	 */
 	t->dev_name = kstrdup(dev_name, GFP_KERNEL);
 	if (!t->dev_name)
-		goto free;
+		goto out;
 
 	addrconf_ctl_path[ADDRCONF_CTL_PATH_DEV].procname = t->dev_name;
 
-	t->sysctl_header = register_net_sysctl_table(net, addrconf_ctl_path,
-			t->addrconf_vars);
+	t->sysctl_header = register_net_sysctl_table_pathdata(net,
+			      addrconf_ctl_path, ipv6_addrconf_sysctl_table, net);
 	if (t->sysctl_header == NULL)
 		goto free_procname;
 
-	p->sysctl = t;
 	return 0;
 
 free_procname:
 	kfree(t->dev_name);
-free:
-	kfree(t);
 out:
 	return -ENOBUFS;
 }
 
 static void __addrconf_sysctl_unregister(struct ipv6_devconf *p)
 {
-	struct addrconf_sysctl_table *t;
-
-	if (p->sysctl == NULL)
-		return;
+	struct addrconf_sysctl *t = &p->addrconf_sysctl;
 
-	t = p->sysctl;
-	p->sysctl = NULL;
 	unregister_sysctl_table(t->sysctl_header);
 	kfree(t->dev_name);
-	kfree(t);
 }
 
 static void addrconf_sysctl_register(struct inet6_dev *idev)
-- 
1.7.4.rc1.7.g2cf08.dirty


^ permalink raw reply related

* [PATCH 4/5] ipv4: share sysctl net/ipv4/conf/DEVNAME/ tables
From: Lucian Adrian Grijincu @ 2011-02-04  4:37 UTC (permalink / raw)
  To: linux-kernel, netdev, Eric W. Biederman, Eric Dumazet,
	David S. Miller, Oct
  Cc: Lucian Adrian Grijincu
In-Reply-To: <cover.1296793770.git.lucian.grijincu@gmail.com>

Before this, for each network device DEVNAME that supports ipv4 a new
sysctl table was registered in $PROC/sys/net/ipv4/conf/DEVNAME/.

The sysctl table was identical for all network devices, except for:
* data: pointer to the data to be accessed in the sysctl
* extra1: the 'struct ipv4_devconf*' of the network device
* extra2: the 'struct net*' of the network namespace

Assuming we have a device name and a 'struct net*', we can get the
'struct net_device*'. From there we can compute:
* data: each entry corresponds to a position in 'struct ipv4_devconf*'
* extra1: 'struct ipv4_devconf*' can be reached from 'struct net_device*'
* extra2: the 'struct net*' that we assumed to have

The device name is determined from the path to the file (the name of
the parent dentry).

The 'struct net*' is stored in the parent 'struct ctl_table*' path by
register_net_sysctl_table_pathdata().

Signed-off-by: Lucian Adrian Grijincu <lucian.grijincu@gmail.com>
---
 fs/proc/proc_sysctl.c      |   16 +++-
 include/linux/inetdevice.h |   12 +++-
 net/ipv4/devinet.c         |  203 +++++++++++++++++++++++++++++---------------
 3 files changed, 161 insertions(+), 70 deletions(-)

diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index fb707e0..fe392f1 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -128,6 +128,11 @@ out:
 	return err;
 }
 
+
+typedef int proc_handler_extended(struct ctl_table *ctl, int write,
+				  void __user *buffer, size_t *lenp, loff_t *ppos,
+				  struct file *filp);
+
 static ssize_t proc_sys_call_handler(struct file *filp, void __user *buf,
 		size_t count, loff_t *ppos, int write)
 {
@@ -136,6 +141,7 @@ static ssize_t proc_sys_call_handler(struct file *filp, void __user *buf,
 	struct ctl_table *table = PROC_I(inode)->sysctl_entry;
 	ssize_t error;
 	size_t res;
+	proc_handler_extended *phx = (proc_handler_extended *) table->proc_handler;
 
 	if (IS_ERR(head))
 		return PTR_ERR(head);
@@ -155,7 +161,15 @@ static ssize_t proc_sys_call_handler(struct file *filp, void __user *buf,
 
 	/* careful: calling conventions are nasty here */
 	res = count;
-	error = table->proc_handler(table, write, buf, &res, ppos);
+	/* Most handlers only use the first 5 arguments (without @filp).
+	 * Changing all is too much of work, as, at the time of writting only
+	 * the devinet.c proc_handlers know about and use the @filp.
+	 *
+	 * This is just a HACK for now, I did this this way to not
+	 * waste time changing all the handlers, in the final version
+	 * I'll change all the handlers if there's not other solution.
+	 */
+	error = phx(table, write, buf, &res, ppos, filp);
 	if (!error)
 		error = res;
 out:
diff --git a/include/linux/inetdevice.h b/include/linux/inetdevice.h
index ae8fdc5..caf06b3 100644
--- a/include/linux/inetdevice.h
+++ b/include/linux/inetdevice.h
@@ -43,8 +43,18 @@ enum
 
 #define IPV4_DEVCONF_MAX (__IPV4_DEVCONF_MAX - 1)
 
+
+struct devinet_sysctl {
+	/* dev_name holds a copy of dev_name, because '.procname' is
+	 * regarded as const by sysctl and we wouldn't want anyone to
+	 * change it under our feet (see SIOCSIFNAME). */
+	char *dev_name;
+	struct ctl_table_header *sysctl_header;
+};
+
+
 struct ipv4_devconf {
-	void	*sysctl;
+	struct devinet_sysctl devinet_sysctl;
 	int	data[IPV4_DEVCONF_MAX];
 	DECLARE_BITMAP(state, IPV4_DEVCONF_MAX);
 };
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index 748cb5b..774d347 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -147,7 +147,7 @@ void in_dev_finish_destroy(struct in_device *idev)
 }
 EXPORT_SYMBOL(in_dev_finish_destroy);
 
-static struct in_device *inetdev_init(struct net_device *dev)
+struct in_device *inetdev_init(struct net_device *dev)
 {
 	struct in_device *in_dev;
 
@@ -158,7 +158,8 @@ static struct in_device *inetdev_init(struct net_device *dev)
 		goto out;
 	memcpy(&in_dev->cnf, dev_net(dev)->ipv4.devconf_dflt,
 			sizeof(in_dev->cnf));
-	in_dev->cnf.sysctl = NULL;
+	in_dev->cnf.devinet_sysctl.dev_name = NULL;
+	in_dev->cnf.devinet_sysctl.sysctl_header = NULL;
 	in_dev->dev = dev;
 	in_dev->arp_parms = neigh_parms_alloc(dev, &arp_tbl);
 	if (!in_dev->arp_parms)
@@ -1375,6 +1376,67 @@ static void inet_forward_change(struct net *net)
 	}
 }
 
+
+
+static int devinet_conf_handler(ctl_table *ctl, int write,
+				void __user *buffer,
+				size_t *lenp, loff_t *ppos,
+				struct file *filp,
+				proc_handler *proc_handler)
+{
+	/* The path to this file is of the form:
+	 *  $PROC_MOUNT/sys/net/ipv4/conf/$DEVNAME/$CTL
+	 *
+	 * The array of 'struct ctl_table' of devinet entries is
+	 * shared between all ipv4 network devices and the 'data'
+	 * field of each structure only hold the offset into the
+	 * 'data' field of 'struct ipv4_devconf'.
+	 *
+	 * To find the propper location of the data that must be
+	 * accessed by this handler we need the device name and the
+	 * network namespace in which it belongs.
+	 */
+
+	/* We store the network namespace in the parent table's ->extra2 */
+	struct inode *parent_inode = filp->f_path.dentry->d_parent->d_inode;
+	struct ctl_table *parent_table = PROC_I(parent_inode)->sysctl_entry;
+	struct net *net = parent_table->extra2;
+
+	const char *dev_name = filp->f_path.dentry->d_parent->d_name.name;
+	struct ctl_table tmp_ctl;
+	struct net_device *dev = NULL;
+	struct in_device *in_dev = NULL;
+	struct ipv4_devconf *cnf;
+	int ret;
+
+	if (strcmp(dev_name, "all") == 0) {
+		cnf = net->ipv4.devconf_all;
+	} else if (strcmp(dev_name, "default") == 0) {
+		cnf = net->ipv4.devconf_dflt;
+	} else {
+		/* the device could have been renamed (SIOCSIFADDR) or
+		 * deleted since we started accessing it's proc sysctl */
+		dev = dev_get_by_name(net, dev_name);
+		if (dev == NULL)
+			return -ENOENT;
+		in_dev = in_dev_get(dev);
+		cnf = &in_dev->cnf;
+	}
+
+	tmp_ctl = *ctl;
+	tmp_ctl.data += (char *)cnf - (char *)&ipv4_devconf;
+	tmp_ctl.extra1 = cnf;
+	tmp_ctl.extra2 = net;
+
+	ret = proc_handler(&tmp_ctl, write, buffer, lenp, ppos);
+
+	if (in_dev)
+		in_dev_put(in_dev);
+	if (dev)
+		dev_put(dev);
+	return ret;
+}
+
 static int devinet_conf_proc(ctl_table *ctl, int write,
 			     void __user *buffer,
 			     size_t *lenp, loff_t *ppos)
@@ -1445,6 +1507,33 @@ static int ipv4_doint_and_flush(ctl_table *ctl, int write,
 	return ret;
 }
 
+static int devinet_conf_proc__(ctl_table *ctl, int write,
+			       void __user *buffer,
+			       size_t *lenp, loff_t *ppos,
+			       struct file *filp)
+{
+	return devinet_conf_handler(ctl, write, buffer, lenp, ppos, filp,
+				    devinet_conf_proc);
+}
+
+static int devinet_sysctl_forward__(ctl_table *ctl, int write,
+				    void __user *buffer,
+				    size_t *lenp, loff_t *ppos,
+				    struct file *filp)
+{
+	return devinet_conf_handler(ctl, write, buffer, lenp, ppos, filp,
+				    devinet_sysctl_forward);
+}
+
+static int ipv4_doint_and_flush__(ctl_table *ctl, int write,
+				  void __user *buffer,
+				  size_t *lenp, loff_t *ppos,
+				  struct file *filp)
+{
+	return devinet_conf_handler(ctl, write, buffer, lenp, ppos, filp,
+				    ipv4_doint_and_flush);
+}
+
 #define DEVINET_SYSCTL_ENTRY(attr, name, mval, proc) \
 	{ \
 		.procname	= name, \
@@ -1452,67 +1541,60 @@ static int ipv4_doint_and_flush(ctl_table *ctl, int write,
 				  IPV4_DEVCONF_ ## attr - 1, \
 		.maxlen		= sizeof(int), \
 		.mode		= mval, \
-		.proc_handler	= proc, \
-		.extra1		= &ipv4_devconf, \
+		.proc_handler	= (proc_handler *) proc, \
 	}
 
 #define DEVINET_SYSCTL_RW_ENTRY(attr, name) \
-	DEVINET_SYSCTL_ENTRY(attr, name, 0644, devinet_conf_proc)
+	DEVINET_SYSCTL_ENTRY(attr, name, 0644, devinet_conf_proc__)
 
 #define DEVINET_SYSCTL_RO_ENTRY(attr, name) \
-	DEVINET_SYSCTL_ENTRY(attr, name, 0444, devinet_conf_proc)
+	DEVINET_SYSCTL_ENTRY(attr, name, 0444, devinet_conf_proc__)
 
 #define DEVINET_SYSCTL_COMPLEX_ENTRY(attr, name, proc) \
 	DEVINET_SYSCTL_ENTRY(attr, name, 0644, proc)
 
 #define DEVINET_SYSCTL_FLUSHING_ENTRY(attr, name) \
-	DEVINET_SYSCTL_COMPLEX_ENTRY(attr, name, ipv4_doint_and_flush)
-
-static struct devinet_sysctl_table {
-	struct ctl_table_header *sysctl_header;
-	struct ctl_table devinet_vars[__IPV4_DEVCONF_MAX];
-	char *dev_name;
-} devinet_sysctl = {
-	.devinet_vars = {
-		DEVINET_SYSCTL_COMPLEX_ENTRY(FORWARDING, "forwarding",
-					     devinet_sysctl_forward),
-		DEVINET_SYSCTL_RO_ENTRY(MC_FORWARDING, "mc_forwarding"),
-
-		DEVINET_SYSCTL_RW_ENTRY(ACCEPT_REDIRECTS, "accept_redirects"),
-		DEVINET_SYSCTL_RW_ENTRY(SECURE_REDIRECTS, "secure_redirects"),
-		DEVINET_SYSCTL_RW_ENTRY(SHARED_MEDIA, "shared_media"),
-		DEVINET_SYSCTL_RW_ENTRY(RP_FILTER, "rp_filter"),
-		DEVINET_SYSCTL_RW_ENTRY(SEND_REDIRECTS, "send_redirects"),
-		DEVINET_SYSCTL_RW_ENTRY(ACCEPT_SOURCE_ROUTE,
-					"accept_source_route"),
-		DEVINET_SYSCTL_RW_ENTRY(ACCEPT_LOCAL, "accept_local"),
-		DEVINET_SYSCTL_RW_ENTRY(SRC_VMARK, "src_valid_mark"),
-		DEVINET_SYSCTL_RW_ENTRY(PROXY_ARP, "proxy_arp"),
-		DEVINET_SYSCTL_RW_ENTRY(MEDIUM_ID, "medium_id"),
-		DEVINET_SYSCTL_RW_ENTRY(BOOTP_RELAY, "bootp_relay"),
-		DEVINET_SYSCTL_RW_ENTRY(LOG_MARTIANS, "log_martians"),
-		DEVINET_SYSCTL_RW_ENTRY(TAG, "tag"),
-		DEVINET_SYSCTL_RW_ENTRY(ARPFILTER, "arp_filter"),
-		DEVINET_SYSCTL_RW_ENTRY(ARP_ANNOUNCE, "arp_announce"),
-		DEVINET_SYSCTL_RW_ENTRY(ARP_IGNORE, "arp_ignore"),
-		DEVINET_SYSCTL_RW_ENTRY(ARP_ACCEPT, "arp_accept"),
-		DEVINET_SYSCTL_RW_ENTRY(ARP_NOTIFY, "arp_notify"),
-		DEVINET_SYSCTL_RW_ENTRY(PROXY_ARP_PVLAN, "proxy_arp_pvlan"),
-
-		DEVINET_SYSCTL_FLUSHING_ENTRY(NOXFRM, "disable_xfrm"),
-		DEVINET_SYSCTL_FLUSHING_ENTRY(NOPOLICY, "disable_policy"),
-		DEVINET_SYSCTL_FLUSHING_ENTRY(FORCE_IGMP_VERSION,
-					      "force_igmp_version"),
-		DEVINET_SYSCTL_FLUSHING_ENTRY(PROMOTE_SECONDARIES,
-					      "promote_secondaries"),
-	},
+	DEVINET_SYSCTL_COMPLEX_ENTRY(attr, name, ipv4_doint_and_flush__)
+
+const struct ctl_table ipv4_devinet_sysctl_table[__IPV4_DEVCONF_MAX] = {
+	DEVINET_SYSCTL_COMPLEX_ENTRY(FORWARDING, "forwarding",
+				     devinet_sysctl_forward__),
+	DEVINET_SYSCTL_RO_ENTRY(MC_FORWARDING, "mc_forwarding"),
+
+	DEVINET_SYSCTL_RW_ENTRY(ACCEPT_REDIRECTS, "accept_redirects"),
+	DEVINET_SYSCTL_RW_ENTRY(SECURE_REDIRECTS, "secure_redirects"),
+	DEVINET_SYSCTL_RW_ENTRY(SHARED_MEDIA, "shared_media"),
+	DEVINET_SYSCTL_RW_ENTRY(RP_FILTER, "rp_filter"),
+	DEVINET_SYSCTL_RW_ENTRY(SEND_REDIRECTS, "send_redirects"),
+	DEVINET_SYSCTL_RW_ENTRY(ACCEPT_SOURCE_ROUTE,
+				"accept_source_route"),
+	DEVINET_SYSCTL_RW_ENTRY(ACCEPT_LOCAL, "accept_local"),
+	DEVINET_SYSCTL_RW_ENTRY(SRC_VMARK, "src_valid_mark"),
+	DEVINET_SYSCTL_RW_ENTRY(PROXY_ARP, "proxy_arp"),
+	DEVINET_SYSCTL_RW_ENTRY(MEDIUM_ID, "medium_id"),
+	DEVINET_SYSCTL_RW_ENTRY(BOOTP_RELAY, "bootp_relay"),
+	DEVINET_SYSCTL_RW_ENTRY(LOG_MARTIANS, "log_martians"),
+	DEVINET_SYSCTL_RW_ENTRY(TAG, "tag"),
+	DEVINET_SYSCTL_RW_ENTRY(ARPFILTER, "arp_filter"),
+	DEVINET_SYSCTL_RW_ENTRY(ARP_ANNOUNCE, "arp_announce"),
+	DEVINET_SYSCTL_RW_ENTRY(ARP_IGNORE, "arp_ignore"),
+	DEVINET_SYSCTL_RW_ENTRY(ARP_ACCEPT, "arp_accept"),
+	DEVINET_SYSCTL_RW_ENTRY(ARP_NOTIFY, "arp_notify"),
+	DEVINET_SYSCTL_RW_ENTRY(PROXY_ARP_PVLAN, "proxy_arp_pvlan"),
+
+	DEVINET_SYSCTL_FLUSHING_ENTRY(NOXFRM, "disable_xfrm"),
+	DEVINET_SYSCTL_FLUSHING_ENTRY(NOPOLICY, "disable_policy"),
+	DEVINET_SYSCTL_FLUSHING_ENTRY(FORCE_IGMP_VERSION,
+				      "force_igmp_version"),
+	DEVINET_SYSCTL_FLUSHING_ENTRY(PROMOTE_SECONDARIES,
+				      "promote_secondaries"),
+	{ }
 };
 
 static int __devinet_sysctl_register(struct net *net, char *dev_name,
-					struct ipv4_devconf *p)
+				     struct ipv4_devconf *cnf)
 {
-	int i;
-	struct devinet_sysctl_table *t;
+	struct devinet_sysctl *t = &cnf->devinet_sysctl;
 
 #define DEVINET_CTL_PATH_DEV	3
 
@@ -1524,16 +1606,6 @@ static int __devinet_sysctl_register(struct net *net, char *dev_name,
 		{ },
 	};
 
-	t = kmemdup(&devinet_sysctl, sizeof(*t), GFP_KERNEL);
-	if (!t)
-		goto out;
-
-	for (i = 0; i < ARRAY_SIZE(t->devinet_vars) - 1; i++) {
-		t->devinet_vars[i].data += (char *)p - (char *)&ipv4_devconf;
-		t->devinet_vars[i].extra1 = p;
-		t->devinet_vars[i].extra2 = net;
-	}
-
 	/*
 	 * Make a copy of dev_name, because '.procname' is regarded as const
 	 * by sysctl and we wouldn't want anyone to change it under our feet
@@ -1541,37 +1613,32 @@ static int __devinet_sysctl_register(struct net *net, char *dev_name,
 	 */
 	t->dev_name = kstrdup(dev_name, GFP_KERNEL);
 	if (!t->dev_name)
-		goto free;
+		goto out;
 
 	devinet_ctl_path[DEVINET_CTL_PATH_DEV].procname = t->dev_name;
 
-	t->sysctl_header = register_net_sysctl_table(net, devinet_ctl_path,
-			t->devinet_vars);
+	t->sysctl_header = register_net_sysctl_table_pathdata(net,
+			 devinet_ctl_path, ipv4_devinet_sysctl_table, net);
 	if (!t->sysctl_header)
 		goto free_procname;
 
-	p->sysctl = t;
 	return 0;
 
 free_procname:
 	kfree(t->dev_name);
-free:
-	kfree(t);
 out:
 	return -ENOBUFS;
 }
 
 static void __devinet_sysctl_unregister(struct ipv4_devconf *cnf)
 {
-	struct devinet_sysctl_table *t = cnf->sysctl;
+	struct devinet_sysctl *t = &cnf->devinet_sysctl;
 
 	if (t == NULL)
 		return;
 
-	cnf->sysctl = NULL;
 	unregister_sysctl_table(t->sysctl_header);
 	kfree(t->dev_name);
-	kfree(t);
 }
 
 static void devinet_sysctl_register(struct in_device *idev)
-- 
1.7.4.rc1.7.g2cf08.dirty

^ permalink raw reply related

* [PATCH 3/5] sysctl: write ctl_table->extra2 to entries created from ctl_path
From: Lucian Adrian Grijincu @ 2011-02-04  4:37 UTC (permalink / raw)
  To: linux-kernel, netdev, Eric W. Biederman, Eric Dumazet,
	David S. Miller, Oct
  Cc: Lucian Adrian Grijincu
In-Reply-To: <cover.1296793770.git.lucian.grijincu@gmail.com>

For each entry in an array of 'struct ctl_path' we were registering a
'struct ctl_table' array with two entries:
- one to store the name + permissions,
- one as an end-of-array marker (completely blank).

We were not using any of the data storage fields
(data, extra1, extra2) in the first 'struct ctl_table'.

This patch adds possibility of storring some user provided
pointer in the 'extra2' field.

All users the next functions store NULL in the 'extra2'
field like they used to before this patch:
* register_sysctl_paths
* register_net_sysctl_table
* register_net_sysctl_rotable

Until now sysctl_check_table considered that the 'struct ctl_table' of
directories may not store anything in the 'extra2' field. We no longer
consider this a fault.

Signed-off-by: Lucian Adrian Grijincu <lucian.grijincu@gmail.com>
---
 include/linux/sysctl.h      |    2 +-
 include/net/net_namespace.h |    2 ++
 kernel/sysctl.c             |    7 +++++--
 kernel/sysctl_check.c       |    2 --
 net/sysctl_net.c            |   20 ++++++++++++++------
 5 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index 1f1da4b..090b9a3 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -1057,7 +1057,7 @@ struct ctl_path {
 void register_sysctl_root(struct ctl_table_root *root);
 struct ctl_table_header *__register_sysctl_paths(
 	struct ctl_table_root *root, struct nsproxy *namespaces,
-	const struct ctl_path *path, struct ctl_table *table);
+	const struct ctl_path *path, struct ctl_table *table, void *pathdata);
 struct ctl_table_header *register_sysctl_table(struct ctl_table * table);
 struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
 						struct ctl_table *table);
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 1bf812b..42d4d61 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -272,6 +272,8 @@ struct ctl_table_header;
 
 extern struct ctl_table_header *register_net_sysctl_table(struct net *net,
 	const struct ctl_path *path, struct ctl_table *table);
+struct ctl_table_header *register_net_sysctl_table_pathdata(struct net *net,
+	const struct ctl_path *path, struct ctl_table *table, void *pathdata);
 extern struct ctl_table_header *register_net_sysctl_rotable(
 	const struct ctl_path *path, struct ctl_table *table);
 extern void unregister_net_sysctl_table(struct ctl_table_header *header);
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 42025ec..9b67c9e 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1759,6 +1759,8 @@ static void try_attach(struct ctl_table_header *p, struct ctl_table_header *q)
  * @namespaces: Data to compute which lists of sysctl entries are visible
  * @path: The path to the directory the sysctl table is in.
  * @table: the top-level table structure
+ * @pathdata: user provided pointer to data that will be stored in ->extra2
+ *            for every ctl_table node allocated for entries in @path
  *
  * Register a sysctl table hierarchy. @table should be a filled in ctl_table
  * array. A completely 0 filled entry terminates the table.
@@ -1809,7 +1811,7 @@ static void try_attach(struct ctl_table_header *p, struct ctl_table_header *q)
 struct ctl_table_header *__register_sysctl_paths(
 	struct ctl_table_root *root,
 	struct nsproxy *namespaces,
-	const struct ctl_path *path, struct ctl_table *table)
+	const struct ctl_path *path, struct ctl_table *table, void *pathdata)
 {
 	struct ctl_table_header *header;
 	struct ctl_table *new, **prevp;
@@ -1841,6 +1843,7 @@ struct ctl_table_header *__register_sysctl_paths(
 		/* Copy the procname */
 		new->procname = path->procname;
 		new->mode     = 0555;
+		new->extra2   = pathdata;
 
 		*prevp = new;
 		prevp = &new->child;
@@ -1895,7 +1898,7 @@ struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
 						struct ctl_table *table)
 {
 	return __register_sysctl_paths(&sysctl_table_root, current->nsproxy,
-					path, table);
+				       path, table, NULL);
 }
 
 /**
diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c
index b7d9c66..e09f47f 100644
--- a/kernel/sysctl_check.c
+++ b/kernel/sysctl_check.c
@@ -112,8 +112,6 @@ static int __sysctl_check_table(struct nsproxy *namespaces,
 				SET_FAIL("Directory with proc_handler");
 			if (table->extra1)
 				SET_FAIL("Directory with extra1");
-			if (table->extra2)
-				SET_FAIL("Directory with extra2");
 		} else {
 			if ((table->proc_handler == proc_dostring) ||
 			    (table->proc_handler == proc_dointvec) ||
diff --git a/net/sysctl_net.c b/net/sysctl_net.c
index ca84212..9c92cac 100644
--- a/net/sysctl_net.c
+++ b/net/sysctl_net.c
@@ -103,22 +103,30 @@ out:
 }
 subsys_initcall(sysctl_init);
 
-struct ctl_table_header *register_net_sysctl_table(struct net *net,
-	const struct ctl_path *path, struct ctl_table *table)
+struct ctl_table_header *register_net_sysctl_table_pathdata(struct net *net,
+	const struct ctl_path *path, struct ctl_table *table, void *pathdata)
 {
 	struct nsproxy namespaces;
 	namespaces = *current->nsproxy;
 	namespaces.net_ns = net;
-	return __register_sysctl_paths(&net_sysctl_root,
-					&namespaces, path, table);
+	return __register_sysctl_paths(&net_sysctl_root, &namespaces,
+				       path, table, pathdata);
+}
+EXPORT_SYMBOL_GPL(register_net_sysctl_table_pathdata);
+
+struct ctl_table_header *register_net_sysctl_table(struct net *net,
+	const struct ctl_path *path, struct ctl_table *table)
+{
+	return register_net_sysctl_table_pathdata(net, path, table, NULL);
 }
 EXPORT_SYMBOL_GPL(register_net_sysctl_table);
 
+
 struct ctl_table_header *register_net_sysctl_rotable(const
 		struct ctl_path *path, struct ctl_table *table)
 {
-	return __register_sysctl_paths(&net_sysctl_ro_root,
-			&init_nsproxy, path, table);
+	return __register_sysctl_paths(&net_sysctl_ro_root, &init_nsproxy,
+				       path, table, NULL);
 }
 EXPORT_SYMBOL_GPL(register_net_sysctl_rotable);
 
-- 
1.7.4.rc1.7.g2cf08.dirty

^ permalink raw reply related

* [PATCH 2/5] sysctl: remove useless ctl_table->parent field
From: Lucian Adrian Grijincu @ 2011-02-04  4:37 UTC (permalink / raw)
  To: linux-kernel, netdev, Eric W. Biederman, Eric Dumazet,
	David S. Miller, Oct
  Cc: Lucian Adrian Grijincu
In-Reply-To: <cover.1296793770.git.lucian.grijincu@gmail.com>

The 'parent' field was added for selinux in:
    commit d912b0cc1a617d7c590d57b7ea971d50c7f02503
    [PATCH] sysctl: add a parent entry to ctl_table and set the parent entry

and then was used for sysctl_check_table.

Both of the users have found other implementations.

CC: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Lucian Adrian Grijincu <lucian.grijincu@gmail.com>
---
 include/linux/sysctl.h |    1 -
 kernel/sysctl.c        |   11 -----------
 kernel/sysctl_check.c  |    4 ++--
 3 files changed, 2 insertions(+), 14 deletions(-)

diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index 7bb5cb6..1f1da4b 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -1018,7 +1018,6 @@ struct ctl_table
 	int maxlen;
 	mode_t mode;
 	struct ctl_table *child;
-	struct ctl_table *parent;	/* Automatically set */
 	proc_handler *proc_handler;	/* Callback for text formatting */
 	void *extra1;
 	void *extra2;
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 56f6fc1..42025ec 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1695,18 +1695,8 @@ int sysctl_perm(struct ctl_table_root *root, struct ctl_table *table, int op)
 	return test_perm(mode, op);
 }
 
-static void sysctl_set_parent(struct ctl_table *parent, struct ctl_table *table)
-{
-	for (; table->procname; table++) {
-		table->parent = parent;
-		if (table->child)
-			sysctl_set_parent(table, table->child);
-	}
-}
-
 static __init int sysctl_init(void)
 {
-	sysctl_set_parent(NULL, root_table);
 #ifdef CONFIG_SYSCTL_SYSCALL_CHECK
 	sysctl_check_table(current->nsproxy, root_table);
 #endif
@@ -1864,7 +1854,6 @@ struct ctl_table_header *__register_sysctl_paths(
 	header->used = 0;
 	header->unregistering = NULL;
 	header->root = root;
-	sysctl_set_parent(NULL, header->ctl_table);
 	header->count = 1;
 #ifdef CONFIG_SYSCTL_SYSCALL_CHECK
 	if (sysctl_check_table(namespaces, header->ctl_table)) {
diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c
index 9b4fecd..b7d9c66 100644
--- a/kernel/sysctl_check.c
+++ b/kernel/sysctl_check.c
@@ -95,8 +95,8 @@ static int __sysctl_check_table(struct nsproxy *namespaces,
 	for (; table->procname; table++) {
 		const char *fail = NULL;
 
-		if (table->parent) {
-			if (table->procname && !table->parent->procname)
+		if (depth != 0) { /* has parent */
+			if (table->procname && !parents[depth - 1]->procname)
 				SET_FAIL("Parent without procname");
 		}
 		if (!table->procname)
-- 
1.7.4.rc1.7.g2cf08.dirty

^ permalink raw reply related

* [PATCH 1/5] sysctl: faster reimplementation of sysctl_check_table
From: Lucian Adrian Grijincu @ 2011-02-04  4:37 UTC (permalink / raw)
  To: linux-kernel, netdev, Eric W. Biederman, Eric Dumazet,
	David S. Miller, Oct
  Cc: Lucian Adrian Grijincu
In-Reply-To: <cover.1296793770.git.lucian.grijincu@gmail.com>

Determining the parent of a node at depth d
- previous implementation: O(d)
- current  implementation: O(1)

Printing the path to a node at depth d
- previous implementation: O(d^2)
- current  implementation: O(d)

This comes to a cost: we use an array ('parents') holding as many
pointers as there can be sysctl levels (currently CTL_MAXNAME=10).

The 'parents' array of pointers holds the same values as the
ctl_table->parents field because the function that updates ->parents
(sysctl_set_parent) is called with either NULL (for root nodes) or
with sysctl_set_parent(table, table->child).

Signed-off-by: Lucian Adrian Grijincu <lucian.grijincu@gmail.com>
---
 kernel/sysctl_check.c |  121 ++++++++++++++++++++++++-------------------------
 1 files changed, 60 insertions(+), 61 deletions(-)

diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c
index 10b90d8..9b4fecd 100644
--- a/kernel/sysctl_check.c
+++ b/kernel/sysctl_check.c
@@ -6,58 +6,34 @@
 #include <net/ip_vs.h>
 
 
-static int sysctl_depth(struct ctl_table *table)
-{
-	struct ctl_table *tmp;
-	int depth;
-
-	depth = 0;
-	for (tmp = table; tmp->parent; tmp = tmp->parent)
-		depth++;
-
-	return depth;
-}
-
-static struct ctl_table *sysctl_parent(struct ctl_table *table, int n)
+static void sysctl_print_path(struct ctl_table *table,
+			      struct ctl_table **parents, int depth)
 {
+	struct ctl_table *p;
 	int i;
-
-	for (i = 0; table && i < n; i++)
-		table = table->parent;
-
-	return table;
-}
-
-
-static void sysctl_print_path(struct ctl_table *table)
-{
-	struct ctl_table *tmp;
-	int depth, i;
-	depth = sysctl_depth(table);
 	if (table->procname) {
-		for (i = depth; i >= 0; i--) {
-			tmp = sysctl_parent(table, i);
-			printk("/%s", tmp->procname?tmp->procname:"");
+		for (i = 0; i < depth; i++) {
+			p = parents[i];
+			printk("/%s", p->procname ? p->procname : "");
 		}
+		printk("/%s", table->procname);
 	}
 	printk(" ");
 }
 
 static struct ctl_table *sysctl_check_lookup(struct nsproxy *namespaces,
-						struct ctl_table *table)
+	     struct ctl_table *table, struct ctl_table **parents, int depth)
 {
 	struct ctl_table_header *head;
 	struct ctl_table *ref, *test;
-	int depth, cur_depth;
-
-	depth = sysctl_depth(table);
+	int cur_depth;
 
 	for (head = __sysctl_head_next(namespaces, NULL); head;
 	     head = __sysctl_head_next(namespaces, head)) {
 		cur_depth = depth;
 		ref = head->ctl_table;
 repeat:
-		test = sysctl_parent(table, cur_depth);
+		test = parents[depth - cur_depth];
 		for (; ref->procname; ref++) {
 			int match = 0;
 			if (cur_depth && !ref->child)
@@ -83,11 +59,12 @@ out:
 	return ref;
 }
 
-static void set_fail(const char **fail, struct ctl_table *table, const char *str)
+static void set_fail(const char **fail, struct ctl_table *table,
+	     const char *str, struct ctl_table **parents, int depth)
 {
 	if (*fail) {
 		printk(KERN_ERR "sysctl table check failed: ");
-		sysctl_print_path(table);
+		sysctl_print_path(table, parents, depth);
 		printk(" %s\n", *fail);
 		dump_stack();
 	}
@@ -95,16 +72,24 @@ static void set_fail(const char **fail, struct ctl_table *table, const char *str
 }
 
 static void sysctl_check_leaf(struct nsproxy *namespaces,
-				struct ctl_table *table, const char **fail)
+			      struct ctl_table *table, const char **fail,
+			      struct ctl_table **parents, int depth)
 {
 	struct ctl_table *ref;
 
-	ref = sysctl_check_lookup(namespaces, table);
-	if (ref && (ref != table))
-		set_fail(fail, table, "Sysctl already exists");
+	ref = sysctl_check_lookup(namespaces, table, parents, depth);
+	if (ref && (ref != table)) {
+		printk(KERN_ALERT "sysctl_check_leaf ref[%s], table[%s]\n", ref->procname, table->procname);
+		set_fail(fail, table, "Sysctl already exists", parents, depth);
+	}
 }
 
-int sysctl_check_table(struct nsproxy *namespaces, struct ctl_table *table)
+
+
+#define SET_FAIL(str) set_fail(&fail, table, str, parents, depth)
+
+static int __sysctl_check_table(struct nsproxy *namespaces,
+	struct ctl_table *table, struct ctl_table **parents, int depth)
 {
 	int error = 0;
 	for (; table->procname; table++) {
@@ -112,23 +97,23 @@ int sysctl_check_table(struct nsproxy *namespaces, struct ctl_table *table)
 
 		if (table->parent) {
 			if (table->procname && !table->parent->procname)
-				set_fail(&fail, table, "Parent without procname");
+				SET_FAIL("Parent without procname");
 		}
 		if (!table->procname)
-			set_fail(&fail, table, "No procname");
+			SET_FAIL("No procname");
 		if (table->child) {
 			if (table->data)
-				set_fail(&fail, table, "Directory with data?");
+				SET_FAIL("Directory with data?");
 			if (table->maxlen)
-				set_fail(&fail, table, "Directory with maxlen?");
+				SET_FAIL("Directory with maxlen?");
 			if ((table->mode & (S_IRUGO|S_IXUGO)) != table->mode)
-				set_fail(&fail, table, "Writable sysctl directory");
+				SET_FAIL("Writable sysctl directory");
 			if (table->proc_handler)
-				set_fail(&fail, table, "Directory with proc_handler");
+				SET_FAIL("Directory with proc_handler");
 			if (table->extra1)
-				set_fail(&fail, table, "Directory with extra1");
+				SET_FAIL("Directory with extra1");
 			if (table->extra2)
-				set_fail(&fail, table, "Directory with extra2");
+				SET_FAIL("Directory with extra2");
 		} else {
 			if ((table->proc_handler == proc_dostring) ||
 			    (table->proc_handler == proc_dointvec) ||
@@ -139,28 +124,42 @@ int sysctl_check_table(struct nsproxy *namespaces, struct ctl_table *table)
 			    (table->proc_handler == proc_doulongvec_minmax) ||
 			    (table->proc_handler == proc_doulongvec_ms_jiffies_minmax)) {
 				if (!table->data)
-					set_fail(&fail, table, "No data");
+					SET_FAIL("No data");
 				if (!table->maxlen)
-					set_fail(&fail, table, "No maxlen");
+					SET_FAIL("No maxlen");
 			}
 #ifdef CONFIG_PROC_SYSCTL
 			if (table->procname && !table->proc_handler)
-				set_fail(&fail, table, "No proc_handler");
-#endif
-#if 0
-			if (!table->procname && table->proc_handler)
-				set_fail(&fail, table, "proc_handler without procname");
+				SET_FAIL("No proc_handler");
 #endif
-			sysctl_check_leaf(namespaces, table, &fail);
+			parents[depth] = table;
+			sysctl_check_leaf(namespaces, table, &fail,
+					  parents, depth);
 		}
 		if (table->mode > 0777)
-			set_fail(&fail, table, "bogus .mode");
+			SET_FAIL("bogus .mode");
 		if (fail) {
-			set_fail(&fail, table, NULL);
+			SET_FAIL(NULL);
 			error = -EINVAL;
 		}
-		if (table->child)
-			error |= sysctl_check_table(namespaces, table->child);
+		if (table->child) {
+			parents[depth] = table;
+			error |= __sysctl_check_table(namespaces, table->child,
+						      parents, depth + 1);
+		}
 	}
 	return error;
 }
+
+
+int sysctl_check_table(struct nsproxy *namespaces, struct ctl_table *table)
+{
+	struct ctl_table *parents[CTL_MAXNAME];
+	/* Keep track of parents as we go down into the tree.
+	 *
+	 * parents[i-1] will be the parent for parents[i].
+	 * The node at depth 'd' will have the parent at parents[d-1].
+	 * The root node (depth=0) has no parent in this array.
+	 */
+	return __sysctl_check_table(namespaces, table, parents, 0);
+}
-- 
1.7.4.rc1.7.g2cf08.dirty

^ permalink raw reply related

* [PATCH 0/5] net: sysctl: share ipv4/ipv6 sysctl tables
From: Lucian Adrian Grijincu @ 2011-02-04  4:37 UTC (permalink / raw)
  To: linux-kernel, netdev, Eric W. Biederman, Eric Dumazet,
	David S. Miller, Oct
  Cc: Lucian Adrian Grijincu

Each network device gets the same 25/24 sysctl entries for ipv4/ipv6
in /proc/sys/net/ipv4/conf/DEVNAME and /proc/sys/net/ipv6/conf/DEVNAME

Unfortunately, space is wasted holding very much similar data.
Fortunately, with some tricks these entries can be shared between all
network devices.

The single entry in 'struct ctl_table' that was modified at runtime
for leaf ctl_table nodes and prevented sharing was 'parent'. This
field was first introduces for selinux and then was used to implement
sysctl_check_table. Selinux recently removed the need for this field:
* http://thread.gmane.org/gmane.linux.kernel.lsm/12623
* LKML-Reference: 1296519474-15714-1-git-send-email-lucian.grijincu@gmail.com

Remove the need for 'parent' in sysctl_check_table and remove the
'parent' field:

  [PATCH 1/5] sysctl: faster reimplementation of sysctl_check_table
  [PATCH 2/5] sysctl: remove useless ctl_table->parent field

Pave the way for sharing of ipv4/6 tables: allow data to be stored in
the nodes above the leafs that will be shared:

  [PATCH 3/5] sysctl: write ctl_table->extra2 to entries created from ctl_path

Finally share the leaf sysctl tables for ipv4/ipv6:

  [PATCH 4/5] ipv4: share sysctl net/ipv4/conf/DEVNAME/ tables
  [PATCH 5/5] ipv6: share sysctl net/ipv6/conf/DEVNAME/ tables

 fs/proc/proc_sysctl.c       |   16 +++-
 include/linux/inetdevice.h  |   12 +++-
 include/linux/ipv6.h        |   15 +++-
 include/linux/sysctl.h      |    3 +-
 include/net/net_namespace.h |    2 +
 kernel/sysctl.c             |   18 +---
 kernel/sysctl_check.c       |  125 +++++++++++++--------------
 net/ipv4/devinet.c          |  203 ++++++++++++++++++++++++++++--------------
 net/ipv6/addrconf.c         |  192 +++++++++++++++++++++++++++-------------
 net/sysctl_net.c            |   20 +++--
 10 files changed, 387 insertions(+), 219 deletions(-)

-- 
1.7.4.rc1.7.g2cf08.dirty

^ permalink raw reply

* Submitting new device driver
From: Nik Trevallyn-Jones @ 2011-02-04  3:05 UTC (permalink / raw)
  To: netdev

Dear network kernel maintainers,

I maintain a device driver for a family of wireless broadband devices 
for the iBurst (TM) system.

I am being asked by the community when (or possibly if) the driver will 
be incorporated into the mainline kernel so users no longer need to 
download and build the driver themselves.

To that end, I've been reading the various documents on requirements for 
submitting a driver, and am now trying to contact the appropriate 
maintainers.

The drivers include support for PCMCIA and USB hardware, and are 
currently available from here:

http://sourceforge.net/projects/ibdriver/

I have recently made changes to support the 2.6.36 kernel, and will 
shortly review all the code to ensure compliance with the various 
guidelines regarding mainline drivers.

Could you please either let me know who I should direct my 
request/submission to, or point me to the appropriate document that 
tells me?

Many thanks for any and all responses,

Cheers!
Nik

^ permalink raw reply

* GOODDAY:
From: Azizi  Raman @ 2011-02-04  3:35 UTC (permalink / raw)
  To: obc

My name is  Aziz  R Stars a senior advocate with  A. Stars & Associates.

I found your contact/profile some where over the Internet and it gave me the greatest joy that you are the one I have been looking for. Whom I strongly believe could
execute this business with me $18 Million . Kindly get back to me for more Detail.

Best regards,
Mr   Aziz  R Sstars  (Esq.)

VXFFTKMBKNYHIESMRCHISVMEHBKJCCXXTIYSJJ

^ permalink raw reply

* Re: [PATCH] net: Add compat ioctl support for the ipv4 multicast ioctl SIOCGETSGCNT
From: Eric W. Biederman @ 2011-02-04  2:07 UTC (permalink / raw)
  To: David Miller; +Cc: arnd, netdev, eric.dumazet, kaber
In-Reply-To: <20110203.171918.28815936.davem@davemloft.net>

David Miller <davem@davemloft.net> writes:

> From: Arnd Bergmann <arnd@arndb.de>
> Date: Sun, 30 Jan 2011 19:59:53 +0100
>
>> * ip multicast actually needs support for SIOCGETVIFCNT in
>>   addition to SIOCGETSGCNT to be complete.
>> * ipv6 multicast needs the same patch as ipv4 multicast for
>>   SIOCGETMIFCNT_IN6/SIOCGETSGCNT_IN6.
>> 
>> It would probably be a good idea if someone could complete the
>> work on ipv4/v6 multicast compat_ioctl, on top of your patch.
>
> Actually, on top of this, Eric's patch is buggy.

Ouch.  Thanks for catching that.

> He defines the "struct compat_sioc_sg_req" but doesn't actually
> use it.
>
> I'll fix that, then take care of the missing cases.  Thanks Arnd.

Eric

^ permalink raw reply

* Re: [PATCH] tcp: Increase the initial congestion window to 10.
From: H.K. Jerry Chu @ 2011-02-04  2:01 UTC (permalink / raw)
  To: Ilpo Järvinen; +Cc: David Miller, Netdev, therbert, hkchu
In-Reply-To: <alpine.DEB.2.00.1102040027060.25125@melkinpaasi.cs.helsinki.fi>

Hi Ilpo,

On Thu, Feb 3, 2011 at 2:43 PM, Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> wrote:
> It would perhaps be useful to change receiver advertized window to include
> some extra segs initially. It should be >= IW + peer's dupThresh-1 as
> otherwise limited transmit won't work for the initial window because we
> won't open more receiver window with dupacks (IIRC, I suppose Jerry might
> be able to correct me right away if I'm wrong and we open window with
> dupacks too?).

Sorry I don't know how the receive window is updated in Linux,
autotuning or not.
But I just wonder why would it have to do with dupacks, i.e., why would it not
slide forward as long as the left edge of the window slides forward,
regardless of
OOO pkt arrival?

I am of the opinion that rwnd is for flow control purpose only thus should be
fully decoupled from the cwnd of the other (sender) side. Therefore
initrwnd should
normally be based on projected BDP and local memory pressure, e.g., 64KB, not
bearing any relation with IW of the other side. Only under special
circumstances should it be used to constrain the sender, e.g., for
devices behind slow links with
very small buffer.

Jerry

>I think initial receiver window code used to have some
> surplus but it was broken by the rfc3390-func conversion (against my
> advice on how to do the conversion).
>
> --
>  i.
>

^ permalink raw reply

* Re: [PATCH] net: Provide compat support for SIOCGETMIFCNT_IN6 and SIOCGETSGCNT_IN6.
From: David Miller @ 2011-02-04  2:00 UTC (permalink / raw)
  To: netdev; +Cc: arnd, ebiederm
In-Reply-To: <20110203.175448.245404709.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Date: Thu, 03 Feb 2011 17:54:48 -0800 (PST)

> 
> Signed-off-by: David S. Miller <davem@davemloft.net>
> ---
>  include/linux/mroute6.h |    1 +
>  net/ipv6/ip6mr.c        |   74 +++++++++++++++++++++++++++++++++++++++++++++++
>  net/ipv6/raw.c          |   18 +++++++++++
>  3 files changed, 93 insertions(+), 0 deletions(-)

Build testing on x86_64 showed that net/ipv6/raw.c needs a linux/compat.h
include.  I've made that change in my local repo.

^ permalink raw reply

* [PATCH] net: Provide compat support for SIOCGETMIFCNT_IN6 and SIOCGETSGCNT_IN6.
From: David Miller @ 2011-02-04  1:54 UTC (permalink / raw)
  To: netdev; +Cc: arnd, ebiederm


Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/linux/mroute6.h |    1 +
 net/ipv6/ip6mr.c        |   74 +++++++++++++++++++++++++++++++++++++++++++++++
 net/ipv6/raw.c          |   18 +++++++++++
 3 files changed, 93 insertions(+), 0 deletions(-)

diff --git a/include/linux/mroute6.h b/include/linux/mroute6.h
index 6091ab7..9d2deb2 100644
--- a/include/linux/mroute6.h
+++ b/include/linux/mroute6.h
@@ -136,6 +136,7 @@ extern int ip6_mroute_setsockopt(struct sock *, int, char __user *, unsigned int
 extern int ip6_mroute_getsockopt(struct sock *, int, char __user *, int __user *);
 extern int ip6_mr_input(struct sk_buff *skb);
 extern int ip6mr_ioctl(struct sock *sk, int cmd, void __user *arg);
+extern int ip6mr_compat_ioctl(struct sock *sk, unsigned int cmd, void __user *arg);
 extern int ip6_mr_init(void);
 extern void ip6_mr_cleanup(void);
 #else
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index 9fab274..5c07092 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -1804,6 +1804,80 @@ int ip6mr_ioctl(struct sock *sk, int cmd, void __user *arg)
 	}
 }
 
+#ifdef CONFIG_COMPAT
+struct compat_sioc_sg_req6 {
+	struct sockaddr_in6 src;
+	struct sockaddr_in6 grp;
+	compat_ulong_t pktcnt;
+	compat_ulong_t bytecnt;
+	compat_ulong_t wrong_if;
+};
+
+struct compat_sioc_mif_req6 {
+	mifi_t	mifi;
+	compat_ulong_t icount;
+	compat_ulong_t ocount;
+	compat_ulong_t ibytes;
+	compat_ulong_t obytes;
+};
+
+int ip6mr_compat_ioctl(struct sock *sk, unsigned int cmd, void __user *arg)
+{
+	struct compat_sioc_sg_req6 sr;
+	struct compat_sioc_mif_req6 vr;
+	struct mif_device *vif;
+	struct mfc6_cache *c;
+	struct net *net = sock_net(sk);
+	struct mr6_table *mrt;
+
+	mrt = ip6mr_get_table(net, raw6_sk(sk)->ip6mr_table ? : RT6_TABLE_DFLT);
+	if (mrt == NULL)
+		return -ENOENT;
+
+	switch (cmd) {
+	case SIOCGETMIFCNT_IN6:
+		if (copy_from_user(&vr, arg, sizeof(vr)))
+			return -EFAULT;
+		if (vr.mifi >= mrt->maxvif)
+			return -EINVAL;
+		read_lock(&mrt_lock);
+		vif = &mrt->vif6_table[vr.mifi];
+		if (MIF_EXISTS(mrt, vr.mifi)) {
+			vr.icount = vif->pkt_in;
+			vr.ocount = vif->pkt_out;
+			vr.ibytes = vif->bytes_in;
+			vr.obytes = vif->bytes_out;
+			read_unlock(&mrt_lock);
+
+			if (copy_to_user(arg, &vr, sizeof(vr)))
+				return -EFAULT;
+			return 0;
+		}
+		read_unlock(&mrt_lock);
+		return -EADDRNOTAVAIL;
+	case SIOCGETSGCNT_IN6:
+		if (copy_from_user(&sr, arg, sizeof(sr)))
+			return -EFAULT;
+
+		read_lock(&mrt_lock);
+		c = ip6mr_cache_find(mrt, &sr.src.sin6_addr, &sr.grp.sin6_addr);
+		if (c) {
+			sr.pktcnt = c->mfc_un.res.pkt;
+			sr.bytecnt = c->mfc_un.res.bytes;
+			sr.wrong_if = c->mfc_un.res.wrong_if;
+			read_unlock(&mrt_lock);
+
+			if (copy_to_user(arg, &sr, sizeof(sr)))
+				return -EFAULT;
+			return 0;
+		}
+		read_unlock(&mrt_lock);
+		return -EADDRNOTAVAIL;
+	default:
+		return -ENOIOCTLCMD;
+	}
+}
+#endif
 
 static inline int ip6mr_forward2_finish(struct sk_buff *skb)
 {
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 86c3952..e728804 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -1157,6 +1157,23 @@ static int rawv6_ioctl(struct sock *sk, int cmd, unsigned long arg)
 	}
 }
 
+#ifdef CONFIG_COMPAT
+static int compat_rawv6_ioctl(struct sock *sk, unsigned int cmd, unsigned long arg)
+{
+	switch (cmd) {
+	case SIOCOUTQ:
+	case SIOCINQ:
+		return -ENOIOCTLCMD;
+	default:
+#ifdef CONFIG_IPV6_MROUTE
+		return ipmr6_compat_ioctl(sk, cmd, compat_ptr(arg));
+#else
+		return -ENOIOCTLCMD;
+#endif
+	}
+}
+#endif
+
 static void rawv6_close(struct sock *sk, long timeout)
 {
 	if (inet_sk(sk)->inet_num == IPPROTO_RAW)
@@ -1215,6 +1232,7 @@ struct proto rawv6_prot = {
 #ifdef CONFIG_COMPAT
 	.compat_setsockopt = compat_rawv6_setsockopt,
 	.compat_getsockopt = compat_rawv6_getsockopt,
+	.compat_ioctl	   = compat_rawv6_ioctl,
 #endif
 };
 
-- 
1.7.4


^ permalink raw reply related

* [PATCH] net: Support compat SIOCGETVIFCNT ioctl in ipv4.
From: David Miller @ 2011-02-04  1:54 UTC (permalink / raw)
  To: netdev; +Cc: arnd, ebiederm


Signed-off-by: David S. Miller <davem@davemloft.net>
---
 net/ipv4/ipmr.c |   30 ++++++++++++++++++++++++++++++
 1 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index fce10a0..8b65a12 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -1444,9 +1444,19 @@ struct compat_sioc_sg_req {
 	compat_ulong_t wrong_if;
 };
 
+struct compat_sioc_vif_req {
+	vifi_t	vifi;		/* Which iface */
+	compat_ulong_t icount;
+	compat_ulong_t ocount;
+	compat_ulong_t ibytes;
+	compat_ulong_t obytes;
+};
+
 int ipmr_compat_ioctl(struct sock *sk, unsigned int cmd, void __user *arg)
 {
 	struct compat_sioc_sg_req sr;
+	struct compat_sioc_vif_req vr;
+	struct vif_device *vif;
 	struct mfc_cache *c;
 	struct net *net = sock_net(sk);
 	struct mr_table *mrt;
@@ -1456,6 +1466,26 @@ int ipmr_compat_ioctl(struct sock *sk, unsigned int cmd, void __user *arg)
 		return -ENOENT;
 
 	switch (cmd) {
+	case SIOCGETVIFCNT:
+		if (copy_from_user(&vr, arg, sizeof(vr)))
+			return -EFAULT;
+		if (vr.vifi >= mrt->maxvif)
+			return -EINVAL;
+		read_lock(&mrt_lock);
+		vif = &mrt->vif_table[vr.vifi];
+		if (VIF_EXISTS(mrt, vr.vifi)) {
+			vr.icount = vif->pkt_in;
+			vr.ocount = vif->pkt_out;
+			vr.ibytes = vif->bytes_in;
+			vr.obytes = vif->bytes_out;
+			read_unlock(&mrt_lock);
+
+			if (copy_to_user(arg, &vr, sizeof(vr)))
+				return -EFAULT;
+			return 0;
+		}
+		read_unlock(&mrt_lock);
+		return -EADDRNOTAVAIL;
 	case SIOCGETSGCNT:
 		if (copy_from_user(&sr, arg, sizeof(sr)))
 			return -EFAULT;
-- 
1.7.4


^ permalink raw reply related

* [PATCH] net: Fix bug in compat SIOCGETSGCNT handling.
From: David Miller @ 2011-02-04  1:54 UTC (permalink / raw)
  To: netdev; +Cc: arnd, ebiederm

Commit 709b46e8d90badda1898caea50483c12af178e96 ("net: Add compat
ioctl support for the ipv4 multicast ioctl SIOCGETSGCNT") added the
correct plumbing to handle SIOCGETSGCNT properly.

However, whilst definiting a proper "struct compat_sioc_sg_req" it
isn't actually used in ipmr_compat_ioctl().

Correct this oversight.

Signed-off-by: David S. Miller <davem@davemloft.net>
---
 net/ipv4/ipmr.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index 7e41ac0..fce10a0 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -1446,7 +1446,7 @@ struct compat_sioc_sg_req {

 int ipmr_compat_ioctl(struct sock *sk, unsigned int cmd, void __user *arg)
 {
-	struct sioc_sg_req sr;
+	struct compat_sioc_sg_req sr;
 	struct mfc_cache *c;
 	struct net *net = sock_net(sk);
 	struct mr_table *mrt;
-- 
1.7.4

^ permalink raw reply related

* Re: 2.6.38-rc3-git1: Reported regressions 2.6.36 -> 2.6.37
From: Dave Airlie @ 2011-02-04  1:42 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Keith Packard, Carlos Mafra, Dave Airlie, Rafael J. Wysocki,
	Takashi Iwai, Linux Kernel Mailing List, Maciej Rutecki,
	Florian Mickler, Andrew Morton, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List,
	Linux Wireless List, DRI
In-Reply-To: <AANLkTin-9a5Z3qq4t8UakRvgB1G3_CT2RLKMVaHXvnLr@mail.gmail.com>

On Fri, Feb 4, 2011 at 11:11 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Thu, Feb 3, 2011 at 5:05 PM, Keith Packard <keithp@keithp.com> wrote:
>>
>> The goal is to make it so that when you *do* set a mode, DPMS gets set
>> to ON (as the monitor will actually be "on" at that point). Here's a
>> patch which does the DPMS_ON precisely when setting a mode.
>
> Ok, patch looks sane, but it does leave me with the "what about the
> 'fb_changed' case?" question. Is that case basically guaranteed to not
> change any existing dpms state?

Yes its inconsistent behaviour but nothing in the fb_changed case will
affect the DPMS
state. I expect we should probably do that so all paths via that
function turn DPMS on,
and it'll be consistent, might be something for 39.

Dave.

^ permalink raw reply

* Re: 2.6.38-rc3-git1: Reported regressions 2.6.36 -> 2.6.37
From: Keith Packard @ 2011-02-04  1:41 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Airlie, Carlos Mafra, Dave Airlie, Rafael J. Wysocki,
	Takashi Iwai, Linux Kernel Mailing List, Maciej Rutecki,
	Florian Mickler, Andrew Morton, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List,
	Linux Wireless List, DRI
In-Reply-To: <AANLkTin-9a5Z3qq4t8UakRvgB1G3_CT2RLKMVaHXvnLr@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 546 bytes --]

On Thu, 3 Feb 2011 17:11:14 -0800, Linus Torvalds <torvalds@linux-foundation.org> wrote:

> Ok, patch looks sane, but it does leave me with the "what about the
> 'fb_changed' case?" question. Is that case basically guaranteed to not
> change any existing dpms state?

None of the existing drivers turn anything on or off in the
mode_set_base code; the fix I intended was purely for the mode_set case,
which always turns on all of the connected outputs. I just screwed up
and stuck it in the wrong place.

-- 
keith.packard@intel.com

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: [PATCH] net: Add compat ioctl support for the ipv4 multicast ioctl SIOCGETSGCNT
From: David Miller @ 2011-02-04  1:19 UTC (permalink / raw)
  To: arnd; +Cc: ebiederm, netdev, eric.dumazet, kaber
In-Reply-To: <201101301959.53661.arnd@arndb.de>

From: Arnd Bergmann <arnd@arndb.de>
Date: Sun, 30 Jan 2011 19:59:53 +0100

> * ip multicast actually needs support for SIOCGETVIFCNT in
>   addition to SIOCGETSGCNT to be complete.
> * ipv6 multicast needs the same patch as ipv4 multicast for
>   SIOCGETMIFCNT_IN6/SIOCGETSGCNT_IN6.
> 
> It would probably be a good idea if someone could complete the
> work on ipv4/v6 multicast compat_ioctl, on top of your patch.

Actually, on top of this, Eric's patch is buggy.

He defines the "struct compat_sioc_sg_req" but doesn't actually
use it.

I'll fix that, then take care of the missing cases.  Thanks Arnd.

^ permalink raw reply

* Re: 2.6.38-rc3-git1: Reported regressions 2.6.36 -> 2.6.37
From: Linus Torvalds @ 2011-02-04  1:11 UTC (permalink / raw)
  To: Keith Packard
  Cc: Dave Airlie, Carlos Mafra, Dave Airlie, Rafael J. Wysocki,
	Takashi Iwai, Linux Kernel Mailing List, Maciej Rutecki,
	Florian Mickler, Andrew Morton, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List,
	Linux Wireless List, DRI
In-Reply-To: <yund3n88v7x.fsf@aiko.keithp.com>

On Thu, Feb 3, 2011 at 5:05 PM, Keith Packard <keithp@keithp.com> wrote:
>
> The goal is to make it so that when you *do* set a mode, DPMS gets set
> to ON (as the monitor will actually be "on" at that point). Here's a
> patch which does the DPMS_ON precisely when setting a mode.

Ok, patch looks sane, but it does leave me with the "what about the
'fb_changed' case?" question. Is that case basically guaranteed to not
change any existing dpms state?

> (note, this patch compiles, but is otherwise only lightly tested).

Carlos? Takashi? Ignore my crazy patch, try this one instead. Does it
fix things for you?

                      Linus

^ permalink raw reply

* Re: 2.6.38-rc3-git1: Reported regressions 2.6.36 -> 2.6.37
From: Keith Packard @ 2011-02-04  1:05 UTC (permalink / raw)
  To: Linus Torvalds, Dave Airlie
  Cc: Linux SCSI List, Linux ACPI, Takashi Iwai, Carlos Mafra,
	Linux Wireless List, Linux Kernel Mailing List, DRI,
	Rafael J. Wysocki, Florian Mickler, Network Development,
	Dave Airlie, Andrew Morton, Kernel Testers List, Linux PM List,
	Maciej Rutecki
In-Reply-To: <AANLkTimQPPDyBwkN0HWM2+bPcbzVd8YHZvn2iR8MVzfL@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 4011 bytes --]

On Thu, 3 Feb 2011 16:30:56 -0800, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Thu, Feb 3, 2011 at 4:06 PM, Dave Airlie <airlied@gmail.com> wrote:
> >
> > If we are setting a mode on a connector it automatically will end up
> > in a DPMS on state,
> > so this seemed correct from what I can see.
> 
> The more I look at that function, the more I disagree with you and
> with that patch.
> 
> The code is just crazy.
> 
> First off, it isn't even necessarily setting a mode to begin with,
> because as far as I can tell. If the mode doesn't change, neither
> mode_changed nor fb_changed will be true, afaik. There seems to be a
> fair amount of code there explicitly to avoid changing modes if not
> necessary.
> 
> But even _if_ we are setting a mode, if I read the code correctly, the
> mode may be set to NULL - which seems to mean "turn it off". In which
> case it looks to me that drm_helper_disable_unused_functions() will
> actually do a
> 
>    (*crtc_funcs->dpms)(crtc, DRM_MODE_DPMS_OFF);
> 
> call on the crtc in question. So then blindly just saying "it's mode
> DRM_MODE_DPMS_ON" afterwards looks rather bogus to me.
> 
> _Maybe_ it would work if it was done before that whole
> "disable_unused" logic. Or maybe it should just be done in
> drm_crtc_helper_set_mode(), which is what actually sets the mode (but
> there's the 'fb_changed' case too)
> 
> > A future mode set shouldn't ever not turn the connector on, since
> > modesetting is an implicit
> > DPMS,
> >
> > It sounds like something more subtle than that, though I'm happy to
> > revert this for now, and let Keith
> > think about it a bit more.
> 
> So I haven't heard anything from Keith. Keith? Just revert it? Or do
> you have a patch for people to try?

The goal is to make it so that when you *do* set a mode, DPMS gets set
to ON (as the monitor will actually be "on" at that point). Here's a
patch which does the DPMS_ON precisely when setting a mode.

Dave thinks we should instead force dpms to match crtc->enabled, I'd
rather have dpms get set when we know the hardware has been changed.

(note, this patch compiles, but is otherwise only lightly tested).

From 38507bb3a67441425e11085d17d727f3b230f927 Mon Sep 17 00:00:00 2001
From: Keith Packard <keithp@keithp.com>
Date: Thu, 3 Feb 2011 16:57:28 -0800
Subject: [PATCH] drm: Only set DPMS ON when actually configuring a mode

In drm_crtc_helper_set_config, instead of always forcing all outputs
to DRM_MODE_DPMS_ON, only set them if the CRTC is actually getting a
mode set, as any mode set will turn all outputs on.

Signed-off-by: Keith Packard <keithp@keithp.com>
---
 drivers/gpu/drm/drm_crtc_helper.c |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/drm_crtc_helper.c b/drivers/gpu/drm/drm_crtc_helper.c
index 952b3d4..17459ee 100644
--- a/drivers/gpu/drm/drm_crtc_helper.c
+++ b/drivers/gpu/drm/drm_crtc_helper.c
@@ -665,6 +665,12 @@ int drm_crtc_helper_set_config(struct drm_mode_set *set)
 				ret = -EINVAL;
 				goto fail;
 			}
+			DRM_DEBUG_KMS("Setting connector DPMS state to on\n");
+			for (i = 0; i < set->num_connectors; i++) {
+				DRM_DEBUG_KMS("\t[CONNECTOR:%d:%s] set DPMS on\n", set->connectors[i]->base.id,
+					      drm_get_connector_name(set->connectors[i]));
+				set->connectors[i]->dpms = DRM_MODE_DPMS_ON;
+			}
 		}
 		drm_helper_disable_unused_functions(dev);
 	} else if (fb_changed) {
@@ -681,12 +687,6 @@ int drm_crtc_helper_set_config(struct drm_mode_set *set)
 			goto fail;
 		}
 	}
-	DRM_DEBUG_KMS("Setting connector DPMS state to on\n");
-	for (i = 0; i < set->num_connectors; i++) {
-		DRM_DEBUG_KMS("\t[CONNECTOR:%d:%s] set DPMS on\n", set->connectors[i]->base.id,
-			      drm_get_connector_name(set->connectors[i]));
-		set->connectors[i]->dpms = DRM_MODE_DPMS_ON;
-	}
 
 	kfree(save_connectors);
 	kfree(save_encoders);
-- 
1.7.2.3

-- 
keith.packard@intel.com

[-- Attachment #1.2: Type: application/pgp-signature, Size: 189 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related

* Re: 2.6.38-rc3-git1: Reported regressions 2.6.36 -> 2.6.37
From: Dave Airlie @ 2011-02-04  0:45 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Carlos Mafra, Keith Packard, Dave Airlie, Rafael J. Wysocki,
	Takashi Iwai, Linux Kernel Mailing List, Maciej Rutecki,
	Florian Mickler, Andrew Morton, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List,
	Linux Wireless List, DRI
In-Reply-To: <AANLkTimQPPDyBwkN0HWM2+bPcbzVd8YHZvn2iR8MVzfL@mail.gmail.com>

On Fri, Feb 4, 2011 at 10:30 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Thu, Feb 3, 2011 at 4:06 PM, Dave Airlie <airlied@gmail.com> wrote:
>>
>> If we are setting a mode on a connector it automatically will end up
>> in a DPMS on state,
>> so this seemed correct from what I can see.
>
> The more I look at that function, the more I disagree with you and
> with that patch.
>
> The code is just crazy.

Good point, I'm just trying to get -rc3 onto a machine where I can
reproduce this now, unfortunately that looks like the machine with the
1.8" disk, so this could take a little while.

hopefully Keith will decloak and tell us more.

Dave.

^ permalink raw reply

* Re: 2.6.38-rc3-git1: Reported regressions 2.6.36 -> 2.6.37
From: Linus Torvalds @ 2011-02-04  0:30 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Carlos Mafra, Keith Packard, Dave Airlie, Rafael J. Wysocki,
	Takashi Iwai, Linux Kernel Mailing List, Maciej Rutecki,
	Florian Mickler, Andrew Morton, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List,
	Linux Wireless List, DRI
In-Reply-To: <AANLkTin7kBur95H=UBcYEcjYsUkrdG8znX-7PFZ4rrHf@mail.gmail.com>

On Thu, Feb 3, 2011 at 4:06 PM, Dave Airlie <airlied@gmail.com> wrote:
>
> If we are setting a mode on a connector it automatically will end up
> in a DPMS on state,
> so this seemed correct from what I can see.

The more I look at that function, the more I disagree with you and
with that patch.

The code is just crazy.

First off, it isn't even necessarily setting a mode to begin with,
because as far as I can tell. If the mode doesn't change, neither
mode_changed nor fb_changed will be true, afaik. There seems to be a
fair amount of code there explicitly to avoid changing modes if not
necessary.

But even _if_ we are setting a mode, if I read the code correctly, the
mode may be set to NULL - which seems to mean "turn it off". In which
case it looks to me that drm_helper_disable_unused_functions() will
actually do a

   (*crtc_funcs->dpms)(crtc, DRM_MODE_DPMS_OFF);

call on the crtc in question. So then blindly just saying "it's mode
DRM_MODE_DPMS_ON" afterwards looks rather bogus to me.

_Maybe_ it would work if it was done before that whole
"disable_unused" logic. Or maybe it should just be done in
drm_crtc_helper_set_mode(), which is what actually sets the mode (but
there's the 'fb_changed' case too)

> A future mode set shouldn't ever not turn the connector on, since
> modesetting is an implicit
> DPMS,
>
> It sounds like something more subtle than that, though I'm happy to
> revert this for now, and let Keith
> think about it a bit more.

So I haven't heard anything from Keith. Keith? Just revert it? Or do
you have a patch for people to try?

                               Linus

^ permalink raw reply

* [PATCH] niu: Fix races between up/down and get_stats.
From: David Miller @ 2011-02-04  0:25 UTC (permalink / raw)
  To: netdev; +Cc: fleitner


As reported by Flavio Leitner, there is no synchronization to protect
NIU's get_stats method from seeing a NULL pointer in either
np->rx_rings or np->tx_rings.  In fact, as far as ->ndo_get_stats
is concerned, these values are set completely asynchronously.

Flavio attempted to fix this using a RW semaphore, which in fact
works most of the time.  However, dev_get_stats() can be invoked
from non-sleepable contexts in some cases, so this fix doesn't
work in all cases.

So instead, control the visibility of the np->{rx,tx}_ring pointers
when the device is being brough up, and use properties of the device
down sequence to our advantage.

In niu_get_stats(), return immediately if netif_running() is false.
The device shutdown sequence first marks the device as not running (by
clearing the __LINK_STATE_START bit), then it performans a
synchronize_rcu() (in dev_deactive_many()), and then finally it
invokes the driver ->ndo_stop() method.

This guarentees that all invocations of niu_get_stats() either see
netif_running() as false, or they see the channel pointers before
->ndo_stop() clears them out.

If netif_running() is true, protect against startup races by loading
the np->{rx,tx}_rings pointer into a local variable, and punting if
it is NULL.  Use ACCESS_ONCE to prevent the compiler from reloading
the pointer on us.

Also, during open, control the order in which the pointers and the
ring counts become visible globally using SMP write memory barriers.
We make sure the np->num_{rx,tx}_rings value is stable and visible
before np->{rx,tx}_rings is.

Such visibility control is not necessary on the niu_free_channels()
side because of the RCU sequencing that happens during device down as
described above.  We are always guarenteed that all niu_get_stats
calls are finished, or will see netif_running() false, by the time
->ndo_stop is invoked.

Reported-by: Flavio Leitner <fleitner@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/niu.c |   61 +++++++++++++++++++++++++++++++++++++++--------------
 1 files changed, 45 insertions(+), 16 deletions(-)

diff --git a/drivers/net/niu.c b/drivers/net/niu.c
index 2541321..9fb59d3 100644
--- a/drivers/net/niu.c
+++ b/drivers/net/niu.c
@@ -4489,6 +4489,9 @@ static int niu_alloc_channels(struct niu *np)
 {
 	struct niu_parent *parent = np->parent;
 	int first_rx_channel, first_tx_channel;
+	int num_rx_rings, num_tx_rings;
+	struct rx_ring_info *rx_rings;
+	struct tx_ring_info *tx_rings;
 	int i, port, err;
 
 	port = np->port;
@@ -4498,18 +4501,21 @@ static int niu_alloc_channels(struct niu *np)
 		first_tx_channel += parent->txchan_per_port[i];
 	}
 
-	np->num_rx_rings = parent->rxchan_per_port[port];
-	np->num_tx_rings = parent->txchan_per_port[port];
+	num_rx_rings = parent->rxchan_per_port[port];
+	num_tx_rings = parent->txchan_per_port[port];
 
-	netif_set_real_num_rx_queues(np->dev, np->num_rx_rings);
-	netif_set_real_num_tx_queues(np->dev, np->num_tx_rings);
-
-	np->rx_rings = kcalloc(np->num_rx_rings, sizeof(struct rx_ring_info),
-			       GFP_KERNEL);
+	rx_rings = kcalloc(num_rx_rings, sizeof(struct rx_ring_info),
+			   GFP_KERNEL);
 	err = -ENOMEM;
-	if (!np->rx_rings)
+	if (!rx_rings)
 		goto out_err;
 
+	np->num_rx_rings = num_rx_rings;
+	smp_wmb();
+	np->rx_rings = rx_rings;
+
+	netif_set_real_num_rx_queues(np->dev, num_rx_rings);
+
 	for (i = 0; i < np->num_rx_rings; i++) {
 		struct rx_ring_info *rp = &np->rx_rings[i];
 
@@ -4538,12 +4544,18 @@ static int niu_alloc_channels(struct niu *np)
 			return err;
 	}
 
-	np->tx_rings = kcalloc(np->num_tx_rings, sizeof(struct tx_ring_info),
-			       GFP_KERNEL);
+	tx_rings = kcalloc(num_tx_rings, sizeof(struct tx_ring_info),
+			   GFP_KERNEL);
 	err = -ENOMEM;
-	if (!np->tx_rings)
+	if (!tx_rings)
 		goto out_err;
 
+	np->num_tx_rings = num_tx_rings;
+	smp_wmb();
+	np->tx_rings = tx_rings;
+
+	netif_set_real_num_tx_queues(np->dev, num_tx_rings);
+
 	for (i = 0; i < np->num_tx_rings; i++) {
 		struct tx_ring_info *rp = &np->tx_rings[i];
 
@@ -6246,11 +6258,17 @@ static void niu_sync_mac_stats(struct niu *np)
 static void niu_get_rx_stats(struct niu *np)
 {
 	unsigned long pkts, dropped, errors, bytes;
+	struct rx_ring_info *rx_rings;
 	int i;
 
 	pkts = dropped = errors = bytes = 0;
+
+	rx_rings = ACCESS_ONCE(np->rx_rings);
+	if (!rx_rings)
+		goto no_rings;
+
 	for (i = 0; i < np->num_rx_rings; i++) {
-		struct rx_ring_info *rp = &np->rx_rings[i];
+		struct rx_ring_info *rp = &rx_rings[i];
 
 		niu_sync_rx_discard_stats(np, rp, 0);
 
@@ -6259,6 +6277,8 @@ static void niu_get_rx_stats(struct niu *np)
 		dropped += rp->rx_dropped;
 		errors += rp->rx_errors;
 	}
+
+no_rings:
 	np->dev->stats.rx_packets = pkts;
 	np->dev->stats.rx_bytes = bytes;
 	np->dev->stats.rx_dropped = dropped;
@@ -6268,16 +6288,24 @@ static void niu_get_rx_stats(struct niu *np)
 static void niu_get_tx_stats(struct niu *np)
 {
 	unsigned long pkts, errors, bytes;
+	struct tx_ring_info *tx_rings;
 	int i;
 
 	pkts = errors = bytes = 0;
+
+	tx_rings = ACCESS_ONCE(np->tx_rings);
+	if (!tx_rings)
+		goto no_rings;
+
 	for (i = 0; i < np->num_tx_rings; i++) {
-		struct tx_ring_info *rp = &np->tx_rings[i];
+		struct tx_ring_info *rp = &tx_rings[i];
 
 		pkts += rp->tx_packets;
 		bytes += rp->tx_bytes;
 		errors += rp->tx_errors;
 	}
+
+no_rings:
 	np->dev->stats.tx_packets = pkts;
 	np->dev->stats.tx_bytes = bytes;
 	np->dev->stats.tx_errors = errors;
@@ -6287,9 +6315,10 @@ static struct net_device_stats *niu_get_stats(struct net_device *dev)
 {
 	struct niu *np = netdev_priv(dev);
 
-	niu_get_rx_stats(np);
-	niu_get_tx_stats(np);
-
+	if (netif_running(dev)) {
+		niu_get_rx_stats(np);
+		niu_get_tx_stats(np);
+	}
 	return &dev->stats;
 }
 
-- 
1.7.4


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox