Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next-2.6] phonet: use call_rcu for phonet device free
From: Eric Dumazet @ 2010-06-07 13:49 UTC (permalink / raw)
  To: Rémi Denis-Courmont; +Cc: Jiri Pirko, netdev, davem
In-Reply-To: <474f08fed4a406e929af3d4142d3e185@chewa.net>

Le lundi 07 juin 2010 à 15:43 +0200, Rémi Denis-Courmont a écrit :
> On Mon, 7 Jun 2010 15:27:39 +0200, Jiri Pirko <jpirko@redhat.com> wrote:
> > Use call_rcu rather than synchronize_rcu.
> > 
> > Signed-off-by: Jiri Pirko <jpirko@redhat.com>
> 
> This looks fine to me, but what is the goal here? The RCU documentation
> seems to imply that synchronize_rcu() is preferable over call_rcu() when at
> all possible.
> 

Thats not exactly that.

synchronize_rcu() is easier, in respect of memory use.
But its drawback is current thread is blocked for several milli seconds.

In the end, call_rcu() is more scalable.

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>



^ permalink raw reply

* Re: [Linux-ATM-General] RX/close vcc race with solos/atmtcp/usbatm/he
From: Chas Williams (CONTRACTOR) @ 2010-06-07 13:44 UTC (permalink / raw)
  To: David Woodhouse; +Cc: linux-atm-general, netdev
In-Reply-To: <1274872584.20576.13579.camel@macbook.infradead.org>

In message <1274872584.20576.13579.camel@macbook.infradead.org>,David Woodhouse
 writes:
>The problem is that the 'find_vcc' functions in these drivers are
>returning a vcc with the ATM_VF_READY bit cleared, because it's already
>in the process of being destroyed. If we fix that simple oversight,
>there's still a race condition because the socket can still be closed
>(and completely freed, afaict) between our call to find_vcc() and our
>call to vcc->push() in the RX tasklet.
...
>Can anyone see a better approach -- short of rewriting the whole ATM
>layer to make the locking saner?

vcc's are really sockets, so you could just increase the refcount --
sock_hold().  after you push the packet, drop the refcount, sock_put()
and hopefully things will be well.  however, i think the close routines
dont really expect this behavior so the card driver's vcc close might
need to be changed to wait around if the refcount on the vcc is > 1.
or perhaps the driver independent part needs to do this.

the he driver works around this issue by holding vcc_sklist_lock around
the find_vcc and ->push() which happen to occur in the same tasklet.

^ permalink raw reply

* [PATCH net-next-2.6] net: avoid two atomic ops in ip_rcv_options()
From: Eric Dumazet @ 2010-06-07 13:54 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

in_dev_get() -> __in_dev_get_rcu() in a rcu protected function.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 net/ipv4/ip_input.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
index d52c9da..27b5a09 100644
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -298,18 +298,16 @@ static inline int ip_rcv_options(struct sk_buff *skb)
 	}
 
 	if (unlikely(opt->srr)) {
-		struct in_device *in_dev = in_dev_get(dev);
+		struct in_device *in_dev = __in_dev_get_rcu(dev);
+
 		if (in_dev) {
 			if (!IN_DEV_SOURCE_ROUTE(in_dev)) {
 				if (IN_DEV_LOG_MARTIANS(in_dev) &&
 				    net_ratelimit())
 					printk(KERN_INFO "source route option %pI4 -> %pI4\n",
 					       &iph->saddr, &iph->daddr);
-				in_dev_put(in_dev);
 				goto drop;
 			}
-
-			in_dev_put(in_dev);
 		}
 
 		if (ip_options_rcv_srr(skb))



^ permalink raw reply related

* Re: [Linux-ATM-General] RX/close vcc race with solos/atmtcp/usbatm/he
From: David Woodhouse @ 2010-06-07 14:13 UTC (permalink / raw)
  To: chas3; +Cc: linux-atm-general, netdev
In-Reply-To: <201006071344.o57DiiCx018593@thirdoffive.cmf.nrl.navy.mil>

On Mon, 2010-06-07 at 09:44 -0400, Chas Williams (CONTRACTOR) wrote:
> vcc's are really sockets, so you could just increase the refcount --
> sock_hold().

There are rules about where we're allowed to call sock_hold(), and I
don't think our find_vcc() functions can be made to meet them.

> however, i think the close routines dont really expect this behavior
> so the card driver's vcc close might need to be changed to wait around
> if the refcount on the vcc is > 1.

In that case I think we might as well stick with the RCU-like solution I
already implemented in the vcc close function -- which is just to wait
for the tasklet to complete, thus ensuring that it's no longer using the
defunct vcc.

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation

^ permalink raw reply

* [RFC] lunar manet routing module
From: Christophe Jelger @ 2010-06-07 14:19 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev
In-Reply-To: <1275917415.2545.79.camel@edumazet-laptop>

[-- Attachment #1: Type: text/plain, Size: 809 bytes --]

Eric Dumazet wrote:
> 
> Christophe,
> 
> Unless patch is really huge, its ok to send it on netdev, with a RFC
> label, so that only people with free time take a look, eventually.
> 
> [RFC] lunar: ....

[not sure what 'huge' means, I'm sending 60k -- sorry for the pollution]

Eric: thanks for the advice. Instead of a patch I attach a .tgz (hope 
it's ok) of the module code with a README explaining everything: it 
compiles for 2.6.31 and 2.6.32, didn't try more recent kernels because I 
actually want to deploy the whole thing on OpenWRT 2.6.32 on Linksys 
devices.

To all: the lunar module crashes the kernel, so be careful. I tried 
debugging with qemu and gdb server but could not find the bug(s) -- my 
experience for kernel debugging is in fact limited.

thanks in advance for any help,
Christophe

[-- Attachment #2: lunar.tgz --]
[-- Type: application/x-compressed-tar, Size: 59498 bytes --]

^ permalink raw reply

* [PATCH net-next-2.6] pkt_sched: gen_estimator: kill est_lock rwlock
From: Eric Dumazet @ 2010-06-07 14:32 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Stephen Hemminger, Jarek Poplawski

We can use RCU in est_timer() and kill est_lock rwlock.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 net/core/gen_estimator.c |   45 ++++++++++++++-----------------------
 1 file changed, 18 insertions(+), 27 deletions(-)

diff --git a/net/core/gen_estimator.c b/net/core/gen_estimator.c
index cf8e703..406d880 100644
--- a/net/core/gen_estimator.c
+++ b/net/core/gen_estimator.c
@@ -102,9 +102,6 @@ struct gen_estimator_head
 
 static struct gen_estimator_head elist[EST_MAX_INTERVAL+1];
 
-/* Protects against NULL dereference */
-static DEFINE_RWLOCK(est_lock);
-
 /* Protects against soft lockup during large deletion */
 static struct rb_root est_root = RB_ROOT;
 
@@ -115,29 +112,25 @@ static void est_timer(unsigned long arg)
 
 	rcu_read_lock();
 	list_for_each_entry_rcu(e, &elist[idx].list, list) {
-		u64 nbytes;
-		u64 brate;
-		u32 npackets;
-		u32 rate;
+		struct gnet_stats_basic_packed *bstats;
 
 		spin_lock(e->stats_lock);
-		read_lock(&est_lock);
-		if (e->bstats == NULL)
-			goto skip;
-
-		nbytes = e->bstats->bytes;
-		npackets = e->bstats->packets;
-		brate = (nbytes - e->last_bytes)<<(7 - idx);
-		e->last_bytes = nbytes;
-		e->avbps += (brate >> e->ewma_log) - (e->avbps >> e->ewma_log);
-		e->rate_est->bps = (e->avbps+0xF)>>5;
-
-		rate = (npackets - e->last_packets)<<(12 - idx);
-		e->last_packets = npackets;
-		e->avpps += (rate >> e->ewma_log) - (e->avpps >> e->ewma_log);
-		e->rate_est->pps = (e->avpps+0x1FF)>>10;
-skip:
-		read_unlock(&est_lock);
+		bstats = rcu_dereference(e->bstats);
+		if (bstats) {
+			u64 nbytes = ACCESS_ONCE(bstats->bytes);
+			u32 npackets = ACCESS_ONCE(bstats->packets);
+			u64 brate = (nbytes - e->last_bytes)<<(7 - idx);
+			u32 rate;
+
+			e->last_bytes = nbytes;
+			e->avbps += (brate >> e->ewma_log) - (e->avbps >> e->ewma_log);
+			e->rate_est->bps = (e->avbps+0xF)>>5;
+
+			rate = (npackets - e->last_packets)<<(12 - idx);
+			e->last_packets = npackets;
+			e->avpps += (rate >> e->ewma_log) - (e->avpps >> e->ewma_log);
+			e->rate_est->pps = (e->avpps+0x1FF)>>10;
+		}
 		spin_unlock(e->stats_lock);
 	}
 
@@ -271,9 +264,7 @@ void gen_kill_estimator(struct gnet_stats_basic_packed *bstats,
 	while ((e = gen_find_node(bstats, rate_est))) {
 		rb_erase(&e->node, &est_root);
 
-		write_lock_bh(&est_lock);
-		e->bstats = NULL;
-		write_unlock_bh(&est_lock);
+		rcu_assign_pointer(e->bstats, NULL);
 
 		list_del_rcu(&e->list);
 		call_rcu(&e->e_rcu, __gen_kill_estimator);



^ permalink raw reply related

* [patch] caif: fix a couple range checks
From: Dan Carpenter @ 2010-06-07 14:51 UTC (permalink / raw)
  To: Sjur Braendeland; +Cc: David S. Miller, netdev, kernel-janitors

The extra ! character means that these conditions are always false.

Signed-off-by: Dan Carpenter <error27@gmail.com>

diff --git a/net/caif/cfveil.c b/net/caif/cfveil.c
index 0fd827f..e04f7d9 100644
--- a/net/caif/cfveil.c
+++ b/net/caif/cfveil.c
@@ -84,7 +84,7 @@ static int cfvei_transmit(struct cflayer *layr, struct cfpkt *pkt)
 		return ret;
 	caif_assert(layr->dn != NULL);
 	caif_assert(layr->dn->transmit != NULL);
-	if (!cfpkt_getlen(pkt) > CAIF_MAX_PAYLOAD_SIZE) {
+	if (cfpkt_getlen(pkt) > CAIF_MAX_PAYLOAD_SIZE) {
 		pr_warning("CAIF: %s(): Packet too large - size=%d\n",
 			   __func__, cfpkt_getlen(pkt));
 		return -EOVERFLOW;
diff --git a/net/caif/cfrfml.c b/net/caif/cfrfml.c
index cd2830f..fd27b17 100644
--- a/net/caif/cfrfml.c
+++ b/net/caif/cfrfml.c
@@ -83,7 +83,7 @@ static int cfrfml_transmit(struct cflayer *layr, struct cfpkt *pkt)
 	if (!cfsrvl_ready(service, &ret))
 		return ret;
 
-	if (!cfpkt_getlen(pkt) > CAIF_MAX_PAYLOAD_SIZE) {
+	if (cfpkt_getlen(pkt) > CAIF_MAX_PAYLOAD_SIZE) {
 		pr_err("CAIF: %s():Packet too large - size=%d\n",
 			__func__, cfpkt_getlen(pkt));
 		return -EOVERFLOW;

^ permalink raw reply related

* Re: [PATCH net-next-2.6] pkt_sched: gen_estimator: kill est_lock rwlock
From: Changli Gao @ 2010-06-07 14:53 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, Stephen Hemminger, Jarek Poplawski
In-Reply-To: <1275921171.2545.102.camel@edumazet-laptop>

On Mon, Jun 7, 2010 at 10:32 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> We can use RCU in est_timer() and kill est_lock rwlock.
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
>  net/core/gen_estimator.c |   45 ++++++++++++++-----------------------
>  1 file changed, 18 insertions(+), 27 deletions(-)
>
> diff --git a/net/core/gen_estimator.c b/net/core/gen_estimator.c
> index cf8e703..406d880 100644
> --- a/net/core/gen_estimator.c
> +++ b/net/core/gen_estimator.c
> @@ -102,9 +102,6 @@ struct gen_estimator_head
>
>  static struct gen_estimator_head elist[EST_MAX_INTERVAL+1];
>
> -/* Protects against NULL dereference */
> -static DEFINE_RWLOCK(est_lock);
> -
>  /* Protects against soft lockup during large deletion */
>  static struct rb_root est_root = RB_ROOT;
>
> @@ -115,29 +112,25 @@ static void est_timer(unsigned long arg)
>
>        rcu_read_lock();
>        list_for_each_entry_rcu(e, &elist[idx].list, list) {
> -               u64 nbytes;
> -               u64 brate;
> -               u32 npackets;
> -               u32 rate;
> +               struct gnet_stats_basic_packed *bstats;
>
>                spin_lock(e->stats_lock);
> -               read_lock(&est_lock);
> -               if (e->bstats == NULL)
> -                       goto skip;
> -
> -               nbytes = e->bstats->bytes;
> -               npackets = e->bstats->packets;
> -               brate = (nbytes - e->last_bytes)<<(7 - idx);
> -               e->last_bytes = nbytes;
> -               e->avbps += (brate >> e->ewma_log) - (e->avbps >> e->ewma_log);
> -               e->rate_est->bps = (e->avbps+0xF)>>5;
> -
> -               rate = (npackets - e->last_packets)<<(12 - idx);
> -               e->last_packets = npackets;
> -               e->avpps += (rate >> e->ewma_log) - (e->avpps >> e->ewma_log);
> -               e->rate_est->pps = (e->avpps+0x1FF)>>10;
> -skip:
> -               read_unlock(&est_lock);
> +               bstats = rcu_dereference(e->bstats);
> +               if (bstats) {
> +                       u64 nbytes = ACCESS_ONCE(bstats->bytes);
> +                       u32 npackets = ACCESS_ONCE(bstats->packets);
> +                       u64 brate = (nbytes - e->last_bytes)<<(7 - idx);
> +                       u32 rate;
> +
> +                       e->last_bytes = nbytes;
> +                       e->avbps += (brate >> e->ewma_log) - (e->avbps >> e->ewma_log);
> +                       e->rate_est->bps = (e->avbps+0xF)>>5;
> +
> +                       rate = (npackets - e->last_packets)<<(12 - idx);
> +                       e->last_packets = npackets;
> +                       e->avpps += (rate >> e->ewma_log) - (e->avpps >> e->ewma_log);
> +                       e->rate_est->pps = (e->avpps+0x1FF)>>10;
> +               }
>                spin_unlock(e->stats_lock);
>        }
>
> @@ -271,9 +264,7 @@ void gen_kill_estimator(struct gnet_stats_basic_packed *bstats,
>        while ((e = gen_find_node(bstats, rate_est))) {
>                rb_erase(&e->node, &est_root);
>
> -               write_lock_bh(&est_lock);
> -               e->bstats = NULL;
> -               write_unlock_bh(&est_lock);
> +               rcu_assign_pointer(e->bstats, NULL);
>
>                list_del_rcu(&e->list);
>                call_rcu(&e->e_rcu, __gen_kill_estimator);
>

Hmm. I don't think it is correct.

Look at this code:

void xt_rateest_put(struct xt_rateest *est)
{
        mutex_lock(&xt_rateest_mutex);
        if (--est->refcnt == 0) {
                hlist_del(&est->list);
                gen_kill_estimator(&est->bstats, &est->rstats);
                kfree(est);
        }
        mutex_unlock(&xt_rateest_mutex);
}

est will be released after gen_kill_estimator() returns. After est is
freed, there may still be some other CPUs running in the branch:
     if (bstats) {
         ....
     }

-- 
Regards,
Changli Gao(xiaosuo@gmail.com)

^ permalink raw reply

* Re: no reassembly for outgoing packets on RAW socket
From: Jiri Olsa @ 2010-06-07 14:55 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netdev
In-Reply-To: <4C08EB85.3050900@trash.net>

On Fri, Jun 04, 2010 at 02:03:17PM +0200, Patrick McHardy wrote:
> Jiri Olsa wrote:
> > hi,
> > 
> > I'd like to be able to sendout a single IP packet with MF flag set.
> > 
> > When using RAW sockets the packet will get stuck in the
> > netfilter (NF_INET_LOCAL_OUT nf_defrag_ipv4 reassembly unit)
> > and wont ever make it out..
> > 
> > I made a change which bypass the outgoing reassembly for
> > RAW sockets, but I'm not sure wether it's too invasive..
> 
> That would break reassembly (and thus connection tracking) for cases
> where its really intended.
> 
> > Is there any standard for RAW sockets behaviour?
> > Or another way around? :)
> 
> You could use the NOTRACK target to bypass connection tracking.

ok,

I tried the NOTRACK target, but the packet is still going
throught reassembly, because the RAW filter has lower priority
then the connection track defragmentation..

I was able to get it bypassed by attached patch and following
command:

	iptables -v -t raw -A OUTPUT -p icmp -j NOTRACK

again, not sure if this is too invasive ;)

If this is not the way, I'd appreciatte any hint..  my goal is
to put malformed packet on the wire (more frags bit set for a
non fragmented packet)


thanks for help,
jirka

---
diff --git a/include/linux/netfilter_ipv4.h b/include/linux/netfilter_ipv4.h
index 29c7727..d249b6a 100644
--- a/include/linux/netfilter_ipv4.h
+++ b/include/linux/netfilter_ipv4.h
@@ -53,8 +53,8 @@
 
 enum nf_ip_hook_priorities {
 	NF_IP_PRI_FIRST = INT_MIN,
-	NF_IP_PRI_CONNTRACK_DEFRAG = -400,
-	NF_IP_PRI_RAW = -300,
+	NF_IP_PRI_RAW = -400,
+	NF_IP_PRI_CONNTRACK_DEFRAG = -300,
 	NF_IP_PRI_SELINUX_FIRST = -225,
 	NF_IP_PRI_CONNTRACK = -200,
 	NF_IP_PRI_MANGLE = -150,
diff --git a/net/ipv4/netfilter/nf_defrag_ipv4.c b/net/ipv4/netfilter/nf_defrag_ipv4.c
index cb763ae..cb865d1 100644
--- a/net/ipv4/netfilter/nf_defrag_ipv4.c
+++ b/net/ipv4/netfilter/nf_defrag_ipv4.c
@@ -74,6 +74,9 @@ static unsigned int ipv4_conntrack_defrag(unsigned int hooknum,
 		return NF_ACCEPT;
 #endif
 #endif
+	if (nf_ct_is_untracked(skb))
+		return NF_ACCEPT;
+
 	/* Gather fragments. */
 	if (ip_hdr(skb)->frag_off & htons(IP_MF | IP_OFFSET)) {
 		enum ip_defrag_users user = nf_ct_defrag_user(hooknum, skb);

^ permalink raw reply related

* Re: [Linux-ATM-General] RX/close vcc race with solos/atmtcp/usbatm/he
From: Chas Williams (CONTRACTOR) @ 2010-06-07 15:10 UTC (permalink / raw)
  To: David Woodhouse; +Cc: linux-atm-general, netdev
In-Reply-To: <1275920035.17903.4998.camel@macbook.infradead.org>

In message <1275920035.17903.4998.camel@macbook.infradead.org>,David Woodhouse 
writes:
>On Mon, 2010-06-07 at 09:44 -0400, Chas Williams (CONTRACTOR) wrote:
>> vcc's are really sockets, so you could just increase the refcount --
>> sock_hold().
>
>There are rules about where we're allowed to call sock_hold(), and I
>don't think our find_vcc() functions can be made to meet them.

if you are using find_vcc() then you should already have a lock on the
hash table for the vccs.  you can safely increment the ref count at
this point.

>In that case I think we might as well stick with the RCU-like solution I
>already implemented in the vcc close function -- which is just to wait
>for the tasklet to complete, thus ensuring that it's no longer using the
>defunct vcc.

the driver shouldnt close the vcc while the tasklet is running and
using the vcc in question.  i guess the safest thing is to simply
do as you are doing and not close while running the tasklet.

^ permalink raw reply

* Re: [PATCH net-next-2.6] pkt_sched: gen_estimator: kill est_lock rwlock
From: Eric Dumazet @ 2010-06-07 15:30 UTC (permalink / raw)
  To: Changli Gao; +Cc: David Miller, netdev, Stephen Hemminger, Jarek Poplawski
In-Reply-To: <AANLkTimgmYjA7Xp19gycRHDXmwP1w1twn92cfT08tCv2@mail.gmail.com>

Le lundi 07 juin 2010 à 22:53 +0800, Changli Gao a écrit :

> Hmm. I don't think it is correct.
> 
> Look at this code:
> 
> void xt_rateest_put(struct xt_rateest *est)
> {
>         mutex_lock(&xt_rateest_mutex);
>         if (--est->refcnt == 0) {
>                 hlist_del(&est->list);
>                 gen_kill_estimator(&est->bstats, &est->rstats);
>                 kfree(est);
>         }
>         mutex_unlock(&xt_rateest_mutex);
> }
> 
> est will be released after gen_kill_estimator() returns. After est is
> freed, there may still be some other CPUs running in the branch:
>      if (bstats) {
>          ....
>      }
> 

Oh well, I think I knew this from a previous attempt, but then I
forgot :)

I'll provide an updated patch, so that no other attempt is tried next
yer, thanks !




^ permalink raw reply

* Re: [PATCH] phylib: Add support for the LXT973 phy.
From: Richard Cochran @ 2010-06-07 15:39 UTC (permalink / raw)
  To: David Miller; +Cc: afleming, netdev
In-Reply-To: <20100607.011835.55846752.davem@davemloft.net>

On Mon, Jun 07, 2010 at 01:18:35AM -0700, David Miller wrote:
> 
> Richard, please just resubmit your original patch and I will apply it.

Okay, here it is.

Thanks,
Richard


This patch implements a work around for Erratum 5, "3.3 V Fiber Speed
Selection." If the hardware wiring does not respect this erratum, then
fiber optic mode will not work properly.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
---
 drivers/net/phy/lxt.c |   51 ++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 50 insertions(+), 1 deletions(-)

diff --git a/drivers/net/phy/lxt.c b/drivers/net/phy/lxt.c
index 057ecaa..c3de582 100644
--- a/drivers/net/phy/lxt.c
+++ b/drivers/net/phy/lxt.c
@@ -53,6 +53,9 @@
 
 #define MII_LXT971_ISR		19  /* Interrupt Status Register */
 
+/* register definitions for the 973 */
+#define MII_LXT973_PCR 16 /* Port Configuration Register */
+#define PCR_FIBER_SELECT 1
 
 MODULE_DESCRIPTION("Intel LXT PHY driver");
 MODULE_AUTHOR("Andy Fleming");
@@ -119,6 +122,33 @@ static int lxt971_config_intr(struct phy_device *phydev)
 	return err;
 }
 
+static int lxt973_probe(struct phy_device *phydev)
+{
+	int val = phy_read(phydev, MII_LXT973_PCR);
+
+	if (val & PCR_FIBER_SELECT) {
+		/*
+		 * If fiber is selected, then the only correct setting
+		 * is 100Mbps, full duplex, and auto negotiation off.
+		 */
+		val = phy_read(phydev, MII_BMCR);
+		val |= (BMCR_SPEED100 | BMCR_FULLDPLX);
+		val &= ~BMCR_ANENABLE;
+		phy_write(phydev, MII_BMCR, val);
+		/* Remember that the port is in fiber mode. */
+		phydev->priv = lxt973_probe;
+	} else {
+		phydev->priv = NULL;
+	}
+	return 0;
+}
+
+static int lxt973_config_aneg(struct phy_device *phydev)
+{
+	/* Do nothing if port is in fiber mode. */
+	return phydev->priv ? 0 : genphy_config_aneg(phydev);
+}
+
 static struct phy_driver lxt970_driver = {
 	.phy_id		= 0x78100000,
 	.name		= "LXT970",
@@ -146,6 +176,18 @@ static struct phy_driver lxt971_driver = {
 	.driver 	= { .owner = THIS_MODULE,},
 };
 
+static struct phy_driver lxt973_driver = {
+	.phy_id		= 0x00137a10,
+	.name		= "LXT973",
+	.phy_id_mask	= 0xfffffff0,
+	.features	= PHY_BASIC_FEATURES,
+	.flags		= 0,
+	.probe		= lxt973_probe,
+	.config_aneg	= lxt973_config_aneg,
+	.read_status	= genphy_read_status,
+	.driver 	= { .owner = THIS_MODULE,},
+};
+
 static int __init lxt_init(void)
 {
 	int ret;
@@ -157,9 +199,15 @@ static int __init lxt_init(void)
 	ret = phy_driver_register(&lxt971_driver);
 	if (ret)
 		goto err2;
+
+	ret = phy_driver_register(&lxt973_driver);
+	if (ret)
+		goto err3;
 	return 0;
 
- err2:	
+ err3:
+	phy_driver_unregister(&lxt971_driver);
+ err2:
 	phy_driver_unregister(&lxt970_driver);
  err1:
 	return ret;
@@ -169,6 +217,7 @@ static void __exit lxt_exit(void)
 {
 	phy_driver_unregister(&lxt970_driver);
 	phy_driver_unregister(&lxt971_driver);
+	phy_driver_unregister(&lxt973_driver);
 }
 
 module_init(lxt_init);
-- 
1.6.3.3



^ permalink raw reply related

* Re: [PATCH] phylib: Add support for the LXT973 phy.
From: Andy Fleming @ 2010-06-07 15:39 UTC (permalink / raw)
  To: David Miller; +Cc: richardcochran, netdev
In-Reply-To: <20100607.011835.55846752.davem@davemloft.net>

On Mon, Jun 7, 2010 at 3:18 AM, David Miller <davem@davemloft.net> wrote:
> From: Richard Cochran <richardcochran@gmail.com>
> Date: Sat, 5 Jun 2010 16:00:17 +0200
>
>> On Wed, Jun 02, 2010 at 02:32:11PM -0500, Andy Fleming wrote:
>>> Yeah, I was clearly not thinking clearly.  dev_flags will be
>>> overwritten, and is not meant for this.  I believe, what we should do
>>> is add a "port" field to the PHY device, and if PCR_FIBER_SELECT is
>>> set, then set the port field to PORT_FIBRE.  I'm not entirely clear on
>>> the semantics of that field in the ethtool cmd, but it seems like the
>>> right idea.
>>
>> Here is another try. Is that more like it?
>
> I think this is overkill.

Well, I meant for it to be the same as the ethtool one, and not a flags field.

>
> One, and only one, PHY driver wants to maintain a boolean state and
> we're adding a full 32-bit flags member to the generic PHY device
> struct to accomodate this?

To be fair, this is a generic problem, with a simple solution.  I was
suggesting that a tiny patch would pave the way for others to follow
suit.  Instead, now a tiny patch will do something strange and
incomprehensible, and then will be changed later when some arbitrary
threshold is reached of PHY drivers needing to know what type of port
they are hooked up to.

>
> Andy if you have ideas to use that state via ethtool or whatever in
> the future, you come back to us when the future arrives and you've
> implemented the code in the generic PHY lib code to do that stuff.

Is there some reason for resistance to taking small steps in the right
direction of an obviously good destination (recording the port type)?
At the very least, can I ask that instead of assigning phydev->priv to
the address of the probe function, that we do something like:

phydev->priv = (void *) PCR_FIBER_SELECT;

And then check to make sure it is that value?  It's still hacky
(IMHO), but at least it's self-documenting.

>
> As it stands right now, that code doesn't exist so accomodating it is
> just silly.
>
> I also think worrying about the phy_dev->priv pointer being misused in
> the future inside of this driver is a completely bogus argument by any
> measure.
>
> We have many cases all over the kernel tree, in drivers and elsewhere,
> where we encode integer states in what are normally pointer values
> when we need to maintain a small piece of state and don't need to do a
> full blown memory allocation to allocate a piece of extra memory to
> hold that state.

I would consider it a case of last resort, for when you need to avoid
a memory allocation for performance reasons, and you must use a
generic structure for a non-generic task.  This is something detected
in slow-path code, and is a generic task, so we're only hitting 1/3 of
those conditions.  I won't speculate as to how many of those other
cases in the tree I would find annoying.  ;)

Andy

^ permalink raw reply

* Re: [Patch] infiniband: check local reserved ports
From: Roland Dreier @ 2010-06-07 15:45 UTC (permalink / raw)
  To: Cong Wang
  Cc: linux-kernel, netdev, Tetsuo Handa, davem, linux-rdma, sean.hefty
In-Reply-To: <4C0CB610.4010305@redhat.com>

 > So this patch looks good for you? :)

Yes, will queue it up, thanks.
-- 
Roland Dreier <rolandd@cisco.com> || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html

^ permalink raw reply

* Re: [PATCH net-next-2.6] pkt_sched: gen_estimator: kill est_lock rwlock
From: Eric Dumazet @ 2010-06-07 15:55 UTC (permalink / raw)
  To: Changli Gao; +Cc: David Miller, netdev, Stephen Hemminger, Jarek Poplawski
In-Reply-To: <1275924638.2545.121.camel@edumazet-laptop>

Le lundi 07 juin 2010 à 17:30 +0200, Eric Dumazet a écrit :
> Le lundi 07 juin 2010 à 22:53 +0800, Changli Gao a écrit :
> 
> > Hmm. I don't think it is correct.
> > 
> > Look at this code:
> > 
> > void xt_rateest_put(struct xt_rateest *est)
> > {
> >         mutex_lock(&xt_rateest_mutex);
> >         if (--est->refcnt == 0) {
> >                 hlist_del(&est->list);
> >                 gen_kill_estimator(&est->bstats, &est->rstats);
> >                 kfree(est);
> >         }
> >         mutex_unlock(&xt_rateest_mutex);
> > }
> > 
> > est will be released after gen_kill_estimator() returns. After est is
> > freed, there may still be some other CPUs running in the branch:
> >      if (bstats) {
> >          ....
> >      }
> > 
> 
> Oh well, I think I knew this from a previous attempt, but then I
> forgot :)
> 
> I'll provide an updated patch, so that no other attempt is tried next
> yer, thanks !
> 
> 

For your information, bug is already there before my patch.

So this est_lock is a wrong protection, in the sense its so convoluted
that nobody but you and me even noticed it was buggy in the first place.

(see commit 5d944c640b4 for a first patch)




^ permalink raw reply

* Re: [Linux-ATM-General] RX/close vcc race with solos/atmtcp/usbatm/he
From: David Woodhouse @ 2010-06-07 16:04 UTC (permalink / raw)
  To: chas3; +Cc: linux-atm-general, netdev
In-Reply-To: <201006071510.o57FA6uP010212@thirdoffive.cmf.nrl.navy.mil>

On Mon, 2010-06-07 at 11:10 -0400, Chas Williams (CONTRACTOR) wrote:
> In message <1275920035.17903.4998.camel@macbook.infradead.org>,David Woodhouse 
> writes:
> >On Mon, 2010-06-07 at 09:44 -0400, Chas Williams (CONTRACTOR) wrote:
> >> vcc's are really sockets, so you could just increase the refcount --
> >> sock_hold().
> >
> >There are rules about where we're allowed to call sock_hold(), and I
> >don't think our find_vcc() functions can be made to meet them.
> 
> if you are using find_vcc() then you should already have a lock on the
> hash table for the vccs.  you can safely increment the ref count at
> this point.

You can still hit the oops that way -- br2684_push() is setting
vcc->user_back to NULL before the final sock_put() anyway, and that's
what was causing the oops.

> >In that case I think we might as well stick with the RCU-like solution I
> >already implemented in the vcc close function -- which is just to wait
> >for the tasklet to complete, thus ensuring that it's no longer using the
> >defunct vcc.
> 
> the driver shouldnt close the vcc while the tasklet is running and
> using the vcc in question.  i guess the safest thing is to simply
> do as you are doing and not close while running the tasklet.

OK. :)

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation


^ permalink raw reply

* Re: [Linux-ATM-General] RX/close vcc race with solos/atmtcp/usbatm/he
From: Chas Williams (CONTRACTOR) @ 2010-06-07 16:37 UTC (permalink / raw)
  To: David Woodhouse; +Cc: linux-atm-general, netdev
In-Reply-To: <1275926647.17903.5077.camel@macbook.infradead.org>

In message <1275926647.17903.5077.camel@macbook.infradead.org>,David Woodhouse 
writes:
>You can still hit the oops that way -- br2684_push() is setting
>vcc->user_back to NULL before the final sock_put() anyway, and that's
>what was causing the oops.

i dont understand.  if you do a sock_hold() in find_vcc(), and then call
vcc->push() you should be able to call vcc->push() and then sock_put().
the solos driver is broken since the result find_vcc() does not do
anything to keep the resulting vcc from evaporating.  find_vcc() should
be taking a reference on the vcc, or the lock needs to held around
any usage of the results from find_vcc().

^ permalink raw reply

* [PATCH net-next-2.6 v2] pkt_sched: gen_estimator: kill est_lock rwlock
From: Eric Dumazet @ 2010-06-07 16:56 UTC (permalink / raw)
  To: Changli Gao, David Miller
  Cc: netdev, Stephen Hemminger, Jarek Poplawski, Patrick McHardy
In-Reply-To: <1275926151.2545.126.camel@edumazet-laptop>


> 
> For your information, bug is already there before my patch.
> 
> So this est_lock is a wrong protection, in the sense its so convoluted
> that nobody but you and me even noticed it was buggy in the first place.
> 
> (see commit 5d944c640b4 for a first patch)
> 
> 

Here is v2 of the patch.

Even if its a bug correction, I cooked it for net-next-2.6 since bug
probably never occured, and patch is too large to be sent to
net-2.6/linux-2.6 before testing.

Another bug comes from net/netfilter/xt_RATEEST.c : It apparently
calls gen_kill_estimator() / gen_new_estimator() without holding RTNL ?

So we should add another lock to protect things (est_root, elist[], ...)

David, I can send a net-2.6 patch for this one, since it should be small
enough. If yes, I'll respin this patch of course ;)

Thanks

[PATCH net-next-2.6] pkt_sched: gen_kill_estimator() rcu fixes

gen_kill_estimator() API is fundamentaly wrong, since caller should make
sure an RCU grace period is respected before freeing bstats or lock.

This was partially addressed in commit 5d944c640b4 (gen_estimator:
deadlock fix), but same problem exist for all gen_kill_estimator()
users.

Change its name to gen_kill_estimator_rcu() and change all callers to
use RCU grace period before freeing bstats container.

As a bonus, est_lock rwlock can be killed for good.

Special thanks to Changli Gao for commenting on a previous patch, this
made this bug clear to me.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 include/net/act_api.h              |    2 
 include/net/gen_stats.h            |    2 
 include/net/netfilter/xt_rateest.h |    1 
 net/core/gen_estimator.c           |   58 ++++++++++++---------------
 net/netfilter/xt_RATEEST.c         |   10 +++-
 net/sched/act_api.c                |    9 +++-
 net/sched/act_police.c             |   12 ++++-
 net/sched/sch_cbq.c                |   11 ++++-
 net/sched/sch_drr.c                |   11 ++++-
 net/sched/sch_generic.c            |    4 -
 net/sched/sch_hfsc.c               |   11 ++++-
 net/sched/sch_htb.c                |   11 ++++-
 12 files changed, 92 insertions(+), 50 deletions(-)

diff --git a/include/net/act_api.h b/include/net/act_api.h
index c05fd71..bab385f 100644
--- a/include/net/act_api.h
+++ b/include/net/act_api.h
@@ -20,6 +20,7 @@ struct tcf_common {
 	struct gnet_stats_queue		tcfc_qstats;
 	struct gnet_stats_rate_est	tcfc_rate_est;
 	spinlock_t			tcfc_lock;
+	struct rcu_head			tcfc_rcu;
 };
 #define tcf_next	common.tcfc_next
 #define tcf_index	common.tcfc_index
@@ -32,6 +33,7 @@ struct tcf_common {
 #define tcf_qstats	common.tcfc_qstats
 #define tcf_rate_est	common.tcfc_rate_est
 #define tcf_lock	common.tcfc_lock
+#define tcf_rcu		common.tcfc_rcu
 
 struct tcf_police {
 	struct tcf_common	common;
diff --git a/include/net/gen_stats.h b/include/net/gen_stats.h
index fa15771..3545696 100644
--- a/include/net/gen_stats.h
+++ b/include/net/gen_stats.h
@@ -40,7 +40,7 @@ extern int gnet_stats_finish_copy(struct gnet_dump *d);
 extern int gen_new_estimator(struct gnet_stats_basic_packed *bstats,
 			     struct gnet_stats_rate_est *rate_est,
 			     spinlock_t *stats_lock, struct nlattr *opt);
-extern void gen_kill_estimator(struct gnet_stats_basic_packed *bstats,
+extern void gen_kill_estimator_rcu(struct gnet_stats_basic_packed *bstats,
 			       struct gnet_stats_rate_est *rate_est);
 extern int gen_replace_estimator(struct gnet_stats_basic_packed *bstats,
 				 struct gnet_stats_rate_est *rate_est,
diff --git a/include/net/netfilter/xt_rateest.h b/include/net/netfilter/xt_rateest.h
index ddbf37e..5e14277 100644
--- a/include/net/netfilter/xt_rateest.h
+++ b/include/net/netfilter/xt_rateest.h
@@ -9,6 +9,7 @@ struct xt_rateest {
 	struct gnet_estimator		params;
 	struct gnet_stats_rate_est	rstats;
 	struct gnet_stats_basic_packed	bstats;
+	struct rcu_head			rcu;
 };
 
 extern struct xt_rateest *xt_rateest_lookup(const char *name);
diff --git a/net/core/gen_estimator.c b/net/core/gen_estimator.c
index cf8e703..bd06b4a 100644
--- a/net/core/gen_estimator.c
+++ b/net/core/gen_estimator.c
@@ -102,9 +102,6 @@ struct gen_estimator_head
 
 static struct gen_estimator_head elist[EST_MAX_INTERVAL+1];
 
-/* Protects against NULL dereference */
-static DEFINE_RWLOCK(est_lock);
-
 /* Protects against soft lockup during large deletion */
 static struct rb_root est_root = RB_ROOT;
 
@@ -115,29 +112,25 @@ static void est_timer(unsigned long arg)
 
 	rcu_read_lock();
 	list_for_each_entry_rcu(e, &elist[idx].list, list) {
-		u64 nbytes;
-		u64 brate;
-		u32 npackets;
-		u32 rate;
+		struct gnet_stats_basic_packed *bstats;
 
 		spin_lock(e->stats_lock);
-		read_lock(&est_lock);
-		if (e->bstats == NULL)
-			goto skip;
-
-		nbytes = e->bstats->bytes;
-		npackets = e->bstats->packets;
-		brate = (nbytes - e->last_bytes)<<(7 - idx);
-		e->last_bytes = nbytes;
-		e->avbps += (brate >> e->ewma_log) - (e->avbps >> e->ewma_log);
-		e->rate_est->bps = (e->avbps+0xF)>>5;
-
-		rate = (npackets - e->last_packets)<<(12 - idx);
-		e->last_packets = npackets;
-		e->avpps += (rate >> e->ewma_log) - (e->avpps >> e->ewma_log);
-		e->rate_est->pps = (e->avpps+0x1FF)>>10;
-skip:
-		read_unlock(&est_lock);
+		bstats = rcu_dereference(e->bstats);
+		if (bstats) {
+			u64 nbytes = ACCESS_ONCE(bstats->bytes);
+			u32 npackets = ACCESS_ONCE(bstats->packets);
+			u64 brate = (nbytes - e->last_bytes)<<(7 - idx);
+			u32 rate;
+
+			e->last_bytes = nbytes;
+			e->avbps += (brate >> e->ewma_log) - (e->avbps >> e->ewma_log);
+			e->rate_est->bps = (e->avbps+0xF)>>5;
+
+			rate = (npackets - e->last_packets)<<(12 - idx);
+			e->last_packets = npackets;
+			e->avpps += (rate >> e->ewma_log) - (e->avpps >> e->ewma_log);
+			e->rate_est->pps = (e->avpps+0x1FF)>>10;
+		}
 		spin_unlock(e->stats_lock);
 	}
 
@@ -255,15 +248,18 @@ static void __gen_kill_estimator(struct rcu_head *head)
 }
 
 /**
- * gen_kill_estimator - remove a rate estimator
+ * gen_kill_estimator_rcu - remove a rate estimator
  * @bstats: basic statistics
  * @rate_est: rate estimator statistics
  *
  * Removes the rate estimator specified by &bstats and &rate_est.
  *
- * NOTE: Called under rtnl_mutex
+ * NOTES:
+ *	Called under rtnl_mutex
+ *	Because est_timer() requirements (RCU protection), caller
+ *	should respect an RCU grace period before freeing bstats
  */
-void gen_kill_estimator(struct gnet_stats_basic_packed *bstats,
+void gen_kill_estimator_rcu(struct gnet_stats_basic_packed *bstats,
 			struct gnet_stats_rate_est *rate_est)
 {
 	struct gen_estimator *e;
@@ -271,15 +267,13 @@ void gen_kill_estimator(struct gnet_stats_basic_packed *bstats,
 	while ((e = gen_find_node(bstats, rate_est))) {
 		rb_erase(&e->node, &est_root);
 
-		write_lock_bh(&est_lock);
-		e->bstats = NULL;
-		write_unlock_bh(&est_lock);
+		rcu_assign_pointer(e->bstats, NULL);
 
 		list_del_rcu(&e->list);
 		call_rcu(&e->e_rcu, __gen_kill_estimator);
 	}
 }
-EXPORT_SYMBOL(gen_kill_estimator);
+EXPORT_SYMBOL(gen_kill_estimator_rcu);
 
 /**
  * gen_replace_estimator - replace rate estimator configuration
@@ -297,7 +291,7 @@ int gen_replace_estimator(struct gnet_stats_basic_packed *bstats,
 			  struct gnet_stats_rate_est *rate_est,
 			  spinlock_t *stats_lock, struct nlattr *opt)
 {
-	gen_kill_estimator(bstats, rate_est);
+	gen_kill_estimator_rcu(bstats, rate_est);
 	return gen_new_estimator(bstats, rate_est, stats_lock, opt);
 }
 EXPORT_SYMBOL(gen_replace_estimator);
diff --git a/net/netfilter/xt_RATEEST.c b/net/netfilter/xt_RATEEST.c
index 69c01e1..55747cd 100644
--- a/net/netfilter/xt_RATEEST.c
+++ b/net/netfilter/xt_RATEEST.c
@@ -60,13 +60,18 @@ struct xt_rateest *xt_rateest_lookup(const char *name)
 }
 EXPORT_SYMBOL_GPL(xt_rateest_lookup);
 
+static void xt_rateeset_free_rcu(struct rcu_head *head)
+{
+	kfree(container_of(head, struct xt_rateest, rcu));
+}
+
 void xt_rateest_put(struct xt_rateest *est)
 {
 	mutex_lock(&xt_rateest_mutex);
 	if (--est->refcnt == 0) {
 		hlist_del(&est->list);
-		gen_kill_estimator(&est->bstats, &est->rstats);
-		kfree(est);
+		gen_kill_estimator_rcu(&est->bstats, &est->rstats);
+		call_rcu(&est->rcu, xt_rateeset_free_rcu);
 	}
 	mutex_unlock(&xt_rateest_mutex);
 }
@@ -179,6 +184,7 @@ static int __init xt_rateest_tg_init(void)
 static void __exit xt_rateest_tg_fini(void)
 {
 	xt_unregister_target(&xt_rateest_tg_reg);
+	rcu_barrier(); /* Wait for completion of call_rcu()'s */
 }
 
 
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 972378f..597b260 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -26,6 +26,11 @@
 #include <net/act_api.h>
 #include <net/netlink.h>
 
+static void tcf_common_free_rcu(struct rcu_head *head)
+{
+	kfree(container_of(head, struct tcf_common, tcfc_rcu));
+}
+
 void tcf_hash_destroy(struct tcf_common *p, struct tcf_hashinfo *hinfo)
 {
 	unsigned int h = tcf_hash(p->tcfc_index, hinfo->hmask);
@@ -36,9 +41,9 @@ void tcf_hash_destroy(struct tcf_common *p, struct tcf_hashinfo *hinfo)
 			write_lock_bh(hinfo->lock);
 			*p1p = p->tcfc_next;
 			write_unlock_bh(hinfo->lock);
-			gen_kill_estimator(&p->tcfc_bstats,
+			gen_kill_estimator_rcu(&p->tcfc_bstats,
 					   &p->tcfc_rate_est);
-			kfree(p);
+			call_rcu(&p->tcfc_rcu, tcf_common_free_rcu);
 			return;
 		}
 	}
diff --git a/net/sched/act_police.c b/net/sched/act_police.c
index 654f73d..c3afcba 100644
--- a/net/sched/act_police.c
+++ b/net/sched/act_police.c
@@ -97,6 +97,11 @@ nla_put_failure:
 	goto done;
 }
 
+static void tcf_police_free_rcu(struct rcu_head *head)
+{
+	kfree(container_of(head, struct tcf_police, tcf_rcu));
+}
+
 static void tcf_police_destroy(struct tcf_police *p)
 {
 	unsigned int h = tcf_hash(p->tcf_index, POL_TAB_MASK);
@@ -107,13 +112,13 @@ static void tcf_police_destroy(struct tcf_police *p)
 			write_lock_bh(&police_lock);
 			*p1p = p->tcf_next;
 			write_unlock_bh(&police_lock);
-			gen_kill_estimator(&p->tcf_bstats,
-					   &p->tcf_rate_est);
+			gen_kill_estimator_rcu(&p->tcf_bstats,
+					       &p->tcf_rate_est);
 			if (p->tcfp_R_tab)
 				qdisc_put_rtab(p->tcfp_R_tab);
 			if (p->tcfp_P_tab)
 				qdisc_put_rtab(p->tcfp_P_tab);
-			kfree(p);
+			call_rcu(&p->tcf_rcu, tcf_police_free_rcu);
 			return;
 		}
 	}
@@ -397,6 +402,7 @@ static void __exit
 police_cleanup_module(void)
 {
 	tcf_unregister_action(&act_police_ops);
+	rcu_barrier(); /* Wait for completion of call_rcu()'s */
 }
 
 module_init(police_init_module);
diff --git a/net/sched/sch_cbq.c b/net/sched/sch_cbq.c
index 28c01ef..d61b3b3 100644
--- a/net/sched/sch_cbq.c
+++ b/net/sched/sch_cbq.c
@@ -140,6 +140,7 @@ struct cbq_class
 	int			filters;
 
 	struct cbq_class 	*defaults[TC_PRIO_MAX+1];
+	struct rcu_head		rcu;
 };
 
 struct cbq_sched_data
@@ -1671,6 +1672,11 @@ static unsigned long cbq_get(struct Qdisc *sch, u32 classid)
 	return 0;
 }
 
+static void cbq_class_free_rcu(struct rcu_head *head)
+{
+	kfree(container_of(head, struct cbq_class, rcu));
+}
+
 static void cbq_destroy_class(struct Qdisc *sch, struct cbq_class *cl)
 {
 	struct cbq_sched_data *q = qdisc_priv(sch);
@@ -1680,9 +1686,9 @@ static void cbq_destroy_class(struct Qdisc *sch, struct cbq_class *cl)
 	tcf_destroy_chain(&cl->filter_list);
 	qdisc_destroy(cl->q);
 	qdisc_put_rtab(cl->R_tab);
-	gen_kill_estimator(&cl->bstats, &cl->rate_est);
+	gen_kill_estimator_rcu(&cl->bstats, &cl->rate_est);
 	if (cl != &q->link)
-		kfree(cl);
+		call_rcu(&cl->rcu, cbq_class_free_rcu);
 }
 
 static void
@@ -2066,6 +2072,7 @@ static int __init cbq_module_init(void)
 static void __exit cbq_module_exit(void)
 {
 	unregister_qdisc(&cbq_qdisc_ops);
+	rcu_barrier(); /* Wait for completion of call_rcu()'s */
 }
 module_init(cbq_module_init)
 module_exit(cbq_module_exit)
diff --git a/net/sched/sch_drr.c b/net/sched/sch_drr.c
index b74046a..f0d5aae 100644
--- a/net/sched/sch_drr.c
+++ b/net/sched/sch_drr.c
@@ -31,6 +31,7 @@ struct drr_class {
 
 	u32				quantum;
 	u32				deficit;
+	struct rcu_head			rcu;
 };
 
 struct drr_sched {
@@ -136,11 +137,16 @@ static int drr_change_class(struct Qdisc *sch, u32 classid, u32 parentid,
 	return 0;
 }
 
+static void drr_class_free_rcu(struct rcu_head *head)
+{
+	kfree(container_of(head, struct drr_class, rcu));
+}
+
 static void drr_destroy_class(struct Qdisc *sch, struct drr_class *cl)
 {
-	gen_kill_estimator(&cl->bstats, &cl->rate_est);
+	gen_kill_estimator_rcu(&cl->bstats, &cl->rate_est);
 	qdisc_destroy(cl->qdisc);
-	kfree(cl);
+	call_rcu(&cl->rcu, drr_class_free_rcu);
 }
 
 static int drr_delete_class(struct Qdisc *sch, unsigned long arg)
@@ -522,6 +528,7 @@ static int __init drr_init(void)
 static void __exit drr_exit(void)
 {
 	unregister_qdisc(&drr_qdisc_ops);
+	rcu_barrier(); /* Wait for completion of call_rcu()'s */
 }
 
 module_init(drr_init);
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index d20fcd2..22120d4 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -632,7 +632,7 @@ void qdisc_destroy(struct Qdisc *qdisc)
 
 	qdisc_put_stab(qdisc->stab);
 #endif
-	gen_kill_estimator(&qdisc->bstats, &qdisc->rate_est);
+	gen_kill_estimator_rcu(&qdisc->bstats, &qdisc->rate_est);
 	if (ops->reset)
 		ops->reset(qdisc);
 	if (ops->destroy)
@@ -643,7 +643,7 @@ void qdisc_destroy(struct Qdisc *qdisc)
 
 	kfree_skb(qdisc->gso_skb);
 	/*
-	 * gen_estimator est_timer() might access qdisc->q.lock,
+	 * gen_estimator est_timer() might access qdisc->q.lock or bstats
 	 * wait a RCU grace period before freeing qdisc.
 	 */
 	call_rcu(&qdisc->rcu_head, qdisc_rcu_free);
diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index abd904b..c433aa8 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -174,6 +174,7 @@ struct hfsc_class
 	unsigned long	cl_vtperiod;	/* vt period sequence number */
 	unsigned long	cl_parentperiod;/* parent's vt period sequence number*/
 	unsigned long	cl_nactive;	/* number of active children */
+	struct rcu_head rcu;
 };
 
 struct hfsc_sched
@@ -1111,6 +1112,11 @@ hfsc_change_class(struct Qdisc *sch, u32 classid, u32 parentid,
 	return 0;
 }
 
+static void hfsc_class_free_rcu(struct rcu_head *head)
+{
+	kfree(container_of(head, struct hfsc_class, rcu));
+}
+
 static void
 hfsc_destroy_class(struct Qdisc *sch, struct hfsc_class *cl)
 {
@@ -1118,9 +1124,9 @@ hfsc_destroy_class(struct Qdisc *sch, struct hfsc_class *cl)
 
 	tcf_destroy_chain(&cl->filter_list);
 	qdisc_destroy(cl->qdisc);
-	gen_kill_estimator(&cl->bstats, &cl->rate_est);
+	gen_kill_estimator_rcu(&cl->bstats, &cl->rate_est);
 	if (cl != &q->root)
-		kfree(cl);
+		call_rcu(&cl->rcu, hfsc_class_free_rcu);
 }
 
 static int
@@ -1742,6 +1748,7 @@ static void __exit
 hfsc_cleanup(void)
 {
 	unregister_qdisc(&hfsc_qdisc_ops);
+	rcu_barrier(); /* Wait for completion of call_rcu()'s */
 }
 
 MODULE_LICENSE("GPL");
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 0b52b8d..b7b5e29 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -123,6 +123,7 @@ struct htb_class {
 	psched_tdiff_t mbuffer;	/* max wait time */
 	long tokens, ctokens;	/* current number of tokens */
 	psched_time_t t_c;	/* checkpoint time */
+	struct rcu_head rcu;
 };
 
 struct htb_sched {
@@ -1190,18 +1191,23 @@ static void htb_parent_to_leaf(struct htb_sched *q, struct htb_class *cl,
 	parent->cmode = HTB_CAN_SEND;
 }
 
+static void htb_class_free_rcu(struct rcu_head *head)
+{
+	kfree(container_of(head, struct htb_class, rcu));
+}
+
 static void htb_destroy_class(struct Qdisc *sch, struct htb_class *cl)
 {
 	if (!cl->level) {
 		WARN_ON(!cl->un.leaf.q);
 		qdisc_destroy(cl->un.leaf.q);
 	}
-	gen_kill_estimator(&cl->bstats, &cl->rate_est);
+	gen_kill_estimator_rcu(&cl->bstats, &cl->rate_est);
 	qdisc_put_rtab(cl->rate);
 	qdisc_put_rtab(cl->ceil);
 
 	tcf_destroy_chain(&cl->filter_list);
-	kfree(cl);
+	call_rcu(&cl->rcu, htb_class_free_rcu);
 }
 
 static void htb_destroy(struct Qdisc *sch)
@@ -1573,6 +1579,7 @@ static int __init htb_module_init(void)
 static void __exit htb_module_exit(void)
 {
 	unregister_qdisc(&htb_qdisc_ops);
+	rcu_barrier(); /* Wait for completion of call_rcu()'s */
 }
 
 module_init(htb_module_init)



^ permalink raw reply related

* [PATCH net-2.6] pkt_sched: gen_estimator: add a new lock
From: Eric Dumazet @ 2010-06-07 17:18 UTC (permalink / raw)
  To: Changli Gao, David Miller
  Cc: netdev, Stephen Hemminger, Jarek Poplawski, Patrick McHardy
In-Reply-To: <1275929761.2545.159.camel@edumazet-laptop>

Le lundi 07 juin 2010 à 18:56 +0200, Eric Dumazet a écrit :
> > 
> > For your information, bug is already there before my patch.
> > 
> > So this est_lock is a wrong protection, in the sense its so convoluted
> > that nobody but you and me even noticed it was buggy in the first place.
> > 
> > (see commit 5d944c640b4 for a first patch)
> > 
> > 
> 
> Here is v2 of the patch.
> 
> Even if its a bug correction, I cooked it for net-next-2.6 since bug
> probably never occured, and patch is too large to be sent to
> net-2.6/linux-2.6 before testing.
> 
> Another bug comes from net/netfilter/xt_RATEEST.c : It apparently
> calls gen_kill_estimator() / gen_new_estimator() without holding RTNL ?
> 
> So we should add another lock to protect things (est_root, elist[], ...)
> 
> David, I can send a net-2.6 patch for this one, since it should be small
> enough. If yes, I'll respin this patch of course ;)

[PATCH net-2.6] pkt_sched: gen_estimator: add a new lock

gen_kill_estimator() / gen_new_estimator() is not always called with
RTNL held.

net/netfilter/xt_RATEEST.c is one user of these API that do not hold
RTNL, so random corruptions can occur between "tc" and "iptables"

Add a new fine grained lock instead of trying to use RTNL in xt_RATEEST

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 net/core/gen_estimator.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/net/core/gen_estimator.c b/net/core/gen_estimator.c
index cf8e703..3d11203 100644
--- a/net/core/gen_estimator.c
+++ b/net/core/gen_estimator.c
@@ -107,6 +107,7 @@ static DEFINE_RWLOCK(est_lock);
 
 /* Protects against soft lockup during large deletion */
 static struct rb_root est_root = RB_ROOT;
+static DEFINE_SPINLOCK(est_tree_lock);
 
 static void est_timer(unsigned long arg)
 {
@@ -201,7 +202,6 @@ struct gen_estimator *gen_find_node(const struct gnet_stats_basic_packed *bstats
  *
  * Returns 0 on success or a negative error code.
  *
- * NOTE: Called under rtnl_mutex
  */
 int gen_new_estimator(struct gnet_stats_basic_packed *bstats,
 		      struct gnet_stats_rate_est *rate_est,
@@ -222,6 +222,7 @@ int gen_new_estimator(struct gnet_stats_basic_packed *bstats,
 	if (est == NULL)
 		return -ENOBUFS;
 
+	spin_lock(&est_tree_lock);
 	idx = parm->interval + 2;
 	est->bstats = bstats;
 	est->rate_est = rate_est;
@@ -242,6 +243,7 @@ int gen_new_estimator(struct gnet_stats_basic_packed *bstats,
 
 	list_add_rcu(&est->list, &elist[idx].list);
 	gen_add_node(est);
+	spin_unlock(&est_tree_lock);
 
 	return 0;
 }
@@ -261,13 +263,13 @@ static void __gen_kill_estimator(struct rcu_head *head)
  *
  * Removes the rate estimator specified by &bstats and &rate_est.
  *
- * NOTE: Called under rtnl_mutex
  */
 void gen_kill_estimator(struct gnet_stats_basic_packed *bstats,
 			struct gnet_stats_rate_est *rate_est)
 {
 	struct gen_estimator *e;
 
+	spin_lock(&est_tree_lock);
 	while ((e = gen_find_node(bstats, rate_est))) {
 		rb_erase(&e->node, &est_root);
 
@@ -278,6 +280,7 @@ void gen_kill_estimator(struct gnet_stats_basic_packed *bstats,
 		list_del_rcu(&e->list);
 		call_rcu(&e->e_rcu, __gen_kill_estimator);
 	}
+	spin_unlock(&est_tree_lock);
 }
 EXPORT_SYMBOL(gen_kill_estimator);
 



^ permalink raw reply related

* 2.6.35-rc2-git1 - include/linux/cgroup.h:534 invoked rcu_dereference_check() without protection!
From: Miles Lane @ 2010-06-07 18:14 UTC (permalink / raw)
  To: paulmck
  Cc: Vivek Goyal, Eric Paris, Lai Jiangshan, Ingo Molnar,
	Peter Zijlstra, LKML, nauman, eric.dumazet, netdev, Jens Axboe,
	Gui Jianfeng, Li Zefan, Johannes Berg

Hi All,

I just reproduced a warning I reported quite a while ago.  Is a patch
for this in the pipeline?

[    0.167267] [ INFO: suspicious rcu_dereference_check() usage. ]
[    0.167396] ---------------------------------------------------
[    0.167526] include/linux/cgroup.h:534 invoked
rcu_dereference_check() without protection!
[    0.167728]
[    0.167729] other info that might help us debug this:
[    0.167731]
[    0.168092]
[    0.168093] rcu_scheduler_active = 1, debug_locks = 1
[    0.168337] no locks held by watchdog/0/5.
[    0.168462]
[    0.168463] stack backtrace:
[    0.168704] Pid: 5, comm: watchdog/0 Not tainted 2.6.35-rc2-git1 #8
[    0.168834] Call Trace:
[    0.168965]  [<ffffffff81064e9c>] lockdep_rcu_dereference+0x9d/0xa5
[    0.169100]  [<ffffffff8102c1ce>] task_subsys_state+0x59/0x70
[    0.169232]  [<ffffffff8103189b>] __sched_setscheduler+0x19d/0x2f8
[    0.169365]  [<ffffffff8102a5ef>] ? need_resched+0x1e/0x28
[    0.169497]  [<ffffffff813c7d01>] ? schedule+0x586/0x619
[    0.169628]  [<ffffffff81081c33>] ? watchdog+0x0/0x8c
[    0.169758]  [<ffffffff81031a11>] sched_setscheduler+0xe/0x10
[    0.169889]  [<ffffffff81081c5d>] watchdog+0x2a/0x8c
[    0.170010]  [<ffffffff81081c33>] ? watchdog+0x0/0x8c
[    0.170141]  [<ffffffff81054a82>] kthread+0x89/0x91
[    0.170274]  [<ffffffff81003054>] kernel_thread_helper+0x4/0x10
[    0.170405]  [<ffffffff813ca480>] ? restore_args+0x0/0x30
[    0.170536]  [<ffffffff810549f9>] ? kthread+0x0/0x91
[    0.170667]  [<ffffffff81003050>] ? kernel_thread_helper+0x0/0x10
[    0.176751] lockdep: fixing up alternatives.

^ permalink raw reply

* 2.6.35-rc2-git1 - lib/idr.c:605 invoked rcu_dereference_check() without protection!
From: Miles Lane @ 2010-06-07 18:23 UTC (permalink / raw)
  To: paulmck
  Cc: Vivek Goyal, Eric Paris, David Woodhouse, Lai Jiangshan,
	Ingo Molnar, Peter Zijlstra, LKML, nauman, eric.dumazet, netdev,
	Jens Axboe, Gui Jianfeng, Li Zefan, Johannes Berg

[    2.677955] [ INFO: suspicious rcu_dereference_check() usage. ]
[    2.679089] ---------------------------------------------------
[    2.680276] lib/idr.c:605 invoked rcu_dereference_check() without protection!
[    2.681499]
[    2.681500] other info that might help us debug this:
[    2.681501]
[    2.685509]
[    2.685510] rcu_scheduler_active = 1, debug_locks = 1
[    2.688221] 1 lock held by swapper/1:
[    2.689587]  #0:  (mtd_table_mutex){+.+...}, at:
[<ffffffff812bea45>] register_mtd_user+0x1a/0x69
[    2.691096]
[    2.691098] stack backtrace:
[    2.694059] Pid: 1, comm: swapper Not tainted 2.6.35-rc2-git1 #8
[    2.695601] Call Trace:
[    2.697243]  [<ffffffff81064e9c>] lockdep_rcu_dereference+0x9d/0xa5
[    2.698868]  [<ffffffff811b9c86>] idr_get_next+0x60/0x124
[    2.700556]  [<ffffffff812be779>] __mtd_next_device+0x1b/0x1d
[    2.702238]  [<ffffffff812bea7c>] register_mtd_user+0x51/0x69
[    2.703964]  [<ffffffff816cca45>] init_mtdchar+0xb3/0xd3
[    2.705686]  [<ffffffff816cc992>] ? init_mtdchar+0x0/0xd3
[    2.707470]  [<ffffffff810001ef>] do_one_initcall+0x59/0x14e
[    2.709255]  [<ffffffff816a768a>] kernel_init+0x144/0x1ce
[    2.711082]  [<ffffffff81003054>] kernel_thread_helper+0x4/0x10
[    2.712862]  [<ffffffff813ca480>] ? restore_args+0x0/0x30
[    2.714647]  [<ffffffff816a7546>] ? kernel_init+0x0/0x1ce
[    2.716415]  [<ffffffff81003050>] ? kernel_thread_helper+0x0/0x10

^ permalink raw reply

* 2.6.35-rc2-git1 - net/mac80211/sta_info.c:125 invoked rcu_dereference_check() without protection!
From: Miles Lane @ 2010-06-07 18:25 UTC (permalink / raw)
  To: paulmck
  Cc: Vivek Goyal, Eric Paris, Lai Jiangshan, Ingo Molnar,
	Peter Zijlstra, LKML, nauman, eric.dumazet, netdev, Jens Axboe,
	Gui Jianfeng, Li Zefan, Johannes Berg

[   43.478812] [ INFO: suspicious rcu_dereference_check() usage. ]
[   43.478815] ---------------------------------------------------
[   43.478820] net/mac80211/sta_info.c:125 invoked
rcu_dereference_check() without protection!
[   43.478824]
[   43.478824] other info that might help us debug this:
[   43.478826]
[   43.478829]
[   43.478830] rcu_scheduler_active = 1, debug_locks = 1
[   43.478834] no locks held by NetworkManager/4017.
[   43.478836]
[   43.478837] stack backtrace:
[   43.478842] Pid: 4017, comm: NetworkManager Not tainted 2.6.35-rc2-git1 #8
[   43.478846] Call Trace:
[   43.478849]  <IRQ>  [<ffffffff81064e9c>] lockdep_rcu_dereference+0x9d/0xa5
[   43.478876]  [<ffffffffa010cb3c>] sta_info_get_bss+0x71/0x12d [mac80211]
[   43.478889]  [<ffffffffa010cc0d>] ieee80211_find_sta+0x15/0x2f [mac80211]
[   43.478902]  [<ffffffffa019ae16>] iwlagn_tx_queue_reclaim+0xe7/0x1bb [iwlagn]
[   43.478909]  [<ffffffff8106530b>] ? mark_lock+0x2d/0x262
[   43.478920]  [<ffffffffa019e270>] iwlagn_rx_reply_tx+0x4cd/0x58a [iwlagn]
[   43.478928]  [<ffffffff811ca313>] ? is_swiotlb_buffer+0x2e/0x3b
[   43.478937]  [<ffffffffa0192ec2>] iwl_rx_handle+0x161/0x2bf [iwlagn]
[   43.478946]  [<ffffffffa019370a>] iwl_irq_tasklet+0x2eb/0x408 [iwlagn]
[   43.478953]  [<ffffffff8103f892>] tasklet_action+0xa7/0x10f
[   43.478960]  [<ffffffff810404cf>] __do_softirq+0x148/0x25a
[   43.478966]  [<ffffffff8100314c>] call_softirq+0x1c/0x28
[   43.478972]  [<ffffffff81004d20>] do_softirq+0x38/0x80
[   43.478977]  [<ffffffff8103ffd0>] irq_exit+0x45/0x94
[   43.478983]  [<ffffffff81004465>] do_IRQ+0xad/0xc4
[   43.478989]  [<ffffffff813ca3d3>] ret_from_intr+0x0/0xf
[   43.478993]  <EOI>

^ permalink raw reply

* Re: [PATCH net-next 00/11] tg3: Bugfixes and 5719 support
From: Matt Carlson @ 2010-06-07 18:40 UTC (permalink / raw)
  To: David Miller; +Cc: Matthew Carlson, netdev@vger.kernel.org, andy@greyhouse.net
In-Reply-To: <20100607.002805.232920368.davem@davemloft.net>

On Mon, Jun 07, 2010 at 12:28:05AM -0700, David Miller wrote:
> From: "Matt Carlson" <mcarlson@broadcom.com>
> Date: Sun, 6 Jun 2010 20:59:34 -0700
> 
> > On Sun, Jun 06, 2010 at 06:00:29PM -0700, David Miller wrote:
> > The programming manual says this bit prevents a tx state machine lockup
> > due to tx mbuf corruption.  The conditions under which the tx mbufs get
> > corrupted is complicated, but the net effect of this bit is that the
> > RDMA engine acts a little more conservatively.
> 
> That last phrase is a good description and could have been used to
> better name the register bit macro in question.  You're basically
> confirming that you named the register bit too tersely.
> 
> > The problem is that register definitions can change from asic rev to
> > asic rev.
> 
> Frankly, this kind of justification really ticks me off.
> 
> How the heck does that make a difference?  Describe which chip revs
> the register bit is valid for in a comment for crying out loud.
> 
> Document your hardware properly, not selectively or where you see
> it fit to do so.
> 
> It's bad enough that you guys don't publish your hardware specs.
> As a consequence as much knowledge as possible must go into the
> driver sources.  Any piece of information you take away is bad.
> 
> There are tons of chip register bits in this driver already which are
> only valid on certain hardware revs.  In fact, there are many.  Yet
> they are all still there in tg3.h whether they are used by the driver
> or not.  In fact, if I recall correctly, the DMA burst size controls
> are pretty much different on every single rev of the chip.
> 
> Documenting where for which chips a register bit is valid is
> pervasively done elsewhere, and is nothing new.  Your driver and your
> hardware is not special.

Maybe we don't have to take this path.

Our hardware specs are available.  You can find them at:

http://www.broadcom.com/support/ethernet_nic/open_source.php?source=top

The specs for the latest asic revs won't be there yet, but I'm sure
they'll get there.

In light of this, do you still feel we need to take this stance with
the tg3 driver?


^ permalink raw reply

* [PATCH 2/2] [PATCH] ucc_geth: fix for RX skb buffers recycling
From: Sergey Matyukevich @ 2010-06-07 18:38 UTC (permalink / raw)
  To: netdev; +Cc: Li Yang, Anton Vorontsov, Sergey Matyukevich
In-Reply-To: <1275935894-30483-1-git-send-email-geomatsi@gmail.com>

This patch implements a proper recycling of skb buffers belonging to RX error
path. The suggested fix actually follows the recycling scheme implemented for
TX skb buffers in the same driver (see 'ucc_geth_tx' function): skb buffers
are checked by 'skb_recycle_check' function and deleted if can't be recycled.

This problem in recycling of skb buffers was discovered by accident in a setup
when ethernet interface on one link end was full-duplex while another was
half-duplex. In this case numerous corrupted frames were received by
full-duplex interface due to late collisions. RX skb buffers with error
frames were not properly recycled, that is why overflow occured from time to
time on the next use of those buffers. Here is example of crush dump:

[    2.587886] Freeing unused kernel memory: 148k init
[    3.563785] PHY: mdio@80103720:00 - Link is Up - 100/Full
[    5.440474] skb_over_panic: text:c01bf710 len:1514 put:1514 head:cf9d4000 data:cf9d4040 tail:0xcf9d466a end:0xcf9d4660 dev:eth0
[    5.452042] ------------[ cut here ]------------
[    5.456654] kernel BUG at net/core/skbuff.c:127!
[    5.461270] Oops: Exception in kernel mode, sig: 5 [#1]
[    5.469099] Modules linked in:
[    5.472155] NIP: c01cd4f4 LR: c01cd4f4 CTR: c01919a4
[    5.477117] REGS: c0325cf0 TRAP: 0700   Not tainted  (2.6.33)
[    5.482854] MSR: 00029032 <EE,ME,CE,IR,DR>  CR: 22048022  XER: 20000000
[    5.489514] TASK = c030b3e8[0] 'swapper' THREAD: c0324000
[    5.494730] GPR00: c01cd4f4 c0325da0 c030b3e8 00000089 00002e7d ffffffff c018ef88 00002e7d
[    5.503126] GPR08: 00000030 c0323290 00002e7d c032b5a8 82048022 1001a100 c02c0340 c026dd84
[    5.511519] GPR16: 00000000 c033c4a8 00000000 00000000 00000001 00000000 cf88357c 0000003f
[    5.519915] GPR24: cf8832e0 cf883000 cf8832e0 cf883550 1c0005ee 000005ea cf96eaf0 cf9d4080
[    5.528518] NIP [c01cd4f4] skb_over_panic+0x48/0x5c
[    5.533395] LR [c01cd4f4] skb_over_panic+0x48/0x5c
[    5.538177] Call Trace:
[    5.540628] [c0325da0] [c01cd4f4] skb_over_panic+0x48/0x5c (unreliable)
[    5.547251] [c0325db0] [c01cec10] skb_put+0x5c/0x60
[    5.552145] [c0325dc0] [c01bf710] ucc_geth_poll+0x404/0x478
[    5.557735] [c0325e20] [c01daef4] net_rx_action+0x9c/0x1a4

Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com>
---
 drivers/net/ucc_geth.c |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c
index 538148a..033b7d6 100644
--- a/drivers/net/ucc_geth.c
+++ b/drivers/net/ucc_geth.c
@@ -3214,8 +3214,13 @@ static int ucc_geth_rx(struct ucc_geth_private *ugeth, u8 rxQ, int rx_work_limit
 				ugeth_err("%s, %d: ERROR!!! skb - 0x%08x",
 					   __func__, __LINE__, (u32) skb);
 			if (skb) {
-				skb->data = skb->head + NET_SKB_PAD;
-				__skb_queue_head(&ugeth->rx_recycle, skb);
+				if (skb_queue_len(&ugeth->rx_recycle) < RX_BD_RING_LEN &&
+						skb_recycle_check(skb,
+							ugeth->ug_info->uf_info.max_rx_buf_length +
+							UCC_GETH_RX_DATA_BUF_ALIGNMENT))
+					__skb_queue_head(&ugeth->rx_recycle, skb);
+				else
+					dev_kfree_skb(skb);
 			}
 
 			ugeth->rx_skbuff[rxQ][ugeth->skb_currx[rxQ]] = NULL;
-- 
1.6.2.5


^ permalink raw reply related

* [PATCH 1/2] [PATCH] ucc_geth driver: add ioctl
From: Sergey Matyukevich @ 2010-06-07 18:38 UTC (permalink / raw)
  To: netdev; +Cc: Li Yang, Anton Vorontsov, Sergey Matyukevich

ioctl operation (ndo_do_ioctl) is added to make mii-tools work

Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com>
---
 drivers/net/ucc_geth.c |   14 ++++++++++++++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c
index 4a34833..538148a 100644
--- a/drivers/net/ucc_geth.c
+++ b/drivers/net/ucc_geth.c
@@ -3702,6 +3702,19 @@ static phy_interface_t to_phy_interface(const char *phy_connection_type)
 	return PHY_INTERFACE_MODE_MII;
 }
 
+static int ucc_geth_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
+{
+	struct ucc_geth_private *ugeth = netdev_priv(dev);
+
+	if (!netif_running(dev))
+		return -EINVAL;
+
+	if (!ugeth->phydev)
+		return -ENODEV;
+
+	return phy_mii_ioctl(ugeth->phydev, if_mii(rq), cmd);
+}
+
 static const struct net_device_ops ucc_geth_netdev_ops = {
 	.ndo_open		= ucc_geth_open,
 	.ndo_stop		= ucc_geth_close,
@@ -3711,6 +3724,7 @@ static const struct net_device_ops ucc_geth_netdev_ops = {
 	.ndo_change_mtu		= eth_change_mtu,
 	.ndo_set_multicast_list	= ucc_geth_set_multi,
 	.ndo_tx_timeout		= ucc_geth_timeout,
+	.ndo_do_ioctl		= ucc_geth_ioctl,
 #ifdef CONFIG_NET_POLL_CONTROLLER
 	.ndo_poll_controller	= ucc_netpoll,
 #endif
-- 
1.6.2.5


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox