Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next] neigh: reorder struct neighbour fields
From: Eric Dumazet @ 2010-10-11 22:20 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <1286834567.30423.102.camel@edumazet-laptop>

Le mardi 12 octobre 2010 à 00:02 +0200, Eric Dumazet a écrit :
> Here is the followup patch.
> 
> Thanks !
> 

Oops, this was an old version, the up2date ones also took care of "used"
field.

I guess its time for a sleep, sorry again.


[PATCH net-next V2] neigh: reorder struct neighbour fields

(refcnt) and (ha_lock, ha, used, dev, output, ops, primary_key) should
be placed on a separate cache lines.

refcnt can be often written, while other fields are mostly read.

This gave me good result on stress test :

before:

real    0m45.570s
user    0m15.525s
sys     9m56.669s

After:

real    0m41.841s
user    0m15.261s
sys     8m45.949s

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
diff --git a/include/net/neighbour.h b/include/net/neighbour.h
index f04e7a2..55590ab 100644
--- a/include/net/neighbour.h
+++ b/include/net/neighbour.h
@@ -94,8 +94,6 @@ struct neighbour {
 	struct neighbour __rcu	*next;
 	struct neigh_table	*tbl;
 	struct neigh_parms	*parms;
-	struct net_device	*dev;
-	unsigned long		used;
 	unsigned long		confirmed;
 	unsigned long		updated;
 	__u8			flags;
@@ -103,16 +101,18 @@ struct neighbour {
 	__u8			type;
 	__u8			dead;
 	atomic_t		refcnt;
+	struct sk_buff_head	arp_queue;
+	struct timer_list	timer;
+	unsigned long		used;
 	atomic_t		probes;
 	rwlock_t		lock;
 	seqlock_t		ha_lock;
 	unsigned char		ha[ALIGN(MAX_ADDR_LEN, sizeof(unsigned long))];
 	struct hh_cache		*hh;
 	int			(*output)(struct sk_buff *skb);
-	struct sk_buff_head	arp_queue;
-	struct timer_list	timer;
 	const struct neigh_ops	*ops;
 	struct rcu_head		rcu;
+	struct net_device	*dev;
 	u8			primary_key[0];
 };
 



^ permalink raw reply related

* Re: tbf/htb qdisc limitations
From: Steven Brudenell @ 2010-10-11 22:27 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: netdev
In-Reply-To: <4CB1A22B.9090701@gmail.com>

> I'm not sure you checked how the "burst" works, and doubt it could
> help you here. Anyway, do you think: rate 2KB/s with burst 5GB
> config would be useful for you?

i actually really do want something like 2KB/s with 5GB burst
(modifying parameters such that burst + rate * 30 days <= 5GB, but you
get the idea). but this isn't possible given the implementation:

i see that overall, virtual "tokens" map to "scheduler ticks", where a
"scheduler tick" is 64ns. (net/sched/sch_{tbf,htb}.c,
include/net/pkt_sched.h -- these 64ns units are called "ticks" despite
being unrelated to HZ). the "burst" parameter is also stored and
passed from userspace as a u32. so, the maximum configurable burst in
both cases is rate * 275s, since we can only track 275s worth of
"scheduler ticks" in a u32 ( (1<<32) / NSEC_PER_SEC * 64 =~ 275s ).

> My proposal is you don't bother with 1) and 2), but first do the
> hack in tbf or htb directly, using or omitting rate tables, how
> you like, and test this idea.

i'll give it a shot, though given that i hate writing the same code
twice, i would prefer to know the right way to change netlink before i
write a functional test.

due to the implementation coupling i don't see any way to make any
permanent change *without* changing the netlink interface -- even
changing that u32 to a u64, which would only need to be a u64 in
userspace because userspace does the munging today!

(what's worse, today userspace has to specify the full rate table over
netlink, instead of just specifying the rate and having the kernel
driver compute the table or whatever other data structure it deems
necessary. i think decoupling interface from implementation is a
worthy goal by itself. if they were decoupled, i could have just coded
a patch and not bothered y'all in the first place....)

> But it seems the right way is to collect monthly stats with some
> userspace tool and change qdisc config dynamically. You might
> look at network admins' lists for small ISPs exemplary scripts
> doing such nasty things to their users, or have a look at ppp
> accounting tools.

<non technical sidetrack>
i disagree outright that a userspace tool is the "right" way to solve
my constraints.

my constraints are:
1) i need to guarantee i never ever go over the monthly transfer limit
(bad experiences with Comcast... you can check out of Red Tape Hotel
any time you like, but you can never leave).
2) i want to be able to transfer short bursts at top speed whenever
possible (that's what i'm paying for in the first place).
3) i need to ration transfer usage so i am never stuck in a situation
of being limited to snail speeds until the end of the month (on a
Comcast connection in my area, i can reasonably sustain 2MB/sec
downstream, which eats 250GB in ~36 hours, so this constraint becomes
important).

tbf with a large burst size seems ideal for my constraints. i can't
quantify this, but it seems like no simpler strategy satisfies the
constraints well and no more complex strategy is necessary. i think
any userspace solution i could write would end up trying to emulate
tbf with large burst.

a userspace tool updating qdisc parameters, even if run in an infinite
loop, would always have a big chunky time resolution compared to an
inline packet shaper (which is important for #2, and for #1 to a
degree). i could write a packet shaper in userspace, but this does not
make sense to given that kernel qos already exists, and already has a
tbf implementation that just needs a little love.
</non technical sidetrack>

given all that, i'd just like to know

1) whether it's forbidden or bad to do floating point math in a packet
scheduler, and

2) the best way to go about making breaking changes to netlink.

^ permalink raw reply

* Re: [PATCH] atl1c: Add support for Atheros AR8152 and AR8152
From: Ben Hutchings @ 2010-10-11 22:28 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: David Miller, Luis Rodriguez, netdev@vger.kernel.org, Jie Yang,
	linux-team
In-Reply-To: <20101011184835.GA10049@tux>

[-- Attachment #1: Type: text/plain, Size: 1796 bytes --]

On Mon, 2010-10-11 at 11:48 -0700, Luis R. Rodriguez wrote:
> On Sun, Oct 10, 2010 at 09:03:04PM -0700, David Miller wrote:
> > From: Ben Hutchings <ben@decadent.org.uk>
> > Date: Mon, 11 Oct 2010 02:18:50 +0100
> > 
> > > Your commit 496c185c9495629ef1c65387cb2594578393cfe0 "atl1c: Add support
> > > for Atheros AR8152 and AR8152" included the following changes:
> >  ...
> > >> +		if (hw->nic_type == athr_l1c || hw->nic_type == athr_l2c_b) {
> >  ...
> > >> +	if ((hw->nic_type == athr_l1c || hw->nic_type == athr_l2c)) {
> >  ...
> > > Shouldn't the first if-statement use the same condition as the second
> > > i.e. matching the previously-defined hardware types athr_l1c and
> > > athr_l2c?
> > 
> > Yeah that definitely looks like a bug to me.
> 
> Good catch, unfortunatley I don't have the source code I used to port
> this work the day I did this anymore locally, so adding 
> Jie Yang who is actually our maintainer for this driver.
> 
> Jie, can you please confirm if this patch is correct?

I was suggesting that the first condition was wrong and the second was
right.

Ben.

> diff --git a/drivers/net/atl1c/atl1c_hw.c b/drivers/net/atl1c/atl1c_hw.c
> index d8501f0..0a7b786 100644
> --- a/drivers/net/atl1c/atl1c_hw.c
> +++ b/drivers/net/atl1c/atl1c_hw.c
> @@ -132,7 +132,7 @@ static int atl1c_get_permanent_address(struct atl1c_hw *hw)
>  			return -1;
>  	}
>  	/* Disable OTP_CLK */
> -	if ((hw->nic_type == athr_l1c || hw->nic_type == athr_l2c)) {
> +	if ((hw->nic_type == athr_l1c || hw->nic_type == athr_l2c_b)) {
>  		otp_ctrl_data &= ~OTP_CTRL_CLK_EN;
>  		AT_WRITE_REG(hw, REG_OTP_CTRL, otp_ctrl_data);
>  		msleep(1);
> 
>   Luis
> 

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply

* Re: [PATCH net-2.6] tg3: restore rx_dropped accounting
From: Matt Carlson @ 2010-10-11 22:26 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, Matthew Carlson, Michael Chan
In-Reply-To: <1286776552.2692.232.camel@edumazet-laptop>

On Sun, Oct 10, 2010 at 10:55:52PM -0700, Eric Dumazet wrote:
> commit 511d22247be7 (tg3: 64 bit stats on all arches), overlooked the
> rx_dropped accounting.
> 
> We use a full "struct rtnl_link_stats64" to hold rx_dropped value, but
> forgot to report it in tg3_get_stats64().
> 
> Use an "unsigned long" instead to shrink "struct tg3" by 176 bytes, and
> report this value to stats readers.
> 
> Increment rx_dropped counter for oversized frames.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Acked-by: Matt Carlson <mcarlson@broadcom.com>

> CC: Michael Chan <mchan@broadcom.com>
> CC: Matt Carlson <mcarlson@broadcom.com>
> ---
>  drivers/net/tg3.c |    6 ++++--
>  drivers/net/tg3.h |    2 +-
>  2 files changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
> index bc3af78..1ec4b9e 100644
> --- a/drivers/net/tg3.c
> +++ b/drivers/net/tg3.c
> @@ -4666,7 +4666,7 @@ static int tg3_rx(struct tg3_napi *tnapi, int budget)
>  				       desc_idx, *post_ptr);
>  		drop_it_no_recycle:
>  			/* Other statistics kept track of by card. */
> -			tp->net_stats.rx_dropped++;
> +			tp->rx_dropped++;
>  			goto next_pkt;
>  		}
>  
> @@ -4726,7 +4726,7 @@ static int tg3_rx(struct tg3_napi *tnapi, int budget)
>  		if (len > (tp->dev->mtu + ETH_HLEN) &&
>  		    skb->protocol != htons(ETH_P_8021Q)) {
>  			dev_kfree_skb(skb);
> -			goto next_pkt;
> +			goto drop_it_no_recycle;
>  		}
>  
>  		if (desc->type_flags & RXD_FLAG_VLAN &&
> @@ -9240,6 +9240,8 @@ static struct rtnl_link_stats64 *tg3_get_stats64(struct net_device *dev,
>  	stats->rx_missed_errors = old_stats->rx_missed_errors +
>  		get_stat64(&hw_stats->rx_discards);
>  
> +	stats->rx_dropped = tp->rx_dropped;
> +
>  	return stats;
>  }
>  
> diff --git a/drivers/net/tg3.h b/drivers/net/tg3.h
> index 4937bd1..be7ff13 100644
> --- a/drivers/net/tg3.h
> +++ b/drivers/net/tg3.h
> @@ -2759,7 +2759,7 @@ struct tg3 {
>  
>  
>  	/* begin "everything else" cacheline(s) section */
> -	struct rtnl_link_stats64	net_stats;
> +	unsigned long			rx_dropped;
>  	struct rtnl_link_stats64	net_stats_prev;
>  	struct tg3_ethtool_stats	estats;
>  	struct tg3_ethtool_stats	estats_prev;
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply

* Re: [RFC PATCH 1/2] r8169: check dma mapping failures
From: Francois Romieu @ 2010-10-11 22:08 UTC (permalink / raw)
  To: Stanislaw Gruszka; +Cc: netdev, Denis Kirjanov
In-Reply-To: <1286548203-10831-1-git-send-email-sgruszka@redhat.com>

Stanislaw Gruszka <sgruszka@redhat.com> :
[...]
> diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
> index bc669a4..b3b28b1 100644
> --- a/drivers/net/r8169.c
> +++ b/drivers/net/r8169.c
> @@ -4018,20 +4018,24 @@ static struct sk_buff *rtl8169_alloc_rx_skb(struct pci_dev *pdev,

You can replace the pci_dev parameter with pdev->dev

>  
>  	skb = __netdev_alloc_skb(dev, rx_buf_sz + pad, gfp);
>  	if (!skb)
> -		goto err_out;
> +		goto err0;

err_out_0 (with a big please)

>  	skb_reserve(skb, align ? ((pad - 1) & (unsigned long)skb->data) : pad);
>  
>  	mapping = dma_map_single(&pdev->dev, skb->data, rx_buf_sz,
>  				 PCI_DMA_FROMDEVICE);
> +	if (dma_mapping_error(&pdev->dev, mapping))
> +		goto err1;

err_free_skb_1

>  	rtl8169_map_to_asic(desc, mapping, rx_buf_sz);
> -out:
> +
>  	return skb;
>  
> -err_out:
> +err1:
> +	dev_kfree_skb(skb);
> +err0:
>  	rtl8169_make_unusable_by_asic(desc);
> -	goto out;
> +	return NULL;

I'd keep the 'goto out' and NULL the skb after the dev_kfree_skb but
it's up to you.

>  }
>  
>  static void rtl8169_rx_clear(struct rtl8169_private *tp)
> @@ -4115,11 +4119,11 @@ static void rtl8169_unmap_tx_skb(struct pci_dev *pdev, struct ring_info *tx_skb,
>  	tx_skb->len = 0;
>  }
>  
> -static void rtl8169_tx_clear(struct rtl8169_private *tp)
> +static void rtl8169_tx_clear_range(struct rtl8169_private *tp, u32 start, u32 end)
>  {
> -	unsigned int i;
> +	u32 i;
>  
> -	for (i = tp->dirty_tx; i < tp->dirty_tx + NUM_TX_DESC; i++) {
> +	for (i = start; i < end; i++) {

Feel free to fix the (existing) wraparound bug.

>  		unsigned int entry = i % NUM_TX_DESC;
>  		struct ring_info *tx_skb = tp->tx_skb + entry;
>  		unsigned int len = tx_skb->len;
> @@ -4136,6 +4140,11 @@ static void rtl8169_tx_clear(struct rtl8169_private *tp)
>  			tp->dev->stats.tx_dropped++;
>  		}
>  	}
> +}
> +
> +static inline void rtl8169_tx_clear(struct rtl8169_private *tp)

Though used several times, it's hardly time critical : drop the inline.

> +{
> +	rtl8169_tx_clear_range(tp, tp->dirty_tx, tp->dirty_tx + NUM_TX_DESC);
>  	tp->cur_tx = tp->dirty_tx = 0;
>  }
>  
> @@ -4254,6 +4263,8 @@ static int rtl8169_xmit_frags(struct rtl8169_private *tp, struct sk_buff *skb,

Please add a local &tp->pci_dev->dev variable.

>  		addr = ((void *) page_address(frag->page)) + frag->page_offset;
>  		mapping = dma_map_single(&tp->pci_dev->dev, addr, len,
>  					 PCI_DMA_TODEVICE);
> +		if (dma_mapping_error(&tp->pci_dev->dev, mapping))
> +			return -cur_frag;

I ponder adding an 'unlikely', goto some cleaning statement and return a
signification deprieved "I failed" value. The caller does it anyway but
I am not sure if it really is its business.

>  
>  		/* anti gcc 2.95.3 bugware (sic) */
>  		status = opts1 | len | (RingEnd * !((entry + 1) % NUM_TX_DESC));
> @@ -4314,7 +4325,10 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>  	opts1 = DescOwn | rtl8169_tso_csum(skb, dev);
>  
>  	frags = rtl8169_xmit_frags(tp, skb, opts1);
> -	if (frags) {
> +	if (frags < 0) {

'unsigned int frags' problem.

[...]
> @@ -4355,6 +4371,10 @@ err_stop:
>  	netif_stop_queue(dev);
>  	dev->stats.tx_dropped++;
>  	return NETDEV_TX_BUSY;
> +
> +err_dma:
> +	rtl8169_tx_clear_range(tp, entry, entry + frags + 1);
> +	return NETDEV_TX_OK;

Third return statement, second NETDEV_TX_OK, no tx_dropped increase. May be
a bit heavy handed.

Otherwise it's cool.

-- 
Ueimor

^ permalink raw reply

* [PATCH] b44: fix carrier detection on bind
From: Paul Fertser @ 2010-10-11 22:42 UTC (permalink / raw)
  To: Gary Zambrano; +Cc: David S. Miller, netdev, Paul Fertser, stable

For carrier detection to work properly when binding the driver with a cable
unplugged, netif_carrier_off() should be called after register_netdev(),
not before.

Signed-off-by: Paul Fertser <fercerpav@gmail.com>
Cc: stable@kernel.org
---

Without this it's quite annoying to boot a laptop without ethernet
connection as the userspace is getting confused and can even add a default
route through a non-functional interface (when it's configured to use
static ip).

 drivers/net/b44.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/b44.c b/drivers/net/b44.c
index 37617ab..d711eb6 100644
--- a/drivers/net/b44.c
+++ b/drivers/net/b44.c
@@ -2161,8 +2161,6 @@ static int __devinit b44_init_one(struct ssb_device *sdev,
 	dev->irq = sdev->irq;
 	SET_ETHTOOL_OPS(dev, &b44_ethtool_ops);

-	netif_carrier_off(dev);
-
 	err = ssb_bus_powerup(sdev->bus, 0);
 	if (err) {
 		dev_err(sdev->dev,
@@ -2204,6 +2202,8 @@ static int __devinit b44_init_one(struct ssb_device *sdev,
 		goto err_out_powerdown;
 	}

+	netif_carrier_off(dev);
+
 	ssb_set_drvdata(sdev, dev);

 	/* Chip reset provides power to the b44 MAC & PCI cores, which
-- 
1.7.2.2

^ permalink raw reply related

* Re: [PATCH] b44: fix carrier detection on bind
From: David Miller @ 2010-10-11 22:45 UTC (permalink / raw)
  To: fercerpav; +Cc: zambrano, netdev, stable
In-Reply-To: <1286836950-3161-1-git-send-email-fercerpav@gmail.com>

From: Paul Fertser <fercerpav@gmail.com>
Date: Tue, 12 Oct 2010 02:42:30 +0400

> For carrier detection to work properly when binding the driver with a cable
> unplugged, netif_carrier_off() should be called after register_netdev(),
> not before.
> 
> Signed-off-by: Paul Fertser <fercerpav@gmail.com>

Applied, thanks.

^ permalink raw reply

* [PATCH net-next] bnx2x: dont use netdev_alloc_skb()
From: Eric Dumazet @ 2010-10-11 23:03 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Michael Chan, Eilon Greenstein

netdev_alloc_skb() is a very wrong interface, really.

We should remove/deprecate it.

For multi queue devices, it makes more sense to allocate skb on local
node of the cpu handling RX interrupts. This allow each cpu to
manipulate its own slub/slab queues/structures without doing expensive
cross-node business.

For non multi queue devices, IRQ affinity should be set so that a cpu
close to the device services interrupts. Even if not set, using
dev_alloc_skb() is faster.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 drivers/net/bnx2x/bnx2x_cmn.c     |   11 +++++------
 drivers/net/bnx2x/bnx2x_cmn.h     |    2 +-
 drivers/net/bnx2x/bnx2x_ethtool.c |    2 +-
 3 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/net/bnx2x/bnx2x_cmn.c b/drivers/net/bnx2x/bnx2x_cmn.c
index 97ef674..0561bd9 100644
--- a/drivers/net/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/bnx2x/bnx2x_cmn.c
@@ -335,7 +335,7 @@ static void bnx2x_tpa_stop(struct bnx2x *bp, struct bnx2x_fastpath *fp,
 	struct sw_rx_bd *rx_buf = &fp->tpa_pool[queue];
 	struct sk_buff *skb = rx_buf->skb;
 	/* alloc new skb */
-	struct sk_buff *new_skb = netdev_alloc_skb(bp->dev, bp->rx_buf_size);
+	struct sk_buff *new_skb = dev_alloc_skb(bp->rx_buf_size);
 
 	/* Unmap skb in the pool anyway, as we are going to change
 	   pool entry status to BNX2X_TPA_STOP even if new skb allocation
@@ -576,8 +576,7 @@ int bnx2x_rx_int(struct bnx2x_fastpath *fp, int budget)
 			    (len <= RX_COPY_THRESH)) {
 				struct sk_buff *new_skb;
 
-				new_skb = netdev_alloc_skb(bp->dev,
-							   len + pad);
+				new_skb = dev_alloc_skb(len + pad);
 				if (new_skb == NULL) {
 					DP(NETIF_MSG_RX_ERR,
 					   "ERROR  packet dropped "
@@ -587,9 +586,9 @@ int bnx2x_rx_int(struct bnx2x_fastpath *fp, int budget)
 				}
 
 				/* aligned copy */
-				skb_copy_from_linear_data_offset(skb, pad,
-						    new_skb->data + pad, len);
 				skb_reserve(new_skb, pad);
+				skb_copy_from_linear_data_offset(skb, pad,
+						    new_skb->data, len);
 				skb_put(new_skb, len);
 
 				bnx2x_reuse_rx_skb(fp, bd_cons, bd_prod);
@@ -841,7 +840,7 @@ void bnx2x_init_rx_rings(struct bnx2x *bp)
 		if (!fp->disable_tpa) {
 			for (i = 0; i < max_agg_queues; i++) {
 				fp->tpa_pool[i].skb =
-				   netdev_alloc_skb(bp->dev, bp->rx_buf_size);
+				   dev_alloc_skb(bp->rx_buf_size);
 				if (!fp->tpa_pool[i].skb) {
 					BNX2X_ERR("Failed to allocate TPA "
 						  "skb pool for queue[%d] - "
diff --git a/drivers/net/bnx2x/bnx2x_cmn.h b/drivers/net/bnx2x/bnx2x_cmn.h
index 7f52cec..f422cfc 100644
--- a/drivers/net/bnx2x/bnx2x_cmn.h
+++ b/drivers/net/bnx2x/bnx2x_cmn.h
@@ -833,7 +833,7 @@ static inline int bnx2x_alloc_rx_skb(struct bnx2x *bp,
 	struct eth_rx_bd *rx_bd = &fp->rx_desc_ring[index];
 	dma_addr_t mapping;
 
-	skb = netdev_alloc_skb(bp->dev, bp->rx_buf_size);
+	skb = dev_alloc_skb(bp->rx_buf_size);
 	if (unlikely(skb == NULL))
 		return -ENOMEM;
 
diff --git a/drivers/net/bnx2x/bnx2x_ethtool.c b/drivers/net/bnx2x/bnx2x_ethtool.c
index 54fe061..5224daa 100644
--- a/drivers/net/bnx2x/bnx2x_ethtool.c
+++ b/drivers/net/bnx2x/bnx2x_ethtool.c
@@ -1430,7 +1430,7 @@ static int bnx2x_run_loopback(struct bnx2x *bp, int loopback_mode, u8 link_up)
 	/* prepare the loopback packet */
 	pkt_size = (((bp->dev->mtu < ETH_MAX_PACKET_SIZE) ?
 		     bp->dev->mtu : ETH_MAX_PACKET_SIZE) + ETH_HLEN);
-	skb = netdev_alloc_skb(bp->dev, bp->rx_buf_size);
+	skb = dev_alloc_skb(bp->rx_buf_size);
 	if (!skb) {
 		rc = -ENOMEM;
 		goto test_loopback_exit;



^ permalink raw reply related

* Re: [PATCH net-2.6] tg3: restore rx_dropped accounting
From: David Miller @ 2010-10-11 23:06 UTC (permalink / raw)
  To: mcarlson; +Cc: eric.dumazet, netdev, mchan
In-Reply-To: <20101011222657.GA7307@mcarlson.broadcom.com>

From: "Matt Carlson" <mcarlson@broadcom.com>
Date: Mon, 11 Oct 2010 15:26:57 -0700

> On Sun, Oct 10, 2010 at 10:55:52PM -0700, Eric Dumazet wrote:
>> commit 511d22247be7 (tg3: 64 bit stats on all arches), overlooked the
>> rx_dropped accounting.
>> 
>> We use a full "struct rtnl_link_stats64" to hold rx_dropped value, but
>> forgot to report it in tg3_get_stats64().
>> 
>> Use an "unsigned long" instead to shrink "struct tg3" by 176 bytes, and
>> report this value to stats readers.
>> 
>> Increment rx_dropped counter for oversized frames.
>> 
>> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> 
> Acked-by: Matt Carlson <mcarlson@broadcom.com>

Applied, thanks Matt.

^ permalink raw reply

* Re: [PATCH net-next] neigh: reorder struct neighbour fields
From: David Miller @ 2010-10-11 23:09 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1286835654.30423.107.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 12 Oct 2010 00:20:54 +0200

> Le mardi 12 octobre 2010 à 00:02 +0200, Eric Dumazet a écrit :
>> Here is the followup patch.
>> 
>> Thanks !
>> 
> 
> Oops, this was an old version, the up2date ones also took care of "used"
> field.
...
> [PATCH net-next V2] neigh: reorder struct neighbour fields

Applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next] net dst: use a percpu_counter to track entries
From: David Miller @ 2010-10-11 23:10 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1286832573.30423.93.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 11 Oct 2010 23:29:33 +0200

> Le lundi 11 octobre 2010 à 13:07 -0700, David Miller a écrit :
> 
>> When I first read the subject line for this patch, I was
>> scared, because I thought you were using a percpu counter
>> for dst entry refcounts :-)
>> 
>> Anyways this is fine, applied, thanks!
> 
> I though about a crazy idea for loopback device (but not a percpu
> counter for dst refcounts)
> 
> Add a cpu field somewhere so that each cpu manipulates a different dst.
> 
> Or maybe just never cache such dst, so that each tcp socket uses its own
> dst.

We could even make the cpu (in lieu of a socket) a key in the lookup,
but that would complicate handling things like PMTU indications and
redirects.

^ permalink raw reply

* Re: [PATCH 1/2 net-next] bnx2: Update firmware to 6.0.x.
From: David Miller @ 2010-10-11 23:12 UTC (permalink / raw)
  To: mchan; +Cc: netdev
In-Reply-To: <1286494973-5115-1-git-send-email-mchan@broadcom.com>

From: "Michael Chan" <mchan@broadcom.com>
Date: Thu, 7 Oct 2010 16:42:52 -0700

> - Improved flow control and simplified interface
> - Use hardware RSS indirection table instead of the slower firmware-
>   based table
> - Lower latency interrupt on 5709
> 
> Signed-off-by: Michael Chan <mchan@broadcom.com>
> Reviewed-by: Benjamin Li <benli@broadcom.com>

Applied.

^ permalink raw reply

* Re: [PATCH 2/2 net-next] bnx2: Enable AER on PCIE devices only
From: David Miller @ 2010-10-11 23:12 UTC (permalink / raw)
  To: mchan; +Cc: netdev
In-Reply-To: <1286494973-5115-2-git-send-email-mchan@broadcom.com>

From: "Michael Chan" <mchan@broadcom.com>
Date: Thu, 7 Oct 2010 16:42:53 -0700

> To prevent unnecessary error message.  pci_save_state() is also moved to
> the end of ->probe() so that all PCI config, including AER state, will be
> saved.
> 
> Update version to 2.0.18.
> 
> Signed-off-by: Michael Chan <mchan@broadcom.com>
> Reviewed-by: Benjamin Li <mchan@broadcom.com>

Applied.

^ permalink raw reply

* [PATCH] b44: fix resume, request_irq after hw reset
From: James Hogan @ 2010-10-11 23:22 UTC (permalink / raw)
  To: Gary Zambrano, Jiri Pirko, FUJITA Tomonori, Hauke Mehrtens,
	Larry Finger <Larry.Finger@
  Cc: David S. Miller

This driver was hanging on resume because it was requesting a shared irq
that it wasn't ready to immediately handle, which was tested in
request_irq because of the CONFIG_DEBUG_SHIRQ config option. The
interrupt handler tried to read the interrupt status register but for
some reason it hung the system.

The request_irq is now moved a bit later after resetting the hardware
which seems to fix it.

Signed-off-by: James Hogan <james@albanarts.com>
---
 drivers/net/b44.c |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/b44.c b/drivers/net/b44.c
index 1e620e2..dbba981 100644
--- a/drivers/net/b44.c
+++ b/drivers/net/b44.c
@@ -2296,12 +2296,6 @@ static int b44_resume(struct ssb_device *sdev)
 	if (!netif_running(dev))
 		return 0;
 
-	rc = request_irq(dev->irq, b44_interrupt, IRQF_SHARED, dev->name, dev);
-	if (rc) {
-		netdev_err(dev, "request_irq failed\n");
-		return rc;
-	}
-
 	spin_lock_irq(&bp->lock);
 
 	b44_init_rings(bp);
@@ -2309,6 +2303,12 @@ static int b44_resume(struct ssb_device *sdev)
 	netif_device_attach(bp->dev);
 	spin_unlock_irq(&bp->lock);
 
+	rc = request_irq(dev->irq, b44_interrupt, IRQF_SHARED, dev->name, dev);
+	if (rc) {
+		netdev_err(dev, "request_irq failed\n");
+		return rc;
+	}
+
 	b44_enable_ints(bp);
 	netif_wake_queue(dev);
 
-- 
1.7.2.3


^ permalink raw reply related

* Re: [PATCH net-next] bnx2x: dont use netdev_alloc_skb()
From: Eric Dumazet @ 2010-10-11 23:22 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Michael Chan, Eilon Greenstein
In-Reply-To: <1286838210.30423.128.camel@edumazet-laptop>

Le mardi 12 octobre 2010 à 01:03 +0200, Eric Dumazet a écrit :
> netdev_alloc_skb() is a very wrong interface, really.
> 
> We should remove/deprecate it.
> 
> For multi queue devices, it makes more sense to allocate skb on local
> node of the cpu handling RX interrupts. This allow each cpu to
> manipulate its own slub/slab queues/structures without doing expensive
> cross-node business.
> 
> For non multi queue devices, IRQ affinity should be set so that a cpu
> close to the device services interrupts. Even if not set, using
> dev_alloc_skb() is faster.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Or maybe revert :

commit b30973f877fea1a3fb84e05599890fcc082a88e5
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Dec 6 20:32:36 2006 -0800

    [PATCH] node-aware skb allocation
    
    Node-aware allocation of skbs for the receive path.
    
    Details:
    
      - __alloc_skb gets a new node argument and cals the node-aware
        slab functions with it.
      - netdev_alloc_skb passed the node number it gets from dev_to_node
        to it, everyone else passes -1 (any node)
    
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Cc: Christoph Lameter <clameter@engr.sgi.com>
    Cc: "David S. Miller" <davem@davemloft.net>
    Signed-off-by: Andrew Morton <akpm@osdl.org>


Apparently, only Christoph and Andrew signed it.




^ permalink raw reply

* Re: [Bugme-new] [Bug 19992] New: b44 + CONFIG_DEBUG_SHIRQ (=y on fedora) fails to resume
From: James Hogan @ 2010-10-11 23:26 UTC (permalink / raw)
  To: Andrew Morton; +Cc: bugzilla-daemon, bugme-daemon, netdev, Gary Zambrano
In-Reply-To: <20101011131539.bbb99afe.akpm@linux-foundation.org>

On Monday 11 October 2010 21:15:39 Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Sun, 10 Oct 2010 16:57:11 GMT
> 
> bugzilla-daemon@bugzilla.kernel.org wrote:
> > https://bugzilla.kernel.org/show_bug.cgi?id=19992
> > 
> >            Summary: b44 + CONFIG_DEBUG_SHIRQ (=y on fedora) fails to
> >            
> >                     resume
> >            
> >            Product: Drivers
> >            Version: 2.5
> >     
> >     Kernel Version: 2.6.36-rc7
> >     
> >           Platform: All
> >         
> >         OS/Version: Linux
> >         
> >               Tree: Mainline
> >             
> >             Status: NEW
> >           
> >           Severity: high
> >           Priority: P1
> >          
> >          Component: Network
> >         
> >         AssignedTo: drivers_network@kernel-bugs.osdl.org
> >         ReportedBy: james@albanarts.com
> >         Regression: Yes
> > 
> > b44 network driver causes system to hang on resume when
> > CONFIG_DEBUG_SHIRQ=y. I've done some TRACE_RESUME'ing and the following
> > happens:
> > * b44_resume() (drivers/net/b44.c) calls request_irq with IRQF_SHARED
> > (after freeing it in the suspend function)
> > * request_irq() (kernel/irq/manage.c) calls the interrupt handler
> > directly if IRQF_SHARED and CONFIG_DEBUG_SHIRQ=y. It says "It's a shared
> > IRQ -- the driver ought to be prepared for it to happen immediately, so
> > let's make sure...."
> > 
> > * b44_interrupt() gets as far as the first br32 and no further:
> >   istat = br32(bp, B44_ISTAT);
> > 
> > I presume it hasn't yet woken the device up so reading a register somehow
> > fails and hangs the system.
> > 
> > If I comment out the code in request_irq() to test the shared irq handler
> > all works fine.
> > 
> > I'm guessing either the b44 driver shouldn't be freeing/requesting irqs
> > in suspend/resume functions, or should be resetting the hardware first
> > so that the test handler call doesn't fail, but I don't know enough
> > about why it is freeing the irq across suspend to be confident fixing
> > it.
> > 
> > This has been like this for a while (2.6.34 at least). Suspend used to
> > work on fedora with this hardware so I think this is a regression. I'm
> > happy to test any patches.
> 
> Thanks.  Yup, if the driver/device isn't ready to accept an IRQ when
> request_irq() is called then there might be a problem should a real
> interrupt happen very shortly after request_irq() is called.
> 
> The code looks OK to me so perhaps it is indeed some weird hardware
> problem.  Maybe a little delay after the ssb_bus_powerup() is needed?

Thanks for the ideas. I tried a delay and it didn't work, but when I moved the 
request_irq after the spinlocked code which appears to reset the hardware, all 
was fine, which kind of makes sense.

See patch "b44: fix resume, request_irq after hw reset"

Cheers
James

^ permalink raw reply

* Re: [PATCH] b44: fix resume, request_irq after hw reset
From: Andrew Morton @ 2010-10-11 23:34 UTC (permalink / raw)
  To: James Hogan
  Cc: Gary Zambrano, Jiri Pirko, FUJITA Tomonori, Hauke Mehrtens,
	Larry Finger, netdev, linux-kernel, David S. Miller
In-Reply-To: <201010120022.13171.james@albanarts.com>

On Tue, 12 Oct 2010 00:22:12 +0100
James Hogan <james@albanarts.com> wrote:

> This driver was hanging on resume because it was requesting a shared irq
> that it wasn't ready to immediately handle, which was tested in
> request_irq because of the CONFIG_DEBUG_SHIRQ config option. The
> interrupt handler tried to read the interrupt status register but for
> some reason it hung the system.
> 
> The request_irq is now moved a bit later after resetting the hardware
> which seems to fix it.
> 
> Signed-off-by: James Hogan <james@albanarts.com>
> ---
>  drivers/net/b44.c |   12 ++++++------
>  1 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/b44.c b/drivers/net/b44.c
> index 1e620e2..dbba981 100644
> --- a/drivers/net/b44.c
> +++ b/drivers/net/b44.c
> @@ -2296,12 +2296,6 @@ static int b44_resume(struct ssb_device *sdev)
>  	if (!netif_running(dev))
>  		return 0;
>  
> -	rc = request_irq(dev->irq, b44_interrupt, IRQF_SHARED, dev->name, dev);
> -	if (rc) {
> -		netdev_err(dev, "request_irq failed\n");
> -		return rc;
> -	}
> -
>  	spin_lock_irq(&bp->lock);
>  
>  	b44_init_rings(bp);
> @@ -2309,6 +2303,12 @@ static int b44_resume(struct ssb_device *sdev)
>  	netif_device_attach(bp->dev);
>  	spin_unlock_irq(&bp->lock);
>  
> +	rc = request_irq(dev->irq, b44_interrupt, IRQF_SHARED, dev->name, dev);
> +	if (rc) {
> +		netdev_err(dev, "request_irq failed\n");
> +		return rc;
> +	}
> +
>  	b44_enable_ints(bp);
>  	netif_wake_queue(dev);

OK, running the interrupt handler before b44_init_hw() is presumably
the problem here.

The hardware surely won't be generating interrupts until we've run
b44_init_hw() and b44_enable_ints(), so this patch really is only to
keep CONFIG_DEBUG_SHIRQ happy.

^ permalink raw reply

* Re: [RFC PATCH net-next] drivers/net Documentation/networking: Create directory intel_wired_lan
From: Jeff Kirsher @ 2010-10-11 23:52 UTC (permalink / raw)
  To: Joe Perches
  Cc: e1000-devel, Bruce Allan, Jesse Brandeburg, linux-kernel,
	Greg Rose, John Ronciak, netdev
In-Reply-To: <1286743352.11039.165.camel@Joe-Laptop>

On Sun, Oct 10, 2010 at 13:42, Joe Perches <joe@perches.com> wrote:
> Perhaps it's better to move drivers from the very populated
> drivers/net directory into vendor specific directories similar
> to the Atheros approach used for drivers/net/wireless/ath/
>
> Move intel drivers and Documentation to separate directories
> Create drivers/net/intel_wired_lan/Kconfig.<speed> and Makefile
> Modify drivers/net/Kconfig and Makefile
> Update MAINTAINERS
>
> Signed-off-by: Joe Perches <joe@perches.com>
> ---
>  .../networking/{ => intel_wired_lan}/e100.txt      |    0
>  .../networking/{ => intel_wired_lan}/e1000.txt     |    0
>  .../networking/{ => intel_wired_lan}/igb.txt       |    0
>  .../networking/{ => intel_wired_lan}/igbvf.txt     |    0
>  .../networking/{ => intel_wired_lan}/ixgb.txt      |    0
>  .../networking/{ => intel_wired_lan}/ixgbe.txt     |    0
>  .../networking/{ => intel_wired_lan}/ixgbevf.txt   |    0
>  MAINTAINERS                                        |   18 +--
>  drivers/net/Kconfig                                |  214 +-------------------
>  drivers/net/Makefile                               |    8 -
>  drivers/net/intel_wired_lan/Kconfig.100            |   25 +++
>  drivers/net/intel_wired_lan/Kconfig.1000           |  102 ++++++++++
>  drivers/net/intel_wired_lan/Kconfig.10000          |   81 ++++++++
>  drivers/net/intel_wired_lan/Makefile               |    9 +
>  drivers/net/{ => intel_wired_lan}/e100.c           |    0
>  drivers/net/{ => intel_wired_lan}/e1000/Makefile   |    0
>  drivers/net/{ => intel_wired_lan}/e1000/e1000.h    |    0
>  .../{ => intel_wired_lan}/e1000/e1000_ethtool.c    |    0
>  drivers/net/{ => intel_wired_lan}/e1000/e1000_hw.c |    0
>  drivers/net/{ => intel_wired_lan}/e1000/e1000_hw.h |    0
>  .../net/{ => intel_wired_lan}/e1000/e1000_main.c   |    0
>  .../net/{ => intel_wired_lan}/e1000/e1000_osdep.h  |    0
>  .../net/{ => intel_wired_lan}/e1000/e1000_param.c  |    0
>  drivers/net/{ => intel_wired_lan}/e1000e/82571.c   |    0
>  drivers/net/{ => intel_wired_lan}/e1000e/Makefile  |    0
>  drivers/net/{ => intel_wired_lan}/e1000e/defines.h |    0
>  drivers/net/{ => intel_wired_lan}/e1000e/e1000.h   |    0
>  drivers/net/{ => intel_wired_lan}/e1000e/es2lan.c  |    0
>  drivers/net/{ => intel_wired_lan}/e1000e/ethtool.c |    0
>  drivers/net/{ => intel_wired_lan}/e1000e/hw.h      |    0
>  drivers/net/{ => intel_wired_lan}/e1000e/ich8lan.c |    0
>  drivers/net/{ => intel_wired_lan}/e1000e/lib.c     |    0
>  drivers/net/{ => intel_wired_lan}/e1000e/netdev.c  |    0
>  drivers/net/{ => intel_wired_lan}/e1000e/param.c   |    0
>  drivers/net/{ => intel_wired_lan}/e1000e/phy.c     |    0
>  drivers/net/{ => intel_wired_lan}/igb/Makefile     |    0
>  .../net/{ => intel_wired_lan}/igb/e1000_82575.c    |    0
>  .../net/{ => intel_wired_lan}/igb/e1000_82575.h    |    0
>  .../net/{ => intel_wired_lan}/igb/e1000_defines.h  |    0
>  drivers/net/{ => intel_wired_lan}/igb/e1000_hw.h   |    0
>  drivers/net/{ => intel_wired_lan}/igb/e1000_mac.c  |    0
>  drivers/net/{ => intel_wired_lan}/igb/e1000_mac.h  |    0
>  drivers/net/{ => intel_wired_lan}/igb/e1000_mbx.c  |    0
>  drivers/net/{ => intel_wired_lan}/igb/e1000_mbx.h  |    0
>  drivers/net/{ => intel_wired_lan}/igb/e1000_nvm.c  |    0
>  drivers/net/{ => intel_wired_lan}/igb/e1000_nvm.h  |    0
>  drivers/net/{ => intel_wired_lan}/igb/e1000_phy.c  |    0
>  drivers/net/{ => intel_wired_lan}/igb/e1000_phy.h  |    0
>  drivers/net/{ => intel_wired_lan}/igb/e1000_regs.h |    0
>  drivers/net/{ => intel_wired_lan}/igb/igb.h        |    0
>  .../net/{ => intel_wired_lan}/igb/igb_ethtool.c    |    0
>  drivers/net/{ => intel_wired_lan}/igb/igb_main.c   |    0
>  drivers/net/{ => intel_wired_lan}/igbvf/Makefile   |    0
>  drivers/net/{ => intel_wired_lan}/igbvf/defines.h  |    0
>  drivers/net/{ => intel_wired_lan}/igbvf/ethtool.c  |    0
>  drivers/net/{ => intel_wired_lan}/igbvf/igbvf.h    |    0
>  drivers/net/{ => intel_wired_lan}/igbvf/mbx.c      |    0
>  drivers/net/{ => intel_wired_lan}/igbvf/mbx.h      |    0
>  drivers/net/{ => intel_wired_lan}/igbvf/netdev.c   |    0
>  drivers/net/{ => intel_wired_lan}/igbvf/regs.h     |    0
>  drivers/net/{ => intel_wired_lan}/igbvf/vf.c       |    0
>  drivers/net/{ => intel_wired_lan}/igbvf/vf.h       |    0
>  drivers/net/{ => intel_wired_lan}/ixgb/Makefile    |    0
>  drivers/net/{ => intel_wired_lan}/ixgb/ixgb.h      |    0
>  drivers/net/{ => intel_wired_lan}/ixgb/ixgb_ee.c   |    0
>  drivers/net/{ => intel_wired_lan}/ixgb/ixgb_ee.h   |    0
>  .../net/{ => intel_wired_lan}/ixgb/ixgb_ethtool.c  |    0
>  drivers/net/{ => intel_wired_lan}/ixgb/ixgb_hw.c   |    0
>  drivers/net/{ => intel_wired_lan}/ixgb/ixgb_hw.h   |    0
>  drivers/net/{ => intel_wired_lan}/ixgb/ixgb_ids.h  |    0
>  drivers/net/{ => intel_wired_lan}/ixgb/ixgb_main.c |    0
>  .../net/{ => intel_wired_lan}/ixgb/ixgb_osdep.h    |    0
>  .../net/{ => intel_wired_lan}/ixgb/ixgb_param.c    |    0
>  drivers/net/{ => intel_wired_lan}/ixgbe/Makefile   |    0
>  drivers/net/{ => intel_wired_lan}/ixgbe/ixgbe.h    |    0
>  .../net/{ => intel_wired_lan}/ixgbe/ixgbe_82598.c  |    0
>  .../net/{ => intel_wired_lan}/ixgbe/ixgbe_82599.c  |    0
>  .../net/{ => intel_wired_lan}/ixgbe/ixgbe_common.c |    0
>  .../net/{ => intel_wired_lan}/ixgbe/ixgbe_common.h |    0
>  .../net/{ => intel_wired_lan}/ixgbe/ixgbe_dcb.c    |    0
>  .../net/{ => intel_wired_lan}/ixgbe/ixgbe_dcb.h    |    0
>  .../{ => intel_wired_lan}/ixgbe/ixgbe_dcb_82598.c  |    0
>  .../{ => intel_wired_lan}/ixgbe/ixgbe_dcb_82598.h  |    0
>  .../{ => intel_wired_lan}/ixgbe/ixgbe_dcb_82599.c  |    0
>  .../{ => intel_wired_lan}/ixgbe/ixgbe_dcb_82599.h  |    0
>  .../net/{ => intel_wired_lan}/ixgbe/ixgbe_dcb_nl.c |    0
>  .../{ => intel_wired_lan}/ixgbe/ixgbe_ethtool.c    |    0
>  .../net/{ => intel_wired_lan}/ixgbe/ixgbe_fcoe.c   |    0
>  .../net/{ => intel_wired_lan}/ixgbe/ixgbe_fcoe.h   |    0
>  .../net/{ => intel_wired_lan}/ixgbe/ixgbe_main.c   |    0
>  .../net/{ => intel_wired_lan}/ixgbe/ixgbe_mbx.c    |    0
>  .../net/{ => intel_wired_lan}/ixgbe/ixgbe_mbx.h    |    0
>  .../net/{ => intel_wired_lan}/ixgbe/ixgbe_phy.c    |    0
>  .../net/{ => intel_wired_lan}/ixgbe/ixgbe_phy.h    |    0
>  .../net/{ => intel_wired_lan}/ixgbe/ixgbe_sriov.c  |    0
>  .../net/{ => intel_wired_lan}/ixgbe/ixgbe_sriov.h  |    0
>  .../net/{ => intel_wired_lan}/ixgbe/ixgbe_type.h   |    0
>  drivers/net/{ => intel_wired_lan}/ixgbevf/Makefile |    0
>  .../net/{ => intel_wired_lan}/ixgbevf/defines.h    |    0
>  .../net/{ => intel_wired_lan}/ixgbevf/ethtool.c    |    0
>  .../net/{ => intel_wired_lan}/ixgbevf/ixgbevf.h    |    0
>  .../{ => intel_wired_lan}/ixgbevf/ixgbevf_main.c   |    0
>  drivers/net/{ => intel_wired_lan}/ixgbevf/mbx.c    |    0
>  drivers/net/{ => intel_wired_lan}/ixgbevf/mbx.h    |    0
>  drivers/net/{ => intel_wired_lan}/ixgbevf/regs.h   |    0
>  drivers/net/{ => intel_wired_lan}/ixgbevf/vf.c     |    0
>  drivers/net/{ => intel_wired_lan}/ixgbevf/vf.h     |    0
>  107 files changed, 224 insertions(+), 233 deletions(-)
>

NAK
I agree with Stephen that this will generate a lot of confusion and....

First, I think we need to keep the documentation in /Documentation/networking.
Second, the changes are extensive and would create a lot of regression testing.

We have been looking at solutions like this for future
drivers/hardware and is on the list of items we are currently working
on, but feel it should not be made retroactively due to the regression
testing and massive changes that would need to be made.

-- 
Cheers,
Jeff

------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply

* Re: [RFC PATCH net-next] drivers/net Documentation/networking: Create directory intel_wired_lan
From: Joe Perches @ 2010-10-12  0:00 UTC (permalink / raw)
  To: Jeff Kirsher
  Cc: Don, e1000-devel, Bruce Allan, Jesse Brandeburg, linux-kernel,
	Greg Rose, John Ronciak, netdev
In-Reply-To: <AANLkTikVKufQRuMGd=cdig7ff4k2aw9Q1FWwMz3pgon-@mail.gmail.com>

On Mon, 2010-10-11 at 16:52 -0700, Jeff Kirsher wrote:
> On Sun, Oct 10, 2010 at 13:42, Joe Perches <joe@perches.com> wrote:
> > Perhaps it's better to move drivers from the very populated
> > drivers/net directory into vendor specific directories similar
> > to the Atheros approach used for drivers/net/wireless/ath/
> NAK
> First, I think we need to keep the documentation in /Documentation/networking.
> Second, the changes are extensive and would create a lot of regression testing.

I don't see any actual changes here other than layout.
What kind of regression testing do you think necessary?

> We have been looking at solutions like this for future
> drivers/hardware and is on the list of items we are currently working
> on, but feel it should not be made retroactively due to the regression
> testing and massive changes that would need to be made.

Might as well start somewhere.


------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply

* Re: [PATCH net-next] ixgbe: fix stats handling
From: Jeff Kirsher @ 2010-10-12  0:13 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, Peter Waskiewicz, Emil Tantilov, netdev
In-Reply-To: <1286799439.2737.21.camel@edumazet-laptop>

On Mon, Oct 11, 2010 at 05:17, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Hi
>
> I am sending this patch for Intel people review/test and acknowledge.
>
> Thanks !
>
> [PATCH net-next] ixgbe: fix stats handling
>
> Current ixgbe stats have following problems :
>
> - Not 64 bit safe (on 32bit arches)
>
> - Not safe in ixgbe_clean_rx_irq() :
>   All cpus dirty a common location (netdev->stats.rx_bytes &
> netdev->stats.rx_packets) without proper synchronization.
>   This slow down a bit multiqueue operations, and possibly miss some
> updates.
>
> Fixes :
>
> Implement ndo_get_stats64() method to provide accurate 64bit rx|tx
> bytes/packets counters, using 64bit safe infrastructure.
>
> ixgbe_get_ethtool_stats() also use this infrastructure to provide 64bit
> safe counters.
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> CC: Peter Waskiewicz <peter.p.waskiewicz.jr@intel.com>
> CC: Emil Tantilov <emil.s.tantilov@intel.com>
> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> ---
>  drivers/net/ixgbe/ixgbe.h         |    3 +-
>  drivers/net/ixgbe/ixgbe_ethtool.c |   29 +++++++++++---------
>  drivers/net/ixgbe/ixgbe_main.c    |   40 +++++++++++++++++++++++++---
>  3 files changed, 56 insertions(+), 16 deletions(-)
>

Thanks Eric, I have added the patch to my queue.

-- 
Cheers,
Jeff

^ permalink raw reply

* linux-next: manual merge of the net tree with the  tree
From: Stephen Rothwell @ 2010-10-12  0:17 UTC (permalink / raw)
  To: David Miller, netdev
  Cc: linux-next, linux-kernel, Stefan Richter, linux1394-devel,
	Ben Hutchings

[-- Attachment #1: Type: text/plain, Size: 528 bytes --]

Hi all,

Today's linux-next merge of the net tree got a conflict in
drivers/ieee1394/eth1394.c between commit
66fa12c571d35e3cd62574c65f1785a460105397 ("ieee1394: remove the old IEEE
1394 driver stack") from the  tree and commit
01414802054c382072b6cb9a1bdc6e243c74b2d5 ("ethtool: Provide a default
implementation of ethtool_ops::get_drvinfo") from the net tree.

The former removes the file.  I just did that.
-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

[-- Attachment #2: Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply

* Re: [PATCH net-next V2] igb: fix stats handling
From: Jeff Kirsher @ 2010-10-12  0:12 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Tantilov, Emil S, Jesper Dangaard Brouer, Duyck, Alexander H,
	Jesper Dangaard Brouer, David S. Miller, netdev, Wyborny, Carolyn
In-Reply-To: <1286698484.2692.205.camel@edumazet-laptop>

On Sun, Oct 10, 2010 at 01:14, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le samedi 09 octobre 2010 à 17:57 -0600, Tantilov, Emil S a écrit :
>> Eric Dumazet wrote:
>> > Le mercredi 06 octobre 2010 à 05:28 +0200, Eric Dumazet a écrit :
>> >
>> >> I'll let Intel guys doing the backporting work, but for old kernels,
>> >> you'll probably need to use "unsigned long" instead of "u64"
>> >>
>> >> My plan is :
>> >>
>> >> - Provide 64bit counters even on 32bit arch
>> >> - with proper synchro (include/linux/u64_stats_sync.h)
>> >> - Add a spinlock so we can apply Jesper patch.
>> >
>> > Here is the net-next-2.6 patch, I am currently enable to test it, the
>> > dev machine with IGB NIC cannot be restarted until tomorrow, my son
>> > Nicolas is currently using it ;)
>> >
>> > Could you and/or Jesper test it, possibly on 32 and 64 bit kernels ?
>> >
>> > Thanks !
>> >
>> > [PATCH net-next] igb: fix stats handling
>> >
>> > There are currently some problems with igb.
>> >
>> > - On 32bit arches, maintaining 64bit counters without proper
>> > synchronization between writers and readers.
>> >
>> > - Stats updated every two seconds, as reported by Jesper.
>> >    (Jesper provided a patch for this)
>> >
>> > - Potential problem between worker thread and ethtool -S
>> >
>> > This patch uses u64_stats_sync, and convert everything to be 64bit
>> > safe,
>> > SMP safe, even on 32bit arches.
>> >
>> > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
>> > ---
>> >  drivers/net/igb/igb.h         |    7 +-
>> >  drivers/net/igb/igb_ethtool.c |   10 +-
>> >  drivers/net/igb/igb_main.c    |  111 +++++++++++++++++++++++---------
>> >  3 files changed, 94 insertions(+), 34 deletions(-)
>>
>> This patch is causing a hang when testing with 2 sessions in a while loop reading /proc/net/dev/ and ethtool -S. I think even just reading /proc/net/dev/ is sufficient, but have not confirmed it yet. I have seen the hang somewhere between 15 min to an hour. Without the patch same test ran 24+ hours without issues.
>>
>> There was no trace on the screen, I got this with magic sysrq:
>>
>> [15388.393579] SysRq : Show Regs
>> [15388.397341] Modules linked in: igb [last unloaded: scsi_wait_scan] [15388.404846]
>> [15388.406889] Pid: 16218, comm: kworker/4:1 Not tainted 2.6.36-rc3-net-next-igb-100810+ #2 S5520HC/S5520HC
>> [15388.418393] EIP: 0060:[<c13fead2>] EFLAGS: 00000297 CPU: 4
>> [15388.424908] EIP is at _raw_spin_lock+0x13/0x19
>> [15388.430257] EAX: f6eab55c EBX: f6eab380 ECX: 00000001 EDX: 00004e4a [15388.437629] ESI: f6eab000 EDI: f6eab41c EBP: f3d9bf4c ESP: f3d9bf4c [15388.445011]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
>> [15388.451422] Process kworker/4:1 (pid: 16218, ti=f3d9a000 task=f5ce0ed0 task.ti=f3d9a000)
>> [15388.461173] Stack:
>> [15388.463796]  f3d9bf64 f8586c57 f3d9bf5c f6eab41c f6901e00 c8703cc0 f3d9bf90 c1041379
>> [15388.473116] <0> 00000004 00000000 f8586b16 00000000 c8707b05 c8707b00 f6901e00 c8703cc4
>> [15388.483379] <0> c8703cc0 f3d9bfb8 c1042879 c16bb640 c8703cc4 00000c54 c16bb640 f6901e10
>> [15388.494219] Call Trace:
>> [15388.497336]  [<f8586c57>] ? igb_watchdog_task+0x141/0x21a [igb]
>> [15388.504336]  [<c1041379>] ? process_one_work+0x18e/0x265
>> [15388.510643]  [<f8586b16>] ? igb_watchdog_task+0x0/0x21a [igb]
>> [15388.517455]  [<c1042879>] ? worker_thread+0xf3/0x1ef
>> [15388.523384]  [<c1042786>] ? worker_thread+0x0/0x1ef
>> [15388.529222]  [<c104506b>] ? kthread+0x62/0x67
>> [15388.534475]  [<c1045009>] ? kthread+0x0/0x67
>> [15388.539623]  [<c1002d36>] ? kernel_thread_helper+0x6/0x10
>> [15388.546034] Code: 00 75 05 f0 66 0f b1 0a 0f 94 c1 0f b6 c1 85 c0 0f 95 c0 0f b6 c0 5d c3 55 ba 00 01 00 00 89 e5 f0 66 0f c1 10 38 f2 74 06 f3 90 <8a> 10 eb f6 5d c3 55 89 e5 9c 59 fa ba 00 01 00 00 f0 66 0f c1
>>
>> Thanks,
>> Emil
>>
>
> Hi Emil, thanks for testing.
>
> It seems one one the u64_stats_sync invariant is not met.
>
> I believe problem comes from "restart_queue"
>
> This one can be updated in // by two cpus.
>
> So we can lose one of the u64_stats_update_begin() /
> u64_stats_update_end() increment and freeze a reader.
>
> So igb had a previous bug here, un-noticed :)
>
> restart_queue can be updated by start_xmit() (so only one cpu can be
> there, because of txqueue lock serialization), but it also can be
> updated by the igb_clean_tx_irq() function (one cpu there too).
>
> One solution to this problem is to use two separate counters, with two
> separate syncp.
>
> [PATCH net-next V2] igb: fix stats handling
>
> There are currently some problems with igb.
>
> - On 32bit arches, maintaining 64bit counters without proper
> synchronization between writers and readers.
>
> - Stats updated every two seconds, as reported by Jesper.
>   (Jesper provided a patch for this)
>
> - Potential problem between worker thread and ethtool -S
>
> This patch uses u64_stats_sync, and convert everything to be 64bit safe,
> SMP safe, even on 32bit arches. It integrates Jesper idea of providing
> accurate stats at the time user reads them.
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
> V2: Add a second restart_queue field, with separate syncp, because
> original restart_queue was potentially udpated by two cpus.
>   Corrected igb_get_ethtool_stats() to also use appropriate syncp,
>   and do the sum of two restart_queue fields.
>
>  drivers/net/igb/igb.h         |    9 ++
>  drivers/net/igb/igb_ethtool.c |   52 ++++++++++----
>  drivers/net/igb/igb_main.c    |  113 +++++++++++++++++++++++---------
>  3 files changed, 129 insertions(+), 45 deletions(-)
>

Thanks Eric, I have added the updated patch to my queue.

-- 
Cheers,
Jeff

^ permalink raw reply

* RE: [PATCH v12 06/17] Use callback to deal with skb_release_data() specially.
From: Xin, Xiaohui @ 2010-10-12  1:24 UTC (permalink / raw)
  To: Eric Dumazet, David Miller
  Cc: netdev@vger.kernel.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, mst@redhat.com, mingo@elte.hu,
	herbert@gondor.apana.org.au, jdike@linux.intel.com
In-Reply-To: <1286811731.2737.28.camel@edumazet-laptop>

>-----Original Message-----
>From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
>Sent: Monday, October 11, 2010 11:42 PM
>To: David Miller
>Cc: Xin, Xiaohui; netdev@vger.kernel.org; kvm@vger.kernel.org;
>linux-kernel@vger.kernel.org; mst@redhat.com; mingo@elte.hu;
>herbert@gondor.apana.org.au; jdike@linux.intel.com
>Subject: Re: [PATCH v12 06/17] Use callback to deal with skb_release_data() specially.
>
>Le lundi 11 octobre 2010 à 08:27 -0700, David Miller a écrit :
>> From: "Xin, Xiaohui" <xiaohui.xin@intel.com>
>> Date: Mon, 11 Oct 2010 15:06:05 +0800
>>
>> > That's to avoid the new cache miss caused by using destructor_arg in data path
>> > like skb_release_data().
>> > That's based on the comment from Eric Dumazet on v7 patches.
>>
>> Thanks for the explanation.
>
>Anyway, frags[] must be the last field of "struct skb_shared_info"
>since commit fed66381 (net: pskb_expand_head() optimization)
>
>It seems Xin worked on a quite old tree.
>
>
I will rebase soon.

Thanks
Xiaohui

^ permalink raw reply

* Re: linux-next: manual merge of the net tree with the  tree
From: Stephen Rothwell @ 2010-10-12  3:51 UTC (permalink / raw)
  To: David Miller, netdev
  Cc: linux-next, linux-kernel, Stefan Richter, linux1394-devel,
	Ben Hutchings
In-Reply-To: <20101012111716.0c54b093.sfr@canb.auug.org.au>

[-- Attachment #1: Type: text/plain, Size: 509 bytes --]

Hi all,

On Tue, 12 Oct 2010 11:17:16 +1100 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>
> Today's linux-next merge of the net tree got a conflict in
> drivers/ieee1394/eth1394.c between commit
> 66fa12c571d35e3cd62574c65f1785a460105397 ("ieee1394: remove the old IEEE
> 1394 driver stack") from the  tree and commit
                               ^
I meant to say "the ieee1394 tree".

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

[-- Attachment #2: Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply

* Re: [PATCH net-next] bnx2x: dont use netdev_alloc_skb()
From: Tom Herbert @ 2010-10-12  5:03 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, Michael Chan, Eilon Greenstein
In-Reply-To: <1286839363.30423.130.camel@edumazet-laptop>

On Mon, Oct 11, 2010 at 4:22 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le mardi 12 octobre 2010 à 01:03 +0200, Eric Dumazet a écrit :
>> netdev_alloc_skb() is a very wrong interface, really.
>>
>> We should remove/deprecate it.
>>
>> For multi queue devices, it makes more sense to allocate skb on local
>> node of the cpu handling RX interrupts. This allow each cpu to
>> manipulate its own slub/slab queues/structures without doing expensive
>> cross-node business.
>>
>> For non multi queue devices, IRQ affinity should be set so that a cpu
>> close to the device services interrupts. Even if not set, using
>> dev_alloc_skb() is faster.
>>
>> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
>
> Or maybe revert :
>
> commit b30973f877fea1a3fb84e05599890fcc082a88e5
> Author: Christoph Hellwig <hch@lst.de>
> Date:   Wed Dec 6 20:32:36 2006 -0800
>

I second this revert.  Node aware allocation by device's node makes
little sense on a multi-queue device and leads to mediocre
performance.

>    [PATCH] node-aware skb allocation
>
>    Node-aware allocation of skbs for the receive path.
>
>    Details:
>
>      - __alloc_skb gets a new node argument and cals the node-aware
>        slab functions with it.
>      - netdev_alloc_skb passed the node number it gets from dev_to_node
>        to it, everyone else passes -1 (any node)
>
>    Signed-off-by: Christoph Hellwig <hch@lst.de>
>    Cc: Christoph Lameter <clameter@engr.sgi.com>
>    Cc: "David S. Miller" <davem@davemloft.net>
>    Signed-off-by: Andrew Morton <akpm@osdl.org>
>
>
> Apparently, only Christoph and Andrew signed it.
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox