Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH 1/2] r8169: allocate with GFP_KERNEL flag when able to sleep
From: David Miller @ 2010-10-09 16:17 UTC (permalink / raw)
  To: eric.dumazet; +Cc: sgruszka, romieu, netdev
In-Reply-To: <1286610844.2692.11.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sat, 09 Oct 2010 09:54:04 +0200

> Le vendredi 08 octobre 2010 à 16:25 +0200, Stanislaw Gruszka a écrit :
>> We have fedora bug report where driver fail to initialize after
>> suspend/resume because of memory allocation errors:
>> https://bugzilla.redhat.com/show_bug.cgi?id=629158
>> 
>> To fix use GFP_KERNEL allocation where possible.
>> 
>> Tested-by: Neal Becker <ndbecker2@gmail.com>
>> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
> 
> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied.

^ permalink raw reply

* [PATCH] net: introduce alloc_skb_order0
From: Eric Dumazet @ 2010-10-09 15:59 UTC (permalink / raw)
  To: Stanislaw Gruszka, David Miller; +Cc: Francois Romieu, netdev
In-Reply-To: <20101008160341.GC10393@redhat.com>

Le vendredi 08 octobre 2010 à 18:03 +0200, Stanislaw Gruszka a écrit :
> On Fri, Oct 08, 2010 at 05:04:07PM +0200, Eric Dumazet wrote:

> > Switch to SLAB -> no more problem ;)
> 
> yeh, I wish to, but fedora use SLUB because of some debugging
> capabilities. 

Yes, of course, I was kidding :)

echo 0 >/sys/kernel/slab/kmalloc-2048/order
echo 0 >/sys/kernel/slab/kmalloc-1024/order
echo 0 >/sys/kernel/slab/kmalloc-512/order

Should do the trick : No more high order allocations for MTU=1500
frames.


For MTU=9000 frames, we probably need something like this patch :

(Not yet for inclusion, this is an RFC, this will need two separate
patches)

[PATCH] net: introduce alloc_skb_order0()

Reception of big frames hit a memory allocation problem, because of high
order pages allocations (order-3 sometimes for MTU=9000). This patch
introduces alloc_skb_order0(), to build skbs with order-0 pages only.

Their headlen is at most SKB_MAX_HEAD(NET_SKB_PAD + NET_IP_ALIGN)
(3648 bytes on x86_64, 3840 bytes on x86_32)

As net drivers might use skb_store_bits() to copy data to this newly
allocated skb, we might even use __GFP_HIGHMEM for the fragments ?

Note : Use GFP_NOWAIT | __GFP_NOWARN mask to allocate pages, since we
dont want to let big packets exhaust GFP_ATOMIC pool.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 drivers/net/r8169.c    |   19 ++++++---------
 include/linux/skbuff.h |    1 
 net/core/skbuff.c      |   47 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 56 insertions(+), 11 deletions(-)

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index fe3b762..f4220db 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -4468,27 +4468,24 @@ static inline void rtl8169_rx_csum(struct sk_buff *skb, u32 opts1)
 		skb_checksum_none_assert(skb);
 }
 
-static inline bool rtl8169_try_rx_copy(struct sk_buff **sk_buff,
+static inline bool rtl8169_try_rx_copy(struct sk_buff **pskb,
 				       struct rtl8169_private *tp, int pkt_size,
 				       dma_addr_t addr)
 {
 	struct sk_buff *skb;
-	bool done = false;
 
 	if (pkt_size >= rx_copybreak)
-		goto out;
+		return false;
 
-	skb = netdev_alloc_skb_ip_align(tp->dev, pkt_size);
+	skb = alloc_skb_order0(pkt_size);
 	if (!skb)
-		goto out;
+		return false;
 
 	pci_dma_sync_single_for_cpu(tp->pci_dev, addr, pkt_size,
 				    PCI_DMA_FROMDEVICE);
-	skb_copy_from_linear_data(*sk_buff, skb->data, pkt_size);
-	*sk_buff = skb;
-	done = true;
-out:
-	return done;
+	skb_store_bits(skb, 0, (*pskb)->data, pkt_size);
+	*pskb = skb;
+	return true;
 }
 
 /*
@@ -4559,10 +4556,10 @@ static int rtl8169_rx_interrupt(struct net_device *dev,
 				pci_unmap_single(pdev, addr, tp->rx_buf_sz,
 						 PCI_DMA_FROMDEVICE);
 				tp->Rx_skbuff[entry] = NULL;
+				skb_put(skb, pkt_size);
 			}
 
 			rtl8169_rx_csum(skb, status);
-			skb_put(skb, pkt_size);
 			skb->protocol = eth_type_trans(skb, dev);
 
 			if (rtl8169_rx_vlan_skb(tp, desc, skb, polling) < 0) {
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 0b53c43..2cc161a 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1841,6 +1841,7 @@ extern int	       skb_copy_bits(const struct sk_buff *skb, int offset,
 				     void *to, int len);
 extern int	       skb_store_bits(struct sk_buff *skb, int offset,
 				      const void *from, int len);
+extern struct sk_buff *alloc_skb_order0(int pkt_size);
 extern __wsum	       skb_copy_and_csum_bits(const struct sk_buff *skb,
 					      int offset, u8 *to, int len,
 					      __wsum csum);
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 752c197..4a6195d 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1664,6 +1664,53 @@ fault:
 }
 EXPORT_SYMBOL(skb_store_bits);
 
+/**
+ * alloc_skb_order0 - allocate skb with order-0 requirements
+ * @pkt_size: packet size
+ * 
+ * Allocate an skb with a head small enough that skb->data should not
+ * require high order page allocation, and complete with fragments if
+ * pkt_size is too big. Might be use in drivers RX path : We reserve
+ * NET_SKB_PAD + NET_IP_ALIGN bytes and use GFP_ATOMIC allocations.
+ * We also set skb->len to pkt_size, so driver should not call skb_put()
+ */
+struct sk_buff *alloc_skb_order0(int pkt_size)
+{
+	int head = min_t(int, pkt_size, SKB_MAX_HEAD(NET_SKB_PAD + NET_IP_ALIGN));
+	struct sk_buff *skb;
+
+	skb = alloc_skb(head + NET_SKB_PAD + NET_IP_ALIGN,
+			GFP_ATOMIC | __GFP_NOWARN);
+	if (!skb)
+		return NULL;
+	skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN);
+	skb_put(skb, head);
+	pkt_size -= head;
+
+	skb->len += pkt_size;
+	skb->data_len += pkt_size;
+	skb->truesize += pkt_size;
+	while (pkt_size) {
+		int i = skb_shinfo(skb)->nr_frags++;
+		skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
+		int fragsize = min_t(int, pkt_size, PAGE_SIZE);
+		struct page *page = alloc_page(GFP_NOWAIT | __GFP_NOWARN);
+
+		if (!page)
+			goto error;
+		frag->page = page;
+		frag->size = fragsize;
+		frag->page_offset = 0;
+		pkt_size -= fragsize;
+	}
+	return skb;
+
+error:
+	kfree_skb(skb);
+	return NULL;	
+}
+EXPORT_SYMBOL(alloc_skb_order0);
+
 /* Checksum skb data. */
 
 __wsum skb_checksum(const struct sk_buff *skb, int offset,



^ permalink raw reply related

* Re: [PATCH -next] sundance: Add initial ethtool stats support
From: Denis Kirjanov @ 2010-10-09 14:40 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: davem, netdev, Eric Dumazet
In-Reply-To: <20101009132749.GA15074@solarflare.com>

On 10/09/2010 05:27 PM, Ben Hutchings wrote:
> Denis Kirjanov wrote:
>> Add initial ethtool statistics support 
> [...]
>> +static void get_ethtool_stats(struct net_device *dev,
>> +		struct ethtool_stats *stats, u64 *data)
>> +{
>> +	struct net_device_stats *netdev_stats = get_stats(dev);
>> +	int i = 0;
>> +
>> +	data[i++] = netdev_stats->tx_packets;
>> +	data[i++] = netdev_stats->tx_bytes;
>> +	data[i++] = netdev_stats->rx_packets;
>> +	data[i++] = netdev_stats->rx_bytes;
>> +	data[i++] = netdev_stats->tx_errors;
>> +	data[i++] = netdev_stats->tx_dropped;
>> +	data[i++] = netdev_stats->rx_errors;
>> +}
> [...]
> 
> There is no point in adding ethtool stats that merely mirror the baseline
> net device stats.
Fair enough,
I'll add extra stats shortly
> Ben.
> 


^ permalink raw reply

* Re: [PATCH] SIW: Object management
From: Bernard Metzler @ 2010-10-09 14:10 UTC (permalink / raw)
  To: Steve Wise
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, Roland Dreier
In-Reply-To: <4CAB464D.5030702-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>

<snip>

> > 
> >> the post_send/recv and other functions in your driver are called
> >> directly (almost) by kernel users like NFSRDMA.  These users may be
> >> calling in an interrupt context and thus you cannot block/sleep.
> >>
> >> 
> > OK, very convincing. not a big change since siw_wqe_get/_put()
> > already maintain a list of pre-allocated wqe's (currently for
> > the read.responses).
> > but, would it be ok if the code distinguishes between user
> > land and in-kernel consumers? i would be very happy if we could
> > keep the pre-allocations per user land connection to its very
> > minimum...
> >
> > 
> 
> I think that's ok, but its bending the core locking rules a little I 
> guess.  But the intent is that kernel users can definitely 
> send/recv/poll in interrupt context, so possibly blocking for user mode 
> QPs in on-kernel-bypass operations is probably ok...
> 
i think its best if the code can be optimized this way. i will provide
a patch following down that path soon after i am back from vacation
(next week i am off).
while respecting the specific kernel user requirements, i really want
to keep memory allocations small for user land applications. 

thanks,
bernard.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH -next] sundance: Add initial ethtool stats support
From: Ben Hutchings @ 2010-10-09 13:27 UTC (permalink / raw)
  To: Denis Kirjanov; +Cc: davem, netdev
In-Reply-To: <20101009095346.GA12951@hera.kernel.org>

Denis Kirjanov wrote:
> Add initial ethtool statistics support 
[...]
> +static void get_ethtool_stats(struct net_device *dev,
> +		struct ethtool_stats *stats, u64 *data)
> +{
> +	struct net_device_stats *netdev_stats = get_stats(dev);
> +	int i = 0;
> +
> +	data[i++] = netdev_stats->tx_packets;
> +	data[i++] = netdev_stats->tx_bytes;
> +	data[i++] = netdev_stats->rx_packets;
> +	data[i++] = netdev_stats->rx_bytes;
> +	data[i++] = netdev_stats->tx_errors;
> +	data[i++] = netdev_stats->tx_dropped;
> +	data[i++] = netdev_stats->rx_errors;
> +}
[...]

There is no point in adding ethtool stats that merely mirror the baseline
net device stats.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCHv5] net: Add batman-adv meshing protocol
From: Sven Eckelmann @ 2010-10-09 12:34 UTC (permalink / raw)
  To: b.a.t.m.a.n-ZwoEplunGu2X36UT3dwllkB+6BGkLq7r
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, David Miller
In-Reply-To: <20100924.134334.28812338.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>

[-- Attachment #1: Type: Text/Plain, Size: 1622 bytes --]

David Miller wrote:
> From: Sven Eckelmann <sven.eckelmann-Mmb7MZpHnFY@public.gmane.org>
> Date: Sat, 18 Sep 2010 21:03:30 +0200
> 
> > B.A.T.M.A.N. (better approach to mobile ad-hoc networking) is a routing
> > protocol for multi-hop ad-hoc mesh networks. The networks may be wired or
> > wireless. See http://www.open-mesh.org/ for more information and user
> > space tools.
> > 
> > Signed-off-by: Sven Eckelmann <sven.eckelmann-Mmb7MZpHnFY@public.gmane.org>
> 
> The only thing remaining which I really don't like is this hash helper
> library thing in here.
> 
> It's a terrible abstraction and very inefficient.  Iteration uses
> function calls, as does removal.  Key comparisons use callbacks, via
> indirection function pointers, also very inefficient.

I would completely agree.

> Just use the "struct hlist_head" and "struct hlist_node" objects we
> have generically already.  Inline the list iteration, as well as the
> key comparisons and the node linking/unlinking.

hlist_head and hlist_node is the right thing to do, but I am a little bit 
irritated by the rest.

Ok, no hash implementation from the basics, but there is functionality shared 
by the four hashing tables used, which I would not like to "implement" again 
everywhere. For example the "add to hash if data isn't already added there". 
This can easily done using a static inline function which receives a 
comparison and choose/key function (which also can be inlined by the compiler) 
and does the rest using hlist_*.

Speaks anything against such things shared in inside batman-adv only?

thanks,
	Sven

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCH -next] sundance: Add initial ethtool stats support
From: Eric Dumazet @ 2010-10-09 12:33 UTC (permalink / raw)
  To: Denis Kirjanov; +Cc: davem, netdev
In-Reply-To: <20101009095346.GA12951@hera.kernel.org>

Le samedi 09 octobre 2010 à 09:53 +0000, Denis Kirjanov a écrit :
> Add initial ethtool statistics support 
> 
> Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
> ---
>  drivers/net/sundance.c |   46 ++++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 46 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/net/sundance.c b/drivers/net/sundance.c
> index 27d69aa..685845b 100644
> --- a/drivers/net/sundance.c
> +++ b/drivers/net/sundance.c
> @@ -1564,6 +1564,18 @@ static int __set_mac_addr(struct net_device *dev)
>  	return 0;
>  }
>  
> +static const struct {
> +	const char name[ETH_GSTRING_LEN];
> +} sundance_stats[] = {
> +	{ "tx_packets" },
> +	{ "tx_bytes" },
> +	{ "rx_packets" },
> +	{ "rx_bytes" },
> +	{ "tx_errors" },
> +	{ "tx_dropped" },
> +	{ "rx_errors" },
> +};
> +
>  static int check_if_running(struct net_device *dev)
>  {
>  	if (!netif_running(dev))
> @@ -1622,6 +1634,37 @@ static void set_msglevel(struct net_device *dev, u32 val)
>  	np->msg_enable = val;
>  }
>  
> +static void get_strings(struct net_device *dev, u32 stringset,
> +		u8 *data)
> +{

	if (stringset != ETH_SS_STATS)
		return;

> +	memcpy(data, sundance_stats, sizeof(sundance_stats));
> +}
> +
> +static int get_sset_count(struct net_device *dev, int sset)
> +{
> +	switch (sset) {
> +	case ETH_SS_STATS:
> +		return ARRAY_SIZE(sundance_stats);
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +static void get_ethtool_stats(struct net_device *dev,
> +		struct ethtool_stats *stats, u64 *data)
> +{
> +	struct net_device_stats *netdev_stats = get_stats(dev);
> +	int i = 0;
> +
> +	data[i++] = netdev_stats->tx_packets;
> +	data[i++] = netdev_stats->tx_bytes;
> +	data[i++] = netdev_stats->rx_packets;
> +	data[i++] = netdev_stats->rx_bytes;
> +	data[i++] = netdev_stats->tx_errors;
> +	data[i++] = netdev_stats->tx_dropped;
> +	data[i++] = netdev_stats->rx_errors;
> +}
> +
>  static const struct ethtool_ops ethtool_ops = {
>  	.begin = check_if_running,
>  	.get_drvinfo = get_drvinfo,
> @@ -1631,6 +1674,9 @@ static const struct ethtool_ops ethtool_ops = {
>  	.get_link = get_link,
>  	.get_msglevel = get_msglevel,
>  	.set_msglevel = set_msglevel,
> +	.get_strings = get_strings,
> +	.get_sset_count = get_sset_count,
> +	.get_ethtool_stats = get_ethtool_stats,
>  };
>  
>  static int netdev_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)



^ permalink raw reply

* [PATCH net-next] sundance: get_stats proper locking
From: Eric Dumazet @ 2010-10-09 12:17 UTC (permalink / raw)
  To: Denis Kirjanov, David Miller; +Cc: netdev
In-Reply-To: <20101009095346.GA12951@hera.kernel.org>

Le samedi 09 octobre 2010 à 09:53 +0000, Denis Kirjanov a écrit :
> Add initial ethtool statistics support 

OK, I guess its time to add proper locking into sundance after all ;)

[PATCH net-next] sundance: get_stats proper locking

sundance get_stats() should not be run concurrently, add a lock to avoid
potential losses.

Note: Remove unused rx_lock field

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 drivers/net/sundance.c |   10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/net/sundance.c b/drivers/net/sundance.c
index 27d69aa..4283cc5 100644
--- a/drivers/net/sundance.c
+++ b/drivers/net/sundance.c
@@ -365,7 +365,6 @@ struct netdev_private {
 	struct timer_list timer;		/* Media monitoring timer. */
 	/* Frequently used values: keep some adjacent for cache effect. */
 	spinlock_t lock;
-	spinlock_t rx_lock;			/* Group with Tx control cache line. */
 	int msg_enable;
 	int chip_id;
 	unsigned int cur_rx, dirty_rx;		/* Producer/consumer ring indices */
@@ -390,6 +389,7 @@ struct netdev_private {
 	unsigned char phys[MII_CNT];		/* MII device addresses, only first one used. */
 	struct pci_dev *pci_dev;
 	void __iomem *base;
+	spinlock_t statlock;
 };
 
 /* The station address location in the EEPROM. */
@@ -514,6 +514,7 @@ static int __devinit sundance_probe1 (struct pci_dev *pdev,
 	np->chip_id = chip_idx;
 	np->msg_enable = (1 << debug) - 1;
 	spin_lock_init(&np->lock);
+	spin_lock_init(&np->statlock);
 	tasklet_init(&np->rx_tasklet, rx_poll, (unsigned long)dev);
 	tasklet_init(&np->tx_tasklet, tx_poll, (unsigned long)dev);
 
@@ -1486,10 +1487,9 @@ static struct net_device_stats *get_stats(struct net_device *dev)
 	struct netdev_private *np = netdev_priv(dev);
 	void __iomem *ioaddr = np->base;
 	int i;
+	unsigned long flags;
 
-	/* We should lock this segment of code for SMP eventually, although
-	   the vulnerability window is very small and statistics are
-	   non-critical. */
+	spin_lock_irqsave(&np->statlock, flags);
 	/* The chip only need report frame silently dropped. */
 	dev->stats.rx_missed_errors	+= ioread8(ioaddr + RxMissed);
 	dev->stats.tx_packets += ioread16(ioaddr + TxFramesOK);
@@ -1506,6 +1506,8 @@ static struct net_device_stats *get_stats(struct net_device *dev)
 	dev->stats.rx_bytes += ioread16(ioaddr + RxOctetsLow);
 	dev->stats.rx_bytes += ioread16(ioaddr + RxOctetsHigh) << 16;
 
+	spin_unlock_irqrestore(&np->statlock, flags);
+
 	return &dev->stats;
 }
 



^ permalink raw reply related

* Re: [Bugme-new] [Bug 19692] New: linux-2.6.36-rc5 crash with gianfar ethernet at full line rate traffic
From: emin ak @ 2010-10-09 12:10 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Andrew Morton, netdev, bugzilla-daemon, bugme-daemon,
	Anton Vorontsov
In-Reply-To: <20101008092427.GA2327@del.dom.local>

> Andrew Morton wrote:
>> (switched to email.  Please respond via emailed reply-to-all, not via the
>> bugzilla web interface).
>>
>> On Mon, 4 Oct 2010 06:25:14 GMT
>> bugzilla-daemon@bugzilla.kernel.org wrote:
>>
>>> https://bugzilla.kernel.org/show_bug.cgi?id=19692
>>>
>>>            Summary: linux-2.6.36-rc5 crash with gianfar ethernet at full
>>>                     line rate traffic
> ...
>
> Emin, until there is something better I hope you could try this patch.
> (not tested nor compiled)
>
> Thanks,
> Jarek P.

Hi Jarek,
My test setup is at my office so that I will try your patch on Monday
immediately and turn you the results.
Thanks.
Emin

^ permalink raw reply

* Fwd: [PATCH] forcedeth: reconfigure multicast packet filter only when needed
From: Jindřich Makovička @ 2010-10-09 11:26 UTC (permalink / raw)
  To: linux-kernel, netdev; +Cc: davem, shemminger, aabdulla, ditto, makovick
In-Reply-To: <20100812115430.2b5d8683@starbug.prg01.itonis.net>

[-- Attachment #1: Type: text/plain, Size: 824 bytes --]

Currently, the forcedeth driver reconfigures the packet filter every
time a multicast stream is (un-)subscribed. As the receiving has to be
stopped and started in this case, any multicast subscription can cause
packet loss, allowing userspace applications to disrupt incoming
traffic.

With the following patch, nv_set_multicast first checks the cached
state of the packet filter, and skips the reconfiguration if the state
does not change. With the default settings, this can reduce some
useless reconfiguration attempts. When switched to promiscuous mode,
all reconfigurations are skipped with the patch, which can be used to
mitigate packet loss problems when receiving and re-subscribing many
multicasts simultaneously on a single machine.

Signed-off-by: Jindrich Makovicka <makovick@gmail.com>

-- 
Jindrich Makovicka

[-- Attachment #2: forcedeth.c.diff --]
[-- Type: text/x-patch, Size: 2212 bytes --]

--- forcedeth.c.orig	2010-04-26 16:48:30.000000000 +0200
+++ forcedeth.c	2010-05-21 13:22:25.705907294 +0200
@@ -837,6 +837,11 @@
 	char name_rx[IFNAMSIZ + 3];       /* -rx    */
 	char name_tx[IFNAMSIZ + 3];       /* -tx    */
 	char name_other[IFNAMSIZ + 6];    /* -other */
+
+	/* current packet filter state */
+	u32 cur_pff;
+	u32 cur_addr[2];
+	u32 cur_mask[2];
 };
 
 /*
@@ -3128,17 +3133,28 @@
 	}
 	addr[0] |= NVREG_MCASTADDRA_FORCE;
 	pff |= NVREG_PFF_ALWAYS;
-	spin_lock_irq(&np->lock);
-	nv_stop_rx(dev);
-	writel(addr[0], base + NvRegMulticastAddrA);
-	writel(addr[1], base + NvRegMulticastAddrB);
-	writel(mask[0], base + NvRegMulticastMaskA);
-	writel(mask[1], base + NvRegMulticastMaskB);
-	writel(pff, base + NvRegPacketFilterFlags);
-	dprintk(KERN_INFO "%s: reconfiguration for multicast lists.\n",
-		dev->name);
-	nv_start_rx(dev);
-	spin_unlock_irq(&np->lock);
+	if (np->cur_pff != (pff & ~NVREG_PFF_PAUSE_RX)
+	    || memcmp(np->cur_addr, addr, sizeof(np->cur_addr)) != 0
+	    || memcmp(np->cur_mask, mask, sizeof(np->cur_mask)) != 0)
+	{
+		dprintk(KERN_INFO "%s: reconfiguration for multicast lists.\n",
+			dev->name);
+		spin_lock_irq(&np->lock);
+		nv_stop_rx(dev);
+		writel(addr[0], base + NvRegMulticastAddrA);
+		writel(addr[1], base + NvRegMulticastAddrB);
+		writel(mask[0], base + NvRegMulticastMaskA);
+		writel(mask[1], base + NvRegMulticastMaskB);
+		writel(pff, base + NvRegPacketFilterFlags);
+		nv_start_rx(dev);
+		spin_unlock_irq(&np->lock);
+		memcpy(np->cur_addr, addr, sizeof(np->cur_addr));
+		memcpy(np->cur_mask, mask, sizeof(np->cur_mask));
+		np->cur_pff = pff & ~NVREG_PFF_PAUSE_RX;
+	} else {
+		dprintk(KERN_INFO "%s: pff state unchanged - skipping reconfiguration.\n",
+			dev->name);
+	}
 }
 
 static void nv_update_pause(struct net_device *dev, u32 pause_flags)
@@ -5369,6 +5385,12 @@
 	writel(NVREG_MCASTMASKB_NONE, base + NvRegMulticastMaskB);
 	writel(0, base + NvRegPacketFilterFlags);
 
+	np->cur_pff = 0;
+	np->cur_addr[0] = NVREG_MCASTADDRA_FORCE;
+	np->cur_addr[1] = 0;
+	np->cur_mask[0] = NVREG_MCASTMASKA_NONE;
+	np->cur_mask[1] = NVREG_MCASTMASKB_NONE;
+
 	writel(0, base + NvRegTransmitterControl);
 	writel(0, base + NvRegReceiverControl);
 

^ permalink raw reply

* Re: [PATCH 2/4] Phonet: advise against enabling the pipe controller
From: Rémi Denis-Courmont @ 2010-10-09 10:20 UTC (permalink / raw)
  To: netdev
In-Reply-To: <20101008.141730.258104062.davem@davemloft.net>

Le samedi 9 octobre 2010 00:17:30 David Miller, vous avez écrit :
> From: Rémi Denis-Courmont <remi@remlab.net>
> Date: Fri,  8 Oct 2010 17:02:01 +0300
> 
> > From: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
> > 
> > As it currently is, the new code path is not compatible with existing
> > Nokia modems. This would break existing userspace for Nokia modem, such
> > as the existing oFono ISI driver.
> > 
> > Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
> 
> This really needs to be designed such that it can co-exist with current
> userland technology.

Yeah, don't tell me. This kinda static option works for Linux kernel on an 
embedded device like my employer. But it does not fly on the desktop.

This big patch came as a surprise, so did its merging. I don't when it can be 
improved.

-- 
Rémi Denis-Courmont
http://www.remlab.net/
http://fi.linkedin.com/in/remidenis

^ permalink raw reply

* [PATCH -next] sundance: Add initial ethtool stats support
From: Denis Kirjanov @ 2010-10-09  9:53 UTC (permalink / raw)
  To: davem; +Cc: netdev

Add initial ethtool statistics support 

Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
---
 drivers/net/sundance.c |   46 ++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 46 insertions(+), 0 deletions(-)

diff --git a/drivers/net/sundance.c b/drivers/net/sundance.c
index 27d69aa..685845b 100644
--- a/drivers/net/sundance.c
+++ b/drivers/net/sundance.c
@@ -1564,6 +1564,18 @@ static int __set_mac_addr(struct net_device *dev)
 	return 0;
 }
 
+static const struct {
+	const char name[ETH_GSTRING_LEN];
+} sundance_stats[] = {
+	{ "tx_packets" },
+	{ "tx_bytes" },
+	{ "rx_packets" },
+	{ "rx_bytes" },
+	{ "tx_errors" },
+	{ "tx_dropped" },
+	{ "rx_errors" },
+};
+
 static int check_if_running(struct net_device *dev)
 {
 	if (!netif_running(dev))
@@ -1622,6 +1634,37 @@ static void set_msglevel(struct net_device *dev, u32 val)
 	np->msg_enable = val;
 }
 
+static void get_strings(struct net_device *dev, u32 stringset,
+		u8 *data)
+{
+	memcpy(data, sundance_stats, sizeof(sundance_stats));
+}
+
+static int get_sset_count(struct net_device *dev, int sset)
+{
+	switch (sset) {
+	case ETH_SS_STATS:
+		return ARRAY_SIZE(sundance_stats);
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static void get_ethtool_stats(struct net_device *dev,
+		struct ethtool_stats *stats, u64 *data)
+{
+	struct net_device_stats *netdev_stats = get_stats(dev);
+	int i = 0;
+
+	data[i++] = netdev_stats->tx_packets;
+	data[i++] = netdev_stats->tx_bytes;
+	data[i++] = netdev_stats->rx_packets;
+	data[i++] = netdev_stats->rx_bytes;
+	data[i++] = netdev_stats->tx_errors;
+	data[i++] = netdev_stats->tx_dropped;
+	data[i++] = netdev_stats->rx_errors;
+}
+
 static const struct ethtool_ops ethtool_ops = {
 	.begin = check_if_running,
 	.get_drvinfo = get_drvinfo,
@@ -1631,6 +1674,9 @@ static const struct ethtool_ops ethtool_ops = {
 	.get_link = get_link,
 	.get_msglevel = get_msglevel,
 	.set_msglevel = set_msglevel,
+	.get_strings = get_strings,
+	.get_sset_count = get_sset_count,
+	.get_ethtool_stats = get_ethtool_stats,
 };
 
 static int netdev_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
-- 
1.7.0


^ permalink raw reply related

* Re: [PATCH 2/2] r8169: use device model DMA API
From: Eric Dumazet @ 2010-10-09  7:57 UTC (permalink / raw)
  To: Stanislaw Gruszka; +Cc: Francois Romieu, netdev
In-Reply-To: <1286547901-10782-2-git-send-email-sgruszka@redhat.com>

Le vendredi 08 octobre 2010 à 16:25 +0200, Stanislaw Gruszka a écrit :
> Use DMA API as PCI equivalents will be deprecated. This change also
> allow to allocate with GFP_KERNEL where possible.
> 
> Tested-by: Neal Becker <ndbecker2@gmail.com>
> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
> ---
>  drivers/net/r8169.c |   53 +++++++++++++++++++++++++++-----------------------
>  1 files changed, 29 insertions(+), 24 deletions(-)

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>



^ permalink raw reply

* Re: [PATCH 1/2] r8169: allocate with GFP_KERNEL flag when able to sleep
From: Eric Dumazet @ 2010-10-09  7:54 UTC (permalink / raw)
  To: Stanislaw Gruszka; +Cc: Francois Romieu, netdev
In-Reply-To: <1286547901-10782-1-git-send-email-sgruszka@redhat.com>

Le vendredi 08 octobre 2010 à 16:25 +0200, Stanislaw Gruszka a écrit :
> We have fedora bug report where driver fail to initialize after
> suspend/resume because of memory allocation errors:
> https://bugzilla.redhat.com/show_bug.cgi?id=629158
> 
> To fix use GFP_KERNEL allocation where possible.
> 
> Tested-by: Neal Becker <ndbecker2@gmail.com>
> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>



^ permalink raw reply

* Re: [PATCH net-next] net: percpu net_device refcount
From: Eric Dumazet @ 2010-10-09  6:23 UTC (permalink / raw)
  To: paulmck; +Cc: Stephen Hemminger, David Miller, netdev
In-Reply-To: <20101008215604.GF2408@linux.vnet.ibm.com>

Le vendredi 08 octobre 2010 à 14:56 -0700, Paul E. McKenney a écrit :
> On Thu, Oct 07, 2010 at 10:30:51AM -0700, Stephen Hemminger wrote:
> > On Thu, 07 Oct 2010 19:12:35 +0200
> > Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > 
> > > We tried very hard to remove all possible dev_hold()/dev_put() pairs in
> > > network stack, using RCU conversions.
> > > 
> > > There is still an unavoidable device refcount change for every dst we
> > > create/destroy, and this can slow down some workloads (routers or some
> > > app servers)
> > > 
> > > We can switch to a percpu refcount implementation, now dynamic per_cpu
> > > infrastructure is mature. On a 64 cpus machine, this consumes 256 bytes
> > > per device.
> > 
> > It makes sense, but what about 256 cores and 1024 Vlans?
> > That adds up to 4M of memory which is might be noticeable.
> 
> I bet that systems that have 256 cores have >100GB of memory, at which
> point 4MB is way down in the noise.

Well, first its 1MB added, and secondly we added percpu stats for vlan
devices, and this consumed 8x more :

(struct vlan_rx_stats is 32 bytes per cpu and per vlan
32*256*1024  ->  8 Mbytes

Some strange machines have many cores sharing a small amount of memory,
but I am not sure they want to run many net devices ;)




^ permalink raw reply

* Re: [PATCH] Tarpit target for the last stable (2.6.35.7): right version
From: Eric Dumazet @ 2010-10-09  6:10 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: Ben Hutchings, Nicola, netfilter-devel, netdev
In-Reply-To: <alpine.LNX.2.01.1010090609350.1542@obet.zrqbmnf.qr>

Le samedi 09 octobre 2010 à 06:21 +0200, Jan Engelhardt a écrit :

> I must say I am surprised.
> 
> There already exists a maintained xt_TARPIT in Xtables-addons, and
> it seems like your patch is a duplicate effort of it.
> 
> If you have any updates to xt_TARPIT, I would recommend the strategy
> that Luciano Coelho used when he wanted to have xt_condition in
> the main kernel:
> 
> 1. take the maintained version from Xt-a
> 2. submit any updates to the Xt-a file, if feasible
> 3. once processed, submit that then for Linux kernel inclusion
> 4. apply further updates to the LK file, if needed
> 
> That way it is way easier to know what specifically new features you 
> have added to xt_TARPIT, if any.
> 

Agreed, so much messages on netdev and netfilter-devel, only to rewrite
a TARPIT module :-(



--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* (unknown), 
From: GABRIEL KANTE @ 2010-10-09  2:22 UTC (permalink / raw)


I am the son of the late Ahmed Tidiane Kante, former minister of geology
and mines of the republic of Guinea.I write to seek your help in the
retrieval of our money from US Bank account belonging to my late Dad.

thanks,
Gabriel


^ permalink raw reply

* Re: [PATCH] Tarpit target for the last stable (2.6.35.7): right version
From: Jan Engelhardt @ 2010-10-09  4:21 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: Nicola, netfilter-devel, netdev
In-Reply-To: <1286565042.2282.6.camel@achroite.uk.solarflarecom.com>


On Friday 2010-10-08 21:10, Ben Hutchings wrote:
>On Fri, 2010-10-08 at 21:01 +0200, Nicola wrote:
>> [Sorry for the inconvenience: I just noticed that Gmail has a bad 
>> formatting. I resend the patch hoping that now things are right.]
>[...]
>
>It's still word-wrapped.  You may have to make the patch an attachment,
>but see Documentation/email-clients.txt for configuration hints.
>
>It is also not based on the right version; networking patches adding new
>features should be based on David Miller's net-next-2.6 tree.

I must say I am surprised.

There already exists a maintained xt_TARPIT in Xtables-addons, and
it seems like your patch is a duplicate effort of it.

If you have any updates to xt_TARPIT, I would recommend the strategy
that Luciano Coelho used when he wanted to have xt_condition in
the main kernel:

1. take the maintained version from Xt-a
2. submit any updates to the Xt-a file, if feasible
3. once processed, submit that then for Linux kernel inclusion
4. apply further updates to the LK file, if needed

That way it is way easier to know what specifically new features you 
have added to xt_TARPIT, if any.


thanks,
Jan

^ permalink raw reply

* [GIT] Networking
From: David Miller @ 2010-10-09  3:30 UTC (permalink / raw)
  To: torvalds; +Cc: akpm, netdev, linux-kernel


1) Memory out-of-bounds accesses in SCTP, from Dan Rosenberg.

2) We accidently handle v3 queries imporperly when in IGMP v2 compat mode.
   Fix from David Stevens.

3) Ref-after-put bug, and lack of dev_get_by_index() failure checking
   in caif_connect(), from Eric Dumazet.

4) Bluetooh can deadlock on writes when socket send buffer fills us,
   but the packets that would release that space are in the backlog,
   because we hold the socket lock while sleeping and therefore can't
   process those backlog packets.

   Fix from Gustavo F. Padovan.

5) IPV6 doesn't check all of the necessary routes when updating the
   PMTU in response to an ICMP.  Fix from Maciej Żenczykowski.

6) When global TCP memory hit is hit, we hang because the sk_wait_event()
   third argument is wrong in sk_stream_wait_memory().

   This is a very very old bug, I am very surprised nobody else has hit
   this problem.

   Fix by Nagendra Singh Tomar.

7) An old commit I made to fix a build failure with some drivers
   was wrong, and didn't fix the INET_LRO dependency problems at
   all.

   So revert that and Ben Hutchings fixes it right by adding the
   necessary "INET" dependencies to drivers which need "INET_LRO".

8) Signedness overflow in offset check of cls_u32 packet scheduler
   classifier module, from Dan Carpenter.

9) Update Intel ethernet driver docs and MAINTAINERS netry, from Jeff Kirsher.

10) SKGE on some motherboards can't do 64-bit DMA properly, fix from
    Stanislaw Gruszka.

11) bond_masters sysfs file gives bogus WARN_ON, fix from Neil HOrman

12) Fix ath9k clock rate calcuation regression, from Felix Fietkau.

13) Kill AddBA timer, otherwise we reference a freed object.  Fix
    from Johannes Berg.

14) Revert mac80211 change to use netif_receive_skb() instead of netif_rx()
    since sometimes this ends up executing in the wrong context, from
    John W. Linville.

15) Use strlcpy() in isdn to avoid going past the end of a buffer.  From
    Dan Carpenter.

16) ETHTOOL_GRXCLSRLALL exposes kernel heap to userspace.  Fix from
    Kees Cook.

17) Revert ConfigReq rejection commit that causes failures in bluetooth
    validation suite.  From Gustavo F. Padovan.

18) rfcomm_sk_state_change() needs to disable local interrupts to avoid
    illegal lock state, also from Gustavo F. Padovan.

Please pull, thanks a lot!

The following changes since commit 6b0cd00bc396daf5c2dcf17a8d82055335341f46:

  Merge branch 'hwpoison-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6 (2010-10-07 13:59:32 -0700)

are available in the git repository at:

  master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.git master

Andrei Emeltchenko (1):
      Bluetooth: fix MTU L2CAP configuration parameter

Ben Hutchings (2):
      Revert "ipv4: Make INET_LRO a bool instead of tristate."
      netdev: Depend on INET before selecting INET_LRO

Dan Carpenter (2):
      cls_u32: signedness bug
      isdn: strcpy() => strlcpy()

Dan Rosenberg (2):
      sctp: prevent reading out-of-bounds memory
      sctp: Fix out-of-bounds reading in sctp_asoc_get_hmac()

David S. Miller (3):
      Merge branch 'master' of master.kernel.org:/.../torvalds/linux-2.6
      Merge branch 'master' of git://git.kernel.org/.../padovan/bluetooth-2.6
      Merge branch 'master' of git://git.kernel.org/.../linville/wireless-2.6

David Stevens (1):
      ipv4: correct IGMP behavior on v3 query during v2-compatibility mode

Eric Dumazet (1):
      caif: fix two caif_connect() bugs

Felix Fietkau (1):
      ath9k_hw: fix regression in ANI listen time calculation

Gustavo F. Padovan (5):
      Bluetooth: Simplify L2CAP Streaming mode sending
      Bluetooth: Fix inconsistent lock state with RFCOMM
      Revert "Bluetooth: Don't accept ConfigReq if we aren't in the BT_CONFIG state"
      Bluetooth: Fix deadlock in the ERTM logic
      Bluetooth: Disallow to change L2CAP_OPTIONS values when connected

Jeff Kirsher (4):
      ixgbevf.txt: Update ixgbevf documentation
      e1000.txt: Update e1000 documentation
      e1000e.txt: Add e1000e documentation
      MAINTAINERS: update Intel LAN Ethernet info

Johannes Berg (1):
      mac80211: delete AddBA response timer

John W. Linville (1):
      Revert "mac80211: use netif_receive_skb in ieee80211_tx_status callpath"

Kees Cook (1):
      net: clear heap allocation for ETHTOOL_GRXCLSRLALL

Maciej Żenczykowski (1):
      net: Fix IPv6 PMTU disc. w/ asymmetric routes

Mat Martineau (1):
      Bluetooth: Only enable L2CAP FCS for ERTM or streaming

Nagendra Tomar (1):
      net: Fix the condition passed to sk_wait_event()

Neil Horman (1):
      bonding: fix WARN_ON when writing to bond_master sysfs file

Stanislaw Gruszka (1):
      skge: add quirk to limit DMA

 Documentation/networking/e1000.txt   |  373 +++++++++-------------------------
 Documentation/networking/e1000e.txt  |  302 +++++++++++++++++++++++++++
 Documentation/networking/ixgbevf.txt |   40 +----
 MAINTAINERS                          |   16 ++-
 drivers/isdn/sc/interrupt.c          |   18 ++-
 drivers/net/Kconfig                  |    4 +-
 drivers/net/bonding/bond_main.c      |    9 +
 drivers/net/skge.c                   |   18 ++-
 drivers/net/wireless/ath/ath9k/ani.c |    2 +-
 include/net/bluetooth/bluetooth.h    |   18 ++
 net/bluetooth/l2cap.c                |   62 +++---
 net/bluetooth/rfcomm/sock.c          |    4 +
 net/caif/caif_socket.c               |   21 ++-
 net/core/ethtool.c                   |    2 +-
 net/core/stream.c                    |    8 +-
 net/ipv4/Kconfig                     |    2 +-
 net/ipv4/igmp.c                      |   14 ++-
 net/ipv6/route.c                     |   28 +++-
 net/mac80211/agg-tx.c                |    2 +
 net/mac80211/status.c                |    4 +-
 net/sched/cls_u32.c                  |    2 +-
 net/sctp/auth.c                      |    8 +-
 net/sctp/socket.c                    |   13 +-
 23 files changed, 590 insertions(+), 380 deletions(-)
 create mode 100644 Documentation/networking/e1000e.txt
 mode change 100755 => 100644 Documentation/networking/ixgbevf.txt

^ permalink raw reply

* [RFC] ixgbe: v3 normalize frag_list usage
From: Duyck, Alexander H @ 2010-10-08 23:57 UTC (permalink / raw)
  To: David Miller
  Cc: Kirsher, Jeffrey T, Brandeburg, Jesse, Allan, Bruce W,
	netdev@vger.kernel.org
In-Reply-To: <20101006.235837.241424681.davem@davemloft.net>

Dave,

Below is the new and improved version of the RSC chaining approach.  Basically
I am holding off on merging the SKB into the frame until the SKB has data in
order to make it take a more standard approach.

Let me know if this will work with the new pointer structure.

Thanks,

Alex

---

This change drops the RSC queue approach and instead creates a normalized
frag_list skb but the tail is kept active and regularly merged into the
host SKB every time it is completed.  In order to identify the tail skb
as a tail we set the head pointer in the RSC CB block of the skb. To locate
the head we just need to check to see if skb->prev is set and make sure to
clean up the pointer before we pass it up to the stack.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 drivers/net/ixgbe/ixgbe.h      |    3 +
 drivers/net/ixgbe/ixgbe_main.c |   96 +++++++++++++++++++++++-----------------
 2 files changed, 58 insertions(+), 41 deletions(-)


diff --git a/drivers/net/ixgbe/ixgbe.h b/drivers/net/ixgbe/ixgbe.h
index a62d19c..397f9e1 100644
--- a/drivers/net/ixgbe/ixgbe.h
+++ b/drivers/net/ixgbe/ixgbe.h
@@ -469,8 +469,9 @@ enum ixbge_state_t {
 };
 
 struct ixgbe_rsc_cb {
+	struct sk_buff *head;
 	dma_addr_t dma;
-	u16 skb_cnt;
+	u16 append_cnt;
 	bool delay_unmap;
 };
 #define IXGBE_RSC_CB(skb) ((struct ixgbe_rsc_cb *)(skb)->cb)
diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 8edce66..104a833 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -1251,34 +1251,51 @@ static inline u16 ixgbe_get_hlen(union ixgbe_adv_rx_desc *rx_desc)
 }
 
 /**
- * ixgbe_transform_rsc_queue - change rsc queue into a full packet
- * @skb: pointer to the last skb in the rsc queue
+ * ixgbe_merge_active_tail - merge active tail into frag_list skb
+ * @tail: pointer to active tail in frag_list
  *
- * This function changes a queue full of hw rsc buffers into a completed
- * packet.  It uses the ->prev pointers to find the first packet and then
- * turns it into the frag list owner.
+ * This function merges the length and data of an active tail into the
+ * skb containing the frag_list.  It resets the tail's pointer to the head,
+ * but it leaves the heads pointer to tail intact.
  **/
-static inline struct sk_buff *ixgbe_transform_rsc_queue(struct sk_buff *skb)
+static inline struct sk_buff *ixgbe_merge_active_tail(struct sk_buff *tail)
 {
-	unsigned int frag_list_size = 0;
-	unsigned int skb_cnt = 1;
+	struct sk_buff *head = IXGBE_RSC_CB(tail)->head;
 
-	while (skb->prev) {
-		struct sk_buff *prev = skb->prev;
-		frag_list_size += skb->len;
-		skb->prev = NULL;
-		skb = prev;
-		skb_cnt++;
-	}
+	if (!head)
+		return tail;
+
+	IXGBE_RSC_CB(tail)->head = NULL;
+
+	if (head->prev)
+		head->prev->next = tail;
+	else
+		skb_shinfo(head)->frag_list = tail;
+
+	head->len += tail->len;
+	head->data_len += tail->len;
+	head->truesize += tail->len;
 
-	skb_shinfo(skb)->frag_list = skb->next;
-	skb->next = NULL;
-	skb->len += frag_list_size;
-	skb->data_len += frag_list_size;
-	skb->truesize += frag_list_size;
-	IXGBE_RSC_CB(skb)->skb_cnt = skb_cnt;
+	head->prev = tail;
+	IXGBE_RSC_CB(head)->append_cnt++;
 
-	return skb;
+	return head;
+}
+
+/**
+ * ixgbe_close_active_frag_list - cleanup pointers on a frag_list skb
+ * @head: pointer to head of an active frag list
+ *
+ * This function will clear the frag_tail_tracker pointer on an active
+ * frag_list and returns true if the pointer was actually set
+ **/
+static inline bool ixgbe_close_active_frag_list(struct sk_buff *head)
+{
+	if (head->prev) {
+		head->prev = NULL;
+		return true;
+	}
+	return false;
 }
 
 static inline bool ixgbe_get_rsc_state(union ixgbe_adv_rx_desc *rx_desc)
@@ -1397,6 +1414,8 @@ static void ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
 			next_buffer = &rx_ring->rx_buffer_info[i];
 		}
 
+		skb = ixgbe_merge_active_tail(skb);
+
 		if (!(staterr & IXGBE_RXD_STAT_EOP)) {
 			if (ring_is_ps_enabled(rx_ring)) {
 				rx_buffer_info->skb = next_buffer->skb;
@@ -1404,15 +1423,13 @@ static void ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
 				next_buffer->skb = skb;
 				next_buffer->dma = 0;
 			} else {
-				skb->next = next_buffer->skb;
-				skb->next->prev = skb;
+				IXGBE_RSC_CB(next_buffer->skb)->head = skb;
 			}
 			rx_ring->rx_stats.non_eop_descs++;
 			goto next_desc;
 		}
 
-		if (skb->prev) {
-			skb = ixgbe_transform_rsc_queue(skb);
+		if (ixgbe_close_active_frag_list(skb)) {
 			/* if we got here without RSC the packet is invalid */
 			if (!pkt_is_rsc) {
 				__pskb_trim(skb, 0);
@@ -1437,7 +1454,7 @@ static void ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
 					skb_shinfo(skb)->nr_frags;
 			else
 				rx_ring->rx_stats.rsc_count +=
-					IXGBE_RSC_CB(skb)->skb_cnt;
+					IXGBE_RSC_CB(skb)->append_cnt;
 			rx_ring->rx_stats.rsc_flush++;
 		}
 
@@ -3907,19 +3924,18 @@ static void ixgbe_clean_rx_ring(struct ixgbe_ring *rx_ring)
 		if (rx_buffer_info->skb) {
 			struct sk_buff *skb = rx_buffer_info->skb;
 			rx_buffer_info->skb = NULL;
-			do {
-				struct sk_buff *this = skb;
-				if (IXGBE_RSC_CB(this)->delay_unmap) {
-					dma_unmap_single(dev,
-							 IXGBE_RSC_CB(this)->dma,
-							 rx_ring->rx_buf_len,
-							 DMA_FROM_DEVICE);
-					IXGBE_RSC_CB(this)->dma = 0;
-					IXGBE_RSC_CB(skb)->delay_unmap = false;
-				}
-				skb = skb->prev;
-				dev_kfree_skb(this);
-			} while (skb);
+			/* We need to clean up RSC frag lists */
+			skb = ixgbe_merge_active_tail(skb);
+			ixgbe_close_active_frag_list(skb);
+			if (IXGBE_RSC_CB(skb)->delay_unmap) {
+				dma_unmap_single(dev,
+						 IXGBE_RSC_CB(skb)->dma,
+						 rx_ring->rx_buf_len,
+						 DMA_FROM_DEVICE);
+				IXGBE_RSC_CB(skb)->dma = 0;
+				IXGBE_RSC_CB(skb)->delay_unmap = false;
+			}
+			dev_kfree_skb(skb);
 		}
 		if (!rx_buffer_info->page)
 			continue;

^ permalink raw reply related

* Re: [PATCH net-next] net: percpu net_device refcount
From: Paul E. McKenney @ 2010-10-08 21:56 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Eric Dumazet, David Miller, netdev
In-Reply-To: <20101007103051.63b5177c@nehalam>

On Thu, Oct 07, 2010 at 10:30:51AM -0700, Stephen Hemminger wrote:
> On Thu, 07 Oct 2010 19:12:35 +0200
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> > We tried very hard to remove all possible dev_hold()/dev_put() pairs in
> > network stack, using RCU conversions.
> > 
> > There is still an unavoidable device refcount change for every dst we
> > create/destroy, and this can slow down some workloads (routers or some
> > app servers)
> > 
> > We can switch to a percpu refcount implementation, now dynamic per_cpu
> > infrastructure is mature. On a 64 cpus machine, this consumes 256 bytes
> > per device.
> 
> It makes sense, but what about 256 cores and 1024 Vlans?
> That adds up to 4M of memory which is might be noticeable.

I bet that systems that have 256 cores have >100GB of memory, at which
point 4MB is way down in the noise.

							Thanx, Paul

^ permalink raw reply

* Re: [PATCH net-next-2.6] net: Fix rxq ref counting
From: David Miller @ 2010-10-08 21:35 UTC (permalink / raw)
  To: eric.dumazet; +Cc: therbert, netdev
In-Reply-To: <1286485149.3745.93.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 07 Oct 2010 22:59:09 +0200

> Le jeudi 07 octobre 2010 à 13:09 -0700, Tom Herbert a écrit :
>> The rx->count reference is used to track reference counts to the
>> number of rx-queue kobjects created for the device.  This patch
>> eliminates initialization of the counter in netif_alloc_rx_queues
>> and instead increments the counter each time a kobject is created.
>> This is now symmetric with the decrement that is done when an object is
>> released.
>> 
>> Signed-off-by: Tom Herbert <therbert@google.com>
 ...
> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH 4/4] Phonet: mark the pipe controller as EXPERIMENTAL
From: David Miller @ 2010-10-08 21:17 UTC (permalink / raw)
  To: remi; +Cc: netdev, remi.denis-courmont
In-Reply-To: <1286546523-3340-4-git-send-email-remi@remlab.net>

From: Rémi Denis-Courmont <remi@remlab.net>
Date: Fri,  8 Oct 2010 17:02:03 +0300

> From: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
> 
> There are a bunch of issues that need to be fixed, including:
>  - GFP_KERNEL allocations from atomic context
>    (and GFP_ATOMIC in process context),
>  - abuse of the setsockopt() call convention,
>  - unprotected/unlocked static variables...
> 
> IMHO, we will need to alter the userspace ABI when we fix it. So mark
> the configuration option as EXPERIMENTAL for the time being (or should
> it be BROKEN instead?).
> 
> Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>

Applied.

^ permalink raw reply

* Re: [PATCH 3/4] Phonet: cleanup pipe enable socket option
From: David Miller @ 2010-10-08 21:17 UTC (permalink / raw)
  To: remi; +Cc: netdev, remi.denis-courmont, kumar.sanghvi
In-Reply-To: <1286546523-3340-3-git-send-email-remi@remlab.net>

From: Rémi Denis-Courmont <remi@remlab.net>
Date: Fri,  8 Oct 2010 17:02:02 +0300

> From: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
> 
> The current code works like this:
 ...
> ...which does not follow the usual socket option pattern. This patch
> merges all three "options" into a single gettable&settable option,
> before Linux 2.6.37 gets out:
 ...
> This also fixes the error code from EFAULT to ENOTCONN.
> 
> Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
> Cc: Kumar Sanghvi <kumar.sanghvi@stericsson.com>

Applied.

^ permalink raw reply

* Re: [PATCH 2/4] Phonet: advise against enabling the pipe controller
From: David Miller @ 2010-10-08 21:17 UTC (permalink / raw)
  To: remi; +Cc: netdev, remi.denis-courmont
In-Reply-To: <1286546523-3340-2-git-send-email-remi@remlab.net>

From: Rémi Denis-Courmont <remi@remlab.net>
Date: Fri,  8 Oct 2010 17:02:01 +0300

> From: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
> 
> As it currently is, the new code path is not compatible with existing
> Nokia modems. This would break existing userspace for Nokia modem, such
> as the existing oFono ISI driver.
> 
> Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>

This really needs to be designed such that it can co-exist with current
userland technology.

Applied.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox