Netdev List
 help / color / mirror / Atom feed
* Re: Bug in skb_segment: fskb->len != len
From: Herbert Xu @ 2013-10-30  1:50 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Christoph Paasch, netdev
In-Reply-To: <1383059293.5464.31.camel@edumazet-glaptop.roam.corp.google.com>

On Tue, Oct 29, 2013 at 08:08:13AM -0700, Eric Dumazet wrote:
>
> GRO layer was updated to be able to stack two or three sk_buff,
> fully populated with page frags.
> 
> Thats quite mandatory to support line rate for 40Gb links.

Indeed I missed this.  Which commit introduced this change?

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH v2 net-next] net: introduce gro_frag_list_enable sysctl
From: David Miller @ 2013-10-30  2:02 UTC (permalink / raw)
  To: eric.dumazet; +Cc: christoph.paasch, herbert, netdev, hkchu, mwdalton
In-Reply-To: <1383094428.4857.16.camel@edumazet-glaptop.roam.corp.google.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 29 Oct 2013 17:53:48 -0700

> So should we apply the first fix to avoid the BUG_ON() ?

Please be more specific, are you talking about splitting up
this patch in some way?

^ permalink raw reply

* Re: [PATCH v2 net-next] net: introduce gro_frag_list_enable sysctl
From: Herbert Xu @ 2013-10-30  2:05 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, christoph.paasch, netdev, hkchu, mwdalton
In-Reply-To: <20131029.220253.1263087684709722001.davem@davemloft.net>

On Tue, Oct 29, 2013 at 10:02:53PM -0400, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Tue, 29 Oct 2013 17:53:48 -0700
> 
> > So should we apply the first fix to avoid the BUG_ON() ?
> 
> Please be more specific, are you talking about splitting up
> this patch in some way?

I think Eric is referring to the patch that removes the BUG_ON
in skb_segment and deals with the new mega-GRO packets.

I think that's fine for stable, but for the long term we should
fix it properly as these new meag-GRO packets still retain the
existing packet boundaries and are trivially segmentable.

If we are indeed able to do that, I doubt we would even need
the sysctl patch since the GRO performance should be vastly
superior to the non-GRO case, even for a router/bridge.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH net-next 08/11] openvswitch: Per cpu flow stats.
From: David Miller @ 2013-10-30  2:11 UTC (permalink / raw)
  To: jesse-l0M0P4e3n4LQT0dZR+AlfA
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1383092544-50599-9-git-send-email-jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>

From: Jesse Gross <jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>
Date: Tue, 29 Oct 2013 17:22:21 -0700

> From: Pravin B Shelar <pshelar-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>
> 
> With mega flow implementation ovs flow can be shared between
> multiple CPUs which makes stats updates highly contended
> operation. Following patch allocates separate stats for each
> CPU to make stats update scalable.
> 
> Signed-off-by: Pravin B Shelar <pshelar-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>
> Signed-off-by: Jesse Gross <jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>

Please properly use __percpu memory for this.

Using arrays for this is very strongly discouraged, please instead
use the kernel facility designed exactly for this.

^ permalink raw reply

* Re: [PATCH v2 net-next] net: introduce gro_frag_list_enable sysctl
From: Jerry Chu @ 2013-10-30  2:13 UTC (permalink / raw)
  To: Herbert Xu
  Cc: David Miller, Eric Dumazet, Christoph Paasch,
	netdev@vger.kernel.org, Michael Dalton
In-Reply-To: <20131030020543.GA1925@gondor.apana.org.au>

On Tue, Oct 29, 2013 at 7:05 PM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Tue, Oct 29, 2013 at 10:02:53PM -0400, David Miller wrote:
>> From: Eric Dumazet <eric.dumazet@gmail.com>
>> Date: Tue, 29 Oct 2013 17:53:48 -0700
>>
>> > So should we apply the first fix to avoid the BUG_ON() ?
>>
>> Please be more specific, are you talking about splitting up
>> this patch in some way?
>
> I think Eric is referring to the patch that removes the BUG_ON
> in skb_segment and deals with the new mega-GRO packets.
>
> I think that's fine for stable, but for the long term we should
> fix it properly as these new meag-GRO packets still retain the
> existing packet boundaries and are trivially segmentable.
>
> If we are indeed able to do that, I doubt we would even need
> the sysctl patch since the GRO performance should be vastly
> superior to the non-GRO case, even for a router/bridge.

Probably not the case for the simple forwarding case. See my
test result of some small (5-8%) CPU+throughput penalty from
GRO (over GRE tunnel) posted previously. But I can believe
the number may be very different if the forwarding path involves
more work (NAT, iptables filtering,...,etc) resulting in a higher per
pkt cost.

Best,

Jerry

>
> Cheers,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* linux-next: manual merge of the net-next tree with the net tree
From: Stephen Rothwell @ 2013-10-30  2:14 UTC (permalink / raw)
  To: David Miller, netdev
  Cc: linux-next, linux-kernel, Nikolay Aleksandrov, Joe Perches

[-- Attachment #1: Type: text/plain, Size: 1360 bytes --]

Hi all,

Today's linux-next merge of the net-next tree got a conflict in
drivers/net/netconsole.c between commit c7c6effdeffc ("netconsole: fix
multiple race conditions") from the net tree and commit 22ded57729e6
("netconsole: Convert to pr_<level>") from the net-next tree.

(Thanks for removing that spare blank line :-))

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

diff --cc drivers/net/netconsole.c
index c9a15925a1f7,a8ef4c4b94be..000000000000
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@@ -333,18 -336,14 +335,18 @@@ static ssize_t store_enabled(struct net
  		netpoll_print_options(&nt->np);
  
  		err = netpoll_setup(&nt->np);
 -		if (err) {
 -			mutex_unlock(&nt->mutex);
 +		if (err)
  			return err;
 -		}
  
- 		printk(KERN_INFO "netconsole: network logging started\n");
+ 		pr_info("network logging started\n");
 -
  	} else {	/* 0 */
 +		/* We need to disable the netconsole before cleaning it up
 +		 * otherwise we might end up in write_msg() with
 +		 * nt->np.dev == NULL and nt->enabled == 1
 +		 */
 +		spin_lock_irqsave(&target_list_lock, flags);
 +		nt->enabled = 0;
 +		spin_unlock_irqrestore(&target_list_lock, flags);
  		netpoll_cleanup(&nt->np);
  	}
  

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCH v2 net-next] net: introduce gro_frag_list_enable sysctl
From: Herbert Xu @ 2013-10-30  2:19 UTC (permalink / raw)
  To: Jerry Chu
  Cc: David Miller, Eric Dumazet, Christoph Paasch,
	netdev@vger.kernel.org, Michael Dalton
In-Reply-To: <CAPshTCgykFMa=4rnQ3TKa2cfgWYj4TgByO2GbzLzq71Y_=oc=Q@mail.gmail.com>

On Tue, Oct 29, 2013 at 07:13:50PM -0700, Jerry Chu wrote:
> On Tue, Oct 29, 2013 at 7:05 PM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
>
> > If we are indeed able to do that, I doubt we would even need
> > the sysctl patch since the GRO performance should be vastly
> > superior to the non-GRO case, even for a router/bridge.
> 
> Probably not the case for the simple forwarding case. See my
> test result of some small (5-8%) CPU+throughput penalty from
> GRO (over GRE tunnel) posted previously. But I can believe
> the number may be very different if the forwarding path involves
> more work (NAT, iptables filtering,...,etc) resulting in a higher per
> pkt cost.

Your numbers are with Eric's current patch that just linearises
the packet, what I'm saying is that you don't need to linearise
these packets since the packet boundaries are still there, just
hidden inside each frag_list.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* [PATCH v2] bgmac: don't update slot on skb alloc/dma mapping error
From: Nathan Hintz @ 2013-10-30  2:32 UTC (permalink / raw)
  To: netdev; +Cc: zajec5

Don't update the slot in "bgmac_dma_rx_skb_for_slot" unless both the
skb alloc and dma mapping are successful; and free the newly allocated
skb if a dma mapping error occurs.  This relieves the caller of the need
to deduce/execute the appropriate cleanup action required when an error
occurs.

Signed-off-by: Nathan Hintz <nlhintz@hotmail.com>
Acked-by: Rafał Miłecki <zajec5@gmail.com>

---

v2: Revised commit message
---
 drivers/net/ethernet/broadcom/bgmac.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bgmac.c b/drivers/net/ethernet/broadcom/bgmac.c
index 7eca5a1..11e5d8d 100644
--- a/drivers/net/ethernet/broadcom/bgmac.c
+++ b/drivers/net/ethernet/broadcom/bgmac.c
@@ -252,25 +252,33 @@ static int bgmac_dma_rx_skb_for_slot(struct bgmac *bgmac,
 				     struct bgmac_slot_info *slot)
 {
 	struct device *dma_dev = bgmac->core->dma_dev;
+	struct sk_buff *skb;
+	dma_addr_t dma_addr;
 	struct bgmac_rx_header *rx;
 
 	/* Alloc skb */
-	slot->skb = netdev_alloc_skb(bgmac->net_dev, BGMAC_RX_BUF_SIZE);
-	if (!slot->skb)
+	skb = netdev_alloc_skb(bgmac->net_dev, BGMAC_RX_BUF_SIZE);
+	if (!skb)
 		return -ENOMEM;
 
 	/* Poison - if everything goes fine, hardware will overwrite it */
-	rx = (struct bgmac_rx_header *)slot->skb->data;
+	rx = (struct bgmac_rx_header *)skb->data;
 	rx->len = cpu_to_le16(0xdead);
 	rx->flags = cpu_to_le16(0xbeef);
 
 	/* Map skb for the DMA */
-	slot->dma_addr = dma_map_single(dma_dev, slot->skb->data,
-					BGMAC_RX_BUF_SIZE, DMA_FROM_DEVICE);
-	if (dma_mapping_error(dma_dev, slot->dma_addr)) {
+	dma_addr = dma_map_single(dma_dev, skb->data,
+				  BGMAC_RX_BUF_SIZE, DMA_FROM_DEVICE);
+	if (dma_mapping_error(dma_dev, dma_addr)) {
 		bgmac_err(bgmac, "DMA mapping error\n");
+		dev_kfree_skb(skb);
 		return -ENOMEM;
 	}
+
+	/* Update the slot */
+	slot->skb = skb;
+	slot->dma_addr = dma_addr;
+
 	if (slot->dma_addr & 0xC0000000)
 		bgmac_warn(bgmac, "DMA address using 0xC0000000 bit(s), it may need translation trick\n");
 
-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCH v2 net-next] net: introduce gro_frag_list_enable sysctl
From: David Miller @ 2013-10-30  2:33 UTC (permalink / raw)
  To: hkchu; +Cc: herbert, eric.dumazet, christoph.paasch, netdev, mwdalton
In-Reply-To: <CAPshTCgykFMa=4rnQ3TKa2cfgWYj4TgByO2GbzLzq71Y_=oc=Q@mail.gmail.com>

From: Jerry Chu <hkchu@google.com>
Date: Tue, 29 Oct 2013 19:13:50 -0700

> On Tue, Oct 29, 2013 at 7:05 PM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
>> On Tue, Oct 29, 2013 at 10:02:53PM -0400, David Miller wrote:
>>> From: Eric Dumazet <eric.dumazet@gmail.com>
>>> Date: Tue, 29 Oct 2013 17:53:48 -0700
>>>
>>> > So should we apply the first fix to avoid the BUG_ON() ?
>>>
>>> Please be more specific, are you talking about splitting up
>>> this patch in some way?
>>
>> I think Eric is referring to the patch that removes the BUG_ON
>> in skb_segment and deals with the new mega-GRO packets.
>>
>> I think that's fine for stable, but for the long term we should
>> fix it properly as these new meag-GRO packets still retain the
>> existing packet boundaries and are trivially segmentable.
>>
>> If we are indeed able to do that, I doubt we would even need
>> the sysctl patch since the GRO performance should be vastly
>> superior to the non-GRO case, even for a router/bridge.
> 
> Probably not the case for the simple forwarding case. See my
> test result of some small (5-8%) CPU+throughput penalty from
> GRO (over GRE tunnel) posted previously. But I can believe
> the number may be very different if the forwarding path involves
> more work (NAT, iptables filtering,...,etc) resulting in a higher per
> pkt cost.

It's that way because it's not implemented properly.

GRO should always win, even on a router, because it decreases the
number of fundamental operations (routing lookups) that the stack
needs to perform.

^ permalink raw reply

* Re: [PATCH v2 net-next] net: introduce gro_frag_list_enable sysctl
From: David Miller @ 2013-10-30  2:34 UTC (permalink / raw)
  To: herbert; +Cc: hkchu, eric.dumazet, christoph.paasch, netdev, mwdalton
In-Reply-To: <20131030021905.GA2089@gondor.apana.org.au>

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Wed, 30 Oct 2013 10:19:06 +0800

> On Tue, Oct 29, 2013 at 07:13:50PM -0700, Jerry Chu wrote:
>> On Tue, Oct 29, 2013 at 7:05 PM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
>>
>> > If we are indeed able to do that, I doubt we would even need
>> > the sysctl patch since the GRO performance should be vastly
>> > superior to the non-GRO case, even for a router/bridge.
>> 
>> Probably not the case for the simple forwarding case. See my
>> test result of some small (5-8%) CPU+throughput penalty from
>> GRO (over GRE tunnel) posted previously. But I can believe
>> the number may be very different if the forwarding path involves
>> more work (NAT, iptables filtering,...,etc) resulting in a higher per
>> pkt cost.
> 
> Your numbers are with Eric's current patch that just linearises
> the packet, what I'm saying is that you don't need to linearise
> these packets since the packet boundaries are still there, just
> hidden inside each frag_list.

Agreed.

^ permalink raw reply

* Re: [PATCH] net/cdc_ncm: fix null pointer panic at usbnet_link_change
From: David Miller @ 2013-10-30  2:38 UTC (permalink / raw)
  To: changbinx.du; +Cc: oliver, linux-usb, netdev, linux-kernel
In-Reply-To: <0C18FE92A7765D4EB9EE5D38D86A563A019F450F@SHSMSX103.ccr.corp.intel.com>

From: "Du, ChangbinX" <changbinx.du@intel.com>
Date: Tue, 29 Oct 2013 03:30:42 +0000

> In cdc_ncm_bind() function, it call cdc_ncm_bind_common() to setup usb.
> But cdc_ncm_bind_common() may meet error and cause usbnet_disconnect()
> be called which calls free_netdev(net). Thus usbnet structure(alloced
> with net_device structure) will be freed,too.
> So we cannot call usbnet_link_change() if cdc_ncm_bind_common() return
> error.

This is not the bug.

The problem is in cdc_ncm_bind_common().

It seems to leave dangling interface data pointers in some cases, and
then branches just to "error" so that they don't get cleared back out.

This bypasses the protection present in cdc_ncm_disconnect() meant to
avoid this problem.

^ permalink raw reply

* Re: [PATCH] bnx2: Use dev_kfree_skb_any() in bnx2_tx_int()
From: David Miller @ 2013-10-30  2:42 UTC (permalink / raw)
  To: tdmackey; +Cc: mchan, netdev, linux-kernel
In-Reply-To: <1383084998-22673-1-git-send-email-tdmackey@booleanhaiku.com>

From: David Mackey <tdmackey@booleanhaiku.com>
Date: Tue, 29 Oct 2013 15:16:38 -0700

> Using dev_kfree_skb_any() will resolve the below issue when a
> netconsole message is transmitted in an irq.
 ...
> Signed-off-by: David Mackey <tdmackey@booleanhaiku.com>

This is absolutely not the correct fix.

The netpoll facility must invoke ->poll() in an environment which
is compatible, locking and interrupt/soft-interrupt wise, as that
in which it is normally called.

Therefore, bnx2_tx_int(), which is invoked from the driver's ->poll()
method, should not need to use dev_kfree_skb_any().  The real problem
is somewhere else.

^ permalink raw reply

* Re: [PATCH net] virtio-net: correctly handle cpu hotplug notifier during resuming
From: David Miller @ 2013-10-30  2:44 UTC (permalink / raw)
  To: jasowang; +Cc: mst, netdev, linux-kernel, virtualization
In-Reply-To: <1383030667-14343-1-git-send-email-jasowang@redhat.com>

From: Jason Wang <jasowang@redhat.com>
Date: Tue, 29 Oct 2013 15:11:07 +0800

> commit 3ab098df35f8b98b6553edc2e40234af512ba877 (virtio-net: don't respond to
> cpu hotplug notifier if we're not ready) tries to bypass the cpu hotplug
> notifier by checking the config_enable and does nothing is it was false. So it
> need to try to hold the config_lock mutex which may happen in atomic
> environment which leads the following warnings:
 ...
> A correct fix is to unregister the hotcpu notifier during restore and register a
> new one in resume.
> 
> Reported-by: Fengguang Wu <fengguang.wu@intel.com>
> Tested-by: Fengguang Wu <fengguang.wu@intel.com>
> Cc: Wanlong Gao <gaowanlong@cn.fujitsu.com>
> Cc: Rusty Russell <rusty@rustcorp.com.au>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
> This patch is needed for 3.8 and above.

Applied and queued up for -stable, thanks Jason.

^ permalink raw reply

* Re: [PATCH net-next] tcp: temporarily disable Fast Open on SYN timeout
From: David Miller @ 2013-10-30  2:51 UTC (permalink / raw)
  To: ycheng; +Cc: ncardwell, edumazet, netdev
In-Reply-To: <1383066545-27348-1-git-send-email-ycheng@google.com>

From: Yuchung Cheng <ycheng@google.com>
Date: Tue, 29 Oct 2013 10:09:05 -0700

> Fast Open currently has a fall back feature to address SYN-data
> being dropped by but it requires the middle-box to pass on regular
> SYN retry after SYN-data. This is implemented in commit aab487435
> ("net-tcp: Fast Open client - detecting SYN-data drops")
> 
> However some NAT boxes will drop all subsequent packets after first
> SYN-data and blackholes the entire connections.  An example is incommit
> 356d7d8 "netfilter: nf_conntrack: fix tcp_in_window for Fast Open".
> 
> The sender should note such incidents and falls back to use regular
> TCP handshake on subsequent attempt temporarily as well: after the
> second SYN timeouts the original Fast Open SYN is most likely lost.
> When such an event recurs Fast Open is disabled based on the number
> of recurrences exponentially.
> 
> Signed-off-by: Yuchung Cheng <ycheng@google.com>
> Signed-off-by: Neal Cardwell <ncardwell@google.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH v2] octeon_mgmt: drop redundant mac address check
From: David Miller @ 2013-10-30  2:53 UTC (permalink / raw)
  To: luka; +Cc: netdev
In-Reply-To: <1383088152-24869-1-git-send-email-luka@openwrt.org>

From: Luka Perkov <luka@openwrt.org>
Date: Wed, 30 Oct 2013 00:09:12 +0100

> Checking if MAC address is valid using is_valid_ether_addr() is already done in
> of_get_mac_address().
> 
> Signed-off-by: Luka Perkov <luka@openwrt.org>
> Acked-by: David Daney <david.daney@cavium.com>
> CC: David Miller <davem@davemloft.net>

Applied.

^ permalink raw reply

* Re: [PATCH v2] mvneta: drop redundant mac address check
From: David Miller @ 2013-10-30  2:53 UTC (permalink / raw)
  To: luka; +Cc: netdev
In-Reply-To: <1383088201-25136-1-git-send-email-luka@openwrt.org>

From: Luka Perkov <luka@openwrt.org>
Date: Wed, 30 Oct 2013 00:10:01 +0100

> Checking if MAC address is valid using is_valid_ether_addr() is already done in
> of_get_mac_address().
> 
> Signed-off-by: Luka Perkov <luka@openwrt.org>
> Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>

Applied.

^ permalink raw reply

* Re: [PATCH v2] arc_emac: drop redundant mac address check
From: David Miller @ 2013-10-30  2:53 UTC (permalink / raw)
  To: luka; +Cc: netdev, Alexey.Brodkin
In-Reply-To: <1383088260-25422-1-git-send-email-luka@openwrt.org>

From: Luka Perkov <luka@openwrt.org>
Date: Wed, 30 Oct 2013 00:11:00 +0100

> Checking if MAC address is valid using is_valid_ether_addr() is already done in
> of_get_mac_address(). While at it, reorganize checking so it matches checks in
> other drivers.
> 
> Signed-off-by: Luka Perkov <luka@openwrt.org>

Applied.

^ permalink raw reply

* Re: [PATCH 1/2] ibm emac: Don't call napi_complete if napi_reschedule failed
From: David Miller @ 2013-10-30  2:58 UTC (permalink / raw)
  To: alistair; +Cc: netdev, benh
In-Reply-To: <1383090638-21526-1-git-send-email-alistair@popple.id.au>

From: Alistair Popple <alistair@popple.id.au>
Date: Wed, 30 Oct 2013 10:50:37 +1100

> This patch fixes a bug which would trigger the BUG_ON() at
> net/core/dev.c:4156. It was found that this was due to continuing
> processing in the current poll call even when the call to
> napi_reschedule failed, indicating the device was already on the
> polling list. This resulted in an extra call to napi_complete which
> triggered the BUG_ON().
> 
> This patch ensures that we only contine processing rotting packets in
> the current mal_poll call if we are not already on the polling list.
> 
> Signed-off-by: Alistair Popple <alistair@popple.id.au>

Applied.

^ permalink raw reply

* Re: [PATCH 2/2] ibm emac: Fix locking for enable/disable eob irq
From: David Miller @ 2013-10-30  2:58 UTC (permalink / raw)
  To: alistair; +Cc: netdev, benh
In-Reply-To: <1383090638-21526-2-git-send-email-alistair@popple.id.au>

From: Alistair Popple <alistair@popple.id.au>
Date: Wed, 30 Oct 2013 10:50:38 +1100

> Calls to mal_enable_eob_irq perform a read-write-modify of a dcr to
> enable device irqs which is protected by a spin lock. However calls to
> mal_disable_eob_irq do not take the corresponding lock.
> 
> This patch resolves the problem by ensuring that calls to
> mal_disable_eob_irq also take the lock.
> 
> Signed-off-by: Alistair Popple <alistair@popple.id.au>

Applied.

^ permalink raw reply

* Re: [PATCH v2] bgmac: don't update slot on skb alloc/dma mapping error
From: David Miller @ 2013-10-30  2:59 UTC (permalink / raw)
  To: nlhintz; +Cc: netdev, zajec5
In-Reply-To: <BLU0-SMTP255BD097115C7E52E264D41AC0A0@phx.gbl>

From: Nathan Hintz <nlhintz@hotmail.com>
Date: Tue, 29 Oct 2013 19:32:01 -0700

> Don't update the slot in "bgmac_dma_rx_skb_for_slot" unless both the
> skb alloc and dma mapping are successful; and free the newly allocated
> skb if a dma mapping error occurs.  This relieves the caller of the need
> to deduce/execute the appropriate cleanup action required when an error
> occurs.
> 
> Signed-off-by: Nathan Hintz <nlhintz@hotmail.com>
> Acked-by: Rafał Miłecki <zajec5@gmail.com>

Applied, thanks.

^ permalink raw reply

* RE: [PATCH net 2/3] r8152: modify the tx flow
From: hayeswang @ 2013-10-30  3:03 UTC (permalink / raw)
  To: 'David Miller'
  Cc: netdev, 'nic_swsd', linux-kernel, linux-usb
In-Reply-To: <20131029.174936.13317032766742363.davem@davemloft.net>

 David Miller [mailto:davem@davemloft.net] 
> Sent: Wednesday, October 30, 2013 5:50 AM
> To: Hayeswang
> Cc: netdev@vger.kernel.org; nic_swsd; 
> linux-kernel@vger.kernel.org; linux-usb@vger.kernel.org
> Subject: Re: [PATCH net 2/3] r8152: modify the tx flow
> 
> From: Hayes Wang <hayeswang@realtek.com>
> Date: Tue, 29 Oct 2013 15:56:16 +0800
> 
> > Support stopping and waking tx queue. The maximum tx queue length
> > is 60.
> 
> What is so special about the number 60?  It seems arbitrary, and if
> it isn't arbitrary you haven't described why this value was choosen.

The value is arbitrary. I think it is better to stop tx when
queuing many packets, otherwise all the available memory may
be used for tx skb. The queue length could be any value or
unlimited if the memory is enough. Should I remove it?

> I've asked you politely last time around to significantly improve
> the quality of your commit messages, and you haven't done this at
> all.

I thought you indicated the last patch only and the others are clear enough.
I would improve them.

> I'm not applying any of these patches until your commit messages
> properly describe your changes completely.
> 

^ permalink raw reply

* Re: [PATCH net 2/3] r8152: modify the tx flow
From: David Miller @ 2013-10-30  3:08 UTC (permalink / raw)
  To: hayeswang; +Cc: netdev, nic_swsd, linux-kernel, linux-usb
In-Reply-To: <7F69554959F74FA58AE07F06B9FCF029@realtek.com.tw>

From: hayeswang <hayeswang@realtek.com>
Date: Wed, 30 Oct 2013 11:03:55 +0800

>  David Miller [mailto:davem@davemloft.net] 
>> Sent: Wednesday, October 30, 2013 5:50 AM
>> To: Hayeswang
>> Cc: netdev@vger.kernel.org; nic_swsd; 
>> linux-kernel@vger.kernel.org; linux-usb@vger.kernel.org
>> Subject: Re: [PATCH net 2/3] r8152: modify the tx flow
>> 
>> From: Hayes Wang <hayeswang@realtek.com>
>> Date: Tue, 29 Oct 2013 15:56:16 +0800
>> 
>> > Support stopping and waking tx queue. The maximum tx queue length
>> > is 60.
>> 
>> What is so special about the number 60?  It seems arbitrary, and if
>> it isn't arbitrary you haven't described why this value was choosen.
> 
> The value is arbitrary. I think it is better to stop tx when
> queuing many packets, otherwise all the available memory may
> be used for tx skb. The queue length could be any value or
> unlimited if the memory is enough. Should I remove it?

You should at least pick some value that you have analyzed in some
way.  We've done a lot of work to strongly limit the amount of SKB
data which sits in device queues on transmit, and what you're doing
here works against to those goals.

Ideally you should pick a value which is sufficient to meet two
goals at the same time:

1) With constant transmit traffic coming from the networking stack,
   the device never starves for new transmit data to send.

2) We never queue up more traffic than we need to satisfy requirement
   #1.

^ permalink raw reply

* [PATCH net-next v2] tipc: remove two indentation levels in tipc_recv_msg routine
From: Ying Xue @ 2013-10-30  3:26 UTC (permalink / raw)
  To: davem, David.Laight, maloy
  Cc: Paul.Gortmaker, jon.maloy, erik.hugne, andreas.bofjall, ying.xue,
	tipc-discussion, netdev

The message dispatching part of tipc_recv_msg() is wrapped layers of
while/if/if/switch, causing out-of-control indentation and does not
look very good. We reduce two indentation levels by separating the
message dispatching from the blocks that checks link state and
sequence numbers, allowing longer function and arg names to be
consistently indented without wrapping. Additionally we also rename
"cont" label to "discard" and add one new label called "unlock_discard"
to make code clearer. In all, these are cosmetic changes that do not
alter the operation of TIPC in any way.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Cc: David Laight <david.laight@aculab.com>
Cc: Andreas Bofjäll <andreas.bofjall@ericsson.com>
---
v2: Incorporated comments from David Laight and Andreas Bofjäll

 net/tipc/link.c |  173 +++++++++++++++++++++++++++----------------------------
 1 file changed, 84 insertions(+), 89 deletions(-)

diff --git a/net/tipc/link.c b/net/tipc/link.c
index e8153f6..54163f9 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -1507,15 +1507,15 @@ void tipc_recv_msg(struct sk_buff *head, struct tipc_bearer *b_ptr)
 
 		/* Ensure bearer is still enabled */
 		if (unlikely(!b_ptr->active))
-			goto cont;
+			goto discard;
 
 		/* Ensure message is well-formed */
 		if (unlikely(!link_recv_buf_validate(buf)))
-			goto cont;
+			goto discard;
 
 		/* Ensure message data is a single contiguous unit */
 		if (unlikely(skb_linearize(buf)))
-			goto cont;
+			goto discard;
 
 		/* Handle arrival of a non-unicast link message */
 		msg = buf_msg(buf);
@@ -1531,20 +1531,18 @@ void tipc_recv_msg(struct sk_buff *head, struct tipc_bearer *b_ptr)
 		/* Discard unicast link messages destined for another node */
 		if (unlikely(!msg_short(msg) &&
 			     (msg_destnode(msg) != tipc_own_addr)))
-			goto cont;
+			goto discard;
 
 		/* Locate neighboring node that sent message */
 		n_ptr = tipc_node_find(msg_prevnode(msg));
 		if (unlikely(!n_ptr))
-			goto cont;
+			goto discard;
 		tipc_node_lock(n_ptr);
 
 		/* Locate unicast link endpoint that should handle message */
 		l_ptr = n_ptr->links[b_ptr->identity];
-		if (unlikely(!l_ptr)) {
-			tipc_node_unlock(n_ptr);
-			goto cont;
-		}
+		if (unlikely(!l_ptr))
+			goto unlock_discard;
 
 		/* Verify that communication with node is currently allowed */
 		if ((n_ptr->block_setup & WAIT_PEER_DOWN) &&
@@ -1554,10 +1552,8 @@ void tipc_recv_msg(struct sk_buff *head, struct tipc_bearer *b_ptr)
 			!msg_redundant_link(msg))
 			n_ptr->block_setup &= ~WAIT_PEER_DOWN;
 
-		if (n_ptr->block_setup) {
-			tipc_node_unlock(n_ptr);
-			goto cont;
-		}
+		if (n_ptr->block_setup)
+			goto unlock_discard;
 
 		/* Validate message sequence number info */
 		seq_no = msg_seqno(msg);
@@ -1593,98 +1589,97 @@ void tipc_recv_msg(struct sk_buff *head, struct tipc_bearer *b_ptr)
 
 		/* Now (finally!) process the incoming message */
 protocol_check:
-		if (likely(link_working_working(l_ptr))) {
-			if (likely(seq_no == mod(l_ptr->next_in_no))) {
-				l_ptr->next_in_no++;
-				if (unlikely(l_ptr->oldest_deferred_in))
-					head = link_insert_deferred_queue(l_ptr,
-									  head);
-deliver:
-				if (likely(msg_isdata(msg))) {
-					tipc_node_unlock(n_ptr);
-					tipc_port_recv_msg(buf);
-					continue;
-				}
-				switch (msg_user(msg)) {
-					int ret;
-				case MSG_BUNDLER:
-					l_ptr->stats.recv_bundles++;
-					l_ptr->stats.recv_bundled +=
-						msg_msgcnt(msg);
-					tipc_node_unlock(n_ptr);
-					tipc_link_recv_bundle(buf);
-					continue;
-				case NAME_DISTRIBUTOR:
-					n_ptr->bclink.recv_permitted = true;
-					tipc_node_unlock(n_ptr);
-					tipc_named_recv(buf);
-					continue;
-				case BCAST_PROTOCOL:
-					tipc_link_recv_sync(n_ptr, buf);
-					tipc_node_unlock(n_ptr);
-					continue;
-				case CONN_MANAGER:
-					tipc_node_unlock(n_ptr);
-					tipc_port_recv_proto_msg(buf);
-					continue;
-				case MSG_FRAGMENTER:
-					l_ptr->stats.recv_fragments++;
-					ret = tipc_link_recv_fragment(
-						&l_ptr->defragm_buf,
-						&buf, &msg);
-					if (ret == 1) {
-						l_ptr->stats.recv_fragmented++;
-						goto deliver;
-					}
-					if (ret == -1)
-						l_ptr->next_in_no--;
-					break;
-				case CHANGEOVER_PROTOCOL:
-					type = msg_type(msg);
-					if (link_recv_changeover_msg(&l_ptr,
-								     &buf)) {
-						msg = buf_msg(buf);
-						seq_no = msg_seqno(msg);
-						if (type == ORIGINAL_MSG)
-							goto deliver;
-						goto protocol_check;
-					}
-					break;
-				default:
-					kfree_skb(buf);
-					buf = NULL;
-					break;
-				}
+		if (unlikely(!link_working_working(l_ptr))) {
+			if (msg_user(msg) == LINK_PROTOCOL) {
+				link_recv_proto_msg(l_ptr, buf);
+				head = link_insert_deferred_queue(l_ptr, head);
+				tipc_node_unlock(n_ptr);
+				continue;
+			}
+
+			/* Traffic message. Conditionally activate link */
+			link_state_event(l_ptr, TRAFFIC_MSG_EVT);
+
+			if (link_working_working(l_ptr)) {
+				/* Re-insert buffer in front of queue */
+				buf->next = head;
+				head = buf;
 				tipc_node_unlock(n_ptr);
-				tipc_net_route_msg(buf);
 				continue;
 			}
+			goto unlock_discard;
+		}
+
+		/* Link is now in state WORKING_WORKING */
+		if (unlikely(seq_no != mod(l_ptr->next_in_no))) {
 			link_handle_out_of_seq_msg(l_ptr, buf);
 			head = link_insert_deferred_queue(l_ptr, head);
 			tipc_node_unlock(n_ptr);
 			continue;
 		}
-
-		/* Link is not in state WORKING_WORKING */
-		if (msg_user(msg) == LINK_PROTOCOL) {
-			link_recv_proto_msg(l_ptr, buf);
+		l_ptr->next_in_no++;
+		if (unlikely(l_ptr->oldest_deferred_in))
 			head = link_insert_deferred_queue(l_ptr, head);
+deliver:
+		if (likely(msg_isdata(msg))) {
 			tipc_node_unlock(n_ptr);
+			tipc_port_recv_msg(buf);
 			continue;
 		}
-
-		/* Traffic message. Conditionally activate link */
-		link_state_event(l_ptr, TRAFFIC_MSG_EVT);
-
-		if (link_working_working(l_ptr)) {
-			/* Re-insert buffer in front of queue */
-			buf->next = head;
-			head = buf;
+		switch (msg_user(msg)) {
+			int ret;
+		case MSG_BUNDLER:
+			l_ptr->stats.recv_bundles++;
+			l_ptr->stats.recv_bundled += msg_msgcnt(msg);
+			tipc_node_unlock(n_ptr);
+			tipc_link_recv_bundle(buf);
+			continue;
+		case NAME_DISTRIBUTOR:
+			n_ptr->bclink.recv_permitted = true;
+			tipc_node_unlock(n_ptr);
+			tipc_named_recv(buf);
+			continue;
+		case BCAST_PROTOCOL:
+			tipc_link_recv_sync(n_ptr, buf);
 			tipc_node_unlock(n_ptr);
 			continue;
+		case CONN_MANAGER:
+			tipc_node_unlock(n_ptr);
+			tipc_port_recv_proto_msg(buf);
+			continue;
+		case MSG_FRAGMENTER:
+			l_ptr->stats.recv_fragments++;
+			ret = tipc_link_recv_fragment(&l_ptr->defragm_buf,
+						      &buf, &msg);
+			if (ret == 1) {
+				l_ptr->stats.recv_fragmented++;
+				goto deliver;
+			}
+			if (ret == -1)
+				l_ptr->next_in_no--;
+			break;
+		case CHANGEOVER_PROTOCOL:
+			type = msg_type(msg);
+			if (link_recv_changeover_msg(&l_ptr, &buf)) {
+				msg = buf_msg(buf);
+				seq_no = msg_seqno(msg);
+				if (type == ORIGINAL_MSG)
+					goto deliver;
+				goto protocol_check;
+			}
+			break;
+		default:
+			kfree_skb(buf);
+			buf = NULL;
+			break;
 		}
 		tipc_node_unlock(n_ptr);
-cont:
+		tipc_net_route_msg(buf);
+		continue;
+unlock_discard:
+
+		tipc_node_unlock(n_ptr);
+discard:
 		kfree_skb(buf);
 	}
 	read_unlock_bh(&tipc_net_lock);
-- 
1.7.9.5

^ permalink raw reply related

* Re: [PATCH] bnx2: Use dev_kfree_skb_any() in bnx2_tx_int()
From: Cong Wang @ 2013-10-30  3:50 UTC (permalink / raw)
  To: David Miller; +Cc: tdmackey, mchan, Linux Kernel Network Developers, LKML
In-Reply-To: <20131029.224227.1700872646112764527.davem@davemloft.net>

On Tue, Oct 29, 2013 at 7:42 PM, David Miller <davem@davemloft.net> wrote:
> From: David Mackey <tdmackey@booleanhaiku.com>
> Date: Tue, 29 Oct 2013 15:16:38 -0700
>
>> Using dev_kfree_skb_any() will resolve the below issue when a
>> netconsole message is transmitted in an irq.
>  ...
>> Signed-off-by: David Mackey <tdmackey@booleanhaiku.com>
>
> This is absolutely not the correct fix.
>
> The netpoll facility must invoke ->poll() in an environment which
> is compatible, locking and interrupt/soft-interrupt wise, as that
> in which it is normally called.
>

Normally ->poll() is called in softirq context, while netpoll could
be called in any context depending on its caller.

Also, netpoll disables IRQ for tx, which is another difference.

^ permalink raw reply

* Re: Bug in skb_segment: fskb->len != len
From: Eric Dumazet @ 2013-10-30  4:03 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Christoph Paasch, netdev
In-Reply-To: <20131030015002.GB1347@gondor.apana.org.au>

On Wed, 2013-10-30 at 09:50 +0800, Herbert Xu wrote:
> On Tue, Oct 29, 2013 at 08:08:13AM -0700, Eric Dumazet wrote:
> >
> > GRO layer was updated to be able to stack two or three sk_buff,
> > fully populated with page frags.
> > 
> > Thats quite mandatory to support line rate for 40Gb links.
> 
> Indeed I missed this.  Which commit introduced this change?

This was mentioned in the changelog :

<quote>

Christoph Paasch and Jerry Chu reported crashes in skb_segment() caused
by commit 8a29111c7ca6 ("net: gro: allow to build full sized skb")

</quote>

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox