Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [net-next] dummy: expend mtu range for dummy device
From: David Miller @ 2016-12-07 18:30 UTC (permalink / raw)
  To: zhangshengju; +Cc: jarod, netdev
In-Reply-To: <1481103693-16394-1-git-send-email-zhangshengju@cmss.chinamobile.com>

From: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Date: Wed,  7 Dec 2016 17:41:33 +0800

> After commit 61e84623ace3 ("net: centralize net_device min/max MTU checking"),
> the mtu range for dummy device becomes [68, 1500].
> 
> This patch expends it to [0, 65535].
> 
> Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>

Applied with "extends" typo fixed, thanks.

^ permalink raw reply

* Misalignment, MIPS, and ip_hdr(skb)->version
From: Jason A. Donenfeld @ 2016-12-07 18:35 UTC (permalink / raw)
  To: Netdev, linux-mips; +Cc: LKML, WireGuard mailing list

Hey MIPS Networking People,

I receive encrypted packets with a 13 byte header. I decrypt the
ciphertext in place, and then discard the header. I then pass the
plaintext to the rest of the networking stack. The plaintext is an IP
packet. Due to the 13 byte header that was discarded, the plaintext
possibly begins at an unaligned location (depending on whether
dev->needed_headroom was respected).

Does this matter? Is this bad? Will there be a necessary performance hit?

In order to find out, I instrumented the MIPS unaligned access
exception handler to see where I was actually in trouble.
Surprisingly, the only part of the stack that seemed to be upset was
on calls to ip_hdr(skb)->version.

Two things disturb me about this. First, this seems too good to be
true. Does it seem reasonable to you that this is actually the only
place that would be problematic? Or was my testing methodology wrong
to arrive at such an optimistic conclusion?

Secondly, why should a call to ip_hdr(skb)->version cause an unaligned
access anyway? This struct member is simply the second half of a
single byte in a bit field. I'd expect for the compiler to generate a
single byte load, followed by a bitshift or a mask. Instead, the
compiler appears to generate a double byte load, hence the exception.
What's up with this? Stupid compiler that should be fixed? Some odd
optimization? What to do?

I'm considering just adding an extra byte of padding (see discussion
in [1]), but before I make any decision like that (and hopefully it
won't be necessary), I'd like to completely and entirely understand
the full effects and consequences of calling netif_rx(skb) when
skb->data is unaligned. Any insight you have to offer would be most
welcome.

Thanks,
Jason

[1] https://lists.zx2c4.com/pipermail/wireguard/2016-December/000709.html

^ permalink raw reply

* Re: 4.9.0-rc8: tg3 dead after resume
From: Billy Shuman @ 2016-12-07 18:39 UTC (permalink / raw)
  To: Michael Chan; +Cc: Netdev, Siva Reddy Kallam
In-Reply-To: <CACKFLinTBokAXnOri9_sxuFby0y4VxxrhEOh1aayPsyPWwDahw@mail.gmail.com>

On Wed, Dec 7, 2016 at 12:37 PM, Michael Chan <michael.chan@broadcom.com> wrote:
> On Wed, Dec 7, 2016 at 7:20 AM, Billy Shuman <wshuman3@gmail.com> wrote:
>> After resume on 4.9.0-rc8 tg3 is dead.
>>
>> In logs I see:
>> kernel: tg3 0000:44:00.0: phy probe failed, err -19
>> kernel: tg3 0000:44:00.0: Problem fetching invariants of chip, aborting
>
> -19 is -ENODEV which means tg3 cannot read the PHY ID.
>
> If it's a true suspend/resume operation, the driver does not have to
> go through probe during resume.  Please explain how you do
> suspend/resume.

I used systemd
>
> Did this work before?  There has been very few changes to tg3 recently.
>
>>
>> rmmod and modprobe does not fix the problem only a reboot resolves the issue.
>>
>> Billy

^ permalink raw reply

* Re: Misalignment, MIPS, and ip_hdr(skb)->version
From: Dave Taht @ 2016-12-07 18:47 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: Netdev, linux-mips, LKML, WireGuard mailing list
In-Reply-To: <CAHmME9o_eCNXpVztOZKW55kpRtE+1KSEQTQOjUBVn68Y2+or2g@mail.gmail.com>

The openwrt tree has long contained a set of patches that correct for
unaligned issues throughout the linux network stack.

https://git.lede-project.org/?p=openwrt/source.git;a=blob;f=target/linux/ar71xx/patches-4.4/910-unaligned_access_hacks.patch;h=b4b749e4b9c02a74a9f712a2740d63e554de5c64;hb=ee53a240ac902dc83209008a2671e7fdcf55957a

unaligned access traps in the packet processing path on certain versions of
the mips architecture is horrifically bad. I had kind of hoped these
patches in some form would have made it upstream by now. (or the
arches that have the issue retired, I think it's mostly just mips24k)

^ permalink raw reply

* Re: Misalignment, MIPS, and ip_hdr(skb)->version
From: David Miller @ 2016-12-07 18:51 UTC (permalink / raw)
  To: dave.taht; +Cc: netdev, linux-kernel, wireguard, linux-mips
In-Reply-To: <CAA93jw7hcmkcyD=t4VRrQFfHk+n+EkSVgY6KFDq0_-DGpMADYw@mail.gmail.com>

From: Dave Taht <dave.taht@gmail.com>
Date: Wed, 7 Dec 2016 10:47:16 -0800

> https://git.lede-project.org/?p=openwrt/source.git;a=blob;f=target/linux/ar71xx/patches-4.4/910-unaligned_access_hacks.patch;h=b4b749e4b9c02a74a9f712a2740d63e554de5c64;hb=ee53a240ac902dc83209008a2671e7fdcf55957a

It's so much better to analyze properly where the misalignment comes from
and address it at the source, as we have for various cases that trip up
Sparc too.

Marking structures "packed" is going to kill performance and is not
the answer.

^ permalink raw reply

* Re: 4.9.0-rc8: tg3 dead after resume
From: Billy Shuman @ 2016-12-07 18:44 UTC (permalink / raw)
  To: Michael Chan; +Cc: Netdev, Siva Reddy Kallam
In-Reply-To: <CACKFLinTBokAXnOri9_sxuFby0y4VxxrhEOh1aayPsyPWwDahw@mail.gmail.com>

On Wed, Dec 7, 2016 at 12:37 PM, Michael Chan <michael.chan@broadcom.com> wrote:
> On Wed, Dec 7, 2016 at 7:20 AM, Billy Shuman <wshuman3@gmail.com> wrote:
>> After resume on 4.9.0-rc8 tg3 is dead.
>>
>> In logs I see:
>> kernel: tg3 0000:44:00.0: phy probe failed, err -19
>> kernel: tg3 0000:44:00.0: Problem fetching invariants of chip, aborting
>
> -19 is -ENODEV which means tg3 cannot read the PHY ID.
>
> If it's a true suspend/resume operation, the driver does not have to
> go through probe during resume.  Please explain how you do
> suspend/resume.
>

Sorry my previous message was accidentally sent to early.

I used systemd (systemctl suspend) to suspend.

> Did this work before?  There has been very few changes to tg3 recently.
>

This is a new laptop for me, but the same behavior is seen on 4.4.36 and 4.8.12.

>>
>> rmmod and modprobe does not fix the problem only a reboot resolves the issue.
>>
>> Billy

^ permalink raw reply

* Re: Misalignment, MIPS, and ip_hdr(skb)->version
From: Jason A. Donenfeld @ 2016-12-07 18:54 UTC (permalink / raw)
  To: David Miller; +Cc: Netdev, WireGuard mailing list, LKML, linux-mips
In-Reply-To: <20161207.135127.789629809982860453.davem@davemloft.net>

On Wed, Dec 7, 2016 at 7:51 PM, David Miller <davem@davemloft.net> wrote:
> It's so much better to analyze properly where the misalignment comes from
> and address it at the source, as we have for various cases that trip up
> Sparc too.

That's sort of my attitude too, hence starting this thread. Any
pointers you have about this would be most welcome, so as not to
perpetuate what already seems like an issue in other parts of the
stack.

^ permalink raw reply

* [PATCH net-next] bpf: fix state equivalence
From: Alexei Starovoitov @ 2016-12-07 18:57 UTC (permalink / raw)
  To: David S . Miller; +Cc: Daniel Borkmann, Josef Bacik, Thomas Graf, netdev

Commmits 57a09bf0a416 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
and 484611357c19 ("bpf: allow access into map value arrays") by themselves
are correct, but in combination they make state equivalence ignore 'id' field
of the register state which can lead to accepting invalid program.

Fixes: 57a09bf0a416 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
Fixes: 484611357c19 ("bpf: allow access into map value arrays")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
---
this fix has to be done before net-next closes next week,
so sending it now. Will add the test to test_verifier later.

 include/linux/bpf_verifier.h | 14 +++++++-------
 kernel/bpf/verifier.c        |  2 +-
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 7453c1281531..a13b031dc6b8 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -18,13 +18,6 @@
 
 struct bpf_reg_state {
 	enum bpf_reg_type type;
-	/*
-	 * Used to determine if any memory access using this register will
-	 * result in a bad access.
-	 */
-	s64 min_value;
-	u64 max_value;
-	u32 id;
 	union {
 		/* valid when type == CONST_IMM | PTR_TO_STACK | UNKNOWN_VALUE */
 		s64 imm;
@@ -40,6 +33,13 @@ struct bpf_reg_state {
 		 */
 		struct bpf_map *map_ptr;
 	};
+	u32 id;
+	/* Used to determine if any memory access using this register will
+	 * result in a bad access. These two fields must be last.
+	 * See states_equal()
+	 */
+	s64 min_value;
+	u64 max_value;
 };
 
 enum bpf_stack_slot_type {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index da9fb2a9b7eb..5b14f85f45c6 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2528,7 +2528,7 @@ static bool states_equal(struct bpf_verifier_env *env,
 		 * we didn't do a variable access into a map then we are a-ok.
 		 */
 		if (!varlen_map_access &&
-		    rold->type == rcur->type && rold->imm == rcur->imm)
+		    memcmp(rold, rcur, offsetofend(struct bpf_reg_state, id)) == 0)
 			continue;
 
 		/* If we didn't map access then again we don't care about the
-- 
2.8.0

^ permalink raw reply related

* Re: [patch] ser_gigaset: return -ENOMEM on error instead of success
From: Paul Bolle @ 2016-12-07 19:06 UTC (permalink / raw)
  To: Dan Carpenter, Tilman Schmidt
  Cc: Karsten Keil, David S. Miller, gigaset307x-common, netdev,
	kernel-janitors
In-Reply-To: <20161207112203.GC5507@elgon.mountain>

On Wed, 2016-12-07 at 14:22 +0300, Dan Carpenter wrote:
> If we can't allocate the resources in gigaset_initdriver() then we
> should return -ENOMEM instead of zero.

That's entirely correct.

> Fixes: 2869b23e4b95 ("[PATCH] drivers/isdn/gigaset: new M101 driver (v2)")
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
> ---
> Ancient code.

For ancient hardware.

I'll be back (probably tomorrow) after a short test to see whether this really
needs to go into stable. It almost certainly should, but I'd like to first see
the mess the current code leaves behind once gigaset_initdriver() fails before
saying so.

Thanks for reporting this!

Paul Bolle

^ permalink raw reply

* [PATCH] [v4] net: phy: phy drivers should not set SUPPORTED_[Asym_]Pause
From: Timur Tabi @ 2016-12-07 19:20 UTC (permalink / raw)
  To: David Miller, Florian Fainelli, netdev, jon.mason, nks.gnu

Instead of having individual PHY drivers set the SUPPORTED_Pause and
SUPPORTED_Asym_Pause flags, phylib itself should set those flags,
unless there is a hardware erratum or other special case.  During
autonegotiation, the PHYs will determine whether to enable pause
frame support.

Pause frames are a feature that is supported by the MAC.  It is the MAC
that generates the frames and that processes them.  The PHY can only be
configured to allow them to pass through.

This commit also effectively reverts the recently applied c7a61319
("net: phy: dp83848: Support ethernet pause frames").

So the new process is:

1) Unless the PHY driver overrides it, phylib sets the SUPPORTED_Pause
and SUPPORTED_AsymPause bits in phydev->supported.  This indicates that
the PHY supports pause frames.

2) The MAC driver checks phydev->supported before it calls phy_start().
If (SUPPORTED_Pause | SUPPORTED_AsymPause) is set, then the MAC driver
sets those bits in phydev->advertising, if it wants to enable pause
frame support.

3) When the link state changes, the MAC driver checks phydev->pause and
phydev->asym_pause,  If the bits are set, then it enables the corresponding
features in the MAC.  The algorithm is:

	if (phydev->pause)
		The MAC should be programmed to receive and honor
                pause frames it receives, i.e. enable receive flow control.

	if (phydev->pause != phydev->asym_pause)
		The MAC should be programmed to transmit pause
		frames when needed, i.e. enable transmit flow control.

Signed-off-by: Timur Tabi <timur@codeaurora.org>
---

v4:
  added dp83848 driver
  removed extraneous line in phy_device

v3:
  set bits only if the PHY driver doesn't set them


 drivers/net/phy/bcm-cygnus.c |  3 +--
 drivers/net/phy/bcm7xxx.c    |  6 ++----
 drivers/net/phy/broadcom.c   | 42 ++++++++++++++----------------------------
 drivers/net/phy/dp83848.c    |  4 +---
 drivers/net/phy/icplus.c     |  6 ++----
 drivers/net/phy/intel-xway.c | 24 ++++++++----------------
 drivers/net/phy/micrel.c     | 30 ++++++++++++------------------
 drivers/net/phy/microchip.c  |  3 +--
 drivers/net/phy/national.c   |  2 +-
 drivers/net/phy/phy_device.c | 19 +++++++++++++++++++
 drivers/net/phy/smsc.c       | 18 ++++++------------
 11 files changed, 67 insertions(+), 90 deletions(-)

diff --git a/drivers/net/phy/bcm-cygnus.c b/drivers/net/phy/bcm-cygnus.c
index 196400c..3fe8cc5 100644
--- a/drivers/net/phy/bcm-cygnus.c
+++ b/drivers/net/phy/bcm-cygnus.c
@@ -134,8 +134,7 @@ static int bcm_cygnus_resume(struct phy_device *phydev)
 	.phy_id        = PHY_ID_BCM_CYGNUS,
 	.phy_id_mask   = 0xfffffff0,
 	.name          = "Broadcom Cygnus PHY",
-	.features      = PHY_GBIT_FEATURES |
-			SUPPORTED_Pause | SUPPORTED_Asym_Pause,
+	.features      = PHY_GBIT_FEATURES,
 	.config_init   = bcm_cygnus_config_init,
 	.config_aneg   = genphy_config_aneg,
 	.read_status   = genphy_read_status,
diff --git a/drivers/net/phy/bcm7xxx.c b/drivers/net/phy/bcm7xxx.c
index aae00bd..264b085 100644
--- a/drivers/net/phy/bcm7xxx.c
+++ b/drivers/net/phy/bcm7xxx.c
@@ -386,8 +386,7 @@ static int bcm7xxx_28nm_probe(struct phy_device *phydev)
 	.phy_id		= (_oui),					\
 	.phy_id_mask	= 0xfffffff0,					\
 	.name		= _name,					\
-	.features	= PHY_GBIT_FEATURES |				\
-			  SUPPORTED_Pause | SUPPORTED_Asym_Pause,	\
+	.features	= PHY_GBIT_FEATURES,				\
 	.flags		= PHY_IS_INTERNAL,				\
 	.config_init	= bcm7xxx_28nm_config_init,			\
 	.config_aneg	= genphy_config_aneg,				\
@@ -406,8 +405,7 @@ static int bcm7xxx_28nm_probe(struct phy_device *phydev)
 	.phy_id         = (_oui),					\
 	.phy_id_mask    = 0xfffffff0,					\
 	.name           = _name,					\
-	.features       = PHY_BASIC_FEATURES |				\
-			  SUPPORTED_Pause | SUPPORTED_Asym_Pause,	\
+	.features       = PHY_BASIC_FEATURES,				\
 	.flags          = PHY_IS_INTERNAL,				\
 	.config_init    = bcm7xxx_config_init,				\
 	.config_aneg    = genphy_config_aneg,				\
diff --git a/drivers/net/phy/broadcom.c b/drivers/net/phy/broadcom.c
index 409b365..4223e35 100644
--- a/drivers/net/phy/broadcom.c
+++ b/drivers/net/phy/broadcom.c
@@ -525,8 +525,7 @@ static int brcm_fet_config_intr(struct phy_device *phydev)
 	.phy_id		= PHY_ID_BCM5411,
 	.phy_id_mask	= 0xfffffff0,
 	.name		= "Broadcom BCM5411",
-	.features	= PHY_GBIT_FEATURES |
-			  SUPPORTED_Pause | SUPPORTED_Asym_Pause,
+	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.config_init	= bcm54xx_config_init,
 	.config_aneg	= genphy_config_aneg,
@@ -537,8 +536,7 @@ static int brcm_fet_config_intr(struct phy_device *phydev)
 	.phy_id		= PHY_ID_BCM5421,
 	.phy_id_mask	= 0xfffffff0,
 	.name		= "Broadcom BCM5421",
-	.features	= PHY_GBIT_FEATURES |
-			  SUPPORTED_Pause | SUPPORTED_Asym_Pause,
+	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.config_init	= bcm54xx_config_init,
 	.config_aneg	= genphy_config_aneg,
@@ -549,8 +547,7 @@ static int brcm_fet_config_intr(struct phy_device *phydev)
 	.phy_id		= PHY_ID_BCM5461,
 	.phy_id_mask	= 0xfffffff0,
 	.name		= "Broadcom BCM5461",
-	.features	= PHY_GBIT_FEATURES |
-			  SUPPORTED_Pause | SUPPORTED_Asym_Pause,
+	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.config_init	= bcm54xx_config_init,
 	.config_aneg	= genphy_config_aneg,
@@ -561,8 +558,7 @@ static int brcm_fet_config_intr(struct phy_device *phydev)
 	.phy_id		= PHY_ID_BCM54612E,
 	.phy_id_mask	= 0xfffffff0,
 	.name		= "Broadcom BCM54612E",
-	.features	= PHY_GBIT_FEATURES |
-			  SUPPORTED_Pause | SUPPORTED_Asym_Pause,
+	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.config_init	= bcm54xx_config_init,
 	.config_aneg	= bcm54612e_config_aneg,
@@ -573,8 +569,7 @@ static int brcm_fet_config_intr(struct phy_device *phydev)
 	.phy_id		= PHY_ID_BCM54616S,
 	.phy_id_mask	= 0xfffffff0,
 	.name		= "Broadcom BCM54616S",
-	.features	= PHY_GBIT_FEATURES |
-			  SUPPORTED_Pause | SUPPORTED_Asym_Pause,
+	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.config_init	= bcm54xx_config_init,
 	.config_aneg	= genphy_config_aneg,
@@ -585,8 +580,7 @@ static int brcm_fet_config_intr(struct phy_device *phydev)
 	.phy_id		= PHY_ID_BCM5464,
 	.phy_id_mask	= 0xfffffff0,
 	.name		= "Broadcom BCM5464",
-	.features	= PHY_GBIT_FEATURES |
-			  SUPPORTED_Pause | SUPPORTED_Asym_Pause,
+	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.config_init	= bcm54xx_config_init,
 	.config_aneg	= genphy_config_aneg,
@@ -597,8 +591,7 @@ static int brcm_fet_config_intr(struct phy_device *phydev)
 	.phy_id		= PHY_ID_BCM5481,
 	.phy_id_mask	= 0xfffffff0,
 	.name		= "Broadcom BCM5481",
-	.features	= PHY_GBIT_FEATURES |
-			  SUPPORTED_Pause | SUPPORTED_Asym_Pause,
+	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.config_init	= bcm54xx_config_init,
 	.config_aneg	= bcm5481_config_aneg,
@@ -609,8 +602,7 @@ static int brcm_fet_config_intr(struct phy_device *phydev)
 	.phy_id         = PHY_ID_BCM54810,
 	.phy_id_mask    = 0xfffffff0,
 	.name           = "Broadcom BCM54810",
-	.features       = PHY_GBIT_FEATURES |
-			  SUPPORTED_Pause | SUPPORTED_Asym_Pause,
+	.features       = PHY_GBIT_FEATURES,
 	.flags          = PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.config_init    = bcm54xx_config_init,
 	.config_aneg    = bcm5481_config_aneg,
@@ -621,8 +613,7 @@ static int brcm_fet_config_intr(struct phy_device *phydev)
 	.phy_id		= PHY_ID_BCM5482,
 	.phy_id_mask	= 0xfffffff0,
 	.name		= "Broadcom BCM5482",
-	.features	= PHY_GBIT_FEATURES |
-			  SUPPORTED_Pause | SUPPORTED_Asym_Pause,
+	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.config_init	= bcm5482_config_init,
 	.config_aneg	= genphy_config_aneg,
@@ -633,8 +624,7 @@ static int brcm_fet_config_intr(struct phy_device *phydev)
 	.phy_id		= PHY_ID_BCM50610,
 	.phy_id_mask	= 0xfffffff0,
 	.name		= "Broadcom BCM50610",
-	.features	= PHY_GBIT_FEATURES |
-			  SUPPORTED_Pause | SUPPORTED_Asym_Pause,
+	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.config_init	= bcm54xx_config_init,
 	.config_aneg	= genphy_config_aneg,
@@ -645,8 +635,7 @@ static int brcm_fet_config_intr(struct phy_device *phydev)
 	.phy_id		= PHY_ID_BCM50610M,
 	.phy_id_mask	= 0xfffffff0,
 	.name		= "Broadcom BCM50610M",
-	.features	= PHY_GBIT_FEATURES |
-			  SUPPORTED_Pause | SUPPORTED_Asym_Pause,
+	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.config_init	= bcm54xx_config_init,
 	.config_aneg	= genphy_config_aneg,
@@ -657,8 +646,7 @@ static int brcm_fet_config_intr(struct phy_device *phydev)
 	.phy_id		= PHY_ID_BCM57780,
 	.phy_id_mask	= 0xfffffff0,
 	.name		= "Broadcom BCM57780",
-	.features	= PHY_GBIT_FEATURES |
-			  SUPPORTED_Pause | SUPPORTED_Asym_Pause,
+	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.config_init	= bcm54xx_config_init,
 	.config_aneg	= genphy_config_aneg,
@@ -669,8 +657,7 @@ static int brcm_fet_config_intr(struct phy_device *phydev)
 	.phy_id		= PHY_ID_BCMAC131,
 	.phy_id_mask	= 0xfffffff0,
 	.name		= "Broadcom BCMAC131",
-	.features	= PHY_BASIC_FEATURES |
-			  SUPPORTED_Pause | SUPPORTED_Asym_Pause,
+	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.config_init	= brcm_fet_config_init,
 	.config_aneg	= genphy_config_aneg,
@@ -681,8 +668,7 @@ static int brcm_fet_config_intr(struct phy_device *phydev)
 	.phy_id		= PHY_ID_BCM5241,
 	.phy_id_mask	= 0xfffffff0,
 	.name		= "Broadcom BCM5241",
-	.features	= PHY_BASIC_FEATURES |
-			  SUPPORTED_Pause | SUPPORTED_Asym_Pause,
+	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.config_init	= brcm_fet_config_init,
 	.config_aneg	= genphy_config_aneg,
diff --git a/drivers/net/phy/dp83848.c b/drivers/net/phy/dp83848.c
index 320d0dc..800b39f 100644
--- a/drivers/net/phy/dp83848.c
+++ b/drivers/net/phy/dp83848.c
@@ -88,9 +88,7 @@ static int dp83848_config_intr(struct phy_device *phydev)
 		.phy_id		= _id,				\
 		.phy_id_mask	= 0xfffffff0,			\
 		.name		= _name,			\
-		.features	= (PHY_BASIC_FEATURES |		\
-				   SUPPORTED_Pause |		\
-				   SUPPORTED_Asym_Pause),	\
+		.features	= PHY_BASIC_FEATURES,		\
 		.flags		= PHY_HAS_INTERRUPT,		\
 								\
 		.soft_reset	= genphy_soft_reset,		\
diff --git a/drivers/net/phy/icplus.c b/drivers/net/phy/icplus.c
index e5f251b..22b51f0 100644
--- a/drivers/net/phy/icplus.c
+++ b/drivers/net/phy/icplus.c
@@ -225,8 +225,7 @@ static int ip101a_g_ack_interrupt(struct phy_device *phydev)
 	.phy_id		= 0x02430d90,
 	.name		= "ICPlus IP1001",
 	.phy_id_mask	= 0x0ffffff0,
-	.features	= PHY_GBIT_FEATURES | SUPPORTED_Pause |
-			  SUPPORTED_Asym_Pause,
+	.features	= PHY_GBIT_FEATURES,
 	.config_init	= &ip1001_config_init,
 	.config_aneg	= &genphy_config_aneg,
 	.read_status	= &genphy_read_status,
@@ -236,8 +235,7 @@ static int ip101a_g_ack_interrupt(struct phy_device *phydev)
 	.phy_id		= 0x02430c54,
 	.name		= "ICPlus IP101A/G",
 	.phy_id_mask	= 0x0ffffff0,
-	.features	= PHY_BASIC_FEATURES | SUPPORTED_Pause |
-			  SUPPORTED_Asym_Pause,
+	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
 	.ack_interrupt	= ip101a_g_ack_interrupt,
 	.config_init	= &ip101a_g_config_init,
diff --git a/drivers/net/phy/intel-xway.c b/drivers/net/phy/intel-xway.c
index c300ab5..b1fd7bb 100644
--- a/drivers/net/phy/intel-xway.c
+++ b/drivers/net/phy/intel-xway.c
@@ -239,8 +239,7 @@ static int xway_gphy_config_intr(struct phy_device *phydev)
 		.phy_id		= PHY_ID_PHY11G_1_3,
 		.phy_id_mask	= 0xffffffff,
 		.name		= "Intel XWAY PHY11G (PEF 7071/PEF 7072) v1.3",
-		.features	= (PHY_GBIT_FEATURES | SUPPORTED_Pause |
-				   SUPPORTED_Asym_Pause),
+		.features	= PHY_GBIT_FEATURES,
 		.flags		= PHY_HAS_INTERRUPT,
 		.config_init	= xway_gphy_config_init,
 		.config_aneg	= xway_gphy14_config_aneg,
@@ -254,8 +253,7 @@ static int xway_gphy_config_intr(struct phy_device *phydev)
 		.phy_id		= PHY_ID_PHY22F_1_3,
 		.phy_id_mask	= 0xffffffff,
 		.name		= "Intel XWAY PHY22F (PEF 7061) v1.3",
-		.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause |
-				   SUPPORTED_Asym_Pause),
+		.features	= PHY_BASIC_FEATURES,
 		.flags		= PHY_HAS_INTERRUPT,
 		.config_init	= xway_gphy_config_init,
 		.config_aneg	= xway_gphy14_config_aneg,
@@ -269,8 +267,7 @@ static int xway_gphy_config_intr(struct phy_device *phydev)
 		.phy_id		= PHY_ID_PHY11G_1_4,
 		.phy_id_mask	= 0xffffffff,
 		.name		= "Intel XWAY PHY11G (PEF 7071/PEF 7072) v1.4",
-		.features	= (PHY_GBIT_FEATURES | SUPPORTED_Pause |
-				   SUPPORTED_Asym_Pause),
+		.features	= PHY_GBIT_FEATURES,
 		.flags		= PHY_HAS_INTERRUPT,
 		.config_init	= xway_gphy_config_init,
 		.config_aneg	= xway_gphy14_config_aneg,
@@ -284,8 +281,7 @@ static int xway_gphy_config_intr(struct phy_device *phydev)
 		.phy_id		= PHY_ID_PHY22F_1_4,
 		.phy_id_mask	= 0xffffffff,
 		.name		= "Intel XWAY PHY22F (PEF 7061) v1.4",
-		.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause |
-				   SUPPORTED_Asym_Pause),
+		.features	= PHY_BASIC_FEATURES,
 		.flags		= PHY_HAS_INTERRUPT,
 		.config_init	= xway_gphy_config_init,
 		.config_aneg	= xway_gphy14_config_aneg,
@@ -299,8 +295,7 @@ static int xway_gphy_config_intr(struct phy_device *phydev)
 		.phy_id		= PHY_ID_PHY11G_1_5,
 		.phy_id_mask	= 0xffffffff,
 		.name		= "Intel XWAY PHY11G (PEF 7071/PEF 7072) v1.5 / v1.6",
-		.features	= (PHY_GBIT_FEATURES | SUPPORTED_Pause |
-				   SUPPORTED_Asym_Pause),
+		.features	= PHY_GBIT_FEATURES,
 		.flags		= PHY_HAS_INTERRUPT,
 		.config_init	= xway_gphy_config_init,
 		.config_aneg	= genphy_config_aneg,
@@ -314,8 +309,7 @@ static int xway_gphy_config_intr(struct phy_device *phydev)
 		.phy_id		= PHY_ID_PHY22F_1_5,
 		.phy_id_mask	= 0xffffffff,
 		.name		= "Intel XWAY PHY22F (PEF 7061) v1.5 / v1.6",
-		.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause |
-				   SUPPORTED_Asym_Pause),
+		.features	= PHY_BASIC_FEATURES,
 		.flags		= PHY_HAS_INTERRUPT,
 		.config_init	= xway_gphy_config_init,
 		.config_aneg	= genphy_config_aneg,
@@ -329,8 +323,7 @@ static int xway_gphy_config_intr(struct phy_device *phydev)
 		.phy_id		= PHY_ID_PHY11G_VR9,
 		.phy_id_mask	= 0xffffffff,
 		.name		= "Intel XWAY PHY11G (xRX integrated)",
-		.features	= (PHY_GBIT_FEATURES | SUPPORTED_Pause |
-				   SUPPORTED_Asym_Pause),
+		.features	= PHY_GBIT_FEATURES,
 		.flags		= PHY_HAS_INTERRUPT,
 		.config_init	= xway_gphy_config_init,
 		.config_aneg	= genphy_config_aneg,
@@ -344,8 +337,7 @@ static int xway_gphy_config_intr(struct phy_device *phydev)
 		.phy_id		= PHY_ID_PHY22F_VR9,
 		.phy_id_mask	= 0xffffffff,
 		.name		= "Intel XWAY PHY22F (xRX integrated)",
-		.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause |
-				   SUPPORTED_Asym_Pause),
+		.features	= PHY_BASIC_FEATURES,
 		.flags		= PHY_HAS_INTERRUPT,
 		.config_init	= xway_gphy_config_init,
 		.config_aneg	= genphy_config_aneg,
diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
index ea92d52..9a77289 100644
--- a/drivers/net/phy/micrel.c
+++ b/drivers/net/phy/micrel.c
@@ -790,7 +790,7 @@ static int kszphy_probe(struct phy_device *phydev)
 	.phy_id		= PHY_ID_KS8737,
 	.phy_id_mask	= MICREL_PHY_ID_MASK,
 	.name		= "Micrel KS8737",
-	.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause),
+	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.driver_data	= &ks8737_type,
 	.config_init	= kszphy_config_init,
@@ -807,8 +807,7 @@ static int kszphy_probe(struct phy_device *phydev)
 	.phy_id		= PHY_ID_KSZ8021,
 	.phy_id_mask	= 0x00ffffff,
 	.name		= "Micrel KSZ8021 or KSZ8031",
-	.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause |
-			   SUPPORTED_Asym_Pause),
+	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.driver_data	= &ksz8021_type,
 	.probe		= kszphy_probe,
@@ -826,8 +825,7 @@ static int kszphy_probe(struct phy_device *phydev)
 	.phy_id		= PHY_ID_KSZ8031,
 	.phy_id_mask	= 0x00ffffff,
 	.name		= "Micrel KSZ8031",
-	.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause |
-			   SUPPORTED_Asym_Pause),
+	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.driver_data	= &ksz8021_type,
 	.probe		= kszphy_probe,
@@ -845,8 +843,7 @@ static int kszphy_probe(struct phy_device *phydev)
 	.phy_id		= PHY_ID_KSZ8041,
 	.phy_id_mask	= MICREL_PHY_ID_MASK,
 	.name		= "Micrel KSZ8041",
-	.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause
-				| SUPPORTED_Asym_Pause),
+	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.driver_data	= &ksz8041_type,
 	.probe		= kszphy_probe,
@@ -864,8 +861,7 @@ static int kszphy_probe(struct phy_device *phydev)
 	.phy_id		= PHY_ID_KSZ8041RNLI,
 	.phy_id_mask	= MICREL_PHY_ID_MASK,
 	.name		= "Micrel KSZ8041RNLI",
-	.features	= PHY_BASIC_FEATURES |
-			  SUPPORTED_Pause | SUPPORTED_Asym_Pause,
+	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.driver_data	= &ksz8041_type,
 	.probe		= kszphy_probe,
@@ -883,8 +879,7 @@ static int kszphy_probe(struct phy_device *phydev)
 	.phy_id		= PHY_ID_KSZ8051,
 	.phy_id_mask	= MICREL_PHY_ID_MASK,
 	.name		= "Micrel KSZ8051",
-	.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause
-				| SUPPORTED_Asym_Pause),
+	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.driver_data	= &ksz8051_type,
 	.probe		= kszphy_probe,
@@ -902,7 +897,7 @@ static int kszphy_probe(struct phy_device *phydev)
 	.phy_id		= PHY_ID_KSZ8001,
 	.name		= "Micrel KSZ8001 or KS8721",
 	.phy_id_mask	= 0x00fffffc,
-	.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause),
+	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.driver_data	= &ksz8041_type,
 	.probe		= kszphy_probe,
@@ -920,7 +915,7 @@ static int kszphy_probe(struct phy_device *phydev)
 	.phy_id		= PHY_ID_KSZ8081,
 	.name		= "Micrel KSZ8081 or KSZ8091",
 	.phy_id_mask	= MICREL_PHY_ID_MASK,
-	.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause),
+	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.driver_data	= &ksz8081_type,
 	.probe		= kszphy_probe,
@@ -938,7 +933,7 @@ static int kszphy_probe(struct phy_device *phydev)
 	.phy_id		= PHY_ID_KSZ8061,
 	.name		= "Micrel KSZ8061",
 	.phy_id_mask	= MICREL_PHY_ID_MASK,
-	.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause),
+	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.config_init	= kszphy_config_init,
 	.config_aneg	= genphy_config_aneg,
@@ -954,7 +949,7 @@ static int kszphy_probe(struct phy_device *phydev)
 	.phy_id		= PHY_ID_KSZ9021,
 	.phy_id_mask	= 0x000ffffe,
 	.name		= "Micrel KSZ9021 Gigabit PHY",
-	.features	= (PHY_GBIT_FEATURES | SUPPORTED_Pause),
+	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.driver_data	= &ksz9021_type,
 	.config_init	= ksz9021_config_init,
@@ -973,7 +968,7 @@ static int kszphy_probe(struct phy_device *phydev)
 	.phy_id		= PHY_ID_KSZ9031,
 	.phy_id_mask	= MICREL_PHY_ID_MASK,
 	.name		= "Micrel KSZ9031 Gigabit PHY",
-	.features	= (PHY_GBIT_FEATURES | SUPPORTED_Pause),
+	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.driver_data	= &ksz9021_type,
 	.config_init	= ksz9031_config_init,
@@ -990,7 +985,6 @@ static int kszphy_probe(struct phy_device *phydev)
 	.phy_id		= PHY_ID_KSZ8873MLL,
 	.phy_id_mask	= MICREL_PHY_ID_MASK,
 	.name		= "Micrel KSZ8873MLL Switch",
-	.features	= (SUPPORTED_Pause | SUPPORTED_Asym_Pause),
 	.flags		= PHY_HAS_MAGICANEG,
 	.config_init	= kszphy_config_init,
 	.config_aneg	= ksz8873mll_config_aneg,
@@ -1004,7 +998,7 @@ static int kszphy_probe(struct phy_device *phydev)
 	.phy_id		= PHY_ID_KSZ886X,
 	.phy_id_mask	= MICREL_PHY_ID_MASK,
 	.name		= "Micrel KSZ886X Switch",
-	.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause),
+	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.config_init	= kszphy_config_init,
 	.config_aneg	= genphy_config_aneg,
diff --git a/drivers/net/phy/microchip.c b/drivers/net/phy/microchip.c
index 12825a5..324fbf6 100644
--- a/drivers/net/phy/microchip.c
+++ b/drivers/net/phy/microchip.c
@@ -146,8 +146,7 @@ static int lan88xx_config_aneg(struct phy_device *phydev)
 	.phy_id_mask	= 0xfffffff0,
 	.name		= "Microchip LAN88xx",
 
-	.features	= (PHY_GBIT_FEATURES |
-			   SUPPORTED_Pause | SUPPORTED_Asym_Pause),
+	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT | PHY_HAS_MAGICANEG,
 
 	.probe		= lan88xx_probe,
diff --git a/drivers/net/phy/national.c b/drivers/net/phy/national.c
index 2a1b490..2addf1d 100644
--- a/drivers/net/phy/national.c
+++ b/drivers/net/phy/national.c
@@ -133,7 +133,7 @@ static int ns_config_init(struct phy_device *phydev)
 	.phy_id = DP83865_PHY_ID,
 	.phy_id_mask = 0xfffffff0,
 	.name = "NatSemi DP83865",
-	.features = PHY_GBIT_FEATURES | SUPPORTED_Pause | SUPPORTED_Asym_Pause,
+	.features = PHY_GBIT_FEATURES,
 	.flags = PHY_HAS_INTERRUPT,
 	.config_init = ns_config_init,
 	.config_aneg = genphy_config_aneg,
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index aeaf1bc..416640d 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -1662,6 +1662,25 @@ static int phy_probe(struct device *dev)
 	 */
 	of_set_phy_eee_broken(phydev);
 
+	/* The Pause Frame bits indicate that the PHY can support passing
+	 * pause frames. During autonegotiation, the PHYs will determine if
+	 * they should allow pause frames to pass.  The MAC driver should then
+	 * use that result to determine whether to enable flow control via
+	 * pause frames.
+	 *
+	 * Normally, PHY drivers should not set the Pause bits, and instead
+	 * allow phylib to do that.  However, there may be some situations
+	 * (e.g. hardware erratum) where the driver wants to set only one
+	 * of these bits.
+	 */
+	if (phydrv->features & (SUPPORTED_Pause | SUPPORTED_Asym_Pause)) {
+		phydev->supported &= ~(SUPPORTED_Pause | SUPPORTED_Asym_Pause);
+		phydev->supported |= phydrv->features &
+				     (SUPPORTED_Pause | SUPPORTED_Asym_Pause);
+	} else {
+		phydev->supported |= SUPPORTED_Pause | SUPPORTED_Asym_Pause;
+	}
+
 	/* Set the state to READY by default */
 	phydev->state = PHY_READY;
 
diff --git a/drivers/net/phy/smsc.c b/drivers/net/phy/smsc.c
index b62c4aa..fb32eaf 100644
--- a/drivers/net/phy/smsc.c
+++ b/drivers/net/phy/smsc.c
@@ -168,8 +168,7 @@ static int smsc_phy_probe(struct phy_device *phydev)
 	.phy_id_mask	= 0xfffffff0,
 	.name		= "SMSC LAN83C185",
 
-	.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause
-				| SUPPORTED_Asym_Pause),
+	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT | PHY_HAS_MAGICANEG,
 
 	.probe		= smsc_phy_probe,
@@ -191,8 +190,7 @@ static int smsc_phy_probe(struct phy_device *phydev)
 	.phy_id_mask	= 0xfffffff0,
 	.name		= "SMSC LAN8187",
 
-	.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause
-				| SUPPORTED_Asym_Pause),
+	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT | PHY_HAS_MAGICANEG,
 
 	.probe		= smsc_phy_probe,
@@ -214,8 +212,7 @@ static int smsc_phy_probe(struct phy_device *phydev)
 	.phy_id_mask	= 0xfffffff0,
 	.name		= "SMSC LAN8700",
 
-	.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause
-				| SUPPORTED_Asym_Pause),
+	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT | PHY_HAS_MAGICANEG,
 
 	.probe		= smsc_phy_probe,
@@ -237,8 +234,7 @@ static int smsc_phy_probe(struct phy_device *phydev)
 	.phy_id_mask	= 0xfffffff0,
 	.name		= "SMSC LAN911x Internal PHY",
 
-	.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause
-				| SUPPORTED_Asym_Pause),
+	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT | PHY_HAS_MAGICANEG,
 
 	.probe		= smsc_phy_probe,
@@ -259,8 +255,7 @@ static int smsc_phy_probe(struct phy_device *phydev)
 	.phy_id_mask	= 0xfffffff0,
 	.name		= "SMSC LAN8710/LAN8720",
 
-	.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause
-				| SUPPORTED_Asym_Pause),
+	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT | PHY_HAS_MAGICANEG,
 
 	.probe		= smsc_phy_probe,
@@ -282,8 +277,7 @@ static int smsc_phy_probe(struct phy_device *phydev)
 	.phy_id_mask	= 0xfffffff0,
 	.name		= "SMSC LAN8740",
 
-	.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause
-				| SUPPORTED_Asym_Pause),
+	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT | PHY_HAS_MAGICANEG,
 
 	.probe		= smsc_phy_probe,
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related

* Re: [PATCH 1/1] ixgbe: write flush vfta registers
From: Lino Sanfilippo @ 2016-12-07 19:24 UTC (permalink / raw)
  To: zhuyj; +Cc: e1000-devel@lists.sourceforge.net, netdev, intel-wired-lan
In-Reply-To: <CAD=hENcE58xM4hBm-8N-eNREhBbP4gC-Vm89krQ6aj_8bDZmJg@mail.gmail.com>

Hi Zhu,

On 07.12.2016 04:05, zhuyj wrote:
> After several week tests, your advice still make this bug appear. But
> my patch make this bug disappear.
> 
> Zhu Yanjun
> 

In your commit message you wrote
"Sometimes vfta registers can not be written successfully in dcb mode." 
Do you mean that the writes to the registers dont seem to have any effect at all? Or is the
effect only delayed? What exactly is your test case?

(BTW: please dont top-post).

Regards,
Lino

------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/xeonphi
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply

* Re: [PATCH net-next] udp: under rx pressure, try to condense skbs
From: Eric Dumazet @ 2016-12-07 19:31 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Paolo Abeni
In-Reply-To: <1481131173.4930.36.camel@edumazet-glaptop3.roam.corp.google.com>

On Wed, 2016-12-07 at 09:19 -0800, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>

> This patch gave me about 5 % increase in throughput in my tests.
> 

BTW, this reduces the gap with GRO being on or off for me :

mlx4 drivers has a copybreak feature when GRO is off :

Packets below 256 bytes are simply copied into a linear skb
 ( done in mlx4_en_rx_skb() )

^ permalink raw reply

* Western Union Spende
From: Western Union @ 2016-12-08  3:41 UTC (permalink / raw)
  To: Recipients

Sir/Madam,
Hereby we inform you that WESTERN UNION has international donations project you expressed as a cash grant of awarded [85,000.00 EURO], promoted as a charity donation by Western union international, United Kingdom, in conjunction with the children's Fund of the United Nations [UNICEF].
To request more information about the processing and payment of your grants claims please contact Mr. Mike Morris, the National Secretary of the Foundation with your qualifying number [WNI/101/231/BDB] as soon as possible.
Western Union District Manager (Mr Mike MORIS)
Website: www.westernunion.com
Email:  Wu.transfer101@gmail.com

^ permalink raw reply

* Re: [RFC PATCH net-next v3 1/2] macb: Add 1588 support in Cadence GEM.
From: Richard Cochran @ 2016-12-07 19:39 UTC (permalink / raw)
  To: Andrei Pistirica
  Cc: netdev, linux-kernel, linux-arm-kernel, davem, nicolas.ferre,
	harinikatakamlinux, harini.katakam, punnaia, michals, anirudh,
	boris.brezillon, alexandre.belloni, tbultel, rafalo
In-Reply-To: <1481134912-2243-1-git-send-email-andrei.pistirica@microchip.com>

On Wed, Dec 07, 2016 at 08:21:51PM +0200, Andrei Pistirica wrote:
> +#ifdef CONFIG_MACB_USE_HWSTAMP
> +void gem_ptp_init(struct net_device *ndev);
> +void gem_ptp_remove(struct net_device *ndev);
> +
> +void gem_ptp_do_txstamp(struct macb *bp, struct sk_buff *skb);
> +void gem_ptp_do_rxstamp(struct macb *bp, struct sk_buff *skb);

These are in the hot path, and so you should do the test before
calling the global function, something like this:

void gem_ptp_txstamp(struct macb *bp, struct sk_buff *skb);

static void gem_ptp_do_txstamp(struct macb *bp, struct sk_buff *skb)
{
	if (!bp->hwts_tx_en)
		return;
	gem_ptp_txstamp(bp, skb);
}

Ditto for Rx.

> +#else
> +static inline void gem_ptp_init(struct net_device *ndev) { }
> +static inline void gem_ptp_remove(struct net_device *ndev) { }
> +
> +static inline void gem_ptp_do_txstamp(struct macb *bp, struct sk_buff *skb) { }
> +static inline void gem_ptp_do_rxstamp(struct macb *bp, struct sk_buff *skb) { }
> +#endif
> +

> +static int gem_ptp_adjfine(struct ptp_clock_info *ptp, long scaled_ppm)
> +{
> +	struct macb *bp = container_of(ptp, struct macb, ptp_caps);
> +	u32 word, diff;
> +	u64 adj, rate;
> +	int neg_adj = 0;
> +
> +	if (scaled_ppm < 0) {
> +		neg_adj = 1;
> +		scaled_ppm = -scaled_ppm;
> +	}
> +	rate = scaled_ppm;
> +
> +	/* word: unused(8bit) | ns(8bit) | fractions(16bit) */
> +	word = (bp->ns_incr << 16) + bp->subns_incr;
> +
> +	adj = word;
> +	adj *= rate;
> +	adj >>= 16; /* remove fractions */
> +	adj += 500000UL;
> +	diff = div_u64(adj, 1000000UL);

In order to round correctly, shouldn't this be?

	adj *= rate;
	adj += 500000UL << 16;
	adj >>= 16;
	diff = div_u64(adj, 1000000UL);

> +	word = neg_adj ? word - diff : word + diff;
> +
> +	spin_lock(&bp->tsu_clk_lock);
> +
> +	gem_writel(bp, TISUBN, GEM_BF(SUBNSINCR, (word & 0xffff)));
> +	gem_writel(bp, TI, GEM_BF(NSINCR, (word >> 16)));
> +
> +	spin_unlock(&bp->tsu_clk_lock);
> +	return 0;
> +}

> +static s32 gem_ptp_max_adj(unsigned int f_nom)
> +{
> +	u64 adj;
> +
> +	/* The 48 bits of seconds for the GEM overflows every:
> +	 * 2^48/(365.25 * 24 * 60 *60) =~ 8 925 512 years (~= 9 mil years),
> +	 * thus the maximum adjust frequency must not overflow CNS register:
> +	 *
> +	 * addend  = 10^9/nominal_freq
> +	 * adj_max = +/- addend*ppb_max/10^9
> +	 * max_ppb = (2^8-1)*nominal_freq-10^9
> +	 */
> +	adj = f_nom;
> +	adj *= 0xffff;
> +	adj -= 1000000000ULL;

What is this computation, and how does it relate to the comment?

> +	return adj;
> +}

> +/* While GEM can timestamp PTP packets, it does not mark the RX descriptor

Does it timestamp PTP event packets only, or all packets?

(See my comment in patch 2/2)

> + * to identify them. UDP packets must be parsed to identify PTP packets.
> + *
> + * Note: Inspired from drivers/net/ethernet/ti/cpts.c
> + */

Thanks,
Richard

^ permalink raw reply

* Re: [PATCH 5/7] Documentation: DT: net: cpsw: allow to specify descriptors pool size
From: Grygorii Strashko @ 2016-12-07 19:41 UTC (permalink / raw)
  To: Ivan Khoronzhuk
  Cc: David S. Miller, netdev, Mugunthan V N, Sekhar Nori, linux-kernel,
	linux-omap
In-Reply-To: <23737800-cca5-fbd4-e21c-5430ecf7ab34@ti.com>



On 12/02/2016 11:21 AM, Grygorii Strashko wrote:
> 
> 
> On 12/02/2016 05:28 AM, Ivan Khoronzhuk wrote:
>> On Thu, Dec 01, 2016 at 05:34:30PM -0600, Grygorii Strashko wrote:
>>> Add optional property "descs_pool_size" to specify buffer descriptor's
>>> pool size. The "descs_pool_size" should define total number of CPDMA
>>> CPPI descriptors to be used for both ingress/egress packets
>>> processing. If not specified - the default value 256 will be used
>>> which will allow to place descriptor's pool into the internal CPPI
>>> RAM on most of TI SoC.
>>>
>>> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
>>> ---
>>>  Documentation/devicetree/bindings/net/cpsw.txt | 5 +++++
>>>  1 file changed, 5 insertions(+)
>>>
>>> diff --git a/Documentation/devicetree/bindings/net/cpsw.txt b/Documentation/devicetree/bindings/net/cpsw.txt
>>> index 5ad439f..b99d196 100644
>>> --- a/Documentation/devicetree/bindings/net/cpsw.txt
>>> +++ b/Documentation/devicetree/bindings/net/cpsw.txt
>>> @@ -35,6 +35,11 @@ Optional properties:
>>>  			  For example in dra72x-evm, pcf gpio has to be
>>>  			  driven low so that cpsw slave 0 and phy data
>>>  			  lines are connected via mux.
>>> +- descs_pool_size	: total number of CPDMA CPPI descriptors to be used for
>>> +			  both ingress/egress packets processing. if not
>>> +			  specified the default value 256 will be used which
>>> +			  will allow to place descriptors pool into the
>>> +			  internal CPPI RAM.
>> Does it describe h/w? Why now module parameter? or even smth like ethtool num
>> ring entries?
>>
> 
> It can be module parameter too. for the use cases i'm aware of -
> this is one-time boot setting only.  
> 
> ----- OR
> So, do you propose to use 
>        ethtool -g ethX
> 
>        ethtool -G ethX [rx N] [tx N]
> ?
> 
> Now cpdma has one pool for all RX/TX channels, so changing this settings
> by ethtool will require: pause interfaces, reallocate cpdma pool, 
> re-arrange buffers between channels, resume interface. Correct?
> 
> How do you think - we can move forward with one pool or better to have two (Rx and Tx)?
> 
> Wouldn't it be reasonable to still have DT (or module) parameter to avoid 
> cpdma reconfiguration on system startup (pause/resume interfaces) (faster boot)?
> 
> How about cpdma re-allocation policy (with expectation that is shouldn't happen too often)?
> - increasing of Rx, Tx will grow total number of physically allocated buffers (total_desc_num)
> - decreasing of Rx, Tx will just change number of available buffers (no memory re-allocation)
> 
> ----- OR ----
> Can we move forward with current patch (total number of CPDMA CPPI descriptors defined in DT) 
> and add ethtool -G ethX [rx N] [tx N] which will allow to re-split descs between RX and TX?
> 
> 

if no comments here, I'll rework patches to use module parameter for descs_pool_size and
will add possibility to re-split RX/TX buffers using  ethtool -G ethX [rx N] [tx N]

-- 
regards,
-grygorii

^ permalink raw reply

* Re: [RFC PATCH net-next v3 2/2] macb: Enable 1588 support in SAMA5Dx platforms.
From: Richard Cochran @ 2016-12-07 19:43 UTC (permalink / raw)
  To: Andrei Pistirica
  Cc: tbultel, boris.brezillon, rafalo, netdev, alexandre.belloni,
	nicolas.ferre, linux-kernel, harinikatakamlinux, michals, anirudh,
	punnaia, harini.katakam, davem, linux-arm-kernel
In-Reply-To: <1481134912-2243-2-git-send-email-andrei.pistirica@microchip.com>

On Wed, Dec 07, 2016 at 08:21:52PM +0200, Andrei Pistirica wrote:
> +static int gem_hwtst_set(struct net_device *netdev,
> +			 struct ifreq *ifr, int cmd)
> +{

...

> +	switch (config.rx_filter) {
> +	case HWTSTAMP_FILTER_NONE:
> +		if (priv->hwts_rx_en)
> +			priv->hwts_rx_en = 0;
> +		break;
> +	case HWTSTAMP_FILTER_PTP_V1_L4_EVENT:
> +	case HWTSTAMP_FILTER_PTP_V2_L4_EVENT:
> +	case HWTSTAMP_FILTER_PTP_V2_L2_EVENT:
> +	case HWTSTAMP_FILTER_ALL:
> +	case HWTSTAMP_FILTER_PTP_V1_L4_SYNC:
> +	case HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ:
> +	case HWTSTAMP_FILTER_PTP_V2_L2_SYNC:
> +	case HWTSTAMP_FILTER_PTP_V2_L4_SYNC:
> +	case HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ:
> +	case HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ:
> +	case HWTSTAMP_FILTER_PTP_V2_EVENT:
> +	case HWTSTAMP_FILTER_PTP_V2_SYNC:
> +	case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
> +		config.rx_filter = HWTSTAMP_FILTER_ALL;

Does the device really time stamp all packets?

Or did you mean "all PTP packets?  For that, use
HWTSTAMP_FILTER_PTP_V2_EVENT.

Thanks,
Richard

^ permalink raw reply

* Re: [PATCH net-next] bpf: fix state equivalence
From: Thomas Graf @ 2016-12-07 19:47 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: David S . Miller, Daniel Borkmann, Josef Bacik, netdev
In-Reply-To: <1481137079-2205635-1-git-send-email-ast@fb.com>

On 12/07/16 at 10:57am, Alexei Starovoitov wrote:
> Commmits 57a09bf0a416 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
> and 484611357c19 ("bpf: allow access into map value arrays") by themselves
> are correct, but in combination they make state equivalence ignore 'id' field
> of the register state which can lead to accepting invalid program.
> 
> Fixes: 57a09bf0a416 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
> Fixes: 484611357c19 ("bpf: allow access into map value arrays")
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> Acked-by: Daniel Borkmann <daniel@iogearbox.net>

Acked-by: Thomas Graf <tgraf@suug.ch>

^ permalink raw reply

* Re: Misalignment, MIPS, and ip_hdr(skb)->version
From: David Miller @ 2016-12-07 19:52 UTC (permalink / raw)
  To: Jason; +Cc: netdev, wireguard, linux-kernel, linux-mips
In-Reply-To: <CAHmME9oLgjDA2F0gkFzHU2Es8-XCxQHRABS18OKF0EnZgt1=LQ@mail.gmail.com>

From: "Jason A. Donenfeld" <Jason@zx2c4.com>
Date: Wed, 7 Dec 2016 19:54:12 +0100

> On Wed, Dec 7, 2016 at 7:51 PM, David Miller <davem@davemloft.net> wrote:
>> It's so much better to analyze properly where the misalignment comes from
>> and address it at the source, as we have for various cases that trip up
>> Sparc too.
> 
> That's sort of my attitude too, hence starting this thread. Any
> pointers you have about this would be most welcome, so as not to
> perpetuate what already seems like an issue in other parts of the
> stack.

The only truly difficult case to handle is GRE encapsulation.  Is
that the situation you are running into?

If not, please figure out what the header configuration looks like
in the case that hits for you, and what the originating device is
just in case it is a device driver issue.

Thanks.

^ permalink raw reply

* [PATCH 1/2] net: ethernet: sxgbe: remove private tx queue lock
From: Lino Sanfilippo @ 2016-12-07 20:05 UTC (permalink / raw)
  To: bh74.an, ks.giri, vipul.pandya, peppe.cavallaro, alexandre.torgue
  Cc: pavel, davem, linux-kernel, netdev, Lino Sanfilippo
In-Reply-To: <1481141138-19466-1-git-send-email-LinoSanfilippo@gmx.de>

The driver uses a private lock for synchronization between the xmit
function and the xmit completion handler, but since the NETIF_F_LLTX flag
is not set, the xmit function is also called with the xmit_lock held.

On the other hand the xmit completion handler first takes the private lock
and (in case that the tx queue has been stopped) the xmit_lock, leading
to a reverse locking order and the potential danger of a deadlock.

Fix this by removing the private lock completely and synchronizing the xmit
function and completion handler solely by means of the xmit_lock. By doing
this also remove the now unnecessary double check for a stopped tx queue.

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
---
 drivers/net/ethernet/samsung/sxgbe/sxgbe_common.h |  1 -
 drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c   | 27 +++++------------------
 2 files changed, 6 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/samsung/sxgbe/sxgbe_common.h b/drivers/net/ethernet/samsung/sxgbe/sxgbe_common.h
index 5cb51b6..c61f260 100644
--- a/drivers/net/ethernet/samsung/sxgbe/sxgbe_common.h
+++ b/drivers/net/ethernet/samsung/sxgbe/sxgbe_common.h
@@ -384,7 +384,6 @@ struct sxgbe_tx_queue {
 	dma_addr_t *tx_skbuff_dma;
 	struct sk_buff **tx_skbuff;
 	struct timer_list txtimer;
-	spinlock_t tx_lock;	/* lock for tx queues */
 	unsigned int cur_tx;
 	unsigned int dirty_tx;
 	u32 tx_count_frames;
diff --git a/drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c b/drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c
index ea44a24..22d3b0b 100644
--- a/drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c
+++ b/drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c
@@ -426,9 +426,6 @@ static int init_tx_ring(struct device *dev, u8 queue_no,
 	tx_ring->dirty_tx = 0;
 	tx_ring->cur_tx = 0;
 
-	/* initialise TX queue lock */
-	spin_lock_init(&tx_ring->tx_lock);
-
 	return 0;
 
 dmamem_err:
@@ -743,7 +740,7 @@ static void sxgbe_tx_queue_clean(struct sxgbe_tx_queue *tqueue)
 
 	dev_txq = netdev_get_tx_queue(priv->dev, queue_no);
 
-	spin_lock(&tqueue->tx_lock);
+	__netif_tx_lock(dev_txq, smp_processor_id());
 
 	priv->xstats.tx_clean++;
 	while (tqueue->dirty_tx != tqueue->cur_tx) {
@@ -781,18 +778,13 @@ static void sxgbe_tx_queue_clean(struct sxgbe_tx_queue *tqueue)
 
 	/* wake up queue */
 	if (unlikely(netif_tx_queue_stopped(dev_txq) &&
-		     sxgbe_tx_avail(tqueue, tx_rsize) > SXGBE_TX_THRESH(priv))) {
-		netif_tx_lock(priv->dev);
-		if (netif_tx_queue_stopped(dev_txq) &&
-		    sxgbe_tx_avail(tqueue, tx_rsize) > SXGBE_TX_THRESH(priv)) {
-			if (netif_msg_tx_done(priv))
-				pr_debug("%s: restart transmit\n", __func__);
-			netif_tx_wake_queue(dev_txq);
-		}
-		netif_tx_unlock(priv->dev);
+	    sxgbe_tx_avail(tqueue, tx_rsize) > SXGBE_TX_THRESH(priv))) {
+		if (netif_msg_tx_done(priv))
+			pr_debug("%s: restart transmit\n", __func__);
+		netif_tx_wake_queue(dev_txq);
 	}
 
-	spin_unlock(&tqueue->tx_lock);
+	__netif_tx_unlock(dev_txq);
 }
 
 /**
@@ -1304,9 +1296,6 @@ static netdev_tx_t sxgbe_xmit(struct sk_buff *skb, struct net_device *dev)
 		      tqueue->hwts_tx_en)))
 		ctxt_desc_req = 1;
 
-	/* get the spinlock */
-	spin_lock(&tqueue->tx_lock);
-
 	if (priv->tx_path_in_lpi_mode)
 		sxgbe_disable_eee_mode(priv);
 
@@ -1316,8 +1305,6 @@ static netdev_tx_t sxgbe_xmit(struct sk_buff *skb, struct net_device *dev)
 			netdev_err(dev, "%s: Tx Ring is full when %d queue is awake\n",
 				   __func__, txq_index);
 		}
-		/* release the spin lock in case of BUSY */
-		spin_unlock(&tqueue->tx_lock);
 		return NETDEV_TX_BUSY;
 	}
 
@@ -1436,8 +1423,6 @@ static netdev_tx_t sxgbe_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	priv->hw->dma->enable_dma_transmission(priv->ioaddr, txq_index);
 
-	spin_unlock(&tqueue->tx_lock);
-
 	return NETDEV_TX_OK;
 }
 
-- 
1.9.1

^ permalink raw reply related

* [PATCH 2/2] net: ethernet: stmmac: remove private tx queue lock
From: Lino Sanfilippo @ 2016-12-07 20:05 UTC (permalink / raw)
  To: bh74.an, ks.giri, vipul.pandya, peppe.cavallaro, alexandre.torgue
  Cc: pavel, davem, linux-kernel, netdev, Lino Sanfilippo
In-Reply-To: <1481141138-19466-1-git-send-email-LinoSanfilippo@gmx.de>

The driver uses a private lock for synchronization between the xmit
function and the xmit completion handler, but since the NETIF_F_LLTX flag
is not set, the xmit function is also called with the xmit_lock held.

On the other hand the xmit completion handler first takes the private lock
and (in case that the tx queue has been stopped) the xmit_lock, leading to
a reverse locking order and the potential danger of a deadlock.

Fix this by removing the private lock completely and synchronizing the xmit
function and completion handler solely by means of the xmit_lock. By doing
this remove also the now unnecessary double check for a stopped tx queue.

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac.h      |  1 -
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 28 +++++------------------
 2 files changed, 6 insertions(+), 23 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index 4d2a759..7e69b11 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -64,7 +64,6 @@ struct stmmac_priv {
 	dma_addr_t dma_tx_phy;
 	int tx_coalesce;
 	int hwts_tx_en;
-	spinlock_t tx_lock;
 	bool tx_path_in_lpi_mode;
 	struct timer_list txtimer;
 	bool tso;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index caf069a..db46ec4 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1307,7 +1307,7 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)
 	unsigned int bytes_compl = 0, pkts_compl = 0;
 	unsigned int entry = priv->dirty_tx;
 
-	spin_lock(&priv->tx_lock);
+	netif_tx_lock(priv->dev);
 
 	priv->xstats.tx_clean++;
 
@@ -1378,22 +1378,17 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)
 	netdev_completed_queue(priv->dev, pkts_compl, bytes_compl);
 
 	if (unlikely(netif_queue_stopped(priv->dev) &&
-		     stmmac_tx_avail(priv) > STMMAC_TX_THRESH)) {
-		netif_tx_lock(priv->dev);
-		if (netif_queue_stopped(priv->dev) &&
-		    stmmac_tx_avail(priv) > STMMAC_TX_THRESH) {
-			if (netif_msg_tx_done(priv))
-				pr_debug("%s: restart transmit\n", __func__);
-			netif_wake_queue(priv->dev);
-		}
-		netif_tx_unlock(priv->dev);
+	    stmmac_tx_avail(priv) > STMMAC_TX_THRESH)) {
+		if (netif_msg_tx_done(priv))
+			pr_debug("%s: restart transmit\n", __func__);
+		netif_wake_queue(priv->dev);
 	}
 
 	if ((priv->eee_enabled) && (!priv->tx_path_in_lpi_mode)) {
 		stmmac_enable_eee_mode(priv);
 		mod_timer(&priv->eee_ctrl_timer, STMMAC_LPI_T(eee_timer));
 	}
-	spin_unlock(&priv->tx_lock);
+	netif_tx_unlock(priv->dev);
 }
 
 static inline void stmmac_enable_dma_irq(struct stmmac_priv *priv)
@@ -1998,8 +1993,6 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev)
 	u8 proto_hdr_len;
 	int i;
 
-	spin_lock(&priv->tx_lock);
-
 	/* Compute header lengths */
 	proto_hdr_len = skb_transport_offset(skb) + tcp_hdrlen(skb);
 
@@ -2011,7 +2004,6 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev)
 			/* This is a hard error, log it. */
 			pr_err("%s: Tx Ring full when queue awake\n", __func__);
 		}
-		spin_unlock(&priv->tx_lock);
 		return NETDEV_TX_BUSY;
 	}
 
@@ -2146,11 +2138,9 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev)
 	priv->hw->dma->set_tx_tail_ptr(priv->ioaddr, priv->tx_tail_addr,
 				       STMMAC_CHAN0);
 
-	spin_unlock(&priv->tx_lock);
 	return NETDEV_TX_OK;
 
 dma_map_err:
-	spin_unlock(&priv->tx_lock);
 	dev_err(priv->device, "Tx dma map failed\n");
 	dev_kfree_skb(skb);
 	priv->dev->stats.tx_dropped++;
@@ -2182,10 +2172,7 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 			return stmmac_tso_xmit(skb, dev);
 	}
 
-	spin_lock(&priv->tx_lock);
-
 	if (unlikely(stmmac_tx_avail(priv) < nfrags + 1)) {
-		spin_unlock(&priv->tx_lock);
 		if (!netif_queue_stopped(dev)) {
 			netif_stop_queue(dev);
 			/* This is a hard error, log it. */
@@ -2357,11 +2344,9 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 		priv->hw->dma->set_tx_tail_ptr(priv->ioaddr, priv->tx_tail_addr,
 					       STMMAC_CHAN0);
 
-	spin_unlock(&priv->tx_lock);
 	return NETDEV_TX_OK;
 
 dma_map_err:
-	spin_unlock(&priv->tx_lock);
 	dev_err(priv->device, "Tx dma map failed\n");
 	dev_kfree_skb(skb);
 	priv->dev->stats.tx_dropped++;
@@ -3347,7 +3332,6 @@ int stmmac_dvr_probe(struct device *device,
 	netif_napi_add(ndev, &priv->napi, stmmac_poll, 64);
 
 	spin_lock_init(&priv->lock);
-	spin_lock_init(&priv->tx_lock);
 
 	ret = register_netdev(ndev);
 	if (ret) {
-- 
1.9.1

^ permalink raw reply related

* Remove private locks to avoid possible deadlock
From: Lino Sanfilippo @ 2016-12-07 20:05 UTC (permalink / raw)
  To: bh74.an, ks.giri, vipul.pandya, peppe.cavallaro, alexandre.torgue
  Cc: pavel, davem, linux-kernel, netdev

Hi,

these patches fix possible deadlock situations in the sxgbe and stmmac
driver. Please note that the patches are only compile tested so it would
be great if someone could do tests with the concerning HW.

Regards,
Lino

^ permalink raw reply

* [net-next PATCH v5 0/6] XDP for virtio_net
From: John Fastabend @ 2016-12-07 20:10 UTC (permalink / raw)
  To: daniel, mst, shm, davem, tgraf, alexei.starovoitov
  Cc: john.r.fastabend, netdev, john.fastabend, brouer

This implements virtio_net for the mergeable buffers and big_packet
modes. I tested this with vhost_net running on qemu and did not see
any issues. For testing num_buf > 1 I added a hack to vhost driver
to only but 100 bytes per buffer.

There are some restrictions for XDP to be enabled and work well
(see patch 3) for more details.

  1. LRO must be off
  2. MTU must be less than PAGE_SIZE
  3. queues must be available to dedicate to XDP
  4. num_bufs received in mergeable buffers must be 1
  5. big_packet mode must have all data on single page

To test this I used pktgen in the hypervisor and ran the XDP sample
programs xdp1 and xdp2 from ./samples/bpf in the host. The default
mode that is used with these patches with Linux guest and QEMU/Linux
hypervisor is the mergeable buffers mode. I tested this mode for 2+
days running xdp2 without issues. Additionally I did a series of
driver unload/load tests to check the allocate/release paths.

To test the big_packets path I applied the following simple patch against
the virtio driver forcing big_packets mode,

--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -2242,7 +2242,7 @@ static int virtnet_probe(struct virtio_device *vdev)
                vi->big_packets = true;

        if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF))
-               vi->mergeable_rx_bufs = true;
+               vi->mergeable_rx_bufs = false;

        if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF) ||
            virtio_has_feature(vdev, VIRTIO_F_VERSION_1))

I then repeated the tests with xdp1 and xdp2. After letting them run
for a few hours I called it good enough.

Testing the unexpected case where virtio receives a packet across
multiple buffers required patching the hypervisor vhost driver to
convince it to send these unexpected packets. Then I used ping with
the -s option to trigger the case with multiple buffers. This mode
is not expected to be used but as MST pointed out per spec it is
not strictly speaking illegal to generate multi-buffer packets so we
need someway to handle these. The following patch can be used to
generate multiple buffers,

--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1777,7 +1777,8 @@ static int translate_desc(struct vhost_virtqueue
*vq, u64

                _iov = iov + ret;
                size = node->size - addr + node->start;
-               _iov->iov_len = min((u64)len - s, size);
+               printk("%s: build 100 length headers!\n", __func__);
+               _iov->iov_len = min((u64)len - s, (u64)100);//size);
                _iov->iov_base = (void __user *)(unsigned long)
                        (node->userspace_addr + addr - node->start);
                s += size;

Please review any comments/feedback welcome as always.

Thanks,
John

---

John Fastabend (6):
      net: virtio dynamically disable/enable LRO
      net: xdp: add invalid buffer warning
      virtio_net: Add XDP support
      virtio_net: add dedicated XDP transmit queues
      virtio_net: add XDP_TX support
      virtio_net: xdp, add slowpath case for non contiguous buffers

 drivers/net/virtio_net.c |  403 +++++++++++++++++++++++++++++++++++++++++++++-
 include/linux/filter.h   |    1 
 net/core/filter.c        |    6 +
 3 files changed, 402 insertions(+), 8 deletions(-)

^ permalink raw reply

* [net-next PATCH v5 1/6] net: virtio dynamically disable/enable LRO
From: John Fastabend @ 2016-12-07 20:11 UTC (permalink / raw)
  To: daniel, mst, shm, davem, tgraf, alexei.starovoitov
  Cc: john.r.fastabend, netdev, john.fastabend, brouer
In-Reply-To: <20161207200139.28121.4811.stgit@john-Precision-Tower-5810>

This adds support for dynamically setting the LRO feature flag. The
message to control guest features in the backend uses the
CTRL_GUEST_OFFLOADS msg type.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
 drivers/net/virtio_net.c |   40 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index a21d93a..a5c47b1 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1419,6 +1419,36 @@ static void virtnet_init_settings(struct net_device *dev)
 	.set_settings = virtnet_set_settings,
 };
 
+static int virtnet_set_features(struct net_device *netdev,
+				netdev_features_t features)
+{
+	struct virtnet_info *vi = netdev_priv(netdev);
+	struct virtio_device *vdev = vi->vdev;
+	struct scatterlist sg;
+	u64 offloads = 0;
+
+	if (features & NETIF_F_LRO)
+		offloads |= (1 << VIRTIO_NET_F_GUEST_TSO4) |
+			    (1 << VIRTIO_NET_F_GUEST_TSO6);
+
+	if (features & NETIF_F_RXCSUM)
+		offloads |= (1 << VIRTIO_NET_F_GUEST_CSUM);
+
+	if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_GUEST_OFFLOADS)) {
+		sg_init_one(&sg, &offloads, sizeof(uint64_t));
+		if (!virtnet_send_command(vi,
+					  VIRTIO_NET_CTRL_GUEST_OFFLOADS,
+					  VIRTIO_NET_CTRL_GUEST_OFFLOADS_SET,
+					  &sg)) {
+			dev_warn(&netdev->dev,
+				 "Failed to set guest offloads by virtnet command.\n");
+			return -EINVAL;
+		}
+	}
+
+	return 0;
+}
+
 static const struct net_device_ops virtnet_netdev = {
 	.ndo_open            = virtnet_open,
 	.ndo_stop   	     = virtnet_close,
@@ -1435,6 +1465,7 @@ static void virtnet_init_settings(struct net_device *dev)
 #ifdef CONFIG_NET_RX_BUSY_POLL
 	.ndo_busy_poll		= virtnet_busy_poll,
 #endif
+	.ndo_set_features	= virtnet_set_features,
 };
 
 static void virtnet_config_changed_work(struct work_struct *work)
@@ -1815,6 +1846,12 @@ static int virtnet_probe(struct virtio_device *vdev)
 	if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_CSUM))
 		dev->features |= NETIF_F_RXCSUM;
 
+	if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO4) &&
+	    virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO6)) {
+		dev->features |= NETIF_F_LRO;
+		dev->hw_features |= NETIF_F_LRO;
+	}
+
 	dev->vlan_features = dev->features;
 
 	/* MTU range: 68 - 65535 */
@@ -2057,7 +2094,8 @@ static int virtnet_restore(struct virtio_device *vdev)
 	VIRTIO_NET_F_CTRL_RX, VIRTIO_NET_F_CTRL_VLAN, \
 	VIRTIO_NET_F_GUEST_ANNOUNCE, VIRTIO_NET_F_MQ, \
 	VIRTIO_NET_F_CTRL_MAC_ADDR, \
-	VIRTIO_NET_F_MTU
+	VIRTIO_NET_F_MTU, \
+	VIRTIO_NET_F_CTRL_GUEST_OFFLOADS
 
 static unsigned int features[] = {
 	VIRTNET_FEATURES,

^ permalink raw reply related

* [net-next PATCH v5 2/6] net: xdp: add invalid buffer warning
From: John Fastabend @ 2016-12-07 20:11 UTC (permalink / raw)
  To: daniel, mst, shm, davem, tgraf, alexei.starovoitov
  Cc: john.r.fastabend, netdev, john.fastabend, brouer
In-Reply-To: <20161207200139.28121.4811.stgit@john-Precision-Tower-5810>

This adds a warning for drivers to use when encountering an invalid
buffer for XDP. For normal cases this should not happen but to catch
this in virtual/qemu setups that I may not have expected from the
emulation layer having a standard warning is useful.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
 include/linux/filter.h |    1 +
 net/core/filter.c      |    6 ++++++
 2 files changed, 7 insertions(+)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index f078d2b..860f953 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -600,6 +600,7 @@ int sk_get_filter(struct sock *sk, struct sock_filter __user *filter,
 struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off,
 				       const struct bpf_insn *patch, u32 len);
 void bpf_warn_invalid_xdp_action(u32 act);
+void bpf_warn_invalid_xdp_buffer(void);
 
 #ifdef CONFIG_BPF_JIT
 extern int bpf_jit_enable;
diff --git a/net/core/filter.c b/net/core/filter.c
index b751202..849f23d 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2948,6 +2948,12 @@ void bpf_warn_invalid_xdp_action(u32 act)
 }
 EXPORT_SYMBOL_GPL(bpf_warn_invalid_xdp_action);
 
+void bpf_warn_invalid_xdp_buffer(void)
+{
+	WARN_ONCE(1, "Illegal XDP buffer encountered, expect throughput degradation\n");
+}
+EXPORT_SYMBOL_GPL(bpf_warn_invalid_xdp_buffer);
+
 static u32 sk_filter_convert_ctx_access(enum bpf_access_type type, int dst_reg,
 					int src_reg, int ctx_off,
 					struct bpf_insn *insn_buf,

^ permalink raw reply related

* [net-next PATCH v5 5/6] virtio_net: add XDP_TX support
From: John Fastabend @ 2016-12-07 20:12 UTC (permalink / raw)
  To: daniel, mst, shm, davem, tgraf, alexei.starovoitov
  Cc: john.r.fastabend, netdev, john.fastabend, brouer
In-Reply-To: <20161207200139.28121.4811.stgit@john-Precision-Tower-5810>

This adds support for the XDP_TX action to virtio_net. When an XDP
program is run and returns the XDP_TX action the virtio_net XDP
implementation will transmit the packet on a TX queue that aligns
with the current CPU that the XDP packet was processed on.

Before sending the packet the header is zeroed.  Also XDP is expected
to handle checksum correctly so no checksum offload  support is
provided.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
 drivers/net/virtio_net.c |   99 +++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 92 insertions(+), 7 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 28b1196..8e5b13c 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -330,12 +330,57 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
 	return skb;
 }
 
+static void virtnet_xdp_xmit(struct virtnet_info *vi,
+			     struct receive_queue *rq,
+			     struct send_queue *sq,
+			     struct xdp_buff *xdp)
+{
+	struct page *page = virt_to_head_page(xdp->data);
+	struct virtio_net_hdr_mrg_rxbuf *hdr;
+	unsigned int num_sg, len;
+	void *xdp_sent;
+	int err;
+
+	/* Free up any pending old buffers before queueing new ones. */
+	while ((xdp_sent = virtqueue_get_buf(sq->vq, &len)) != NULL) {
+		struct page *sent_page = virt_to_head_page(xdp_sent);
+
+		if (vi->mergeable_rx_bufs)
+			put_page(sent_page);
+		else
+			give_pages(rq, sent_page);
+	}
+
+	/* Zero header and leave csum up to XDP layers */
+	hdr = xdp->data;
+	memset(hdr, 0, vi->hdr_len);
+
+	num_sg = 1;
+	sg_init_one(sq->sg, xdp->data, xdp->data_end - xdp->data);
+	err = virtqueue_add_outbuf(sq->vq, sq->sg, num_sg,
+				   xdp->data, GFP_ATOMIC);
+	if (unlikely(err)) {
+		if (vi->mergeable_rx_bufs)
+			put_page(page);
+		else
+			give_pages(rq, page);
+	} else if (!vi->mergeable_rx_bufs) {
+		/* If not mergeable bufs must be big packets so cleanup pages */
+		give_pages(rq, (struct page *)page->private);
+		page->private = 0;
+	}
+
+	virtqueue_kick(sq->vq);
+}
+
 static u32 do_xdp_prog(struct virtnet_info *vi,
+		       struct receive_queue *rq,
 		       struct bpf_prog *xdp_prog,
 		       struct page *page, int offset, int len)
 {
 	int hdr_padded_len;
 	struct xdp_buff xdp;
+	unsigned int qp;
 	u32 act;
 	u8 *buf;
 
@@ -353,9 +398,15 @@ static u32 do_xdp_prog(struct virtnet_info *vi,
 	switch (act) {
 	case XDP_PASS:
 		return XDP_PASS;
+	case XDP_TX:
+		qp = vi->curr_queue_pairs -
+			vi->xdp_queue_pairs +
+			smp_processor_id();
+		xdp.data = buf + (vi->mergeable_rx_bufs ? 0 : 4);
+		virtnet_xdp_xmit(vi, rq, &vi->sq[qp], &xdp);
+		return XDP_TX;
 	default:
 		bpf_warn_invalid_xdp_action(act);
-	case XDP_TX:
 	case XDP_ABORTED:
 	case XDP_DROP:
 		return XDP_DROP;
@@ -390,9 +441,17 @@ static struct sk_buff *receive_big(struct net_device *dev,
 
 		if (unlikely(hdr->hdr.gso_type || hdr->hdr.flags))
 			goto err_xdp;
-		act = do_xdp_prog(vi, xdp_prog, page, 0, len);
-		if (act == XDP_DROP)
+		act = do_xdp_prog(vi, rq, xdp_prog, page, 0, len);
+		switch (act) {
+		case XDP_PASS:
+			break;
+		case XDP_TX:
+			rcu_read_unlock();
+			goto xdp_xmit;
+		case XDP_DROP:
+		default:
 			goto err_xdp;
+		}
 	}
 	rcu_read_unlock();
 
@@ -407,6 +466,7 @@ static struct sk_buff *receive_big(struct net_device *dev,
 err:
 	dev->stats.rx_dropped++;
 	give_pages(rq, page);
+xdp_xmit:
 	return NULL;
 }
 
@@ -425,6 +485,8 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 	struct bpf_prog *xdp_prog;
 	unsigned int truesize;
 
+	head_skb = NULL;
+
 	rcu_read_lock();
 	xdp_prog = rcu_dereference(rq->xdp_prog);
 	if (xdp_prog) {
@@ -448,9 +510,17 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 		if (unlikely(hdr->hdr.gso_type || hdr->hdr.flags))
 			goto err_xdp;
 
-		act = do_xdp_prog(vi, xdp_prog, page, offset, len);
-		if (act == XDP_DROP)
+		act = do_xdp_prog(vi, rq, xdp_prog, page, offset, len);
+		switch (act) {
+		case XDP_PASS:
+			break;
+		case XDP_TX:
+			rcu_read_unlock();
+			goto xdp_xmit;
+		case XDP_DROP:
+		default:
 			goto err_xdp;
+		}
 	}
 	rcu_read_unlock();
 
@@ -528,6 +598,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 err_buf:
 	dev->stats.rx_dropped++;
 	dev_kfree_skb(head_skb);
+xdp_xmit:
 	return NULL;
 }
 
@@ -1734,6 +1805,16 @@ static void free_receive_page_frags(struct virtnet_info *vi)
 			put_page(vi->rq[i].alloc_frag.page);
 }
 
+static bool is_xdp_queue(struct virtnet_info *vi, int q)
+{
+	if (q < (vi->curr_queue_pairs - vi->xdp_queue_pairs))
+		return false;
+	else if (q < vi->curr_queue_pairs)
+		return true;
+	else
+		return false;
+}
+
 static void free_unused_bufs(struct virtnet_info *vi)
 {
 	void *buf;
@@ -1741,8 +1822,12 @@ static void free_unused_bufs(struct virtnet_info *vi)
 
 	for (i = 0; i < vi->max_queue_pairs; i++) {
 		struct virtqueue *vq = vi->sq[i].vq;
-		while ((buf = virtqueue_detach_unused_buf(vq)) != NULL)
-			dev_kfree_skb(buf);
+		while ((buf = virtqueue_detach_unused_buf(vq)) != NULL) {
+			if (!is_xdp_queue(vi, i))
+				dev_kfree_skb(buf);
+			else
+				put_page(virt_to_head_page(buf));
+		}
 	}
 
 	for (i = 0; i < vi->max_queue_pairs; i++) {

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox