Netdev List
 help / color / mirror / Atom feed
* [PATCH net-next] net: phy: don't double-read link status register if link is up
From: Heiner Kallweit @ 2019-02-07 19:22 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, David Miller; +Cc: netdev@vger.kernel.org

The link status register latches link-down events. Therefore, if link
is reported as being up, there's no need for a second read.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/phy/phy-c45.c    | 2 ++
 drivers/net/phy/phy_device.c | 6 +++++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/phy/phy-c45.c b/drivers/net/phy/phy-c45.c
index d429c6a3e..19334fe5f 100644
--- a/drivers/net/phy/phy-c45.c
+++ b/drivers/net/phy/phy-c45.c
@@ -147,6 +147,8 @@ int genphy_c45_read_link(struct phy_device *phydev)
 			val = phy_read_mmd(phydev, devad, MDIO_STAT1);
 			if (val < 0)
 				return val;
+			else if (val & MDIO_STAT1_LSTATUS)
+				continue;
 		}
 
 		val = phy_read_mmd(phydev, devad, MDIO_STAT1);
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index d490cd2a8..9369e1323 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -1735,8 +1735,12 @@ int genphy_update_link(struct phy_device *phydev)
 	 */
 	if (!phy_polling_mode(phydev)) {
 		status = phy_read(phydev, MII_BMSR);
-		if (status < 0)
+		if (status < 0) {
 			return status;
+		} else if (status & BMSR_LSTATUS) {
+			phydev->link = 1;
+			return 0;
+		}
 	}
 
 	/* Read link and autonegotiation status */
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH bpf-next] tools/bpf: add missing strings.h include
From: Andrii Nakryiko @ 2019-02-07 19:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Andrii Nakryiko, Yonghong Song, Song Liu, Alexei Starovoitov,
	Martin Lau, netdev, kernel-team
In-Reply-To: <20190207192111.GB6835@kernel.org>

On Thu, Feb 7, 2019 at 11:21 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Thu, Feb 07, 2019 at 09:50:27AM -0800, Andrii Nakryiko escreveu:
> > Few files in libbpf are using bzero() function (defined in strings.h header), but
> > don't include corresponding header. When libbpf is added as a dependency to pahole,
> > this undeterministically causes warnings on some machines:
> >
> > bpf.c:225:2: warning: implicit declaration of function ‘bzero’ [-Wimplicit-function-declaration]
> >   bzero(&attr, sizeof(attr));
>      ^~~~~
>
> You forgot your:
>
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
>
> And a:
>
> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
>
> :-)
>

argh... v2 incoming, thanks!

> - Arnaldo
>
> > ---
> >  tools/lib/bpf/bpf.c    | 1 +
> >  tools/lib/bpf/btf.c    | 1 +
> >  tools/lib/bpf/libbpf.c | 1 +
> >  3 files changed, 3 insertions(+)
> >
> > diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> > index 3defad77dc7a..92fd27fe0599 100644
> > --- a/tools/lib/bpf/bpf.c
> > +++ b/tools/lib/bpf/bpf.c
> > @@ -22,6 +22,7 @@
> >   */
> >
> >  #include <stdlib.h>
> > +#include <strings.h>
> >  #include <memory.h>
> >  #include <unistd.h>
> >  #include <asm/unistd.h>
> > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > index ab6528c935a1..4324eb47d214 100644
> > --- a/tools/lib/bpf/btf.c
> > +++ b/tools/lib/bpf/btf.c
> > @@ -4,6 +4,7 @@
> >  #include <stdio.h>
> >  #include <stdlib.h>
> >  #include <string.h>
> > +#include <strings.h>
> >  #include <unistd.h>
> >  #include <errno.h>
> >  #include <linux/err.h>
> > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> > index 47969aa0faf8..8d64ada5f728 100644
> > --- a/tools/lib/bpf/libbpf.c
> > +++ b/tools/lib/bpf/libbpf.c
> > @@ -18,6 +18,7 @@
> >  #include <libgen.h>
> >  #include <inttypes.h>
> >  #include <string.h>
> > +#include <strings.h>
> >  #include <unistd.h>
> >  #include <fcntl.h>
> >  #include <errno.h>
> > --
> > 2.17.1
>
> --
>
> - Arnaldo

^ permalink raw reply

* [PATCH v2 bpf-next] tools/bpf: add missing strings.h include
From: Andrii Nakryiko @ 2019-02-07 19:29 UTC (permalink / raw)
  To: acme, andrii.nakryiko, yhs, songliubraving, ast, kafai, netdev,
	kernel-team
  Cc: Andrii Nakryiko

Few files in libbpf are using bzero() function (defined in strings.h header), but
don't include corresponding header. When libbpf is added as a dependency to pahole,
this undeterministically causes warnings on some machines:

bpf.c:225:2: warning: implicit declaration of function ‘bzero’ [-Wimplicit-function-declaration]
  bzero(&attr, sizeof(attr));
    ^~~~~

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/bpf/bpf.c    | 1 +
 tools/lib/bpf/btf.c    | 1 +
 tools/lib/bpf/libbpf.c | 1 +
 3 files changed, 3 insertions(+)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 3defad77dc7a..92fd27fe0599 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -22,6 +22,7 @@
  */
 
 #include <stdlib.h>
+#include <strings.h>
 #include <memory.h>
 #include <unistd.h>
 #include <asm/unistd.h>
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index ab6528c935a1..4324eb47d214 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -4,6 +4,7 @@
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
+#include <strings.h>
 #include <unistd.h>
 #include <errno.h>
 #include <linux/err.h>
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 47969aa0faf8..8d64ada5f728 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -18,6 +18,7 @@
 #include <libgen.h>
 #include <inttypes.h>
 #include <string.h>
+#include <strings.h>
 #include <unistd.h>
 #include <fcntl.h>
 #include <errno.h>
-- 
2.17.1


^ permalink raw reply related

* Re: stmmac / meson8b-dwmac
From: Simon Huelck @ 2019-02-07 19:30 UTC (permalink / raw)
  To: Emiliano Ingrassia, Martin Blumenstingl
  Cc: Gpeppe.cavallaro, alexandre.torgue, linux-amlogic, netdev
In-Reply-To: <20190206103645.GA2482@ingrassia.epigenesys.com>

Hi Guys,


i can confirm better performance with 4.14.29

- ~900 MBits with iperf2 in one way
-~ 500 - 600MBits with iperf2 in duplex in both directions


This wasnt the case with 4.17.9, not with 4.18, 4.19 or the 5.0 series.....


How can i help further ?

regards,
Simon


Am 06.02.2019 um 11:36 schrieb Emiliano Ingrassia:
> Hi Martin, Hi Simon,
>
> On Mon, Feb 04, 2019 at 03:34:41PM +0100, Martin Blumenstingl wrote:
>> On Thu, Jan 17, 2019 at 10:23 PM Simon Huelck <simonmail@gmx.de> wrote:
>> [...]
>>>>> I got problems with my ODROID c2 running on 4.19.16 ( and some releases
>>>>> earlier ). the stmmac / dwmac driver doesnt provide the 800M/900M
>>>>> performance that i was used to earlier.
>>>>>
> Simon, did you ever reach 1 Gbps full duplex speed?
> If yes, what was the kernel version did you use?
>
>>>>> Now im stuck near 550M/600M in the same environment. but what really
>>>>> confuses me that duplex does hurt even more.
>>>> interesting that you see this on the Odroid-C2 as well.
>>>> previously I have only observed it on an Odroid-C1
>>>>
>>>>> PC --- VLAN3 --> switch --VLAN3--> ODROID
>>>>>
>>>>> NAS <-- VLAN1 -- switch <-- VLAN1-- ODROID
>>>>>
>>>>>
>>>>> this means when im doing a iperf from PC to NAS, that my ODROID has load
>>>>> on RX/TX same time (duplex). this shouldnt be an issue , all is 1GBits
>>>>> FD. And in the past that wasnt an issue.
>> +Cc Emiliano who has seen a similar duplex issue on his Odroid-C1: [0]
>> (please note that all kernels prior to v5.1 with the pending patches
>> from [1] applied are only receiving data on RXD0 and RXD1 but not on
>> RXD2 and RXD3)
>>
>> Emiliano, can you confirm the duplex issue observed by Simon is
>> similar to the one you see on your Odroid-C1?
>>
> It could be but, if I understand correctly, Simon is limited in
> speed also in half duplex transmission (~550/600 Mbps), while we can
> reach at least 900 Mbps.
>
>>>>>
>>>>> Now what happens:
>>>>>
>>>>> - benchmark between PC - ODROID is roughly 550M
>>>>>
>>>>> - benchmark between NAS - ODROID is roughly 550M
>>>>>
>>>>> - benchmark between PC - NAS is only around 300M
>>>>>
>>>>>
>>>>> and like i said i was easliy able to hit 800 or even 900M to my NAS
>>>>> earlier. I applied some .dtb fixes for interrupt levels for the
>>>>> meson-gx.dtsi and meson-gxbb-odroid-c2.dtb, which will be mainlined ,
>>>>> but the effect stayed identical.
>>>> good that you have the interrupt patches already applied
>>>> I believe it don't fix any performance issues - it's a fix for the
>>>> Ethernet controller seemingly getting "stuck" (not processing data
>>>> anymore). however, that already rules out one potential issue
>>>>
>>>>> are you aware of this problem ? Earlier kernel versions were all
>>>>> perfectly fine and i stepped ( self compiled) kernel through all major
>>>>> releases since odroid c2 was mainlined.
>> Guiseppe, Alexandre: what kind of data do you need from us if we see
>> the speeds drop (in both directions) when we send and receive at the
>> same time?
>>
>> [...]
>>> the problem is that i dont have these kernel sources anymore :-(. but i
>>> can provide some testing and numbers. maybe i dig if i got these kernel
>>> configs somewhere around but i did not change much during migrating
>> do you remember the kernel version where it worked fine?
>>
>>> im using a zyxel gs1900-8 switch and a qnap ts231p , and as i said i
>>> didnt change my setup. i was able to hit 100MByte/s from my NAS , so
>>> close to the benchmarks of 900MBit/s
>> I typically only do small transfers or I have traffic only in one direction.
>> thus it's likely that I missed this in my own tests
>>
>>
>> Regards
>> Martin
>>
>>
>> [0] http://lists.infradead.org/pipermail/linux-amlogic/2018-December/009679.html
>> [1] https://patchwork.kernel.org/cover/10744905/
> Regards,
>
> Emiliano



^ permalink raw reply

* Re: [PATCH net-next 0/2] qed*: SmartAN query support
From: Jakub Kicinski @ 2019-02-07 19:54 UTC (permalink / raw)
  To: Sudarsana Reddy Kalluru; +Cc: davem, netdev, aelior, mkalderon
In-Reply-To: <20190207142012.4521-1-skalluru@marvell.com>

On Thu, 7 Feb 2019 06:20:10 -0800, Sudarsana Reddy Kalluru wrote:
> SmartAN feature detects the peer/cable capabilities and establishes the
> link in the best possible configuration.

It sounds familiar, I need to check with FW team, but I think we may be
doing a similar thing, and adding a common API rather than ethtool flag
would be preferable.

Could you please share a little bit more detail?  What are the
configurations this would choose between?

^ permalink raw reply

* [PATCH bpf] bpf: only adjust gso_size on bytestream protocols
From: Willem de Bruijn @ 2019-02-07 19:54 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, posk.devel, dja, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

bpf_skb_change_proto and bpf_skb_adjust_room change skb header length.
For GSO packets they adjust gso_size to maintain the same MTU.

The gso size can only be safely adjusted on bytestream protocols.
Commit d02f51cbcf12 ("bpf: fix bpf_skb_adjust_net/bpf_skb_proto_xlat
to deal with gso sctp skbs") excluded SKB_GSO_SCTP.

Since then type SKB_GSO_UDP_L4 has been added, whose contents are one
gso_size unit per datagram. Also exclude these.

Move from a blacklist to a whitelist check to future proof against
additional such new GSO types, e.g., for fraglist based GRO.

Fixes: bec1f6f69736 ("udp: generate gso with UDP_SEGMENT")
Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 include/linux/skbuff.h |  6 ++++++
 net/core/filter.c      | 12 ++++--------
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 95d25b010a257..5a7a8b93a5abf 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -4212,6 +4212,12 @@ static inline bool skb_is_gso_sctp(const struct sk_buff *skb)
 	return skb_shinfo(skb)->gso_type & SKB_GSO_SCTP;
 }
 
+static inline bool skb_is_gso_tcp(const struct sk_buff *skb)
+{
+	return skb_is_gso(skb) &&
+	       skb_shinfo(skb)->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6);
+}
+
 static inline void skb_gso_reset(struct sk_buff *skb)
 {
 	skb_shinfo(skb)->gso_size = 0;
diff --git a/net/core/filter.c b/net/core/filter.c
index 7a54dc11ac2d3..f7d0004fc1609 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2789,8 +2789,7 @@ static int bpf_skb_proto_4_to_6(struct sk_buff *skb)
 	u32 off = skb_mac_header_len(skb);
 	int ret;
 
-	/* SCTP uses GSO_BY_FRAGS, thus cannot adjust it. */
-	if (skb_is_gso(skb) && unlikely(skb_is_gso_sctp(skb)))
+	if (!skb_is_gso_tcp(skb))
 		return -ENOTSUPP;
 
 	ret = skb_cow(skb, len_diff);
@@ -2831,8 +2830,7 @@ static int bpf_skb_proto_6_to_4(struct sk_buff *skb)
 	u32 off = skb_mac_header_len(skb);
 	int ret;
 
-	/* SCTP uses GSO_BY_FRAGS, thus cannot adjust it. */
-	if (skb_is_gso(skb) && unlikely(skb_is_gso_sctp(skb)))
+	if (!skb_is_gso_tcp(skb))
 		return -ENOTSUPP;
 
 	ret = skb_unclone(skb, GFP_ATOMIC);
@@ -2957,8 +2955,7 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 len_diff)
 	u32 off = skb_mac_header_len(skb) + bpf_skb_net_base_len(skb);
 	int ret;
 
-	/* SCTP uses GSO_BY_FRAGS, thus cannot adjust it. */
-	if (skb_is_gso(skb) && unlikely(skb_is_gso_sctp(skb)))
+	if (!skb_is_gso_tcp(skb))
 		return -ENOTSUPP;
 
 	ret = skb_cow(skb, len_diff);
@@ -2987,8 +2984,7 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 len_diff)
 	u32 off = skb_mac_header_len(skb) + bpf_skb_net_base_len(skb);
 	int ret;
 
-	/* SCTP uses GSO_BY_FRAGS, thus cannot adjust it. */
-	if (skb_is_gso(skb) && unlikely(skb_is_gso_sctp(skb)))
+	if (!skb_is_gso_tcp(skb))
 		return -ENOTSUPP;
 
 	ret = skb_unclone(skb, GFP_ATOMIC);
-- 
2.20.1.611.gfbb209baf1-goog


^ permalink raw reply related

* Re: [PATCH net-next] net: phy: let genphy_c45_read_link manage the devices to check
From: Andrew Lunn @ 2019-02-07 19:57 UTC (permalink / raw)
  To: Heiner Kallweit
  Cc: Florian Fainelli, David Miller, Russell King,
	netdev@vger.kernel.org
In-Reply-To: <e784e845-5e25-9916-343a-0d60fbf7fc9b@gmail.com>

On Thu, Feb 07, 2019 at 08:05:55PM +0100, Heiner Kallweit wrote:
> Let genphy_c45_read_link manage the devices to check, this removes
> overhead from callers. Devices VEND1 and VEND2 will never be checked,
> for now adopt the logic of the Marvell driver to also exclude PHY XS.
> 
> At the moment we have very few clause 45 PHY drivers, so we are
> lacking experience whether other drivers will have to exclude further
> devices, or may need to check PHY XS. If we should figure out that
> list of devices to check needs to be configurable, I think best will
> be to add a device list member to struct phy_driver.

Hi Heiner

For the Aquantia PHY you probably need to exclude MDIO_MMD_C22EXT.

There is no register 1D:1 listed in the datasheet.

PHY XS is the interface towards the MAC. You cannot configure the MAC
to use the correct interface mode until the PHY has link to its peer
and the link mode has been determined. So you need the PHY to signal
link up independent of the MAC-PHY link, to kick off the configuration
of the MAC. When using phylink, the MAC can indicate it has a link to
the PHY using phylink_mac_change().

      Andrew
 

^ permalink raw reply

* Re: [PATCH v2 bpf-next 2/2] btf: expose API to work with raw btf data
From: Andrii Nakryiko @ 2019-02-07 20:13 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, Song Liu, Yonghong Song, Alexei Starovoitov,
	Martin Lau, netdev, Daniel Borkmann
In-Reply-To: <CAEf4Bzbj8=YBYDs52TA8r4HAKS64rZMg9X0WGTdcJBLauL+F3A@mail.gmail.com>

On Thu, Feb 7, 2019 at 11:21 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Feb 5, 2019 at 10:25 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Tue, Feb 05, 2019 at 09:46:14PM -0800, Andrii Nakryiko wrote:
> > > On Tue, Feb 5, 2019 at 7:07 PM Alexei Starovoitov
> > > <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > On Tue, Feb 05, 2019 at 04:29:49PM -0800, Andrii Nakryiko wrote:
> > > > > This patch exposes two new APIs btf__get_raw_data_size() and
> > > > > btf__get_raw_data() that allows to get a copy of raw BTF data out of
> > > > > struct btf. This is useful for external programs that need to manipulate
> > > > > raw data, e.g., pahole using btf__dedup() to deduplicate BTF type info
> > > > > and then writing it back to file.
> > > > >
> > > > > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > > > > Acked-by: Song Liu <songliubraving@fb.com>
> > > > > ---
> > > > >  tools/lib/bpf/btf.c      | 10 ++++++++++
> > > > >  tools/lib/bpf/btf.h      |  2 ++
> > > > >  tools/lib/bpf/libbpf.map |  2 ++
> > > > >  3 files changed, 14 insertions(+)
> > > > >
> > > > > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > > > > index 1c2ba7182400..34bfb3641aac 100644
> > > > > --- a/tools/lib/bpf/btf.c
> > > > > +++ b/tools/lib/bpf/btf.c
> > > > > @@ -437,6 +437,16 @@ int btf__fd(const struct btf *btf)
> > > > >       return btf->fd;
> > > > >  }
> > > > >
> > > > > +__u32 btf__get_raw_data_size(const struct btf *btf)
> > > > > +{
> > > > > +     return btf->data_size;
> > > > > +}
> > > > > +
> > > > > +void btf__get_raw_data(const struct btf *btf, char *data)
> > > > > +{
> > > > > +     memcpy(data, btf->data, btf->data_size);
> > > > > +}
> > > >
> > > > I cannot think of any other way to use this api,
> > > > but to call btf__get_raw_data_size() first,
> > > > then malloc that much memory and then call btf__get_raw_data()
> > > > to store btf into it.
> > > >
> > > > If so, may be api should be single call that allocates, copies,
> > > > and returns pointer to new mem and its size?
> > > > Probably less error prone?
> > > >
> > >
> > > I don't have strong preference, but providing pointer to allocated memory
> > > seems more flexible and allows more clever/optimal use of memory from caller
> > > side. E.g., instead of doing two mallocs, you can imagine doing something
> > > like:
> > >
> > > int max_size = max(btf__get_raw_data_size(btf),
> > >                    btf_ext__get_raw_data_size(btf_ext));
> > > char *m = malloc(max_size);
> > > btf__get_raw_data(btf, m);
> > > dump_btf_section_to_file(m, some_file);
> > > btf_ext__get_raw_data(btf_ext, m);
> > > dump_btf_ext_section_to_file(m, some_file);
> > > free(m);
> > >
> > > Also, pointer to memory could be mmap()'ed file, for instance. In general,
> > > for a library it might be a good thing to not be prescriptive as to how one
> > > gets that piece of memory.
> >
> > Plausible, but I'd like to see pahole patches to be convinced ;)
> >
>
> Here's a summary of proposed ways to expose raw data through new api,
> with pros/cons.
>
> 1. Originally proposed two functions. `int btf__get_raw_data_size()`
> to get size, `void btf__get_raw_data(void* buf)` to write raw data to
> a provided buf.
>
> Pros:
>   - allows maximal flexibility for users of this API. They can manage
> memory as it's convenient for them (e.g., reusing same buffer for
> multiple btf and btf_ext raw data).
>   - allows using mmap()'ed memory, as allocation and memory ownership
> is delegated to user
>
> Cons:
>   - has potential of buffer overflows, if user doesn't provide big enough buffer
>
>
> 2. Alexei's proposal to combine getting size in single function that
> internally allocates new memory buffer, copies data and returns it to
> users to use and later free.
>
> Pros:
>   - one less API function
>   - more straightforward usage, it's hard to misuse it (except for
> memory leaking, if memory is not freed)
>
> Cons:
>   - always allocated for each call
>   - least flexible approach, doesn't allow caller to manage memory,
> prevents any kind of direct write to mmap()'ed file
>
> 3. Daniel proposed realloc-like approach, where caller optionally
> provides memory buffer, but we always call realloc() internally to
> ensure we have long enough buffer.
>
> Pros:
>   - allows callers to provide their memory buffer (similar to approach
> #1, but see cons below)
>   - prevents user error with providing too small buffer (similar to approach #2)
>
> Cons:
>   - realloc expects that memory was allocated by previous malloc()
> call, so caller can't allocate bigger chunk of memory and provide
> pointer inside that area (behavior is undefined in that case). This
> requirement is not immediately obvious, so this approach feels more
> error prone than either of approach #1 and #2
>   - still doesn't allow mmap()'ed usage, again due to realloc()'s requirements
>

There is actually approach #4 - just return const void* to an internal
memory buffer. This is trivial for struct btf, will require just
slight changes for struct btf_ext, but it puts all the control in
user's hands without imposing any unnecessary allocations. This
approach seems to provide best of all approaches with no downsides.

>
> Approach #3 looks most subtly-error-prone, as it's too easy to just
> provide pointer that's not at the beginning of malloc()'ed memory, but
> this might not be detected immediately, and could potentially lead to
> silent memory corruption.
>
> I'd still go with approach #1 as it provides least restrictive API,
> even though approach #2 will provide marginally better usability for
> common cases.
>
> Alexei, Daniel, which approach you'd prefer in the end after
> considering all pros and cons?

^ permalink raw reply

* [PATCH v2 net-next] net: fixed-phy: Add fixed_phy_register_with_gpiod() API
From: Moritz Fischer @ 2019-02-07 20:14 UTC (permalink / raw)
  To: netdev; +Cc: andrew, f.fainelli, hkallweit1, davem, linux-kernel,
	Moritz Fischer

Add fixed_phy_register_with_gpiod() API. It lets users create a
fixed_phy instance that uses a GPIO descriptor which was obtained
externally e.g. through platform data.
This enables platform devices (non-DT based) to use GPIOs for link
status.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
---

Changes from v1:
- Added Florian's Reviewed-by: tag

Changes from RFC:
- Implemented Andrew's/Florians suggestion to add new API
  instead of changing old API

---
 drivers/net/phy/fixed_phy.c | 32 +++++++++++++++++++++++++-------
 include/linux/phy_fixed.h   | 15 +++++++++++++++
 2 files changed, 40 insertions(+), 7 deletions(-)

diff --git a/drivers/net/phy/fixed_phy.c b/drivers/net/phy/fixed_phy.c
index d810f914aaa4..b0d1368c3400 100644
--- a/drivers/net/phy/fixed_phy.c
+++ b/drivers/net/phy/fixed_phy.c
@@ -229,12 +229,12 @@ static struct gpio_desc *fixed_phy_get_gpiod(struct device_node *np)
 }
 #endif
 
-struct phy_device *fixed_phy_register(unsigned int irq,
-				      struct fixed_phy_status *status,
-				      struct device_node *np)
+static struct phy_device *__fixed_phy_register(unsigned int irq,
+					       struct fixed_phy_status *status,
+					       struct device_node *np,
+					       struct gpio_desc *gpiod)
 {
 	struct fixed_mdio_bus *fmb = &platform_fmb;
-	struct gpio_desc *gpiod = NULL;
 	struct phy_device *phy;
 	int phy_addr;
 	int ret;
@@ -243,9 +243,11 @@ struct phy_device *fixed_phy_register(unsigned int irq,
 		return ERR_PTR(-EPROBE_DEFER);
 
 	/* Check if we have a GPIO associated with this fixed phy */
-	gpiod = fixed_phy_get_gpiod(np);
-	if (IS_ERR(gpiod))
-		return ERR_CAST(gpiod);
+	if (!gpiod) {
+		gpiod = fixed_phy_get_gpiod(np);
+		if (IS_ERR(gpiod))
+			return ERR_CAST(gpiod);
+	}
 
 	/* Get the next available PHY address, up to PHY_MAX_ADDR */
 	phy_addr = ida_simple_get(&phy_fixed_ida, 0, PHY_MAX_ADDR, GFP_KERNEL);
@@ -308,8 +310,24 @@ struct phy_device *fixed_phy_register(unsigned int irq,
 
 	return phy;
 }
+
+struct phy_device *fixed_phy_register(unsigned int irq,
+				      struct fixed_phy_status *status,
+				      struct device_node *np)
+{
+	return __fixed_phy_register(irq, status, np, NULL);
+}
 EXPORT_SYMBOL_GPL(fixed_phy_register);
 
+struct phy_device *
+fixed_phy_register_with_gpiod(unsigned int irq,
+			      struct fixed_phy_status *status,
+			      struct gpio_desc *gpiod)
+{
+	return __fixed_phy_register(irq, status, NULL, gpiod);
+}
+EXPORT_SYMBOL_GPL(fixed_phy_register_with_gpiod);
+
 void fixed_phy_unregister(struct phy_device *phy)
 {
 	phy_device_remove(phy);
diff --git a/include/linux/phy_fixed.h b/include/linux/phy_fixed.h
index c78fc203db43..1e5d86ebdaeb 100644
--- a/include/linux/phy_fixed.h
+++ b/include/linux/phy_fixed.h
@@ -19,6 +19,12 @@ extern int fixed_phy_add(unsigned int irq, int phy_id,
 extern struct phy_device *fixed_phy_register(unsigned int irq,
 					     struct fixed_phy_status *status,
 					     struct device_node *np);
+
+extern struct phy_device *
+fixed_phy_register_with_gpiod(unsigned int irq,
+			      struct fixed_phy_status *status,
+			      struct gpio_desc *gpiod);
+
 extern void fixed_phy_unregister(struct phy_device *phydev);
 extern int fixed_phy_set_link_update(struct phy_device *phydev,
 			int (*link_update)(struct net_device *,
@@ -35,6 +41,15 @@ static inline struct phy_device *fixed_phy_register(unsigned int irq,
 {
 	return ERR_PTR(-ENODEV);
 }
+
+static inline struct phy_device *
+fixed_phy_register_with_gpiod(unsigned int irq,
+			      struct fixed_phy_status *status,
+			      struct gpio_desc *gpiod)
+{
+	return ERR_PTR(-ENODEV);
+}
+
 static inline void fixed_phy_unregister(struct phy_device *phydev)
 {
 }
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH net-next] net: phy: let genphy_c45_read_link manage the devices to check
From: Heiner Kallweit @ 2019-02-07 20:21 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Florian Fainelli, David Miller, Russell King,
	netdev@vger.kernel.org
In-Reply-To: <20190207195720.GP32483@lunn.ch>

On 07.02.2019 20:57, Andrew Lunn wrote:
> On Thu, Feb 07, 2019 at 08:05:55PM +0100, Heiner Kallweit wrote:
>> Let genphy_c45_read_link manage the devices to check, this removes
>> overhead from callers. Devices VEND1 and VEND2 will never be checked,
>> for now adopt the logic of the Marvell driver to also exclude PHY XS.
>>
>> At the moment we have very few clause 45 PHY drivers, so we are
>> lacking experience whether other drivers will have to exclude further
>> devices, or may need to check PHY XS. If we should figure out that
>> list of devices to check needs to be configurable, I think best will
>> be to add a device list member to struct phy_driver.
> 
> Hi Heiner
> 
> For the Aquantia PHY you probably need to exclude MDIO_MMD_C22EXT.
> 
> There is no register 1D:1 listed in the datasheet.
> 
Thanks, Andrew. Right, the Aquantia PHY doesn't seem to have the C22EXT
MMD. Because the Aquantia PHY has no device 29 in its package the code
should work.
I also checked the 802.3 spec and it says that registers 29.0 to
29.4 are reserved. At another location the 802.3 spec states that read
access to non-existing registers should return a zero value.
Therefore current code (when reading a 0 from 29.1) may come to the
false conclusion that device 29 reports link down.
So it seems we have to exclude device C22EXT in general. I'll add that
and submit a v2.

> PHY XS is the interface towards the MAC. You cannot configure the MAC
> to use the correct interface mode until the PHY has link to its peer
> and the link mode has been determined. So you need the PHY to signal
> link up independent of the MAC-PHY link, to kick off the configuration
> of the MAC. When using phylink, the MAC can indicate it has a link to
> the PHY using phylink_mac_change().
> 
OK, so excluding PHY XS is right.

>       Andrew
>  
> 
Heiner

^ permalink raw reply

* [PATCH net] vxlan: test dev->flags & IFF_UP before calling netif_rx()
From: Eric Dumazet @ 2019-02-07 20:27 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Eric Dumazet, Eric Dumazet, Petr Machata, Ido Schimmel,
	Roopa Prabhu, Stefano Brivio

netif_rx() must be called under a strict contract.

At device dismantle phase, core networking clears IFF_UP
and flush_all_backlogs() is called after rcu grace period
to make sure no incoming packet might be in a cpu backlog
and still referencing the device.

Most drivers call netif_rx() from their interrupt handler,
and since the interrupts are disabled at device dismantle,
netif_rx() does not have to check dev->flags & IFF_UP

Virtual drivers do not have this guarantee, and must
therefore make the check themselves.

Otherwise we risk use-after-free and/or crashes.

Note this patch also fixes a small issue that came
with commit ce6502a8f957 ("vxlan: fix a use after free
in vxlan_encap_bypass"), since the dev->stats.rx_dropped
change was done on the wrong device.

Fixes: d342894c5d2f ("vxlan: virtual extensible lan")
Fixes: ce6502a8f957 ("vxlan: fix a use after free in vxlan_encap_bypass")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Petr Machata <petrm@mellanox.com>
Cc: Ido Schimmel <idosch@mellanox.com>
Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
Cc: Stefano Brivio <sbrivio@redhat.com>
---
 drivers/net/vxlan.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 5209ee9aac47846367d7f469a7e69d08c030087e..2aae11feff0c22c1b073a6a119fc6c311cbe1691 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2219,7 +2219,7 @@ static void vxlan_encap_bypass(struct sk_buff *skb, struct vxlan_dev *src_vxlan,
 	struct pcpu_sw_netstats *tx_stats, *rx_stats;
 	union vxlan_addr loopback;
 	union vxlan_addr *remote_ip = &dst_vxlan->default_dst.remote_ip;
-	struct net_device *dev = skb->dev;
+	struct net_device *dev;
 	int len = skb->len;
 
 	tx_stats = this_cpu_ptr(src_vxlan->dev->tstats);
@@ -2239,9 +2239,15 @@ static void vxlan_encap_bypass(struct sk_buff *skb, struct vxlan_dev *src_vxlan,
 #endif
 	}
 
+	rcu_read_lock();
+	dev = skb->dev;
+	if (unlikely(!(dev->flags & IFF_UP))) {
+		kfree_skb(skb);
+		goto drop;
+	}
+
 	if (dst_vxlan->cfg.flags & VXLAN_F_LEARN)
-		vxlan_snoop(skb->dev, &loopback, eth_hdr(skb)->h_source, 0,
-			    vni);
+		vxlan_snoop(dev, &loopback, eth_hdr(skb)->h_source, 0, vni);
 
 	u64_stats_update_begin(&tx_stats->syncp);
 	tx_stats->tx_packets++;
@@ -2254,8 +2260,10 @@ static void vxlan_encap_bypass(struct sk_buff *skb, struct vxlan_dev *src_vxlan,
 		rx_stats->rx_bytes += len;
 		u64_stats_update_end(&rx_stats->syncp);
 	} else {
+drop:
 		dev->stats.rx_dropped++;
 	}
+	rcu_read_unlock();
 }
 
 static int encap_bypass_if_local(struct sk_buff *skb, struct net_device *dev,
-- 
2.20.1.791.gb4d0f1c61a-goog


^ permalink raw reply related

* Re: [PATCH net] vxlan: test dev->flags & IFF_UP before calling netif_rx()
From: Eric Dumazet @ 2019-02-07 20:34 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Eric Dumazet, Petr Machata, Ido Schimmel, Roopa Prabhu,
	Stefano Brivio
In-Reply-To: <20190207202738.155940-1-edumazet@google.com>

On Thu, Feb 7, 2019 at 12:27 PM Eric Dumazet <edumazet@google.com> wrote:
>
> netif_rx() must be called under a strict contract.
>
> At device dismantle phase, core networking clears IFF_UP
> and flush_all_backlogs() is called after rcu grace period
> to make sure no incoming packet might be in a cpu backlog
> and still referencing the device.
>
> Most drivers call netif_rx() from their interrupt handler,
> and since the interrupts are disabled at device dismantle,
> netif_rx() does not have to check dev->flags & IFF_UP
>
> Virtual drivers do not have this guarantee, and must
> therefore make the check themselves.

Since other netif_rx() callers seem to have the same issue, I wonder
if we simply could add a generic check.

Most high performance drivers no longer call netif_rx() so an extra check
in it wont be a problem.

^ permalink raw reply

* [PATCH net] af_key: unconditionally clone on broadcast
From: Sean Tranchetti @ 2019-02-07 20:33 UTC (permalink / raw)
  To: eric.dumazet, davem, netdev; +Cc: Sean Tranchetti

Attempting to avoid cloning the skb when broadcasting by inflating
the refcount with sock_hold/sock_put while under RCU lock is dangerous
and violates RCU principles. It leads to subtle race conditions when
attempting to free the SKB, as we may reference sockets that have
already been freed by the stack.

Unable to handle kernel paging request at virtual address 6b6b6b6b6b6c4b
[006b6b6b6b6b6c4b] address between user and kernel address ranges
Internal error: Oops: 96000004 [#1] PREEMPT SMP
task: fffffff78f65b380 task.stack: ffffff8049a88000
pc : sock_rfree+0x38/0x6c
lr : skb_release_head_state+0x6c/0xcc
Process repro (pid: 7117, stack limit = 0xffffff8049a88000)
Call trace:
	sock_rfree+0x38/0x6c
	skb_release_head_state+0x6c/0xcc
	skb_release_all+0x1c/0x38
	__kfree_skb+0x1c/0x30
	kfree_skb+0xd0/0xf4
	pfkey_broadcast+0x14c/0x18c
	pfkey_sendmsg+0x1d8/0x408
	sock_sendmsg+0x44/0x60
	___sys_sendmsg+0x1d0/0x2a8
	__sys_sendmsg+0x64/0xb4
	SyS_sendmsg+0x34/0x4c
	el0_svc_naked+0x34/0x38
Kernel panic - not syncing: Fatal exception

Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
---
Realized I never actually sent this patch out after testing the changes
Eric recommended. Whoops. Better late then never, I suppose...

 net/key/af_key.c | 40 +++++++++++++++-------------------------
 1 file changed, 15 insertions(+), 25 deletions(-)

diff --git a/net/key/af_key.c b/net/key/af_key.c
index 655c787..e800891 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -196,30 +196,22 @@ static int pfkey_release(struct socket *sock)
 	return 0;
 }
 
-static int pfkey_broadcast_one(struct sk_buff *skb, struct sk_buff **skb2,
-			       gfp_t allocation, struct sock *sk)
+static int pfkey_broadcast_one(struct sk_buff *skb, gfp_t allocation,
+			       struct sock *sk)
 {
 	int err = -ENOBUFS;
 
-	sock_hold(sk);
-	if (*skb2 == NULL) {
-		if (refcount_read(&skb->users) != 1) {
-			*skb2 = skb_clone(skb, allocation);
-		} else {
-			*skb2 = skb;
-			refcount_inc(&skb->users);
-		}
-	}
-	if (*skb2 != NULL) {
-		if (atomic_read(&sk->sk_rmem_alloc) <= sk->sk_rcvbuf) {
-			skb_set_owner_r(*skb2, sk);
-			skb_queue_tail(&sk->sk_receive_queue, *skb2);
-			sk->sk_data_ready(sk);
-			*skb2 = NULL;
-			err = 0;
-		}
+	if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf)
+		return err;
+
+	skb = skb_clone(skb, allocation);
+
+	if (skb) {
+		skb_set_owner_r(skb, sk);
+		skb_queue_tail(&sk->sk_receive_queue, skb);
+		sk->sk_data_ready(sk);
+		err = 0;
 	}
-	sock_put(sk);
 	return err;
 }
 
@@ -234,7 +226,6 @@ static int pfkey_broadcast(struct sk_buff *skb, gfp_t allocation,
 {
 	struct netns_pfkey *net_pfkey = net_generic(net, pfkey_net_id);
 	struct sock *sk;
-	struct sk_buff *skb2 = NULL;
 	int err = -ESRCH;
 
 	/* XXX Do we need something like netlink_overrun?  I think
@@ -253,7 +244,7 @@ static int pfkey_broadcast(struct sk_buff *skb, gfp_t allocation,
 		 * socket.
 		 */
 		if (pfk->promisc)
-			pfkey_broadcast_one(skb, &skb2, GFP_ATOMIC, sk);
+			pfkey_broadcast_one(skb, GFP_ATOMIC, sk);
 
 		/* the exact target will be processed later */
 		if (sk == one_sk)
@@ -268,7 +259,7 @@ static int pfkey_broadcast(struct sk_buff *skb, gfp_t allocation,
 				continue;
 		}
 
-		err2 = pfkey_broadcast_one(skb, &skb2, GFP_ATOMIC, sk);
+		err2 = pfkey_broadcast_one(skb, GFP_ATOMIC, sk);
 
 		/* Error is cleared after successful sending to at least one
 		 * registered KM */
@@ -278,9 +269,8 @@ static int pfkey_broadcast(struct sk_buff *skb, gfp_t allocation,
 	rcu_read_unlock();
 
 	if (one_sk != NULL)
-		err = pfkey_broadcast_one(skb, &skb2, allocation, one_sk);
+		err = pfkey_broadcast_one(skb, allocation, one_sk);
 
-	kfree_skb(skb2);
 	kfree_skb(skb);
 	return err;
 }
-- 
1.9.1


^ permalink raw reply related

* [PATCH v2 net-next] net: phy: let genphy_c45_read_link manage the devices to check
From: Heiner Kallweit @ 2019-02-07 20:41 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, David Miller, Russell King
  Cc: netdev@vger.kernel.org

Let genphy_c45_read_link manage the devices to check, this removes
overhead from callers. Add C22EXT to the list of excluded devices
because it doesn't implement the status register. According to the
802.3 clause 45 spec registers 29.0 - 29.4 are reserved.

At the moment we have very few clause 45 PHY drivers, so we are
lacking experience whether other drivers will have to exclude further
devices, or may need to check PHY XS. If we should figure out that
list of devices to check needs to be configurable, I think best will
be to add a device list member to struct phy_driver.

v2:
- adjusted commit message
- exclude also device C22EXT from link checking

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/phy/marvell10g.c | 10 +---------
 drivers/net/phy/phy-c45.c    | 18 ++++++++++--------
 include/linux/phy.h          |  2 +-
 include/uapi/linux/mdio.h    |  2 ++
 4 files changed, 14 insertions(+), 18 deletions(-)

diff --git a/drivers/net/phy/marvell10g.c b/drivers/net/phy/marvell10g.c
index 296a537cd..96a79c6c7 100644
--- a/drivers/net/phy/marvell10g.c
+++ b/drivers/net/phy/marvell10g.c
@@ -428,16 +428,8 @@ static int mv3310_read_10gbr_status(struct phy_device *phydev)
 
 static int mv3310_read_status(struct phy_device *phydev)
 {
-	u32 mmd_mask = phydev->c45_ids.devices_in_package;
 	int val;
 
-	/* The vendor devads do not report link status.  Avoid the PHYXS
-	 * instance as there are three, and its status depends on the MAC
-	 * being appropriately configured for the negotiated speed.
-	 */
-	mmd_mask &= ~(BIT(MDIO_MMD_VEND1) | BIT(MDIO_MMD_VEND2) |
-		      BIT(MDIO_MMD_PHYXS));
-
 	phydev->speed = SPEED_UNKNOWN;
 	phydev->duplex = DUPLEX_UNKNOWN;
 	linkmode_zero(phydev->lp_advertising);
@@ -453,7 +445,7 @@ static int mv3310_read_status(struct phy_device *phydev)
 	if (val & MDIO_STAT1_LSTATUS)
 		return mv3310_read_10gbr_status(phydev);
 
-	val = genphy_c45_read_link(phydev, mmd_mask);
+	val = genphy_c45_read_link(phydev);
 	if (val < 0)
 		return val;
 
diff --git a/drivers/net/phy/phy-c45.c b/drivers/net/phy/phy-c45.c
index b09a5e134..eff9e5a4d 100644
--- a/drivers/net/phy/phy-c45.c
+++ b/drivers/net/phy/phy-c45.c
@@ -118,17 +118,24 @@ EXPORT_SYMBOL_GPL(genphy_c45_aneg_done);
 /**
  * genphy_c45_read_link - read the overall link status from the MMDs
  * @phydev: target phy_device struct
- * @mmd_mask: MMDs to read status from
  *
  * Read the link status from the specified MMDs, and if they all indicate
  * that the link is up, set phydev->link to 1.  If an error is encountered,
  * a negative errno will be returned, otherwise zero.
  */
-int genphy_c45_read_link(struct phy_device *phydev, u32 mmd_mask)
+int genphy_c45_read_link(struct phy_device *phydev)
 {
+	u32 mmd_mask = phydev->c45_ids.devices_in_package;
 	int val, devad;
 	bool link = true;
 
+	/* The vendor devads and C22EXT do not report link status. Avoid the
+	 * PHYXS instance as its status may depend on the MAC being
+	 * appropriately configured for the negotiated speed.
+	 */
+	mmd_mask &= ~(MDIO_DEVS_VEND1 | MDIO_DEVS_VEND2 | MDIO_DEVS_C22EXT |
+		      MDIO_DEVS_PHYXS);
+
 	while (mmd_mask && link) {
 		devad = __ffs(mmd_mask);
 		mmd_mask &= ~BIT(devad);
@@ -274,16 +281,11 @@ EXPORT_SYMBOL_GPL(gen10g_config_aneg);
 
 int gen10g_read_status(struct phy_device *phydev)
 {
-	u32 mmd_mask = phydev->c45_ids.devices_in_package;
-
 	/* For now just lie and say it's 10G all the time */
 	phydev->speed = SPEED_10000;
 	phydev->duplex = DUPLEX_FULL;
 
-	/* Avoid reading the vendor MMDs */
-	mmd_mask &= ~(BIT(MDIO_MMD_VEND1) | BIT(MDIO_MMD_VEND2));
-
-	return genphy_c45_read_link(phydev, mmd_mask);
+	return genphy_c45_read_link(phydev);
 }
 EXPORT_SYMBOL_GPL(gen10g_read_status);
 
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 237dd0358..f41bf651f 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -1094,7 +1094,7 @@ int genphy_write_mmd_unsupported(struct phy_device *phdev, int devnum,
 /* Clause 45 PHY */
 int genphy_c45_restart_aneg(struct phy_device *phydev);
 int genphy_c45_aneg_done(struct phy_device *phydev);
-int genphy_c45_read_link(struct phy_device *phydev, u32 mmd_mask);
+int genphy_c45_read_link(struct phy_device *phydev);
 int genphy_c45_read_lpa(struct phy_device *phydev);
 int genphy_c45_read_pma(struct phy_device *phydev);
 int genphy_c45_pma_setup_forced(struct phy_device *phydev);
diff --git a/include/uapi/linux/mdio.h b/include/uapi/linux/mdio.h
index d435b00d6..2e6e309f0 100644
--- a/include/uapi/linux/mdio.h
+++ b/include/uapi/linux/mdio.h
@@ -123,6 +123,8 @@
 #define MDIO_DEVS_TC			MDIO_DEVS_PRESENT(MDIO_MMD_TC)
 #define MDIO_DEVS_AN			MDIO_DEVS_PRESENT(MDIO_MMD_AN)
 #define MDIO_DEVS_C22EXT		MDIO_DEVS_PRESENT(MDIO_MMD_C22EXT)
+#define MDIO_DEVS_VEND1			MDIO_DEVS_PRESENT(MDIO_MMD_VEND1)
+#define MDIO_DEVS_VEND2			MDIO_DEVS_PRESENT(MDIO_MMD_VEND2)
 
 /* Control register 2. */
 #define MDIO_PMA_CTRL2_TYPE		0x000f	/* PMA/PMD type selection */
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH net-next] net: phy: let genphy_c45_read_link manage the devices to check
From: Andrew Lunn @ 2019-02-07 20:50 UTC (permalink / raw)
  To: Heiner Kallweit
  Cc: Florian Fainelli, David Miller, Russell King,
	netdev@vger.kernel.org
In-Reply-To: <25962cfd-f759-10ab-e7a5-9816852412c1@gmail.com>

> Thanks, Andrew. Right, the Aquantia PHY doesn't seem to have the C22EXT
> MMD. Because the Aquantia PHY has no device 29 in its package the code
> should work.

It lists device 29 in its devices in package. So the current code does
look there.

> So it seems we have to exclude device C22EXT in general. I'll add that
> and submit a v2.

Yes.

	Andrew

^ permalink raw reply

* Re: [PATCH v2 net-next] net: phy: let genphy_c45_read_link manage the devices to check
From: Andrew Lunn @ 2019-02-07 20:59 UTC (permalink / raw)
  To: Heiner Kallweit
  Cc: Florian Fainelli, David Miller, Russell King,
	netdev@vger.kernel.org
In-Reply-To: <7c00df20-b3e1-7ea9-2adb-004e6291d52c@gmail.com>

On Thu, Feb 07, 2019 at 09:41:46PM +0100, Heiner Kallweit wrote:
> Let genphy_c45_read_link manage the devices to check, this removes
> overhead from callers. Add C22EXT to the list of excluded devices
> because it doesn't implement the status register. According to the
> 802.3 clause 45 spec registers 29.0 - 29.4 are reserved.
> 
> At the moment we have very few clause 45 PHY drivers, so we are
> lacking experience whether other drivers will have to exclude further
> devices, or may need to check PHY XS. If we should figure out that
> list of devices to check needs to be configurable, I think best will
> be to add a device list member to struct phy_driver.
> 
> v2:
> - adjusted commit message
> - exclude also device C22EXT from link checking
> 
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH net-next 0/2] qed*: SmartAN query support
From: David Miller @ 2019-02-07 21:22 UTC (permalink / raw)
  To: skalluru; +Cc: netdev, aelior, mkalderon
In-Reply-To: <20190207142012.4521-1-skalluru@marvell.com>

From: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Date: Thu, 7 Feb 2019 06:20:10 -0800

> SmartAN feature detects the peer/cable capabilities and establishes the
> link in the best possible configuration.
> The patch series adds support for querying the capability. Please consider
> applying it net-next.

Series applied, thanks.

^ permalink raw reply

* Re: [RFC, PATCH] net: page_pool: Don't use page->private to store dma_addr_t
From: David Miller @ 2019-02-07 21:25 UTC (permalink / raw)
  To: ilias.apalodimas; +Cc: willy, brouer, tariqt, toke, netdev, mgorman, linux-mm
In-Reply-To: <20190207152034.GA3295@apalos>

From: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Date: Thu, 7 Feb 2019 17:20:34 +0200

> Well updating struct page is the final goal, hence the comment. I am mostly
> looking for opinions here since we are trying to store dma addresses which are
> irrelevant to pages. Having dma_addr_t definitions in mm-related headers is a
> bit controversial isn't it ? If we can add that, then yes the code would look
> better

I fundamentally disagree.

One of the core operations performed on a page is mapping it so that a device
and use it.

Why have ancillary data structure support for this all over the place, rather
than in the common spot which is the page.

A page really is not just a 'mm' structure, it is a system structure.

^ permalink raw reply

* Re: [PATCH net-next 0/2] qed*: SmartAN query support
From: David Miller @ 2019-02-07 21:32 UTC (permalink / raw)
  To: skalluru; +Cc: netdev, aelior, mkalderon
In-Reply-To: <20190207.132235.1892910927125881925.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Date: Thu, 07 Feb 2019 13:22:35 -0800 (PST)

> From: Sudarsana Reddy Kalluru <skalluru@marvell.com>
> Date: Thu, 7 Feb 2019 06:20:10 -0800
> 
>> SmartAN feature detects the peer/cable capabilities and establishes the
>> link in the best possible configuration.
>> The patch series adds support for querying the capability. Please consider
>> applying it net-next.
> 
> Series applied, thanks.

Nevermind, I reverted before pushing out as I see Jakub has some questions.

^ permalink raw reply

* Re: [RFC, PATCH] net: page_pool: Don't use page->private to store dma_addr_t
From: Matthew Wilcox @ 2019-02-07 21:34 UTC (permalink / raw)
  To: David Miller
  Cc: ilias.apalodimas, brouer, tariqt, toke, netdev, mgorman, linux-mm
In-Reply-To: <20190207.132519.1698007650891404763.davem@davemloft.net>

On Thu, Feb 07, 2019 at 01:25:19PM -0800, David Miller wrote:
> From: Ilias Apalodimas <ilias.apalodimas@linaro.org>
> Date: Thu, 7 Feb 2019 17:20:34 +0200
> 
> > Well updating struct page is the final goal, hence the comment. I am mostly
> > looking for opinions here since we are trying to store dma addresses which are
> > irrelevant to pages. Having dma_addr_t definitions in mm-related headers is a
> > bit controversial isn't it ? If we can add that, then yes the code would look
> > better
> 
> I fundamentally disagree.
> 
> One of the core operations performed on a page is mapping it so that a device
> and use it.
> 
> Why have ancillary data structure support for this all over the place, rather
> than in the common spot which is the page.
> 
> A page really is not just a 'mm' structure, it is a system structure.

+1

The fundamental point of computing is to do I/O.

^ permalink raw reply

* Re: [RFC, PATCH] net: page_pool: Don't use page->private to store dma_addr_t
From: Ilias Apalodimas @ 2019-02-07 21:37 UTC (permalink / raw)
  To: David Miller; +Cc: willy, brouer, tariqt, toke, netdev, mgorman, linux-mm
In-Reply-To: <20190207.132519.1698007650891404763.davem@davemloft.net>

Hi David,

On Thu, Feb 07, 2019 at 01:25I:19PM -0800, David Miller wrote:
> From: Ilias Apalodimas <ilias.apalodimas@linaro.org>
> Date: Thu, 7 Feb 2019 17:20:34 +0200
> 
> > Well updating struct page is the final goal, hence the comment. I am mostly
> > looking for opinions here since we are trying to store dma addresses which are
> > irrelevant to pages. Having dma_addr_t definitions in mm-related headers is a
> > bit controversial isn't it ? If we can add that, then yes the code would look
> > better
> 
> I fundamentally disagree.
> 
> One of the core operations performed on a page is mapping it so that a device
> and use it.
> 
> Why have ancillary data structure support for this all over the place, rather
> than in the common spot which is the page.

You are right on that. Moreover the intention of this change is to facilitate
the page recycling patches we proposed with Jesper. In that context we do need
the dma mapping information in a common spot since we'll need to access it from
drivers, networking code etc. The struct page *is* the best place for that.

> 
> A page really is not just a 'mm' structure, it is a system structure.

Well if you put it that way i completely agree (also it makes our life a *lot*
easier :))


Thanks
/Ilias

^ permalink raw reply

* Re: [ISSUE][4.20.6] mlx5 and checksum failures
From: Marcelo Ricardo Leitner @ 2019-02-07 21:40 UTC (permalink / raw)
  To: Ian Kumlien
  Cc: Cong Wang, David Miller, Saeed Mahameed,
	Linux Kernel Network Developers
In-Reply-To: <CAA85sZsbmJ=c69h6yzd0LAzd198=mppjqjPZu_EXhQP+soe4Yw@mail.gmail.com>

On Wed, Feb 06, 2019 at 11:41:28PM +0100, Ian Kumlien wrote:
> On Wed, Feb 6, 2019 at 11:38 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
> > On Wed, Feb 6, 2019 at 2:15 PM Ian Kumlien <ian.kumlien@gmail.com> wrote:
> > > Could we please schedule this for 4.19 and 4.20 - it's kinda breaking things
> >
> > It doesn't break anything, packets are _not_ dropped, only that the
> > warning itself is noisy.
> 
> Not my experience, to me it slows the machine down and looses packets,
> I don't however know
> if this is the only culprit
> 
> You can actually see it on ping where it start out with 0.0xyx and
> ends up at ~10ms

Serial console is a/the killer in these situations.

^ permalink raw reply

* Re: [RFC, PATCH] net: page_pool: Don't use page->private to store dma_addr_t
From: Ilias Apalodimas @ 2019-02-07 21:42 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: David Miller, brouer, tariqt, toke, netdev, mgorman, linux-mm
In-Reply-To: <20190207213400.GA21860@bombadil.infradead.org>

Hi Matthew,

On Thu, Feb 07, 2019 at 01:34:00PM -0800, Matthew Wilcox wrote:
> On Thu, Feb 07, 2019 at 01:25:19PM -0800, David Miller wrote:
> > From: Ilias Apalodimas <ilias.apalodimas@linaro.org>
> > Date: Thu, 7 Feb 2019 17:20:34 +0200
> > 
> > > Well updating struct page is the final goal, hence the comment. I am mostly
> > > looking for opinions here since we are trying to store dma addresses which are
> > > irrelevant to pages. Having dma_addr_t definitions in mm-related headers is a
> > > bit controversial isn't it ? If we can add that, then yes the code would look
> > > better
> > 
> > I fundamentally disagree.
> > 
> > One of the core operations performed on a page is mapping it so that a device
> > and use it.
> > 
> > Why have ancillary data structure support for this all over the place, rather
> > than in the common spot which is the page.
> > 
> > A page really is not just a 'mm' structure, it is a system structure.
> 
> +1
> 
> The fundamental point of computing is to do I/O.
Ok, great that should sort it out then.
I'll use your proposal and base the patch on that. 

Thanks for taking the time with this

/Ilias

^ permalink raw reply

* Re: [PATCH v2] rhashtable: make walk safe from softirq context
From: Herbert Xu @ 2019-02-07 21:48 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-wireless, netdev, Jouni Malinen, Thomas Graf
In-Reply-To: <fee1bdca8bb21a472a75c1a43fa61aad7fe4bff5.camel@sipsolutions.net>

On Thu, Feb 07, 2019 at 02:50:54PM +0100, Johannes Berg wrote:
> 
> > This interface wasn't designed for use in softirq contexts.
> 
> Well, it clearly was used there. You even gave it a gfp_t argument in
> rhashtable_walk_init(), so you can't really claim it wasn't designed for
> this. I see now that it's ignored, but still?

I see.  This was added behind my back so I wasn't aware of it.

> > Could you please show me who is doing this so I can review that
> > to see whether it's a legitimate use of this API?
> 
> I'm sure you'll say it's not legitimate, but it still exists ;-)
> 
> mesh_plink_broken() gets called from the TX status path, via
> ieee80211s_update_metric().

Let me take a look.

Thanks!
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH net-next] net: phy: let genphy_c45_read_link manage the devices to check
From: Heiner Kallweit @ 2019-02-07 21:54 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Florian Fainelli, David Miller, Russell King,
	netdev@vger.kernel.org
In-Reply-To: <20190207205054.GA5223@lunn.ch>

On 07.02.2019 21:50, Andrew Lunn wrote:
>> Thanks, Andrew. Right, the Aquantia PHY doesn't seem to have the C22EXT
>> MMD. Because the Aquantia PHY has no device 29 in its package the code
>> should work.
> 
> It lists device 29 in its devices in package. So the current code does
> look there.
> 
I just looked for a description of a device 29. Strange that the device
list states it's there and then it's not there.

When checking that I was scratching my head because of the following code
in genphy_c45_read_link:

devad = __ffs(mmd_mask);
mmd_mask &= ~BIT(devad);

AFAIK __ffs() returns the posix bit number, means it returns 1 for bit 0.
Then this code piece seems to be wrong because I think the intention is
to clear the bit we just found. Instead we clear the next bit.

And device 0 isn't really a device but a flag "Clause 22 registers present".
So far we may have been lucky because to supported 10G PHY has this flag set.
But if this flag is set and we try to access a register 0.1 then we may be
in trouble again. Therefore I think we need to exclude also device 0.
Or do I miss something?

>> So it seems we have to exclude device C22EXT in general. I'll add that
>> and submit a v2.
> 
> Yes.
> 
> 	Andrew
> 
Heiner

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox