* [PATCH RFC 4/4] net: phy: fixed_phy: let genphy driver set supported and advertised modes
From: Heiner Kallweit @ 2019-08-13 21:26 UTC (permalink / raw)
To: Andrew Lunn, Florian Fainelli, Marek Behun, David Miller
Cc: netdev@vger.kernel.org
In-Reply-To: <ac3471d5-deb7-b711-6e74-23f59914758a@gmail.com>
A fixed phy as special swphy binds to the genphy driver that calls
genphy_read_abilities(). This function populates the supported and
advertised modes, so we don't have to do it manually.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
drivers/net/phy/fixed_phy.c | 23 -----------------------
1 file changed, 23 deletions(-)
diff --git a/drivers/net/phy/fixed_phy.c b/drivers/net/phy/fixed_phy.c
index 7c5265fd2..db4d96f2f 100644
--- a/drivers/net/phy/fixed_phy.c
+++ b/drivers/net/phy/fixed_phy.c
@@ -282,29 +282,6 @@ static struct phy_device *__fixed_phy_register(unsigned int irq,
phy->mdio.dev.of_node = np;
phy->is_pseudo_fixed_link = true;
- switch (status->speed) {
- case SPEED_1000:
- linkmode_set_bit(ETHTOOL_LINK_MODE_1000baseT_Half_BIT,
- phy->supported);
- linkmode_set_bit(ETHTOOL_LINK_MODE_1000baseT_Full_BIT,
- phy->supported);
- /* fall through */
- case SPEED_100:
- linkmode_set_bit(ETHTOOL_LINK_MODE_100baseT_Half_BIT,
- phy->supported);
- linkmode_set_bit(ETHTOOL_LINK_MODE_100baseT_Full_BIT,
- phy->supported);
- /* fall through */
- case SPEED_10:
- default:
- linkmode_set_bit(ETHTOOL_LINK_MODE_10baseT_Half_BIT,
- phy->supported);
- linkmode_set_bit(ETHTOOL_LINK_MODE_10baseT_Full_BIT,
- phy->supported);
- }
-
- phy_advertise_supported(phy);
-
ret = phy_device_register(phy);
if (ret) {
phy_device_free(phy);
--
2.22.0
^ permalink raw reply related
* [PATCH RFC 3/4] net: phy: swphy: bind swphy to genphy driver at probe time
From: Heiner Kallweit @ 2019-08-13 21:26 UTC (permalink / raw)
To: Andrew Lunn, Florian Fainelli, Marek Behun, David Miller
Cc: netdev@vger.kernel.org
In-Reply-To: <ac3471d5-deb7-b711-6e74-23f59914758a@gmail.com>
Let a swphy bind to the genphy driver at probe time. This provides
automatic feature detection even if the swphy never gets attached to a
net_device. So far the genphy driver binds to a PHY as fallback only
once the PHY is attached to a net_device.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
drivers/net/phy/swphy.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/phy/swphy.c b/drivers/net/phy/swphy.c
index 53c214a22..7ac5054fa 100644
--- a/drivers/net/phy/swphy.c
+++ b/drivers/net/phy/swphy.c
@@ -151,8 +151,9 @@ int swphy_read_reg(int reg, const struct fixed_phy_status *state)
case MII_BMSR:
return bmsr;
case MII_PHYSID1:
+ return GENPHY_ID_HIGH;
case MII_PHYSID2:
- return 0;
+ return GENPHY_ID_LOW;
case MII_LPA:
return lpa;
case MII_STAT1000:
--
2.22.0
^ permalink raw reply related
* [PATCH RFC 2/4] net: phy: allow to bind genphy driver at probe time
From: Heiner Kallweit @ 2019-08-13 21:25 UTC (permalink / raw)
To: Andrew Lunn, Florian Fainelli, Marek Behun, David Miller
Cc: netdev@vger.kernel.org
In-Reply-To: <ac3471d5-deb7-b711-6e74-23f59914758a@gmail.com>
In cases like a fixed phy that is never attached to a net_device we
may want to bind the genphy driver at probe time. Setting a PHY ID of
0xffffffff to bind the genphy driver would fail due to a check in
get_phy_device(). Therefore let's change the PHY ID the genphy driver
binds to to 0xfffffffe. This still shouldn't match any real PHY,
and it will pass the check in get_phy_devcie().
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
drivers/net/phy/phy_device.c | 3 +--
include/linux/phy.h | 4 ++++
2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 163295dbc..54f80af31 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -2388,8 +2388,7 @@ void phy_drivers_unregister(struct phy_driver *drv, int n)
EXPORT_SYMBOL(phy_drivers_unregister);
static struct phy_driver genphy_driver = {
- .phy_id = 0xffffffff,
- .phy_id_mask = 0xffffffff,
+ PHY_ID_MATCH_EXACT(GENPHY_ID),
.name = "Generic PHY",
.soft_reset = genphy_no_soft_reset,
.get_features = genphy_read_abilities,
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 5ac7d2137..3b07bce78 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -37,6 +37,10 @@
#define PHY_1000BT_FEATURES (SUPPORTED_1000baseT_Half | \
SUPPORTED_1000baseT_Full)
+#define GENPHY_ID_HIGH 0xffffU
+#define GENPHY_ID_LOW 0xfffeU
+#define GENPHY_ID ((GENPHY_ID_HIGH << 16) | GENPHY_ID_LOW)
+
extern __ETHTOOL_DECLARE_LINK_MODE_MASK(phy_basic_features) __ro_after_init;
extern __ETHTOOL_DECLARE_LINK_MODE_MASK(phy_basic_t1_features) __ro_after_init;
extern __ETHTOOL_DECLARE_LINK_MODE_MASK(phy_gbit_features) __ro_after_init;
--
2.22.0
^ permalink raw reply related
* [PATCH RFC 1/4] net: phy: swphy: emulate register MII_ESTATUS
From: Heiner Kallweit @ 2019-08-13 21:24 UTC (permalink / raw)
To: Andrew Lunn, Florian Fainelli, Marek Behun, David Miller
Cc: netdev@vger.kernel.org
In-Reply-To: <ac3471d5-deb7-b711-6e74-23f59914758a@gmail.com>
When the genphy driver binds to a swphy it will call
genphy_read_abilites that will try to read MII_ESTATUS if BMSR_ESTATEN
is set in MII_BMSR. So far this would read the default value 0xffff
and 1000FD and 1000HD are reported as supported just by chance.
Better add explicit support for emulating MII_ESTATUS.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
drivers/net/phy/swphy.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/net/phy/swphy.c b/drivers/net/phy/swphy.c
index dad22481d..53c214a22 100644
--- a/drivers/net/phy/swphy.c
+++ b/drivers/net/phy/swphy.c
@@ -22,6 +22,7 @@ struct swmii_regs {
u16 bmsr;
u16 lpa;
u16 lpagb;
+ u16 estat;
};
enum {
@@ -48,6 +49,7 @@ static const struct swmii_regs speed[] = {
[SWMII_SPEED_1000] = {
.bmsr = BMSR_ESTATEN,
.lpagb = LPA_1000FULL | LPA_1000HALF,
+ .estat = ESTATUS_1000_TFULL | ESTATUS_1000_THALF,
},
};
@@ -56,11 +58,13 @@ static const struct swmii_regs duplex[] = {
.bmsr = BMSR_ESTATEN | BMSR_100HALF,
.lpa = LPA_10HALF | LPA_100HALF,
.lpagb = LPA_1000HALF,
+ .estat = ESTATUS_1000_THALF,
},
[SWMII_DUPLEX_FULL] = {
.bmsr = BMSR_ESTATEN | BMSR_100FULL,
.lpa = LPA_10FULL | LPA_100FULL,
.lpagb = LPA_1000FULL,
+ .estat = ESTATUS_1000_TFULL,
},
};
@@ -112,6 +116,7 @@ int swphy_read_reg(int reg, const struct fixed_phy_status *state)
{
int speed_index, duplex_index;
u16 bmsr = BMSR_ANEGCAPABLE;
+ u16 estat = 0;
u16 lpagb = 0;
u16 lpa = 0;
@@ -125,6 +130,7 @@ int swphy_read_reg(int reg, const struct fixed_phy_status *state)
duplex_index = state->duplex ? SWMII_DUPLEX_FULL : SWMII_DUPLEX_HALF;
bmsr |= speed[speed_index].bmsr & duplex[duplex_index].bmsr;
+ estat |= speed[speed_index].estat & duplex[duplex_index].estat;
if (state->link) {
bmsr |= BMSR_LSTATUS | BMSR_ANEGCOMPLETE;
@@ -151,6 +157,8 @@ int swphy_read_reg(int reg, const struct fixed_phy_status *state)
return lpa;
case MII_STAT1000:
return lpagb;
+ case MII_ESTATUS:
+ return estat;
/*
* We do not support emulating Clause 45 over Clause 22 register
--
2.22.0
^ permalink raw reply related
* [PATCH RFC 0/4] net: phy: improve fixed_phy / swphy handling
From: Heiner Kallweit @ 2019-08-13 21:23 UTC (permalink / raw)
To: Andrew Lunn, Florian Fainelli, Marek Behun, David Miller
Cc: netdev@vger.kernel.org
Based on discussion [0] I prepared a patch set for improving few
aspects of swphy and fixed_phy handling. So far it's compile-tested
only. I'd appreciate testing on different devices.
[0] https://marc.info/?t=156553610700001&r=1&w=2
Heiner Kallweit (4):
net: phy: swphy: emulate register MII_ESTATUS
net: phy: allow to bind genphy driver at probe time
net: phy: swphy: bind swphy to genphy driver at probe time
net: phy: fixed_phy: let genphy driver set supported and advertised
modes
drivers/net/phy/fixed_phy.c | 23 -----------------------
drivers/net/phy/phy_device.c | 3 +--
drivers/net/phy/swphy.c | 11 ++++++++++-
include/linux/phy.h | 4 ++++
4 files changed, 15 insertions(+), 26 deletions(-)
--
2.22.0
^ permalink raw reply
* Re: [PATCH bpf-next 0/2] libbpf: make use of BTF through sysfs
From: Daniel Borkmann @ 2019-08-13 21:22 UTC (permalink / raw)
To: Andrii Nakryiko, bpf, netdev, ast, acme; +Cc: andrii.nakryiko, kernel-team
In-Reply-To: <20190813185443.437829-1-andriin@fb.com>
On 8/13/19 8:54 PM, Andrii Nakryiko wrote:
> Now that kernel's BTF is exposed through sysfs at well-known location, attempt
> to load it first as a target BTF for the purpose of BPF CO-RE relocations.
>
> Patch #1 is a follow-up patch to rename /sys/kernel/btf/kernel into
> /sys/kernel/btf/vmlinux.
> Patch #2 adds ability to load raw BTF contents from sysfs and expands the list
> of locations libbpf attempts to load vmlinux BTF from.
>
> Andrii Nakryiko (2):
> btf: rename /sys/kernel/btf/kernel into /sys/kernel/btf/vmlinux
> libbpf: attempt to load kernel BTF from sysfs first
>
> Documentation/ABI/testing/sysfs-kernel-btf | 2 +-
> kernel/bpf/sysfs_btf.c | 30 +++++-----
> scripts/link-vmlinux.sh | 18 +++---
> tools/lib/bpf/libbpf.c | 64 +++++++++++++++++++---
> 4 files changed, 82 insertions(+), 32 deletions(-)
>
LGTM, applied thanks!
^ permalink raw reply
* Re: [PATCH net-next v2 07/12] net: stmmac: Add ethtool register dump for XGMAC cores
From: Jakub Kicinski @ 2019-08-13 21:19 UTC (permalink / raw)
To: Jose Abreu
Cc: netdev, Joao Pinto, Giuseppe Cavallaro, Alexandre Torgue,
David S. Miller, Maxime Coquelin, linux-stm32, linux-arm-kernel,
linux-kernel
In-Reply-To: <3d860a78ce4e98941f7e292d251d7360755fdf2e.1565602974.git.joabreu@synopsys.com>
On Mon, 12 Aug 2019 11:44:06 +0200, Jose Abreu wrote:
> static void stmmac_ethtool_gregs(struct net_device *dev,
> struct ethtool_regs *regs, void *space)
> {
> - u32 *reg_space = (u32 *) space;
> -
> struct stmmac_priv *priv = netdev_priv(dev);
> + int size = stmmac_ethtool_get_regs_len(dev);
> + u32 *reg_space = (u32 *) space;
>
> - memset(reg_space, 0x0, REG_SPACE_SIZE);
> + memset(reg_space, 0x0, size);
no need to zero regs, ethtool core zallocs them
^ permalink raw reply
* Re: Error when loading BPF_CGROUP_INET_EGRESS program with bpftool
From: Andrii Nakryiko @ 2019-08-13 21:18 UTC (permalink / raw)
To: Fejes Ferenc; +Cc: netdev@vger.kernel.org
In-Reply-To: <CAAej5NbwZ80MNQYxP4NiJXheAn1DcSgm+O3zQQgCoP03HGHEgQ@mail.gmail.com>
On Mon, Aug 12, 2019 at 1:48 PM Fejes Ferenc <fejes@inf.elte.hu> wrote:
>
> Thanks for the answer, I really appreciate it. I tried omitting
Please reply inline, no top posting on kernel mailing lists.
> "cgroup/skb" to let libbpf guess the attach type, but I got the same
> error. Really interesting, because I got the error
> > libbpf: failed to load program 'cgroup_skb/egress'
> wich is weird because it shows that libbpf guess the program type
> correctly. So something definitely on my side - thank you for verifyng
> that - I try to investigate it!
What was the error message you got after you provided correct program
attach type?
>
> Ferenc
> Andrii Nakryiko <andrii.nakryiko@gmail.com> ezt írta (időpont: 2019.
> aug. 12., H, 20:27):
> >
> > On Mon, Aug 12, 2019 at 1:59 AM Fejes Ferenc <fejes@inf.elte.hu> wrote:
> > >
> > > Greetings!
> > >
> > > I found a strange error when I tried to load a BPF_CGROUP_INET_EGRESS
> > > prog with bpftool. Loading the same program from C code with
> > > bpf_prog_load_xattr works without problem.
> > >
> > > The error message I got:
> > > bpftool prog loadall hbm_kern.o /sys/fs/bpf/hbm type cgroup/skb
> >
> > You need "cgroup_skb/egress" instead of "cgroup/skb" (or try just
> > dropping it, bpftool will try to guess the type from program's section
> > name, which would be correct in this case).
> >
> > > libbpf: load bpf program failed: Invalid argument
> > > libbpf: -- BEGIN DUMP LOG ---
> > > libbpf:
> > > ; return ALLOW_PKT | REDUCE_CW;
> > > 0: (b7) r0 = 3
> > > 1: (95) exit
> > > At program exit the register R0 has value (0x3; 0x0) should have been
> > > in (0x0; 0x1)
> > > processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0
> > > peak_states 0 mark_read 0
> > >
> > > libbpf: -- END LOG --
> > > libbpf: failed to load program 'cgroup_skb/egress'
> > > libbpf: failed to load object 'hbm_kern.o'
> > > Error: failed to load object file
> > >
> > >
> > > My environment: 5.3-rc3 / net-next master (both producing the error).
> > > Libbpf and bpftool installed from the kernel source (cleaned and
> > > reinstalled when I tried a new kernel). I compiled the program with
> > > Clang 8, on Ubuntu 19.10 server image, the source:
> > >
> > > #include <linux/bpf.h>
> > > #include "bpf_helpers.h"
> > >
> > > #define DROP_PKT 0
> > > #define ALLOW_PKT 1
> > > #define REDUCE_CW 2
> > >
> > > SEC("cgroup_skb/egress")
> > > int hbm(struct __sk_buff *skb)
> > > {
> > > return ALLOW_PKT | REDUCE_CW;
> > > }
> > > char _license[] SEC("license") = "GPL";
> > >
> > >
> > > I also tried to trace down the bug with gdb. It seems like the
> > > section_names array in libbpf.c filled with garbage, especially the
> >
> > I did the same, section_names appears to be correct, not sure what was
> > going on in your case. The problem is that "cgroup/skb", which you
> > provided on command line, overrides this section name and forces
> > bpftool to guess program type as BPF_CGROUP_INET_INGRESS, which
> > restricts return codes to just 0 or 1, while for
> > BPF_CGROUP_INET_EGRESS i is [0, 3].
> >
> > > expected_attach_type fields (in my case, this contains
> > > BPF_CGROUP_INET_INGRESS instead of BPF_CGROUP_INET_EGRESS).
> > >
> > > Thanks!
^ permalink raw reply
* Re: [PATCH net-next] net/ncsi: allow to customize BMC MAC Address offset
From: Terry Duncan @ 2019-08-13 21:15 UTC (permalink / raw)
To: Ben Wei, Tao Ren
Cc: Andrew Lunn, Jakub Kicinski, netdev@vger.kernel.org,
openbmc@lists.ozlabs.org, linux-kernel@vger.kernel.org,
Samuel Mendoza-Jonas, David S.Miller, William Kennington
In-Reply-To: <CH2PR15MB3686B3A20A231FC111C42F40A3D20@CH2PR15MB3686.namprd15.prod.outlook.com>
On 8/13/19 12:52 PM, Ben Wei wrote:
>> On 8/13/19 9:31 AM, Terry Duncan wrote:
>>> Tao, in your new patch will it be possible to disable the setting of the BMC MAC? I would like to be able to send NCSI_OEM_GET_MAC perhaps with netlink (TBD) to get the system address without it affecting the BMC address.
>>>
>>> I was about to send patches to add support for the Intel adapters when I saw this thread.
>>> Thanks,
>>> Terry
>>
> Hi Terry,
> Do you plan to monitor and configure NIC through use space programs via netlink? I'm curious if you have additional use cases.
> I had planned to add some daemon in user space to monitor NIC through NC-SI, primarily to log AENs, check link status and retrieve NIC counters.
> We can collaborate on these if they align with your needs as well.
>
> Also about Intel NIC, do you not plan to derive system address from BMC MAC? Is the BMC MAC independent of system address?
>
> Thanks,
> -Ben
>
Hi Ben,
Intel has in the past programmed BMC MACs with an offset from the system
MAC address of the first shared interface. We have found this approach
causes problems and adds complexity when system interfaces / OCP cards
are added or removed or disabled in BIOS. Our approach in OpenBMC is to
keep this simple - provide means for the BMC MAC addresses to be set in
manufacturing and persist them.
We do also have some of the same use cases you mention.
Thanks,
Terry
^ permalink raw reply
* Re: [PATCH bpf-next v2 2/4] bpf: support cloning sk storage on accept()
From: Daniel Borkmann @ 2019-08-13 21:12 UTC (permalink / raw)
To: Stanislav Fomichev
Cc: Stanislav Fomichev, netdev, bpf, davem, ast, Martin KaFai Lau,
Yonghong Song
In-Reply-To: <20190812175249.GF2820@mini-arch>
On 8/12/19 7:52 PM, Stanislav Fomichev wrote:
> On 08/12, Daniel Borkmann wrote:
>> On 8/9/19 6:10 PM, Stanislav Fomichev wrote:
>>> Add new helper bpf_sk_storage_clone which optionally clones sk storage
>>> and call it from sk_clone_lock.
>>>
>>> Cc: Martin KaFai Lau <kafai@fb.com>
>>> Cc: Yonghong Song <yhs@fb.com>
>>> Signed-off-by: Stanislav Fomichev <sdf@google.com>
>> [...]
>>> +int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk)
>>> +{
>>> + struct bpf_sk_storage *new_sk_storage = NULL;
>>> + struct bpf_sk_storage *sk_storage;
>>> + struct bpf_sk_storage_elem *selem;
>>> + int ret;
>>> +
>>> + RCU_INIT_POINTER(newsk->sk_bpf_storage, NULL);
>>> +
>>> + rcu_read_lock();
>>> + sk_storage = rcu_dereference(sk->sk_bpf_storage);
>>> +
>>> + if (!sk_storage || hlist_empty(&sk_storage->list))
>>> + goto out;
>>> +
>>> + hlist_for_each_entry_rcu(selem, &sk_storage->list, snode) {
>>> + struct bpf_sk_storage_elem *copy_selem;
>>> + struct bpf_sk_storage_map *smap;
>>> + struct bpf_map *map;
>>> + int refold;
>>> +
>>> + smap = rcu_dereference(SDATA(selem)->smap);
>>> + if (!(smap->map.map_flags & BPF_F_CLONE))
>>> + continue;
>>> +
>>> + map = bpf_map_inc_not_zero(&smap->map, false);
>>> + if (IS_ERR(map))
>>> + continue;
>>> +
>>> + copy_selem = bpf_sk_storage_clone_elem(newsk, smap, selem);
>>> + if (!copy_selem) {
>>> + ret = -ENOMEM;
>>> + bpf_map_put(map);
>>> + goto err;
>>> + }
>>> +
>>> + if (new_sk_storage) {
>>> + selem_link_map(smap, copy_selem);
>>> + __selem_link_sk(new_sk_storage, copy_selem);
>>> + } else {
>>> + ret = sk_storage_alloc(newsk, smap, copy_selem);
>>> + if (ret) {
>>> + kfree(copy_selem);
>>> + atomic_sub(smap->elem_size,
>>> + &newsk->sk_omem_alloc);
>>> + bpf_map_put(map);
>>> + goto err;
>>> + }
>>> +
>>> + new_sk_storage = rcu_dereference(copy_selem->sk_storage);
>>> + }
>>> + bpf_map_put(map);
>>
>> The map get/put combination /under/ RCU read lock seems a bit odd to me, could
>> you exactly describe the race that this would be preventing?
> There is a race between sk storage release and sk storage clone.
> bpf_sk_storage_map_free uses synchronize_rcu to wait for all existing
> users to finish and the new ones are prevented via map's refcnt being
> zero; we need to do something like that for the clone.
> Martin suggested to use bpf_map_inc_not_zero/bpf_map_put.
> If I read everythin correctly, I think without map_inc/map_put we
> get the following race:
>
> CPU0 CPU1
>
> bpf_map_put
> bpf_sk_storage_map_free(smap)
> synchronize_rcu
>
> // no more users via bpf or
> // syscall, but clone
> // can still happen
>
> for each (bucket)
> selem_unlink
> selem_unlink_map(smap)
>
> // adding anything at
> // this point to the
> // bucket will leak
>
> rcu_read_lock
> tcp_v4_rcv
> tcp_v4_do_rcv
> // sk is lockless TCP_LISTEN
> tcp_v4_cookie_check
> tcp_v4_syn_recv_sock
> bpf_sk_storage_clone
> rcu_dereference(sk->sk_bpf_storage)
> selem_link_map(smap, copy)
> // adding new element to the
> // map -> leak
> rcu_read_unlock
>
> selem_unlink_sk
> sk->sk_bpf_storage = NULL
>
> synchronize_rcu
>
Makes sense, thanks for clarifying. Perhaps a small comment on top of
the bpf_map_inc_not_zero() would be great as well, so it's immediately
clear also from this location when reading the code why this is done.
Thanks,
Daniel
^ permalink raw reply
* Re: [PATCH net-next v2 02/12] net: stmmac: Prepare to add Split Header support
From: Jakub Kicinski @ 2019-08-13 21:11 UTC (permalink / raw)
To: Jose Abreu
Cc: netdev, Joao Pinto, Giuseppe Cavallaro, Alexandre Torgue,
David S. Miller, Maxime Coquelin, linux-stm32, linux-arm-kernel,
linux-kernel
In-Reply-To: <342007d6ac2b44db03d113e7e8bf0310caa77ea0.1565602974.git.joabreu@synopsys.com>
On Mon, 12 Aug 2019 11:44:01 +0200, Jose Abreu wrote:
> In order to add Split Header support, stmmac_rx() needs to take into
> account that packet may be split accross multiple descriptors.
>
> Refactor the logic of this function in order to support this scenario.
>
> Signed-off-by: Jose Abreu <joabreu@synopsys.com>
>
> ---
> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
> Cc: Alexandre Torgue <alexandre.torgue@st.com>
> Cc: Jose Abreu <joabreu@synopsys.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
> Cc: netdev@vger.kernel.org
> Cc: linux-stm32@st-md-mailman.stormreply.com
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> ---
> drivers/net/ethernet/stmicro/stmmac/stmmac.h | 6 +
> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 149 +++++++++++++---------
> 2 files changed, 95 insertions(+), 60 deletions(-)
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> index 80276587048a..56158e1448ac 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> @@ -74,6 +74,12 @@ struct stmmac_rx_queue {
> u32 rx_zeroc_thresh;
> dma_addr_t dma_rx_phy;
> u32 rx_tail_addr;
> + unsigned int state_saved;
> + struct {
> + struct sk_buff *skb;
> + unsigned int len;
> + unsigned int error;
> + } state;
> };
>
> struct stmmac_channel {
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index b2e5f4ecd551..a093eb4ec275 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -3353,9 +3353,10 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit, u32 queue)
> {
> struct stmmac_rx_queue *rx_q = &priv->rx_queue[queue];
> struct stmmac_channel *ch = &priv->channel[queue];
> + unsigned int count = 0, error = 0, len = 0;
> + int status = 0, coe = priv->hw->rx_csum;
> unsigned int next_entry = rx_q->cur_rx;
> - int coe = priv->hw->rx_csum;
> - unsigned int count = 0;
> + struct sk_buff *skb = NULL;
>
> if (netif_msg_rx_status(priv)) {
> void *rx_head;
> @@ -3369,9 +3370,27 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit, u32 queue)
> stmmac_display_ring(priv, rx_head, DMA_RX_SIZE, true);
> }
> while (count < limit) {
> + enum pkt_hash_types hash_type;
> struct stmmac_rx_buffer *buf;
> + unsigned int prev_len = 0;
> struct dma_desc *np, *p;
> - int entry, status;
> + int entry;
> + u32 hash;
> +
> + if (!count && rx_q->state_saved) {
> + skb = rx_q->state.skb;
> + error = rx_q->state.error;
> + len = rx_q->state.len;
> + } else {
> + rx_q->state_saved = false;
> + skb = NULL;
> + error = 0;
> + len = 0;
> + }
> +
> +read_again:
> + if (count >= limit)
> + break;
Is this stopping the NAPI poll once @limit descriptors were seen?
It should probably be okay to ignore the limit until you get a full
frame? I'd think it'd be best to finish up the frame while the state
is hot in the CPU cache.. WDYT?
> entry = next_entry;
> buf = &rx_q->buf_pool[entry];
> @@ -3407,28 +3426,24 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit, u32 queue)
> page_pool_recycle_direct(rx_q->page_pool, buf->page);
> priv->dev->stats.rx_errors++;
> buf->page = NULL;
> + error = 1;
> + }
> +
> + if (unlikely(error & (status & rx_not_ls)))
Looks suspicious - sure this is supposed to be error & (status & bla)
and not error && ... ?
> + goto read_again;
> + if (unlikely(error)) {
> + if (skb)
> + dev_kfree_skb(skb);
> + continue;
> + }
> +
> + /* Buffer is good. Go on. */
> +
> + if (likely(status & rx_not_ls)) {
> + len += priv->dma_buf_sz;
> } else {
> - enum pkt_hash_types hash_type;
> - struct sk_buff *skb;
> - unsigned int des;
> - int frame_len;
> - u32 hash;
> -
> - stmmac_get_desc_addr(priv, p, &des);
> - frame_len = stmmac_get_rx_frame_len(priv, p, coe);
> -
> - /* If frame length is greater than skb buffer size
> - * (preallocated during init) then the packet is
> - * ignored
> - */
> - if (frame_len > priv->dma_buf_sz) {
> - if (net_ratelimit())
> - netdev_err(priv->dev,
> - "len %d larger than size (%d)\n",
> - frame_len, priv->dma_buf_sz);
> - priv->dev->stats.rx_length_errors++;
> - continue;
> - }
> + prev_len = len;
> + len = stmmac_get_rx_frame_len(priv, p, coe);
>
> /* ACS is set; GMAC core strips PAD/FCS for IEEE 802.3
> * Type frames (LLC/LLC-SNAP)
> @@ -3439,57 +3454,71 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit, u32 queue)
> */
> if (unlikely(priv->synopsys_id >= DWMAC_CORE_4_00) ||
> unlikely(status != llc_snap))
> - frame_len -= ETH_FCS_LEN;
> -
> - if (netif_msg_rx_status(priv)) {
> - netdev_dbg(priv->dev, "\tdesc: %p [entry %d] buff=0x%x\n",
> - p, entry, des);
> - netdev_dbg(priv->dev, "frame size %d, COE: %d\n",
> - frame_len, status);
> - }
> + len -= ETH_FCS_LEN;
> + }
>
> - skb = netdev_alloc_skb_ip_align(priv->dev, frame_len);
> - if (unlikely(!skb)) {
> + if (!skb) {
> + skb = netdev_alloc_skb_ip_align(priv->dev, len);
Since you're in NAPI call perhaps something like napi_alloc_skb() could
speed things up a little? But please also see below..
> + if (!skb) {
> priv->dev->stats.rx_dropped++;
> continue;
> }
>
> - dma_sync_single_for_cpu(priv->device, buf->addr,
> - frame_len, DMA_FROM_DEVICE);
> + dma_sync_single_for_cpu(priv->device, buf->addr, len,
> + DMA_FROM_DEVICE);
> skb_copy_to_linear_data(skb, page_address(buf->page),
> - frame_len);
> - skb_put(skb, frame_len);
> + len);
> + skb_put(skb, len);
>
> - if (netif_msg_pktdata(priv)) {
> - netdev_dbg(priv->dev, "frame received (%dbytes)",
> - frame_len);
> - print_pkt(skb->data, frame_len);
> - }
> + /* Data payload copied into SKB, page ready for recycle */
> + page_pool_recycle_direct(rx_q->page_pool, buf->page);
> + buf->page = NULL;
> + } else {
> + unsigned int buf_len = len - prev_len;
>
> - stmmac_get_rx_hwtstamp(priv, p, np, skb);
> + if (likely(status & rx_not_ls))
> + buf_len = priv->dma_buf_sz;
>
> - stmmac_rx_vlan(priv->dev, skb);
> + dma_sync_single_for_cpu(priv->device, buf->addr,
> + buf_len, DMA_FROM_DEVICE);
> + skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags,
> + buf->page, 0, buf_len,
> + priv->dma_buf_sz);
>
> - skb->protocol = eth_type_trans(skb, priv->dev);
> + /* Data payload appended into SKB */
> + page_pool_release_page(rx_q->page_pool, buf->page);
> + buf->page = NULL;
> + }
>
> - if (unlikely(!coe))
> - skb_checksum_none_assert(skb);
> - else
> - skb->ip_summed = CHECKSUM_UNNECESSARY;
> + if (likely(status & rx_not_ls))
> + goto read_again;
>
> - if (!stmmac_get_rx_hash(priv, p, &hash, &hash_type))
> - skb_set_hash(skb, hash, hash_type);
> + /* Got entire packet into SKB. Finish it. */
>
> - skb_record_rx_queue(skb, queue);
> - napi_gro_receive(&ch->rx_napi, skb);
> + stmmac_get_rx_hwtstamp(priv, p, np, skb);
> + stmmac_rx_vlan(priv->dev, skb);
> + skb->protocol = eth_type_trans(skb, priv->dev);
>
> - /* Data payload copied into SKB, page ready for recycle */
> - page_pool_recycle_direct(rx_q->page_pool, buf->page);
> - buf->page = NULL;
> + if (unlikely(!coe))
> + skb_checksum_none_assert(skb);
> + else
> + skb->ip_summed = CHECKSUM_UNNECESSARY;
>
> - priv->dev->stats.rx_packets++;
> - priv->dev->stats.rx_bytes += frame_len;
> - }
> + if (!stmmac_get_rx_hash(priv, p, &hash, &hash_type))
> + skb_set_hash(skb, hash, hash_type);
> +
> + skb_record_rx_queue(skb, queue);
> + napi_gro_receive(&ch->rx_napi, skb);
Did you look into using napi_gro_frags() family of APIs?
I think Eric said those are more efficient from GRO perspective..
> + priv->dev->stats.rx_packets++;
> + priv->dev->stats.rx_bytes += len;
> + }
> +
> + if (status & rx_not_ls) {
> + rx_q->state_saved = true;
> + rx_q->state.skb = skb;
> + rx_q->state.error = error;
> + rx_q->state.len = len;
> }
>
> stmmac_rx_refill(priv, queue);
^ permalink raw reply
* Re: [PATCH net-next v2 01/12] net: stmmac: Get correct timestamp values from XGMAC
From: Jakub Kicinski @ 2019-08-13 20:57 UTC (permalink / raw)
To: Jose Abreu
Cc: netdev, Joao Pinto, Giuseppe Cavallaro, Alexandre Torgue,
David S. Miller, Maxime Coquelin, linux-stm32, linux-arm-kernel,
linux-kernel
In-Reply-To: <195f374a0b46e5e65a691742fc2dbeffacfaf148.1565602974.git.joabreu@synopsys.com>
On Mon, 12 Aug 2019 11:44:00 +0200, Jose Abreu wrote:
> TX Timestamp in XGMAC comes from MAC instead of descriptors. Implement
> this in a new callback.
>
> Also, RX Timestamp in XGMAC must be cheked against corruption and we need
> a barrier to make sure that descriptor fields are read correctly.
>
> Changes from v1:
> - Rework the get timestamp function (David)
>
> Signed-off-by: Jose Abreu <joabreu@synopsys.com>
The barrier sounds like it might be a bug fix, does it occur in he wild?
> @@ -113,13 +119,11 @@ static int dwxgmac2_get_rx_timestamp_status(void *desc, void *next_desc,
> unsigned int rdes3 = le32_to_cpu(p->des3);
> int ret = -EBUSY;
>
> - if (likely(rdes3 & XGMAC_RDES3_CDA)) {
> + if (likely(rdes3 & XGMAC_RDES3_CDA))
> ret = dwxgmac2_rx_check_timestamp(next_desc);
> - if (ret)
> - return ret;
> - }
> -
> - return ret;
> + if (!ret)
> + return 1;
> + return 0;
nit:
return !ret;
> }
>
> static void dwxgmac2_init_rx_desc(struct dma_desc *p, int disable_rx_ic,
^ permalink raw reply
* Re: [PATCH net-next] net/ncsi: allow to customize BMC MAC Address offset
From: Terry Duncan @ 2019-08-13 20:54 UTC (permalink / raw)
To: Tao Ren
Cc: Andrew Lunn, Jakub Kicinski, netdev@vger.kernel.org,
openbmc@lists.ozlabs.org, Ben Wei, linux-kernel@vger.kernel.org,
Samuel Mendoza-Jonas, David S.Miller, William Kennington
In-Reply-To: <faa1b3c9-9ba3-0fff-e1d4-f6dddb60c52c@fb.com>
On 8/13/19 11:28 AM, Tao Ren wrote:
> On 8/13/19 9:31 AM, Terry Duncan wrote:
>> Tao, in your new patch will it be possible to disable the setting of the BMC MAC? I would like to be able to send NCSI_OEM_GET_MAC perhaps with netlink (TBD) to get the system address without it affecting the BMC address.
>>
>> I was about to send patches to add support for the Intel adapters when I saw this thread.
>>
>> Thanks,
>>
>> Terry
>
> Hi Terry,
>
> Sounds like you are planning to configure BMC MAC address from user space via netlink? Ben Wei <benwei@fb.com> started a thread "Out-of-band NIC management" in openbmc community for NCSI management using netlink, and you may follow up with him for details.
>
> I haven't decided what to do in my v2 patch: maybe using device tree, maybe moving the logic to uboot, and I'm also evaluating the netlink option. But it shouldn't impact your patch, because you can disable NCSI_OEM_GET_MAC option from your config file.
Thanks Tao. I see now that disabling the NCSI_OEM_GET_MAC option will do
what I want.
Best,
Terry
^ permalink raw reply
* Re: [PATCH v6 4/4] net: phy: realtek: Add LED configuration support for RTL8211E
From: Matthias Kaehlcke @ 2019-08-13 20:46 UTC (permalink / raw)
To: Andrew Lunn
Cc: David S . Miller, Rob Herring, Mark Rutland, Florian Fainelli,
Heiner Kallweit, netdev, devicetree, linux-kernel,
Douglas Anderson
In-Reply-To: <20190813201411.GL15047@lunn.ch>
On Tue, Aug 13, 2019 at 10:14:11PM +0200, Andrew Lunn wrote:
> > +static int rtl8211e_config_led(struct phy_device *phydev, int led,
> > + struct phy_led_config *cfg)
> > +{
> > + u16 lacr_bits = 0, lcr_bits = 0;
> > + int oldpage, ret;
> > +
>
> You should probably check that led is 0 or 1.
Actually this PHY has up to 3 LEDs, but yes, good point to validate
the parameter. I'll wait a day or two for if there is other feedback
and send a new version with the check.
> Otherwise this looks good.
Thanks for the reviews!
Matthias
^ permalink raw reply
* Re: [PATCH net-next] net: dsa: mv88e6xxx: check for mode change in port_setup_mac
From: Marek Behun @ 2019-08-13 20:45 UTC (permalink / raw)
To: Andrew Lunn; +Cc: netdev, Vivien Didelot, Heiner Kallweit
In-Reply-To: <20190813203234.GO15047@lunn.ch>
On Tue, 13 Aug 2019 22:32:34 +0200
Andrew Lunn <andrew@lunn.ch> wrote:
> On Tue, Aug 13, 2019 at 07:12:43PM +0200, Marek Behún wrote:
> > @@ -598,12 +599,49 @@ int mv88e6352_port_link_state(struct mv88e6xxx_chip *chip, int port,
> > struct phylink_link_state *state)
> > {
> > int err;
> > - u16 reg;
> > + u16 reg, mac;
> >
> > err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_STS, ®);
> > if (err)
> > return err;
>
> Hi Marek
>
> It seems a bit off putting this block of code here, after reading
> MV88E6XXX_PORT_STS but before using the value. You don't need STS to
> determine the interface mode.
>
> If you keep the code together, you can then reuse reg, rather than add
> mac.
>
> Apart from that, this looks O.K.
>
> Andrew
Hi Andrew,
you are right, I shall rewrite this.
Thank you.
Marek
^ permalink raw reply
* Re: [PATCH net] netdevsim: Restore per-network namespace accounting for fib entries
From: Jiri Pirko @ 2019-08-13 20:37 UTC (permalink / raw)
To: David Miller; +Cc: dsahern, netdev, dsahern
In-Reply-To: <20190813.104054.140346412563349218.davem@davemloft.net>
Tue, Aug 13, 2019 at 07:40:54PM CEST, davem@davemloft.net wrote:
>From: Jiri Pirko <jiri@resnulli.us>
>Date: Tue, 13 Aug 2019 09:14:45 +0200
>
>> Mon, Aug 12, 2019 at 05:28:02PM CEST, davem@davemloft.net wrote:
>>>From: Jiri Pirko <jiri@resnulli.us>
>>>Date: Mon, 12 Aug 2019 10:36:35 +0200
>>>
>>>> I understand it with real devices, but dummy testing device, who's
>>>> purpose is just to test API. Why?
>>>
>>>Because you'll break all of the wonderful testing infrastructure
>>>people like David have created.
>>
>> Are you referring to selftests? There is no such test there :(
>> But if it would be, could implement the limitations
>> properly (like using cgroups), change the tests and remove this
>> code from netdevsim?
>
>What about older kernels? Can't you see how illogical this is?
Not really, since this is a dummy testing device. Not like we would
break anybody. Well, I changed instantiation of netdevsim from rtnetlink
to sysfs, that is even bigger breakage, nobody cared (of course).
Well sure we can comeup with netdevsim2, netdevsimN, to maintain backward
compatibility of netdevsim. I find it ridiculous. But anyway, I don't
really care that much. I resign as this seems futile :(
^ permalink raw reply
* Re: [PATCH net-next] net: dsa: mv88e6xxx: check for mode change in port_setup_mac
From: Andrew Lunn @ 2019-08-13 20:32 UTC (permalink / raw)
To: Marek Behún; +Cc: netdev, Vivien Didelot, Heiner Kallweit
In-Reply-To: <20190813171243.27898-1-marek.behun@nic.cz>
On Tue, Aug 13, 2019 at 07:12:43PM +0200, Marek Behún wrote:
> @@ -598,12 +599,49 @@ int mv88e6352_port_link_state(struct mv88e6xxx_chip *chip, int port,
> struct phylink_link_state *state)
> {
> int err;
> - u16 reg;
> + u16 reg, mac;
>
> err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_STS, ®);
> if (err)
> return err;
Hi Marek
It seems a bit off putting this block of code here, after reading
MV88E6XXX_PORT_STS but before using the value. You don't need STS to
determine the interface mode.
If you keep the code together, you can then reuse reg, rather than add
mac.
Apart from that, this looks O.K.
Andrew
^ permalink raw reply
* Re: [PATCH net-next v2 2/3] net: phy: add phy_speed_down_core and phy_resolve_min_speed
From: Andrew Lunn @ 2019-08-13 20:16 UTC (permalink / raw)
To: Heiner Kallweit; +Cc: Florian Fainelli, David Miller, netdev@vger.kernel.org
In-Reply-To: <d0c44f84-d441-9d4e-96e1-d8abb7c0e508@gmail.com>
On Mon, Aug 12, 2019 at 11:51:27PM +0200, Heiner Kallweit wrote:
> phy_speed_down_core provides most of the functionality for
> phy_speed_down. It makes use of new helper phy_resolve_min_speed that is
> based on the sorting of the settings[] array. In certain cases it may be
> helpful to be able to exclude legacy half duplex modes, therefore
> prepare phy_resolve_min_speed() for it.
>
> v2:
> - rename __phy_speed_down to phy_speed_down_core
>
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Andrew
^ permalink raw reply
* Re: [PATCH net-next v2 1/3] net: phy: add __set_linkmode_max_speed
From: Andrew Lunn @ 2019-08-13 20:16 UTC (permalink / raw)
To: Heiner Kallweit; +Cc: Florian Fainelli, David Miller, netdev@vger.kernel.org
In-Reply-To: <4c77b801-6005-834c-da0e-f32847961f81@gmail.com>
On Mon, Aug 12, 2019 at 11:50:30PM +0200, Heiner Kallweit wrote:
> We will need the functionality of __set_linkmode_max_speed also for
> linkmode bitmaps other than phydev->supported. Therefore split it.
>
> v2:
> - remove unused parameter from __set_linkmode_max_speed
>
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Andrew
^ permalink raw reply
* Re: [PATCH v6 4/4] net: phy: realtek: Add LED configuration support for RTL8211E
From: Andrew Lunn @ 2019-08-13 20:14 UTC (permalink / raw)
To: Matthias Kaehlcke
Cc: David S . Miller, Rob Herring, Mark Rutland, Florian Fainelli,
Heiner Kallweit, netdev, devicetree, linux-kernel,
Douglas Anderson
In-Reply-To: <20190813191147.19936-5-mka@chromium.org>
> +static int rtl8211e_config_led(struct phy_device *phydev, int led,
> + struct phy_led_config *cfg)
> +{
> + u16 lacr_bits = 0, lcr_bits = 0;
> + int oldpage, ret;
> +
You should probably check that led is 0 or 1.
Otherwise this looks good.
Andrew
^ permalink raw reply
* Re: [PATCH v6 3/4] net: phy: realtek: Add helpers for accessing RTL8211x extension pages
From: Andrew Lunn @ 2019-08-13 20:09 UTC (permalink / raw)
To: Matthias Kaehlcke
Cc: David S . Miller, Rob Herring, Mark Rutland, Florian Fainelli,
Heiner Kallweit, netdev, devicetree, linux-kernel,
Douglas Anderson
In-Reply-To: <20190813191147.19936-4-mka@chromium.org>
On Tue, Aug 13, 2019 at 12:11:46PM -0700, Matthias Kaehlcke wrote:
> Some RTL8211x PHYs have extension pages, which can be accessed
> after selecting a page through a custom method. Add a function to
> modify bits in a register of an extension page and a helper for
> selecting an ext page. Use rtl8211x_modify_ext_paged() in
> rtl8211e_config_init() instead of doing things 'manually'.
>
> Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Andrew
^ permalink raw reply
* Re: [PATCH v6 2/4] net: phy: Add support for generic LED configuration through the DT
From: Andrew Lunn @ 2019-08-13 19:56 UTC (permalink / raw)
To: Matthias Kaehlcke
Cc: David S . Miller, Rob Herring, Mark Rutland, Florian Fainelli,
Heiner Kallweit, netdev, devicetree, linux-kernel,
Douglas Anderson
In-Reply-To: <20190813191147.19936-3-mka@chromium.org>
On Tue, Aug 13, 2019 at 12:11:45PM -0700, Matthias Kaehlcke wrote:
> For PHYs with a device tree node look for LED trigger configuration
> using the generic binding, if it exists try to apply it via the new
> driver hook .config_led.
>
> Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Andrew
^ permalink raw reply
* Re: [PATCH net-next] net: phy: modify assignment to OR for dev_flags in phy_attach_direct
From: Florian Fainelli @ 2019-08-13 19:54 UTC (permalink / raw)
To: Tao Ren, Andrew Lunn, Heiner Kallweit, David S . Miller,
Vladimir Oltean, Arun Parameswaran, Justin Chen, netdev,
linux-kernel, openbmc
In-Reply-To: <20190805185551.3140564-1-taoren@fb.com>
On 8/5/19 11:55 AM, Tao Ren wrote:
> Modify the assignment to OR when dealing with phydev->dev_flags in
> phy_attach_direct function, and this is to make sure dev_flags set in
> driver's probe callback won't be lost.
As Andrew pointed out already, this probably needs to be reworked, but
for now this looks reasonable:
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
--
Florian
^ permalink raw reply
* Re: [PATCH v6 1/4] dt-bindings: net: phy: Add subnode for LED configuration
From: Andrew Lunn @ 2019-08-13 19:54 UTC (permalink / raw)
To: Matthias Kaehlcke
Cc: David S . Miller, Rob Herring, Mark Rutland, Florian Fainelli,
Heiner Kallweit, netdev, devicetree, linux-kernel,
Douglas Anderson
In-Reply-To: <20190813191147.19936-2-mka@chromium.org>
On Tue, Aug 13, 2019 at 12:11:44PM -0700, Matthias Kaehlcke wrote:
> The LED behavior of some Ethernet PHYs is configurable. Add an
> optional 'leds' subnode with a child node for each LED to be
> configured. The binding aims to be compatible with the common
> LED binding (see devicetree/bindings/leds/common.txt).
>
> A LED can be configured to be:
>
> - 'on' when a link is active, some PHYs allow configuration for
> certain link speeds
> speeds
> - 'off'
> - blink on RX/TX activity, some PHYs allow configuration for
> certain link speeds
>
> For the configuration to be effective it needs to be supported by
> the hardware and the corresponding PHY driver.
>
> Suggested-by: Andrew Lunn <andrew@lunn.ch>
> Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Andrew
^ permalink raw reply
* Re: [PATCH net-next] mcast: ensure L-L IPv6 packets are accepted by bridge
From: Ido Schimmel @ 2019-08-13 19:53 UTC (permalink / raw)
To: Patrick Ruddy; +Cc: netdev, roopa, nikolay, linus.luessing
In-Reply-To: <20190813141804.20515-1-pruddy@vyatta.att-mail.com>
+ Bridge maintainers, Linus
On Tue, Aug 13, 2019 at 03:18:04PM +0100, Patrick Ruddy wrote:
> At present only all-nodes IPv6 multicast packets are accepted by
> a bridge interface that is not in multicast router mode. Since
> other protocols can be running in the absense of multicast
> forwarding e.g. OSPFv3 IPv6 ND. Change the test to allow
> all of the FFx2::/16 range to be accepted when not in multicast
> router mode. This aligns the code with IPv4 link-local reception
> and RFC4291
Can you please quote the relevant part from RFC 4291?
>
> Signed-off-by: Patrick Ruddy <pruddy@vyatta.att-mail.com>
> ---
> include/net/addrconf.h | 15 +++++++++++++++
> net/bridge/br_multicast.c | 2 +-
> 2 files changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/include/net/addrconf.h b/include/net/addrconf.h
> index becdad576859..05b42867e969 100644
> --- a/include/net/addrconf.h
> +++ b/include/net/addrconf.h
> @@ -434,6 +434,21 @@ static inline void addrconf_addr_solict_mult(const struct in6_addr *addr,
> htonl(0xFF000000) | addr->s6_addr32[3]);
> }
>
> +/*
> + * link local multicast address range ffx2::/16 rfc4291
> + */
> +static inline bool ipv6_addr_is_ll_mcast(const struct in6_addr *addr)
> +{
> +#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && BITS_PER_LONG == 64
> + __be64 *p = (__be64 *)addr;
> + return ((p[0] & cpu_to_be64(0xff0f000000000000UL))
> + ^ cpu_to_be64(0xff02000000000000UL)) == 0UL;
> +#else
> + return ((addr->s6_addr32[0] & htonl(0xff0f0000)) ^
> + htonl(0xff020000)) == 0;
> +#endif
> +}
> +
> static inline bool ipv6_addr_is_ll_all_nodes(const struct in6_addr *addr)
> {
> #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && BITS_PER_LONG == 64
> diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
> index 9b379e110129..ed3957381fa2 100644
> --- a/net/bridge/br_multicast.c
> +++ b/net/bridge/br_multicast.c
> @@ -1664,7 +1664,7 @@ static int br_multicast_ipv6_rcv(struct net_bridge *br,
> err = ipv6_mc_check_mld(skb);
>
> if (err == -ENOMSG) {
> - if (!ipv6_addr_is_ll_all_nodes(&ipv6_hdr(skb)->daddr))
> + if (!ipv6_addr_is_ll_mcast(&ipv6_hdr(skb)->daddr))
> BR_INPUT_SKB_CB(skb)->mrouters_only = 1;
IIUC, you want IPv6 link-local packets to be locally received, but this
also changes how these packets are flooded. RFC 4541 says that packets
addressed to the all hosts address are a special case and should be
forwarded to all ports:
"In IPv6, the data forwarding rules are more straight forward because MLD is
mandated for addresses with scope 2 (link-scope) or greater. The only exception
is the address FF02::1 which is the all hosts link-scope address for which MLD
messages are never sent. Packets with the all hosts link-scope address should
be forwarded on all ports."
Maybe you want something like:
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 09b1dd8cd853..9f312a73f61c 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -132,7 +132,8 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
if ((mdst || BR_INPUT_SKB_CB_MROUTERS_ONLY(skb)) &&
br_multicast_querier_exists(br, eth_hdr(skb))) {
if ((mdst && mdst->host_joined) ||
- br_multicast_is_router(br)) {
+ br_multicast_is_router(br) ||
+ BR_INPUT_SKB_CB_LOCAL_RECEIVE(skb)) {
local_rcv = true;
br->dev->stats.multicast++;
}
diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index 9b379e110129..f03cecf6174e 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -1667,6 +1667,9 @@ static int br_multicast_ipv6_rcv(struct net_bridge *br,
if (!ipv6_addr_is_ll_all_nodes(&ipv6_hdr(skb)->daddr))
BR_INPUT_SKB_CB(skb)->mrouters_only = 1;
+ if (ipv6_addr_is_ll_mcast(&ipv6_hdr(skb)->daddr))
+ BR_INPUT_SKB_CB(skb)->local_receive = 1;
+
if (ipv6_addr_is_all_snoopers(&ipv6_hdr(skb)->daddr)) {
err = br_ip6_multicast_mrd_rcv(br, port, skb);
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index b7a4942ff1b3..d76394ca4059 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -426,6 +426,7 @@ struct br_input_skb_cb {
#ifdef CONFIG_BRIDGE_IGMP_SNOOPING
u8 igmp;
u8 mrouters_only:1;
+ u8 local_receive:1;
#endif
u8 proxyarp_replied:1;
u8 src_port_isolated:1;
@@ -445,8 +446,10 @@ struct br_input_skb_cb {
#ifdef CONFIG_BRIDGE_IGMP_SNOOPING
# define BR_INPUT_SKB_CB_MROUTERS_ONLY(__skb) (BR_INPUT_SKB_CB(__skb)->mrouters_only)
+# define BR_INPUT_SKB_CB_LOCAL_RECEIVE(__skb) (BR_INPUT_SKB_CB(__skb)->local_receive)
#else
# define BR_INPUT_SKB_CB_MROUTERS_ONLY(__skb) (0)
+# define BR_INPUT_SKB_CB_LOCAL_RECEIVE(__skb) (0)
#endif
#define br_printk(level, br, format, args...) \
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox