* Re: [PATCH repost net-next] dsa: mv88e6xxx: Optimise atu_get
From: Andrew Lunn @ 2017-01-04 21:19 UTC (permalink / raw)
To: David Miller; +Cc: volodymyr.bendiuga, vivien.didelot, f.fainelli, netdev
In-Reply-To: <20170104.161103.175830630875484681.davem@davemloft.net>
On Wed, Jan 04, 2017 at 04:11:03PM -0500, David Miller wrote:
> From: Andrew Lunn <andrew@lunn.ch>
> Date: Wed, 4 Jan 2017 19:56:24 +0100
>
> > +static inline u64 ether_addr_to_u64(const u8 *addr)
> > +{
> > + u64 u = 0;
> > + int i;
> > +
> > + for (i = 0; i < ETH_ALEN; i++)
> > + u = u << 8 | addr[i];
> > +
> > + return u;
> > +}
> ...
> > +static inline void u64_to_ether_addr(u64 u, u8 *addr)
> > +{
> > + int i;
> > +
> > + for (i = ETH_ALEN - 1; i >= 0; i--) {
> > + addr[i] = u & 0xff;
> > + u = u >> 8;
> > + }
> > +}
>
> I think these two routines behave differently on big vs little
> endian. And I doubt this was your intention.
I don't have a big endian system to test on.
I tried to avoid the usual pitfalls. I don't cast a collection of
bytes to a u64, which i know has no chance of working. Accessing a MAC
address as a byte array should be endian safe. The shift operation
should also be endian safe.
What exactly do you think will behave differently?
Andrew
^ permalink raw reply
* Re: [PATCH repost net-next] dsa: mv88e6xxx: Optimise atu_get
From: Florian Fainelli @ 2017-01-04 21:27 UTC (permalink / raw)
To: Andrew Lunn, David Miller; +Cc: volodymyr.bendiuga, vivien.didelot, netdev
In-Reply-To: <20170104211957.GD4229@lunn.ch>
On 01/04/2017 01:19 PM, Andrew Lunn wrote:
> On Wed, Jan 04, 2017 at 04:11:03PM -0500, David Miller wrote:
>> From: Andrew Lunn <andrew@lunn.ch>
>> Date: Wed, 4 Jan 2017 19:56:24 +0100
>>
>>> +static inline u64 ether_addr_to_u64(const u8 *addr)
>>> +{
>>> + u64 u = 0;
>>> + int i;
>>> +
>>> + for (i = 0; i < ETH_ALEN; i++)
>>> + u = u << 8 | addr[i];
>>> +
>>> + return u;
>>> +}
>> ...
>>> +static inline void u64_to_ether_addr(u64 u, u8 *addr)
>>> +{
>>> + int i;
>>> +
>>> + for (i = ETH_ALEN - 1; i >= 0; i--) {
>>> + addr[i] = u & 0xff;
>>> + u = u >> 8;
>>> + }
>>> +}
>>
>> I think these two routines behave differently on big vs little
>> endian. And I doubt this was your intention.
>
> I don't have a big endian system to test on.
You could build the driver for e.g: a MIPS Malta board and use the
qemu-system-mips to validate this, there could be a way to do that on
ARM too although it's a different kind of BE (BE8 vs. BE32) AFAIR.
>
> I tried to avoid the usual pitfalls. I don't cast a collection of
> bytes to a u64, which i know has no chance of working. Accessing a MAC
> address as a byte array should be endian safe. The shift operation
> should also be endian safe.
>
> What exactly do you think will behave differently?
>
> Andrew
>
--
Florian
^ permalink raw reply
* Re: [PATCH repost net-next] dsa: mv88e6xxx: Optimise atu_get
From: David Miller @ 2017-01-04 21:27 UTC (permalink / raw)
To: andrew; +Cc: volodymyr.bendiuga, vivien.didelot, f.fainelli, netdev
In-Reply-To: <20170104211957.GD4229@lunn.ch>
From: Andrew Lunn <andrew@lunn.ch>
Date: Wed, 4 Jan 2017 22:19:57 +0100
> On Wed, Jan 04, 2017 at 04:11:03PM -0500, David Miller wrote:
>> From: Andrew Lunn <andrew@lunn.ch>
>> Date: Wed, 4 Jan 2017 19:56:24 +0100
>>
>> > +static inline u64 ether_addr_to_u64(const u8 *addr)
>> > +{
>> > + u64 u = 0;
>> > + int i;
>> > +
>> > + for (i = 0; i < ETH_ALEN; i++)
>> > + u = u << 8 | addr[i];
>> > +
>> > + return u;
>> > +}
>> ...
>> > +static inline void u64_to_ether_addr(u64 u, u8 *addr)
>> > +{
>> > + int i;
>> > +
>> > + for (i = ETH_ALEN - 1; i >= 0; i--) {
>> > + addr[i] = u & 0xff;
>> > + u = u >> 8;
>> > + }
>> > +}
>>
>> I think these two routines behave differently on big vs little
>> endian. And I doubt this was your intention.
>
> I don't have a big endian system to test on.
>
> I tried to avoid the usual pitfalls. I don't cast a collection of
> bytes to a u64, which i know has no chance of working. Accessing a MAC
> address as a byte array should be endian safe. The shift operation
> should also be endian safe.
>
> What exactly do you think will behave differently?
Maybe I over-reacted.
I just ran some test programs in userspace on both little and big
endian and they checked out.
Sorry for the false alarm.
I'll apply this, thanks.
^ permalink raw reply
* [PATCH] sh_eth: R8A7740 supports packet shecksumming
From: Sergei Shtylyov @ 2017-01-04 21:29 UTC (permalink / raw)
To: netdev, linux-renesas-soc
The R8A7740 GEther controller supports the packet checksum offloading
but the 'hw_crc' (bad name, I'll fix it) flag isn't set in the R8A7740
data, thus CSMR isn't cleared...
Fixes: 73a0d907301e ("net: sh_eth: add support R8A7740")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
---
This patch is against DaveM's 'net.git' repo plus the fixes sent recently...
drivers/net/ethernet/renesas/sh_eth.c | 1 +
1 file changed, 1 insertion(+)
Index: net/drivers/net/ethernet/renesas/sh_eth.c
===================================================================
--- net.orig/drivers/net/ethernet/renesas/sh_eth.c
+++ net/drivers/net/ethernet/renesas/sh_eth.c
@@ -574,6 +574,7 @@ static struct sh_eth_cpu_data r8a7740_da
.rpadir_value = 2 << 16,
.no_trimd = 1,
.no_ade = 1,
+ .hw_crc = 1,
.tsu = 1,
.select_mii = 1,
.shift_rd0 = 1,
^ permalink raw reply
* Re: [PATCH net-next 0/6] Prepare BPF for VLAN_TAG_PRESENT cleanup
From: Michał Mirosław @ 2017-01-04 21:30 UTC (permalink / raw)
To: Daniel Borkmann; +Cc: netdev, alexei.starovoitov
In-Reply-To: <586D570D.9060800@iogearbox.net>
On Wed, Jan 04, 2017 at 09:11:57PM +0100, Daniel Borkmann wrote:
> On 01/04/2017 02:18 AM, Michał Mirosław wrote:
> > Those patches prepare BPF ant its JITs for removal of VLAN_TAG_PRESENT.
> > The set depends on "Preparation for VLAN_TAG_PRESENT cleanup" patchset.
> >
> > The series is supposed to be bisect-friendly and that requires temporary
> > insertion of #define VLAN_TAG_PRESENT in BPF code to be able to split
> > JIT changes per architecture.
> >
> > Michał Mirosław (6):
> > net/skbuff: add macros for VLAN_PRESENT bit
> > net/bpf_jit: ARM: split VLAN_PRESENT bit handling from VLAN_TCI
> > net/bpf_jit: MIPS: split VLAN_PRESENT bit handling from VLAN_TCI
> > net/bpf_jit: PPC: split VLAN_PRESENT bit handling from VLAN_TCI
> > net/bpf_jit: SPARC: split VLAN_PRESENT bit handling from VLAN_TCI
> > net/bpf: split VLAN_PRESENT bit handling from VLAN_TCI
>
> Please add a proper changelog to all the individual patches, right now
> they have none. Also, how was this runtime tested? Did you run BPF selftest
> suite with them? Seems like they weren't even compile tested properly
> given the kbuild bot barking on sparc ...
Compile bot is barking because it doesn't have patches, which this set depends on.
Sorry about that.
Best Regards,
Michał Mirosław
^ permalink raw reply
* Re: [PATCH] phy state machine: failsafe leave invalid RUNNING state
From: Florian Fainelli @ 2017-01-04 21:44 UTC (permalink / raw)
To: Zefir Kurtisi, netdev; +Cc: andrew
In-Reply-To: <8521b51f-04f7-aeef-f862-bb150257cfa4@neratec.com>
On 01/04/2017 08:10 AM, Zefir Kurtisi wrote:
> On 01/04/2017 04:30 PM, Florian Fainelli wrote:
>>
>>
>> On 01/04/2017 07:27 AM, Zefir Kurtisi wrote:
>>> On 01/04/2017 04:13 PM, Florian Fainelli wrote:
>>>>
>>>>
>>>> On 01/04/2017 07:04 AM, Zefir Kurtisi wrote:
>>>>> While in RUNNING state, phy_state_machine() checks for link changes by
>>>>> comparing phydev->link before and after calling phy_read_status().
>>>>> This works as long as it is guaranteed that phydev->link is never
>>>>> changed outside the phy_state_machine().
>>>>>
>>>>> If in some setups this happens, it causes the state machine to miss
>>>>> a link loss and remain RUNNING despite phydev->link being 0.
>>>>>
>>>>> This has been observed running a dsa setup with a process continuously
>>>>> polling the link states over ethtool each second (SNMPD RFC-1213
>>>>> agent). Disconnecting the link on a phy followed by a ETHTOOL_GSET
>>>>> causes dsa_slave_get_settings() / dsa_slave_get_link_ksettings() to
>>>>> call phy_read_status() and with that modify the link status - and
>>>>> with that bricking the phy state machine.
>>>>
>>>> That's the interesting part of the analysis, how does this brick the PHY
>>>> state machine? Is the PHY driver changing the link status in the
>>>> read_status callback that it implements?
>>>>
>>> phydev->read_status points to genphy_read_status(), where the first call goes to
>>> genphy_update_link() which updates the link status.
>>>
>>> Thereafter phy_state_machine():RUNNING won't be able to detect the link loss
>>> anymore unless the link state changes again.
>>>
>>>
>>> I was trying to figure out if there is a rule that forbids changing phydev->link
>>> from outside the state machine, but found several places where it happens (either
>>> directly, or over genphy_read_status() or over genphy_update_link()).
>>>
>>> Curious how this did not show up before, since within the dsa setup it is very
>>> easy to trigger:
>>> a) physically disconnect link
>>> b) within one second run ethtool ethX
>>
>> You need to be more specific here about what "the dsa setup" is, drivers
>> involved, which ports of the switch you are seeing this with (user
>> facing, CPU port, DSA port?) etc.
>>
> I am working on top of LEDE and with that at kernel 4.4.21 - alas I checked the
> related source files and believe the effect should be reproducible with HEAD.
>
> The setup is as follows:
> mv88e6321:
> * ports 0+1 connected to fibre-optics transceivers at fixed 100 Mbps
> * port 4 is CPU port
> * custom phy driver (replacement for marvell.ko) only populated with
> * .config_init to
> * set fixed speed for ports 0+1 (when in FO mode)
> * run genphy_config_init() for all other modes (here: CPU port)
> * .config_aneg=genphy_config_aneg, .read_status=genphy_read_status
>
>
> To my understanding, the exact setup is irrelevant - to reproduce the issue it is
> enough to have a means of running genphy_update_link() (as done in e.g.
> mediatek/mtk_eth_soc.c, dsa/slave.c), or genphy_read_status() (as done in e.g.
> hisilicon/hns/hns_enet.c) or phy_read_status() (as done in e.g.
> ethernet/ti/netcp_ethss.c, ethernet/aeroflex/greth.c, etc.). In the observed
> drivers it is mostly implemented in the ETHTOOL_GSET execution path.
>
> Once you get the link state updated outside the phy state machine, it remains in
> invalid RUNNING. To prevent that invalid state, to my understanding upper layer
> drivers (Ethernet, dsa) must not modify link-states in any case (including calling
> the functions noted above), or we need the proposed fail-safe mechanism to prevent
> getting stuck.
OK, I see the code path involved now, sorry -ENOCOFFEE when I initially
responded. Yes, clearly, we should not be mangling the PHY device's link
by calling genphy_read_status(). At first glance, none of the users
below should be doing what they are doing, but let's kick a separate
patch series to collect feedback from the driver writes.
Thanks!
--
Florian
^ permalink raw reply
* Re: [PATCH repost net-next] dsa: mv88e6xxx: Optimise atu_get
From: Andrew Lunn @ 2017-01-04 21:45 UTC (permalink / raw)
To: David Miller; +Cc: volodymyr.bendiuga, vivien.didelot, f.fainelli, netdev
In-Reply-To: <20170104.162718.86114638722981428.davem@davemloft.net>
> Maybe I over-reacted.
I'm happy somebody other than me is thinking about this.
> I just ran some test programs in userspace on both little and big
> endian and they checked out.
Great, thanks for testing.
Andrew
^ permalink raw reply
* [GIT] Networking
From: David Miller @ 2017-01-04 21:45 UTC (permalink / raw)
To: torvalds; +Cc: akpm, netdev, linux-kernel
1) stmmac_drv_probe() can race with stmmac_open() because we register the
netdevice too early. Fix from Florian Fainelli.
2) UFO handling in __ip6_append_data() and ip6_finish_output() use different
tests for deciding whether a frame will be fragmented or not, put them
in sync. Fix from Zheng Li.
3) The rtnetlink getstats handlers need to validate that the netlink
request is large enough, fix from Mathias Krause.
4) Use after free in mlx4 driver, from Jack Morgenstein.
5) Fix setting of garbage UID value in sockets during setattr() calls,
from Eric Biggers.
6) Packet drop_monitor doesn't format the netlink messages properly such
that nlmsg_next fails to work, fix from Reiter Wolfgang.
7) Fix handling of wildcard addresses in l2tp lookups, from Guillaume
Nault.
8) __skb_flow_dissect() can crash on pptp packets, from Ian Kumlien.
9) IGMP code doesn't reset group query timers properly, from Michal
Tesar.
10) Fix overzealous MAIN/LOCAL route table combining in ipv4, from
Alexander Duyck.
11) vxlan offload check needs to be more strict in be2net driver,
from Sabrina Dubroca.
12) Moving l3mdev to packet hooks lost RX stat counters unintentionally,
fix from David Ahern.
Please pull, thanks a lot!
The following changes since commit 8f18e4d03ed8fa5e4a300c94550533bd8ce4ff9a:
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2016-12-27 16:04:37 -0800)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git
for you to fetch changes up to 71eae1ca77fd6be218d8a952d97bba827e56516d:
sh_eth: enable RX descriptor word 0 shift on SH7734 (2017-01-04 16:12:14 -0500)
----------------------------------------------------------------
Alexander Alemayhu (1):
Documentation/networking: fix typo in mpls-sysctl
Alexander Duyck (1):
ipv4: Do not allow MAIN to be alias for new LOCAL w/ custom rules
Augusto Mecking Caringi (1):
net: atm: Fix warnings in net/atm/lec.c when !CONFIG_PROC_FS
Bartosz Folta (1):
net: macb: Updated resource allocation function calls to new version of API.
Colin Ian King (1):
net: wan: slic_ds26522: fix spelling mistake: "configurated" -> "configured"
Daniel Jurgens (1):
net/mlx5: Cancel recovery work in remove flow
David Ahern (2):
net: ipv4: dst for local input routes should use l3mdev if relevant
net: vrf: Add missing Rx counters
David S. Miller (7):
Merge branch 'mlx5-fixes'
Merge branch 'mlx4-misc-fixes'
Merge branch 'l2tp-socket-lookup-fixes'
Merge tag 'mac80211-for-davem-2017-01-02' of git://git.kernel.org/.../jberg/mac80211
Merge branch 'dwmac-oxnas-leaks'
Merge branch 'systemport-padding-and-TSB-insertion'
Merge branch 'dpaa_eth-fixes'
Edward Cree (1):
sfc: don't report RX hash keys to ethtool when RSS wasn't enabled
Eli Cohen (1):
net/mlx5: Avoid shadowing numa_node
Eric Biggers (1):
net: socket: don't set sk_uid to garbage value in ->setattr()
Eugenia Emantayev (1):
net/mlx4_en: Fix bad WQE issue
Florian Fainelli (4):
net: stmmac: Fix race between stmmac_drv_probe and stmmac_open
net: stmmac: Fix error path after register_netdev move
net: systemport: Utilize skb_put_padto()
net: systemport: Pad packet before inserting TSB
Gal Pressman (2):
Revert "net/mlx5e: Expose PCIe statistics to ethtool"
Revert "net/mlx5: Add MPCNT register infrastructure"
Guillaume Nault (2):
l2tp: consider '::' as wildcard address in l2tp_ip6 socket lookup
l2tp: take remote address into account in l2tp_ip and l2tp_ip6 socket lookups
Huy Nguyen (1):
net/mlx5e: Check ets capability before initializing ets settings
Ian Kumlien (1):
flow_dissector: Update pptp handling to avoid null pointer deref.
Jack Morgenstein (2):
net/mlx4_core: Use-after-free causes a resource leak in flow-steering detach
net/mlx4_core: Fix raw qp flow steering rules under SRIOV
Johan Hovold (3):
net: stmmac: dwmac-oxnas: fix of-node leak
net: stmmac: dwmac-oxnas: fix fixed-link-phydev leaks
net: stmmac: dwmac-oxnas: use generic pm implementation
Johannes Berg (1):
mac80211: initialize fast-xmit 'info' later
Leon Romanovsky (1):
net/mlx4: Remove BUG_ON from ICM allocation routine
Madalin Bucur (1):
dpaa_eth: cleanup after init_phy() failure
Maor Gottlieb (2):
net/mlx5: Mask destination mac value in ethtool steering rules
net/mlx5: Release FTE lock in error flow
Mathias Krause (1):
rtnl: stats - add missing netlink message size checks
Michal Tesar (1):
igmp: Make igmp group member RFC 3376 compliant
Mohamad Haj Yahia (1):
net/mlx5: Prevent setting multicast macs for VFs
Nicolas Pitre (1):
LiquidIO VF: s/select/imply/ for PTP_1588_CLOCK
Noa Osherovich (1):
net/mlx5: Check FW limitations on log_max_qp before setting it
Or Gerlitz (1):
net/mlx5: Disable RoCE on the e-switch management port under switchdev mode
Paul Blakey (1):
net/sched: cls_flower: Fix missing addr_type in classify
Peter Chen (1):
net: usb: asix_devices: add .reset_resume for USB PM
Reiter Wolfgang (2):
drop_monitor: add missing call to genlmsg_end
drop_monitor: consider inserted data in genlmsg_end
Roy Pledge (1):
dpaa_eth: Initialize CGR structure before init
Sabrina Dubroca (1):
benet: stricter vxlan offloading check in be_features_check
Saeed Mahameed (2):
net/mlx5e: Don't sync netdev state when not registered
net/mlx5e: Disable netdev after close
Sergei Shtylyov (2):
sh_eth: fix branch prediction in sh_eth_interrupt()
sh_eth: enable RX descriptor word 0 shift on SH7734
Slava Shwartsman (1):
net/mlx4_en: Fix type mismatch for 32-bit systems
Varun Prakash (1):
libcxgb: fix error check for ip6_route_output()
Wei Zhang (1):
net: fix incorrect original ingress device index in PKTINFO
Zheng Li (1):
ipv6: Should use consistent conditional judgement for ip6 fragment between __ip6_append_data and ip6_finish_output
Documentation/networking/mpls-sysctl.txt | 4 ++--
drivers/infiniband/hw/mlx4/main.c | 14 ++++++++++++--
drivers/net/ethernet/broadcom/bcmsysport.c | 23 +++++++++++------------
drivers/net/ethernet/cadence/macb_pci.c | 27 ++++++++++-----------------
drivers/net/ethernet/cavium/Kconfig | 2 +-
drivers/net/ethernet/chelsio/libcxgb/libcxgb_cm.c | 12 +++++-------
drivers/net/ethernet/emulex/benet/be_main.c | 4 +++-
drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 6 +++++-
drivers/net/ethernet/mellanox/mlx4/en_clock.c | 8 ++------
drivers/net/ethernet/mellanox/mlx4/en_rx.c | 8 +++++++-
drivers/net/ethernet/mellanox/mlx4/icm.c | 7 ++++++-
drivers/net/ethernet/mellanox/mlx4/main.c | 18 ++++++++++++++++++
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c | 28 +++++-----------------------
drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c | 3 +++
drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c | 17 -----------------
drivers/net/ethernet/mellanox/mlx5/core/en_fs_ethtool.c | 1 +
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 51 ++++++++++++++++-----------------------------------
drivers/net/ethernet/mellanox/mlx5/core/en_stats.h | 32 +-------------------------------
drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 2 +-
drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 11 +++++++++++
drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 1 +
drivers/net/ethernet/mellanox/mlx5/core/main.c | 15 +++++++++++----
drivers/net/ethernet/renesas/sh_eth.c | 3 ++-
drivers/net/ethernet/sfc/ef10.c | 3 ++-
drivers/net/ethernet/sfc/ethtool.c | 2 ++
drivers/net/ethernet/sfc/net_driver.h | 2 ++
drivers/net/ethernet/sfc/siena.c | 1 +
drivers/net/ethernet/stmicro/stmmac/dwmac-oxnas.c | 89 +++++++++++++++++++++++++++++++++--------------------------------------------------------
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 23 +++++++++++++----------
drivers/net/usb/asix_devices.c | 1 +
drivers/net/vrf.c | 3 +++
drivers/net/wan/slic_ds26522.c | 2 +-
include/linux/mlx4/device.h | 2 ++
include/linux/mlx5/device.h | 5 -----
include/linux/mlx5/driver.h | 1 -
include/linux/mlx5/mlx5_ifc.h | 93 ---------------------------------------------------------------------------------------------
net/atm/lec.c | 2 ++
net/core/drop_monitor.c | 39 ++++++++++++++++++++++++++++++---------
net/core/flow_dissector.c | 5 +++--
net/core/rtnetlink.c | 6 ++++++
net/ipv4/fib_frontend.c | 2 +-
net/ipv4/igmp.c | 7 ++++++-
net/ipv4/ip_sockglue.c | 8 +++++++-
net/ipv4/route.c | 3 ++-
net/ipv6/ip6_output.c | 2 +-
net/l2tp/l2tp_ip.c | 19 ++++++-------------
net/l2tp/l2tp_ip6.c | 24 ++++++++----------------
net/mac80211/tx.c | 3 ++-
net/sched/cls_flower.c | 4 ++++
net/socket.c | 2 +-
50 files changed, 273 insertions(+), 377 deletions(-)
^ permalink raw reply
* [PATCH net-next] net: dsa: b53: Utilize common helpers for u64/MAC
From: Florian Fainelli @ 2017-01-04 21:53 UTC (permalink / raw)
To: netdev; +Cc: davem, andrew, vivien.didelot, volodymyr.bendiuga,
Florian Fainelli
Utilize the two functions recently introduced: u64_to_ether() and
ether_to_u64() instead of our own versions.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
drivers/net/dsa/b53/b53_common.c | 2 +-
drivers/net/dsa/b53/b53_priv.h | 23 ++---------------------
2 files changed, 3 insertions(+), 22 deletions(-)
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index 947adda3397d..d5370c227043 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1137,7 +1137,7 @@ static int b53_arl_op(struct b53_device *dev, int op, int port,
int ret;
/* Convert the array into a 64-bit MAC */
- mac = b53_mac_to_u64(addr);
+ mac = ether_addr_to_u64(addr);
/* Perform a read for the given MAC and VID */
b53_write48(dev, B53_ARLIO_PAGE, B53_MAC_ADDR_IDX, mac);
diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
index f192a673caba..d9833540fe26 100644
--- a/drivers/net/dsa/b53/b53_priv.h
+++ b/drivers/net/dsa/b53/b53_priv.h
@@ -325,25 +325,6 @@ struct b53_arl_entry {
u8 is_static:1;
};
-static inline void b53_mac_from_u64(u64 src, u8 *dst)
-{
- unsigned int i;
-
- for (i = 0; i < ETH_ALEN; i++)
- dst[ETH_ALEN - 1 - i] = (src >> (8 * i)) & 0xff;
-}
-
-static inline u64 b53_mac_to_u64(const u8 *src)
-{
- unsigned int i;
- u64 dst = 0;
-
- for (i = 0; i < ETH_ALEN; i++)
- dst |= (u64)src[ETH_ALEN - 1 - i] << (8 * i);
-
- return dst;
-}
-
static inline void b53_arl_to_entry(struct b53_arl_entry *ent,
u64 mac_vid, u32 fwd_entry)
{
@@ -352,14 +333,14 @@ static inline void b53_arl_to_entry(struct b53_arl_entry *ent,
ent->is_valid = !!(fwd_entry & ARLTBL_VALID);
ent->is_age = !!(fwd_entry & ARLTBL_AGE);
ent->is_static = !!(fwd_entry & ARLTBL_STATIC);
- b53_mac_from_u64(mac_vid, ent->mac);
+ u64_to_ether_addr(mac_vid, ent->mac);
ent->vid = mac_vid >> ARLTBL_VID_S;
}
static inline void b53_arl_from_entry(u64 *mac_vid, u32 *fwd_entry,
const struct b53_arl_entry *ent)
{
- *mac_vid = b53_mac_to_u64(ent->mac);
+ *mac_vid = ether_addr_to_u64(ent->mac);
*mac_vid |= (u64)(ent->vid & ARLTBL_VID_MASK) << ARLTBL_VID_S;
*fwd_entry = ent->port & ARLTBL_DATA_PORT_ID_MASK;
if (ent->is_valid)
--
2.9.3
^ permalink raw reply related
* Re: [PATCH net-next] net: dsa: b53: Utilize common helpers for u64/MAC
From: Andrew Lunn @ 2017-01-04 22:07 UTC (permalink / raw)
To: Florian Fainelli; +Cc: netdev, davem, vivien.didelot, volodymyr.bendiuga
In-Reply-To: <20170104215320.19293-1-f.fainelli@gmail.com>
On Wed, Jan 04, 2017 at 01:53:20PM -0800, Florian Fainelli wrote:
> Utilize the two functions recently introduced: u64_to_ether() and
> ether_to_u64() instead of our own versions.
:-)
And i expect these have been tested on big endian systems. I never
thought to look if one of the other drivers already had them.
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Andrew
^ permalink raw reply
* Re: [PATCH v3 net-next 1/2] tools: psock_lib: tighten conditions checked in sock_setfilter
From: Daniel Borkmann @ 2017-01-04 22:16 UTC (permalink / raw)
To: Sowmini Varadhan, linux-kselftest, netdev; +Cc: willemb, davem, shuah
In-Reply-To: <3aa068fa482f7cf5381957e9a3ea58550822d1d1.1483555162.git.sowmini.varadhan@oracle.com>
On 01/04/2017 07:45 PM, Sowmini Varadhan wrote:
> The bpf_prog used in sock_setfilter() only attempts to check for
> ip pktlen, and verifies that the contents of the 80'th packet in
> the ethernet frame is 'a' or 'b'. Thus many non-udp packets
> could incorrectly pass through this filter and cause incorrect
> test results.
>
> This commit hardens the conditions checked by the filter so
> that only UDP/IPv4 packets with the matching length and test-character
> will be permitted by the filter. The filter has been cleaned up
> to explicitly use the BPF macros to make it more readable.
>
> Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
> Acked-by: Willem de Bruijn <willemb@google.com>
> ---
> v2: commit comment edited based on Willem de Bruijn review
> v3: Shuah Khan nit.
>
> tools/testing/selftests/net/psock_lib.h | 29 ++++++++++++++++++++++-------
> 1 files changed, 22 insertions(+), 7 deletions(-)
>
> diff --git a/tools/testing/selftests/net/psock_lib.h b/tools/testing/selftests/net/psock_lib.h
> index 24bc7ec..9e5553a 100644
> --- a/tools/testing/selftests/net/psock_lib.h
> +++ b/tools/testing/selftests/net/psock_lib.h
> @@ -27,6 +27,7 @@
> #include <string.h>
> #include <arpa/inet.h>
> #include <unistd.h>
> +#include <netinet/udp.h>
>
> #define DATA_LEN 100
> #define DATA_CHAR 'a'
> @@ -40,14 +41,28 @@
>
> static __maybe_unused void sock_setfilter(int fd, int lvl, int optnum)
> {
> + uint16_t ip_len = DATA_LEN +
> + sizeof(struct iphdr) +
> + sizeof(struct udphdr);
> + /* the filter below checks for all of the following conditions that
> + * are based on the contents of create_payload()
> + * ether type 0x800 and
> + * ip proto udp and
> + * ip len == ip_len and
> + * udp[38] == 'a' or udp[38] == 'b'
> + */
> struct sock_filter bpf_filter[] = {
> - { 0x80, 0, 0, 0x00000000 }, /* LD pktlen */
> - { 0x35, 0, 4, DATA_LEN }, /* JGE DATA_LEN [f goto nomatch]*/
> - { 0x30, 0, 0, 0x00000050 }, /* LD ip[80] */
> - { 0x15, 1, 0, DATA_CHAR }, /* JEQ DATA_CHAR [t goto match]*/
> - { 0x15, 0, 1, DATA_CHAR_1}, /* JEQ DATA_CHAR_1 [t goto match]*/
> - { 0x06, 0, 0, 0x00000060 }, /* RET match */
> - { 0x06, 0, 0, 0x00000000 }, /* RET no match */
> + BPF_STMT(BPF_LD | BPF_H | BPF_ABS, 12), /* LD ethertype */
> + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, ETH_P_IP, 0, 8),
> + BPF_STMT(BPF_LD|BPF_B|BPF_ABS, 23), /* LD ip_proto */
> + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, IPPROTO_UDP, 0, 6),
> + BPF_STMT(BPF_LD|BPF_H|BPF_ABS, 16), /* LD ip_len */
> + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, ip_len, 0, 4),
> + BPF_STMT(BPF_LD|BPF_B|BPF_ABS, 80), /* LD udp[38] */
> + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, DATA_CHAR, 1, 0),
> + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, DATA_CHAR_1, 0, 1),
> + BPF_STMT(BPF_RET | BPF_K, ~0), /* match */
> + BPF_STMT(BPF_RET | BPF_K, 0) /* no match */
Just reading up on the thread, sorry to jump in late. Can't you just
use the generated code from bpf_asm (tools/net/) and add the asm program
as a comment above? Something like we do in net/core/ptp_classifier.c +13.
As it stands it makes it a bit harder to parse / less readable with macros
actually. Rest seems fine, thanks.
> };
> struct sock_fprog bpf_prog;
>
>
^ permalink raw reply
* Re: [PATCH net-next v2] tcp: provide timestamps for partial writes
From: Willem de Bruijn @ 2017-01-04 22:22 UTC (permalink / raw)
To: Soheil Hassas Yeganeh
Cc: David Miller, Network Development, Soheil Hassas Yeganeh,
Willem de Bruijn, Eric Dumazet, Neal Cardwell, Martin KaFai Lau
In-Reply-To: <20170104161934.26849-1-soheil.kdev@gmail.com>
On Wed, Jan 4, 2017 at 11:19 AM, Soheil Hassas Yeganeh
<soheil.kdev@gmail.com> wrote:
> From: Soheil Hassas Yeganeh <soheil@google.com>
>
> For TCP sockets, TX timestamps are only captured when the user data
> is successfully and fully written to the socket. In many cases,
> however, TCP writes can be partial for which no timestamp is
> collected.
>
> Collect timestamps whenever any user data is (fully or partially)
> copied into the socket. Pass tcp_write_queue_tail to tcp_tx_timestamp
> instead of the local skb pointer since it can be set to NULL on
> the error path.
>
> Note that tcp_write_queue_tail can be NULL, even if bytes have been
> copied to the socket. This is because acknowledgements are being
> processed in tcp_sendmsg(), and by the time tcp_tx_timestamp is
> called tcp_write_queue_tail can be NULL. For such cases, this patch
> does not collect any timestamps (i.e., it is best-effort).
>
> This patch is written with suggestions from Willem de Bruijn and
> Eric Dumazet.
>
> Change-log V1 -> V2:
> - Use sockc.tsflags instead of sk->sk_tsflags.
> - Use the same code path for normal writes and errors.
>
> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
> Acked-by: Yuchung Cheng <ycheng@google.com>
> Cc: Willem de Bruijn <willemb@google.com>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Neal Cardwell <ncardwell@google.com>
> Cc: Martin KaFai Lau <kafai@fb.com>
Acked-by: Willem de Bruijn <willemb@google.com>
^ permalink raw reply
* Re: [PATCH v3 net-next 1/2] tools: psock_lib: tighten conditions checked in sock_setfilter
From: Sowmini Varadhan @ 2017-01-04 22:22 UTC (permalink / raw)
To: Daniel Borkmann; +Cc: linux-kselftest, netdev, willemb, davem, shuah
In-Reply-To: <586D7437.1050708@iogearbox.net>
On (01/04/17 23:16), Daniel Borkmann wrote:
>
> Just reading up on the thread, sorry to jump in late. Can't you just
> use the generated code from bpf_asm (tools/net/) and add the asm program
> as a comment above? Something like we do in net/core/ptp_classifier.c +13.
I was actually using the example from the BSD bpf(4) man page,
and expanding on that one..
https://www.freebsd.org/cgi/man.cgi?query=bpf&sektion=4&manpath=FreeBSD+4.7-RELEASE
(I could not find the equivalent linux man page).
It was a lot easier to parse than the existing code .
> As it stands it makes it a bit harder to parse / less readable with macros
> actually. Rest seems fine, thanks.
You think the earlier code was readable? I had to use
gcc -E, with help from the bpf(4) page, to make sense of it.
--Sowmini
^ permalink raw reply
* Re: [PATCH v3 net-next 1/2] tools: psock_lib: tighten conditions checked in sock_setfilter
From: Daniel Borkmann @ 2017-01-04 22:26 UTC (permalink / raw)
To: Sowmini Varadhan; +Cc: linux-kselftest, netdev, willemb, davem, shuah
In-Reply-To: <20170104222224.GA31756@oracle.com>
On 01/04/2017 11:22 PM, Sowmini Varadhan wrote:
> On (01/04/17 23:16), Daniel Borkmann wrote:
>>
>> Just reading up on the thread, sorry to jump in late. Can't you just
>> use the generated code from bpf_asm (tools/net/) and add the asm program
>> as a comment above? Something like we do in net/core/ptp_classifier.c +13.
>
> I was actually using the example from the BSD bpf(4) man page,
> and expanding on that one..
> https://www.freebsd.org/cgi/man.cgi?query=bpf&sektion=4&manpath=FreeBSD+4.7-RELEASE
> (I could not find the equivalent linux man page).
>
> It was a lot easier to parse than the existing code .
cBPF with its tooling is all documented here:
Documentation/networking/filter.txt
>> As it stands it makes it a bit harder to parse / less readable with macros
>> actually. Rest seems fine, thanks.
>
> You think the earlier code was readable? I had to use
> gcc -E, with help from the bpf(4) page, to make sense of it.
>
> --Sowmini
>
>
^ permalink raw reply
* Re: [PATCH v3 net-next 1/2] tools: psock_lib: tighten conditions checked in sock_setfilter
From: Shuah Khan @ 2017-01-04 22:37 UTC (permalink / raw)
To: Sowmini Varadhan, linux-kselftest, netdev
Cc: daniel, willemb, davem, Shuah Khan
In-Reply-To: <3aa068fa482f7cf5381957e9a3ea58550822d1d1.1483555162.git.sowmini.varadhan@oracle.com>
On 01/04/2017 11:45 AM, Sowmini Varadhan wrote:
> The bpf_prog used in sock_setfilter() only attempts to check for
> ip pktlen, and verifies that the contents of the 80'th packet in
> the ethernet frame is 'a' or 'b'. Thus many non-udp packets
> could incorrectly pass through this filter and cause incorrect
> test results.
>
> This commit hardens the conditions checked by the filter so
> that only UDP/IPv4 packets with the matching length and test-character
> will be permitted by the filter. The filter has been cleaned up
> to explicitly use the BPF macros to make it more readable.
>
> Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
> Acked-by: Willem de Bruijn <willemb@google.com>
> ---
> v2: commit comment edited based on Willem de Bruijn review
> v3: Shuah Khan nit.
>
> tools/testing/selftests/net/psock_lib.h | 29 ++++++++++++++++++++++-------
> 1 files changed, 22 insertions(+), 7 deletions(-)
>
> diff --git a/tools/testing/selftests/net/psock_lib.h b/tools/testing/selftests/net/psock_lib.h
> index 24bc7ec..9e5553a 100644
> --- a/tools/testing/selftests/net/psock_lib.h
> +++ b/tools/testing/selftests/net/psock_lib.h
> @@ -27,6 +27,7 @@
> #include <string.h>
> #include <arpa/inet.h>
> #include <unistd.h>
> +#include <netinet/udp.h>
>
> #define DATA_LEN 100
> #define DATA_CHAR 'a'
> @@ -40,14 +41,28 @@
>
> static __maybe_unused void sock_setfilter(int fd, int lvl, int optnum)
> {
> + uint16_t ip_len = DATA_LEN +
> + sizeof(struct iphdr) +
> + sizeof(struct udphdr);
> + /* the filter below checks for all of the following conditions that
> + * are based on the contents of create_payload()
> + * ether type 0x800 and
> + * ip proto udp and
> + * ip len == ip_len and
> + * udp[38] == 'a' or udp[38] == 'b'
> + */
Looks like you have to do v4 anyway, please make sure your comment
block is one of the acceptable formats based on coding style:
https://marc.info/?l=linux-crypto-vger&m=146799837129319&w=2
> struct sock_filter bpf_filter[] = {
> - { 0x80, 0, 0, 0x00000000 }, /* LD pktlen */
> - { 0x35, 0, 4, DATA_LEN }, /* JGE DATA_LEN [f goto nomatch]*/
> - { 0x30, 0, 0, 0x00000050 }, /* LD ip[80] */
> - { 0x15, 1, 0, DATA_CHAR }, /* JEQ DATA_CHAR [t goto match]*/
> - { 0x15, 0, 1, DATA_CHAR_1}, /* JEQ DATA_CHAR_1 [t goto match]*/
> - { 0x06, 0, 0, 0x00000060 }, /* RET match */
> - { 0x06, 0, 0, 0x00000000 }, /* RET no match */
> + BPF_STMT(BPF_LD | BPF_H | BPF_ABS, 12), /* LD ethertype */
> + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, ETH_P_IP, 0, 8),
> + BPF_STMT(BPF_LD|BPF_B|BPF_ABS, 23), /* LD ip_proto */
> + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, IPPROTO_UDP, 0, 6),
> + BPF_STMT(BPF_LD|BPF_H|BPF_ABS, 16), /* LD ip_len */
> + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, ip_len, 0, 4),
> + BPF_STMT(BPF_LD|BPF_B|BPF_ABS, 80), /* LD udp[38] */
> + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, DATA_CHAR, 1, 0),
> + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, DATA_CHAR_1, 0, 1),
> + BPF_STMT(BPF_RET | BPF_K, ~0), /* match */
> + BPF_STMT(BPF_RET | BPF_K, 0) /* no match */
> };
> struct sock_fprog bpf_prog;
>
>
^ permalink raw reply
* Re: [net-next PATCH 5/6] i40e: Add TX and RX support in switchdev mode.
From: Samudrala, Sridhar @ 2017-01-04 22:46 UTC (permalink / raw)
To: Or Gerlitz
Cc: Alexander Duyck, John Fastabend, Anjali Singhai Jain,
jakub.kicinski, intel-wired-lan, Linux Netdev List
In-Reply-To: <CAJ3xEMjGuKKge+_hqZPOg2fp3f50TdxGtbKFXwnCZ70rPxmuyw@mail.gmail.com>
On 1/3/2017 3:03 PM, Or Gerlitz wrote:
> On Fri, Dec 30, 2016 at 7:04 PM, Samudrala, Sridhar
> <sridhar.samudrala@intel.com> wrote:
>> On 12/30/2016 7:31 AM, Or Gerlitz wrote:
>>> Are you exposing switchdev ops for the representators? didn't see that
>>> or maybe it's in the 4th patch which didn't make it to the list?
>> Not at this time. In the future patches when we offload fdb/vlan
>> functionality, we could use switchdev ops.
> but wait, this is the switchdev mode... even before doing any
> offloading, you want (need) your representor netdevices to have the
> same HW ID marking they are all ports of the same ASIC, this you can
> do with the switchdev parent ID attribute.
OK. I will add switchdev_port_attr_get() with PORT_PARENT_ID support in v3.
Thanks
Sridhar
^ permalink raw reply
* Re: [PATCH v3 net-next 1/2] tools: psock_lib: tighten conditions checked in sock_setfilter
From: Sowmini Varadhan @ 2017-01-04 22:48 UTC (permalink / raw)
To: Daniel Borkmann; +Cc: linux-kselftest, netdev, willemb, davem, shuah
In-Reply-To: <586D7692.4000604@iogearbox.net>
On (01/04/17 23:26), Daniel Borkmann wrote:
>
> >>As it stands it makes it a bit harder to parse / less readable with macros
> >>actually. Rest seems fine, thanks.
Usually macros are there (a) as an abstraction so you
dont have to hard-code things, and, (b) to make things
more readable. (maybe that's why the 1992 VJ paper on
BPF came up with these macros?)
I think we differ on code-aesthetics (not correctness) here.
It was not immediately obvious to me that "0x15 is actually
BPF_JMP + BPF_JEQ + BPF_K" etc, when I wanted to extend
the bpf_prog to harden the checks in the existing code.
Would it be ok to leave the extremely subjective
"make this more readable" part for you to tackle later?
Or I can just drop patch1, and you can fix it to your
satisfaction later.
--Sowmini
^ permalink raw reply
* Re: [PATCH v3 net-next 1/2] tools: psock_lib: tighten conditions checked in sock_setfilter
From: Sowmini Varadhan @ 2017-01-04 22:49 UTC (permalink / raw)
To: Shuah Khan; +Cc: linux-kselftest, netdev, daniel, willemb, davem
In-Reply-To: <d3e75147-475a-823c-3530-5150dd24a63d@kernel.org>
On (01/04/17 15:37), Shuah Khan wrote:
> Looks like you have to do v4 anyway, please make sure your comment
> block is one of the acceptable formats based on coding style:
I'm not sure about that. I can just keep patch 2.
thanks,
--Sowmini
^ permalink raw reply
* Re: [PATCH v3 net-next 1/2] tools: psock_lib: tighten conditions checked in sock_setfilter
From: Sowmini Varadhan @ 2017-01-04 22:55 UTC (permalink / raw)
To: Shuah Khan; +Cc: linux-kselftest, netdev, daniel, willemb, davem
In-Reply-To: <d3e75147-475a-823c-3530-5150dd24a63d@kernel.org>
On (01/04/17 15:37), Shuah Khan wrote:
> > + /* the filter below checks for all of the following conditions that
> > + * are based on the contents of create_payload()
> > + * ether type 0x800 and
> > + * ip proto udp and
> > + * ip len == ip_len and
> > + * udp[38] == 'a' or udp[38] == 'b'
> > + */
>
> Looks like you have to do v4 anyway, please make sure your comment
> block is one of the acceptable formats based on coding style:
>
> https://marc.info/?l=linux-crypto-vger&m=146799837129319&w=2
BTW, the above is conformant with the comment style required for
networking:
https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt
which seems to be used in psock_fanout.c and reuseport_bpf.c as well.
Thanks
--Sowmini
^ permalink raw reply
* [RFC PATCH next] ipv6: do not send RTM_DELADDR for tentative addresses
From: Mahesh Bandewar @ 2017-01-04 23:01 UTC (permalink / raw)
To: David Miller
Cc: netdev, Mahesh Bandewar, Mahesh Bandewar, Hideaki YOSHIFUJI,
Patrick McHardy, Hannes Frederic Sowa
From: Mahesh Bandewar <maheshb@google.com>
RTM_NEWADDR notification is sent when IFA_F_TENTATIVE is cleared from
the address. So if the address is added and deleted before DAD probes
completes, the RTM_DELADDR will be sent for which there was no
RTM_NEWADDR causing asymmetry in notification. However if the same
logic is used while sending RTM_DELADDR notification, this asymmetry
can be avoided.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
CC: Patrick McHardy <kaber@trash.net>
CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
net/ipv6/addrconf.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index c1e124bc8e1e..ac9bd5620f81 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -4888,6 +4888,13 @@ static void inet6_ifa_notify(int event, struct inet6_ifaddr *ifa)
struct net *net = dev_net(ifa->idev->dev);
int err = -ENOBUFS;
+ /* Don't send DELADDR notification for TENTATIVE address,
+ * since NEWADDR notification is sent only after removing
+ * TENTATIVE flag.
+ */
+ if (ifa->flags & IFA_F_TENTATIVE && event == RTM_DELADDR)
+ return;
+
skb = nlmsg_new(inet6_ifaddr_msgsize(), GFP_ATOMIC);
if (!skb)
goto errout;
--
2.11.0.390.gc69c2f50cf-goog
^ permalink raw reply related
* Re: [PATCH 2/2] PCI: lock each enable/disable num_vfs operation in sysfs
From: Gavin Shan @ 2017-01-04 23:11 UTC (permalink / raw)
To: Tantilov, Emil S
Cc: Gavin Shan, linux-pci@vger.kernel.org,
intel-wired-lan@lists.osuosl.org, Duyck, Alexander H,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <87618083B2453E4A8714035B62D67992507430EB@FMSMSX105.amr.corp.intel.com>
On Wed, Jan 04, 2017 at 04:00:20PM +0000, Tantilov, Emil S wrote:
>>On Tue, Jan 03, 2017 at 04:48:31PM -0800, Emil Tantilov wrote:
>>>Enabling/disabling SRIOV via sysfs by echo-ing multiple values
>>>simultaneously:
>>>
>>>echo 63 > /sys/class/net/ethX/device/sriov_numvfs&
>>>echo 63 > /sys/class/net/ethX/device/sriov_numvfs
>>>
>>>sleep 5
>>>
>>>echo 0 > /sys/class/net/ethX/device/sriov_numvfs&
>>>echo 0 > /sys/class/net/ethX/device/sriov_numvfs
>>>
>>>Results in the following bug:
>>>
>>>kernel BUG at drivers/pci/iov.c:495!
>>>invalid opcode: 0000 [#1] SMP
>>>CPU: 1 PID: 8050 Comm: bash Tainted: G W 4.9.0-rc7-net-next #2092
>>>RIP: 0010:[<ffffffff813b1647>]
>>> [<ffffffff813b1647>] pci_iov_release+0x57/0x60
>>>
>>>Call Trace:
>>> [<ffffffff81391726>] pci_release_dev+0x26/0x70
>>> [<ffffffff8155be6e>] device_release+0x3e/0xb0
>>> [<ffffffff81365ee7>] kobject_cleanup+0x67/0x180
>>> [<ffffffff81365d9d>] kobject_put+0x2d/0x60
>>> [<ffffffff8155bc27>] put_device+0x17/0x20
>>> [<ffffffff8139c08a>] pci_dev_put+0x1a/0x20
>>> [<ffffffff8139cb6b>] pci_get_dev_by_id+0x5b/0x90
>>> [<ffffffff8139cca5>] pci_get_subsys+0x35/0x40
>>> [<ffffffff8139ccc8>] pci_get_device+0x18/0x20
>>> [<ffffffff8139ccfb>] pci_get_domain_bus_and_slot+0x2b/0x60
>>> [<ffffffff813b09e7>] pci_iov_remove_virtfn+0x57/0x180
>>> [<ffffffff813b0b95>] pci_disable_sriov+0x65/0x140
>>> [<ffffffffa00a1af7>] ixgbe_disable_sriov+0xc7/0x1d0 [ixgbe]
>>> [<ffffffffa00a1e9d>] ixgbe_pci_sriov_configure+0x3d/0x170 [ixgbe]
>>> [<ffffffff8139d28c>] sriov_numvfs_store+0xdc/0x130
>>>...
>>>RIP [<ffffffff813b1647>] pci_iov_release+0x57/0x60
>>>
>>>Use the existing mutex lock to protect each enable/disable operation.
>>>
>>>CC: Alexander Duyck <alexander.h.duyck@intel.com>
>>>Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
>>
>>Emil, It's going to change semantics of pci_enable_sriov() and pci_disable_sriov().
>>They can be invoked when writing to the sysfs entry, or loading PF's
>>driver. With the change applied, the lock (pf->sriov->lock) isn't acquired and released
>>in the PF's driver loading path.
>
>The enablement of SRIOV on driver load is done via deprecated module parameter.
>Perhaps we can just remove it, although there are probably still people that use it
>and may not be happy if we get rid of it.
>
Yeah, some drivers are still using the interface. So we cannot affect it until
it can be droped.
>>I think the reasonable way would be adding a flag in "struct sriov", to
>>indicate someone is accessing the IOV capability through sysfs file. With this, the
>>code returns with "-EBUSY" immediately for contenders. With it, nothing is going
>>to be changed in PF's driver loading path.
>
>Flag is what I initially had in mind, but did not want to add extra locking if we
>can make use of the existing.
>
The problem is sriov->lock wasn't introduced to protect the whole IOV capability.
Instead, it protects the allocation of virtual bus (if needed). In your patch,
it will be used to protect the whole IOV capability, ensure accessing the IOV
capability exclusively. So the usage of this lock is changed.
code extracted from pci.h:
struct pci_sriov {
:
struct mutex lock; /* lock for VF bus */
:
}
The lock was introduced by commit d1b054da8 ("PCI: initialize and release SR-IOV
capability"). If I'm correct enough, I don't think this lock is needed when
pci_enable_sriov() or pci_disable_sriov() are called in driver because of module
parameters. I don't see the usage case calling pci_disable_sriov() while previous
pci_enable_sriov() isn't finished yet. Also, it's not needed in EEH subsystem.
So I think the lock can be dropped, then it can be used to protect sysfs path.
>>Also, there are some minor comments as below and I guess most of them won't
>>be applied if you take my suggestion eventually. However, I'm trying to make
>>the comments complete.
>
>Thanks a lot for reviewing!
>
>>
>>>---
>>> drivers/pci/pci-sysfs.c | 24 +++++++++++++++++-------
>>> 1 file changed, 17 insertions(+), 7 deletions(-)
>>>
>>>diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
>>>index 0666287..5b54cf5 100644
>>>--- a/drivers/pci/pci-sysfs.c
>>>+++ b/drivers/pci/pci-sysfs.c
>>>@@ -472,7 +472,9 @@ static ssize_t sriov_numvfs_store(struct device *dev,
>>> const char *buf, size_t count)
>>> {
>>> struct pci_dev *pdev = to_pci_dev(dev);
>>>+ struct pci_sriov *iov = pdev->sriov;
>>> int ret;
>>>+
>>
>>Unnecessary change.
>>
>>> u16 num_vfs;
>>>
>>> ret = kstrtou16(buf, 0, &num_vfs);
>>>@@ -482,38 +484,46 @@ static ssize_t sriov_numvfs_store(struct device
>>*dev,
>>> if (num_vfs > pci_sriov_get_totalvfs(pdev))
>>> return -ERANGE;
>>>
>>>+ mutex_lock(&iov->dev->sriov->lock);
>>>+
>>> if (num_vfs == pdev->sriov->num_VFs)
>>>- return count; /* no change */
>>>+ goto exit;
>>>
>>> /* is PF driver loaded w/callback */
>>> if (!pdev->driver || !pdev->driver->sriov_configure) {
>>> dev_info(&pdev->dev, "Driver doesn't support SRIOV
>>configuration via sysfs\n");
>>>- return -ENOSYS;
>>>+ ret = -EINVAL;
>>>+ goto exit;
>>
>>Why we need change the error code here?
>
>checkpatch was complaining about the use of the ENOSYS error code being specific
>and even though it was not my patch introducing it I had to change it to shut it up.
>
Right, it's reserved for attempt to call nonexisting syscall, but I think ENOENT
might be more indicative than EINVAL in this specific case?
>>> }
>>>
>>> if (num_vfs == 0) {
>>> /* disable VFs */
>>> ret = pdev->driver->sriov_configure(pdev, 0);
>>>- if (ret < 0)
>>>- return ret;
>>>- return count;
>>>+ goto exit;
>>> }
>>>
>>> /* enable VFs */
>>> if (pdev->sriov->num_VFs) {
>>> dev_warn(&pdev->dev, "%d VFs already enabled. Disable before enabling %d VFs\n",
>>> pdev->sriov->num_VFs, num_vfs);
>>>- return -EBUSY;
>>>+ ret = -EBUSY;
>>>+ goto exit;
>>> }
>>>
>>> ret = pdev->driver->sriov_configure(pdev, num_vfs);
>>> if (ret < 0)
>>>- return ret;
>>>+ goto exit;
>>>
>>> if (ret != num_vfs)
>>> dev_warn(&pdev->dev, "%d VFs requested; only %d enabled\n",
>>> num_vfs, ret);
>>>
>>>+exit:
>>>+ mutex_unlock(&iov->dev->sriov->lock);
>>>+
>>>+ if (ret < 0)
>>>+ return ret;
>>>+
>>> return count;
>>
>>The code might be clearer if @ret is returned here. In that case, We need
>>set it properly in error paths.
>
>I played with different ways to handle this and this seemed the least intrusive.
>
Ok, both should be fine.
Thanks,
Gavin
^ permalink raw reply
* Re: [PATCH v3 net-next 1/2] tools: psock_lib: tighten conditions checked in sock_setfilter
From: Daniel Borkmann @ 2017-01-04 22:59 UTC (permalink / raw)
To: Sowmini Varadhan; +Cc: linux-kselftest, netdev, willemb, davem, shuah
In-Reply-To: <20170104224848.GB31756@oracle.com>
On 01/04/2017 11:48 PM, Sowmini Varadhan wrote:
> On (01/04/17 23:26), Daniel Borkmann wrote:
[...]
>>>> As it stands it makes it a bit harder to parse / less readable with macros
>>>> actually. Rest seems fine, thanks.
>
> Usually macros are there (a) as an abstraction so you
> dont have to hard-code things, and, (b) to make things
> more readable. (maybe that's why the 1992 VJ paper on
> BPF came up with these macros?)
>
> I think we differ on code-aesthetics (not correctness) here.
> It was not immediately obvious to me that "0x15 is actually
> BPF_JMP + BPF_JEQ + BPF_K" etc, when I wanted to extend
> the bpf_prog to harden the checks in the existing code.
>
> Would it be ok to leave the extremely subjective
> "make this more readable" part for you to tackle later?
> Or I can just drop patch1, and you can fix it to your
> satisfaction later.
I think we're talking past each other (?), my suggestion
from my original email was to use bpf_asm and paste the
(human readable) program as a comment above as done also
elsewhere. But just leave it as it is then, no big deal
either.
^ permalink raw reply
* Re: [PATCH v3 net-next 1/2] tools: psock_lib: tighten conditions checked in sock_setfilter
From: Shuah Khan @ 2017-01-04 23:26 UTC (permalink / raw)
To: Sowmini Varadhan
Cc: linux-kselftest, netdev, daniel, willemb, davem, Shuah Khan
In-Reply-To: <20170104225554.GD31756@oracle.com>
On 01/04/2017 03:55 PM, Sowmini Varadhan wrote:
> On (01/04/17 15:37), Shuah Khan wrote:
>>> + /* the filter below checks for all of the following conditions that
>>> + * are based on the contents of create_payload()
>>> + * ether type 0x800 and
>>> + * ip proto udp and
>>> + * ip len == ip_len and
>>> + * udp[38] == 'a' or udp[38] == 'b'
>>> + */
>>
>> Looks like you have to do v4 anyway, please make sure your comment
>> block is one of the acceptable formats based on coding style:
>>
>> https://marc.info/?l=linux-crypto-vger&m=146799837129319&w=2
>
> BTW, the above is conformant with the comment style required for
> networking:
>
> https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt
>
> which seems to be used in psock_fanout.c and reuseport_bpf.c as well.
I would like to see the comment blocks in selftest consistent with the
Kernel coding style.
>
> Thanks
> --Sowmini
>
Could you please split this patch into two. Hardening part in one and
the cleanup in a separate patch. This way I can get the hardening fix
into 4.10 in my next Kselftest update. Cleanup patch can go in later.
thanks,
-- Shuah
^ permalink raw reply
* RE: [PATCH] scm: remove use CMSG{_COMPAT}_ALIGN(sizeof(struct {compat_}cmsghdr))
From: YUAN Linyu @ 2017-01-04 23:45 UTC (permalink / raw)
To: David Miller, cugyly@163.com; +Cc: netdev@vger.kernel.org
In-Reply-To: <20170104.132452.646592152519338774.davem@davemloft.net>
Thanks
> -----Original Message-----
> From: David Miller [mailto:davem@davemloft.net]
> Sent: Thursday, January 05, 2017 2:25 AM
> To: cugyly@163.com
> Cc: netdev@vger.kernel.org; YUAN Linyu
> Subject: Re: [PATCH] scm: remove use CMSG{_COMPAT}_ALIGN(sizeof(struct
> {compat_}cmsghdr))
>
> From: yuan linyu <cugyly@163.com>
> Date: Tue, 3 Jan 2017 20:42:17 +0800
>
> > From: yuan linyu <Linyu.Yuan@alcatel-sbell.com.cn>
> >
> > sizeof(struct cmsghdr) and sizeof(struct compat_cmsghdr) already aligned.
> > remove use CMSG_ALIGN(sizeof(struct cmsghdr)) and
> > CMSG_COMPAT_ALIGN(sizeof(struct compat_cmsghdr)) keep code
> consistent.
> >
> > Signed-off-by: yuan linyu <Linyu.Yuan@alcatel-sbell.com.cn>
>
> Applied, and I added the following commit which will make sure our
> analysis is accurate.
>
> ====================
> [PATCH] net: Assert at build time the assumptions we make about the CMSG
> header.
>
> It must always be the case that CMSG_ALIGN(sizeof(hdr)) == sizeof(hdr).
>
> Otherwise there are missing adjustments in the various calculations
> that parse and build these things.
>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> ---
> net/compat.c | 3 +++
> net/socket.c | 2 ++
> 2 files changed, 5 insertions(+)
>
> diff --git a/net/compat.c b/net/compat.c
> index 4e27dd1..ba3ac72 100644
> --- a/net/compat.c
> +++ b/net/compat.c
> @@ -130,6 +130,9 @@ int cmsghdr_from_user_compat_to_kern(struct
> msghdr *kmsg, struct sock *sk,
> __kernel_size_t kcmlen, tmp;
> int err = -EFAULT;
>
> + BUILD_BUG_ON(sizeof(struct compat_cmsghdr) !=
> + CMSG_COMPAT_ALIGN(sizeof(struct compat_cmsghdr)));
> +
> kcmlen = 0;
> kcmsg_base = kcmsg = (struct cmsghdr *)stackbuf;
> ucmsg = CMSG_COMPAT_FIRSTHDR(kmsg);
> diff --git a/net/socket.c b/net/socket.c
> index 8487bf1..5f3b5a2 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -1948,6 +1948,8 @@ static int ___sys_sendmsg(struct socket *sock,
> struct user_msghdr __user *msg,
> ctl_buf = msg_sys->msg_control;
> ctl_len = msg_sys->msg_controllen;
> } else if (ctl_len) {
> + BUILD_BUG_ON(sizeof(struct cmsghdr) !=
> + CMSG_ALIGN(sizeof(struct cmsghdr)));
> if (ctl_len > sizeof(ctl)) {
> ctl_buf = sock_kmalloc(sock->sk, ctl_len, GFP_KERNEL);
> if (ctl_buf == NULL)
> --
> 2.4.11
^ permalink raw reply
* [PATCH net-next] liquidio: fix wrong information about channels reported to ethtool
From: Felix Manlunas @ 2017-01-05 0:18 UTC (permalink / raw)
To: davem; +Cc: netdev, raghu.vatsavayi, derek.chickles, satananda.burla
From: Weilin Chang <weilin.chang@cavium.com>
Information reported to ethtool about channels is sometimes wrong for PF,
and always wrong for VF. Fix them by getting the information from the
right fields from the right structs.
Signed-off-by: Weilin Chang <weilin.chang@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Satanand Burla <satananda.burla@cavium.com>
---
drivers/net/ethernet/cavium/liquidio/lio_ethtool.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/cavium/liquidio/lio_ethtool.c b/drivers/net/ethernet/cavium/liquidio/lio_ethtool.c
index b00c300..50384ce 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_ethtool.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_ethtool.c
@@ -296,12 +296,16 @@ lio_ethtool_get_channels(struct net_device *dev,
rx_count = CFG_GET_NUM_RXQS_NIC_IF(conf6x, lio->ifidx);
tx_count = CFG_GET_NUM_TXQS_NIC_IF(conf6x, lio->ifidx);
} else if (OCTEON_CN23XX_PF(oct)) {
- struct octeon_config *conf23 = CHIP_CONF(oct, cn23xx_pf);
- max_rx = CFG_GET_OQ_MAX_Q(conf23);
- max_tx = CFG_GET_IQ_MAX_Q(conf23);
- rx_count = CFG_GET_NUM_RXQS_NIC_IF(conf23, lio->ifidx);
- tx_count = CFG_GET_NUM_TXQS_NIC_IF(conf23, lio->ifidx);
+ max_rx = oct->sriov_info.num_pf_rings;
+ max_tx = oct->sriov_info.num_pf_rings;
+ rx_count = lio->linfo.num_rxpciq;
+ tx_count = lio->linfo.num_txpciq;
+ } else if (OCTEON_CN23XX_VF(oct)) {
+ max_tx = oct->sriov_info.rings_per_vf;
+ max_rx = oct->sriov_info.rings_per_vf;
+ rx_count = lio->linfo.num_rxpciq;
+ tx_count = lio->linfo.num_txpciq;
}
channel->max_rx = max_rx;
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox