Netdev List
 help / color / mirror / Atom feed
* [PATCH 1/2] net: phylink: update mac_config() documentation
From: Russell King @ 2019-02-05 15:58 UTC (permalink / raw)
  To: linux-doc, netdev

A detail for mac_config() had been missed in the documentation for the
method - it is expected that the method will update the MAC to the
settings, rather than completely reprogram the MAC on each call.
Update the documentation for this method for this detail.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 include/linux/phylink.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/include/linux/phylink.h b/include/linux/phylink.h
index 021fc6595856..606a121629a9 100644
--- a/include/linux/phylink.h
+++ b/include/linux/phylink.h
@@ -149,6 +149,13 @@ int mac_link_state(struct net_device *ndev,
  *   configuration word. Nothing is advertised by the MAC. The MAC is
  *   responsible for reading the configuration word and configuring
  *   itself accordingly.
+ *
+ * Implementations are expected to update the MAC to reflect the
+ * requested settings - i.o.w., if nothing has changed between two
+ * calls, no action is expected.  If only flow control settings have
+ * changed, flow control should be updated *without* taking the link
+ * down.  This "update" behaviour is critical to avoid bouncing the
+ * link up status.
  */
 void mac_config(struct net_device *ndev, unsigned int mode,
 		const struct phylink_link_state *state);
-- 
2.7.4


^ permalink raw reply related

* [PATCH 2/2] doc: add phylink documentation to the networking book
From: Russell King @ 2019-02-05 15:58 UTC (permalink / raw)
  To: linux-doc, netdev; +Cc: David S. Miller, Jonathan Corbet

Add some phylink documentation to the networking book detailing how
to convert network drivers from phylib to phylink.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
Version 2 adds the "Modes of operation" section, as it appears mvpp2 is
non-conformant (which is, unfortunately, causing problems in certain
circumstances.)

 Documentation/networking/index.rst       |   1 +
 Documentation/networking/sfp-phylink.rst | 268 +++++++++++++++++++++++++++++++
 2 files changed, 269 insertions(+)
 create mode 100644 Documentation/networking/sfp-phylink.rst

diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index bd89dae8d578..ea81fc403b68 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -31,6 +31,7 @@ Linux Networking Documentation
    net_failover
    alias
    bridge
+   sfp-phylink
 
 .. only::  subproject
 
diff --git a/Documentation/networking/sfp-phylink.rst b/Documentation/networking/sfp-phylink.rst
new file mode 100644
index 000000000000..78a577c9d8a3
--- /dev/null
+++ b/Documentation/networking/sfp-phylink.rst
@@ -0,0 +1,268 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=======
+phylink
+=======
+
+Overview
+========
+
+phylink is a mechanism to support hot-pluggable networking modules
+without needing to re-initialise the adapter on hot-plug events.
+
+phylink supports conventional phylib-based setups, fixed link setups
+and SFP modules at present.
+
+Modes of operation
+==================
+
+phylink has several modes of operation, which depend on the firmware
+settings.
+
+1. PHY mode
+
+   In PHY mode, we use phylib to read the current link settings from
+   the PHY, and pass them to the MAC driver.  We expect the MAC driver
+   to configure exactly the modes that are specified without any
+   negotiation being enabled on the link.
+
+2. Fixed mode
+
+   Fixed mode is the same as PHY mode as far as the MAC driver is
+   concerned.
+
+3. In-band mode
+
+   In-band mode is used with 802.3z, SGMII and similar interface modes
+   are used, and we are expecting to use the and honor the in-band
+   negotiation or control word sent across the serdes channel.
+
+By example, what this means is that:
+
+.. code-block:: none
+
+  &eth {
+    phy = <&phy>;
+    phy-mode = "sgmii";
+  };
+
+does not use in-band SGMII signalling.  The PHY is expected to follow
+exactly the settings given to it in its :c:func:`mac_config` function.
+The link should be forced up or down appropriately in the
+:c:func:`mac_link_up` and :c:func:`mac_link_down` functions.
+
+.. code-block:: none
+
+  &eth {
+    managed = "in-band-status";
+    phy = <&phy>;
+    phy-mode = "sgmii";
+  };
+
+uses in-band mode, where results from the PHYs negotiation are passed
+to the MAC through the SGMII control word, and the MAC is expected to
+acknowledge the control word.  The :c:func:`mac_link_up` and
+:c:func:`mac_link_down` functions must not force the MAC side link
+up and down.
+
+Rough guide to converting a network driver to sfp/phylink
+=========================================================
+
+This guide briefly describes how to convert a network driver from
+phylib to the sfp/phylink support.  Please send patches to improve
+this documentation.
+
+1. Optionally split the network driver's phylib update function into
+   three parts dealing with link-down, link-up and reconfiguring the
+   MAC settings. This can be done as a separate preparation commit.
+
+   An example of this preparation can be found in git commit fc548b991fb0.
+
+2. Replace::
+
+	select FIXED_PHY
+	select PHYLIB
+
+   with::
+
+	select PHYLINK
+
+   in the driver's Kconfig stanza.
+
+3. Add::
+
+	#include <linux/phylink.h>
+
+   to the driver's list of header files.
+
+4. Add::
+
+	struct phylink *phylink;
+
+   to the driver's private data structure.  We shall refer to the
+   driver's private data pointer as ``priv`` below, and the driver's
+   private data structure as ``struct foo_priv``.
+
+5. Replace the following functions:
+
+   .. flat-table::
+    :header-rows: 1
+    :widths: 1 1
+    :stub-columns: 0
+
+    * - Original function
+      - Replacement function
+    * - phy_start(phydev)
+      - phylink_start(priv->phylink)
+    * - phy_stop(phydev)
+      - phylink_stop(priv->phylink)
+    * - phy_mii_ioctl(phydev, ifr, cmd)
+      - phylink_mii_ioctl(priv->phylink, ifr, cmd)
+    * - phy_ethtool_get_wol(phydev, wol)
+      - phylink_ethtool_get_wol(priv->phylink, wol)
+    * - phy_ethtool_set_wol(phydev, wol)
+      - phylink_ethtool_set_wol(priv->phylink, wol)
+    * - phy_disconnect(phydev)
+      - phylink_disconnect_phy(priv->phylink)
+
+   Please note that some of these functions must be called under the
+   rtnl lock, and will warn if not. This will normally be the case,
+   except if these are called from the driver suspend/resume paths.
+
+6. Add/replace ksettings get/set methods with:
+
+   .. code-block:: c
+
+    static int foo_ethtool_set_link_ksettings(struct net_device *dev,
+					     const struct ethtool_link_ksettings *cmd)
+    {
+	struct foo_priv *priv = netdev_priv(dev);
+
+	return phylink_ethtool_ksettings_set(priv->phylink, cmd);
+    }
+
+    static int foo_ethtool_get_link_ksettings(struct net_device *dev,
+					     struct ethtool_link_ksettings *cmd)
+    {
+	struct foo_priv *priv = netdev_priv(dev);
+
+	return phylink_ethtool_ksettings_get(priv->phylink, cmd);
+    }
+
+7. Replace the call to:
+
+	phy_dev = of_phy_connect(dev, node, link_func, flags, phy_interface)
+
+   and associated code with a call to:
+
+	err = phylink_of_phy_connect(priv->phylink, node, flags)
+
+   For the most part, ``flags`` can be zero, these flags are passed to
+   the of_phy_attach() inside this function call if a PHY is specified
+   in the DT node ``node``.
+
+   ``node`` should be the DT node which contains the network phy property,
+   fixed link properties, and will also contain the sfp property.
+
+   The setup of fixed links should also be removed; these are handled
+   natively by phylink.
+
+   of_phy_connect() was also passed a function pointer for link updates.
+   This function is replaced by a different form of MAC updates
+   described below in (8).
+
+   Manipulation of the PHY's supported/advertised happens within phylink
+   based on the validate callback, see below in (8).
+
+   Note that the driver no longer needs to store the ``phy_interface``,
+   and also note that ``phy_interface`` becomes a dynamic property,
+   just like the speed, duplex etc settings.
+
+   Finally, note that the MAC driver has no direct access to the PHY
+   anymore; that is because in the phylink model, the PHY can be
+   dynamic.
+
+8. Add a :c:type:`struct phylink_mac_ops <phylink_mac_ops>` instance to
+   the driver, which is a table of function pointers, and implement
+   these functions. The old link update function for
+   :c:func:`of_phy_connect` becomes three methods: :c:func:`mac_link_up`,
+   :c:func:`mac_link_down`, and :c:func:`mac_config`. If step 1 was
+   performed, then the functionality will have been split there.
+
+   It is important that if in-band negotiation is used,
+   :c:func:`mac_link_up` and :c:func:`mac_link_down` do not prevent the
+   in-band negotiation from completing, since these functions are called
+   when the in-band link state changes - otherwise the link will never
+   come up.
+
+   The :c:func:`validate` method should mask the supplied supported mask,
+   and ``state->advertising`` with the supported ethtool link modes.
+   These are the new ethtool link modes, so bitmask operations must be
+   used. For an example, see drivers/net/ethernet/marvell/mvneta.c.
+
+   The :c:func:`mac_link_state` method is used to read the link state
+   from the MAC, and report back the settings that the MAC is currently
+   using. This is particularly important for in-band negotiation
+   methods such as 1000base-X and SGMII.
+
+   The :c:func:`mac_config` method is used to update the MAC with the
+   requested state, and must avoid unnecessarily taking the link down
+   when making changes to the MAC configuration.  This means the
+   function should modify the state and only take the link down when
+   absolutely necessary to change the MAC configuration.  An example
+   of how to do this can be found in :c:func:`mvneta_mac_config` in
+   drivers/net/ethernet/marvell/mvneta.c.
+
+   For further information on these methods, please see the inline
+   documentation in :c:type:`struct phylink_mac_ops <phylink_mac_ops>`.
+
+9. Remove calls to of_parse_phandle() for the PHY,
+   of_phy_register_fixed_link() for fixed links etc from the probe
+   function, and replace with:
+
+   .. code-block:: c
+
+	struct phylink *phylink;
+
+	phylink = phylink_create(dev, node, phy_mode, &phylink_ops);
+	if (IS_ERR(phylink)) {
+		err = PTR_ERR(phylink);
+		fail probe;
+	}
+
+	priv->phylink = phylink;
+
+   and arrange to destroy the phylink in the probe failure path as
+   appropriate and the removal path too by calling:
+
+   .. code-block:: c
+
+	phylink_destroy(priv->phylink);
+
+10. Arrange for MAC link state interrupts to be forwarded into
+    phylink, via:
+
+    .. code-block:: c
+
+	phylink_mac_change(priv->phylink, link_is_up);
+
+    where ``link_is_up`` is true if the link is currently up or false
+    otherwise.
+
+11. Verify that the driver does not call::
+
+	netif_carrier_on()
+	netif_carrier_off()
+
+   as these will interfere with phylink's tracking of the link state,
+   and cause phylink to omit calls via the :c:func:`mac_link_up` and
+   :c:func:`mac_link_down` methods.
+
+Network drivers should call phylink_stop() and phylink_start() via their
+suspend/resume paths, which ensures that the appropriate
+:c:type:`struct phylink_mac_ops <phylink_mac_ops>` methods are called
+as necessary.
+
+For information describing the SFP cage in DT, please see the binding
+documentation in the kernel source tree
+``Documentation/devicetree/bindings/net/sff,sfp.txt``
-- 
2.7.4


^ permalink raw reply related

* [PATCH net] net: defxx: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: Yang Wei @ 2019-02-05 16:01 UTC (permalink / raw)
  To: netdev; +Cc: macro, davem, yang.wei9, albin_yang

From: Yang Wei <yang.wei9@zte.com.cn>

dev_consume_skb_irq() should be called in dfx_xmt_done() when skb
xmit done. It makes drop profiles(dropwatch, perf) more friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
---
 drivers/net/fddi/defxx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/fddi/defxx.c b/drivers/net/fddi/defxx.c
index 38ac8ef..56b7791 100644
--- a/drivers/net/fddi/defxx.c
+++ b/drivers/net/fddi/defxx.c
@@ -3512,7 +3512,7 @@ static int dfx_xmt_done(DFX_board_t *bp)
 				 bp->descr_block_virt->xmt_data[comp].long_1,
 				 p_xmt_drv_descr->p_skb->len,
 				 DMA_TO_DEVICE);
-		dev_kfree_skb_irq(p_xmt_drv_descr->p_skb);
+		dev_consume_skb_irq(p_xmt_drv_descr->p_skb);
 
 		/*
 		 * Move to start of next packet by updating completion index
-- 
2.7.4



^ permalink raw reply related

* [PATCH net] net: tulip: de2104x: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: Yang Wei @ 2019-02-05 16:03 UTC (permalink / raw)
  To: netdev, linux-parisc; +Cc: davem, yang.wei9, albin_yang

From: Yang Wei <yang.wei9@zte.com.cn>

dev_consume_skb_irq() should be called in de_tx() when skb xmit
done. It makes drop profiles(dropwatch, perf) more friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
---
 drivers/net/ethernet/dec/tulip/de2104x.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/dec/tulip/de2104x.c b/drivers/net/ethernet/dec/tulip/de2104x.c
index 13430f7..f1a2da1 100644
--- a/drivers/net/ethernet/dec/tulip/de2104x.c
+++ b/drivers/net/ethernet/dec/tulip/de2104x.c
@@ -585,7 +585,7 @@ static void de_tx (struct de_private *de)
 				netif_dbg(de, tx_done, de->dev,
 					  "tx done, slot %d\n", tx_tail);
 			}
-			dev_kfree_skb_irq(skb);
+			dev_consume_skb_irq(skb);
 		}
 
 next:
-- 
2.7.4



^ permalink raw reply related

* Re: [PATCH btf v2 0/3] Add BTF types deduplication algorithm
From: Daniel Borkmann @ 2019-02-05 16:04 UTC (permalink / raw)
  To: Andrii Nakryiko, netdev, ast, yhs, acme, kernel-team, kafai,
	ecree, andrii.nakryiko
In-Reply-To: <20190205012946.1590917-1-andriin@fb.com>

On 02/05/2019 02:29 AM, Andrii Nakryiko wrote:
> This patch series adds BTF deduplication algorithm to libbpf. This algorithm
> allows to take BTF type information containing duplicate per-compilation unit
> information and reduce it to equivalent set of BTF types with no duplication without
> loss of information. It also deduplicates strings and removes those strings that
> are not referenced from any BTF type (and line information in .BTF.ext section,
> if any).
> 
> Algorithm also resolves struct/union forward declarations into concrete BTF types
> across multiple compilation units to facilitate better deduplication ratio. If
> undesired, this resolution can be disabled through specifying corresponding options.
> 
> When applied to BTF data emitted by pahole's DWARF->BTF converter, it reduces
> the overall size of .BTF section by about 65x, from about 112MB to 1.75MB, leaving
> only 29247 out of initial 3073497 BTF type descriptors.
> 
> Algorithm with minor differences and preliminary results before FUNC/FUNC_PROTO
> support is also described more verbosely at:
> https://facebookmicrosites.github.io/bpf/blog/2018/11/14/btf-enhancement.html
> 
> v1->v2:
> - rebase on latest bpf-next
> - err_log/elog -> pr_debug
> - btf__dedup, btf__get_strings, btf__get_nr_types listed under 0.0.2 version
> 
> Andrii Nakryiko (3):
>   btf: extract BTF type size calculation
>   btf: add BTF types deduplication algorithm
>   selftests/btf: add initial BTF dedup tests
> 
>  tools/lib/bpf/btf.c                    | 1851 +++++++++++++++++++++++-
>  tools/lib/bpf/btf.h                    |   10 +
>  tools/lib/bpf/libbpf.map               |    3 +
>  tools/testing/selftests/bpf/test_btf.c |  535 ++++++-
>  4 files changed, 2332 insertions(+), 67 deletions(-)

Applied, thanks!

^ permalink raw reply

* Re: [PATCH bpf-next v2 0/4] Add RISC-V (RV64G) BPF JIT
From: Daniel Borkmann @ 2019-02-05 16:05 UTC (permalink / raw)
  To: bjorn.topel, linux-riscv, ast, netdev; +Cc: palmer, hch
In-Reply-To: <20190205124125.5553-1-bjorn.topel@gmail.com>

On 02/05/2019 01:41 PM, bjorn.topel@gmail.com wrote:
> From: Björn Töpel <bjorn.topel@gmail.com>
> 
> Hi!
> 
> This v2 series adds an RV64G BPF JIT to the kernel.
> 
> At the moment the RISC-V Linux port does not support
> CONFIG_HAVE_KPROBES (Patrick Stählin sent out an RFC last year), which
> means that CONFIG_BPF_EVENTS is not supported. Thus, no tests
> involving BPF_PROG_TYPE_TRACEPOINT, BPF_PROG_TYPE_PERF_EVENT,
> BPF_PROG_TYPE_KPROBE and BPF_PROG_TYPE_RAW_TRACEPOINT passes.
> 
> The implementation does not support "far branching" (>4KiB).
> 
> Test results:
>   # modprobe test_bpf
>   test_bpf: Summary: 378 PASSED, 0 FAILED, [366/366 JIT'ed]
> 
>   # echo 1 > /proc/sys/kernel/unprivileged_bpf_disabled
>   # ./test_verifier
>   ...
>   Summary: 761 PASSED, 507 SKIPPED, 2 FAILED
> 
> Note that "test_verifier" was run with one build with
> CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y and one without, otherwise
> many of the the tests that require unaligned access were skipped.
> 
> CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y:
>   # echo 1 > /proc/sys/kernel/unprivileged_bpf_disabled
>   # ./test_verifier | grep -c 'NOTE.*unknown align'
>   0
> 
> No CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS:
>   # echo 1 > /proc/sys/kernel/unprivileged_bpf_disabled
>   # ./test_verifier | grep -c 'NOTE.*unknown align'
>   59
> 
> The two failing test_verifier tests are:
>   "ld_abs: vlan + abs, test 1"
>   "ld_abs: jump around ld_abs"
> 
> This is due to that "far branching" involved in those tests.
> All tests where done on QEMU emulator version 3.1.50
> (v3.1.0-688-g8ae951fbc106). I'll test it on real hardware, when I get
> access to it.
> 
> I'm routing this patch via bpf-next/netdev mailing list (after a
> conversation with Palmer at FOSDEM), mainly because the other JITs
> went that path.
> 
> Again, thanks for all the comments!
> 
> Cheers,
> Björn
> 
> v1 -> v2:
> * Added JMP32 support. (Daniel)
> * Add RISC-V to Documentation/sysctl/net.txt. (Daniel)
> * Fixed seen_call() asymmetry. (Daniel)
> * Fixed broken bpf_flush_icache() range. (Daniel)
> * Added alignment annotations to some selftests.
> 
> RFCv1 -> v1:
> * Cleaned up the Kconfig and net/Makefile. (Christoph)
> * Removed the entry-stub and squashed the build/config changes to be
>   part of the JIT implementation. (Christoph)
> * Simplified the register tracking code. (Daniel)
> * Removed unused macros. (Daniel)
> * Added myself as maintainer and updated documentation. (Daniel)
> * Removed HAVE_EFFICIENT_UNALIGNED_ACCESS. (Christoph, Palmer)
> * Added tail-calls and cleaned up the code.
> 
> 
> Björn Töpel (4):
>   bpf, riscv: add BPF JIT for RV64G
>   MAINTAINERS: add RISC-V BPF JIT maintainer
>   bpf, doc: add RISC-V JIT to BPF documentation
>   selftests/bpf: add "any alignment" annotation for some tests
> 
>  Documentation/networking/filter.txt           |   16 +-
>  Documentation/sysctl/net.txt                  |    1 +
>  MAINTAINERS                                   |    6 +
>  arch/riscv/Kconfig                            |    1 +
>  arch/riscv/Makefile                           |    2 +-
>  arch/riscv/net/Makefile                       |    1 +
>  arch/riscv/net/bpf_jit_comp.c                 | 1602 +++++++++++++++++
>  .../selftests/bpf/verifier/ctx_sk_msg.c       |    1 +
>  .../testing/selftests/bpf/verifier/ctx_skb.c  |    1 +
>  tools/testing/selftests/bpf/verifier/jmp32.c  |   22 +
>  tools/testing/selftests/bpf/verifier/jset.c   |    2 +
>  .../selftests/bpf/verifier/spill_fill.c       |    1 +
>  .../selftests/bpf/verifier/spin_lock.c        |    2 +
>  .../selftests/bpf/verifier/value_ptr_arith.c  |    4 +
>  14 files changed, 1654 insertions(+), 8 deletions(-)
>  create mode 100644 arch/riscv/net/Makefile
>  create mode 100644 arch/riscv/net/bpf_jit_comp.c

Applied, thanks!

^ permalink raw reply

* [PATCH net] net: dscc4: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: Yang Wei @ 2019-02-05 16:07 UTC (permalink / raw)
  To: netdev; +Cc: romieu, davem, yang.wei9, albin_yang

From: Yang Wei <yang.wei9@zte.com.cn>

dev_consume_skb_irq() should be called in dscc4_tx_irq() when skb
xmit done. It makes drop profiles(dropwatch, perf) more friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
---
 drivers/net/wan/dscc4.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wan/dscc4.c b/drivers/net/wan/dscc4.c
index c0b0f52..27decf8 100644
--- a/drivers/net/wan/dscc4.c
+++ b/drivers/net/wan/dscc4.c
@@ -1575,7 +1575,7 @@ static void dscc4_tx_irq(struct dscc4_pci_priv *ppriv,
 					dev->stats.tx_packets++;
 					dev->stats.tx_bytes += skb->len;
 				}
-				dev_kfree_skb_irq(skb);
+				dev_consume_skb_irq(skb);
 				dpriv->tx_skbuff[cur] = NULL;
 				++dpriv->tx_dirty;
 			} else {
-- 
2.7.4



^ permalink raw reply related

* [PATCH net] net: smsc: epic100: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: Yang Wei @ 2019-02-05 16:09 UTC (permalink / raw)
  To: netdev; +Cc: davem, colin.king, yang.wei9, albin_yang

From: Yang Wei <yang.wei9@zte.com.cn>

dev_consume_skb_irq() should be called in epic_tx() when skb xmit
done. It makes drop profiles(dropwatch, perf) more friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
---
 drivers/net/ethernet/smsc/epic100.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/smsc/epic100.c b/drivers/net/ethernet/smsc/epic100.c
index 15c62c1..be47d86 100644
--- a/drivers/net/ethernet/smsc/epic100.c
+++ b/drivers/net/ethernet/smsc/epic100.c
@@ -1037,7 +1037,7 @@ static void epic_tx(struct net_device *dev, struct epic_private *ep)
 		skb = ep->tx_skbuff[entry];
 		pci_unmap_single(ep->pci_dev, ep->tx_ring[entry].bufaddr,
 				 skb->len, PCI_DMA_TODEVICE);
-		dev_kfree_skb_irq(skb);
+		dev_consume_skb_irq(skb);
 		ep->tx_skbuff[entry] = NULL;
 	}
 
-- 
2.7.4



^ permalink raw reply related

* Need to retouch your photos?
From: Stacy @ 2019-02-04  9:58 UTC (permalink / raw)
  To: netdev

Need to retouch your photos?  Deep etching or masking for your photos?

We are the studio who can do those service for your photos.

Please send photos to start

Thanks,
Stacy


















Redmscheid


Ansbadch


^ permalink raw reply

* Need to retouch your photos?
From: Stacy @ 2019-02-04 11:55 UTC (permalink / raw)
  To: netdev

Need to retouch your photos?  Deep etching or masking for your photos?

We are the studio who can do those service for your photos.

Please send photos to start

Thanks,
Stacy


















Ulmd


Bietigheim


^ permalink raw reply

* [PATCH net] net: fec_mpc52xx: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: Yang Wei @ 2019-02-05 16:12 UTC (permalink / raw)
  To: netdev; +Cc: davem, yang.wei9, albin_yang

From: Yang Wei <yang.wei9@zte.com.cn>

dev_consume_skb_irq() should be called in mpc52xx_fec_tx_interrupt()
when skb xmit done. It makes drop profiles(dropwatch, perf) more
friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
---
 drivers/net/ethernet/freescale/fec_mpc52xx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/fec_mpc52xx.c b/drivers/net/ethernet/freescale/fec_mpc52xx.c
index b90bab7..c1968b3 100644
--- a/drivers/net/ethernet/freescale/fec_mpc52xx.c
+++ b/drivers/net/ethernet/freescale/fec_mpc52xx.c
@@ -369,7 +369,7 @@ static irqreturn_t mpc52xx_fec_tx_interrupt(int irq, void *dev_id)
 		dma_unmap_single(dev->dev.parent, bd->skb_pa, skb->len,
 				 DMA_TO_DEVICE);
 
-		dev_kfree_skb_irq(skb);
+		dev_consume_skb_irq(skb);
 	}
 	spin_unlock(&priv->lock);
 
-- 
2.7.4



^ permalink raw reply related

* [PATCH net] net: fsl_ucc_hdlc: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: Yang Wei @ 2019-02-05 16:14 UTC (permalink / raw)
  To: netdev, linuxppc-dev; +Cc: qiang.zhao, davem, yang.wei9, albin_yang

From: Yang Wei <yang.wei9@zte.com.cn>

dev_consume_skb_irq() should be called in hdlc_tx_done() when skb
xmit done. It makes drop profiles(dropwatch, perf) more friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
---
 drivers/net/wan/fsl_ucc_hdlc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wan/fsl_ucc_hdlc.c b/drivers/net/wan/fsl_ucc_hdlc.c
index 66d889d..a08f04c 100644
--- a/drivers/net/wan/fsl_ucc_hdlc.c
+++ b/drivers/net/wan/fsl_ucc_hdlc.c
@@ -482,7 +482,7 @@ static int hdlc_tx_done(struct ucc_hdlc_private *priv)
 		memset(priv->tx_buffer +
 		       (be32_to_cpu(bd->buf) - priv->dma_tx_addr),
 		       0, skb->len);
-		dev_kfree_skb_irq(skb);
+		dev_consume_skb_irq(skb);
 
 		priv->tx_skbuff[priv->skb_dirtytx] = NULL;
 		priv->skb_dirtytx =
-- 
2.7.4



^ permalink raw reply related

* [PATCH net] net: sun: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: Yang Wei @ 2019-02-05 16:19 UTC (permalink / raw)
  To: netdev; +Cc: davem, yanjun.zhu, shannon.nelson, robh, yang.wei9, albin_yang

From: Yang Wei <yang.wei9@zte.com.cn>

dev_consume_skb_irq() should be called when skb xmit done. It makes
drop profiles(dropwatch, perf) more friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
---
 drivers/net/ethernet/sun/cassini.c | 2 +-
 drivers/net/ethernet/sun/sunbmac.c | 2 +-
 drivers/net/ethernet/sun/sunhme.c  | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/sun/cassini.c b/drivers/net/ethernet/sun/cassini.c
index 7ec4eb7..6fc05c1 100644
--- a/drivers/net/ethernet/sun/cassini.c
+++ b/drivers/net/ethernet/sun/cassini.c
@@ -1898,7 +1898,7 @@ static inline void cas_tx_ringN(struct cas *cp, int ring, int limit)
 		cp->net_stats[ring].tx_packets++;
 		cp->net_stats[ring].tx_bytes += skb->len;
 		spin_unlock(&cp->stat_lock[ring]);
-		dev_kfree_skb_irq(skb);
+		dev_consume_skb_irq(skb);
 	}
 	cp->tx_old[ring] = entry;
 
diff --git a/drivers/net/ethernet/sun/sunbmac.c b/drivers/net/ethernet/sun/sunbmac.c
index 720b7ac..e9b757b 100644
--- a/drivers/net/ethernet/sun/sunbmac.c
+++ b/drivers/net/ethernet/sun/sunbmac.c
@@ -781,7 +781,7 @@ static void bigmac_tx(struct bigmac *bp)
 
 		DTX(("skb(%p) ", skb));
 		bp->tx_skbs[elem] = NULL;
-		dev_kfree_skb_irq(skb);
+		dev_consume_skb_irq(skb);
 
 		elem = NEXT_TX(elem);
 	}
diff --git a/drivers/net/ethernet/sun/sunhme.c b/drivers/net/ethernet/sun/sunhme.c
index ff641cf..d007dfe 100644
--- a/drivers/net/ethernet/sun/sunhme.c
+++ b/drivers/net/ethernet/sun/sunhme.c
@@ -1962,7 +1962,7 @@ static void happy_meal_tx(struct happy_meal *hp)
 			this = &txbase[elem];
 		}
 
-		dev_kfree_skb_irq(skb);
+		dev_consume_skb_irq(skb);
 		dev->stats.tx_packets++;
 	}
 	hp->tx_old = elem;
-- 
2.7.4



^ permalink raw reply related

* [PATCH net] net: tehuti: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: Yang Wei @ 2019-02-05 16:21 UTC (permalink / raw)
  To: netdev; +Cc: andy, davem, yang.wei9, albin_yang

From: Yang Wei <yang.wei9@zte.com.cn>

dev_consume_skb_irq() should be called in bdx_tx_cleanup() when skb
xmit done. It makes drop profiles(dropwatch, perf) more friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
---
 drivers/net/ethernet/tehuti/tehuti.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/tehuti/tehuti.c b/drivers/net/ethernet/tehuti/tehuti.c
index dc966ddb..b24c111 100644
--- a/drivers/net/ethernet/tehuti/tehuti.c
+++ b/drivers/net/ethernet/tehuti/tehuti.c
@@ -1739,7 +1739,7 @@ static void bdx_tx_cleanup(struct bdx_priv *priv)
 		tx_level -= db->rptr->len;	/* '-' koz len is negative */
 
 		/* now should come skb pointer - free it */
-		dev_kfree_skb_irq(db->rptr->addr.skb);
+		dev_consume_skb_irq(db->rptr->addr.skb);
 		bdx_tx_db_inc_rptr(db);
 	}
 
-- 
2.7.4



^ permalink raw reply related

* [PATCH net] net: via-velocity: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: Yang Wei @ 2019-02-05 16:22 UTC (permalink / raw)
  To: netdev; +Cc: romieu, davem, yang.wei9, albin_yang

From: Yang Wei <yang.wei9@zte.com.cn>

dev_consume_skb_irq() should be called in velocity_free_tx_buf()
when skb xmit done. It makes drop profiles(dropwatch, perf) more
friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
---
 drivers/net/ethernet/via/via-velocity.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/via/via-velocity.c b/drivers/net/ethernet/via/via-velocity.c
index 8241269..27f6cf1 100644
--- a/drivers/net/ethernet/via/via-velocity.c
+++ b/drivers/net/ethernet/via/via-velocity.c
@@ -1740,7 +1740,7 @@ static void velocity_free_tx_buf(struct velocity_info *vptr,
 		dma_unmap_single(vptr->dev, tdinfo->skb_dma[i],
 				 le16_to_cpu(pktlen), DMA_TO_DEVICE);
 	}
-	dev_kfree_skb_irq(skb);
+	dev_consume_skb_irq(skb);
 	tdinfo->skb = NULL;
 }
 
-- 
2.7.4



^ permalink raw reply related

* [PATCH net] net: broadcom: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: Yang Wei @ 2019-02-05 16:25 UTC (permalink / raw)
  To: netdev; +Cc: davem, f.fainelli, andrew, yang.wei9, albin_yang

From: Yang Wei <yang.wei9@zte.com.cn>

dev_consume_skb_irq() should be called in sbdma_tx_process() when
skb xmit done. It makes drop profiles(dropwatch, perf) more
friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
---
 drivers/net/ethernet/broadcom/sb1250-mac.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/sb1250-mac.c b/drivers/net/ethernet/broadcom/sb1250-mac.c
index 5db9f41..134ae28 100644
--- a/drivers/net/ethernet/broadcom/sb1250-mac.c
+++ b/drivers/net/ethernet/broadcom/sb1250-mac.c
@@ -1288,7 +1288,7 @@ static void sbdma_tx_process(struct sbmac_softc *sc, struct sbmacdma *d,
 		 * for transmits, we just free buffers.
 		 */
 
-		dev_kfree_skb_irq(sb);
+		dev_consume_skb_irq(sb);
 
 		/*
 		 * .. and advance to the next buffer.
-- 
2.7.4



^ permalink raw reply related

* Re: [PATCH net-next v3] net: dsa: mv88e6xxx: Prevent suspend to RAM
From: Vivien Didelot @ 2019-02-05 16:28 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Andrew Lunn, Florian Fainelli, David S. Miller, netdev,
	linux-kernel, Thomas Petazzoni, Gregory Clement, Antoine Tenart,
	Maxime Chevallier, Nadav Haklai, Miquel Raynal
In-Reply-To: <20190205110728.11451-1-miquel.raynal@bootlin.com>

Hi Miquel,

On Tue,  5 Feb 2019 12:07:28 +0100, Miquel Raynal <miquel.raynal@bootlin.com> wrote:

> +/* There is no suspend to RAM support at DSA level yet, the switch configuration
> + * would be lost after a power cycle so prevent it to be suspended.
> + */
> +static int __maybe_unused mv88e6xxx_suspend(struct device *dev)
> +{
> +	return -EOPNOTSUPP;
> +}
> +
> +static int __maybe_unused mv88e6xxx_resume(struct device *dev)
> +{
> +	return 0;
> +}

The code looks good but my only concern is -EOPNOTSUPP. In this
context this code is specific to callbacks targeting bridge and
switchdev, while the dev_pm_ops are completely parallel to DSA.

It is intuitive but given Documentation/power/runtime_pm.txt, this
will default to being interpreted as a fatal error, while -EBUSY
seems to keep the device in an 'active' state in a saner way.

I don't understand yet how to properly tell PM core that suspend to RAM
isn't supported. If an error code different from -EAGAIN or -EBUSY
is the way to go, I'm good with it:

Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>


Thanks,

	Vivien

^ permalink raw reply

* Kernel panic in eth_header
From: Andrew @ 2019-02-05 16:29 UTC (permalink / raw)
  To: Netdev

Hi all.

After upgrade on PPPoE BRAS to kernel 4.9.153 I've got an kernel panic 
after a 3 days of uptime.

Unfortunately kernel is compiled w/o debug info; I rebuilt kernel with 
debug info enabled (kernel is compiled with same function addresses - I 
compare vmlinux symbol maps) - it says that panic is in 
net/ethernet/eth.c:88

Below there is a kernel panic trace. igb is from vendor, ver. 5.3.5.4. 
What extra info is needed?

[263565.106441] BUG: unable to handle kernel paging request at 
ffff88015a4d2dd4
[263565.113527] IP: [<ffffffff8158e48b>] eth_header+0x3b/0xc0
[263565.119030] PGD 1e8f067 [263565.121474] PUD 0
[263565.123580]
[263565.125166] Oops: 0002 [#1] SMP
[263565.128398] Modules linked in: xt_nat iptable_nat nf_conntrack_ipv4 
nf_defrag_ipv4 nf_nat_ipv4 iptable_filter xt_length xt_TCPMSS xt_tcpudp 
xt_mark xt_dscp iptable_mangle ip_tables x_tables nf_nat_pptp 
nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_proto_gre nf_nat 
nf_conntrack sch_sfq sch_htb cls_u32 sch_ingress sch_prio sch_tbf 
cls_flow cls_fw act_police ifb 8021q mrp garp stp llc softdog pppoe 
pppox ppp_generic slhc i2c_nforce2 i2c_core igb(O) parport_pc dca 
parport thermal asus_atk0110 fan ptp k10temp hwmon pps_core nv_tco
[263565.176083] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G           O    
4.9.153-x86_64 #1
[263565.183996] Hardware name: System manufacturer System Product 
Name/M2N-E, BIOS ASUS M2N-E ACPI BIOS Revision 5001 03/23/2010
[263565.195289] task: ffff88007d0f5200 task.stack: ffffc9000006c000
[263565.201295] RIP: 0010:[<ffffffff8158e48b>] [<ffffffff8158e48b>] 
eth_header+0x3b/0xc0
[263565.209225] RSP: 0018:ffff88007fa83c58  EFLAGS: 00010286
[263565.214622] RAX: ffff88015a4d2dc8 RBX: 0000000000000008 RCX: 
ffff8800682434a0
[263565.221843] RDX: ffff88015a4d2dc8 RSI: ffff88015a4d2dc8 RDI: 
ffff880077aab000
[263565.229062] RBP: ffff88007b663d90 R08: ffff88007b663d90 R09: 
0000000000000574
[263565.236281] R10: ffff88007d1fa000 R11: 0000000000000000 R12: 
ffff8800682434a0
[263565.243501] R13: ffff88007d1fa000 R14: 0000000000000574 R15: 
0000000000000008
[263565.250719] FS:  0000000000000000(0000) GS:ffff88007fa80000(0000) 
knlGS:0000000000000000
[263565.258894] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[263565.264725] CR2: ffff88015a4d2dd4 CR3: 000000007ad73000 CR4: 
00000000000006f0
[263565.271944] Stack:
[263565.274041]  ffff880077aab000 ffff880068243400 ffff88007a745000 
ffff8800682434a0
[263565.281582]  0000000000000002 ffffffff81571d09 ffff880068243400 
ffff88007fa83d00
[263565.289121]  ffff88007a745000 ffff880077aab000 ffff88007a712000 
ffffffff815a8c61
[263565.296661] Call Trace:
[263565.299193]  <IRQ> [263565.301205] [<ffffffff81571d09>] ? 
neigh_connected_output+0xa9/0x100
[263565.307740]  [<ffffffff815a8c61>] ? ip_finish_output2+0x221/0x400
[263565.313920]  [<ffffffff8159e144>] ? nf_iterate+0x54/0x60
[263565.319319]  [<ffffffff815ab2fa>] ? ip_output+0x6a/0xf0
[263565.324631]  [<ffffffff8159e102>] ? nf_iterate+0x12/0x60
[263565.330030]  [<ffffffff815aa6e0>] ? ip_fragment.constprop.5+0x80/0x80
[263565.336556]  [<ffffffff815a73b6>] ? ip_forward+0x396/0x480
[263565.342128]  [<ffffffff815a6fb0>] ? ip_check_defrag+0x1e0/0x1e0
[263565.348134]  [<ffffffff815a5a2e>] ? ip_rcv+0x2ae/0x370
[263565.353361]  [<ffffffffa0107c02>] ? pppoe_rcv_core+0xd2/0x160 [pppoe]
[263565.359888]  [<ffffffff815a5170>] ? ip_local_deliver_finish+0x1d0/0x1d0
[263565.366586]  [<ffffffff81562a57>] ? __netif_receive_skb_core+0x527/0xa80
[263565.373373]  [<ffffffff81567632>] ? process_backlog+0x92/0x130
[263565.379291]  [<ffffffff8156745d>] ? net_rx_action+0x24d/0x390
[263565.385124]  [<ffffffff81628374>] ? __do_softirq+0xf4/0x2a0
[263565.390784]  [<ffffffff8107136c>] ? irq_exit+0xbc/0xd0
[263565.396008]  [<ffffffff81626cd6>] ? 
call_function_single_interrupt+0x96/0xa0
[263565.403141]  <EOI> [263565.405153] [<ffffffff81623eb0>] ? 
__sched_text_end+0x2/0x2
[263565.410907]  [<ffffffff81624182>] ? native_safe_halt+0x2/0x10
[263565.416741]  [<ffffffff81623ec8>] ? default_idle+0x18/0xd0
[263565.422314]  [<ffffffff810a7a46>] ? cpu_startup_entry+0x126/0x220
[263565.428492]  [<ffffffff8104c261>] ? start_secondary+0x161/0x180
[263565.434496] Code: 0e 00 00 00 53 89 d3 49 89 cc 4c 89 c5 45 89 ce e8 
bb 8a fc ff 66 83 fb 01 48 89 c6 74 44 66 83 fb 04 74 3e 66 c1 c3 08 48 
85 ed <66> 89 58 0c 74 40 8b 45 00 4d 85 e4 89 46 06 0f b7 45 04 66 89
[263565.454534] RIP  [<ffffffff8158e48b>] eth_header+0x3b/0xc0
[263565.460124]  RSP <ffff88007fa83c58>
[263565.463696] CR2: ffff88015a4d2dd4
[263565.467104] ---[ end trace a1bcaf3618724adf ]---
[263565.471807] Kernel panic - not syncing: Fatal exception in interrupt
[263565.478245] Kernel Offset: disabled
[263565.481818] Rebooting in 5 seconds..


^ permalink raw reply

* Re: [PATCH] net: dsa: Fix lockdep false positive splat
From: Vivien Didelot @ 2019-02-05 16:35 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: netdev, linux-kernel, Andrew Lunn, Florian Fainelli,
	David S. Miller
In-Reply-To: <20190202175329.5969-1-marc.zyngier@arm.com>

On Sat,  2 Feb 2019 17:53:29 +0000, Marc Zyngier <marc.zyngier@arm.com> wrote:
> Creating a macvtap on a DSA-backed interface results in the following
> splat when lockdep is enabled:
> 
> [   19.638080] IPv6: ADDRCONF(NETDEV_CHANGE): lan0: link becomes ready
> [   23.041198] device lan0 entered promiscuous mode
> [   23.043445] device eth0 entered promiscuous mode
> [   23.049255]
> [   23.049557] ============================================
> [   23.055021] WARNING: possible recursive locking detected
> [   23.060490] 5.0.0-rc3-00013-g56c857a1b8d3 #118 Not tainted
> [   23.066132] --------------------------------------------
> [   23.071598] ip/2861 is trying to acquire lock:
> [   23.076171] 00000000f61990cb (_xmit_ETHER){+...}, at: dev_set_rx_mode+0x1c/0x38
> [   23.083693]
> [   23.083693] but task is already holding lock:
> [   23.089696] 00000000ecf0c3b4 (_xmit_ETHER){+...}, at: dev_uc_add+0x24/0x70
> [   23.096774]
> [   23.096774] other info that might help us debug this:
> [   23.103494]  Possible unsafe locking scenario:
> [   23.103494]
> [   23.109584]        CPU0
> [   23.112093]        ----
> [   23.114601]   lock(_xmit_ETHER);
> [   23.117917]   lock(_xmit_ETHER);
> [   23.121233]
> [   23.121233]  *** DEADLOCK ***
> [   23.121233]
> [   23.127325]  May be due to missing lock nesting notation
> [   23.127325]
> [   23.134315] 2 locks held by ip/2861:
> [   23.137987]  #0: 000000003b766c72 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x338/0x4e0
> [   23.146231]  #1: 00000000ecf0c3b4 (_xmit_ETHER){+...}, at: dev_uc_add+0x24/0x70
> [   23.153757]
> [   23.153757] stack backtrace:
> [   23.158243] CPU: 0 PID: 2861 Comm: ip Not tainted 5.0.0-rc3-00013-g56c857a1b8d3 #118
> [   23.166212] Hardware name: Globalscale Marvell ESPRESSOBin Board (DT)
> [   23.172843] Call trace:
> [   23.175358]  dump_backtrace+0x0/0x188
> [   23.179116]  show_stack+0x14/0x20
> [   23.182524]  dump_stack+0xb4/0xec
> [   23.185928]  __lock_acquire+0x123c/0x1860
> [   23.190048]  lock_acquire+0xc8/0x248
> [   23.193724]  _raw_spin_lock_bh+0x40/0x58
> [   23.197755]  dev_set_rx_mode+0x1c/0x38
> [   23.201607]  dev_set_promiscuity+0x3c/0x50
> [   23.205820]  dsa_slave_change_rx_flags+0x5c/0x70
> [   23.210567]  __dev_set_promiscuity+0x148/0x1e0
> [   23.215136]  __dev_set_rx_mode+0x74/0x98
> [   23.219167]  dev_uc_add+0x54/0x70
> [   23.222575]  macvlan_open+0x170/0x1d0
> [   23.226336]  __dev_open+0xe0/0x160
> [   23.229830]  __dev_change_flags+0x16c/0x1b8
> [   23.234132]  dev_change_flags+0x20/0x60
> [   23.238074]  do_setlink+0x2d0/0xc50
> [   23.241658]  __rtnl_newlink+0x5f8/0x6e8
> [   23.245601]  rtnl_newlink+0x50/0x78
> [   23.249184]  rtnetlink_rcv_msg+0x360/0x4e0
> [   23.253397]  netlink_rcv_skb+0xe8/0x130
> [   23.257338]  rtnetlink_rcv+0x14/0x20
> [   23.261012]  netlink_unicast+0x190/0x210
> [   23.265043]  netlink_sendmsg+0x288/0x350
> [   23.269075]  sock_sendmsg+0x18/0x30
> [   23.272659]  ___sys_sendmsg+0x29c/0x2c8
> [   23.276602]  __sys_sendmsg+0x60/0xb8
> [   23.280276]  __arm64_sys_sendmsg+0x1c/0x28
> [   23.284488]  el0_svc_common+0xd8/0x138
> [   23.288340]  el0_svc_handler+0x24/0x80
> [   23.292192]  el0_svc+0x8/0xc
> 
> This looks fairly harmless (no actual deadlock occurs), and is
> fixed in a similar way to c6894dec8ea9 ("bridge: fix lockdep
> addr_list_lock false positive splat") by putting the addr_list_lock
> in its own lockdep class.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>

^ permalink raw reply

* Re: [PATCH net-next v7 0/8] devlink: Add configuration parameters support for devlink_port
From: Michal Kubecek @ 2019-02-05 16:51 UTC (permalink / raw)
  To: Vasundhara Volam
  Cc: Jakub Kicinski, Netdev, David Miller, michael.chan@broadcom.com,
	Jiri Pirko
In-Reply-To: <CAACQVJqPdrDTrFwZPt+XaGEf3-H81EWELR_SvZZG83mF+54MsQ@mail.gmail.com>

On Tue, Feb 05, 2019 at 09:53:26AM +0530, Vasundhara Volam wrote:
> On Tue, Feb 5, 2019 at 8:26 AM Jakub Kicinski
> >
> > No?  We were talking about using the soon-too-come ethtool netlink
> > API with additional indication that given configuration request is
> > supposed to be persisted.  Adding more devlink parameters is exactly
> > the opposite of what you should be doing.
> 
> Okay. So, till then can we have the devlink wake_on_lan parameter or
> you want this to be removed? Could you please clarify?
> 
> Once ethtool netlink API is available with persisted support, I can remove
> this wake_on_lan parameter from devlink. Thanks.

Once you provide an interface for userspace and applications start using
it, it's hard to get rid of it. As an extreme example, the legacy ioctl
interface used by ifconfig has been declared obsolete since kernel 2.2.0
(January 1999, i.e. 20 years ago) and we still have to maintain it.

Michal Kubecek

^ permalink raw reply

* Re: Kernel panic in eth_header
From: Eric Dumazet @ 2019-02-05 16:57 UTC (permalink / raw)
  To: Andrew, Netdev
In-Reply-To: <18c17dde-5963-4412-2e98-ba44953f0ddd@seti.kr.ua>



On 02/05/2019 08:29 AM, Andrew wrote:
> Hi all.
> 
> After upgrade on PPPoE BRAS to kernel 4.9.153 I've got an kernel panic after a 3 days of uptime.
> 
> Unfortunately kernel is compiled w/o debug info; I rebuilt kernel with debug info enabled (kernel is compiled with same function addresses - I compare vmlinux symbol maps) - it says that panic is in net/ethernet/eth.c:88
> 
> Below there is a kernel panic trace. igb is from vendor, ver. 5.3.5.4. What extra info is needed?
> 
> [263565.106441] BUG: unable to handle kernel paging request at ffff88015a4d2dd4
> [263565.113527] IP: [<ffffffff8158e48b>] eth_header+0x3b/0xc0
> [263565.119030] PGD 1e8f067 [263565.121474] PUD 0
> [263565.123580]
> [263565.125166] Oops: 0002 [#1] SMP
> [263565.128398] Modules linked in: xt_nat iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter xt_length xt_TCPMSS xt_tcpudp xt_mark xt_dscp iptable_mangle ip_tables x_tables nf_nat_pptp nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_proto_gre nf_nat nf_conntrack sch_sfq sch_htb cls_u32 sch_ingress sch_prio sch_tbf cls_flow cls_fw act_police ifb 8021q mrp garp stp llc softdog pppoe pppox ppp_generic slhc i2c_nforce2 i2c_core igb(O) parport_pc dca parport thermal asus_atk0110 fan ptp k10temp hwmon pps_core nv_tco
> [263565.176083] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G           O    4.9.153-x86_64 #1
> [263565.183996] Hardware name: System manufacturer System Product Name/M2N-E, BIOS ASUS M2N-E ACPI BIOS Revision 5001 03/23/2010
> [263565.195289] task: ffff88007d0f5200 task.stack: ffffc9000006c000
> [263565.201295] RIP: 0010:[<ffffffff8158e48b>] [<ffffffff8158e48b>] eth_header+0x3b/0xc0
> [263565.209225] RSP: 0018:ffff88007fa83c58  EFLAGS: 00010286
> [263565.214622] RAX: ffff88015a4d2dc8 RBX: 0000000000000008 RCX: ffff8800682434a0
> [263565.221843] RDX: ffff88015a4d2dc8 RSI: ffff88015a4d2dc8 RDI: ffff880077aab000
> [263565.229062] RBP: ffff88007b663d90 R08: ffff88007b663d90 R09: 0000000000000574
> [263565.236281] R10: ffff88007d1fa000 R11: 0000000000000000 R12: ffff8800682434a0
> [263565.243501] R13: ffff88007d1fa000 R14: 0000000000000574 R15: 0000000000000008
> [263565.250719] FS:  0000000000000000(0000) GS:ffff88007fa80000(0000) knlGS:0000000000000000
> [263565.258894] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [263565.264725] CR2: ffff88015a4d2dd4 CR3: 000000007ad73000 CR4: 00000000000006f0
> [263565.271944] Stack:
> [263565.274041]  ffff880077aab000 ffff880068243400 ffff88007a745000 ffff8800682434a0
> [263565.281582]  0000000000000002 ffffffff81571d09 ffff880068243400 ffff88007fa83d00
> [263565.289121]  ffff88007a745000 ffff880077aab000 ffff88007a712000 ffffffff815a8c61
> [263565.296661] Call Trace:
> [263565.299193]  <IRQ> [263565.301205] [<ffffffff81571d09>] ? neigh_connected_output+0xa9/0x100
> [263565.307740]  [<ffffffff815a8c61>] ? ip_finish_output2+0x221/0x400
> [263565.313920]  [<ffffffff8159e144>] ? nf_iterate+0x54/0x60
> [263565.319319]  [<ffffffff815ab2fa>] ? ip_output+0x6a/0xf0
> [263565.324631]  [<ffffffff8159e102>] ? nf_iterate+0x12/0x60
> [263565.330030]  [<ffffffff815aa6e0>] ? ip_fragment.constprop.5+0x80/0x80
> [263565.336556]  [<ffffffff815a73b6>] ? ip_forward+0x396/0x480
> [263565.342128]  [<ffffffff815a6fb0>] ? ip_check_defrag+0x1e0/0x1e0
> [263565.348134]  [<ffffffff815a5a2e>] ? ip_rcv+0x2ae/0x370
> [263565.353361]  [<ffffffffa0107c02>] ? pppoe_rcv_core+0xd2/0x160 [pppoe]
> [263565.359888]  [<ffffffff815a5170>] ? ip_local_deliver_finish+0x1d0/0x1d0
> [263565.366586]  [<ffffffff81562a57>] ? __netif_receive_skb_core+0x527/0xa80
> [263565.373373]  [<ffffffff81567632>] ? process_backlog+0x92/0x130
> [263565.379291]  [<ffffffff8156745d>] ? net_rx_action+0x24d/0x390
> [263565.385124]  [<ffffffff81628374>] ? __do_softirq+0xf4/0x2a0
> [263565.390784]  [<ffffffff8107136c>] ? irq_exit+0xbc/0xd0
> [263565.396008]  [<ffffffff81626cd6>] ? call_function_single_interrupt+0x96/0xa0
> [263565.403141]  <EOI> [263565.405153] [<ffffffff81623eb0>] ? __sched_text_end+0x2/0x2
> [263565.410907]  [<ffffffff81624182>] ? native_safe_halt+0x2/0x10
> [263565.416741]  [<ffffffff81623ec8>] ? default_idle+0x18/0xd0
> [263565.422314]  [<ffffffff810a7a46>] ? cpu_startup_entry+0x126/0x220
> [263565.428492]  [<ffffffff8104c261>] ? start_secondary+0x161/0x180
> [263565.434496] Code: 0e 00 00 00 53 89 d3 49 89 cc 4c 89 c5 45 89 ce e8 bb 8a fc ff 66 83 fb 01 48 89 c6 74 44 66 83 fb 04 74 3e 66 c1 c3 08 48 85 ed <66> 89 58 0c 74 40 8b 45 00 4d 85 e4 89 46 06 0f b7 45 04 66 89
> [263565.454534] RIP  [<ffffffff8158e48b>] eth_header+0x3b/0xc0
> [263565.460124]  RSP <ffff88007fa83c58>
> [263565.463696] CR2: ffff88015a4d2dd4
> [263565.467104] ---[ end trace a1bcaf3618724adf ]---
> [263565.471807] Kernel panic - not syncing: Fatal exception in interrupt
> [263565.478245] Kernel Offset: disabled
> [263565.481818] Rebooting in 5 seconds..
> 


This is a well known issue, a fix should come shortly in stable branches

diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index f8bbd693c19c247e41839c2d0b5318ca51b23ee8..d95b32af4a0e3f552405c9e61cc372729834160c 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -425,6 +425,7 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
         * fragment.
         */
 
+       err = -EINVAL;
        /* Find out where to put this fragment.  */
        prev_tail = qp->q.fragments_tail;
        if (!prev_tail)
@@ -501,7 +502,6 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
 
 discard_qp:
        inet_frag_kill(&qp->q);
-       err = -EINVAL;
        __IP_INC_STATS(net, IPSTATS_MIB_REASM_OVERLAPS);
 err:
        kfree_skb(skb);




^ permalink raw reply related

* Re: [B.A.T.M.A.N.] [RFC v4 00/19] batman-adv: netlink restructuring, part 2
From: Simon Wunderlich @ 2019-02-05 17:04 UTC (permalink / raw)
  To: b.a.t.m.a.n; +Cc: Sven Eckelmann, netdev, Jiri Pirko
In-Reply-To: <1895931.G10psR3j26@sven-edge>

[-- Attachment #1: Type: text/plain, Size: 1804 bytes --]

On Saturday, January 26, 2019 11:47:20 AM CET Sven Eckelmann wrote:
> Aggregated OGM is currently defined as:
> 
> 
> * according to batctl manpage:
> 
>     aggregation|ag [0|1]
>            If no parameter is given the current aggregation setting
>            is displayed. Otherwise the parameter is used to enable or
>            disable OGM packet aggregation.
> 
> * according to sysfs ABI:
> 
>     What:           /sys/class/net/<mesh_iface>/mesh/aggregated_ogms
>     Date:           May 2010
>     Contact:        Marek Lindner <mareklindner@neomailbox.ch>
>     Description:
>                     Indicates whether the batman protocol messages of the
>                     mesh <mesh_iface> shall be aggregated or not.
> 
> So sysfs is only one possible backend for the batctl command. There is 
> currently nothing which I would assume to be aggregatable beside OGMs but
> let  us assume for now that there is now something and some way to
> aggregate things beside OGMs in a save and backward compatible way. Let's
> call this FOO - so we have BATADV_ATTR_AGGREGATION_OGM_ENABLED and
> BATADV_ATTR_AGGREGATION_FOO_ENABLED. Or we have BATADV_ATTR_AGGREGATION as
> an  u32 and just use the second bit as marker for FOO (and of course the
> first bit as marker for OGM).
> 
> Would it now be more preferable to use BATADV_ATTR_AGGREGATION_OGM_ENABLED
> as  u8 (boolean) or to to switch to BATADV_ATTR_AGGREGATION (u32) & assign
> single bits to packet types.

I'd prefer BATADV_ATTR_AGGREGATION_OGM_ENABLED (as we have your patchset now). 
Although it may be technically possible to aggregate other things (e.g. 
broadcasts), I don't think this will be implemented anytime soon, if at all. 
And if we do, we can just make another BATADV_ATTR_AGGREGATION_FOO_ENABLED 
flag.

Cheers,
       Simon

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* [PATCH v2] bpf: test_maps: Avoid possible out of bound access
From: Breno Leitao @ 2019-02-05 17:12 UTC (permalink / raw)
  To: netdev; +Cc: daniel, ast, davem, Breno Leitao

When compiling test_maps selftest with GCC-8, it warns that an array might
be indexed with a negative value, which could cause a negative out of bound
access, depending on parameters of the function. This is the GCC-8 warning:

	gcc -Wall -O2 -I../../../include/uapi -I../../../lib -I../../../lib/bpf -I../../../../include/generated -DHAVE_GENHDR -I../../../include    test_maps.c /home/breno/Devel/linux/tools/testing/selftests/bpf/libbpf.a -lcap -lelf -lrt -lpthread -o /home/breno/Devel/linux/tools/testing/selftests/bpf/test_maps
	In file included from test_maps.c:16:
	test_maps.c: In function ‘run_all_tests’:
	test_maps.c:1079:10: warning: array subscript -1 is below array bounds of ‘pid_t[<Ube20> + 1]’ [-Warray-bounds]
	   assert(waitpid(pid[i], &status, 0) == pid[i]);
		  ^~~~~~~~~~~~~~~~~~~~~~~~~~~
	test_maps.c:1059:6: warning: array subscript -1 is below array bounds of ‘pid_t[<Ube20> + 1]’ [-Warray-bounds]
	   pid[i] = fork();
	   ~~~^~~

This patch simply guarantees that the task(s) variables are unsigned, thus,
they could never be a negative number, hence avoiding an out of bound access
warning.

Signed-off-by: Breno Leitao <leitao@debian.org>
---
 tools/testing/selftests/bpf/test_maps.c | 27 +++++++++++++------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c
index e2b9eee37187..6e05a22b346c 100644
--- a/tools/testing/selftests/bpf/test_maps.c
+++ b/tools/testing/selftests/bpf/test_maps.c
@@ -43,7 +43,7 @@ static int map_flags;
 	}								\
 })
 
-static void test_hashmap(int task, void *data)
+static void test_hashmap(unsigned int task, void *data)
 {
 	long long key, next_key, first_key, value;
 	int fd;
@@ -133,7 +133,7 @@ static void test_hashmap(int task, void *data)
 	close(fd);
 }
 
-static void test_hashmap_sizes(int task, void *data)
+static void test_hashmap_sizes(unsigned int task, void *data)
 {
 	int fd, i, j;
 
@@ -153,7 +153,7 @@ static void test_hashmap_sizes(int task, void *data)
 		}
 }
 
-static void test_hashmap_percpu(int task, void *data)
+static void test_hashmap_percpu(unsigned int task, void *data)
 {
 	unsigned int nr_cpus = bpf_num_possible_cpus();
 	BPF_DECLARE_PERCPU(long, value);
@@ -280,7 +280,7 @@ static int helper_fill_hashmap(int max_entries)
 	return fd;
 }
 
-static void test_hashmap_walk(int task, void *data)
+static void test_hashmap_walk(unsigned int task, void *data)
 {
 	int fd, i, max_entries = 1000;
 	long long key, value, next_key;
@@ -351,7 +351,7 @@ static void test_hashmap_zero_seed(void)
 	close(second);
 }
 
-static void test_arraymap(int task, void *data)
+static void test_arraymap(unsigned int task, void *data)
 {
 	int key, next_key, fd;
 	long long value;
@@ -406,7 +406,7 @@ static void test_arraymap(int task, void *data)
 	close(fd);
 }
 
-static void test_arraymap_percpu(int task, void *data)
+static void test_arraymap_percpu(unsigned int task, void *data)
 {
 	unsigned int nr_cpus = bpf_num_possible_cpus();
 	BPF_DECLARE_PERCPU(long, values);
@@ -502,7 +502,7 @@ static void test_arraymap_percpu_many_keys(void)
 	close(fd);
 }
 
-static void test_devmap(int task, void *data)
+static void test_devmap(unsigned int task, void *data)
 {
 	int fd;
 	__u32 key, value;
@@ -517,7 +517,7 @@ static void test_devmap(int task, void *data)
 	close(fd);
 }
 
-static void test_queuemap(int task, void *data)
+static void test_queuemap(unsigned int task, void *data)
 {
 	const int MAP_SIZE = 32;
 	__u32 vals[MAP_SIZE + MAP_SIZE/2], val;
@@ -575,7 +575,7 @@ static void test_queuemap(int task, void *data)
 	close(fd);
 }
 
-static void test_stackmap(int task, void *data)
+static void test_stackmap(unsigned int task, void *data)
 {
 	const int MAP_SIZE = 32;
 	__u32 vals[MAP_SIZE + MAP_SIZE/2], val;
@@ -641,7 +641,7 @@ static void test_stackmap(int task, void *data)
 #define SOCKMAP_PARSE_PROG "./sockmap_parse_prog.o"
 #define SOCKMAP_VERDICT_PROG "./sockmap_verdict_prog.o"
 #define SOCKMAP_TCP_MSG_PROG "./sockmap_tcp_msg_prog.o"
-static void test_sockmap(int tasks, void *data)
+static void test_sockmap(unsigned int tasks, void *data)
 {
 	struct bpf_map *bpf_map_rx, *bpf_map_tx, *bpf_map_msg, *bpf_map_break;
 	int map_fd_msg = 0, map_fd_rx = 0, map_fd_tx = 0, map_fd_break;
@@ -1258,10 +1258,11 @@ static void test_map_large(void)
 }
 
 #define run_parallel(N, FN, DATA) \
-	printf("Fork %d tasks to '" #FN "'\n", N); \
+	printf("Fork %u tasks to '" #FN "'\n", N); \
 	__run_parallel(N, FN, DATA)
 
-static void __run_parallel(int tasks, void (*fn)(int task, void *data),
+static void __run_parallel(unsigned int tasks,
+			   void (*fn)(unsigned int task, void *data),
 			   void *data)
 {
 	pid_t pid[tasks];
@@ -1302,7 +1303,7 @@ static void test_map_stress(void)
 #define DO_UPDATE 1
 #define DO_DELETE 0
 
-static void test_update_delete(int fn, void *data)
+static void test_update_delete(unsigned int fn, void *data)
 {
 	int do_update = ((int *)data)[1];
 	int fd = ((int *)data)[0];
-- 
2.19.0


^ permalink raw reply related

* Kernel panic in eth_header
From: Andrew @ 2019-02-05 16:09 UTC (permalink / raw)
  To: Netdev

Hi all.

After upgrade on PPPoE BRAS to kernel 4.9.153 I've got an kernel panic 
after a 3 days of uptime.

Unfortunately kernel is compiled w/o debug info; I rebuilt kernel with 
debug info enabled (kernel is compiled with same function addresses - I 
compare vmlinux symbol maps) - it says that panic is in 
net/ethernet/eth.c:88

Below there is a kernel panic trace. igb is from upstream, ver. 5.3.5.4. 
What extra info is needed?

[263565.106441] BUG: unable to handle kernel paging request at 
ffff88015a4d2dd4
[263565.113527] IP: [<ffffffff8158e48b>] eth_header+0x3b/0xc0
[263565.119030] PGD 1e8f067 [263565.121474] PUD 0
[263565.123580]
[263565.125166] Oops: 0002 [#1] SMP
[263565.128398] Modules linked in: xt_nat iptable_nat nf_conntrack_ipv4 
nf_defrag_ipv4 nf_nat_ipv4 iptable_filter xt_length xt_TCPMSS xt_tcpudp 
xt_mark xt_dscp iptable_mangle ip_tables x_tables nf_nat_pptp 
nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_proto_gre nf_nat 
nf_conntrack sch_sfq sch_htb cls_u32 sch_ingress sch_prio sch_tbf 
cls_flow cls_fw act_police ifb 8021q mrp garp stp llc softdog pppoe 
pppox ppp_generic slhc i2c_nforce2 i2c_core igb(O) parport_pc dca 
parport thermal asus_atk0110 fan ptp k10temp hwmon pps_core nv_tco
[263565.176083] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G           O    
4.9.153-x86_64 #1
[263565.183996] Hardware name: System manufacturer System Product 
Name/M2N-E, BIOS ASUS M2N-E ACPI BIOS Revision 5001 03/23/2010
[263565.195289] task: ffff88007d0f5200 task.stack: ffffc9000006c000
[263565.201295] RIP: 0010:[<ffffffff8158e48b>] [<ffffffff8158e48b>] 
eth_header+0x3b/0xc0
[263565.209225] RSP: 0018:ffff88007fa83c58  EFLAGS: 00010286
[263565.214622] RAX: ffff88015a4d2dc8 RBX: 0000000000000008 RCX: 
ffff8800682434a0
[263565.221843] RDX: ffff88015a4d2dc8 RSI: ffff88015a4d2dc8 RDI: 
ffff880077aab000
[263565.229062] RBP: ffff88007b663d90 R08: ffff88007b663d90 R09: 
0000000000000574
[263565.236281] R10: ffff88007d1fa000 R11: 0000000000000000 R12: 
ffff8800682434a0
[263565.243501] R13: ffff88007d1fa000 R14: 0000000000000574 R15: 
0000000000000008
[263565.250719] FS:  0000000000000000(0000) GS:ffff88007fa80000(0000) 
knlGS:0000000000000000
[263565.258894] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[263565.264725] CR2: ffff88015a4d2dd4 CR3: 000000007ad73000 CR4: 
00000000000006f0
[263565.271944] Stack:
[263565.274041]  ffff880077aab000 ffff880068243400 ffff88007a745000 
ffff8800682434a0
[263565.281582]  0000000000000002 ffffffff81571d09 ffff880068243400 
ffff88007fa83d00
[263565.289121]  ffff88007a745000 ffff880077aab000 ffff88007a712000 
ffffffff815a8c61
[263565.296661] Call Trace:
[263565.299193]  <IRQ> [263565.301205] [<ffffffff81571d09>] ? 
neigh_connected_output+0xa9/0x100
[263565.307740]  [<ffffffff815a8c61>] ? ip_finish_output2+0x221/0x400
[263565.313920]  [<ffffffff8159e144>] ? nf_iterate+0x54/0x60
[263565.319319]  [<ffffffff815ab2fa>] ? ip_output+0x6a/0xf0
[263565.324631]  [<ffffffff8159e102>] ? nf_iterate+0x12/0x60
[263565.330030]  [<ffffffff815aa6e0>] ? ip_fragment.constprop.5+0x80/0x80
[263565.336556]  [<ffffffff815a73b6>] ? ip_forward+0x396/0x480
[263565.342128]  [<ffffffff815a6fb0>] ? ip_check_defrag+0x1e0/0x1e0
[263565.348134]  [<ffffffff815a5a2e>] ? ip_rcv+0x2ae/0x370
[263565.353361]  [<ffffffffa0107c02>] ? pppoe_rcv_core+0xd2/0x160 [pppoe]
[263565.359888]  [<ffffffff815a5170>] ? ip_local_deliver_finish+0x1d0/0x1d0
[263565.366586]  [<ffffffff81562a57>] ? __netif_receive_skb_core+0x527/0xa80
[263565.373373]  [<ffffffff81567632>] ? process_backlog+0x92/0x130
[263565.379291]  [<ffffffff8156745d>] ? net_rx_action+0x24d/0x390
[263565.385124]  [<ffffffff81628374>] ? __do_softirq+0xf4/0x2a0
[263565.390784]  [<ffffffff8107136c>] ? irq_exit+0xbc/0xd0
[263565.396008]  [<ffffffff81626cd6>] ? 
call_function_single_interrupt+0x96/0xa0
[263565.403141]  <EOI> [263565.405153] [<ffffffff81623eb0>] ? 
__sched_text_end+0x2/0x2
[263565.410907]  [<ffffffff81624182>] ? native_safe_halt+0x2/0x10
[263565.416741]  [<ffffffff81623ec8>] ? default_idle+0x18/0xd0
[263565.422314]  [<ffffffff810a7a46>] ? cpu_startup_entry+0x126/0x220
[263565.428492]  [<ffffffff8104c261>] ? start_secondary+0x161/0x180
[263565.434496] Code: 0e 00 00 00 53 89 d3 49 89 cc 4c 89 c5 45 89 ce e8 
bb 8a fc ff 66 83 fb 01 48 89 c6 74 44 66 83 fb 04 74 3e 66 c1 c3 08 48 
85 ed <66> 89 58 0c 74 40 8b 45 00 4d 85 e4 89 46 06 0f b7 45 04 66 89
[263565.454534] RIP  [<ffffffff8158e48b>] eth_header+0x3b/0xc0
[263565.460124]  RSP <ffff88007fa83c58>
[263565.463696] CR2: ffff88015a4d2dd4
[263565.467104] ---[ end trace a1bcaf3618724adf ]---
[263565.471807] Kernel panic - not syncing: Fatal exception in interrupt
[263565.478245] Kernel Offset: disabled
[263565.481818] Rebooting in 5 seconds..


^ permalink raw reply

* [RFC bpf-next 0/7] net: flow_dissector: trigger BPF hook when called from eth_get_headlen
From: Stanislav Fomichev @ 2019-02-05 17:36 UTC (permalink / raw)
  To: netdev; +Cc: davem, ast, daniel, simon.horman, willemb, Stanislav Fomichev

Currently, when eth_get_headlen calls flow dissector, it doesn't pass any
skb. Because we use passed skb to lookup associated networking namespace
to find whether we have a BPF program attached or not, we always use
C-based flow dissector in this case.

The goal of this patch series is to add new networking namespace argument
to the eth_get_headlen and make BPF flow dissector programs be able to
work in the skb-less case.

The series goes like this:
1. introduce __init_skb and __init_skb_shinfo; those will be used to
   initialize temporary skb
2. introduce skb_net which can be used to get networking namespace
   associated with an skb
3. add new optional network namespace argument to __skb_flow_dissect and
   plumb through the callers
4. add new __flow_bpf_dissect which constructs temporary on-stack skb
   (using __init_skb) and calls BPF flow dissector program
5. convert flow dissector BPF_PROG_TEST_RUN to skb-less mode to show that
   it works
6. add selftest that makes sure going over the packet bounds in
   bpf_skb_load_bytes with on-stack skb doesn't cause any problems
7. add new net namespace argument go eth_get_headlen and convert the
   callers

Stanislav Fomichev (7):
  net: introduce __init_skb and __init_skb_shinfo helpers
  net: introduce skb_net helper
  net: plumb network namespace into __skb_flow_dissect
  net: flow_dissector: handle no-skb use case
  bpf: when doing BPF_PROG_TEST_RUN for flow dissector use no-skb mode
  selftests/bpf: add flow dissector bpf_skb_load_bytes helper test
  net: flow_dissector: pass net argument to the eth_get_headlen

 drivers/net/ethernet/broadcom/bnxt/bnxt.c     |   2 +-
 drivers/net/ethernet/hisilicon/hns/hns_enet.c |   3 +-
 .../net/ethernet/hisilicon/hns3/hns3_enet.c   |   3 +-
 drivers/net/ethernet/intel/fm10k/fm10k_main.c |   2 +-
 drivers/net/ethernet/intel/i40e/i40e_txrx.c   |   3 +-
 drivers/net/ethernet/intel/iavf/iavf_txrx.c   |   2 +-
 drivers/net/ethernet/intel/ice/ice_txrx.c     |   2 +-
 drivers/net/ethernet/intel/igb/igb_main.c     |   2 +-
 drivers/net/ethernet/intel/igc/igc_main.c     |   2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   2 +-
 .../net/ethernet/intel/ixgbevf/ixgbevf_main.c |   3 +-
 .../net/ethernet/mellanox/mlx5/core/en_tx.c   |   3 +-
 drivers/net/tun.c                             |   3 +-
 include/linux/etherdevice.h                   |   2 +-
 include/linux/skbuff.h                        |  23 +++-
 net/bpf/test_run.c                            |  52 +++------
 net/core/flow_dissector.c                     | 105 +++++++++++++-----
 net/core/skbuff.c                             |  78 +++++++------
 net/ethernet/eth.c                            |   8 +-
 tools/testing/selftests/bpf/test_progs.c      |  49 ++++++++
 20 files changed, 227 insertions(+), 122 deletions(-)

-- 
2.20.1.611.gfbb209baf1-goog

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox