Netdev List
 help / color / mirror / Atom feed
* Re: XDP redirect measurements, gotchas and tracepoints
From: Jesper Dangaard Brouer @ 2017-08-22  6:37 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: xdp-newbies@vger.kernel.org, John Fastabend, Daniel Borkmann,
	Andy Gospodarek, netdev@vger.kernel.org, Paweł Staszewski,
	brouer
In-Reply-To: <20170821223540.atktdricmks26c27@ast-mbp>

On Mon, 21 Aug 2017 15:35:42 -0700
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Mon, Aug 21, 2017 at 09:25:06PM +0200, Jesper Dangaard Brouer wrote:
> > 
> > Third gotcha(3): You got this far, loaded xdp on both interfaces, and
> > notice now that (with default setup) you can RX with 14Mpps but only
> > TX with 6.9Mpps (and might have 5% idle cycles).  I debugged this via
> > perf tracepoint event xdp:xdp_redirect, and found this was due to
> > overrunning the xdp TX ring-queue size.  
> 
> we should probably fix this somehow.

Gotcha-3 (quoted above) is an interesting problem.  At first it looks
like a driver tuning problem. But it is actually an inherent property
of XDP, as there is no-queue or push-back flow control with XDP, there
is no way to handle TX queue overrun.
 My proposed solution: I want to provide a facility for userspace to
load another eBPF program (at the tracepoint xdp_redirect), which can
"see" the issue occurring.  This allows a XDP/BPF developer to
implement their own reaction/mitigation flow-control (e.g. via a map
shared with the XDP program).


> Once tx-ing netdev added to devmap we can enable xdp on it automatically?

I think you are referring to Gotcha-2 here:

  Second gotcha(2): you cannot TX out a device, unless it also have a
  xdp bpf program attached. (This is an implicit dependency, as the
  driver code need to setup XDP resources before it can ndo_xdp_xmit).

Yes, we should work on improving this situation.  Auto enabling XDP
when a netdev is added to a devmap is a good solution.  Currently this
is tied to loading an XDP bpf_prog.  Do you propose loading a dummy
bpf_prog on the netdev? (then we need to handle 1. not replacing
existing bpf_prog, 2. on take-down don't remove "later" loaded
bpf_prog).

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: [PATCH v4 net-next] arm: eBPF JIT compiler
From: Shubham Bansal @ 2017-08-22  7:18 UTC (permalink / raw)
  To: Russell King - ARM Linux, David Miller
  Cc: Network Development, Alexei Starovoitov, Daniel Borkmann,
	linux-arm-kernel, LKML, Kees Cook, Shubham Bansal
In-Reply-To: <1503383772-5788-1-git-send-email-illusionist.neo@gmail.com>

Russell, David, Alexei, Daniel and Kees. Please check this patch and
lets finish it.

Thanks.

^ permalink raw reply

* Re: [PATCH net-next v4] arm: eBPF JIT compiler
From: Shubham Bansal @ 2017-08-22  7:29 UTC (permalink / raw)
  To: Russell King - ARM Linux, David Miller
  Cc: Network Development, Alexei Starovoitov, Daniel Borkmann,
	linux-arm-kernel, LKML, Kees Cook, Andrew Lunn, Shubham Bansal
In-Reply-To: <1503383296-5492-1-git-send-email-illusionist.neo@gmail.com>

Please ignore this mail. Sent it by mistake.

^ permalink raw reply

* Re: [PATCH v4 net-next] arm: eBPF JIT compiler
From: Shubham Bansal @ 2017-08-22  7:30 UTC (permalink / raw)
  To: Russell King - ARM Linux, David Miller
  Cc: Network Development, Alexei Starovoitov, Daniel Borkmann,
	linux-arm-kernel, LKML, Kees Cook, Andrew Lunn, Shubham Bansal
In-Reply-To: <1503383678-5740-1-git-send-email-illusionist.neo@gmail.com>

Please ignore this mail. Sent it by mistake.
Sent the correct patch later on.

^ permalink raw reply

* Re: [PATCH] ptp: make ptp_clock_info const
From: Richard Cochran @ 2017-08-22  7:50 UTC (permalink / raw)
  To: Bhumika Goyal; +Cc: Julia Lawall, netdev, Linux Kernel Mailing List
In-Reply-To: <CAOH+1jGDjGusOCYv+S4P9LGq9vbnYtMF1-NfnyjfGYiM2ugGpg@mail.gmail.com>

On Tue, Aug 22, 2017 at 12:06:23PM +0530, Bhumika Goyal wrote:
> By testing I meant that they have been checked by the compiler. I
> haven't run the code on a hardware or some test data.

Yes, compiling is good enough for this change.

Thanks,
Richard

> > Acked-by: Richard Cochran <richardcochran@gmail.com>

^ permalink raw reply

* Re: [PATCH v3 3/4] net: stmmac: register parent MDIO node for sun8i-h3-emac
From: Corentin Labbe @ 2017-08-22  7:59 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Maxime Ripard, Chen-Yu Tsai, Rob Herring, Mark Rutland,
	Russell King, Giuseppe Cavallaro, Alexandre Torgue, devicetree,
	linux-arm-kernel, linux-kernel, netdev
In-Reply-To: <20170821142321.GE1703@lunn.ch>

On Mon, Aug 21, 2017 at 04:23:21PM +0200, Andrew Lunn wrote:
> > All muxes are mostly always represented the same way afaik, or do you
> > want to simply introduce a new compatible / property?
> 
> +      	  mdio-mux {
> +		compatible = "allwinner,sun8i-h3-mdio-switch";
> +		mdio-parent-bus = <&mdio_parent>;
> +		#address-cells = <1>;
> +		#size-cells = <0>;
> +
> +		internal_mdio: mdio@1 {
> 			reg = <1>;
> -			clocks = <&ccu CLK_BUS_EPHY>;
> -			resets = <&ccu RST_BUS_EPHY>;
> +			#address-cells = <1>;
> +			#size-cells = <0>;
> +			int_mii_phy: ethernet-phy@1 {
> +				compatible = "ethernet-phy-ieee802.3-c22";
> +				reg = <1>;
> +				clocks = <&ccu CLK_BUS_EPHY>;
> +				resets = <&ccu RST_BUS_EPHY>;
> +				phy-is-integrated;
> +			};
> +		};
> +		mdio: mdio@0 {
> +			reg = <0>;
> +			#address-cells = <1>;
> +			#size-cells = <0>;
>  		};
>  		
> Hi Maxim
> 
> Anybody who knows the MDIO-mux code/binding, knows that it is a run
> time mux. You swap the mux per MDIO transaction. You can access all
> the PHY and switches on the mux'ed MDIO bus.
> 
> However here, it is effectively a boot-time MUX. You cannot change it
> on the fly. What happens when somebody has a phandle to a PHY on the
> internal and a phandle to a phy on the external? Does the driver at
> least return -EINVAL, or -EBUSY? Is there a representation which
> eliminates this possibility?
> 

Exactly you can change it on the fly, but you need to reset the MAC for enabling the new configuration.
The stmmac driver does not handle mdio-mux. It is why I have a patch which automaticly select parent MDIO node of the PHY node.

For representation we could keep the current. (With a big comment stating that it is a switch)

We can also add a mdio-switch type node, or add a mdio-switch property to mdio-mux.

Regards

^ permalink raw reply

* [PATCH] ravb: make mdiobb_ops const
From: Bhumika Goyal @ 2017-08-22  8:08 UTC (permalink / raw)
  To: julia.lawall, sergei.shtylyov, netdev, linux-renesas-soc,
	linux-kernel
  Cc: Bhumika Goyal

Make these const as they are only stored in a const field of a
mdiobb_ctrl structure.

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
---
 drivers/net/ethernet/renesas/ravb_main.c | 2 +-
 drivers/net/ethernet/renesas/sh_eth.c    | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
index fdf30bf..6ffd9e4 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -172,7 +172,7 @@ static int ravb_get_mdio_data(struct mdiobb_ctrl *ctrl)
 }
 
 /* MDIO bus control struct */
-static struct mdiobb_ops bb_ops = {
+static const struct mdiobb_ops bb_ops = {
 	.owner = THIS_MODULE,
 	.set_mdc = ravb_set_mdc,
 	.set_mdio_dir = ravb_set_mdio_dir,
diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c
index d2e88a3..8af353f 100644
--- a/drivers/net/ethernet/renesas/sh_eth.c
+++ b/drivers/net/ethernet/renesas/sh_eth.c
@@ -1119,7 +1119,7 @@ static void sh_mdc_ctrl(struct mdiobb_ctrl *ctrl, int bit)
 }
 
 /* mdio bus control struct */
-static struct mdiobb_ops bb_ops = {
+static const struct mdiobb_ops bb_ops = {
 	.owner = THIS_MODULE,
 	.set_mdc = sh_mdc_ctrl,
 	.set_mdio_dir = sh_mmd_ctrl,
-- 
1.9.1

^ permalink raw reply related

* [PATCH] net: ethernet: ax88796: make mdiobb_ops const
From: Bhumika Goyal @ 2017-08-22  8:11 UTC (permalink / raw)
  To: julia.lawall, davem, netdev, linux-kernel; +Cc: Bhumika Goyal

Make this const as it is only stored in a const field of a
mdiobb_ctrl structure.

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
---
 drivers/net/ethernet/8390/ax88796.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/8390/ax88796.c b/drivers/net/ethernet/8390/ax88796.c
index 05d9d3e..2455547 100644
--- a/drivers/net/ethernet/8390/ax88796.c
+++ b/drivers/net/ethernet/8390/ax88796.c
@@ -585,7 +585,7 @@ static int ax_bb_get_data(struct mdiobb_ctrl *ctrl)
 	return reg_memr & AX_MEMR_MDI ? 1 : 0;
 }
 
-static struct mdiobb_ops bb_ops = {
+static const struct mdiobb_ops bb_ops = {
 	.owner = THIS_MODULE,
 	.set_mdc = ax_bb_mdc,
 	.set_mdio_dir = ax_bb_dir,
-- 
1.9.1

^ permalink raw reply related

* [PATCH] drivers: net: wireless: atmel: check memory allocation failure
From: Himanshu Jha @ 2017-08-22  8:11 UTC (permalink / raw)
  To: kvalo; +Cc: linux-wireless, linux-kernel, netdev, Himanshu Jha

Check memory allocation failure and return -ENOMEM if failure
occurs.

Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com>
---
 drivers/net/wireless/atmel/at76c50x-usb.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/wireless/atmel/at76c50x-usb.c b/drivers/net/wireless/atmel/at76c50x-usb.c
index 09defbc..73f0924 100644
--- a/drivers/net/wireless/atmel/at76c50x-usb.c
+++ b/drivers/net/wireless/atmel/at76c50x-usb.c
@@ -940,7 +940,7 @@ static void at76_dump_mib_mac_addr(struct at76_priv *priv)
 					 GFP_KERNEL);
 
 	if (!m)
-		return;
+		return -ENOMEM;
 
 	ret = at76_get_mib(priv->udev, MIB_MAC_ADDR, m,
 			   sizeof(struct mib_mac_addr));
@@ -969,7 +969,7 @@ static void at76_dump_mib_mac_wep(struct at76_priv *priv)
 	struct mib_mac_wep *m = kmalloc(sizeof(struct mib_mac_wep), GFP_KERNEL);
 
 	if (!m)
-		return;
+		return -ENOMEM;
 
 	ret = at76_get_mib(priv->udev, MIB_MAC_WEP, m,
 			   sizeof(struct mib_mac_wep));
@@ -1006,7 +1006,7 @@ static void at76_dump_mib_mac_mgmt(struct at76_priv *priv)
 					 GFP_KERNEL);
 
 	if (!m)
-		return;
+		return -ENOMEM;
 
 	ret = at76_get_mib(priv->udev, MIB_MAC_MGMT, m,
 			   sizeof(struct mib_mac_mgmt));
@@ -1043,7 +1043,7 @@ static void at76_dump_mib_mac(struct at76_priv *priv)
 	struct mib_mac *m = kmalloc(sizeof(struct mib_mac), GFP_KERNEL);
 
 	if (!m)
-		return;
+		return -ENOMEM;
 
 	ret = at76_get_mib(priv->udev, MIB_MAC, m, sizeof(struct mib_mac));
 	if (ret < 0) {
@@ -1080,7 +1080,7 @@ static void at76_dump_mib_phy(struct at76_priv *priv)
 	struct mib_phy *m = kmalloc(sizeof(struct mib_phy), GFP_KERNEL);
 
 	if (!m)
-		return;
+		return -ENOMEM;
 
 	ret = at76_get_mib(priv->udev, MIB_PHY, m, sizeof(struct mib_phy));
 	if (ret < 0) {
@@ -1113,7 +1113,7 @@ static void at76_dump_mib_local(struct at76_priv *priv)
 	struct mib_local *m = kmalloc(sizeof(*m), GFP_KERNEL);
 
 	if (!m)
-		return;
+		return -ENOMEM;
 
 	ret = at76_get_mib(priv->udev, MIB_LOCAL, m, sizeof(*m));
 	if (ret < 0) {
@@ -1138,7 +1138,7 @@ static void at76_dump_mib_mdomain(struct at76_priv *priv)
 	struct mib_mdomain *m = kmalloc(sizeof(struct mib_mdomain), GFP_KERNEL);
 
 	if (!m)
-		return;
+		return -ENOMEM;
 
 	ret = at76_get_mib(priv->udev, MIB_MDOMAIN, m,
 			   sizeof(struct mib_mdomain));
-- 
2.7.4

^ permalink raw reply related

* [PATCH] net: mdio-gpio: make mdiobb_ops const
From: Bhumika Goyal @ 2017-08-22  8:13 UTC (permalink / raw)
  To: julia.lawall, andrew, f.fainelli, netdev, linux-kernel; +Cc: Bhumika Goyal

Make this const as it is only stored in a const field of a
mdiobb_ctrl structure.

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
---
 drivers/net/phy/mdio-gpio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/phy/mdio-gpio.c b/drivers/net/phy/mdio-gpio.c
index 7faa79b..4333c6e 100644
--- a/drivers/net/phy/mdio-gpio.c
+++ b/drivers/net/phy/mdio-gpio.c
@@ -116,7 +116,7 @@ static void mdc_set(struct mdiobb_ctrl *ctrl, int what)
 	gpiod_set_value(bitbang->mdc, what);
 }
 
-static struct mdiobb_ops mdio_gpio_ops = {
+static const struct mdiobb_ops mdio_gpio_ops = {
 	.owner = THIS_MODULE,
 	.set_mdc = mdc_set,
 	.set_mdio_dir = mdio_dir,
-- 
1.9.1

^ permalink raw reply related

* [PATCH] net: ethernet: freescale: fs_enet: make mdiobb_ops const
From: Bhumika Goyal @ 2017-08-22  8:15 UTC (permalink / raw)
  To: julia.lawall, pantelis.antoniou, vbordug, linuxppc-dev, netdev,
	linux-kernel
  Cc: Bhumika Goyal

Make this const as it is only stored in a const field of a
mdiobb_ctrl structure.

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
---
 drivers/net/ethernet/freescale/fs_enet/mii-bitbang.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/fs_enet/mii-bitbang.c b/drivers/net/ethernet/freescale/fs_enet/mii-bitbang.c
index 1f015ed..c8e5d88 100644
--- a/drivers/net/ethernet/freescale/fs_enet/mii-bitbang.c
+++ b/drivers/net/ethernet/freescale/fs_enet/mii-bitbang.c
@@ -100,7 +100,7 @@ static inline void mdc(struct mdiobb_ctrl *ctrl, int what)
 	in_be32(bitbang->dat);
 }
 
-static struct mdiobb_ops bb_ops = {
+static const struct mdiobb_ops bb_ops = {
 	.owner = THIS_MODULE,
 	.set_mdc = mdc,
 	.set_mdio_dir = mdio_dir,
-- 
1.9.1

^ permalink raw reply related

* [patch net] mlxsw: spectrum_switchdev: Fix mrouter flag update
From: Jiri Pirko @ 2017-08-22  8:28 UTC (permalink / raw)
  To: netdev; +Cc: davem, nogahf, idosch, mlxsw

From: Nogah Frankel <nogahf@mellanox.com>

Update the value of the mrouter flag in struct mlxsw_sp_bridge_port when
it is being changed.

Fixes: c57529e1d5d8 ("mlxsw: spectrum: Replace vPorts with Port-VLAN")
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
index 5eb1606..d39ffbf 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
@@ -705,6 +705,7 @@ static int mlxsw_sp_port_attr_mc_router_set(struct mlxsw_sp_port *mlxsw_sp_port,
 					    bool is_port_mc_router)
 {
 	struct mlxsw_sp_bridge_port *bridge_port;
+	int err;
 
 	if (switchdev_trans_ph_prepare(trans))
 		return 0;
@@ -715,11 +716,17 @@ static int mlxsw_sp_port_attr_mc_router_set(struct mlxsw_sp_port *mlxsw_sp_port,
 		return 0;
 
 	if (!bridge_port->bridge_device->multicast_enabled)
-		return 0;
+		goto out;
 
-	return mlxsw_sp_bridge_port_flood_table_set(mlxsw_sp_port, bridge_port,
-						    MLXSW_SP_FLOOD_TYPE_MC,
-						    is_port_mc_router);
+	err = mlxsw_sp_bridge_port_flood_table_set(mlxsw_sp_port, bridge_port,
+						   MLXSW_SP_FLOOD_TYPE_MC,
+						   is_port_mc_router);
+	if (err)
+		return err;
+
+out:
+	bridge_port->mrouter = is_port_mc_router;
+	return 0;
 }
 
 static int mlxsw_sp_port_mc_disabled_set(struct mlxsw_sp_port *mlxsw_sp_port,
-- 
2.9.3

^ permalink raw reply related

* RE: [PATCH net-next v4] openvswitch: enable NSH support
From: Jan Scheurich @ 2017-08-22  8:32 UTC (permalink / raw)
  To: Jiri Benc
  Cc: Yang, Yi, netdev@vger.kernel.org, dev@openvswitch.org,
	blp@ovn.org, e@erig.me
In-Reply-To: <20170821135019.5c65b888@griffin>

> > If I understand correctly, this is a default definition that can be
> > overridden by drivers so that we still cannot rely on the Ethernet
> > payload always being 32-bit-aligned.
> 
> Not by drivers, by architectures. Only architectures that can efficiently
> handle unaligned access (or on which the cost of handling unaligned access
> is lower than the cost of unaligned DMA access) set NET_IP_ALIGN to 0.

Thanks, got it!

> 
> > Furthermore, there seem to be machine architectures that cannot handle
> > misaligned 32-bit access at all (not even with a performance penalty).
> 
> Those leave NET_IP_ALIGN to 2.

Dito

> 
> > Or why else does OVS user space code take so great pain to model
> > possible misalignment and provide/use safe access functions?
> 
> I don't know how the ovs user space deals with packet allocation. In the
> kernel, the network header is aligned in a way that it allows efficient 32bit
> access.

It seems that OVS has not had the same approach as Linux. There is no config parameter covering the alignment characteristics of the machine architecture. For packets buffers received from outside sources (e.g. DPDK interfaces) they make no assumptions on alignment and play safe. For packets allocated inside OVS, the Ethernet packet is typically stored so that the L3 header is 32-bit aligned, so that the misalignment precautions would be unnecessary. But I didn't check all code paths.

^ permalink raw reply

* Re: [PATCH] drivers: net: wireless: atmel: check memory allocation failure
From: Matteo Croce @ 2017-08-22  8:41 UTC (permalink / raw)
  To: Himanshu Jha, kvalo; +Cc: linux-wireless, linux-kernel, netdev
In-Reply-To: <1503389481-4988-1-git-send-email-himanshujha199640@gmail.com>

Il giorno mar, 22/08/2017 alle 13.41 +0530, Himanshu Jha ha scritto:
> Check memory allocation failure and return -ENOMEM if failure
> occurs.
> 
> Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com>
> ---
>  drivers/net/wireless/atmel/at76c50x-usb.c | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/wireless/atmel/at76c50x-usb.c
> b/drivers/net/wireless/atmel/at76c50x-usb.c
> index 09defbc..73f0924 100644
> --- a/drivers/net/wireless/atmel/at76c50x-usb.c
> +++ b/drivers/net/wireless/atmel/at76c50x-usb.c
> @@ -940,7 +940,7 @@ static void at76_dump_mib_mac_addr(struct
> at76_priv *priv)
>  					 GFP_KERNEL);
>  
>  	if (!m)
> -		return;
> +		return -ENOMEM;
>  
>  	ret = at76_get_mib(priv->udev, MIB_MAC_ADDR, m,
>  			   sizeof(struct mib_mac_addr));
> @@ -969,7 +969,7 @@ static void at76_dump_mib_mac_wep(struct
> at76_priv *priv)
>  	struct mib_mac_wep *m = kmalloc(sizeof(struct mib_mac_wep),
> GFP_KERNEL);
>  
>  	if (!m)
> -		return;
> +		return -ENOMEM;
>  
>  	ret = at76_get_mib(priv->udev, MIB_MAC_WEP, m,
>  			   sizeof(struct mib_mac_wep));
> @@ -1006,7 +1006,7 @@ static void at76_dump_mib_mac_mgmt(struct
> at76_priv *priv)
>  					 GFP_KERNEL);
>  
>  	if (!m)
> -		return;
> +		return -ENOMEM;
>  
>  	ret = at76_get_mib(priv->udev, MIB_MAC_MGMT, m,
>  			   sizeof(struct mib_mac_mgmt));
> @@ -1043,7 +1043,7 @@ static void at76_dump_mib_mac(struct at76_priv
> *priv)
>  	struct mib_mac *m = kmalloc(sizeof(struct mib_mac),
> GFP_KERNEL);
>  
>  	if (!m)
> -		return;
> +		return -ENOMEM;
>  
>  	ret = at76_get_mib(priv->udev, MIB_MAC, m, sizeof(struct
> mib_mac));
>  	if (ret < 0) {
> @@ -1080,7 +1080,7 @@ static void at76_dump_mib_phy(struct at76_priv
> *priv)
>  	struct mib_phy *m = kmalloc(sizeof(struct mib_phy),
> GFP_KERNEL);
>  
>  	if (!m)
> -		return;
> +		return -ENOMEM;
>  
>  	ret = at76_get_mib(priv->udev, MIB_PHY, m, sizeof(struct
> mib_phy));
>  	if (ret < 0) {
> @@ -1113,7 +1113,7 @@ static void at76_dump_mib_local(struct
> at76_priv *priv)
>  	struct mib_local *m = kmalloc(sizeof(*m), GFP_KERNEL);
>  
>  	if (!m)
> -		return;
> +		return -ENOMEM;
>  
>  	ret = at76_get_mib(priv->udev, MIB_LOCAL, m, sizeof(*m));
>  	if (ret < 0) {
> @@ -1138,7 +1138,7 @@ static void at76_dump_mib_mdomain(struct
> at76_priv *priv)
>  	struct mib_mdomain *m = kmalloc(sizeof(struct mib_mdomain),
> GFP_KERNEL);
>  
>  	if (!m)
> -		return;
> +		return -ENOMEM;
>  
>  	ret = at76_get_mib(priv->udev, MIB_MDOMAIN, m,
>  			   sizeof(struct mib_mdomain));

Perhaps these functions should return something instead of being void.

Regards,

-- 
Matteo Croce
per aspera ad upstream

^ permalink raw reply

* [PATCH] ARM: dts: rk3228-evb: Fix the compiling error
From: David Wu @ 2017-08-22  9:24 UTC (permalink / raw)
  To: davem; +Cc: sfr, netdev, David Wu

This patch solves the following error:
arch/arm/boot/dts/rk3228-evb.dtb: ERROR (phandle_references): Reference to non-existent node or label "phy0"

Fixess db40f15b53e4 ("ARM: dts: rk3228-evb: Enable the integrated PHY for gmac")
Signed-off-by: David Wu <david.wu@rock-chips.com>
---
 arch/arm/boot/dts/rk3228-evb.dts | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/boot/dts/rk3228-evb.dts b/arch/arm/boot/dts/rk3228-evb.dts
index 456ddf7..1be9daa 100644
--- a/arch/arm/boot/dts/rk3228-evb.dts
+++ b/arch/arm/boot/dts/rk3228-evb.dts
@@ -76,7 +76,7 @@
 	clock_in_out = "output";
 	phy-supply = <&vcc_phy>;
 	phy-mode = "rmii";
-	phy-handle = <&phy0>;
+	phy-handle = <&phy>;
 	status = "okay";
 
 	mdio {
@@ -84,7 +84,7 @@
 		#address-cells = <1>;
 		#size-cells = <0>;
 
-		phy@0 {
+		phy: phy@0 {
 			compatible = "ethernet-phy-id1234.d400", "ethernet-phy-ieee802.3-c22";
 			reg = <0>;
 			clocks = <&cru SCLK_MAC_PHY>;
-- 
1.9.1

^ permalink raw reply related

* Re: [PATCH net-next v4] openvswitch: enable NSH support
From: Yang, Yi @ 2017-08-22  9:38 UTC (permalink / raw)
  To: Jiri Benc
  Cc: netdev@vger.kernel.org, dev@openvswitch.org, blp@ovn.org,
	e@erig.me, jan.scheurich@ericsson.com
In-Reply-To: <20170821114713.7c6ebb9a@griffin>

On Mon, Aug 21, 2017 at 05:47:13PM +0800, Jiri Benc wrote:
> 
> > I don't know how we can support this, is it a must-have thing?
> 
> What would happen if you get a GSO packet? Ports of an ovs bridge claim
> GSO support, thus they may get a GSO packet. You have to handle it one
> way or the other: either software segment the packet before pushing the
> header, or implement proper GSO support for NSH.
> 

I checked GSO support code, we have two cases to handle, one case is to
output NSH packet to VxLAN-gpe port, the other case is to output NSH packet
to Ethernet port (tap, veth, vhost, physical eth, etc.), I don't think
it is a good way to do GSO support in OVS module, because we can't know
the final output port at this point, if the final output port is
VxLAN-gpe port, it will still face GSO issue, but I didn't see vxlan
module handles this, maybe udp tunnel did this. I think a NSH packet can
be split into two packets, one has NSH header, another one doesn't, so I'm not sure how vxlan module can avoid this situation.

For the second case, we can add a nsh GSO module in net/nsh/nsh_gso.c (as MPLS did in net/mpls/mpls_gso.c), that is the best way to handle this, what do you
think about it?

^ permalink raw reply

* Re: [PATCH] vsock: only load vmci transport on VMware hypervisor by default
From: Stefan Hajnoczi @ 2017-08-22  9:54 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: Michal Kubecek, jasowang@redhat.com, George Zhang,
	Stephen Hemminger, Rolf Neugebauer, Marcelo Cerri, Asias He,
	Dan Carpenter, olaf@aepfle.de, Haiyang Zhang, Dave Scott,
	apw@canonical.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, gregkh@linuxfoundation.org,
	joe@perches.com, devel@linuxdriverproject.org, Vitaly Kuznetsov,
	davem@davemloft.net, Jorgen S. Hansen
In-Reply-To: <KL1P15301MB00083B1F5B70CBAB7C42BBD3BF800@KL1P15301MB0008.APCP153.PROD.OUTLOOK.COM>

On Fri, Aug 18, 2017 at 11:07:37PM +0000, Dexuan Cui wrote:
> > From: Stefan Hajnoczi [mailto:stefanha@redhat.com]
> > > CID is not really used by us, because we only support guest<->host
> > communication,
> > > and don't support guest<->guest communication. The Hyper-V host
> > references
> > > every VM by VmID (which is invisible to the VM), and a VM can only talk to
> > the
> > > host via this feature.
> > 
> > Applications running inside the guest should use VMADDR_CID_HOST (2) to
> > connect to the host, even on Hyper-V.
> I have no objection, and this patch does support this usage of the
> user-space applications.
>
> > By the way, we should collaborate on a test suite and a vsock(7) man
> > page that documents the semantics of AF_VSOCK sockets.  This way our
> > transports will have the same behavior and AF_VSOCK applications will
> > work on all 3 hypervisors.
> I can't agree more. :-)
> BTW, I have been using Rolf's test suite to test my patch:
> https://github.com/rn/virtsock/tree/master/c
> Maybe this can be a good starting point.

Thanks for sharing this, I will try it with virtio-vsock.

I have a netcat-like utility here:
https://github.com/stefanha/linux/blob/vsock-extras/nc-vsock.c

> > Not all features need to be supported.  For example, VMCI supports
> > SOCK_DGRAM while Hyper-V and virtio do not.  But features that are
> > available should behave identically.
> I totally agree, though I'm afraid Hyper-V may have a little more limitations
> compared to VMware/KVM duo to the <VM_ID, ServiceID> <--> <cid, port>
> mapping.
>  
> > > Can we use the 'protocol' parameter in the socket() function:
> > > int socket(int domain, int type, int protocol)
> > >
> > > IMO currently the 'protocol' is not really used.
> > > I think we can modify __vsock_core_init() to allow multiple transport layers
> > to
> > >  be registered, and we can define different 'protocol' numbers for
> > > VMware/KVM/Hyper-V, and ask the application to explicitly specify what
> > should
> > > be used. Considering compatibility, we can use the default transport in a
> > given
> > > VM depending on the underlying hypervisor.
> > 
> > I think AF_VSOCK should hide the transport from users/applications.
> Ideally yes, but let's consider the KVM-on-KVM nested scenario: when
> an application in the Level-1 VM creates an AF_VSOCK socket and call
> connect() for it, how can we know if the app is trying to connect to
> the Level-0 host, or connect to the Level-2 VM? We can't.

We *can* by looking at the destination CID.  Please take a look at
drivers/misc/vmw_vmci/vmci_route.c:vmci_route() to see how VMCI handles
nested virt.

It boils down to something like this:

  static int vsock_stream_connect(struct socket *sock, struct sockaddr *addr,
                                  int addr_len, int flags)
  {
      ...
      if (remote_addr.svm_cid == VMADDR_CID_HOST)
          transport = host_transport;
      else
          transport = guest_transport;

It's easy for connect(2) but Jorgen mentioned it's harder for listen(2)
because the socket would need to listen on both transports.  We define
two new constants VMADDR_CID_LISTEN_FROM_GUEST and
VMADDR_CID_LISTEN_FROM_HOST for bind(2) so that applications can decide
which side to listen on.  Or the listen socket could simply listen to
both sides.

Stefan

^ permalink raw reply

* Re: [PATCH net-next 3/3] hv_sock: implements Hyper-V transport for Virtual Sockets (AF_VSOCK)
From: Stefan Hajnoczi @ 2017-08-22 10:14 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: davem@davemloft.net, netdev@vger.kernel.org,
	devel@linuxdriverproject.org, KY Srinivasan, Haiyang Zhang,
	Stephen Hemminger, George Zhang, Jorgen Hansen, Michal Kubecek,
	Vitaly Kuznetsov, Cathy Avery, jasowang@redhat.com,
	Rolf Neugebauer, Dave Scott, Marcelo Cerri, apw@canonical.com,
	olaf@aepfle.de, joe@perches.com, "linux
In-Reply-To: <KL1P15301MB0008C56CCD01B69186EF3A93BF800@KL1P15301MB0008.APCP153.PROD.OUTLOOK.COM>

On Fri, Aug 18, 2017 at 10:23:54PM +0000, Dexuan Cui wrote:
> > From: Stefan Hajnoczi [mailto:stefanha@redhat.com]
> > Sent: Thursday, August 17, 2017 07:56
> > To: Dexuan Cui <decui@microsoft.com>
> > On Tue, Aug 15, 2017 at 10:18:41PM +0000, Dexuan Cui wrote:
> > > +static u32 hvs_get_local_cid(void)
> > > +{
> > > +	return VMADDR_CID_ANY;
> > > +}
> > 
> > Interesting concept: the guest never knows its CID.  This is nice from a
> > live migration perspective.  Currently VMCI and virtio adjust listen
> > socket local CIDs after migration.
> > 
> > > +static bool hvs_stream_allow(u32 cid, u32 port)
> > > +{
> > > +	static const u32 valid_cids[] = {
> > > +		VMADDR_CID_ANY,
> > 
> > Is this for loopback?
> 
> No, we don't support lookback in Linux VM, at least for now.
> In our Linux implementation, Linux VM can only connect to the host, and
> here when Linux VM calls connect(), I treat  VMADDR_CID_ANY 
> the same as VMADDR_CID_HOST.

VMCI and virtio-vsock do not treat connect(VMADDR_CID_ANY) the same as
connect(VMADDR_CID_HOST).  It is an error to connect to VMADDR_CID_ANY.

> > > +		VMADDR_CID_HOST,
> > > +	};
> > > +	int i;
> > > +
> > > +	/* The host's port range [MIN_HOST_EPHEMERAL_PORT, 0xFFFFFFFF)
> > is
> > > +	 * reserved as ephemeral ports, which are used as the host's ports
> > > +	 * when the host initiates connections.
> > > +	 */
> > > +	if (port > MAX_HOST_LISTEN_PORT)
> > > +		return false;
> > 
> > Without this if statement the guest will attempt to connect.  I guess
> > there will be no listen sockets above MAX_HOST_LISTEN_PORT, so the
> > connection attempt will fail.
> 
> You're correct.
> To use the vsock common infrastructure, we have to map Hyper-V's
> GUID <VM_ID, Service_ID> to int <cid, port>, and hence we must limit
> the port range we can listen() on to [0, MAX_LISTEN_PORT], i.e.
> we can only use half of the whole 32-bit port space for listen().
> This is detailed in the long comments starting at about Line 100.
>  
> > ...but hardcode this knowledge into the guest driver?
> I'd like the guest's connect() to fail immediately here.
> IMO this is better than a connect timeout. :-)

Thanks for explaining.  Perhaps the comment could be updated:

 /* The host's port range [MIN_HOST_EPHEMERAL_PORT, 0xFFFFFFFF) is
  * reserved as ephemeral ports, which are used as the host's ports when
  * the host initiates connections.
  *
  * Perform this check in the guest so an immediate error is produced
  * instead of a timeout.
  */

Stefan

^ permalink raw reply

* [PATCH net v1 0/2] tipc: topology server fixes
From: Parthasarathy Bhuvaragan @ 2017-08-22 10:28 UTC (permalink / raw)
  To: davem; +Cc: jon.maloy, netdev, tipc-discussion, ying.xue

The following commits fixes two race conditions causing general
protection faults.

Parthasarathy Bhuvaragan (1):
  tipc: remove subscription references only for pending timers

Ying Xue (1):
  tipc: fix a race condition of releasing subscriber object

 net/tipc/subscr.c | 21 +++++++++------------
 1 file changed, 9 insertions(+), 12 deletions(-)

-- 
2.1.4


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

^ permalink raw reply

* [PATCH net v1 1/2] tipc: remove subscription references only for pending timers
From: Parthasarathy Bhuvaragan @ 2017-08-22 10:28 UTC (permalink / raw)
  To: davem; +Cc: jon.maloy, netdev, tipc-discussion, ying.xue
In-Reply-To: <1503397721-19682-1-git-send-email-parthasarathy.bhuvaragan@ericsson.com>

In commit, 139bb36f754a ("tipc: advance the time of deleting
subscription from subscriber->subscrp_list"), we delete the
subscription from the subscribers list and from nametable
unconditionally. This leads to the following bug if the timer
running tipc_subscrp_timeout() in another CPU accesses the
subscription list after the subscription delete request.

[39.570] general protection fault: 0000 [#1] SMP
::
[39.574] task: ffffffff81c10540 task.stack: ffffffff81c00000
[39.575] RIP: 0010:tipc_subscrp_timeout+0x32/0x80 [tipc]
[39.576] RSP: 0018:ffff88003ba03e90 EFLAGS: 00010282
[39.576] RAX: dead000000000200 RBX: ffff88003f0f3600 RCX: 0000000000000101
[39.577] RDX: dead000000000100 RSI: 0000000000000201 RDI: ffff88003f0d7948
[39.578] RBP: ffff88003ba03ea0 R08: 0000000000000001 R09: ffff88003ba03ef8
[39.579] R10: 000000000000014f R11: 0000000000000000 R12: ffff88003f0d7948
[39.580] R13: ffff88003f0f3618 R14: ffffffffa006c250 R15: ffff88003f0f3600
[39.581] FS:  0000000000000000(0000) GS:ffff88003ba00000(0000) knlGS:0000000000000000
[39.582] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[39.583] CR2: 00007f831c6e0714 CR3: 000000003d3b0000 CR4: 00000000000006f0
[39.584] Call Trace:
[39.584]  <IRQ>
[39.585]  call_timer_fn+0x3d/0x180
[39.585]  ? tipc_subscrb_rcv_cb+0x260/0x260 [tipc]
[39.586]  run_timer_softirq+0x168/0x1f0
[39.586]  ? sched_clock_cpu+0x16/0xc0
[39.587]  __do_softirq+0x9b/0x2de
[39.587]  irq_exit+0x60/0x70
[39.588]  smp_apic_timer_interrupt+0x3d/0x50
[39.588]  apic_timer_interrupt+0x86/0x90
[39.589] RIP: 0010:default_idle+0x20/0xf0
[39.589] RSP: 0018:ffffffff81c03e58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10
[39.590] RAX: 0000000000000000 RBX: ffffffff81c10540 RCX: 0000000000000000
[39.591] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[39.592] RBP: ffffffff81c03e68 R08: 0000000000000000 R09: 0000000000000000
[39.593] R10: ffffc90001cbbe00 R11: 0000000000000000 R12: 0000000000000000
[39.594] R13: ffffffff81c10540 R14: 0000000000000000 R15: 0000000000000000
[39.595]  </IRQ>
::
[39.603] RIP: tipc_subscrp_timeout+0x32/0x80 [tipc] RSP: ffff88003ba03e90
[39.604] ---[ end trace 79ce94b7216cb459 ]---

Fixes: 139bb36f754a ("tipc: advance the time of deleting subscription from subscriber->subscrp_list")
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
---
 net/tipc/subscr.c | 19 +++++++------------
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/net/tipc/subscr.c b/net/tipc/subscr.c
index 0bf91cd3733c..f2c81f42dfda 100644
--- a/net/tipc/subscr.c
+++ b/net/tipc/subscr.c
@@ -52,7 +52,6 @@ struct tipc_subscriber {
 	struct list_head subscrp_list;
 };
 
-static void tipc_subscrp_delete(struct tipc_subscription *sub);
 static void tipc_subscrb_put(struct tipc_subscriber *subscriber);
 
 /**
@@ -197,15 +196,19 @@ static void tipc_subscrb_subscrp_delete(struct tipc_subscriber *subscriber,
 {
 	struct list_head *subscription_list = &subscriber->subscrp_list;
 	struct tipc_subscription *sub, *temp;
+	u32 timeout;
 
 	spin_lock_bh(&subscriber->lock);
 	list_for_each_entry_safe(sub, temp, subscription_list,  subscrp_list) {
 		if (s && memcmp(s, &sub->evt.s, sizeof(struct tipc_subscr)))
 			continue;
 
-		tipc_nametbl_unsubscribe(sub);
-		list_del(&sub->subscrp_list);
-		tipc_subscrp_delete(sub);
+		timeout = htohl(sub->evt.s.timeout, sub->swap);
+		if (timeout == TIPC_WAIT_FOREVER || del_timer(&sub->timer)) {
+			tipc_nametbl_unsubscribe(sub);
+			list_del(&sub->subscrp_list);
+			tipc_subscrp_put(sub);
+		}
 
 		if (s)
 			break;
@@ -236,14 +239,6 @@ static void tipc_subscrb_delete(struct tipc_subscriber *subscriber)
 	tipc_subscrb_put(subscriber);
 }
 
-static void tipc_subscrp_delete(struct tipc_subscription *sub)
-{
-	u32 timeout = htohl(sub->evt.s.timeout, sub->swap);
-
-	if (timeout == TIPC_WAIT_FOREVER || del_timer(&sub->timer))
-		tipc_subscrp_put(sub);
-}
-
 static void tipc_subscrp_cancel(struct tipc_subscr *s,
 				struct tipc_subscriber *subscriber)
 {
-- 
2.1.4


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

^ permalink raw reply related

* [PATCH net v1 2/2] tipc: fix a race condition of releasing subscriber object
From: Parthasarathy Bhuvaragan @ 2017-08-22 10:28 UTC (permalink / raw)
  To: davem; +Cc: jon.maloy, netdev, tipc-discussion, ying.xue
In-Reply-To: <1503397721-19682-1-git-send-email-parthasarathy.bhuvaragan@ericsson.com>

From: Ying Xue <ying.xue@windriver.com>

No matter whether a request is inserted into workqueue as a work item
to cancel a subscription or to delete a subscription's subscriber
asynchronously, the work items may be executed in different workers.
As a result, it doesn't mean that one request which is raised prior to
another request is definitely handled before the latter. By contrast,
if the latter request is executed before the former request, below
error may happen:

[  656.183644] BUG: spinlock bad magic on CPU#0, kworker/u8:0/12117
[  656.184487] general protection fault: 0000 [#1] SMP
[  656.185160] Modules linked in: tipc ip6_udp_tunnel udp_tunnel 9pnet_virtio 9p 9pnet virtio_net virtio_pci virtio_ring virtio [last unloaded: ip6_udp_tunnel]
[  656.187003] CPU: 0 PID: 12117 Comm: kworker/u8:0 Not tainted 4.11.0-rc7+ #6
[  656.187920] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  656.188690] Workqueue: tipc_rcv tipc_recv_work [tipc]
[  656.189371] task: ffff88003f5cec40 task.stack: ffffc90004448000
[  656.190157] RIP: 0010:spin_bug+0xdd/0xf0
[  656.190678] RSP: 0018:ffffc9000444bcb8 EFLAGS: 00010202
[  656.191375] RAX: 0000000000000034 RBX: ffff88003f8d1388 RCX: 0000000000000000
[  656.192321] RDX: ffff88003ba13708 RSI: ffff88003ba0cd08 RDI: ffff88003ba0cd08
[  656.193265] RBP: ffffc9000444bcd0 R08: 0000000000000030 R09: 000000006b6b6b6b
[  656.194208] R10: ffff8800bde3e000 R11: 00000000000001b4 R12: 6b6b6b6b6b6b6b6b
[  656.195157] R13: ffffffff81a3ca64 R14: ffff88003f8d1388 R15: ffff88003f8d13a0
[  656.196101] FS:  0000000000000000(0000) GS:ffff88003ba00000(0000) knlGS:0000000000000000
[  656.197172] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  656.197935] CR2: 00007f0b3d2e6000 CR3: 000000003ef9e000 CR4: 00000000000006f0
[  656.198873] Call Trace:
[  656.199210]  do_raw_spin_lock+0x66/0xa0
[  656.199735]  _raw_spin_lock_bh+0x19/0x20
[  656.200258]  tipc_subscrb_subscrp_delete+0x28/0xf0 [tipc]
[  656.200990]  tipc_subscrb_rcv_cb+0x45/0x260 [tipc]
[  656.201632]  tipc_receive_from_sock+0xaf/0x100 [tipc]
[  656.202299]  tipc_recv_work+0x2b/0x60 [tipc]
[  656.202872]  process_one_work+0x157/0x420
[  656.203404]  worker_thread+0x69/0x4c0
[  656.203898]  kthread+0x138/0x170
[  656.204328]  ? process_one_work+0x420/0x420
[  656.204889]  ? kthread_create_on_node+0x40/0x40
[  656.205527]  ret_from_fork+0x29/0x40
[  656.206012] Code: 48 8b 0c 25 00 c5 00 00 48 c7 c7 f0 24 a3 81 48 81 c1 f0 05 00 00 65 8b 15 61 ef f5 7e e8 9a 4c 09 00 4d 85 e4 44 8b 4b 08 74 92 <45> 8b 84 24 40 04 00 00 49 8d 8c 24 f0 05 00 00 eb 8d 90 0f 1f
[  656.208504] RIP: spin_bug+0xdd/0xf0 RSP: ffffc9000444bcb8
[  656.209798] ---[ end trace e2a800e6eb0770be ]---

In above scenario, the request of deleting subscriber was performed
earlier than the request of canceling a subscription although the
latter was issued before the former, which means tipc_subscrb_delete()
was called before tipc_subscrp_cancel(). As a result, when
tipc_subscrb_subscrp_delete() called by tipc_subscrp_cancel() was
executed to cancel a subscription, the subscription's subscriber
refcnt had been decreased to 1. After tipc_subscrp_delete() where
the subscriber was freed because its refcnt was decremented to zero,
but the subscriber's lock had to be released, as a consequence, panic
happened.

By contrast, if we increase subscriber's refcnt before
tipc_subscrb_subscrp_delete() is called in tipc_subscrp_cancel(),
the panic issue can be avoided.

Fixes: d094c4d5f5c7 ("tipc: add subscription refcount to avoid invalid delete")
Reported-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
---
 net/tipc/subscr.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/tipc/subscr.c b/net/tipc/subscr.c
index f2c81f42dfda..be3d9e3183dc 100644
--- a/net/tipc/subscr.c
+++ b/net/tipc/subscr.c
@@ -242,7 +242,9 @@ static void tipc_subscrb_delete(struct tipc_subscriber *subscriber)
 static void tipc_subscrp_cancel(struct tipc_subscr *s,
 				struct tipc_subscriber *subscriber)
 {
+	tipc_subscrb_get(subscriber);
 	tipc_subscrb_subscrp_delete(subscriber, s);
+	tipc_subscrb_put(subscriber);
 }
 
 static struct tipc_subscription *tipc_subscrp_create(struct net *net,
-- 
2.1.4


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

^ permalink raw reply related

* Re: [iproute PATCH v2 1/7] nstat: Avoid passing negative fd to fdopen()
From: Phil Sutter @ 2017-08-22 10:48 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20170821172323.47c335f3@xeon-e3>

On Mon, Aug 21, 2017 at 05:23:23PM -0700, Stephen Hemminger wrote:
> On Mon, 21 Aug 2017 19:08:07 +0200
> Phil Sutter <phil@nwl.cc> wrote:
> 
> > Introduce a wrapper which does the sanity checking and returns NULL
> > in case fd is invalid.
> > 
> > Signed-off-by: Phil Sutter <phil@nwl.cc>
> > ---
> >  misc/nstat.c | 15 +++++++++++----
> >  1 file changed, 11 insertions(+), 4 deletions(-)
> > 
> > diff --git a/misc/nstat.c b/misc/nstat.c
> > index 1212b1f2c8128..7cdde75a56e4e 100644
> > --- a/misc/nstat.c
> > +++ b/misc/nstat.c
> > @@ -252,9 +252,16 @@ static void load_ugly_table(FILE *fp)
> >  	}
> >  }
> >  
> > +static FILE *fdopen_null(int fd, const char *mode)
> > +{
> > +	if (fd < 0)
> > +		return NULL;
> > +	return fdopen(fd, mode);
> > +}
> > +
> >  static void load_sctp_snmp(void)
> >  {
> > -	FILE *fp = fdopen(net_sctp_snmp_open(), "r");
> > +	FILE *fp = fdopen_null(net_sctp_snmp_open(), "r");
> >  
> >  	if (fp) {
> >  		load_good_table(fp);
[...]
> 
> Why not just fix it at the source of the open.
> I.e 
> static FILE *generic_proc_open(condt char * env, const char *name)
> {
> ...
> 	return fopen(p, "r");
> }

What should it do? All the load_*() functions are supposed to be
fault-tolerant, so calling abort() or something is not an option.

Thanks, Phil

^ permalink raw reply

* Re: [RFC net-next v2] bridge lwtunnel, VPLS & NVGRE
From: Nikolay Aleksandrov @ 2017-08-22 11:01 UTC (permalink / raw)
  To: Stephen Hemminger, David Lamparter
  Cc: netdev, roopa, bridge, amine.kherbouche
In-Reply-To: <20170821170151.5b12a392@xeon-e3>

On 22/08/17 03:01, Stephen Hemminger wrote:
> On Mon, 21 Aug 2017 19:15:17 +0200
> David Lamparter <equinox@diac24.net> wrote:
> 
>> Hi all,
>>
>>
>> this is an update on the earlier "[RFC net-next] VPLS support".  Note
>> I've changed the subject lines on some of the patches to better reflect
>> what they really do (tbh the earlier subject lines were crap.)
>>
>> As previously, iproute2 / FRR patches are at:
>> - https://github.com/eqvinox/vpls-iproute2
>> - https://github.com/opensourcerouting/frr/commits/vpls
>> while this patchset is also available at:
>> - https://github.com/eqvinox/vpls-linux-kernel
>> (but please be aware that I'm amending and rebasing commits)
>>
>> The NVGRE implementation in the 3rd patch in this series is actually an
>> accident - I was just wiring up gretap as a reference;  only after I was
>> done I noticed that that sums up to NVGRE, more or less.  IMHO, it does
>> serve well to demonstrate the bridge changes are not VPLS-specific.
>>
>> To refer some notes from the first announce mail:
>>> I've tested some basic setups, the chain from LDP down into the kernel
>>> works at least in these.  FRR has some testcases around from OpenBSD
>>> VPLS support, I haven't wired that up to run against Linux / this
>>> patchset yet.  
>>
>> Same as before (API didn't change).
>>
>>> The patchset needs a lot of polishing (yes I left my TODO notes in the
>>> commit messages), for now my primary concern is overall design
>>> feedback.  Roopa has already provided a lot of input (Thanks!);  the
>>> major topic I'm expecting to get discussion on is the bridge FDB
>>> changes.  
>>
>> Got some useful input;  but still need feedback on the bridge FDB
>> changes (first 2 patches).  I don't believe it to have a significant
>> impact on existing bridge operation, and I believe a multipoint tunnel
>> driver without its own FDB (e.g. NVGRE in this set) should perform
>> better than one with its own FDB (e.g. existing VXLAN).
>>
>>> P.S.: For a little context on the bridge FDB changes - I'm hoping to
>>> find some time to extend this to the MDB to allow aggregating dst
>>> metadata and handing down a list of dst metas on TX.  This isn't
>>> specifically for VPLS but rather to give sufficient information to the
>>> 802.11 stack to allow it to optimize selecting rates (or unicasting)
>>> for multicast traffic by having the multicast subscriber list known.
>>> This is done by major commercial wifi solutions (e.g. google "dynamic
>>> multicast optimization".)  
>>
>> You can find hacks at this on:
>> https://github.com/eqvinox/vpls-linux-kernel/tree/mdb-hack
>> Please note that the patches in that branch are not at an acceptable
>> quality level, but you can see the semantic relation to 802.11.
>>
>> I would, however, like to point out that this branch has pseudo-working
>> IGMP/MLD snooping for VPLS, and it'd be 20-ish lines to add it to NVGRE
>> (I'll do that as soon as I get to it, it'll pop up on that branch too.)
>>
>> This is relevant to the discussion because it's a feature which is
>> non-obvious (to me) on how to do with the VXLAN model of having an
>> entirely separate FDB.  Meanwhile, with this architecture, the proof of
>> concept / hack is coming in at a measly cost of:
>> 8 files changed, 176 insertions(+), 15 deletions(-)
>>
>>
>> Cheers,
>>
>> -David
>>
>>
>> --- diffstat:
>> include/linux/netdevice.h      |  18 ++++++
>> include/net/dst_metadata.h     |  51 ++++++++++++++---
>> include/net/ip_tunnels.h       |   5 ++
>> include/uapi/linux/lwtunnel.h  |   8 +++
>> include/uapi/linux/neighbour.h |   2 +
>> include/uapi/linux/rtnetlink.h |   5 ++
>> net/bridge/br.c                |   2 +-
>> net/bridge/br_device.c         |   4 ++
>> net/bridge/br_fdb.c            | 119 ++++++++++++++++++++++++++++++++--------
>> net/bridge/br_input.c          |   6 +-
>> net/bridge/br_private.h        |   6 +-
>> net/core/lwtunnel.c            |   1 +
>> net/ipv4/ip_gre.c              |  40 ++++++++++++--
>> net/ipv4/ip_tunnel.c           |   1 +
>> net/ipv4/ip_tunnel_core.c      |  87 +++++++++++++++++++++++------
>> net/mpls/Kconfig               |  11 ++++
>> net/mpls/Makefile              |   1 +
>> net/mpls/af_mpls.c             | 113 ++++++++++++++++++++++++++++++++------
>> net/mpls/internal.h            |  44 +++++++++++++--
>> net/mpls/vpls.c                | 550 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 20 files changed, 990 insertions(+), 84 deletions(-)
> 
> I know the bridge is an easy target to extend L2 forwarding, but it is not
> the only option. Have you condidered building a new driver (like VXLAN does)
> which does the forwarding you want. Having all features in one driver
> makes for worse performance, and increased complexity.
> 

+1

As I said before, a separate implementation will be much cleaner and will not affect
the bridge in any way, paying both performance and complexity price for something that
the majority of users will not be using isn't worth it.  In addition this creates a
silent dependency between the bridge and the fdb metadata dst users, it would be much
more preferable to be able to run them separately.
If there is any code that will need to be re-used by VPLS (or anyone else) figure out a way
to factor it out.

^ permalink raw reply

* Re: [iproute PATCH v3 2/7] xfrm_state: Make sure alg_name is NULL-terminated
From: Phil Sutter @ 2017-08-22 11:05 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20170821172541.6eb3e60c@xeon-e3>

On Mon, Aug 21, 2017 at 05:28:20PM -0700, Stephen Hemminger wrote:
> On Mon, 21 Aug 2017 15:23:36 +0200
> Phil Sutter <phil@nwl.cc> wrote:
> 
> > Signed-off-by: Phil Sutter <phil@nwl.cc>
> > ---
> >  ip/xfrm_state.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/ip/xfrm_state.c b/ip/xfrm_state.c
> > index e11c93bf1c3b5..7c0389038986e 100644
> > --- a/ip/xfrm_state.c
> > +++ b/ip/xfrm_state.c
> > @@ -125,7 +125,8 @@ static int xfrm_algo_parse(struct xfrm_algo *alg, enum xfrm_attr_type_t type,
> >  	fprintf(stderr, "warning: ALGO-NAME/ALGO-KEYMAT values will be sent to the kernel promiscuously! (verifying them isn't implemented yet)\n");
> >  #endif
> >  
> > -	strncpy(alg->alg_name, name, sizeof(alg->alg_name));
> > +	strncpy(alg->alg_name, name, sizeof(alg->alg_name) - 1);
> > +	alg->alg_name[sizeof(alg->alg_name) - 1] = '\0';
> >  
> >  	if (slen > 2 && strncmp(key, "0x", 2) == 0) {
> >  		/* split two chars "0x" from the top */
> 
> You are fixing enough of these null terminated string issues, that maybe
> introducing strlcpy() would make sense. Either in utils (or -lbsd).

I thought about that already, but decided against it since we don't want
to truncate user chosen interface names so instead implemented
assert_valid_dev_name() and went on with manually sanitizing the other
cases.

Is adding libbsd as additional dependency acceptable? If not, I could
provide a simple strlcpy() implementation in utils.

Are you fine with a follow-up patch addressing this?

Thanks, Phil

^ permalink raw reply

* Re: [PATCH RESEND 1/2] net: enable high resolution timer mode to timeout datagram sockets
From: Vallish Vaidyeshwara @ 2017-08-22 11:14 UTC (permalink / raw)
  To: Cong Wang
  Cc: David Miller, shuah, Linux Kernel Network Developers, LKML,
	Eduardo Valentin, anchalag
In-Reply-To: <CAM_iQpUxS91BGYU7FOwq65Ln_9bbMO1YRzA2+xJDYh399du7mQ@mail.gmail.com>

On Mon, Aug 21, 2017 at 01:10:34PM -0700, Cong Wang wrote:
> On Fri, Aug 18, 2017 at 11:44 AM, Vallish Vaidyeshwara
> <vallish@amazon.com> wrote:
> > -       *timeo_p = schedule_timeout(*timeo_p);
> > +       /* Wait using highres timer */
> > +       expires = ktime_add_ns(ktime_get(), jiffies_to_nsecs(*timeo_p));
> > +       pre_sched_time = jiffies;
> > +       if (schedule_hrtimeout(&expires, HRTIMER_MODE_ABS))
>

Hello Cong,

> Does this work with MAX_SCHEDULE_TIMEOUT too??
> 

Thanks for pointing out MAX_SCHEDULE_TIMEOUT. I have made minor change to
accommodate MAX_SCHEDULE_TIMEOUT and will send out next version of the patch
for review.

Thanks.
-Vallish

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox