Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [RFC PATCH 4/5] mlx4: add support for fast rx drop bpf program
From: Daniel Borkmann @ 2016-04-04  9:57 UTC (permalink / raw)
  To: Johannes Berg, Brenden Blanco
  Cc: davem, netdev, tom, alexei.starovoitov, ogerlitz, john.fastabend,
	brouer
In-Reply-To: <1459755310.18188.13.camel@sipsolutions.net>

On 04/04/2016 09:35 AM, Johannes Berg wrote:
> On Sat, 2016-04-02 at 23:38 -0700, Brenden Blanco wrote:
>>
>> Having a common check makes sense. The tricky thing is that the type can
>> only be checked after taking the reference, and I wanted to keep the
>> scope of the prog brief in the case of errors. I would have to move the
>> bpf_prog_get logic into dev_change_bpf_fd and pass a bpf_prog * into the
>> ndo instead. Would that API look fine to you?
>
> I can't really comment, I wasn't planning on using the API right now :)
>
> However, what else is there that the driver could possibly do with the
> FD, other than getting the bpf_prog?
>
>> A possible extension of this is just to keep the bpf_prog * in the
>> netdev itself and expose a feature flag from the driver rather than
>> an ndo. But that would mean another 8 bytes in the netdev.
>
> That also misses the signal to the driver when the program is
> set/removed, so I don't think that works. I'd argue it's not really
> desirable anyway though since I wouldn't expect a majority of drivers
> to start supporting this.

I think ndo is probably fine for this purpose, see also my other mail. I
think currently, the only really driver specific code would be to store
the prog pointer somewhere and to pass needed meta data to populate the
fake skb.

Maybe mid-term drivers might want to reuse this hook/signal for offloading
as well, not yet sure ... how would that relate to offloading of cls_bpf?
Should these be considered two different things (although from an offloading
perspective they are not really). _Conceptually_, XDP could also be seen
as a software offload for the facilities we support with cls_bpf et al.

Thanks,
Daniel

^ permalink raw reply

* Re: [PATCH] Marvell phy: add fiber status check for some components
From: Andrew Lunn @ 2016-04-04 12:22 UTC (permalink / raw)
  To: Charles-Antoine Couret; +Cc: netdev
In-Reply-To: <570229B6.4090805@nexvision.fr>

On Mon, Apr 04, 2016 at 10:45:42AM +0200, Charles-Antoine Couret wrote:
> Hi,
> 
> > Shouldn't you return to page 0, i.e. MII_M1111_COPPER, under all
> > conditions?
> 
> I return marvell_read_status() which returns 0 if it hasn't error during the process.
> In case of right conditions, my function returns 0 for COPPER part (and FIBER part too).
> 
> It doesn't change the value returned and behavior.

Hi Charles

Please read my email again. I'm talking about the phy page, not the
function return value.

	 Andrew

^ permalink raw reply

* Re: [PATCH v2 09/15] wcn36xx: Parse trigger_ba response properly
From: Sergei Shtylyov @ 2016-04-04 12:24 UTC (permalink / raw)
  To: Bjorn Andersson, Eugene Krasnikov, Kalle Valo
  Cc: Pontus Fuchs, wcn36xx, linux-wireless, netdev, linux-kernel
In-Reply-To: <1459721806-11817-9-git-send-email-bjorn.andersson@linaro.org>

Hello.

On 4/4/2016 1:16 AM, Bjorn Andersson wrote:

> From: Pontus Fuchs <pontus.fuchs@gmail.com>
>
> This message does not follow the canonical format and needs it's own
> parser.
>
> Signed-off-by: Pontus Fuchs <pontus.fuchs@gmail.com>
> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
> ---
>   drivers/net/wireless/ath/wcn36xx/smd.c | 14 ++++++++++++--
>   1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c b/drivers/net/wireless/ath/wcn36xx/smd.c
> index 76c6856ed932..7f315d098f52 100644
> --- a/drivers/net/wireless/ath/wcn36xx/smd.c
> +++ b/drivers/net/wireless/ath/wcn36xx/smd.c
> @@ -1968,6 +1968,17 @@ out:
>   	return ret;
>   }
>
> +static int wcn36xx_smd_trigger_ba_rsp(void *buf, int len)
> +{
> +	struct wcn36xx_hal_trigger_ba_rsp_msg *rsp;
> +
> +	if (len < sizeof(*rsp))
> +		return -EINVAL;
> +
> +	rsp = (struct wcn36xx_hal_trigger_ba_rsp_msg *) buf;

    Casts from 'void *' to other pointer types are automatic, no need for the 
explicit cast.

[...]

MBR, Sergei

^ permalink raw reply

* Re: [PATCH v2 10/15] wcn36xx: Copy all members in config_sta v1 conversion
From: Sergei Shtylyov @ 2016-04-04 12:25 UTC (permalink / raw)
  To: Bjorn Andersson, Eugene Krasnikov, Kalle Valo
  Cc: Pontus Fuchs, wcn36xx, linux-wireless, netdev, linux-kernel
In-Reply-To: <1459721806-11817-10-git-send-email-bjorn.andersson@linaro.org>

On 4/4/2016 1:16 AM, Bjorn Andersson wrote:

> From: Pontus Fuchs <pontus.fuchs@gmail.com>
>
> When converting to version 1 of the config_sta struct not all
> members where copied. This fixes the problem of multicast frames

    Were.

> not being delivered on an encrypted network.
>
> Signed-off-by: Pontus Fuchs <pontus.fuchs@gmail.com>
> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
[...]

MBR, Sergei

^ permalink raw reply

* Re: davinci-mdio: failing to connect to PHY
From: Andrew Lunn @ 2016-04-04 12:31 UTC (permalink / raw)
  To: Petr Kulhavy; +Cc: netdev
In-Reply-To: <57022356.6010309@barix.com>

On Mon, Apr 04, 2016 at 10:18:30AM +0200, Petr Kulhavy wrote:
> Hi,
> 
> I'm experiencing a peculiar problem with PHY communication in the
> current davinci-mdio.c driver.
> After upgrading from kernel 3.17 to 4.5 my DT based AM1808 board
> started having issues with the PHY communication.
> The MAC is detected, the MDIO is detected, the PHY is detected
> (twice?!?!), however there is no data being sent/received and the
> after issuing "ifdown -a" the MDIO starts spitting out messages that
> it cannot connect to the PHY:
> 
> net eth0: could not connect to phy davinci_mdio.0:00
> davinci_mdio davinci_mdio.0: resetting idled controller
> 
> 
> I'm using a single Micrel KSZ8081 PHY connected via RMII using the
> default PHY address 0x01.
> Here is the dmesg excerpt related to mdio:
> 
> davinci_mdio davinci_mdio.0: Runtime PM disabled, clock forced on.
> davinci_mdio davinci_mdio.0: davinci mdio revision 1.5
> davinci_mdio davinci_mdio.0: detected phy mask fffffffc
> libphy: davinci_mdio.0: probed
> davinci_mdio davinci_mdio.0: phy[0]: device davinci_mdio.0:00,
> driver Micrel KSZ8081 or KSZ8091
> davinci_mdio davinci_mdio.0: phy[1]: device davinci_mdio.0:01,
> driver Micrel KSZ8081 or KSZ8091
> davinci_mdio davinci_mdio.0: resetting idled controller
> Micrel KSZ8081 or KSZ8091 davinci_mdio.0:00: failed to disable NAND
> tree mode
> Micrel KSZ8081 or KSZ8091 davinci_mdio.0:00: attached PHY driver
> [Micrel KSZ8081 or KSZ8091] (mii_bus:phy_addr=davinci_mdio.0:00,
> irq=-1)
> 
> 
> After a soft-reboot the MDIO uses a different PHY mask fffffffd,
> detects correctly only one PHY at address 1 (this is the default
> address) and the networking works:
 
Hi Petr

You might want to take a look at:

http://lxr.free-electrons.com/source/drivers/net/ethernet/ti/davinci_mdio.c#L137

It seems to be asking the hardware about the phy mask.

   Andrew

^ permalink raw reply

* Re: AP firmware for TI wl1251 wifi chip (wl1251-fw-ap.bin)
From: Pali Rohár @ 2016-04-04 12:39 UTC (permalink / raw)
  To: Luciano Coelho, Felipe Balbi, kev, Shahar Levi, Kalle Valo,
	Andrew F. Davis, Guy Mishol, Yaniv Machani, Arik Nemtsov,
	Gery Kahn, Felipe Balbi, Luciano Coelho
  Cc: David Woodhouse, Pavel Machek, Aaro Koskinen, Ben Hutchings,
	David Gnedt, Ivaylo Dimitrov, Sebastian Reichel, Tony Lindgren,
	Nishanth Menon, linux-wireless, netdev, linux-kernel
In-Reply-To: <201603200040.26045@pali>

On Sunday 20 March 2016 00:40:25 Pali Rohár wrote:
> Hi!
> 
> In linux-firmware repository [1] is missing AP firmware for TI wl1251 
> chip. There is only STA firmware wl1251-fw.bin which supports managed 
> and ad-hoc modes.
> 
> For other TI wilink chips there are <CHIP>-ap.bin firmware files 
> (wl1271-fw-ap.bin and wl128x-fw-ap.bin) which support AP mode. But for 
> wl1251 firmware file with guessed name "wl1251-fw-ap.bin" is missing.
> 
> Do you have any idea what happened with AP firmware for ti wilink4 
> wl1251 wifi chip? Or where can be found? Guys from TI, can you help?
> 
> I see that STA firmware was added into linux-firmware tree in year 2013 
> by this pull request [2].
> 
> [1] - https://git.kernel.org/cgit/linux/kernel/git/firmware/linux-firmware.git/tree/ti-connectivity
> 
> [2] - http://thread.gmane.org/gmane.linux.kernel/1566500/focus=1571382
> 

Hi! Anybody has some idea about that AP firmware?

-- 
Pali Rohár
pali.rohar@gmail.com

^ permalink raw reply

* Re: [PATCH net-next] cxgb4/cxgb4vf: Deprecate module parameter dflt_msg_enable
From: Sergei Shtylyov @ 2016-04-04 12:43 UTC (permalink / raw)
  To: Hariprasad Shenai, davem; +Cc: netdev, leedom, nirranjan
In-Reply-To: <1459745604-8093-1-git-send-email-hariprasad@chelsio.com>

Hello.

On 4/4/2016 7:53 AM, Hariprasad Shenai wrote:

> Message level can be set through ethtool, so deprecate module parameter
> which is used to set the same.
>
> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
> ---
>   drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c     | 3 ++-
>   drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c | 3 ++-
>   2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
> index d1e3f0997d6b..acefa35b7250 100644
> --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
> +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
> @@ -168,7 +168,8 @@ MODULE_PARM_DESC(force_init, "Forcibly become Master PF and initialize adapter,"
>   static int dflt_msg_enable = DFLT_MSG_ENABLE;
>
>   module_param(dflt_msg_enable, int, 0644);
> -MODULE_PARM_DESC(dflt_msg_enable, "Chelsio T4 default message enable bitmap");
> +MODULE_PARM_DESC(dflt_msg_enable, "Chelsio T4 default message enable bitmap,"

    Need space after the last comma...

> +		 "deprecated parameter");
>
>   /*
>    * The driver uses the best interrupt scheme available on a platform in the
> diff --git a/drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c b/drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c
> index 1cc8a7a69457..730fec73d5a6 100644
> --- a/drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c
> +++ b/drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c
> @@ -74,7 +74,8 @@ static int dflt_msg_enable = DFLT_MSG_ENABLE;
>
>   module_param(dflt_msg_enable, int, 0644);
>   MODULE_PARM_DESC(dflt_msg_enable,
> -		 "default adapter ethtool message level bitmap");
> +		 "default adapter ethtool message level bitmap, "

    ... like here.

> +		 "deprecated parameter");
>
>   /*
>    * The driver uses the best interrupt scheme available on a platform in the

MBR, Sergei

^ permalink raw reply

* Re: [PATCH] net: socket: return a proper error code when source address becomes nonlocal
From: Sergei Shtylyov @ 2016-04-04 12:55 UTC (permalink / raw)
  To: Liping Zhang, davem; +Cc: netdev, Liping Zhang
In-Reply-To: <1459753769-4290-1-git-send-email-zlpnobody@163.com>

Hello.

On 4/4/2016 10:09 AM, Liping Zhang wrote:

> From: Liping Zhang <liping.zhang@spreadtrum.com>
>
> 1. Socket can use bind(directly) or connect(indirectly) to bind to a local
>     ip address, and later if the network becomes down, that cause the source
>     address becomes nonlocal, then send() call will fail and return EINVAL.
>     But this error code is confusing, acctually we did not pass any invalid
>     arguments. Furthermore, send() maybe return ok at first, it now returns
>     fail just because of a temporary network problem, i.e. when the network
>     recovery, send() call will become ok. Return EADDRNOTAVAIL instead of
>     EINVAL in such situation is better.
> 2. We can use IPV6_PKTINFO to specify the ipv6 source address when call
>     sendmsg() to send packet, but if the address is not available, call will
>     fail and EINVAL is returned. This error code is not very appropriate,
>     it failed maybe just because of a temporary network problem. Also
>     RFC3542, section 6.6 describe an example returns EADDRNOTAVAIL:
>     "ipi6_ifindex specifies an interface but the address ipi6_addr is not
>     available for use on that interface.". So return EADDRNOTAVAIL instead
>     of EINVAL here.
>
> Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>

    I think this should be 2 patches as you seem to fix 2 separate problems.

[...]

MBR, Sergei

^ permalink raw reply

* [PATCH v2] Marvell phy: add fiber status check for some components
From: Charles-Antoine Couret @ 2016-04-04 13:06 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: netdev
In-Reply-To: <20160404122219.GD21828@lunn.ch>

[-- Attachment #1: Type: text/plain, Size: 188 bytes --]

Hi,
I took into account previous remark from Andrew to return in MII_M1111_COPPER page in all cases.
I completed the description of patch.

Thanks for all.
Regards,
Charles-Antoine Couret

[-- Attachment #2: marvell.patch --]
[-- Type: text/x-patch, Size: 3038 bytes --]

>From 564b767163d19355a3b5efaad195e93796570c71 Mon Sep 17 00:00:00 2001
From: Charles-Antoine Couret <charles-antoine.couret@nexvision.fr>
Date: Fri, 1 Apr 2016 16:16:35 +0200
Subject: [PATCH] Marvell phy: add fiber status check for some components

Marvell's phy could have two modes: fiber and copper. Currently, the driver
checks only the copper mode registers to get the status link which could be
wrong.

This commit add a handler to check fiber then copper status link.
If the fiber link is activated, the driver would use this information.
Else, it would use the copper status.

This patch is not tested with all Marvell's phy.
The new function is actived only for tested phys.

Signed-off-by: Charles-Antoine Couret <charles-antoine.couret@nexvision.fr>
---
 drivers/net/phy/marvell.c | 43 +++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 41 insertions(+), 2 deletions(-)

diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c
index ab1d0fc..22552ab 100644
--- a/drivers/net/phy/marvell.c
+++ b/drivers/net/phy/marvell.c
@@ -890,6 +890,45 @@ static int marvell_read_status(struct phy_device *phydev)
 	return 0;
 }
 
+/* marvell_read_fiber_status
+ *
+ * Some Marvell's phys have two modes: fiber and copper.
+ * Both need status checked.
+ * Description:
+ *   First, check the fiber link and status.
+ *   If the fiber link is down, check the copper link and status which
+ *   will be the default value if both link are down.
+ */
+static int marvell_read_fiber_status(struct phy_device *phydev)
+{
+	int err;
+
+	/* Check the fiber mode first */
+	err = phy_write(phydev, MII_MARVELL_PHY_PAGE, MII_M1111_FIBER);
+	if (err < 0)
+		goto error;
+
+	err = marvell_read_status(phydev);
+	if (err < 0)
+		goto error;
+
+	if (phydev->link) {
+		phy_write(phydev, MII_MARVELL_PHY_PAGE, MII_M1111_COPPER);
+		return 0;
+	}
+
+	/* If fiber link is down, check and save copper mode state */
+	err = phy_write(phydev, MII_MARVELL_PHY_PAGE, MII_M1111_COPPER);
+	if (err < 0)
+		goto error;
+
+	return marvell_read_status(phydev);
+
+error:
+	phy_write(phydev, MII_MARVELL_PHY_PAGE, MII_M1111_COPPER);
+	return err;
+}
+
 static int marvell_aneg_done(struct phy_device *phydev)
 {
 	int retval = phy_read(phydev, MII_M1011_PHY_STATUS);
@@ -1122,7 +1161,7 @@ static struct phy_driver marvell_drivers[] = {
 		.probe = marvell_probe,
 		.config_init = &m88e1111_config_init,
 		.config_aneg = &marvell_config_aneg,
-		.read_status = &marvell_read_status,
+		.read_status = &marvell_read_fiber_status,
 		.ack_interrupt = &marvell_ack_interrupt,
 		.config_intr = &marvell_config_intr,
 		.resume = &genphy_resume,
@@ -1270,7 +1309,7 @@ static struct phy_driver marvell_drivers[] = {
 		.probe = marvell_probe,
 		.config_init = &marvell_config_init,
 		.config_aneg = &m88e1510_config_aneg,
-		.read_status = &marvell_read_status,
+		.read_status = &marvell_read_fiber_status,
 		.ack_interrupt = &marvell_ack_interrupt,
 		.config_intr = &marvell_config_intr,
 		.did_interrupt = &m88e1121_did_interrupt,
-- 
2.5.5


^ permalink raw reply related

* Re: [RFC PATCH 1/5] bpf: add PHYS_DEV prog type for early driver filter
From: Jesper Dangaard Brouer @ 2016-04-04 13:07 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Brenden Blanco, davem, netdev, tom, alexei.starovoitov, gerlitz,
	john.fastabend, brouer
In-Reply-To: <57022A85.6040002@iogearbox.net>


On Mon, 04 Apr 2016 10:49:09 +0200 Daniel Borkmann <daniel@iogearbox.net> wrote:

> On 04/02/2016 03:21 AM, Brenden Blanco wrote:
> > Add a new bpf prog type that is intended to run in early stages of the
> > packet rx path. Only minimal packet metadata will be available, hence a new
> > context type, struct xdp_metadata, is exposed to userspace. So far only
> > expose the readable packet length, and only in read mode.
> >
> > The PHYS_DEV name is chosen to represent that the program is meant only
> > for physical adapters, rather than all netdevs.
> >
> > While the user visible struct is new, the underlying context must be
> > implemented as a minimal skb in order for the packet load_* instructions
> > to work. The skb filled in by the driver must have skb->len, skb->head,
> > and skb->data set, and skb->data_len == 0.
> >
[...]
> 
> Do you plan to support bpf_skb_load_bytes() as well? I like using
> this API especially when dealing with larger chunks (>4 bytes) to
> load into stack memory, plus content is kept in network byte order.
> 
> What about other helpers such as bpf_skb_store_bytes() et al that
> work on skbs. Do you intent to reuse them as is and thus populate
> the per cpu skb with needed fields (faking linear data), or do you
> see larger obstacles that prevent for this?

Argh... maybe the minimal pseudo/fake SKB is the wrong "signal" to send
to users of this API.

The hole idea is that an SKB is NOT allocated yet, and not needed at
this level.  If we start supporting calling underlying SKB functions,
then we will end-up in the same place (performance wise).

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: [PATCH v2] Marvell phy: add fiber status check for some components
From: Andrew Lunn @ 2016-04-04 13:14 UTC (permalink / raw)
  To: Charles-Antoine Couret; +Cc: netdev
In-Reply-To: <570266F3.4000400@nexvision.fr>

On Mon, Apr 04, 2016 at 03:06:59PM +0200, Charles-Antoine Couret wrote:
> Hi,
> I took into account previous remark from Andrew to return in MII_M1111_COPPER page in all cases.
> I completed the description of patch.

Hi Charles

Please do not send patches as attachments.

       Andrew

^ permalink raw reply

* FWD: [PATCH v2] Marvell phy: add fiber status check for some components
From: Andrew Lunn @ 2016-04-04 13:25 UTC (permalink / raw)
  To: Florian Fainelli; +Cc: netdev, charles-antoine.couret

> >From 564b767163d19355a3b5efaad195e93796570c71 Mon Sep 17 00:00:00 2001
> From: Charles-Antoine Couret <charles-antoine.couret@nexvision.fr>
> Date: Fri, 1 Apr 2016 16:16:35 +0200
> Subject: [PATCH] Marvell phy: add fiber status check for some components
> 
> Marvell's phy could have two modes: fiber and copper. Currently, the driver
> checks only the copper mode registers to get the status link which could be
> wrong.
> 
> This commit add a handler to check fiber then copper status link.
> If the fiber link is activated, the driver would use this information.
> Else, it would use the copper status.

Hi Florian

What do you think about this?

This works for basic status information. But what about other ethtool
options? Setting the speed and duplex, turning pause on/off, etc.

Do we actually need to stay on page 1 if fibre is in use? How do we
initially change to page 1 when the fibre link is still down?

Should we be using the old mechanism to swap between TP, BNC and AUI
to swap between copper and fibre?

   Andrew

> 
> This patch is not tested with all Marvell's phy.
> The new function is actived only for tested phys.
> 
> Signed-off-by: Charles-Antoine Couret <charles-antoine.couret@nexvision.fr>
> ---
>  drivers/net/phy/marvell.c | 43 +++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 41 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c
> index ab1d0fc..22552ab 100644
> --- a/drivers/net/phy/marvell.c
> +++ b/drivers/net/phy/marvell.c
> @@ -890,6 +890,45 @@ static int marvell_read_status(struct phy_device *phydev)
>  	return 0;
>  }
>  
> +/* marvell_read_fiber_status
> + *
> + * Some Marvell's phys have two modes: fiber and copper.
> + * Both need status checked.
> + * Description:
> + *   First, check the fiber link and status.
> + *   If the fiber link is down, check the copper link and status which
> + *   will be the default value if both link are down.
> + */
> +static int marvell_read_fiber_status(struct phy_device *phydev)
> +{
> +	int err;
> +
> +	/* Check the fiber mode first */
> +	err = phy_write(phydev, MII_MARVELL_PHY_PAGE, MII_M1111_FIBER);
> +	if (err < 0)
> +		goto error;
> +
> +	err = marvell_read_status(phydev);
> +	if (err < 0)
> +		goto error;
> +
> +	if (phydev->link) {
> +		phy_write(phydev, MII_MARVELL_PHY_PAGE, MII_M1111_COPPER);
> +		return 0;
> +	}
> +
> +	/* If fiber link is down, check and save copper mode state */
> +	err = phy_write(phydev, MII_MARVELL_PHY_PAGE, MII_M1111_COPPER);
> +	if (err < 0)
> +		goto error;
> +
> +	return marvell_read_status(phydev);
> +
> +error:
> +	phy_write(phydev, MII_MARVELL_PHY_PAGE, MII_M1111_COPPER);
> +	return err;
> +}
> +
>  static int marvell_aneg_done(struct phy_device *phydev)
>  {
>  	int retval = phy_read(phydev, MII_M1011_PHY_STATUS);
> @@ -1122,7 +1161,7 @@ static struct phy_driver marvell_drivers[] = {
>  		.probe = marvell_probe,
>  		.config_init = &m88e1111_config_init,
>  		.config_aneg = &marvell_config_aneg,
> -		.read_status = &marvell_read_status,
> +		.read_status = &marvell_read_fiber_status,
>  		.ack_interrupt = &marvell_ack_interrupt,
>  		.config_intr = &marvell_config_intr,
>  		.resume = &genphy_resume,
> @@ -1270,7 +1309,7 @@ static struct phy_driver marvell_drivers[] = {
>  		.probe = marvell_probe,
>  		.config_init = &marvell_config_init,
>  		.config_aneg = &m88e1510_config_aneg,
> -		.read_status = &marvell_read_status,
> +		.read_status = &marvell_read_fiber_status,
>  		.ack_interrupt = &marvell_ack_interrupt,
>  		.config_intr = &marvell_config_intr,
>  		.did_interrupt = &m88e1121_did_interrupt,
> -- 
> 2.5.5
> 
> 
> 
> ----- End forwarded message -----
> 

^ permalink raw reply

* Re: [PATCH v5 net-next] net: ipv4: Consider failed nexthops in multipath routes
From: David Ahern @ 2016-04-04 13:29 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: netdev
In-Reply-To: <alpine.LFD.2.11.1604040917390.2182@ja.home.ssi.bg>

On 4/4/16 12:29 AM, Julian Anastasov wrote:
> Reviewed-by: Julian Anastasov <ja@ssi.bg>
>
> 	With one comment: the fallback strategy is simplified,
> we do not fallback to all possible reachable nexthops.
>

Right. I will send a second patch that examines other nexthops (hash <= 
nh_upper_bound).

^ permalink raw reply

* Re: Best way to reduce system call overhead for tun device I/O?
From: ValdikSS @ 2016-04-04 13:35 UTC (permalink / raw)
  To: Guus Sliepen
  Cc: Stephen Hemminger, Willem de Bruijn, David Miller, Tom Herbert,
	netdev

I'm trying to increase OpenVPN throughput by optimizing tun manipulations, too.
Right now I have more questions than answers.

I get about 800 Mbit/s speeds via OpenVPN with authentication and encryption disabled on a local machine with OpenVPN server and client running in a different
network namespaces, which use veth for networking, with 1500 MTU on a TUN interface. This is rather limiting. Low-end devices like SOHO routers could only
achieve 15-20 Mbit/s via OpenVPN with encryption with a 560 MHz CPU.
Increasing MTU reduces overhead. You can get > 5GBit/s if you set 16000 MTU on a TUN interface.
That's not only OpenVPN related. All the tunneling software I tried can't achieve gigabit speeds without encryption on my machine with MTU 1500. Didn't test
tinc though.

TUN supports various offloading techniques: GSO, TSO, UFO, just as hardware NICs. From what I understand, if we use GSO/GRO for TUN, we would be able to receive
send small packets combined in a huge one with one send/recv call with MTU 1500 on a TUN interface, and the performance should increase and be just as it now
with increased MTU. But there is a very little information of how to use offloading with TUN.
I've found an old example code which creates TUN interface with GSO support (TUN_VNET_HDR), does NAT and echoes TUN data to stdout, and a script to run two
instances of this software connected with a pipe. But it doesn't work for me, I never see any combined frames (gso_type is always 0 in a virtio_net_hdr header).
Probably I did something wrong, but I'm not sure what exactly is wrong.

Here's said application: http://ovrload.ru/f/68996_tun.tar.gz

The questions are as follows:

 1. Do I understand correctly that GSO/GRO would have the same effect as increasing MTU on TUN interface?
 2. How GRO/GSO is different from TSO, UFO?
 3. Can we get and send combined frames directly from/to NIC with offloading support?
 4. How to implement GRO/GSO, TSO, UFO? What should be the logic behind it?

Any reply is greatly appreciated.

P.S. this could be helpful: https://ldpreload.com/p/tuntap-notes.txt

> I'm trying to reduce system call overhead when reading/writing to/from a
> tun device in userspace. For sockets, one can use sendmmsg()/recvmmsg(),
> but a tun fd is not a socket fd, so this doesn't work. I'm see several
> options to allow userspace to read/write multiple packets with one
> syscall:
>
> - Implement a TX/RX ring buffer that is mmap()ed, like with AF_PACKET
>   sockets.
>
> - Implement a ioctl() to emulate sendmmsg()/recvmmsg().
>
> - Add a flag that can be set using TUNSETIFF that makes regular
>   read()/write() calls handle multiple packets in one go.
>
> - Expose a socket fd to userspace, so regular sendmmsg()/recvmmsg() can
>   be used. There is tun_get_socket() which is used internally in the
>   kernel, but this is not exposed to userspace, and doesn't look trivial
>   to do either.
>
> What would be the right way to do this?
>
> -- 
> Met vriendelijke groet / with kind regards,
>      Guus Sliepen <guus@tinc-vpn.org>

^ permalink raw reply

* Re: [RFC PATCH 1/5] bpf: add PHYS_DEV prog type for early driver filter
From: Daniel Borkmann @ 2016-04-04 13:36 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Brenden Blanco, davem, netdev, tom, alexei.starovoitov, gerlitz,
	john.fastabend
In-Reply-To: <20160404150700.1456ae80@redhat.com>

On 04/04/2016 03:07 PM, Jesper Dangaard Brouer wrote:
> On Mon, 04 Apr 2016 10:49:09 +0200 Daniel Borkmann <daniel@iogearbox.net> wrote:
>> On 04/02/2016 03:21 AM, Brenden Blanco wrote:
>>> Add a new bpf prog type that is intended to run in early stages of the
>>> packet rx path. Only minimal packet metadata will be available, hence a new
>>> context type, struct xdp_metadata, is exposed to userspace. So far only
>>> expose the readable packet length, and only in read mode.
>>>
>>> The PHYS_DEV name is chosen to represent that the program is meant only
>>> for physical adapters, rather than all netdevs.
>>>
>>> While the user visible struct is new, the underlying context must be
>>> implemented as a minimal skb in order for the packet load_* instructions
>>> to work. The skb filled in by the driver must have skb->len, skb->head,
>>> and skb->data set, and skb->data_len == 0.
>>>
> [...]
>>
>> Do you plan to support bpf_skb_load_bytes() as well? I like using
>> this API especially when dealing with larger chunks (>4 bytes) to
>> load into stack memory, plus content is kept in network byte order.
>>
>> What about other helpers such as bpf_skb_store_bytes() et al that
>> work on skbs. Do you intent to reuse them as is and thus populate
>> the per cpu skb with needed fields (faking linear data), or do you
>> see larger obstacles that prevent for this?
>
> Argh... maybe the minimal pseudo/fake SKB is the wrong "signal" to send
> to users of this API.
>
> The hole idea is that an SKB is NOT allocated yet, and not needed at
> this level.  If we start supporting calling underlying SKB functions,
> then we will end-up in the same place (performance wise).

I'm talking about the current skb-related BPF helper functions we have,
so the question is how much from that code we have we can reuse under
these constraints (obviously things like the tunnel helpers are a different
story) and if that trade-off is acceptable for us. I'm also thinking
that, for example, if you need to parse the packet data anyway for a drop
verdict, you might as well pass some meta data (that is set in the real
skb later on) for those packets that go up the stack.

^ permalink raw reply

* Your Urgent Respond,
From: Mr Michael Gary @ 2016-04-04 13:44 UTC (permalink / raw)
  To: DHS

Urgent Founds!!

We wish to inform you that your over due Inheritance funds which we agreed to pay you in cash is already sealed and package with a security proof box. The funds worth of $7.5 millions US Dollar,in the package will be conveyed to you by an Int'l diplomatic agent, Mr. Jeff Bernard. He will be leaving for your country any time from now, therefore reach us with the details below.

Using a Diplomatic agent this time is because of the failure that were recorded in the other transfer options. Just try and give the diplomat your information and offer him all assistance he may need, especially directive assistance so that he will be able to get your consignment box to you in the couple of days.

Please contact him with your full information, such as your.

1) Full name: ......
2) Resident address:......
3) Phone number:.....
4) The name of your nearest airport:....
Send him the information above for him to locate your home with your package.

Below is some of his contact information:

Name: Jeff Bernard
Email: (dipjeffbernard@hotmail.com).
Feel free and call your fund's original bank at +229-982-92-026, any time you wish for more explanation. 

Best Regard 
Mr Michael Gary

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply

* Re: davinci-mdio: failing to connect to PHY
From: Petr Kulhavy @ 2016-04-04 13:50 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: netdev
In-Reply-To: <20160404123145.GE21828@lunn.ch>

On 04.04.2016 14:31, Andrew Lunn wrote:
>   
> Hi Petr
>
> You might want to take a look at:
>
> http://lxr.free-electrons.com/source/drivers/net/ethernet/ti/davinci_mdio.c#L137
>
> It seems to be asking the hardware about the phy mask.
>
>     Andrew

Hi Andrew,

thanks a lot for the link. In the meantime I've understood the issue 
better. It is due to the fact that the PHY is pin-strapped to address 1 
and broadcast (at address 0) is  enabled. The Micrel driver's 
config_init() disables the broadcast and the PHY stops responding, which 
causes the troubles. The kernel 3.17 didn't disable the broadcast and 
therefore it worked.

I'm wondering how to solve or workaround this...

Petr

^ permalink raw reply

* Re: davinci-mdio: failing to connect to PHY
From: Andrew Lunn @ 2016-04-04 13:58 UTC (permalink / raw)
  To: Petr Kulhavy; +Cc: netdev
In-Reply-To: <5702710A.5010804@barix.com>

On Mon, Apr 04, 2016 at 03:50:02PM +0200, Petr Kulhavy wrote:
> 
> 
> On 04.04.2016 14:31, Andrew Lunn wrote:
> >Hi Petr
> >
> >You might want to take a look at:
> >
> >http://lxr.free-electrons.com/source/drivers/net/ethernet/ti/davinci_mdio.c#L137
> >
> >It seems to be asking the hardware about the phy mask.
> >
> >    Andrew
> 
> Hi Andrew,
> 
> thanks a lot for the link. In the meantime I've understood the issue
> better. It is due to the fact that the PHY is pin-strapped to
> address 1 and broadcast (at address 0) is  enabled. The Micrel
> driver's config_init() disables the broadcast and the PHY stops
> responding, which causes the troubles. The kernel 3.17 didn't
> disable the broadcast and therefore it worked.
> 
> I'm wondering how to solve or workaround this...

One option is in your device tree is to explicitly list the phy on
your mdio bus. Something like:

&mdio {
        status = "okay";

        ethphy0: ethernet-phy@1 {
                reg = <1>;
        };
};

This alone might be sufficient. If not, you need to reference the phy
via a phandle in the ethernet node.


&eth0 {
        status = "okay";
        phy-handle = <&ethphy0>;
};

	Andrew

^ permalink raw reply

* Re: davinci-mdio: failing to connect to PHY
From: Petr Kulhavy @ 2016-04-04 14:01 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: netdev
In-Reply-To: <20160404135813.GA25131@lunn.ch>



On 04.04.2016 15:58, Andrew Lunn wrote:
> On Mon, Apr 04, 2016 at 03:50:02PM +0200, Petr Kulhavy wrote:
>> Hi Andrew,
>>
>> thanks a lot for the link. In the meantime I've understood the issue
>> better. It is due to the fact that the PHY is pin-strapped to
>> address 1 and broadcast (at address 0) is  enabled. The Micrel
>> driver's config_init() disables the broadcast and the PHY stops
>> responding, which causes the troubles. The kernel 3.17 didn't
>> disable the broadcast and therefore it worked.
>>
>> I'm wondering how to solve or workaround this...
> One option is in your device tree is to explicitly list the phy on
> your mdio bus. Something like:
>
> &mdio {
>          status = "okay";
>
>          ethphy0: ethernet-phy@1 {
>                  reg = <1>;
>          };
> };
>
> This alone might be sufficient. If not, you need to reference the phy
> via a phandle in the ethernet node.
>
>
> &eth0 {
>          status = "okay";
>          phy-handle = <&ethphy0>;
> };
>
> 	Andrew
Thanks a lot, I'm going to try it out right now!

Cheers
Petr

^ permalink raw reply

* Re: [RFC PATCH 1/5] bpf: add PHYS_DEV prog type for early driver filter
From: Tom Herbert @ 2016-04-04 14:09 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Jesper Dangaard Brouer, Brenden Blanco, David S. Miller,
	Linux Kernel Network Developers, Alexei Starovoitov, gerlitz,
	john fastabend
In-Reply-To: <57026DFA.3090201@iogearbox.net>

On Mon, Apr 4, 2016 at 10:36 AM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> On 04/04/2016 03:07 PM, Jesper Dangaard Brouer wrote:
>>
>> On Mon, 04 Apr 2016 10:49:09 +0200 Daniel Borkmann <daniel@iogearbox.net>
>> wrote:
>>>
>>> On 04/02/2016 03:21 AM, Brenden Blanco wrote:
>>>>
>>>> Add a new bpf prog type that is intended to run in early stages of the
>>>> packet rx path. Only minimal packet metadata will be available, hence a
>>>> new
>>>> context type, struct xdp_metadata, is exposed to userspace. So far only
>>>> expose the readable packet length, and only in read mode.
>>>>
>>>> The PHYS_DEV name is chosen to represent that the program is meant only
>>>> for physical adapters, rather than all netdevs.
>>>>
>>>> While the user visible struct is new, the underlying context must be
>>>> implemented as a minimal skb in order for the packet load_* instructions
>>>> to work. The skb filled in by the driver must have skb->len, skb->head,
>>>> and skb->data set, and skb->data_len == 0.
>>>>
>> [...]
>>>
>>>
>>> Do you plan to support bpf_skb_load_bytes() as well? I like using
>>> this API especially when dealing with larger chunks (>4 bytes) to
>>> load into stack memory, plus content is kept in network byte order.
>>>
>>> What about other helpers such as bpf_skb_store_bytes() et al that
>>> work on skbs. Do you intent to reuse them as is and thus populate
>>> the per cpu skb with needed fields (faking linear data), or do you
>>> see larger obstacles that prevent for this?
>>
>>
>> Argh... maybe the minimal pseudo/fake SKB is the wrong "signal" to send
>> to users of this API.
>>
>> The hole idea is that an SKB is NOT allocated yet, and not needed at
>> this level.  If we start supporting calling underlying SKB functions,
>> then we will end-up in the same place (performance wise).
>
>
> I'm talking about the current skb-related BPF helper functions we have,
> so the question is how much from that code we have we can reuse under
> these constraints (obviously things like the tunnel helpers are a different
> story) and if that trade-off is acceptable for us. I'm also thinking
> that, for example, if you need to parse the packet data anyway for a drop
> verdict, you might as well pass some meta data (that is set in the real
> skb later on) for those packets that go up the stack.

Right, the meta data in this case is an abstracted receive descriptor.
This would include items that we get in a device receive descriptor
(computed checksum, hash, VLAN tag). This is purposely a small
restricted data structure. I'm hoping we can minimize the size of this
to not much more than 32 bytes (including pointers to data and
linkage).

How this translates to skb to maintain compatibility is with BPF
interesting question. One other consideration is that skb's are kernel
specific, we should be able to use the same BPF filter program in
userspace over DPDK for instance-- so an skb interface as the packet
abstraction might not be the right model...

Tom

^ permalink raw reply

* Re: System hangs (unable to handle kernel paging request)
From: Bastien Philbert @ 2016-04-04 14:30 UTC (permalink / raw)
  To: Oleksii Berezhniak, netdev
In-Reply-To: <CAJHPw-M7ZLRXcEHovc4EZL1OiE_SihBte2tE+P8UZenKNya7hg@mail.gmail.com>



On 2016-04-04 03:59 AM, Oleksii Berezhniak wrote:
> Good day.
> 
> We have PPPoE server with CentOS 7 (kernel 3.10.0-327.10.1.el7.dsip.x86_64)
> 
> We applied some PPPoE related patches to this kernel:
> 
> ppp: don't override sk->sk_state in pppoe_flush_dev()
> ppp: fix pppoe_dev deletion condition in pppoe_release()
> pppoe: fix memory corruption in padt work structure
> pppoe: fix reference counting in PPPoE proxy
> 
> Also we built latest version of ixgbe driver from Intel.
> 
> Now we have crashes after approx. one week of uptime:
> 
> [545444.673270] BUG: unable to handle kernel paging request at ffff88a005040200
> [545444.673306] IP: [<ffffffff811c0e95>] kmem_cache_alloc+0x75/0x1d0
> [545444.673335] PGD 0
> [545444.673348] Oops: 0000 [#1] SMP
> [545444.673367] Modules linked in: arc4 ppp_mppe act_police cls_u32
> sch_ingress sch_tbf pptp gre pppoe pppox ppp_generic slhc 8021q garp
> stp mrp llc iptable_nat nf_conn
> track_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_filter xt_TCPMSS
> iptable_mangle xt_CT nf_conntrack iptable_raw w83793 hwmon_vid
> snd_hda_codec_realtek snd_hda_codec
> _generic snd_hda_intel snd_hda_codec coretemp snd_hda_core iTCO_wdt
> kvm iTCO_vendor_support snd_hwdep snd_seq snd_seq_device ipmi_ssif
> ppdev lpc_ich snd_pcm pcspkr mfd_
> core sg ipmi_si snd_timer snd i2c_i801 ipmi_msghandler ioatdma
> parport_pc parport shpchp soundcore i7core_edac tpm_infineon edac_core
> ip_tables ext4 mbcache jbd2 sd_mod
>  crct10dif_generic crc_t10dif crct10dif_common syscopyarea sysfillrect
> firewire_ohci sysimgblt i2c_algo_bit drm_kms_helper ata_generic
> pata_acpi
> [545444.674383]  ttm firewire_core crc_itu_t serio_raw drm ata_piix
> libata crc32c_intel i2c_core ixgbe(OE) vxlan e1000e ip6_udp_tunnel
> udp_tunnel aacraid dca ptp pps_co
> re
> [545444.674783] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G           OE
> ------------   3.10.0-327.10.1.el7.dsip.x86_64 #1
> [545444.675032] Hardware name: empty empty/S7010, BIOS 'V2.06  ' 03/31/2010
> [545444.675162] task: ffff880139c55c00 ti: ffff880139c84000 task.ti:
> ffff880139c84000
> [545444.675400] RIP: 0010:[<ffffffff811c0e95>]  [<ffffffff811c0e95>]
> kmem_cache_alloc+0x75/0x1d0
> [545444.675641] RSP: 0018:ffff88023fc23ce8  EFLAGS: 00010286
> [545444.675766] RAX: 0000000000000000 RBX: ffff8802302eab00 RCX:
> 000000010eb8edbe
> [545444.676002] RDX: 000000010eb8edbd RSI: 0000000000000020 RDI:
> ffff88013b803700
> [545444.676237] RBP: ffff88023fc23d18 R08: 00000000000175a0 R09:
> ffffffff81517e70
> [545444.676472] R10: 000000000000006b R11: 0000000000000000 R12:
> ffff88a005040200
> [545444.676706] R13: 0000000000000020 R14: ffff88013b803700 R15:
> ffff88013b803700
> [545444.676942] FS:  0000000000000000(0000) GS:ffff88023fc20000(0000)
> knlGS:0000000000000000
> [545444.677180] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [545444.677307] CR2: ffff88a005040200 CR3: 0000000237e63000 CR4:
> 00000000000007e0
> [545444.677543] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [545444.677779] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [545444.678014] Stack:
> [545444.678127]  ffff880237ea2040 ffff8802302eab00 0000000000000280
> 0000000000000280
> [545444.678370]  0000000000000006 ffff880236bb1b60 ffff88023fc23d40
> ffffffff81517e70
> [545444.678614]  0000000000000280 ffff8802302eab00 0000000000000000
> ffff88023fc23d60
> [545444.678857] Call Trace:
> [545444.678973]  <IRQ>
> 
> [545444.678982]
> [545444.679100]  [<ffffffff81517e70>] build_skb+0x30/0x1d0
> [545444.679222]  [<ffffffff8151a973>] __alloc_rx_skb+0x63/0xb0
> [545444.679349]  [<ffffffff8151a9db>] __netdev_alloc_skb+0x1b/0x40
> [545444.679492]  [<ffffffffa0104d8e>] ixgbe_clean_rx_irq+0xee/0xa50 [ixgbe]
> [545444.679624]  [<ffffffff8152862f>] ? __napi_complete+0x1f/0x30
> [545444.679756]  [<ffffffffa0106738>] ixgbe_poll+0x2d8/0x6d0 [ixgbe]
> [545444.679886]  [<ffffffff8152b092>] net_rx_action+0x152/0x240
> [545444.680015]  [<ffffffff81084aef>] __do_softirq+0xef/0x280
> [545444.680144]  [<ffffffff8164735c>] call_softirq+0x1c/0x30
> [545444.680277]  [<ffffffff81016fc5>] do_softirq+0x65/0xa0
> [545444.680402]  [<ffffffff81084e85>] irq_exit+0x115/0x120
> [545444.680529]  [<ffffffff81647ef8>] do_IRQ+0x58/0xf0
> [545444.680660]  [<ffffffff8163d1ad>] common_interrupt+0x6d/0x6d
> [545444.680786]  <EOI>
> [545444.680794]
> [545444.680914]  [<ffffffff81058e96>] ? native_safe_halt+0x6/0x10
> [545444.681041]  [<ffffffff8101dbcf>] default_idle+0x1f/0xc0
> [545444.681168]  [<ffffffff8101e4d6>] arch_cpu_idle+0x26/0x30
> [545444.681297]  [<ffffffff810d62c5>] cpu_startup_entry+0x245/0x290
> [545444.681427]  [<ffffffff810475fa>] start_secondary+0x1ba/0x230
> [545444.681554] Code: ce 00 00 49 8b 50 08 4d 8b 20 49 8b 40 10 4d 85
> e4 0f 84 1f 01 00 00 48 85 c0 0f 84 16 01 00 00 49 63 46 20 48 8d 4a
> 01 4d 8b 06 <49> 8b 1c 04 4c
> 89 e0 65 49 0f c7 08 0f 94 c0 84 c0 74 b9 49 63
> [545444.682056] RIP  [<ffffffff811c0e95>] kmem_cache_alloc+0x75/0x1d0
> [545444.682186]  RSP <ffff88023fc23ce8>
> [545444.682305] CR2: ffff88a005040200
> 
> 
> Every time description and call stack are the same.
> 
> What can be cause of these crashes?
> 
> Thanks.
> 
I am wondering if your kernel has this commit id, 32b3e08fff60494cd1d281a39b51583edfd2b18f.
As this seems to be added to fix issues that look very similar to the trace you are receiving.
Nick

^ permalink raw reply

* Re: Best way to reduce system call overhead for tun device I/O?
From: Guus Sliepen @ 2016-04-04 14:31 UTC (permalink / raw)
  To: ValdikSS
  Cc: Stephen Hemminger, Willem de Bruijn, David Miller, Tom Herbert,
	netdev
In-Reply-To: <57026C8F.8050406@valdikss.org.ru>

On Mon, Apr 04, 2016 at 04:30:55PM +0300, ValdikSS wrote:

> I'm trying to increase OpenVPN throughput by optimizing tun manipulations, too.
> Right now I have more questions than answers.
> 
> I get about 800 Mbit/s speeds via OpenVPN with authentication and encryption disabled on a local machine with OpenVPN server and client running in a different
> network namespaces, which use veth for networking, with 1500 MTU on a TUN interface. This is rather limiting. Low-end devices like SOHO routers could only
> achieve 15-20 Mbit/s via OpenVPN with encryption with a 560 MHz CPU.
> Increasing MTU reduces overhead. You can get > 5GBit/s if you set 16000 MTU on a TUN interface.
> That's not only OpenVPN related. All the tunneling software I tried can't achieve gigabit speeds without encryption on my machine with MTU 1500. Didn't test
> tinc though.

It's exactly the same issue for tinc. But tinc does path MTU discovery,
and actively limits the size of packets inside the tunnel so that the
outer packets are not bigger than the PMTU. Of course this can be
disabled, but experience has shown that transmitting large UDP packets
over the Internet is not ideal, since they will be fragmented, and the
loss of one fragment means the whole packet is dropped. In the case of
OpenVPN, I think many users use -mssfix, so they too are in effect
limiting the size of packets inside the tunnel.

Of course, tinc could fragment packets internally (it actually does so
in some circumstances), but I'd rather avoid that.

Also, GSO and GRO only deal with optimizations within one UDP packet or
one TCP stream. If you have many concurrent programs sending data, or
one program sending lots of small UDP packets, those will never be
optimized.

So I think GSO/GRO is not the way to go, but there really should be a
way to receive and send many individual packets in one system call.

-- 
Met vriendelijke groet / with kind regards,
     Guus Sliepen <guus@tinc-vpn.org>

^ permalink raw reply

* Re: [RFC PATCH 1/5] bpf: add PHYS_DEV prog type for early driver filter
From: Eric Dumazet @ 2016-04-04 14:33 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Daniel Borkmann, Brenden Blanco, davem, netdev, tom,
	alexei.starovoitov, gerlitz, john.fastabend
In-Reply-To: <20160404150700.1456ae80@redhat.com>

On Mon, 2016-04-04 at 15:07 +0200, Jesper Dangaard Brouer wrote:

> Argh... maybe the minimal pseudo/fake SKB is the wrong "signal" to send
> to users of this API.
> 
> The hole idea is that an SKB is NOT allocated yet, and not needed at
> this level.  If we start supporting calling underlying SKB functions,
> then we will end-up in the same place (performance wise).

A BPF program can access many skb fields.

If you plan to support BPF, your fake skb needs to be populated like a
real one. Looks like some code will be replicated in all drivers that
want this facility...

Or accept (document ?) that some BPF instructions are just not there.
(hash, queue_mapping ...)

^ permalink raw reply

* Re: Best way to reduce system call overhead for tun device I/O?
From: Guus Sliepen @ 2016-04-04 14:40 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Stephen Hemminger, David Miller, Tom Herbert, Network Development
In-Reply-To: <CAF=yD-LUaxsJMZiGXQdEDh-6UE11ApL89rjt=13oLK3FM1rerQ@mail.gmail.com>

On Sun, Apr 03, 2016 at 07:03:09PM -0400, Willem de Bruijn wrote:

> On Thu, Mar 31, 2016 at 7:39 PM, Stephen Hemminger <stephen@networkplumber.org> wrote:
>
> > Rather than bodge AF_PACKET onto TUN, why not just create a new device type
> > and control it from something modern like netlink.

Do we really want to introduce a whole new device type? The tun device
is working perfectly fine, except for the fact that there is no way to
send/receive multiple packets in one go.

> Depending on the use-case, it may be sufficient to extend AF_PACKET
> with limited tap functionality:
> 
> - add a po->xmit mode that reinjects into the kernel receive path,
>   analogous to pktgen's M_NETIF_RECEIVE mode.
> 
> - optionally drop packets in __netif_receive_skb_core and xmit_one
>   if any of the registered packet sockets accepted the packet and has
>   a new intercept feature flag enabled.
> 
> This can be applied to a dummy device, but much more interesting
> is to interpose on the flow of a normal nic. It is clearly not a drop-in
> replacement for a tap (let alone tun) device. I have some preliminary
> code.

It's not really tinc's use case, but I did try using socket(AF_PACKET)
bound to a tun interface, just to see if sendmmsg()/recvmmsg() works
then. It does, but indeed for packets that are sent to the socket, they
need to be reinjected into the kernel receive path. So I'll be happy to
test out your preliminary code.

-- 
Met vriendelijke groet / with kind regards,
     Guus Sliepen <guus@tinc-vpn.org>

^ permalink raw reply

* Re: [PATCH v5 2/4] Documentation: Bindings: Add STM32 DWMAC glue
From: Alexandre Torgue @ 2016-04-04 14:40 UTC (permalink / raw)
  To: Joachim Eastwood
  Cc: Rob Herring, Chen-Yu Tsai, Maxime Coquelin, Giuseppe Cavallaro,
	netdev, devicetree, linux-kernel, linux-arm-kernel
In-Reply-To: <CAJgp7zxfFRhfNjcE23n+DKD0ffrvH4_sowPjK_-DZAS4ecH5dg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

Hi Rob,

2016-03-22 17:11 GMT+01:00 Alexandre Torgue <alexandre.torgue-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:
> Hi guys,
>
> I will fix typo issues (s/vesrion/version and ethernet @).
>
> Concerning compatible string. For sure "snps,dwmac-3.50a" string is
> not used inside glue driver.
> I perfere to keep it for information but if you really want that I
> remove it I will not block ;)
>
> 2016-03-21 16:36 GMT+01:00 Joachim  Eastwood <manabian-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:
>> On 21 March 2016 at 13:40, Rob Herring <robh-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
>>> On Sat, Mar 19, 2016 at 12:00:22AM +0800, Chen-Yu Tsai wrote:
>>>> Hi,
>>>>
>>>> On Fri, Mar 18, 2016 at 11:37 PM, Alexandre TORGUE
>>>> <alexandre.torgue-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>>> > +- clocks: Must contain a phandle for each entry in clock-names.
>>>> > +- clock-names: Should be "stmmaceth" for the host clock.
>>>
> We can remove host clock (stmmac eth) entry here and refer to
> stmmac.txt binding for common entry
>
>>> This doesn't sound like the clock input signal name...
>>>
>>>> > +              Should be "tx-clk" for the MAC TX clock.
>>>> > +              Should be "rx-clk" for the MAC RX clock.
>>>
>>> How can other DWMAC blocks not have these clocks? The glue can't really
>>> add these clocks. It could combine them into one or a new version of
>>> DWMAC could have a different number of clock inputs. So if there is
>>> variation here, then some of the bindings are probably wrong. I guess
>>> the only change I'm suggesting is possibly moving these into common
>>> binding doc.
>>
>> The LPC18xx implementation probably have these clocks as well but the
>> LPC1850 user manual only documents the main clock. Someone with access
>> to the IP block doc from Synopsys should be able to check which clocks
>> the MAC really needs.
>>
>> Rockchip bindings have two clocks named "mac_clk_rx" and "mac_clk_tx".
>> These are probably the same as stm32 needs so maybe use these names
>> and move them into the main doc and update the rockchip binding.
>>
> I think we can use same name. But I have a doubt on moving it in a
> common bindings (maybe I don't well understood). When you say "common
> binding file" is it "stmmac.txt" binding ? If yes does it mean that we
> have to control it inside stmmac driver (no more in glue) ? In this
> case those clocks will become "required" for stm32 and rockship but
> not for others chip. It could create confusion?

A gentle ping. Can you give me your feedback please ?
I will send next patchset version according to your answer.

Thanks in advance

Alex

>
> Best regards
>
> Alex
>
>>
>> regards,
>> Joachim Eastwood
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox