Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next 1/2] net: flow_offload: mangle 128-bit packet field with one action
From: Edward Cree @ 2019-08-20 14:15 UTC (permalink / raw)
  To: Pablo Neira Ayuso, netfilter-devel
  Cc: davem, netdev, jakub.kicinski, jiri, vladbu
In-Reply-To: <20190820105225.13943-1-pablo@netfilter.org>

On 20/08/2019 11:52, Pablo Neira Ayuso wrote:
> The existing infrastructure needs the front-end to generate up to four
> actions (one for each 32-bit word) to mangle an IPv6 address. This patch
> allows you to mangle fields than are longer than 4-bytes with one single
> action. Drivers have been adapted to this new representation following a
> simple approach, that is, iterate over the array of words and configure
> the hardware IR to make the packet mangling. FLOW_ACTION_MANGLE_MAX_WORDS
> defines the maximum number of words from one given offset (currently 4
> words).
>
> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
What's the point of this?
Why do you need to be able to do this with a single action?  It doesn't
 look like this extra 70 lines of code is actually buying you anything,
 and it makes more work for any other drivers that want to implement the
 offload API.

^ permalink raw reply

* Re: linux-next: Tree for Aug 19 (drivers/net/netdevsim/dev.o)
From: Ido Schimmel @ 2019-08-20 14:13 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Stephen Rothwell, Linux Next Mailing List,
	Linux Kernel Mailing List, netdev@vger.kernel.org
In-Reply-To: <92ef45a5-c933-0493-b2ff-50352fa8bf3f@infradead.org>

On Mon, Aug 19, 2019 at 09:16:13PM -0700, Randy Dunlap wrote:
> On 8/19/19 2:18 AM, Stephen Rothwell wrote:
> > Hi all,
> > 
> > Changes since 20190816:
> > 
> 
> on x86_64:
> # CONFIG_INET is not set
> 
> ld: drivers/net/netdevsim/dev.o: in function `nsim_dev_trap_report_work':
> dev.c:(.text+0x52f): undefined reference to `ip_send_check'

Thanks, Randy.

There is a fix here [1], but some changes were requested. I will send a
fix later today and Cc you.

[1] https://patchwork.ozlabs.org/patch/1149229/

^ permalink raw reply

* Re: [PATCH net-next] netdevsim: Fix build error without CONFIG_INET
From: Ido Schimmel @ 2019-08-20 14:08 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: YueHaibing, davem, idosch, jiri, mcroce, linux-kernel, netdev
In-Reply-To: <20190819145900.5d9cc1f3@cakuba.netronome.com>

On Mon, Aug 19, 2019 at 02:59:00PM -0700, Jakub Kicinski wrote:
> On Mon, 19 Aug 2019 20:08:25 +0800, YueHaibing wrote:
> > If CONFIG_INET is not set, building fails:
> > 
> > drivers/net/netdevsim/dev.o: In function `nsim_dev_trap_report_work':
> > dev.c:(.text+0x67b): undefined reference to `ip_send_check'
> > 
> > Add CONFIG_INET Kconfig dependency to fix this.
> > 
> > Reported-by: Hulk Robot <hulkci@huawei.com>
> > Fixes: da58f90f11f5 ("netdevsim: Add devlink-trap support")
> > Signed-off-by: YueHaibing <yuehaibing@huawei.com>
> 
> Hmm.. I'd rather the test module did not have hard dependencies on
> marginally important config options. We have done a pretty good job
> so far limiting the requirements though separating the code out at
> compilation object level. The more tests depend on netdevsim and the
> more bots we have running tests against randconfig - the more important
> this is.
> 
> This missing reference here is for calculating a checksum over a
> constant header.. could we perhaps just hard code the checksum?

Sure. I was AFK today, will send a patch later today when I get home.

Thanks

^ permalink raw reply

* Re: [PATCH] selftests: bpf: install files test_xdp_vlan.sh
From: Jesper Dangaard Brouer @ 2019-08-20 14:07 UTC (permalink / raw)
  To: Anders Roxell
  Cc: shuah, ast, daniel, davem, jakub.kicinski, hawk, john.fastabend,
	linux-kselftest, netdev, bpf, linux-kernel
In-Reply-To: <20190820134121.25728-1-anders.roxell@linaro.org>

On Tue, 20 Aug 2019 15:41:21 +0200
Anders Roxell <anders.roxell@linaro.org> wrote:

> When ./test_xdp_vlan_mode_generic.sh runs it complains that it can't
> find file test_xdp_vlan.sh.
> 
>  # selftests: bpf: test_xdp_vlan_mode_generic.sh
>  # ./test_xdp_vlan_mode_generic.sh: line 9: ./test_xdp_vlan.sh: No such
>  file or directory
> 
> Rework so that test_xdp_vlan.sh gets installed, added to the variable
> TEST_PROGS_EXTENDED.
> 
> Fixes: d35661fcf95d ("selftests/bpf: add wrapper scripts for test_xdp_vlan.sh")
> Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
> ---

Thanks for catching this!

Acked-by: Jesper Dangaard Brouer <jbrouer@redhat.com>

>  tools/testing/selftests/bpf/Makefile | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
> index 1faad0c3c3c9..d7968e20463c 100644
> --- a/tools/testing/selftests/bpf/Makefile
> +++ b/tools/testing/selftests/bpf/Makefile
> @@ -68,7 +68,8 @@ TEST_PROGS := test_kmod.sh \
>  TEST_PROGS_EXTENDED := with_addr.sh \
>  	with_tunnels.sh \
>  	tcp_client.py \
> -	tcp_server.py
> +	tcp_server.py \
> +	test_xdp_vlan.sh
>  
>  # Compile but not part of 'make run_tests'
>  TEST_GEN_PROGS_EXTENDED = test_libbpf_open test_sock_addr test_skb_cgroup_id_user \



-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* [PATCH bpf-next v2] bpf: add BTF ids in procfs for file descriptors to BTF objects
From: Quentin Monnet @ 2019-08-20 13:53 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann
  Cc: bpf, netdev, oss-drivers, Quentin Monnet

Implement the show_fdinfo hook for BTF FDs file operations, and make it
print the id of the BTF object. This allows for a quick retrieval of the
BTF id from its FD; or it can help understanding what type of object
(BTF) the file descriptor points to.

v2:
- Do not expose data_size, only btf_id, in FD info.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 kernel/bpf/btf.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 5fcc7a17eb5a..6b403dc18486 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -3376,6 +3376,15 @@ void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj,
 	btf_type_ops(t)->seq_show(btf, t, type_id, obj, 0, m);
 }
 
+#ifdef CONFIG_PROC_FS
+static void bpf_btf_show_fdinfo(struct seq_file *m, struct file *filp)
+{
+	const struct btf *btf = filp->private_data;
+
+	seq_printf(m, "btf_id:\t%u\n", btf->id);
+}
+#endif
+
 static int btf_release(struct inode *inode, struct file *filp)
 {
 	btf_put(filp->private_data);
@@ -3383,6 +3392,9 @@ static int btf_release(struct inode *inode, struct file *filp)
 }
 
 const struct file_operations btf_fops = {
+#ifdef CONFIG_PROC_FS
+	.show_fdinfo	= bpf_btf_show_fdinfo,
+#endif
 	.release	= btf_release,
 };
 
-- 
2.17.1


^ permalink raw reply related

* Re: [RFC PATCH net-next 03/11] spi: Add a PTP system timestamp to the transfer structure
From: Vladimir Oltean @ 2019-08-20 13:48 UTC (permalink / raw)
  To: Mark Brown
  Cc: Hubert Feurstein, mlichvar, Richard Cochran, Andrew Lunn,
	Florian Fainelli, linux-spi, netdev
In-Reply-To: <20190820125557.GB4738@sirena.co.uk>

Hi Mark,

On Tue, 20 Aug 2019 at 15:55, Mark Brown <broonie@kernel.org> wrote:
>
> On Fri, Aug 16, 2019 at 05:05:53PM +0300, Vladimir Oltean wrote:
>
> > I'm not sure how to respond to this, because I don't know anything
> > about the timing of DMA transfers.
> > Maybe snapshotting DMA transfers the same way is not possible (if at
> > all). Maybe they are not exactly adequate for this sort of application
> > anyway. Maybe it depends.
>
> DMA transfers generally proceed without any involvement from the CPU,
> this is broadly the point of DMA.  You *may* be able to split into
> multiple transactions but it's not reliable that you'd be able to do so
> on byte boundaries and there will be latency getting notified of
> completions.
>
> > In other words, from a purely performance perspective, I am against
> > limiting the API to just snapshotting the first and last byte. At this
> > level of "zoom", if I change the offset of the byte to anything other
> > than 3, the synchronization offset refuses to converge towards zero,
> > because the snapshot is incurring a constant offset that the servo
> > loop from userspace (phc2sys) can't compensate for.
>
> > Maybe the SPI master driver should just report what sort of
> > snapshotting capability it can offer, ranging from none (default
> > unless otherwise specified), to transfer-level (DMA style) or
> > byte-level.
>
> That does then have the consequence that the majority of controllers
> aren't going to be usable with the API which isn't great.
>

Can we continue this discussion on this thread:
https://www.spinics.net/lists/netdev/msg593772.html
The whole point there is that if there's nothing that the driver can
do, the SPI core will take the timestamps and record their (bad)
precision.

> > I'm afraid more actual experimentation is needed with DMA-based
> > controllers to understand what can be expected from them, and as a
> > result, how the API should map around them.
> > MDIO bus controllers are in a similar situation (with Hubert's patch)
> > but at least there the frame size is fixed and I haven't heard of an
> > MDIO controller to use DMA.
>
> I'm not 100% clear what the problem you're trying to solve is, or if
> it's a sensible problem to try to solve for that matter.

The problem can simply be summarized as: you're trying to read a clock
over SPI, but there's so much timing jitter in you doing that, that
you have a high degree of uncertainty in the actual precision of the
readout you took.
The solution has two parts:
- Make the SPI access itself more predictable in terms of latency.
This is always going to have to be dealt with on a driver-by-driver,
hardware-by-hardware basis.
- Provide a way of taking a software timestamp in the time interval
when the latency is predictable, and as close as possible to the
moment when the SPI slave will receive the request. Disabling
interrupts and preemption always helps to snapshot that critical
section. Again, the SPI core can't do that. And finding the correct
"pre" and "post" hooks that surround the hardware transfer in a
deterministic fashion is crucial. If you read the cover letter, I used
a GPIO pin to make sure the timestamps are where they should be, and
that they don't vary in width (post - pre) - there are also some
screenshots on Gdrive. Maybe something similar is not impossible for a
DMA transfer, although the problem formulation so far is too vague to
emit a more clear statement.
If you know when the SPI slave's clock was actually read, you have a
better idea of what time it was.

Regards,
-Vladimir

^ permalink raw reply

* Re: [oss-drivers] Re: [PATCH bpf-next] bpf: add BTF ids in procfs for file descriptors to BTF objects
From: Quentin Monnet @ 2019-08-20 13:48 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov; +Cc: bpf, netdev, oss-drivers
In-Reply-To: <fcb2e528-6750-2192-befe-dd68ca36fc62@iogearbox.net>

2019-08-20 15:36 UTC+0200 ~ Daniel Borkmann <daniel@iogearbox.net>
> On 8/20/19 11:52 AM, Quentin Monnet wrote:
>> Implement the show_fdinfo hook for BTF FDs file operations, and make it
>> print the id and the size of the BTF object. This allows for a quick
>> retrieval of the BTF id from its FD; or it can help understanding what
>> type of object (BTF) the file descriptor points to.
>>
>> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
>> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
>> ---
>>   kernel/bpf/btf.c | 16 ++++++++++++++++
>>   1 file changed, 16 insertions(+)
>>
>> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
>> index 5fcc7a17eb5a..39e184f1b27c 100644
>> --- a/kernel/bpf/btf.c
>> +++ b/kernel/bpf/btf.c
>> @@ -3376,6 +3376,19 @@ void btf_type_seq_show(const struct btf *btf,
>> u32 type_id, void *obj,
>>       btf_type_ops(t)->seq_show(btf, t, type_id, obj, 0, m);
>>   }
>>   +#ifdef CONFIG_PROC_FS
>> +static void bpf_btf_show_fdinfo(struct seq_file *m, struct file *filp)
>> +{
>> +    const struct btf *btf = filp->private_data;
>> +
>> +    seq_printf(m,
>> +           "btf_id:\t%u\n"
>> +           "data_size:\t%u\n",
>> +           btf->id,
>> +           btf->data_size);
> 
> Looks good, exposing btf_id makes sense to me in order to correlate with
> applications.
> Do you have a concrete use case for data_size to expose it this way as
> opposed to fetch
> it via btf_get_info_by_fd()? If not, I'd say lets only add btf_id in there.

No, I don't have one for data_size to be honest. I'll respin with btf_id
only then.

Thanks,
Quentin

^ permalink raw reply

* [PATCH] selftests: bpf: add config fragment BPF_JIT
From: Anders Roxell @ 2019-08-20 13:41 UTC (permalink / raw)
  To: shuah, ast, daniel
  Cc: linux-kselftest, netdev, bpf, linux-kernel, Anders Roxell

When running test_kmod.sh the following shows up

 # sysctl cannot stat /proc/sys/net/core/bpf_jit_enable No such file or directory
 cannot: stat_/proc/sys/net/core/bpf_jit_enable #
 # sysctl cannot stat /proc/sys/net/core/bpf_jit_harden No such file or directory
 cannot: stat_/proc/sys/net/core/bpf_jit_harden #

Rework to enable CONFIG_BPF_JIT to solve "No such file or directory"

Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
---
 tools/testing/selftests/bpf/config | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config
index f7a0744db31e..5dc109f4c097 100644
--- a/tools/testing/selftests/bpf/config
+++ b/tools/testing/selftests/bpf/config
@@ -34,3 +34,4 @@ CONFIG_NET_MPLS_GSO=m
 CONFIG_MPLS_ROUTING=m
 CONFIG_MPLS_IPTUNNEL=m
 CONFIG_IPV6_SIT=m
+CONFIG_BPF_JIT=y
-- 
2.20.1


^ permalink raw reply related

* [PATCH] selftests: bpf: install files test_xdp_vlan.sh
From: Anders Roxell @ 2019-08-20 13:41 UTC (permalink / raw)
  To: shuah, ast, daniel, davem, jakub.kicinski, hawk, john.fastabend
  Cc: linux-kselftest, netdev, bpf, linux-kernel, Anders Roxell

When ./test_xdp_vlan_mode_generic.sh runs it complains that it can't
find file test_xdp_vlan.sh.

 # selftests: bpf: test_xdp_vlan_mode_generic.sh
 # ./test_xdp_vlan_mode_generic.sh: line 9: ./test_xdp_vlan.sh: No such
 file or directory

Rework so that test_xdp_vlan.sh gets installed, added to the variable
TEST_PROGS_EXTENDED.

Fixes: d35661fcf95d ("selftests/bpf: add wrapper scripts for test_xdp_vlan.sh")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
---
 tools/testing/selftests/bpf/Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 1faad0c3c3c9..d7968e20463c 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -68,7 +68,8 @@ TEST_PROGS := test_kmod.sh \
 TEST_PROGS_EXTENDED := with_addr.sh \
 	with_tunnels.sh \
 	tcp_client.py \
-	tcp_server.py
+	tcp_server.py \
+	test_xdp_vlan.sh
 
 # Compile but not part of 'make run_tests'
 TEST_GEN_PROGS_EXTENDED = test_libbpf_open test_sock_addr test_skb_cgroup_id_user \
-- 
2.20.1


^ permalink raw reply related

* [PATCH] selftests: net: add missing NFT_FWD_NETDEV to config
From: Anders Roxell @ 2019-08-20 13:41 UTC (permalink / raw)
  To: davem, shuah; +Cc: netdev, linux-kselftest, linux-kernel, Anders Roxell

When running xfrm_policy.sh we see the following

 # sysctl cannot stat /proc/sys/net/ipv4/conf/eth1/forwarding No such file or directory
 cannot: stat_/proc/sys/net/ipv4/conf/eth1/forwarding #
 # sysctl cannot stat /proc/sys/net/ipv4/conf/veth0/forwarding No such file or directory
 cannot: stat_/proc/sys/net/ipv4/conf/veth0/forwarding #
 # sysctl cannot stat /proc/sys/net/ipv4/conf/eth1/forwarding No such file or directory
 cannot: stat_/proc/sys/net/ipv4/conf/eth1/forwarding #
 # sysctl cannot stat /proc/sys/net/ipv4/conf/veth0/forwarding No such file or directory
 cannot: stat_/proc/sys/net/ipv4/conf/veth0/forwarding #
 # sysctl cannot stat /proc/sys/net/ipv6/conf/eth1/forwarding No such file or directory
 cannot: stat_/proc/sys/net/ipv6/conf/eth1/forwarding #
 # sysctl cannot stat /proc/sys/net/ipv6/conf/veth0/forwarding No such file or directory
 cannot: stat_/proc/sys/net/ipv6/conf/veth0/forwarding #
 # sysctl cannot stat /proc/sys/net/ipv6/conf/eth1/forwarding No such file or directory
 cannot: stat_/proc/sys/net/ipv6/conf/eth1/forwarding #
 # sysctl cannot stat /proc/sys/net/ipv6/conf/veth0/forwarding No such file or directory
 cannot: stat_/proc/sys/net/ipv6/conf/veth0/forwarding #
 # modprobe FATAL Module ip_tables not found in directory /lib/modules/5.3.0-rc5-next-20190820+
 FATAL: Module_ip_tables #
 # iptables v1.6.2 can't initialize iptables table `filter' Table does not exist (do you need to insmod?)
 v1.6.2: can't_initialize #

Rework to enable CONFIG_NF_TABLES_NETDEV and CONFIG_NFT_FWD_NETDEV.

Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
---
 tools/testing/selftests/net/config | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/testing/selftests/net/config b/tools/testing/selftests/net/config
index b8503a8119b0..e30b0ae5d474 100644
--- a/tools/testing/selftests/net/config
+++ b/tools/testing/selftests/net/config
@@ -29,3 +29,5 @@ CONFIG_NET_SCH_FQ=m
 CONFIG_NET_SCH_ETF=m
 CONFIG_TEST_BLACKHOLE_DEV=m
 CONFIG_KALLSYMS=y
+CONFIG_NF_TABLES_NETDEV=y
+CONFIG_NFT_FWD_NETDEV=m
-- 
2.20.1


^ permalink raw reply related

* [PATCH] bonding: force enable lacp port after link state recovery for 802.3ad
From: zhangsha.zhang @ 2019-08-20 13:38 UTC (permalink / raw)
  To: j.vosburgh, vfalico, andy, davem
  Cc: netdev, linux-kernel, zhangsha.zhang, yuehaibing, hunongda,
	alex.chen

From: Sha Zhang <zhangsha.zhang@huawei.com>

After the commit 334031219a84 ("bonding/802.3ad: fix slave link
initialization transition states") merged,
the slave's link status will be changed to BOND_LINK_FAIL
from BOND_LINK_DOWN in the following scenario:
- Driver reports loss of carrier and
  bonding driver receives NETDEV_CHANGE notifier
- slave's duplex and speed is zerod and
  its port->is_enabled is cleard to 'false';
- Driver reports link recovery and
  bonding driver receives NETDEV_UP notifier;
- If speed/duplex getting failed here, the link status
  will be changed to BOND_LINK_FAIL;
- The MII monotor later recover the slave's speed/duplex
  and set link status to BOND_LINK_UP, but remains
  the 'port->is_enabled' to 'false'.

In this scenario, the lacp port will not be enabled even its speed
and duplex are valid. The bond will not send LACPDU's, and its
state is 'AD_STATE_DEFAULTED' forever. The simplest fix I think
is to force enable lacp after port slave speed check in
bond_miimon_commit. As enabled, the lacp port can run its state machine
normally after link recovery.

Signed-off-by: Sha Zhang <zhangsha.zhang@huawei.com>
---
 drivers/net/bonding/bond_main.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 931d9d9..379253a 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2194,6 +2194,7 @@ static void bond_miimon_commit(struct bonding *bond)
 {
 	struct list_head *iter;
 	struct slave *slave, *primary;
+	struct port *port;
 
 	bond_for_each_slave(bond, slave, iter) {
 		switch (slave->new_link) {
@@ -2205,8 +2206,13 @@ static void bond_miimon_commit(struct bonding *bond)
 			 * link status
 			 */
 			if (BOND_MODE(bond) == BOND_MODE_8023AD &&
-			    slave->link == BOND_LINK_UP)
+			    slave->link == BOND_LINK_UP) {
 				bond_3ad_adapter_speed_duplex_changed(slave);
+				if (slave->duplex == DUPLEX_FULL) {
+					port = &(SLAVE_AD_INFO(slave)->port);
+					port->is_enabled = true;
+				}
+			}
 			continue;
 
 		case BOND_LINK_UP:
-- 
1.8.3.1


^ permalink raw reply related

* Re: [PATCH bpf-next] bpf: add BTF ids in procfs for file descriptors to BTF objects
From: Daniel Borkmann @ 2019-08-20 13:36 UTC (permalink / raw)
  To: Quentin Monnet, Alexei Starovoitov; +Cc: bpf, netdev, oss-drivers
In-Reply-To: <20190820095233.17097-1-quentin.monnet@netronome.com>

On 8/20/19 11:52 AM, Quentin Monnet wrote:
> Implement the show_fdinfo hook for BTF FDs file operations, and make it
> print the id and the size of the BTF object. This allows for a quick
> retrieval of the BTF id from its FD; or it can help understanding what
> type of object (BTF) the file descriptor points to.
> 
> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> ---
>   kernel/bpf/btf.c | 16 ++++++++++++++++
>   1 file changed, 16 insertions(+)
> 
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index 5fcc7a17eb5a..39e184f1b27c 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -3376,6 +3376,19 @@ void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj,
>   	btf_type_ops(t)->seq_show(btf, t, type_id, obj, 0, m);
>   }
>   
> +#ifdef CONFIG_PROC_FS
> +static void bpf_btf_show_fdinfo(struct seq_file *m, struct file *filp)
> +{
> +	const struct btf *btf = filp->private_data;
> +
> +	seq_printf(m,
> +		   "btf_id:\t%u\n"
> +		   "data_size:\t%u\n",
> +		   btf->id,
> +		   btf->data_size);

Looks good, exposing btf_id makes sense to me in order to correlate with applications.
Do you have a concrete use case for data_size to expose it this way as opposed to fetch
it via btf_get_info_by_fd()? If not, I'd say lets only add btf_id in there.

> +}
> +#endif
> +
>   static int btf_release(struct inode *inode, struct file *filp)
>   {
>   	btf_put(filp->private_data);
> @@ -3383,6 +3396,9 @@ static int btf_release(struct inode *inode, struct file *filp)
>   }
>   
>   const struct file_operations btf_fops = {
> +#ifdef CONFIG_PROC_FS
> +	.show_fdinfo	= bpf_btf_show_fdinfo,
> +#endif
>   	.release	= btf_release,
>   };
>   
> 

Thanks,
Daniel

^ permalink raw reply

* Re: [EXT] Re: [PATCH net-next 0/1] Add BASE-T1 PHY support
From: Christian Herber @ 2019-08-20 13:36 UTC (permalink / raw)
  To: Heiner Kallweit, davem@davemloft.net
  Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <13e65051-fe4f-5964-30b3-75285e6d2eee@gmail.com>

On 19.08.2019 21:07, Heiner Kallweit wrote:
> Caution: EXT Email
> 
> On 19.08.2019 08:32, Christian Herber wrote:
>> On 16.08.2019 22:59, Heiner Kallweit wrote:
>>> On 15.08.2019 17:32, Christian Herber wrote:
>>>> This patch adds basic support for BASE-T1 PHYs in the framework.
>>>> BASE-T1 PHYs main area of application are automotive and industrial.
>>>> BASE-T1 is standardized in IEEE 802.3, namely
>>>> - IEEE 802.3bw: 100BASE-T1
>>>> - IEEE 802.3bp 1000BASE-T1
>>>> - IEEE 802.3cg: 10BASE-T1L and 10BASE-T1S
>>>>
>>>> There are no products which contain BASE-T1 and consumer type PHYs like
>>>> 1000BASE-T. However, devices exist which combine 100BASE-T1 and 1000BASE-T1
>>>> PHYs with auto-negotiation.
>>>
>>> Is this meant in a way that *currently* there are no PHY's combining Base-T1
>>> with normal Base-T modes? Or are there reasons why this isn't possible in
>>> general? I'm asking because we have PHY's combining copper and fiber, and e.g.
>>> the mentioned Aquantia PHY that combines NBase-T with 1000Base-T2.
>>>
>>>>
>>>> The intention of this patch is to make use of the existing Clause 45 functions.
>>>> BASE-T1 adds some additional registers e.g. for aneg control, which follow a
>>>> similiar register layout as the existing devices. The bits which are used in
>>>> BASE-T1 specific registers are the same as in basic registers, thus the
>>>> existing functions can be resued, with get_aneg_ctrl() selecting the correct
>>>> register address.
>>>>
>>> If Base-T1 can't be combined with other modes then at a first glance I see no
>>> benefit in defining new registers e.g. for aneg control, and the standard ones
>>> are unused. Why not using the standard registers? Can you shed some light on that?
>>>
>>> Are the new registers internally shadowed to the standard location?
>>> That's something I've seen on other PHY's: one register appears in different
>>> places in different devices.
>>>
>>>> The current version of ethtool has been prepared for 100/1000BASE-T1 and works
>>>> with this patch. 10BASE-T1 needs to be added to ethtool.
>>>>
>>>> Christian Herber (1):
>>>>     Added BASE-T1 PHY support to PHY Subsystem
>>>>
>>>>    drivers/net/phy/phy-c45.c    | 113 +++++++++++++++++++++++++++++++----
>>>>    drivers/net/phy/phy-core.c   |   4 +-
>>>>    include/uapi/linux/ethtool.h |   2 +
>>>>    include/uapi/linux/mdio.h    |  21 +++++++
>>>>    4 files changed, 129 insertions(+), 11 deletions(-)
>>>>
>>>
>>> Heiner
>>>
>>
>> Hi Heiner,
>>
>> I do not think the Aquantia part you are describing is publicly
>> documented, so i cannot comment on that part.
> Right, datasheet isn't publicly available. All I wanted to say with
> mentioning this PHY: It's not a rare exception that a PHY combines
> standard BaseT modes with "non-consumer" modes for special purposes.
> One practical use case of this proprietary 1000Base-T2 mode is
> re-using existing 2-pair cabling in aircrafts.
> 
>> There are multiple reasons why e.g. xBASE-T1 plus 1000BASE-T is
>> unlikely. First, the is no use-case known to me, where this would be
>> required. Second, there is no way that you can do an auto-negotiation
>> between the two, as these both have their own auto-neg defined (Clause
>> 28/73 vs. Clause 98). Thirdly, if you would ever have a product with
>> both, I believe it would just include two full PHYs and a way to select
>> which flavor you want. Of course, this is the theory until proven
>> otherwise, but to me it is sufficient to use a single driver.
>>
> I'm with you if you say it's unlikely. However your statement in the
> commit message leaves the impression that there can't be such a device.
> And that's a difference.
> 
> Regarding "including two full PHYs":
> This case we have already, there are PHYs combining different IP blocks,
> each one supporting a specific mode (e.g. copper and fiber). There you
> also have the case of different autoneg methods, clause 28 vs. clause 37.
> 
>> The registers are different in the fields they include. It is just that
>> the flags which are used by the Linux driver, like restarting auto-neg,
>> are at the same position.
>>
> Good to know. Your commit description doesn't mention any specific PHY.
> I suppose you have PHYs you'd like to operate with the genphy_c45 driver.
> Could you give an example? And ideally, is a public datasheet available?
> 
>> Christian
>>
>>
> Heiner
> 

There are no public BASE-T1 devices on the market right now that use 
Clause 45 standard registers. The first such products were developed 
before the IEEE standard (BroadR-Reach) and used Clause 22 access (see 
e.g. the support in the Kernel for TJA110x).

The most convenient way to test with a BASE-T1 device would be to use an 
SFP (e.g. 
https://technica-engineering.de/produkt/1000base-t1-sfp-module/). 
Alternative source could be Goepel.

There are also a number of media-converters around where one could break 
out the MDIO and connect to a processor. Of course, in all cases it 
should be made sure that this is a Clause-45 device.

As all relevant parts are NDA-restricted, this is pretty much all the 
information I can share.

Christian


^ permalink raw reply

* Re: [PATCH net-next v3 2/4] net: mdio: add PTP offset compensation to mdiobus_write_sts
From: Andrew Lunn @ 2019-08-20 13:26 UTC (permalink / raw)
  To: Miroslav Lichvar
  Cc: Hubert Feurstein, netdev, linux-kernel, Richard Cochran,
	Florian Fainelli, Heiner Kallweit, Vladimir Oltean,
	David S. Miller
In-Reply-To: <20190820094903.GI891@localhost>

On Tue, Aug 20, 2019 at 11:49:03AM +0200, Miroslav Lichvar wrote:
> On Tue, Aug 20, 2019 at 10:48:31AM +0200, Hubert Feurstein wrote:
> 
> > +	/* PTP offset compensation:
> > +	 * After the MDIO access is completed (from the chip perspective), the
> > +	 * switch chip will snapshot the PHC timestamp. To make sure our system
> > +	 * timestamp corresponds to the PHC timestamp, we have to add the
> > +	 * duration of this MDIO access to sts->post_ts. Linuxptp's phc2sys
> > +	 * takes the average of pre_ts and post_ts to calculate the final
> > +	 * system timestamp. With this in mind, we have to add ptp_sts_offset
> > +	 * twice to post_ts, in order to not introduce an constant time offset.
> > +	 */
> > +	if (sts)
> > +		timespec64_add_ns(&sts->post_ts, 2 * bus->ptp_sts_offset);
> 
> This correction looks good to me.
> 
> Is the MDIO write delay constant in reality, or does it at least have
> an upper bound? That is, is it always true that the post_ts timestamp
> does not point to a time before the PHC timestamp was actually taken?

The post_ts could be before the target hardware does anything. The
write triggers an MDIO bus transaction, sending about 64 bits of data
down a wire at around 2.5Mbps. So there is a minimum delay of 25uS
just sending the bits down the wire. It is unclear to me exactly when
the post_ts is taken, has the hardware actually sent the bits, or has
it just initiated sending the bits? In this case, there is an
interrupt sometime later indicating the transaction has completed, so
my guess would be, post_ts indicates the transaction has been
initiated.

Also, how long does the device on the end of the bus actually take to
decode the bits on the wire and do what it needs to do?

> This is important to not break the estimation of maximum error in the
> measured offset. Applications using the ioctl may assume that the
> maximum error is (post_ts-pre_ts)/2 (i.e. half of the delay printed by
> phc2sys). That would not work if the delay could be occasionally 50
> microseconds for instance, i.e. the post_ts timestamp would be earlier
> than the PHC timestamp.

Given my understanding of the hardware, post_ts-pre_ts should be
constant. But what it exactly measures is not clearly defined :-(

And different hardware will have different definitions.

But the real point is, by doing these timestamps here, we are as close
as possible to where the action really happens, and we are minimising
the number of undefined things we are measuring, so in general, the
error is minimised.

    Andrew

^ permalink raw reply

* Re: [EXT] Re: [PATCH net-next 0/1] net: fec: add C45 MDIO read/write support
From: Andrew Lunn @ 2019-08-20 13:04 UTC (permalink / raw)
  To: Andy Duan
  Cc: Marco Hartmann, davem@davemloft.net, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, Christian Herber
In-Reply-To: <VI1PR0402MB360079EAAE7042048B2F5AC8FFAB0@VI1PR0402MB3600.eurprd04.prod.outlook.com>

On Tue, Aug 20, 2019 at 02:32:26AM +0000, Andy Duan wrote:
> From: Andrew Lunn <andrew@lunn.ch>
> > On Mon, Aug 19, 2019 at 05:11:14PM +0000, Marco Hartmann wrote:
> > > As of yet, the Fast Ethernet Controller (FEC) driver only supports
> > > Clause 22 conform MDIO transactions. IEEE 802.3ae Clause 45 defines a
> > > modified MDIO protocol that uses a two staged access model in order to
> > > increase the address space.
> > >
> > > This patch adds support for Clause 45 conform MDIO read and write
> > > operations to the FEC driver.
> > 
> > Hi Marco
> > 
> > Do all versions of the FEC hardware support C45? Or do we need to make use
> > of the quirk support in this driver to just enable it for some revisions of FEC?
> > 
> > Thanks
> >         Andrew
> 
> i.MX legacy platforms like i.MX6/7 series, they doesn't support Write & Read Increment.
> But for i.MX8MQ/MM series, it support C45 full features like Write & Read Increment.
> 
> For the patch itself, it doesn't support Write & Read Increment, so I think the patch doesn't
> need to add quirk support.

Hi Andy

So what happens with something older than a i.MX8MQ/MM when a C45
transfer is attempted? This patch adds a new write. Does that write
immediately trigger a completion interrupt? Does it never trigger an
interrupt, and we have to wait FEC_MII_TIMEOUT?

Ideally, if the hardware does not support C45, we want it to return
EOPNOTSUPP.

Thanks
	Andrew

^ permalink raw reply

* Re: [PATCH net-next v2] net: openvswitch: Set OvS recirc_id from tc chain index
From: Paul Blakey @ 2019-08-20 12:56 UTC (permalink / raw)
  To: Pravin B Shelar, netdev@vger.kernel.org, David S. Miller,
	Justin Pettit, Simon Horman, Marcelo Ricardo Leitner, Vlad Buslov
  Cc: Jiri Pirko, Roi Dayan, Yossi Kuperman, Rony Efraim, Oz Shlomo
In-Reply-To: <1566304834-22836-1-git-send-email-paulb@mellanox.com>

Hey guys, sorry for spam, I used the --in-reply-to  this time so it gets 
to the original thread ("[PATCH net-next v2] net: openvswitch: Set OvS 
recirc_id from tc chain index") ,

Ignore this thread and respond there if needed.

Thanks.


On 8/20/2019 3:40 PM, Paul Blakey wrote:
> Regarding the user_features change, I tested the above patch with this one in
> userspace that I'll send once this is accepted, togother with the rest
> of connection tracking offload patches.
>
> I also have a test for it, if anyone wants it.
>
> Patch is:
> lib/netdev-offloads-tc: Probe recirc tc sharing feature on first recirc_id rule
> [...]

^ permalink raw reply

* Re: [RFC PATCH net-next 03/11] spi: Add a PTP system timestamp to the transfer structure
From: Mark Brown @ 2019-08-20 12:55 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Hubert Feurstein, mlichvar, Richard Cochran, Andrew Lunn,
	Florian Fainelli, linux-spi, netdev
In-Reply-To: <CA+h21hrZbun_j+oABJFP+P+V3zHP2x0mAhv-1ocF38miCvZHew@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1860 bytes --]

On Fri, Aug 16, 2019 at 05:05:53PM +0300, Vladimir Oltean wrote:

> I'm not sure how to respond to this, because I don't know anything
> about the timing of DMA transfers.
> Maybe snapshotting DMA transfers the same way is not possible (if at
> all). Maybe they are not exactly adequate for this sort of application
> anyway. Maybe it depends.

DMA transfers generally proceed without any involvement from the CPU,
this is broadly the point of DMA.  You *may* be able to split into
multiple transactions but it's not reliable that you'd be able to do so
on byte boundaries and there will be latency getting notified of
completions.

> In other words, from a purely performance perspective, I am against
> limiting the API to just snapshotting the first and last byte. At this
> level of "zoom", if I change the offset of the byte to anything other
> than 3, the synchronization offset refuses to converge towards zero,
> because the snapshot is incurring a constant offset that the servo
> loop from userspace (phc2sys) can't compensate for.

> Maybe the SPI master driver should just report what sort of
> snapshotting capability it can offer, ranging from none (default
> unless otherwise specified), to transfer-level (DMA style) or
> byte-level.

That does then have the consequence that the majority of controllers
aren't going to be usable with the API which isn't great.

> I'm afraid more actual experimentation is needed with DMA-based
> controllers to understand what can be expected from them, and as a
> result, how the API should map around them.
> MDIO bus controllers are in a similar situation (with Hubert's patch)
> but at least there the frame size is fixed and I haven't heard of an
> MDIO controller to use DMA.

I'm not 100% clear what the problem you're trying to solve is, or if
it's a sensible problem to try to solve for that matter.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH net-next v2] net: openvswitch: Set OvS recirc_id from tc chain index
From: Paul Blakey @ 2019-08-20 12:51 UTC (permalink / raw)
  To: Pravin B Shelar, netdev, David S. Miller, Justin Pettit,
	Simon Horman, Marcelo Ricardo Leitner, Vlad Buslov, Paul Blakey
  Cc: Jiri Pirko, Roi Dayan, Yossi Kuperman, Rony Efraim, Oz Shlomo
In-Reply-To: <1566304251-15795-1-git-send-email-paulb@mellanox.com>

Regarding the user_features change, I tested the above patch with this one in
userspace that I'll send once this is accepted, togother with the rest
of connection tracking offload patches.

I also have a test for it, if anyone wants it.

Patch is:
lib/netdev-offloads-tc: Probe recirc tc sharing feature on first recirc_id rule

Signed-off-by: Paul Blakey <paulb@mellanox.com>
---
 datapath/linux/compat/include/linux/openvswitch.h |  3 ++
 lib/dpif-netdev.c                                 |  1 +
 lib/dpif-netlink.c                                | 61 +++++++++++++++++++----
 lib/dpif-provider.h                               |  2 +
 lib/dpif.c                                        |  9 ++++
 lib/dpif.h                                        |  2 +
 lib/netdev-offload-tc.c                           | 33 ++++++++++--
 lib/netdev-offload.h                              |  2 +-
 8 files changed, 100 insertions(+), 13 deletions(-)

diff --git a/datapath/linux/compat/include/linux/openvswitch.h b/datapath/linux/compat/include/linux/openvswitch.h
index 65a003a..921ef5b 100644
--- a/datapath/linux/compat/include/linux/openvswitch.h
+++ b/datapath/linux/compat/include/linux/openvswitch.h
@@ -143,6 +143,9 @@ struct ovs_vport_stats {
 /* Allow datapath to associate multiple Netlink PIDs to each vport */
 #define OVS_DP_F_VPORT_PIDS	(1 << 1)
 
+/* Allow tc offload recirc sharing */
+#define OVS_DP_F_TC_RECIRC_SHARING  (1 << 2)
+
 /* Fixed logical ports. */
 #define OVSP_LOCAL      ((__u32)0)
 
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index d0a1c58..fd8275f 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -7489,6 +7489,7 @@ const struct dpif_class dpif_netdev_class = {
     dpif_netdev_run,
     dpif_netdev_wait,
     dpif_netdev_get_stats,
+    NULL,                      /* set_features */
     dpif_netdev_port_add,
     dpif_netdev_port_del,
     dpif_netdev_port_set_config,
diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
index 07a9ddd..15e6057 100644
--- a/lib/dpif-netlink.c
+++ b/lib/dpif-netlink.c
@@ -192,6 +192,7 @@ struct dpif_handler {
 struct dpif_netlink {
     struct dpif dpif;
     int dp_ifindex;
+    uint32_t user_features;
 
     /* Upcall messages. */
     struct fat_rwlock upcall_lock;
@@ -303,7 +304,6 @@ dpif_netlink_enumerate(struct sset *all_dps,
     if (error) {
         return error;
     }
-
     ofpbuf_use_stub(&buf, reply_stub, sizeof reply_stub);
     dpif_netlink_dp_dump_start(&dump);
     while (nl_dump_next(&dump, &msg, &buf)) {
@@ -333,15 +333,26 @@ dpif_netlink_open(const struct dpif_class *class OVS_UNUSED, const char *name,
 
     /* Create or look up datapath. */
     dpif_netlink_dp_init(&dp_request);
+    upcall_pid = 0;
+    dp_request.upcall_pid = &upcall_pid;
+    dp_request.name = name;
+
     if (create) {
         dp_request.cmd = OVS_DP_CMD_NEW;
-        upcall_pid = 0;
-        dp_request.upcall_pid = &upcall_pid;
     } else {
+        dp_request.cmd = OVS_DP_CMD_GET;
+
+        error = dpif_netlink_dp_transact(&dp_request, &dp, &buf);
+        if (error)  {
+            return error;
+        }
+        dp_request.user_features = dp.user_features;
+        ofpbuf_delete(buf);
+
         /* Use OVS_DP_CMD_SET to report user features */
         dp_request.cmd = OVS_DP_CMD_SET;
     }
-    dp_request.name = name;
+
     dp_request.user_features |= OVS_DP_F_UNALIGNED;
     dp_request.user_features |= OVS_DP_F_VPORT_PIDS;
     error = dpif_netlink_dp_transact(&dp_request, &dp, &buf);
@@ -367,6 +378,7 @@ open_dpif(const struct dpif_netlink_dp *dp, struct dpif **dpifp)
               dp->dp_ifindex, dp->dp_ifindex);
 
     dpif->dp_ifindex = dp->dp_ifindex;
+    dpif->user_features = dp->user_features;
     *dpifp = &dpif->dpif;
 
     return 0;
@@ -662,6 +674,31 @@ dpif_netlink_get_stats(const struct dpif *dpif_, struct dpif_dp_stats *stats)
     return error;
 }
 
+static int
+dpif_netlink_set_features(struct dpif *dpif_, uint32_t new_features)
+{
+    struct dpif_netlink *dpif = dpif_netlink_cast(dpif_);
+    struct dpif_netlink_dp request, reply;
+    struct ofpbuf *bufp;
+    int error;
+
+    dpif_netlink_dp_init(&request);
+    request.cmd = OVS_DP_CMD_SET;
+    request.dp_ifindex = dpif->dp_ifindex;
+    request.user_features = dpif->user_features | new_features;
+
+    error = dpif_netlink_dp_transact(&request, &reply, &bufp);
+    if (!error) {
+        dpif->user_features = reply.user_features;
+        ofpbuf_delete(bufp);
+        if (!(dpif->user_features & new_features)) {
+            return -EOPNOTSUPP;
+        }
+    }
+
+    return error;
+}
+
 static const char *
 get_vport_type(const struct dpif_netlink_vport *vport)
 {
@@ -1989,7 +2026,6 @@ static int
 parse_flow_put(struct dpif_netlink *dpif, struct dpif_flow_put *put)
 {
     static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20);
-    const struct dpif_class *dpif_class = dpif->dpif.dpif_class;
     struct match match;
     odp_port_t in_port;
     const struct nlattr *nla;
@@ -2011,7 +2047,7 @@ parse_flow_put(struct dpif_netlink *dpif, struct dpif_flow_put *put)
     }
 
     in_port = match.flow.in_port.odp_port;
-    dev = netdev_ports_get(in_port, dpif_class);
+    dev = netdev_ports_get(in_port, dpif->dpif.dpif_class);
     if (!dev) {
         return EOPNOTSUPP;
     }
@@ -2024,7 +2060,7 @@ parse_flow_put(struct dpif_netlink *dpif, struct dpif_flow_put *put)
             odp_port_t out_port;
 
             out_port = nl_attr_get_odp_port(nla);
-            outdev = netdev_ports_get(out_port, dpif_class);
+            outdev = netdev_ports_get(out_port, dpif->dpif.dpif_class);
             if (!outdev) {
                 err = EOPNOTSUPP;
                 goto out;
@@ -2040,7 +2076,7 @@ parse_flow_put(struct dpif_netlink *dpif, struct dpif_flow_put *put)
         }
     }
 
-    info.dpif_class = dpif_class;
+    info.dpif = &dpif->dpif;
     info.tp_dst_port = dst_port;
     info.tunnel_csum_on = csum_on;
     err = netdev_flow_put(dev, &match,
@@ -3394,6 +3430,7 @@ const struct dpif_class dpif_netlink_class = {
     dpif_netlink_run,
     NULL,                       /* wait */
     dpif_netlink_get_stats,
+    dpif_netlink_set_features,
     dpif_netlink_port_add,
     dpif_netlink_port_del,
     NULL,                       /* port_set_config */
@@ -3702,6 +3739,9 @@ dpif_netlink_dp_from_ofpbuf(struct dpif_netlink_dp *dp, const struct ofpbuf *buf
         [OVS_DP_ATTR_MEGAFLOW_STATS] = {
                         NL_POLICY_FOR(struct ovs_dp_megaflow_stats),
                         .optional = true },
+        [OVS_DP_ATTR_USER_FEATURES] = {
+                        .type = NL_A_U32,
+                        .optional = true },
     };
 
     dpif_netlink_dp_init(dp);
@@ -3730,6 +3770,10 @@ dpif_netlink_dp_from_ofpbuf(struct dpif_netlink_dp *dp, const struct ofpbuf *buf
         dp->megaflow_stats = nl_attr_get(a[OVS_DP_ATTR_MEGAFLOW_STATS]);
     }
 
+    if (a[OVS_DP_ATTR_USER_FEATURES]) {
+        dp->user_features = nl_attr_get_u32(a[OVS_DP_ATTR_USER_FEATURES]);
+    }
+
     return 0;
 }
 
@@ -3802,7 +3846,6 @@ dpif_netlink_dp_transact(const struct dpif_netlink_dp *request,
     dpif_netlink_dp_to_ofpbuf(request, request_buf);
     error = nl_transact(NETLINK_GENERIC, request_buf, bufp);
     ofpbuf_delete(request_buf);
-
     if (reply) {
         dpif_netlink_dp_init(reply);
         if (!error) {
diff --git a/lib/dpif-provider.h b/lib/dpif-provider.h
index 12898b9..8c8bc77 100644
--- a/lib/dpif-provider.h
+++ b/lib/dpif-provider.h
@@ -187,6 +187,8 @@ struct dpif_class {
     /* Retrieves statistics for 'dpif' into 'stats'. */
     int (*get_stats)(const struct dpif *dpif, struct dpif_dp_stats *stats);
 
+    int (*set_features)(struct dpif *dpif, uint32_t user_features);
+
     /* Adds 'netdev' as a new port in 'dpif'.  If '*port_no' is not
      * ODPP_NONE, attempts to use that as the port's port number.
      *
diff --git a/lib/dpif.c b/lib/dpif.c
index c88b210..dc13655 100644
--- a/lib/dpif.c
+++ b/lib/dpif.c
@@ -543,6 +543,15 @@ dpif_get_dp_stats(const struct dpif *dpif, struct dpif_dp_stats *stats)
     return error;
 }
 
+int
+dpif_set_features(struct dpif *dpif, uint32_t new_features)
+{
+    int error = dpif->dpif_class->set_features(dpif, new_features);
+
+    log_operation(dpif, "set_features", error);
+    return error;
+}
+
 const char *
 dpif_port_open_type(const char *datapath_type, const char *port_type)
 {
diff --git a/lib/dpif.h b/lib/dpif.h
index 289d574..c7bb48f 100644
--- a/lib/dpif.h
+++ b/lib/dpif.h
@@ -435,6 +435,8 @@ struct dpif_dp_stats {
 };
 int dpif_get_dp_stats(const struct dpif *, struct dpif_dp_stats *);
 
+int dpif_set_features(struct dpif *, uint32_t new_features);
+
 \f
 /* Port operations. */
 
diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
index 60d5a42..4e85585 100644
--- a/lib/netdev-offload-tc.c
+++ b/lib/netdev-offload-tc.c
@@ -38,6 +38,7 @@
 #include "tc.h"
 #include "unaligned.h"
 #include "util.h"
+#include "dpif-provider.h"
 
 VLOG_DEFINE_THIS_MODULE(netdev_offload_tc);
 
@@ -1354,6 +1355,25 @@ flower_match_to_tun_opt(struct tc_flower *flower, const struct flow_tnl *tnl,
     flower->mask.tunnel.metadata.present.len = tnl->metadata.present.len;
 }
 
+static bool
+recirc_id_sharing_support(struct dpif *dpif)
+{
+    static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
+    static bool supported = false;
+    int err;
+
+    if (ovsthread_once_start(&once)) {
+        err = dpif_set_features(dpif, OVS_DP_F_TC_RECIRC_SHARING);
+        supported = err ? supported : true;
+        if (supported) {
+            VLOG_INFO("probe tc: tc recirc id sharing with OvS datapath is supported.");
+        }
+        ovsthread_once_done(&once);
+    }
+
+    return supported;
+}
+
 static int
 netdev_tc_flow_put(struct netdev *netdev, struct match *match,
                    struct nlattr *actions, size_t actions_len,
@@ -1371,7 +1391,7 @@ netdev_tc_flow_put(struct netdev *netdev, struct match *match,
     uint32_t block_id = 0;
     struct nlattr *nla;
     struct tc_id id;
-    uint32_t chain;
+    uint32_t chain = 0;
     size_t left;
     int prio = 0;
     int ifindex;
@@ -1386,7 +1406,13 @@ netdev_tc_flow_put(struct netdev *netdev, struct match *match,
 
     memset(&flower, 0, sizeof flower);
 
-    chain = key->recirc_id;
+    if (key->recirc_id) {
+        if (recirc_id_sharing_support(info->dpif)) {
+            chain = key->recirc_id;
+        } else {
+            return -EOPNOTSUPP;
+        }
+    }
     mask->recirc_id = 0;
 
     if (flow_tnl_dst_is_set(&key->tunnel)) {
@@ -1634,7 +1660,8 @@ netdev_tc_flow_put(struct netdev *netdev, struct match *match,
         action = &flower.actions[flower.action_count];
         if (nl_attr_type(nla) == OVS_ACTION_ATTR_OUTPUT) {
             odp_port_t port = nl_attr_get_odp_port(nla);
-            struct netdev *outdev = netdev_ports_get(port, info->dpif_class);
+            struct netdev *outdev = netdev_ports_get(port,
+                                                     info->dpif->dpif_class);
 
             action->out.ifindex_out = netdev_get_ifindex(outdev);
             action->out.ingress = is_internal_port(netdev_get_type(outdev));
diff --git a/lib/netdev-offload.h b/lib/netdev-offload.h
index 97a5006..d852fe2 100644
--- a/lib/netdev-offload.h
+++ b/lib/netdev-offload.h
@@ -62,7 +62,7 @@ struct netdev_flow_dump {
 
 /* Flow offloading. */
 struct offload_info {
-    const struct dpif_class *dpif_class;
+    struct dpif *dpif;
     ovs_be16 tp_dst_port; /* Destination port for tunnel in SET action */
     uint8_t tunnel_csum_on; /* Tunnel header with checksum */
 
-- 
1.8.3.1


^ permalink raw reply related

* Re: [PATCH net-next v2] net: openvswitch: Set OvS recirc_id from tc chain index
From: Paul Blakey @ 2019-08-20 12:40 UTC (permalink / raw)
  To: Pravin B Shelar, netdev, David S. Miller, Justin Pettit,
	Simon Horman, Marcelo Ricardo Leitner, Vlad Buslov, Paul Blakey
  Cc: Jiri Pirko, Roi Dayan, Yossi Kuperman, Rony Efraim, Oz Shlomo

Regarding the user_features change, I tested the above patch with this one in
userspace that I'll send once this is accepted, togother with the rest
of connection tracking offload patches.

I also have a test for it, if anyone wants it.

Patch is:
lib/netdev-offloads-tc: Probe recirc tc sharing feature on first recirc_id rule

Signed-off-by: Paul Blakey <paulb@mellanox.com>
---
 datapath/linux/compat/include/linux/openvswitch.h |  3 ++
 lib/dpif-netdev.c                                 |  1 +
 lib/dpif-netlink.c                                | 61 +++++++++++++++++++----
 lib/dpif-provider.h                               |  2 +
 lib/dpif.c                                        |  9 ++++
 lib/dpif.h                                        |  2 +
 lib/netdev-offload-tc.c                           | 33 ++++++++++--
 lib/netdev-offload.h                              |  2 +-
 8 files changed, 100 insertions(+), 13 deletions(-)

diff --git a/datapath/linux/compat/include/linux/openvswitch.h b/datapath/linux/compat/include/linux/openvswitch.h
index 65a003a..921ef5b 100644
--- a/datapath/linux/compat/include/linux/openvswitch.h
+++ b/datapath/linux/compat/include/linux/openvswitch.h
@@ -143,6 +143,9 @@ struct ovs_vport_stats {
 /* Allow datapath to associate multiple Netlink PIDs to each vport */
 #define OVS_DP_F_VPORT_PIDS	(1 << 1)
 
+/* Allow tc offload recirc sharing */
+#define OVS_DP_F_TC_RECIRC_SHARING  (1 << 2)
+
 /* Fixed logical ports. */
 #define OVSP_LOCAL      ((__u32)0)
 
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index d0a1c58..fd8275f 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -7489,6 +7489,7 @@ const struct dpif_class dpif_netdev_class = {
     dpif_netdev_run,
     dpif_netdev_wait,
     dpif_netdev_get_stats,
+    NULL,                      /* set_features */
     dpif_netdev_port_add,
     dpif_netdev_port_del,
     dpif_netdev_port_set_config,
diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
index 07a9ddd..15e6057 100644
--- a/lib/dpif-netlink.c
+++ b/lib/dpif-netlink.c
@@ -192,6 +192,7 @@ struct dpif_handler {
 struct dpif_netlink {
     struct dpif dpif;
     int dp_ifindex;
+    uint32_t user_features;
 
     /* Upcall messages. */
     struct fat_rwlock upcall_lock;
@@ -303,7 +304,6 @@ dpif_netlink_enumerate(struct sset *all_dps,
     if (error) {
         return error;
     }
-
     ofpbuf_use_stub(&buf, reply_stub, sizeof reply_stub);
     dpif_netlink_dp_dump_start(&dump);
     while (nl_dump_next(&dump, &msg, &buf)) {
@@ -333,15 +333,26 @@ dpif_netlink_open(const struct dpif_class *class OVS_UNUSED, const char *name,
 
     /* Create or look up datapath. */
     dpif_netlink_dp_init(&dp_request);
+    upcall_pid = 0;
+    dp_request.upcall_pid = &upcall_pid;
+    dp_request.name = name;
+
     if (create) {
         dp_request.cmd = OVS_DP_CMD_NEW;
-        upcall_pid = 0;
-        dp_request.upcall_pid = &upcall_pid;
     } else {
+        dp_request.cmd = OVS_DP_CMD_GET;
+
+        error = dpif_netlink_dp_transact(&dp_request, &dp, &buf);
+        if (error)  {
+            return error;
+        }
+        dp_request.user_features = dp.user_features;
+        ofpbuf_delete(buf);
+
         /* Use OVS_DP_CMD_SET to report user features */
         dp_request.cmd = OVS_DP_CMD_SET;
     }
-    dp_request.name = name;
+
     dp_request.user_features |= OVS_DP_F_UNALIGNED;
     dp_request.user_features |= OVS_DP_F_VPORT_PIDS;
     error = dpif_netlink_dp_transact(&dp_request, &dp, &buf);
@@ -367,6 +378,7 @@ open_dpif(const struct dpif_netlink_dp *dp, struct dpif **dpifp)
               dp->dp_ifindex, dp->dp_ifindex);
 
     dpif->dp_ifindex = dp->dp_ifindex;
+    dpif->user_features = dp->user_features;
     *dpifp = &dpif->dpif;
 
     return 0;
@@ -662,6 +674,31 @@ dpif_netlink_get_stats(const struct dpif *dpif_, struct dpif_dp_stats *stats)
     return error;
 }
 
+static int
+dpif_netlink_set_features(struct dpif *dpif_, uint32_t new_features)
+{
+    struct dpif_netlink *dpif = dpif_netlink_cast(dpif_);
+    struct dpif_netlink_dp request, reply;
+    struct ofpbuf *bufp;
+    int error;
+
+    dpif_netlink_dp_init(&request);
+    request.cmd = OVS_DP_CMD_SET;
+    request.dp_ifindex = dpif->dp_ifindex;
+    request.user_features = dpif->user_features | new_features;
+
+    error = dpif_netlink_dp_transact(&request, &reply, &bufp);
+    if (!error) {
+        dpif->user_features = reply.user_features;
+        ofpbuf_delete(bufp);
+        if (!(dpif->user_features & new_features)) {
+            return -EOPNOTSUPP;
+        }
+    }
+
+    return error;
+}
+
 static const char *
 get_vport_type(const struct dpif_netlink_vport *vport)
 {
@@ -1989,7 +2026,6 @@ static int
 parse_flow_put(struct dpif_netlink *dpif, struct dpif_flow_put *put)
 {
     static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20);
-    const struct dpif_class *dpif_class = dpif->dpif.dpif_class;
     struct match match;
     odp_port_t in_port;
     const struct nlattr *nla;
@@ -2011,7 +2047,7 @@ parse_flow_put(struct dpif_netlink *dpif, struct dpif_flow_put *put)
     }
 
     in_port = match.flow.in_port.odp_port;
-    dev = netdev_ports_get(in_port, dpif_class);
+    dev = netdev_ports_get(in_port, dpif->dpif.dpif_class);
     if (!dev) {
         return EOPNOTSUPP;
     }
@@ -2024,7 +2060,7 @@ parse_flow_put(struct dpif_netlink *dpif, struct dpif_flow_put *put)
             odp_port_t out_port;
 
             out_port = nl_attr_get_odp_port(nla);
-            outdev = netdev_ports_get(out_port, dpif_class);
+            outdev = netdev_ports_get(out_port, dpif->dpif.dpif_class);
             if (!outdev) {
                 err = EOPNOTSUPP;
                 goto out;
@@ -2040,7 +2076,7 @@ parse_flow_put(struct dpif_netlink *dpif, struct dpif_flow_put *put)
         }
     }
 
-    info.dpif_class = dpif_class;
+    info.dpif = &dpif->dpif;
     info.tp_dst_port = dst_port;
     info.tunnel_csum_on = csum_on;
     err = netdev_flow_put(dev, &match,
@@ -3394,6 +3430,7 @@ const struct dpif_class dpif_netlink_class = {
     dpif_netlink_run,
     NULL,                       /* wait */
     dpif_netlink_get_stats,
+    dpif_netlink_set_features,
     dpif_netlink_port_add,
     dpif_netlink_port_del,
     NULL,                       /* port_set_config */
@@ -3702,6 +3739,9 @@ dpif_netlink_dp_from_ofpbuf(struct dpif_netlink_dp *dp, const struct ofpbuf *buf
         [OVS_DP_ATTR_MEGAFLOW_STATS] = {
                         NL_POLICY_FOR(struct ovs_dp_megaflow_stats),
                         .optional = true },
+        [OVS_DP_ATTR_USER_FEATURES] = {
+                        .type = NL_A_U32,
+                        .optional = true },
     };
 
     dpif_netlink_dp_init(dp);
@@ -3730,6 +3770,10 @@ dpif_netlink_dp_from_ofpbuf(struct dpif_netlink_dp *dp, const struct ofpbuf *buf
         dp->megaflow_stats = nl_attr_get(a[OVS_DP_ATTR_MEGAFLOW_STATS]);
     }
 
+    if (a[OVS_DP_ATTR_USER_FEATURES]) {
+        dp->user_features = nl_attr_get_u32(a[OVS_DP_ATTR_USER_FEATURES]);
+    }
+
     return 0;
 }
 
@@ -3802,7 +3846,6 @@ dpif_netlink_dp_transact(const struct dpif_netlink_dp *request,
     dpif_netlink_dp_to_ofpbuf(request, request_buf);
     error = nl_transact(NETLINK_GENERIC, request_buf, bufp);
     ofpbuf_delete(request_buf);
-
     if (reply) {
         dpif_netlink_dp_init(reply);
         if (!error) {
diff --git a/lib/dpif-provider.h b/lib/dpif-provider.h
index 12898b9..8c8bc77 100644
--- a/lib/dpif-provider.h
+++ b/lib/dpif-provider.h
@@ -187,6 +187,8 @@ struct dpif_class {
     /* Retrieves statistics for 'dpif' into 'stats'. */
     int (*get_stats)(const struct dpif *dpif, struct dpif_dp_stats *stats);
 
+    int (*set_features)(struct dpif *dpif, uint32_t user_features);
+
     /* Adds 'netdev' as a new port in 'dpif'.  If '*port_no' is not
      * ODPP_NONE, attempts to use that as the port's port number.
      *
diff --git a/lib/dpif.c b/lib/dpif.c
index c88b210..dc13655 100644
--- a/lib/dpif.c
+++ b/lib/dpif.c
@@ -543,6 +543,15 @@ dpif_get_dp_stats(const struct dpif *dpif, struct dpif_dp_stats *stats)
     return error;
 }
 
+int
+dpif_set_features(struct dpif *dpif, uint32_t new_features)
+{
+    int error = dpif->dpif_class->set_features(dpif, new_features);
+
+    log_operation(dpif, "set_features", error);
+    return error;
+}
+
 const char *
 dpif_port_open_type(const char *datapath_type, const char *port_type)
 {
diff --git a/lib/dpif.h b/lib/dpif.h
index 289d574..c7bb48f 100644
--- a/lib/dpif.h
+++ b/lib/dpif.h
@@ -435,6 +435,8 @@ struct dpif_dp_stats {
 };
 int dpif_get_dp_stats(const struct dpif *, struct dpif_dp_stats *);
 
+int dpif_set_features(struct dpif *, uint32_t new_features);
+
 \f
 /* Port operations. */
 
diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
index 60d5a42..4e85585 100644
--- a/lib/netdev-offload-tc.c
+++ b/lib/netdev-offload-tc.c
@@ -38,6 +38,7 @@
 #include "tc.h"
 #include "unaligned.h"
 #include "util.h"
+#include "dpif-provider.h"
 
 VLOG_DEFINE_THIS_MODULE(netdev_offload_tc);
 
@@ -1354,6 +1355,25 @@ flower_match_to_tun_opt(struct tc_flower *flower, const struct flow_tnl *tnl,
     flower->mask.tunnel.metadata.present.len = tnl->metadata.present.len;
 }
 
+static bool
+recirc_id_sharing_support(struct dpif *dpif)
+{
+    static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
+    static bool supported = false;
+    int err;
+
+    if (ovsthread_once_start(&once)) {
+        err = dpif_set_features(dpif, OVS_DP_F_TC_RECIRC_SHARING);
+        supported = err ? supported : true;
+        if (supported) {
+            VLOG_INFO("probe tc: tc recirc id sharing with OvS datapath is supported.");
+        }
+        ovsthread_once_done(&once);
+    }
+
+    return supported;
+}
+
 static int
 netdev_tc_flow_put(struct netdev *netdev, struct match *match,
                    struct nlattr *actions, size_t actions_len,
@@ -1371,7 +1391,7 @@ netdev_tc_flow_put(struct netdev *netdev, struct match *match,
     uint32_t block_id = 0;
     struct nlattr *nla;
     struct tc_id id;
-    uint32_t chain;
+    uint32_t chain = 0;
     size_t left;
     int prio = 0;
     int ifindex;
@@ -1386,7 +1406,13 @@ netdev_tc_flow_put(struct netdev *netdev, struct match *match,
 
     memset(&flower, 0, sizeof flower);
 
-    chain = key->recirc_id;
+    if (key->recirc_id) {
+        if (recirc_id_sharing_support(info->dpif)) {
+            chain = key->recirc_id;
+        } else {
+            return -EOPNOTSUPP;
+        }
+    }
     mask->recirc_id = 0;
 
     if (flow_tnl_dst_is_set(&key->tunnel)) {
@@ -1634,7 +1660,8 @@ netdev_tc_flow_put(struct netdev *netdev, struct match *match,
         action = &flower.actions[flower.action_count];
         if (nl_attr_type(nla) == OVS_ACTION_ATTR_OUTPUT) {
             odp_port_t port = nl_attr_get_odp_port(nla);
-            struct netdev *outdev = netdev_ports_get(port, info->dpif_class);
+            struct netdev *outdev = netdev_ports_get(port,
+                                                     info->dpif->dpif_class);
 
             action->out.ifindex_out = netdev_get_ifindex(outdev);
             action->out.ingress = is_internal_port(netdev_get_type(outdev));
diff --git a/lib/netdev-offload.h b/lib/netdev-offload.h
index 97a5006..d852fe2 100644
--- a/lib/netdev-offload.h
+++ b/lib/netdev-offload.h
@@ -62,7 +62,7 @@ struct netdev_flow_dump {
 
 /* Flow offloading. */
 struct offload_info {
-    const struct dpif_class *dpif_class;
+    struct dpif *dpif;
     ovs_be16 tp_dst_port; /* Destination port for tunnel in SET action */
     uint8_t tunnel_csum_on; /* Tunnel header with checksum */
 
-- 
1.8.3.1


^ permalink raw reply related

* [PATCH] vhost: Remove unnecessary variable
From: Yunsheng Lin @ 2019-08-20 12:36 UTC (permalink / raw)
  To: mst, jasowang; +Cc: kvm, virtualization, netdev, linux-kernel

It is unnecessary to use ret variable to return the error
code, just return the error code directly.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
---
 drivers/vhost/vhost.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 0536f85..1ac9de2 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -203,7 +203,6 @@ EXPORT_SYMBOL_GPL(vhost_poll_init);
 int vhost_poll_start(struct vhost_poll *poll, struct file *file)
 {
 	__poll_t mask;
-	int ret = 0;
 
 	if (poll->wqh)
 		return 0;
@@ -213,10 +212,10 @@ int vhost_poll_start(struct vhost_poll *poll, struct file *file)
 		vhost_poll_wakeup(&poll->wait, 0, 0, poll_to_key(mask));
 	if (mask & EPOLLERR) {
 		vhost_poll_stop(poll);
-		ret = -EINVAL;
+		return -EINVAL;
 	}
 
-	return ret;
+	return 0;
 }
 EXPORT_SYMBOL_GPL(vhost_poll_start);
 
-- 
2.8.1


^ permalink raw reply related

* [PATCH net-next v2] net: openvswitch: Set OvS recirc_id from tc chain index
From: Paul Blakey @ 2019-08-20 12:30 UTC (permalink / raw)
  To: Pravin B Shelar, netdev, David S. Miller, Justin Pettit,
	Simon Horman, Marcelo Ricardo Leitner, Vlad Buslov, Paul Blakey
  Cc: Jiri Pirko, Roi Dayan, Yossi Kuperman, Rony Efraim, Oz Shlomo

Offloaded OvS datapath rules are translated one to one to tc rules,
for example the following simplified OvS rule:

recirc_id(0),in_port(dev1),eth_type(0x0800),ct_state(-trk) actions:ct(),recirc(2)

Will be translated to the following tc rule:

$ tc filter add dev dev1 ingress \
	    prio 1 chain 0 proto ip \
		flower tcp ct_state -trk \
		action ct pipe \
		action goto chain 2

Received packets will first travel though tc, and if they aren't stolen
by it, like in the above rule, they will continue to OvS datapath.
Since we already did some actions (action ct in this case) which might
modify the packets, and updated action stats, we would like to continue
the proccessing with the correct recirc_id in OvS (here recirc_id(2))
where we left off.

To support this, introduce a new skb extension for tc, which
will be used for translating tc chain to ovs recirc_id to
handle these miss cases. Last tc chain index will be set
by tc goto chain action and read by OvS datapath.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
V2:
	Changed user_features to return not supported for requested user_features that aren't supported
	Added static key per pravin request, it is enabled on user_features request (And will be used by userspace to probe actual kernel support)

 include/linux/skbuff.h           | 13 +++++++++++++
 include/net/sch_generic.h        |  5 ++++-
 include/uapi/linux/openvswitch.h |  3 +++
 net/core/skbuff.c                |  6 ++++++
 net/openvswitch/datapath.c       | 38 +++++++++++++++++++++++++++++++++-----
 net/openvswitch/datapath.h       |  2 ++
 net/openvswitch/flow.c           | 13 +++++++++++++
 net/sched/Kconfig                | 13 +++++++++++++
 net/sched/act_api.c              |  1 +
 net/sched/cls_api.c              | 12 ++++++++++++
 10 files changed, 100 insertions(+), 6 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 7eb28b7..29d7c5a 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -279,6 +279,16 @@ struct nf_bridge_info {
 };
 #endif
 
+#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
+/* Chain in tc_skb_ext will be used to share the tc chain with
+ * ovs recirc_id. It will be set to the current chain by tc
+ * and read by ovs to recirc_id.
+ */
+struct tc_skb_ext {
+	__u32 chain;
+};
+#endif
+
 struct sk_buff_head {
 	/* These two members must be first. */
 	struct sk_buff	*next;
@@ -4050,6 +4060,9 @@ enum skb_ext_id {
 #ifdef CONFIG_XFRM
 	SKB_EXT_SEC_PATH,
 #endif
+#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
+	TC_SKB_EXT,
+#endif
 	SKB_EXT_NUM, /* must be last */
 };
 
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index d9f359a..e896e95 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -272,7 +272,10 @@ struct tcf_result {
 			unsigned long	class;
 			u32		classid;
 		};
-		const struct tcf_proto *goto_tp;
+		struct {
+			const struct tcf_proto *goto_tp;
+			u32 goto_index;
+		};
 
 		/* used in the skb_tc_reinsert function */
 		struct {
diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
index f271f1e..1887a45 100644
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -123,6 +123,9 @@ struct ovs_vport_stats {
 /* Allow datapath to associate multiple Netlink PIDs to each vport */
 #define OVS_DP_F_VPORT_PIDS	(1 << 1)
 
+/* Allow tc offload recirc sharing */
+#define OVS_DP_F_TC_RECIRC_SHARING	(1 << 2)
+
 /* Fixed logical ports. */
 #define OVSP_LOCAL      ((__u32)0)
 
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index ea8e8d3..2b40b5a 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4087,6 +4087,9 @@ int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb)
 #ifdef CONFIG_XFRM
 	[SKB_EXT_SEC_PATH] = SKB_EXT_CHUNKSIZEOF(struct sec_path),
 #endif
+#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
+	[TC_SKB_EXT] = SKB_EXT_CHUNKSIZEOF(struct tc_skb_ext),
+#endif
 };
 
 static __always_inline unsigned int skb_ext_total_length(void)
@@ -4098,6 +4101,9 @@ static __always_inline unsigned int skb_ext_total_length(void)
 #ifdef CONFIG_XFRM
 		skb_ext_type_len[SKB_EXT_SEC_PATH] +
 #endif
+#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
+		skb_ext_type_len[TC_SKB_EXT] +
+#endif
 		0;
 }
 
diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 65122bb..dde9d76 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -1545,10 +1545,34 @@ static void ovs_dp_reset_user_features(struct sk_buff *skb, struct genl_info *in
 	dp->user_features = 0;
 }
 
-static void ovs_dp_change(struct datapath *dp, struct nlattr *a[])
+DEFINE_STATIC_KEY_FALSE(tc_recirc_sharing_support);
+
+static int ovs_dp_change(struct datapath *dp, struct nlattr *a[])
 {
-	if (a[OVS_DP_ATTR_USER_FEATURES])
-		dp->user_features = nla_get_u32(a[OVS_DP_ATTR_USER_FEATURES]);
+	u32 user_features = 0;
+
+	if (a[OVS_DP_ATTR_USER_FEATURES]) {
+		user_features = nla_get_u32(a[OVS_DP_ATTR_USER_FEATURES]);
+
+		if (user_features & ~(OVS_DP_F_VPORT_PIDS |
+				      OVS_DP_F_UNALIGNED |
+				      OVS_DP_F_TC_RECIRC_SHARING))
+			return -EOPNOTSUPP;
+
+#if !IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
+		if (user_features & OVS_DP_F_TC_RECIRC_SHARING)
+			return -EOPNOTSUPP;
+#endif
+	}
+
+	dp->user_features = user_features;
+
+	if (dp->user_features & OVS_DP_F_TC_RECIRC_SHARING)
+		static_branch_enable(&tc_recirc_sharing_support);
+	else
+		static_branch_disable(&tc_recirc_sharing_support);
+
+	return 0;
 }
 
 static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info)
@@ -1610,7 +1634,9 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info)
 	parms.port_no = OVSP_LOCAL;
 	parms.upcall_portids = a[OVS_DP_ATTR_UPCALL_PID];
 
-	ovs_dp_change(dp, a);
+	err = ovs_dp_change(dp, a);
+	if (err)
+		goto err_destroy_meters;
 
 	/* So far only local changes have been made, now need the lock. */
 	ovs_lock();
@@ -1736,7 +1762,9 @@ static int ovs_dp_cmd_set(struct sk_buff *skb, struct genl_info *info)
 	if (IS_ERR(dp))
 		goto err_unlock_free;
 
-	ovs_dp_change(dp, info->attrs);
+	err = ovs_dp_change(dp, info->attrs);
+	if (err)
+		goto err_unlock_free;
 
 	err = ovs_dp_cmd_fill_info(dp, reply, info->snd_portid,
 				   info->snd_seq, 0, OVS_DP_CMD_SET);
diff --git a/net/openvswitch/datapath.h b/net/openvswitch/datapath.h
index 751d34a..81e85dd 100644
--- a/net/openvswitch/datapath.h
+++ b/net/openvswitch/datapath.h
@@ -218,6 +218,8 @@ static inline struct datapath *get_dp(struct net *net, int dp_ifindex)
 extern struct notifier_block ovs_dp_device_notifier;
 extern struct genl_family dp_vport_genl_family;
 
+DECLARE_STATIC_KEY_FALSE(tc_recirc_sharing_support);
+
 void ovs_dp_process_packet(struct sk_buff *skb, struct sw_flow_key *key);
 void ovs_dp_detach_port(struct vport *);
 int ovs_dp_upcall(struct datapath *, struct sk_buff *,
diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
index bc89e16..b84929f 100644
--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -816,6 +816,9 @@ static int key_extract_mac_proto(struct sk_buff *skb)
 int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info,
 			 struct sk_buff *skb, struct sw_flow_key *key)
 {
+#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
+	struct tc_skb_ext *tc_ext;
+#endif
 	int res, err;
 
 	/* Extract metadata from packet. */
@@ -848,7 +851,17 @@ int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info,
 	if (res < 0)
 		return res;
 	key->mac_proto = res;
+
+#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
+	if (static_branch_unlikely(&tc_recirc_sharing_support)) {
+		tc_ext = skb_ext_find(skb, TC_SKB_EXT);
+		key->recirc_id = tc_ext ? tc_ext->chain : 0;
+	} else {
+		key->recirc_id = 0;
+	}
+#else
 	key->recirc_id = 0;
+#endif
 
 	err = key_extract(skb, key);
 	if (!err)
diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index afd2ba1..b3faafe 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -963,6 +963,19 @@ config NET_IFE_SKBTCINDEX
         tristate "Support to encoding decoding skb tcindex on IFE action"
         depends on NET_ACT_IFE
 
+config NET_TC_SKB_EXT
+	bool "TC recirculation support"
+	depends on NET_CLS_ACT
+	default y if NET_CLS_ACT
+	select SKB_EXTENSIONS
+
+	help
+	  Say Y here to allow tc chain misses to continue in OvS datapath in
+	  the correct recirc_id, and hardware chain misses to continue in
+	  the correct chain in tc software datapath.
+
+	  Say N here if you won't be using tc<->ovs offload or tc chains offload.
+
 endif # NET_SCHED
 
 config NET_SCH_FIFO
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 3397122..c393604 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -27,6 +27,7 @@ static void tcf_action_goto_chain_exec(const struct tc_action *a,
 {
 	const struct tcf_chain *chain = rcu_dereference_bh(a->goto_chain);
 
+	res->goto_index = chain->index;
 	res->goto_tp = rcu_dereference_bh(chain->filter_chain);
 }
 
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index e0d8b45..82245cf 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -1488,6 +1488,18 @@ int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
 			goto reset;
 		} else if (unlikely(TC_ACT_EXT_CMP(err, TC_ACT_GOTO_CHAIN))) {
 			first_tp = res->goto_tp;
+
+#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
+			{
+				struct tc_skb_ext *ext;
+
+				ext = skb_ext_add(skb, TC_SKB_EXT);
+				if (WARN_ON_ONCE(!ext))
+					return TC_ACT_SHOT;
+
+				ext->chain = res->goto_index;
+			}
+#endif
 			goto reset;
 		}
 #endif
-- 
1.8.3.1


^ permalink raw reply related

* Re: [PATCH net-next v3 2/4] net: mdio: add PTP offset compensation to mdiobus_write_sts
From: Hubert Feurstein @ 2019-08-20 12:29 UTC (permalink / raw)
  To: Miroslav Lichvar
  Cc: netdev, lkml, Andrew Lunn, Richard Cochran, Florian Fainelli,
	Heiner Kallweit, Vladimir Oltean, David S. Miller
In-Reply-To: <20190820094903.GI891@localhost>

Hi Miroslav,

Am Di., 20. Aug. 2019 um 11:49 Uhr schrieb Miroslav Lichvar
<mlichvar@redhat.com>:
>
> On Tue, Aug 20, 2019 at 10:48:31AM +0200, Hubert Feurstein wrote:
>
> > +     /* PTP offset compensation:
> > +      * After the MDIO access is completed (from the chip perspective), the
> > +      * switch chip will snapshot the PHC timestamp. To make sure our system
> > +      * timestamp corresponds to the PHC timestamp, we have to add the
> > +      * duration of this MDIO access to sts->post_ts. Linuxptp's phc2sys
> > +      * takes the average of pre_ts and post_ts to calculate the final
> > +      * system timestamp. With this in mind, we have to add ptp_sts_offset
> > +      * twice to post_ts, in order to not introduce an constant time offset.
> > +      */
> > +     if (sts)
> > +             timespec64_add_ns(&sts->post_ts, 2 * bus->ptp_sts_offset);
>
> This correction looks good to me.
>
> Is the MDIO write delay constant in reality, or does it at least have
> an upper bound? That is, is it always true that the post_ts timestamp
> does not point to a time before the PHC timestamp was actually taken?
>
> This is important to not break the estimation of maximum error in the
> measured offset. Applications using the ioctl may assume that the
> maximum error is (post_ts-pre_ts)/2 (i.e. half of the delay printed by
> phc2sys). That would not work if the delay could be occasionally 50
> microseconds for instance, i.e. the post_ts timestamp would be earlier
> than the PHC timestamp.
>
If the timestamps are taken in the MDIO driver (imx-fec in my case), then
everything is quite deterministic (see results in the cover letter). Of course,
it still can be improved slightly, by splitting up the writel into iowmb and
write_relaxed and disable the interrupts while capturing the timestamps
(as I did in my original RFC patch). But phc2sys takes by default 5 measurements
and uses the one with the smallest delay, so this shouldn't be necessary.

Although, by adding 2 * ptp_sts_offset the system timestamp to post_ts
the timestamp is aligned with the PHC timestamp, but this also increases
the delay which is reported by phc2sys (~26us). But the maximum error
which must be expected is definitely much less (< 1us). So maybe it is better
to shift both timestamps pre_ts and post_ts by ptp_sts_offset.

Hubert

^ permalink raw reply

* Re: [PATCH v6 6/7] nfc: pn533: Add autopoll capability
From: Johan Hovold @ 2019-08-20 12:23 UTC (permalink / raw)
  To: Lars Poeschel
  Cc: Allison Randal, Steve Winslow, Greg Kroah-Hartman,
	Gustavo A. R. Silva, Kate Stewart, Thomas Gleixner,
	open list:NFC SUBSYSTEM, open list, Johan Hovold
In-Reply-To: <20190820120345.22593-6-poeschel@lemonage.de>

On Tue, Aug 20, 2019 at 02:03:43PM +0200, Lars Poeschel wrote:
> pn532 devices support an autopoll command, that lets the chip
> automatically poll for selected nfc technologies instead of manually
> looping through every single nfc technology the user is interested in.
> This is faster and less cpu and bus intensive than manually polling.
> This adds this autopoll capability to the pn533 driver.
> 
> Cc: Johan Hovold <johan@kernel.org>
> Signed-off-by: Lars Poeschel <poeschel@lemonage.de>
> ---
> Changes in v6:
> - Rebased the patch series on v5.3-rc5

Just two drive-by comments below.

>  drivers/nfc/pn533/pn533.c | 193 +++++++++++++++++++++++++++++++++++++-
>  drivers/nfc/pn533/pn533.h |  10 +-
>  2 files changed, 197 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/nfc/pn533/pn533.c b/drivers/nfc/pn533/pn533.c
> index a8c756caa678..7e915239222b 100644
> --- a/drivers/nfc/pn533/pn533.c
> +++ b/drivers/nfc/pn533/pn533.c
> @@ -185,6 +185,32 @@ struct pn533_cmd_jump_dep_response {
>  	u8 gt[];
>  } __packed;
>  
> +struct pn532_autopoll_resp {
> +	u8 type;
> +	u8 ln;
> +	u8 tg;
> +	u8 tgdata[];
> +} __packed;

No need for __packed.

> +static int pn533_autopoll_complete(struct pn533 *dev, void *arg,
> +			       struct sk_buff *resp)
> +{
> +	u8 nbtg;
> +	int rc;
> +	struct pn532_autopoll_resp *apr;
> +	struct nfc_target nfc_tgt;
> +
> +	if (IS_ERR(resp)) {
> +		rc = PTR_ERR(resp);
> +
> +		nfc_err(dev->dev, "%s  autopoll complete error %d\n",
> +			__func__, rc);
> +
> +		if (rc == -ENOENT) {
> +			if (dev->poll_mod_count != 0)
> +				return rc;
> +			goto stop_poll;
> +		} else if (rc < 0) {
> +			nfc_err(dev->dev,
> +				"Error %d when running autopoll\n", rc);
> +			goto stop_poll;
> +		}
> +	}
> +
> +	nbtg = resp->data[0];
> +	if ((nbtg > 2) || (nbtg <= 0))
> +		return -EAGAIN;
> +
> +	apr = (struct pn532_autopoll_resp *)&resp->data[1];
> +	while (nbtg--) {
> +		memset(&nfc_tgt, 0, sizeof(struct nfc_target));
> +		switch (apr->type) {
> +		case PN532_AUTOPOLL_TYPE_ISOA:
> +			dev_dbg(dev->dev, "ISOA");

You forgot the '\n' here and elsewhere (some nfc_err as well).

Johan

^ permalink raw reply

* Re: [PATCH net-next 4/6] net: dsa: Don't program the VLAN as pvid on the upstream port
From: Vladimir Oltean @ 2019-08-20 12:09 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Vivien Didelot, Andrew Lunn, Ido Schimmel, Roopa Prabhu, nikolay,
	David S. Miller, netdev
In-Reply-To: <19610afd-298a-e434-00ea-48eb5b143c1b@gmail.com>

Hi Florian,

On Tue, 20 Aug 2019 at 06:15, Florian Fainelli <f.fainelli@gmail.com> wrote:
>
>
>
> On 8/19/2019 5:00 PM, Vladimir Oltean wrote:
> > Commit b2f81d304cee ("net: dsa: add CPU and DSA ports as VLAN members")
> > programs the VLAN from the bridge into the specified port as well as the
> > upstream port, with the same set of flags.
> >
> > Consider the typical case of installing pvid 1 on user port 1, pvid 2 on
> > user port 2, etc. The upstream port would end up having a pvid equal to
> > the last user port whose pvid was programmed from the bridge. Less than
> > useful.
> >
> > So just don't change the pvid of the upstream port and let it be
> > whatever the driver set it internally to be.
>
> This patch should allow removing the !dsa_is_cpu_port() checks from
> b53_common.c:b53_vlan_add, about time :)
>
> It seems to me that the fundamental issue here is that because we do not
> have a user visible network device that 1:1 maps with the CPU (or DSA)
> ports for that matter (and for valid reasons, they would represent two
> ends of the same pipe), we do not have a good way to control the CPU
> port VLAN attributes.
>
> There was a prior attempt at allowing using the bridge master device to
> program the CPU port's VLAN attributes, see [1], but I did not follow up
> with that until [2] and then life caught me. If you can/want, that would
> be great (not asking for TPS reports).
>
> [1]:
> https://lists.linuxfoundation.org/pipermail/bridge/2016-November/010112.html
> [2]:
> https://lore.kernel.org/lkml/20180624153339.13572-1-f.fainelli@gmail.com/T/
>

So what was the conclusion of that discussion? Should you or should
you not add the check for vlan->flags & BRIDGE_VLAN_INFO_BRENTRY?
I don't exactly handle the meaning of 'master' and 'self' options from
a user perspective.
Right now (no patches applied) I get the following behavior in DSA
(swp2 is already member of br0):

$ echo 1 | sudo tee /sys/class/net/br0/bridge/vlan_filtering
$ sudo bridge vlan add vid 100 dev swp2
$ sudo bridge vlan add vid 101 dev swp2 self
RTNETLINK answers: Operation not supported
$ sudo bridge vlan add vid 102 dev swp2 master
$ sudo bridge vlan add vid 103 dev br0
RTNETLINK answers: Operation not supported
$ sudo bridge vlan add vid 104 dev br0 self
$ sudo bridge vlan add vid 105 dev br0 master
RTNETLINK answers: Operation not supported

$ bridge vlan
port    vlan ids
eth0     1 PVID Egress Untagged

swp5     1 PVID Egress Untagged

swp2     1 PVID Egress Untagged
         100
         102

swp3     1 PVID Egress Untagged

swp4     1 PVID Egress Untagged

br0      1 PVID Egress Untagged
         104

Who returns EOPNOTSUPP for VID 101 and why?
Why is VID 102 not installed in br0? This part I don't understand from
your patchset. Does it mean that the CPU port (br0) will have to be
explicitly configured from now on, even if I run the commands on swp2
with 'master'?


> >
> > Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> > ---
> >  net/dsa/switch.c | 16 ++++++++++++----
> >  1 file changed, 12 insertions(+), 4 deletions(-)
> >
> > diff --git a/net/dsa/switch.c b/net/dsa/switch.c
> > index 84ab2336131e..02ccc53f1926 100644
> > --- a/net/dsa/switch.c
> > +++ b/net/dsa/switch.c
> > @@ -239,17 +239,21 @@ dsa_switch_vlan_prepare_bitmap(struct dsa_switch *ds,
> >                              const struct switchdev_obj_port_vlan *vlan,
> >                              const unsigned long *bitmap)
> >  {
> > +     struct switchdev_obj_port_vlan v = *vlan;
> >       int port, err;
> >
> >       if (!ds->ops->port_vlan_prepare || !ds->ops->port_vlan_add)
> >               return -EOPNOTSUPP;
> >
> >       for_each_set_bit(port, bitmap, ds->num_ports) {
> > -             err = dsa_port_vlan_check(ds, port, vlan);
> > +             if (dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port))
> > +                     v.flags &= ~BRIDGE_VLAN_INFO_PVID;
> > +
> > +             err = dsa_port_vlan_check(ds, port, &v);
> >               if (err)
> >                       return err;
> >
> > -             err = ds->ops->port_vlan_prepare(ds, port, vlan);
> > +             err = ds->ops->port_vlan_prepare(ds, port, &v);
> >               if (err)
> >                       return err;
> >       }
> > @@ -262,10 +266,14 @@ dsa_switch_vlan_add_bitmap(struct dsa_switch *ds,
> >                          const struct switchdev_obj_port_vlan *vlan,
> >                          const unsigned long *bitmap)
> >  {
> > +     struct switchdev_obj_port_vlan v = *vlan;
> >       int port;
> >
> > -     for_each_set_bit(port, bitmap, ds->num_ports)
> > -             ds->ops->port_vlan_add(ds, port, vlan);
> > +     for_each_set_bit(port, bitmap, ds->num_ports) {
> > +             if (dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port))
> > +                     v.flags &= ~BRIDGE_VLAN_INFO_PVID;
> > +             ds->ops->port_vlan_add(ds, port, &v);
> > +     }
> >  }
> >
> >  static int dsa_switch_vlan_add(struct dsa_switch *ds,
> >
>
> --
> Florian

Regards,
-Vladimir

^ permalink raw reply

* RE: [PATCH REPOST 1/2] can: flexcan: fix deadlock when using self wakeup
From: Joakim Zhang @ 2019-08-20 11:56 UTC (permalink / raw)
  To: Sean Nyekjaer, mkl@pengutronix.de, linux-can@vger.kernel.org
  Cc: wg@grandegger.com, netdev@vger.kernel.org, dl-linux-imx,
	Martin Hundebøll
In-Reply-To: <DB7PR04MB4618A1F984F2281C66959B06E6AB0@DB7PR04MB4618.eurprd04.prod.outlook.com>


> -----Original Message-----
> From: Joakim Zhang
> Sent: 2019年8月20日 19:25
> To: Sean Nyekjaer <sean@geanix.com>; mkl@pengutronix.de;
> linux-can@vger.kernel.org
> Cc: wg@grandegger.com; netdev@vger.kernel.org; dl-linux-imx
> <linux-imx@nxp.com>; Martin Hundebøll <martin@geanix.com>
> Subject: RE: [PATCH REPOST 1/2] can: flexcan: fix deadlock when using self
> wakeup
> 
> 
> > -----Original Message-----
> > From: Sean Nyekjaer <sean@geanix.com>
> > Sent: 2019年8月20日 18:25
> > To: Joakim Zhang <qiangqing.zhang@nxp.com>; mkl@pengutronix.de;
> > linux-can@vger.kernel.org
> > Cc: wg@grandegger.com; netdev@vger.kernel.org; dl-linux-imx
> > <linux-imx@nxp.com>; Martin Hundebøll <martin@geanix.com>
> > Subject: Re: [PATCH REPOST 1/2] can: flexcan: fix deadlock when using
> > self wakeup
> >
> >
> >
> > On 16/08/2019 10.20, Joakim Zhang wrote:
> > > As reproted by Sean Nyekjaer below:
> > > When suspending, when there is still can traffic on the interfaces
> > > the flexcan immediately wakes the platform again. As it should :-).
> > > But it throws this error msg:
> > > [ 3169.378661] PM: noirq suspend of devices failed
> > >
> > > On the way down to suspend the interface that throws the error
> > > message does call flexcan_suspend but fails to call flexcan_noirq_suspend.
> > > That means the flexcan_enter_stop_mode is called, but on the way out
> > > of suspend the driver only calls flexcan_resume and skips
> > > flexcan_noirq_resume, thus it doesn't call flexcan_exit_stop_mode.
> > > This leaves the flexcan in stop mode, and with the current driver it
> > > can't recover from this even with a soft reboot, it requires a hard reboot.
> > >
> > > The best way to exit stop mode is in Wake Up interrupt context, and
> > > then
> > > suspend() and resume() functions can be symmetric. However, stop
> > > mode request and ack will be controlled by SCU(System Control Unit)
> > > firmware(manage clock,power,stop mode, etc. by Cortex-M4 core) in
> > > coming i.MX8(QM/QXP). And SCU firmware interface can't be available
> > > in
> > interrupt context.
> > >
> > > For compatibillity, the wake up mechanism can't be symmetric, so we
> > > need in_stop_mode hack.
> > >
> > > Fixes: de3578c198c6 ("can: flexcan: add self wakeup support")
> > > Reported-by: Sean Nyekjaer <sean@geanix.com>
> > > Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
> > >
> >
> > Unfortunatly it's still possible to reproduce the deadlock with this patch...
> >
> > [  689.921717] flexcan: probe of 2094000.flexcan failed with error
> > -110
> >
> > My test setup:
> > PC with CAN-USB dongle connected to can0 and can1.
> >
> > PC:
> > $ while true; do cansend can0 '123#DEADBEEF'; done
> >
> > iMX6ull:
> > root@iwg26:~# systemctl suspend
> >
> >
> > [  365.858054] systemd[1]: Reached target Sleep.
> > root@iwg26:~# [  365.939826] systemd[1]: Starting Suspend...
> > [  366.115839] systemd-sleep[248]: Suspending system...
> > [  366.517949] dpm_run_callback(): platform_pm_suspend+0x0/0x5c
> > returns
> > -110 [  366.518249] PM: Device 2094000.flexcan failed to suspend:
> > error -110 [  366.518406] PM: Some devices failed to suspend, or early
> > wake event detected [  366.732162] dpm_run_callback():
> > platform_pm_suspend+0x0/0x5c returns -110 [  366.732285] PM: Device
> > 2090000.flexcan failed to suspend: error -110 [  366.732330] PM: Some
> > devices failed to suspend, or early wake event detected [  366.890637]
> > systemd-sleep[248]: System resumed.
> 
> CAN1, CAN0 suspended failed, then CAN0, CAN1 resumed back, so
> CAN0/CAN1 can work fine.
>
> > [  366.923062] systemd[1]: Started Suspend.
> > [  366.942819] systemd[1]: sleep.target: Unit not needed anymore.
> Stopping.
> > [  366.954791] systemd[1]: Stopped target Sleep.
> > [  366.962402] systemd[1]: Reached target Suspend.
> > [  366.977546] systemd-logind[135]: Operation 'sleep' finished.
> > [  366.979194] systemd[1]: suspend.target: Unit not needed anymore.
> > Stopping.
> > [  366.993831] systemd[1]: Stopped target Suspend.
> > [  367.139972] systemd-networkd[220]: usb0: Lost carrier [
> > 367.294077]
> > systemd-networkd[220]: usb0: Gained carrier
> >
> > root@iwg26:~# candump can0 | head -n 2
> >
> >    can0  123   [4]  DE AD BE EF
> >    can0  123   [4]  DE AD BE EF
> > root@iwg26:~# candump can1 | head -n 2
> >
> >    can1  123   [4]  DE AD BE EF
> >    can1  123   [4]  DE AD BE EF
> > root@iwg26:~# systemctl suspend
> >
> > root@iwg26:~# [  385.106658] systemd[1]: Reached target Sleep.
> > [  385.147602] systemd[1]: Starting Suspend...
> > [  385.246421] systemd-sleep[260]: Suspending system...
> > [  385.634733] dpm_run_callback(): platform_pm_suspend+0x0/0x5c
> > returns
> > -110 [  385.634855] PM: Device 2090000.flexcan failed to suspend:
> > error -110 [  385.634897] PM: Some devices failed to suspend, or early
> > wake event detected [  385.856251] PM: noirq suspend of devices failed
> > [  385.998364]
> > systemd-sleep[260]: System resumed.
> 
> CAN0 suspended failed, CAN1 noirq suspended failed, then CAN1, CAN0
> resumed back, so CAN0/CAN1 can work fine.

If CAN0 suspended failed, should system resumed after suspended all devices, should not enter noirq suspend, 
why it here printed "PM: noirq suspend of devices failed"?

> > [  386.023390] systemd[1]: Started Suspend.
> > [  386.031570] systemd[1]: sleep.target: Unit not needed anymore.
> Stopping.
> > [  386.055886] systemd[1]: Stopped target Sleep.
> > [  386.061430] systemd[1]: Reached target Suspend.
> > [  386.066142] systemd[1]: suspend.target: Unit not needed anymore.
> > Stopping.
> > [  386.112575] systemd-networkd[220]: usb0: Lost carrier [
> > 386.116797]
> > systemd-logind[135]: Operation 'sleep' finished.
> > [  386.146161] systemd[1]: Stopped target Suspend.
> > [  386.260866] systemd-networkd[220]: usb0: Gained carrier
> > root@iwg26:~# candump can0 | head -n 2
> >    can0  123   [4]  DE AD BE EF
> >    can0  123   [4]  DE AD BE EF
> > root@iwg26:~# candump can1 | head -n 2
> >
> >    can1  123   [4]  DE AD BE EF
> >    can1  123   [4]  DE AD BE EF
> > root@iwg26:~# systemctl suspend
> >
> > [  396.919303] systemd[1]: Reached target Sleep.
> > root@iwg26:~# [  396.964722] systemd[1]: Starting Suspend...
> > [  397.067336] systemd-sleep[268]: Suspending system...
> > [  397.574571] PM: noirq suspend of devices failed [  397.834731] PM:
> > noirq suspend of devices failed [  397.807996] systemd-networkd[220]:
> > usb0: Lost carrier [  398.156295] dpm_run_callback():
> > platform_pm_suspend+0x0/0x5c returns -110 [  398.156339] PM: Device
> 2094000.flexcan failed to suspend:
> > error -110 [  398.156509] PM: Some devices failed to suspend, or early
> > wake event detected [  398.053555] systemd-sleep[268]: Failed to write
> > /sys/power/state:
> > Device or resource busy
> 
> But the log here is very strange and chaotic, it looks like CAN0 suspended failed,
> then resumed back, so CAN0 can work fine.
> CAN1 noirq suspend failed, but have not resumed back, so CAN1 still in stop
> mode, cannot work. I think this may be other device noirq suspend failed broke
> the resume of CAN1.
> 
> Could you do more debug to help locate the issue?

More strange, why here first enter noirq suspend?

Best Regards,
Joakim Zhang
> > [  398.074751] systemd[1]: systemd-suspend.service: Main process
> > exited, code=exited, status=1/FAILURE [  398.076779] systemd[1]:
> 
> > systemd-suspend.service: Failed with result 'exit-code'.
> > [  398.109255] systemd[1]: Failed to start Suspend.
> > [  398.118704] systemd[1]: Dependency failed for Suspend.
> > [  398.136283] systemd-logind[135]: Operation 'sleep' finished.
> > [  398.137770] systemd[1]: suspend.target: Job suspend.target/start
> > failed with result 'dependency'.
> > [  398.139105] systemd[1]: sleep.target: Unit not needed anymore.
> Stopping.
> > [  398.167590] systemd[1]: Stopped target Sleep.
> > [  398.201558] systemd-networkd[220]: usb0: Gained carrier
> 
> Log here also strange.
> 
> Best Regards,
> Joakim Zhang
> > root@iwg26:~# candump can0 | head -n 2
> >    can0  123   [4]  DE AD BE EF
> >    can0  123   [4]  DE AD BE EF
> > root@iwg26:~# candump can1 | head -n 2
> >
> > nothing on can1 anymore :-(
> >
> > root@iwg26:~# rmmod flexcan
> > [  622.884746] systemd-networkd[220]: can1: Lost carrier [
> > 623.046766]
> > systemd-networkd[220]: can0: Lost carrier root@iwg26:~# insmod
> > /mnt/flexcan.ko [  628.323981] flexcan 2094000.flexcan: registering
> > netdev failed
> >
> > and can1 fails to register with:
> > [  628.347485] flexcan: probe of 2094000.flexcan failed with error
> > -110
> >
> > /Sean

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox