Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net] be2net: Signal that the device cannot transmit during reconfiguration
From: David Miller @ 2019-07-16 19:41 UTC (permalink / raw)
  To: bpoirier
  Cc: sathya.perla, ajit.khaparde, sriharsha.basavapatna, somnath.kotur,
	fyang, saeedm, netdev
In-Reply-To: <20190716081655.7676-1-bpoirier@suse.com>

From: Benjamin Poirier <bpoirier@suse.com>
Date: Tue, 16 Jul 2019 17:16:55 +0900

> While changing the number of interrupt channels, be2net stops adapter
> operation (including netif_tx_disable()) but it doesn't signal that it
> cannot transmit. This may lead dev_watchdog() to falsely trigger during
> that time.
> 
> Add the missing call to netif_carrier_off(), following the pattern used in
> many other drivers. netif_carrier_on() is already taken care of in
> be_open().
> 
> Signed-off-by: Benjamin Poirier <bpoirier@suse.com>

Applied.

^ permalink raw reply

* Re: [PATCH bpf] selftests/bpf: make directory prerequisites order-only
From: Andrii Nakryiko @ 2019-07-16 19:39 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Daniel Borkmann, Ilya Leoshkevich, bpf, Network Development, gor,
	Heiko Carstens
In-Reply-To: <CAADnVQKzZQ_mbaMHEU6HA-JEy=1jXvBWULg8yKQY_2zwSmU86g@mail.gmail.com>

On Tue, Jul 16, 2019 at 10:49 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Jul 15, 2019 at 3:22 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >
> > On 7/12/19 3:56 PM, Ilya Leoshkevich wrote:
> > > When directories are used as prerequisites in Makefiles, they can cause
> > > a lot of unnecessary rebuilds, because a directory is considered changed
> > > whenever a file in this directory is added, removed or modified.
> > >
> > > If the only thing a target is interested in is the existence of the
> > > directory it depends on, which is the case for selftests/bpf, this
> > > directory should be specified as an order-only prerequisite: it would
> > > still be created in case it does not exist, but it would not trigger a
> > > rebuild of a target in case it's considered changed.
> > >
> > > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> >
> > Applied, thanks!
>
> Hi Ilya,
>
> this commit breaks map_tests.

This change just exposed existing problem with Makefile. Sent out fix.

> To reproduce:
> rm map_tests/tests.h
> make
> tests.h will not be regenerated.
> Please provide a fix asap.
> We cannot ship bpf tree with such failure.

^ permalink raw reply

* Re: [RFC bpf-next 0/8] bpf: accelerate insn patching speed
From: Jiong Wang @ 2019-07-16 19:39 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jiong Wang, Andrii Nakryiko, Daniel Borkmann, Edward Cree,
	Naveen N. Rao, Andrii Nakryiko, Jakub Kicinski, bpf, Networking,
	oss-drivers, Yonghong Song
In-Reply-To: <20190716161701.mk5ye47aj2slkdjp@ast-mbp.dhcp.thefacebook.com>


Alexei Starovoitov writes:

> On Tue, Jul 16, 2019 at 09:50:25AM +0100, Jiong Wang wrote:
>> 
>> Let me digest a little bit and do some coding, then I will come back. Some
>> issues can only shown up during in-depth coding. I kind of feel handling
>> aux reference in verifier layer is the part that will still introduce some
>> un-clean code.
>
> I'm still internalizing this discussion. Only want to point out
> that I think it's better to have simpler algorithm that consumes more
> memory and slower than more complex algorithm that is more cpu/memory efficient.
> Here we're aiming at 10x improvement anyway, so extra cpu and memory
> here and there are good trade-off to make.
>
>> >> If there is no dead insn elimination opt, then we could just adjust
>> >> offsets. When there is insn deleting, I feel the logic becomes more
>> >> complex. One subprog could be completely deleted or partially deleted, so
>> >> I feel just recalculate the whole subprog info as a side-product is
>> >> much simpler.
>> >
>> > What's the situation where entirety of subprog can be deleted?
>> 
>> Suppose you have conditional jmp_imm, true path calls one subprog, false
>> path calls the other. If insn walker later found it is also true, then the
>> subprog at false path won't be marked as "seen", so it is entirely deleted.
>> 
>> I actually thought it is in theory one subprog could be deleted entirely,
>> so if we support insn deletion inside verifier, then range info like
>> line_info/subprog_info needs to consider one range is deleted.
>
> I don't think dead code elim can remove subprogs.
> cfg check rejects code with dead progs.

cfg check rejects unreachable code based on static analysis while one
subprog passed cfg check could be identified as dead later after runtime
value tracking, after check_cond_jmp_op pruning subprog call in false
path and making the subprog dead?

For example:

  static subprog1()
  static subprog2()
  
  foo(int mask)
  {
    if (mask & 0x1)
      subprog1();
    else
      subprog2();
    ...
  }

foo's incoming arg is a mask, and depending on whether the LSB is set, it
calls different init functions, subprog1 or subprog2.

foo might be called with a constant as mask, for example 0x8000. Then if
foo is not called by someone else, subprog1 is dead if there is no other
caller of it.

LLVM is smart enough to optimize out such dead functions if they are only
visible in the same compilation unit, and people might only write code in
such shape when they are encapsulated in a lib. but if case like above is
true, I think it is possible one subprog could be deleted by verifier
entirely.

> I don't think we have a test for such 'dead prog only due to verifier walk'
> situation. I wonder what happens :)


^ permalink raw reply

* [PATCH bpf 2/2] selftests/bpf: structure test_{progs,maps,verifier} test runners uniformly
From: Andrii Nakryiko @ 2019-07-16 19:38 UTC (permalink / raw)
  To: bpf, netdev, daniel, ast; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko
In-Reply-To: <20190716193837.2808971-1-andriin@fb.com>

It's easier to follow the logic if it's structured the same.
There is just slight difference between test_progs/test_maps and
test_verifier. test_verifier's verifier/*.c files are not really compilable
C files (they are more of include headers), so they can't be specified as
explicit dependencies of test_verifier.

Cc: Alexei Starovoitov <ast@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/testing/selftests/bpf/Makefile | 24 ++++++++++--------------
 1 file changed, 10 insertions(+), 14 deletions(-)

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 9bc68d8abc5f..11c9c62c3362 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -176,6 +176,7 @@ endif
 endif
 
 TEST_PROGS_CFLAGS := -I. -I$(OUTPUT)
+TEST_MAPS_CFLAGS := -I. -I$(OUTPUT)
 TEST_VERIFIER_CFLAGS := -I. -I$(OUTPUT) -Iverifier
 
 ifneq ($(SUBREG_CODEGEN),)
@@ -227,16 +228,14 @@ ifeq ($(DWARF2BTF),y)
 	$(BTF_PAHOLE) -J $@
 endif
 
-PROG_TESTS_H := $(OUTPUT)/prog_tests/tests.h
-test_progs.c: $(PROG_TESTS_H)
-$(OUTPUT)/test_progs: CFLAGS += $(TEST_PROGS_CFLAGS)
-$(OUTPUT)/test_progs: prog_tests/*.c
-
 PROG_TESTS_DIR = $(OUTPUT)/prog_tests
 $(PROG_TESTS_DIR):
 	mkdir -p $@
-
+PROG_TESTS_H := $(PROG_TESTS_DIR)/tests.h
 PROG_TESTS_FILES := $(wildcard prog_tests/*.c)
+test_progs.c: $(PROG_TESTS_H)
+$(OUTPUT)/test_progs: CFLAGS += $(TEST_PROGS_CFLAGS)
+$(OUTPUT)/test_progs: test_progs.c $(PROG_TESTS_H) $(PROG_TESTS_FILES)
 $(PROG_TESTS_H): $(PROG_TESTS_FILES) | $(PROG_TESTS_DIR)
 	$(shell ( cd prog_tests/; \
 		  echo '/* Generated header, do not edit */'; \
@@ -250,7 +249,6 @@ $(PROG_TESTS_H): $(PROG_TESTS_FILES) | $(PROG_TESTS_DIR)
 		  echo '#endif' \
 		 ) > $(PROG_TESTS_H))
 
-TEST_MAPS_CFLAGS := -I. -I$(OUTPUT)
 MAP_TESTS_DIR = $(OUTPUT)/map_tests
 $(MAP_TESTS_DIR):
 	mkdir -p $@
@@ -272,17 +270,15 @@ $(MAP_TESTS_H): $(MAP_TESTS_FILES) | $(MAP_TESTS_DIR)
 		  echo '#endif' \
 		 ) > $(MAP_TESTS_H))
 
-VERIFIER_TESTS_H := $(OUTPUT)/verifier/tests.h
-test_verifier.c: $(VERIFIER_TESTS_H)
-$(OUTPUT)/test_verifier: CFLAGS += $(TEST_VERIFIER_CFLAGS)
-$(OUTPUT)/test_verifier: test_verifier.c $(VERIFIER_TESTS_H)
-
 VERIFIER_TESTS_DIR = $(OUTPUT)/verifier
 $(VERIFIER_TESTS_DIR):
 	mkdir -p $@
-
+VERIFIER_TESTS_H := $(VERIFIER_TESTS_DIR)/tests.h
 VERIFIER_TEST_FILES := $(wildcard verifier/*.c)
-$(OUTPUT)/verifier/tests.h: $(VERIFIER_TEST_FILES) | $(VERIFIER_TESTS_DIR)
+test_verifier.c: $(VERIFIER_TESTS_H)
+$(OUTPUT)/test_verifier: CFLAGS += $(TEST_VERIFIER_CFLAGS)
+$(OUTPUT)/test_verifier: test_verifier.c $(VERIFIER_TESTS_H)
+$(VERIFIER_TESTS_H): $(VERIFIER_TEST_FILES) | $(VERIFIER_TESTS_DIR)
 	$(shell ( cd verifier/; \
 		  echo '/* Generated header, do not edit */'; \
 		  echo '#ifdef FILL_ARRAY'; \
-- 
2.17.1


^ permalink raw reply related

* [PATCH bpf 1/2] selftests/bpf: fix test_verifier/test_maps make dependencies
From: Andrii Nakryiko @ 2019-07-16 19:38 UTC (permalink / raw)
  To: bpf, netdev, daniel, ast
  Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko, Ilya Leoshkevich,
	Stanislav Fomichev, Martin KaFai Lau

e46fc22e60a4 ("selftests/bpf: make directory prerequisites order-only")
exposed existing problem in Makefile for test_verifier and test_maps tests:
their dependency on auto-generated header file with a list of all tests wasn't
recorded explicitly. This patch fixes these issues.

Fixes: 51a0e301a563 ("bpf: Add BPF_MAP_TYPE_SK_STORAGE test to test_maps")
Fixes: 6b7b6995c43e ("selftests: bpf: tests.h should depend on .c files, not the output")
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/testing/selftests/bpf/Makefile | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 1296253b3422..9bc68d8abc5f 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -86,8 +86,6 @@ $(OUTPUT)/urandom_read: $(OUTPUT)/%: %.c
 $(OUTPUT)/test_stub.o: test_stub.c
 	$(CC) $(TEST_PROGS_CFLAGS) $(CFLAGS) -c -o $@ $<
 
-$(OUTPUT)/test_maps: map_tests/*.c
-
 BPFOBJ := $(OUTPUT)/libbpf.a
 
 $(TEST_GEN_PROGS): $(OUTPUT)/test_stub.o $(BPFOBJ)
@@ -257,9 +255,10 @@ MAP_TESTS_DIR = $(OUTPUT)/map_tests
 $(MAP_TESTS_DIR):
 	mkdir -p $@
 MAP_TESTS_H := $(MAP_TESTS_DIR)/tests.h
+MAP_TESTS_FILES := $(wildcard map_tests/*.c)
 test_maps.c: $(MAP_TESTS_H)
 $(OUTPUT)/test_maps: CFLAGS += $(TEST_MAPS_CFLAGS)
-MAP_TESTS_FILES := $(wildcard map_tests/*.c)
+$(OUTPUT)/test_maps: test_maps.c $(MAP_TESTS_H) $(MAP_TESTS_FILES)
 $(MAP_TESTS_H): $(MAP_TESTS_FILES) | $(MAP_TESTS_DIR)
 	$(shell ( cd map_tests/; \
 		  echo '/* Generated header, do not edit */'; \
@@ -276,6 +275,7 @@ $(MAP_TESTS_H): $(MAP_TESTS_FILES) | $(MAP_TESTS_DIR)
 VERIFIER_TESTS_H := $(OUTPUT)/verifier/tests.h
 test_verifier.c: $(VERIFIER_TESTS_H)
 $(OUTPUT)/test_verifier: CFLAGS += $(TEST_VERIFIER_CFLAGS)
+$(OUTPUT)/test_verifier: test_verifier.c $(VERIFIER_TESTS_H)
 
 VERIFIER_TESTS_DIR = $(OUTPUT)/verifier
 $(VERIFIER_TESTS_DIR):
-- 
2.17.1


^ permalink raw reply related

* Re: [PATCH ghak90 V6 02/10] audit: add container id
From: Richard Guy Briggs @ 2019-07-16 19:38 UTC (permalink / raw)
  To: Paul Moore
  Cc: Tycho Andersen, containers, linux-api, Linux-Audit Mailing List,
	linux-fsdevel, LKML, netdev, netfilter-devel, sgrubb, omosnace,
	dhowells, simo, Eric Paris, Serge Hallyn, ebiederm, nhorman
In-Reply-To: <CAHC9VhRFeCFSCn=m6wgDK2tXBN1euc2+bw8o=CfNwptk8t=j7A@mail.gmail.com>

On 2019-07-15 16:38, Paul Moore wrote:
> On Mon, Jul 8, 2019 at 1:51 PM Richard Guy Briggs <rgb@redhat.com> wrote:
> > On 2019-05-29 11:29, Paul Moore wrote:
> 
> ...
> 
> > > The idea is that only container orchestrators should be able to
> > > set/modify the audit container ID, and since setting the audit
> > > container ID can have a significant effect on the records captured
> > > (and their routing to multiple daemons when we get there) modifying
> > > the audit container ID is akin to modifying the audit configuration
> > > which is why it is gated by CAP_AUDIT_CONTROL.  The current thinking
> > > is that you would only change the audit container ID from one
> > > set/inherited value to another if you were nesting containers, in
> > > which case the nested container orchestrator would need to be granted
> > > CAP_AUDIT_CONTROL (which everyone to date seems to agree is a workable
> > > compromise).  We did consider allowing for a chain of nested audit
> > > container IDs, but the implications of doing so are significant
> > > (implementation mess, runtime cost, etc.) so we are leaving that out
> > > of this effort.
> >
> > We had previously discussed the idea of restricting
> > orchestrators/engines from only being able to set the audit container
> > identifier on their own descendants, but it was discarded.  I've added a
> > check to ensure this is now enforced.
> 
> When we weren't allowing nested orchestrators it wasn't necessary, but
> with the move to support nesting I believe this will be a requirement.
> We might also need/want to restrict audit container ID changes if a
> descendant is acting as a container orchestrator and managing one or
> more audit container IDs; although I'm less certain of the need for
> this.

I was of the opinion it was necessary before with single-layer parallel
orchestrators/engines.

> > I've also added a check to ensure that a process can't set its own audit
> > container identifier ...
> 
> What does this protect against, or what problem does this solve?
> Considering how easy it is to fork/exec, it seems like this could be
> trivially bypassed.

Well, for starters, it would remove one layer of nesting.  It would
separate the functional layers of processes.  Other than that, it seems
like a gut feeling that it is just wrong to allow it.  It seems like a
layer violation that one container orchestrator/engine could set its own
audit container identifier and then set its children as well.  It would
be its own parent.  It would make it harder to verify adherance to
descendancy and inheritance rules.

> > ... and that if the identifier is already set, then the
> > orchestrator/engine must be in a descendant user namespace from the
> > orchestrator that set the previously inherited audit container
> > identifier.
> 
> You lost me here ... although I don't like the idea of relying on X
> namespace inheritance for a hard coded policy on setting the audit
> container ID; we've worked hard to keep this independent of any
> definition of a "container" and it would sadden me greatly if we had
> to go back on that.

This would seem to be the one concession I'm reluctantly making to try
to solve this nested container orchestrator/engine challenge.

Would backing off on that descendant user namespace requirement and only
require that a nested audit container identifier only be permitted on a
descendant task be sufficient?  It may for this use case, but I suspect
not for additional audit daemons (we're not there yet) and message
routing to those daemons.

The one difference here is that it does not depend on this if the audit
container identifier has not already been set.

> paul moore

- RGB

--
Richard Guy Briggs <rgb@redhat.com>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

^ permalink raw reply

* Re: [PATCH] net: ethernet: ti: cpsw: Add of_node_put() before return and break
From: David Miller @ 2019-07-16 19:37 UTC (permalink / raw)
  To: nishkadg.linux; +Cc: grygorii.strashko, ivan.khoronzhuk, linux-omap, netdev
In-Reply-To: <20190716054843.2957-1-nishkadg.linux@gmail.com>

From: Nishka Dasgupta <nishkadg.linux@gmail.com>
Date: Tue, 16 Jul 2019 11:18:43 +0530

> Each iteration of for_each_available_child_of_node puts the previous
> node, but in the case of a return or break from the middle of the loop,
> there is no put, thus causing a memory leak.

What an incredible terribly designed loop macro, this
for_each_available_child_of_node () thing is.

A macro with non-trivial, invisible, side effects.  It requires
special handling of reference counting of objects if the loop is
terminated early.

This is so error prone.  Is it any wonder we have to go through the
entire tree fixing up nearly every use of this thing?

Instead of looking at the automated analysis of this and saying "great
here are all of these places where I can fix bugs", I would instead
appreicate it if the reaction was more like "this interface is
obviously impossible to use in a non-error-prone fashion, we should
fix it."

I guess I have no choice but to apply your fixes, but the larger issue
must be addressed instead.

^ permalink raw reply

* Re: [PATCH iproute2 0/2] Fix IPv6 tunnel add when dev param is used
From: Stephen Hemminger @ 2019-07-16 19:28 UTC (permalink / raw)
  To: Andrea Claudi; +Cc: netdev, dsahern
In-Reply-To: <cover.1562667648.git.aclaudi@redhat.com>

On Tue,  9 Jul 2019 15:16:49 +0200
Andrea Claudi <aclaudi@redhat.com> wrote:

> Commit ba126dcad20e6 ("ip6tunnel: fix 'ip -6 {show|change} dev
> <name>' cmds") breaks IPv6 tunnel creation when dev parameter
> is used.
> 
> This series revert the original commit, which mistakenly use
> dev for tunnel name, while addressing a issue on tunnel change
> when no interface name is specified.
> 
> Andrea Claudi (2):
>   Revert "ip6tunnel: fix 'ip -6 {show|change} dev <name>' cmds"
>   ip tunnel: warn when changing IPv6 tunnel without tunnel name
> 
>  ip/ip6tunnel.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 

Both applied, thanks

^ permalink raw reply

* Re: [RFC PATCH 5/5] PTP: Add support for Intel PMC Timed GPIO Controller
From: Shannon Nelson @ 2019-07-16 19:14 UTC (permalink / raw)
  To: Felipe Balbi, Richard Cochran
  Cc: netdev, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H . Peter Anvin, x86, linux-kernel, Christopher S . Hall
In-Reply-To: <20190716072038.8408-6-felipe.balbi@linux.intel.com>

On 7/16/19 12:20 AM, Felipe Balbi wrote:
> Add a driver supporting Intel Timed GPIO controller available as part
> of some Intel PMCs.
>
> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>

Hi Felipe, just a couple of quick comments:

There are several places where a line is continued on the next line, but 
should be indented to match the opening parenthesis on a function call 
or 'if' expression.

Shouldn't there be a kthread_stop() in intel_pmc_tgpio_remove(), or did 
I miss that somewhere?

Cheers,
sln


> ---
>   drivers/ptp/Kconfig               |   8 +
>   drivers/ptp/Makefile              |   1 +
>   drivers/ptp/ptp-intel-pmc-tgpio.c | 378 ++++++++++++++++++++++++++++++
>   3 files changed, 387 insertions(+)
>   create mode 100644 drivers/ptp/ptp-intel-pmc-tgpio.c
>
> diff --git a/drivers/ptp/Kconfig b/drivers/ptp/Kconfig
> index 9b8fee5178e8..bb0fce70a783 100644
> --- a/drivers/ptp/Kconfig
> +++ b/drivers/ptp/Kconfig
> @@ -107,6 +107,14 @@ config PTP_1588_CLOCK_PCH
>   	  To compile this driver as a module, choose M here: the module
>   	  will be called ptp_pch.
>   
> +config PTP_INTEL_PMC_TGPIO
> +	tristate "Intel PMC Timed GPIO"
> +	depends on X86
> +	depends on ACPI
> +	imply PTP_1588_CLOCK
> +	help
> +	  This driver adds support for Intel PMC Timed GPIO Controller
> +
>   config PTP_1588_CLOCK_KVM
>   	tristate "KVM virtual PTP clock"
>   	depends on PTP_1588_CLOCK
> diff --git a/drivers/ptp/Makefile b/drivers/ptp/Makefile
> index 677d1d178a3e..ff89c90ace82 100644
> --- a/drivers/ptp/Makefile
> +++ b/drivers/ptp/Makefile
> @@ -7,6 +7,7 @@ ptp-y					:= ptp_clock.o ptp_chardev.o ptp_sysfs.o
>   obj-$(CONFIG_PTP_1588_CLOCK)		+= ptp.o
>   obj-$(CONFIG_PTP_1588_CLOCK_DTE)	+= ptp_dte.o
>   obj-$(CONFIG_PTP_1588_CLOCK_IXP46X)	+= ptp_ixp46x.o
> +obj-$(CONFIG_PTP_INTEL_PMC_TGPIO)	+= ptp-intel-pmc-tgpio.o
>   obj-$(CONFIG_PTP_1588_CLOCK_PCH)	+= ptp_pch.o
>   obj-$(CONFIG_PTP_1588_CLOCK_KVM)	+= ptp_kvm.o
>   obj-$(CONFIG_PTP_1588_CLOCK_QORIQ)	+= ptp-qoriq.o
> diff --git a/drivers/ptp/ptp-intel-pmc-tgpio.c b/drivers/ptp/ptp-intel-pmc-tgpio.c
> new file mode 100644
> index 000000000000..880ece34868a
> --- /dev/null
> +++ b/drivers/ptp/ptp-intel-pmc-tgpio.c
> @@ -0,0 +1,378 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Intel Timed GPIO Controller Driver
> + *
> + * Copyright (C) 2018 Intel Corporation
> + * Author: Felipe Balbi <felipe.balbi@linux.intel.com>
> + */
> +
> +#include <linux/acpi.h>
> +#include <linux/bitops.h>
> +#include <linux/gpio.h>
> +#include <linux/io-64-nonatomic-lo-hi.h>
> +#include <linux/kthread.h>
> +#include <linux/module.h>
> +#include <linux/mutex.h>
> +#include <linux/platform_device.h>
> +#include <linux/ptp_clock_kernel.h>
> +#include <asm/tsc.h>
> +
> +#define TGPIOCTL		0x00
> +#define TGPIOCOMPV31_0		0x10
> +#define TGPIOCOMPV63_32		0x14
> +#define TGPIOPIV31_0		0x18
> +#define TGPIOPIV63_32		0x1c
> +#define TGPIOTCV31_0		0x20
> +#define TGPIOTCV63_32		0x24
> +#define TGPIOECCV31_0		0x28
> +#define TGPIOECCV63_32		0x2c
> +#define TGPIOEC31_0		0x30
> +#define TGPIOEC63_32		0x34
> +
> +/* Control Register */
> +#define TGPIOCTL_EN		BIT(0)
> +#define TGPIOCTL_DIR		BIT(1)
> +#define TGPIOCTL_EP		GENMASK(3, 2)
> +#define TGPIOCTL_EP_RISING_EDGE	(0 << 2)
> +#define TGPIOCTL_EP_FALLING_EDGE (1 << 2)
> +#define TGPIOCTL_EP_TOGGLE_EDGE	(2 << 2)
> +#define TGPIOCTL_PM		BIT(4)
> +
> +#define NSECS_PER_SEC		1000000000
> +#define TGPIO_MAX_ADJ_TIME	999999900
> +
> +struct intel_pmc_tgpio {
> +	struct ptp_clock_info	info;
> +	struct ptp_clock	*clock;
> +
> +	struct mutex		lock;
> +	struct device		*dev;
> +	void __iomem		*base;
> +
> +	struct task_struct	*event_thread;
> +	bool			input;
> +};
> +#define to_intel_pmc_tgpio(i)	(container_of((i), struct intel_pmc_tgpio, info))
> +
> +static inline u64 to_intel_pmc_tgpio_time(struct ptp_clock_time *t)
> +{
> +	return t->sec * NSECS_PER_SEC + t->nsec;
> +}
> +
> +static inline u64 intel_pmc_tgpio_readq(void __iomem *base, u32 offset)
> +{
> +	return lo_hi_readq(base + offset);
> +}
> +
> +static inline void intel_pmc_tgpio_writeq(void __iomem *base, u32 offset, u64 v)
> +{
> +	return lo_hi_writeq(v, base + offset);
> +}
> +
> +static inline u32 intel_pmc_tgpio_readl(void __iomem *base, u32 offset)
> +{
> +	return readl(base + offset);
> +}
> +
> +static inline void intel_pmc_tgpio_writel(void __iomem *base, u32 offset, u32 value)
> +{
> +	writel(value, base + offset);
> +}
> +
> +static struct ptp_pin_desc intel_pmc_tgpio_pin_config[] = {
> +	{					\
> +		.name	= "pin0",		\
> +		.index	= 0,			\
> +		.func	= PTP_PF_NONE,		\
> +		.chan	= 0,			\
> +	}
> +};
> +
> +static int intel_pmc_tgpio_gettime64(struct ptp_clock_info *info,
> +		struct timespec64 *ts)
> +{
> +	struct intel_pmc_tgpio	*tgpio = to_intel_pmc_tgpio(info);
> +	u64 now;
> +
> +	mutex_lock(&tgpio->lock);
> +	now = get_art_ns_now();
> +	*ts = ns_to_timespec64(now);
> +	mutex_unlock(&tgpio->lock);
> +
> +	return 0;
> +}
> +
> +static int intel_pmc_tgpio_settime64(struct ptp_clock_info *info,
> +		const struct timespec64 *ts)
> +{
> +	return -EOPNOTSUPP;
> +}
> +
> +static int intel_pmc_tgpio_event_thread(void *_tgpio)
> +{
> +	struct intel_pmc_tgpio	*tgpio = _tgpio;
> +	u64 reg;
> +
> +	while (!kthread_should_stop()) {
> +		bool input;
> +		int i;
> +
> +		mutex_lock(&tgpio->lock);
> +		input = tgpio->input;
> +		mutex_unlock(&tgpio->lock);
> +
> +		if (!input)
> +			schedule();
> +
> +		reg = intel_pmc_tgpio_readq(tgpio->base, TGPIOEC31_0);
> +
> +		for (i = 0; i < reg; i++) {
> +			struct ptp_clock_event event;
> +
> +			event.type = PTP_CLOCK_EXTTS;
> +			event.index = 0;
> +			event.timestamp = intel_pmc_tgpio_readq(tgpio->base,
> +					TGPIOTCV31_0);
> +
> +			ptp_clock_event(tgpio->clock, &event);
> +		}
> +		schedule_timeout_interruptible(10);
> +	}
> +
> +	return 0;
> +}
> +
> +static int intel_pmc_tgpio_config_input(struct intel_pmc_tgpio *tgpio,
> +		struct ptp_extts_request *extts, int on)
> +{
> +	u32			ctrl;
> +	bool			input;
> +
> +	ctrl = intel_pmc_tgpio_readl(tgpio->base, TGPIOCTL);
> +	ctrl &= ~TGPIOCTL_EN;
> +	intel_pmc_tgpio_writel(tgpio->base, TGPIOCTL, ctrl);
> +
> +	if (on) {
> +		ctrl |= TGPIOCTL_DIR;
> +
> +		if (extts->flags & PTP_RISING_EDGE &&
> +				extts->flags & PTP_FALLING_EDGE)
> +			ctrl |= TGPIOCTL_EP_TOGGLE_EDGE;
> +		else if (extts->flags & PTP_RISING_EDGE)
> +			ctrl |= TGPIOCTL_EP_RISING_EDGE;
> +		else if (extts->flags & PTP_FALLING_EDGE)
> +			ctrl |= TGPIOCTL_EP_FALLING_EDGE;
> +
> +		/* gotta program all other bits before EN bit is set */
> +		intel_pmc_tgpio_writel(tgpio->base, TGPIOCTL, ctrl);
> +		ctrl |= TGPIOCTL_EN;
> +		input = true;
> +	} else {
> +		ctrl &= ~(TGPIOCTL_DIR | TGPIOCTL_EN);
> +		input = false;
> +	}
> +
> +	intel_pmc_tgpio_writel(tgpio->base, TGPIOCTL, ctrl);
> +	tgpio->input = input;
> +
> +	if (input)
> +		wake_up_process(tgpio->event_thread);
> +
> +	return 0;
> +}
> +
> +static int intel_pmc_tgpio_config_output(struct intel_pmc_tgpio *tgpio,
> +		struct ptp_perout_request *perout, int on)
> +{
> +	u32			ctrl;
> +
> +	ctrl = intel_pmc_tgpio_readl(tgpio->base, TGPIOCTL);
> +	if (on) {
> +		struct ptp_clock_time *period = &perout->period;
> +		struct ptp_clock_time *start = &perout->start;
> +
> +		if (ctrl & TGPIOCTL_EN)
> +			return 0;
> +
> +		intel_pmc_tgpio_writeq(tgpio->base, TGPIOCOMPV31_0,
> +				to_intel_pmc_tgpio_time(start));
> +
> +		intel_pmc_tgpio_writeq(tgpio->base, TGPIOPIV31_0,
> +				to_intel_pmc_tgpio_time(period));
> +
> +		ctrl &= ~TGPIOCTL_DIR;
> +		if (perout->flags & PTP_PEROUT_ONE_SHOT)
> +			ctrl &= ~TGPIOCTL_PM;
> +		else
> +			ctrl |= TGPIOCTL_PM;
> +
> +		/* gotta program all other bits before EN bit is set */
> +		intel_pmc_tgpio_writel(tgpio->base, TGPIOCTL, ctrl);
> +
> +		ctrl |= TGPIOCTL_EN;
> +		intel_pmc_tgpio_writel(tgpio->base, TGPIOCTL, ctrl);
> +	} else {
> +		if (!(ctrl & ~TGPIOCTL_EN))
> +			return 0;
> +
> +		ctrl &= ~(TGPIOCTL_EN | TGPIOCTL_PM);
> +		intel_pmc_tgpio_writel(tgpio->base, TGPIOCTL, ctrl);
> +	}
> +
> +	return 0;
> +}
> +
> +static int intel_pmc_tgpio_enable(struct ptp_clock_info *info,
> +		struct ptp_clock_request *req, int on)
> +{
> +	struct intel_pmc_tgpio	*tgpio = to_intel_pmc_tgpio(info);
> +	int			ret = -EOPNOTSUPP;
> +
> +	mutex_lock(&tgpio->lock);
> +	switch (req->type) {
> +	case PTP_CLK_REQ_EXTTS:
> +		ret = intel_pmc_tgpio_config_input(tgpio, &req->extts, on);
> +		break;
> +	case PTP_CLK_REQ_PEROUT:
> +		ret = intel_pmc_tgpio_config_output(tgpio, &req->perout, on);
> +		break;
> +	default:
> +		break;
> +	}
> +	mutex_unlock(&tgpio->lock);
> +
> +	return ret;
> +}
> +
> +static int intel_pmc_tgpio_get_time_fn(ktime_t *device_time,
> +		struct system_counterval_t *system_counter, void *_tgpio)
> +{
> +	get_tsc_ns(system_counter, device_time);
> +	return 0;
> +}
> +
> +static int intel_pmc_tgpio_getcrosststamp(struct ptp_clock_info *info,
> +		struct system_device_crosststamp *cts)
> +{
> +	struct intel_pmc_tgpio	*tgpio = to_intel_pmc_tgpio(info);
> +
> +	return get_device_system_crosststamp(intel_pmc_tgpio_get_time_fn, tgpio,
> +			NULL, cts);
> +}
> +
> +static int intel_pmc_tgpio_counttstamp(struct ptp_clock_info *info,
> +		struct ptp_event_count_tstamp *count)
> +{
> +	struct intel_pmc_tgpio	*tgpio = to_intel_pmc_tgpio(info);
> +	u32 dt_hi_tmp;
> +	u32 dt_hi;
> +	u32 dt_lo;
> +
> +	dt_hi_tmp = intel_pmc_tgpio_readl(tgpio->base, TGPIOTCV63_32);
> +	dt_lo = intel_pmc_tgpio_readl(tgpio->base, TGPIOTCV31_0);
> +
> +	count->event_count = intel_pmc_tgpio_readl(tgpio->base, TGPIOECCV63_32);
> +	count->event_count <<= 32;
> +	count->event_count |= intel_pmc_tgpio_readl(tgpio->base, TGPIOECCV31_0);
> +
> +	dt_hi = intel_pmc_tgpio_readl(tgpio->base, TGPIOTCV63_32);
> +
> +	if (dt_hi_tmp != dt_hi && dt_lo & 0x80000000)
> +		count->device_time.sec = dt_hi_tmp;
> +	else
> +		count->device_time.sec = dt_hi;
> +
> +	count->device_time.nsec = dt_lo;
> +
> +	return 0;
> +}
> +
> +static int intel_pmc_tgpio_verify(struct ptp_clock_info *ptp, unsigned int pin,
> +		enum ptp_pin_function func, unsigned int chan)
> +{
> +	return 0;
> +}
> +
> +static const struct ptp_clock_info intel_pmc_tgpio_info = {
> +	.owner		= THIS_MODULE,
> +	.name		= "Intel PMC TGPIO",
> +	.max_adj	= 50000000,
> +	.n_pins		= 1,
> +	.n_ext_ts	= 1,
> +	.n_per_out	= 1,
> +	.pin_config	= intel_pmc_tgpio_pin_config,
> +	.gettime64	= intel_pmc_tgpio_gettime64,
> +	.settime64	= intel_pmc_tgpio_settime64,
> +	.enable		= intel_pmc_tgpio_enable,
> +	.getcrosststamp	= intel_pmc_tgpio_getcrosststamp,
> +	.counttstamp	= intel_pmc_tgpio_counttstamp,
> +	.verify		= intel_pmc_tgpio_verify,
> +};
> +
> +static int intel_pmc_tgpio_probe(struct platform_device *pdev)
> +{
> +	struct intel_pmc_tgpio	*tgpio;
> +	struct device		*dev;
> +	struct resource		*res;
> +
> +	dev = &pdev->dev;
> +	tgpio = devm_kzalloc(dev, sizeof(*tgpio), GFP_KERNEL);
> +	if (!tgpio)
> +		return -ENOMEM;
> +
> +	tgpio->dev = dev;
> +	tgpio->info = intel_pmc_tgpio_info;
> +
> +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +	tgpio->base = devm_ioremap_resource(dev, res);
> +	if (!tgpio->base)
> +		return -ENOMEM;
> +
> +	mutex_init(&tgpio->lock);
> +	platform_set_drvdata(pdev, tgpio);
> +
> +	tgpio->event_thread = kthread_create(intel_pmc_tgpio_event_thread,
> +			tgpio, dev_name(tgpio->dev));
> +	if (IS_ERR(tgpio->event_thread))
> +		return PTR_ERR(tgpio->event_thread);
> +
> +	tgpio->clock = ptp_clock_register(&tgpio->info, &pdev->dev);
> +	if (IS_ERR(tgpio->clock))
> +		return PTR_ERR(tgpio->clock);
> +
> +	wake_up_process(tgpio->event_thread);
> +
> +	return 0;
> +}
> +
> +static int intel_pmc_tgpio_remove(struct platform_device *pdev)
> +{
> +	struct intel_pmc_tgpio	*tgpio = platform_get_drvdata(pdev);
> +
> +	ptp_clock_unregister(tgpio->clock);
> +
> +	return 0;
> +}
> +
> +static const struct acpi_device_id intel_pmc_acpi_match[] = {
> +	/* TODO */
> +
> +	{  },
> +};
> +
> +/* MODULE_ALIAS("acpi*:TODO:*"); */
> +
> +static struct platform_driver intel_pmc_tgpio_driver = {
> +	.probe		= intel_pmc_tgpio_probe,
> +	.remove		= intel_pmc_tgpio_remove,
> +	.driver		= {
> +		.name	= "intel-pmc-tgpio",
> +		.acpi_match_table = ACPI_PTR(intel_pmc_acpi_match),
> +	},
> +};
> +
> +module_platform_driver(intel_pmc_tgpio_driver);
> +
> +MODULE_AUTHOR("Felipe Balbi <felipe.balbi@linux.intel.com>");
> +MODULE_LICENSE("GPL v2");
> +MODULE_DESCRIPTION("Intel PMC Timed GPIO Controller Driver");


^ permalink raw reply

* Re: [PATCH net v2] skbuff: fix compilation warnings in skb_dump()
From: Willem de Bruijn @ 2019-07-16 19:01 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Qian Cai, David Miller, Willem de Bruijn, joe, clang-built-linux,
	Network Development, LKML
In-Reply-To: <20190716165136.GC37903@archlinux-threadripper>

On Tue, Jul 16, 2019 at 6:53 PM Nathan Chancellor
<natechancellor@gmail.com> wrote:
>
> On Tue, Jul 16, 2019 at 11:43:05AM -0400, Qian Cai wrote:
> > The commit 6413139dfc64 ("skbuff: increase verbosity when dumping skb
> > data") introduced a few compilation warnings.
> >
> > net/core/skbuff.c:766:32: warning: format specifies type 'unsigned
> > short' but the argument has type 'unsigned int' [-Wformat]
> >                        level, sk->sk_family, sk->sk_type,
> > sk->sk_protocol);
> >                                              ^~~~~~~~~~~
> > net/core/skbuff.c:766:45: warning: format specifies type 'unsigned
> > short' but the argument has type 'unsigned int' [-Wformat]
> >                        level, sk->sk_family, sk->sk_type,
> > sk->sk_protocol);
> > ^~~~~~~~~~~~~~~
> >
> > Fix them by using the proper types.
> >
> > Fixes: 6413139dfc64 ("skbuff: increase verbosity when dumping skb data")
> > Signed-off-by: Qian Cai <cai@lca.pw>
>
> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>

Acked-by: Willem de Bruijn <willemb@google.com>

Thanks Qian.

^ permalink raw reply

* Re: [PATCH iproute2-rc 2/8] rdma: Add "stat qp show" support
From: Stephen Hemminger @ 2019-07-16 19:01 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Leon Romanovsky, netdev, David Ahern, Mark Zhang,
	RDMA mailing list
In-Reply-To: <20190710072455.9125-3-leon@kernel.org>

On Wed, 10 Jul 2019 10:24:49 +0300
Leon Romanovsky <leon@kernel.org> wrote:

> From: Mark Zhang <markz@mellanox.com>
> 
> This patch presents link, id, task name, lqpn, as well as all sub
> counters of a QP counter.
> A QP counter is a dynamically allocated statistic counter that is
> bound with one or more QPs. It has several sub-counters, each is
> used for a different purpose.
> 
> Examples:
> $ rdma stat qp show
> link mlx5_2/1 cntn 5 pid 31609 comm client.1 rx_write_requests 0
> rx_read_requests 0 rx_atomic_requests 0 out_of_buffer 0 out_of_sequence 0
> duplicate_request 0 rnr_nak_retry_err 0 packet_seq_err 0
> implied_nak_seq_err 0 local_ack_timeout_err 0 resp_local_length_error 0
> resp_cqe_error 0 req_cqe_error 0 req_remote_invalid_request 0
> req_remote_access_errors 0 resp_remote_access_errors 0
> resp_cqe_flush_error 0 req_cqe_flush_error 0
>     LQPN: <178>
> $ rdma stat show link rocep1s0f5/1
> link rocep1s0f5/1 rx_write_requests 0 rx_read_requests 0 rx_atomic_requests 0 out_of_buffer 0 duplicate_request 0
> rnr_nak_retry_err 0 packet_seq_err 0 implied_nak_seq_err 0 local_ack_timeout_err 0 resp_local_length_error 0 resp_cqe_error 0
> req_cqe_error 0 req_remote_invalid_request 0 req_remote_access_errors 0 resp_remote_access_errors 0 resp_cqe_flush_error 0
> req_cqe_flush_error 0 rp_cnp_ignored 0 rp_cnp_handled 0 np_ecn_marked_roce_packets 0 np_cnp_sent 0
> $ rdma stat show link rocep1s0f5/1 -p
> link rocep1s0f5/1
>     rx_write_requests 0
>     rx_read_requests 0
>     rx_atomic_requests 0
>     out_of_buffer 0
>     duplicate_request 0
>     rnr_nak_retry_err 0
>     packet_seq_err 0
>     implied_nak_seq_err 0
>     local_ack_timeout_err 0
>     resp_local_length_error 0
>     resp_cqe_error 0
>     req_cqe_error 0
>     req_remote_invalid_request 0
>     req_remote_access_errors 0
>     resp_remote_access_errors 0
>     resp_cqe_flush_error 0
>     req_cqe_flush_error 0
>     rp_cnp_ignored 0
>     rp_cnp_handled 0
>     np_ecn_marked_roce_packets 0
>     np_cnp_sent 0
> 
> Signed-off-by: Mark Zhang <markz@mellanox.com>
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> ---
>  rdma/Makefile |   2 +-
>  rdma/rdma.c   |   3 +-
>  rdma/rdma.h   |   1 +
>  rdma/stat.c   | 268 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  rdma/utils.c  |   7 ++
>  5 files changed, 279 insertions(+), 2 deletions(-)
>  create mode 100644 rdma/stat.c
> 

Headers have been merged, but this patch does not apply cleanly to current iproute2


^ permalink raw reply

* Re: [PATCH 2/9] rcu: Add support for consolidated-RCU reader checking (v3)
From: Paul E. McKenney @ 2019-07-16 18:53 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: linux-kernel, Alexey Kuznetsov, Bjorn Helgaas, Borislav Petkov,
	c0d1n61at3, David S. Miller, edumazet, Greg Kroah-Hartman,
	Hideaki YOSHIFUJI, H. Peter Anvin, Ingo Molnar, Jonathan Corbet,
	Josh Triplett, keescook, kernel-hardening, kernel-team,
	Lai Jiangshan, Len Brown, linux-acpi, linux-doc, linux-pci,
	linux-pm, Mathieu Desnoyers, neilb, netdev, Oleg Nesterov,
	Pavel Machek, peterz, Rafael J. Wysocki, Rasmus Villemoes, rcu,
	Steven Rostedt, Tejun Heo, Thomas Gleixner, will,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)
In-Reply-To: <20190716184649.GA130463@google.com>

On Tue, Jul 16, 2019 at 02:46:49PM -0400, Joel Fernandes wrote:
> On Tue, Jul 16, 2019 at 11:38:33AM -0700, Paul E. McKenney wrote:
> > On Mon, Jul 15, 2019 at 10:36:58AM -0400, Joel Fernandes (Google) wrote:
> > > This patch adds support for checking RCU reader sections in list
> > > traversal macros. Optionally, if the list macro is called under SRCU or
> > > other lock/mutex protection, then appropriate lockdep expressions can be
> > > passed to make the checks pass.
> > > 
> > > Existing list_for_each_entry_rcu() invocations don't need to pass the
> > > optional fourth argument (cond) unless they are under some non-RCU
> > > protection and needs to make lockdep check pass.
> > > 
> > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> > 
> > Now that I am on the correct version, again please fold in the checks
> > for the extra argument.  The ability to have an optional argument looks
> > quite helpful, especially when compared to growing the RCU API!
> 
> I did fold this and replied with a pull request URL based on /dev branch. But
> we can hold off on the pull requests until we decide on the below comments:
> 
> > A few more things below.
> > > ---
> > >  include/linux/rculist.h  | 28 ++++++++++++++++++++-----
> > >  include/linux/rcupdate.h |  7 +++++++
> > >  kernel/rcu/Kconfig.debug | 11 ++++++++++
> > >  kernel/rcu/update.c      | 44 ++++++++++++++++++++++++----------------
> > >  4 files changed, 67 insertions(+), 23 deletions(-)
> > > 
> > > diff --git a/include/linux/rculist.h b/include/linux/rculist.h
> > > index e91ec9ddcd30..1048160625bb 100644
> > > --- a/include/linux/rculist.h
> > > +++ b/include/linux/rculist.h
> > > @@ -40,6 +40,20 @@ static inline void INIT_LIST_HEAD_RCU(struct list_head *list)
> > >   */
> > >  #define list_next_rcu(list)	(*((struct list_head __rcu **)(&(list)->next)))
> > >  
> > > +/*
> > > + * Check during list traversal that we are within an RCU reader
> > > + */
> > > +
> > > +#ifdef CONFIG_PROVE_RCU_LIST
> > 
> > This new Kconfig option is OK temporarily, but unless there is reason to
> > fear malfunction that a few weeks of rcutorture, 0day, and -next won't
> > find, it would be better to just use CONFIG_PROVE_RCU.  The overall goal
> > is to reduce the number of RCU knobs rather than grow them, must though
> > history might lead one to believe otherwise.  :-/
> 
> If you want, we can try to drop this option and just use PROVE_RCU however I
> must say there may be several warnings that need to be fixed in a short
> period of time (even a few weeks may be too short) considering the 1000+
> uses of RCU lists.

Do many people other than me build with CONFIG_PROVE_RCU?  If so, then
that would be a good reason for a temporary CONFIG_PROVE_RCU_LIST,
as in going away in a release or two once the warnings get fixed.

> But I don't mind dropping it and it may just accelerate the fixing up of all
> callers.

I will let you decide based on the above question.  But if you have
CONFIG_PROVE_RCU_LIST, as noted below, it needs to depend on RCU_EXPERT.

							Thanx, Paul

> > > +#define __list_check_rcu(dummy, cond, ...)				\
> > > +	({								\
> > > +	RCU_LOCKDEP_WARN(!cond && !rcu_read_lock_any_held(),		\
> > > +			 "RCU-list traversed in non-reader section!");	\
> > > +	 })
> > > +#else
> > > +#define __list_check_rcu(dummy, cond, ...) ({})
> > > +#endif
> > > +
> > >  /*
> > >   * Insert a new entry between two known consecutive entries.
> > >   *
> > > @@ -343,14 +357,16 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,
> > >   * @pos:	the type * to use as a loop cursor.
> > >   * @head:	the head for your list.
> > >   * @member:	the name of the list_head within the struct.
> > > + * @cond:	optional lockdep expression if called from non-RCU protection.
> > >   *
> > >   * This list-traversal primitive may safely run concurrently with
> > >   * the _rcu list-mutation primitives such as list_add_rcu()
> > >   * as long as the traversal is guarded by rcu_read_lock().
> > >   */
> > > -#define list_for_each_entry_rcu(pos, head, member) \
> > > -	for (pos = list_entry_rcu((head)->next, typeof(*pos), member); \
> > > -		&pos->member != (head); \
> > > +#define list_for_each_entry_rcu(pos, head, member, cond...)		\
> > > +	for (__list_check_rcu(dummy, ## cond, 0),			\
> > > +	     pos = list_entry_rcu((head)->next, typeof(*pos), member);	\
> > > +		&pos->member != (head);					\
> > >  		pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
> > >  
> > >  /**
> > > @@ -616,13 +632,15 @@ static inline void hlist_add_behind_rcu(struct hlist_node *n,
> > >   * @pos:	the type * to use as a loop cursor.
> > >   * @head:	the head for your list.
> > >   * @member:	the name of the hlist_node within the struct.
> > > + * @cond:	optional lockdep expression if called from non-RCU protection.
> > >   *
> > >   * This list-traversal primitive may safely run concurrently with
> > >   * the _rcu list-mutation primitives such as hlist_add_head_rcu()
> > >   * as long as the traversal is guarded by rcu_read_lock().
> > >   */
> > > -#define hlist_for_each_entry_rcu(pos, head, member)			\
> > > -	for (pos = hlist_entry_safe (rcu_dereference_raw(hlist_first_rcu(head)),\
> > > +#define hlist_for_each_entry_rcu(pos, head, member, cond...)		\
> > > +	for (__list_check_rcu(dummy, ## cond, 0),			\
> > > +	     pos = hlist_entry_safe (rcu_dereference_raw(hlist_first_rcu(head)),\
> > >  			typeof(*(pos)), member);			\
> > >  		pos;							\
> > >  		pos = hlist_entry_safe(rcu_dereference_raw(hlist_next_rcu(\
> > > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > > index 8f7167478c1d..f3c29efdf19a 100644
> > > --- a/include/linux/rcupdate.h
> > > +++ b/include/linux/rcupdate.h
> > > @@ -221,6 +221,7 @@ int debug_lockdep_rcu_enabled(void);
> > >  int rcu_read_lock_held(void);
> > >  int rcu_read_lock_bh_held(void);
> > >  int rcu_read_lock_sched_held(void);
> > > +int rcu_read_lock_any_held(void);
> > >  
> > >  #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
> > >  
> > > @@ -241,6 +242,12 @@ static inline int rcu_read_lock_sched_held(void)
> > >  {
> > >  	return !preemptible();
> > >  }
> > > +
> > > +static inline int rcu_read_lock_any_held(void)
> > > +{
> > > +	return !preemptible();
> > > +}
> > > +
> > >  #endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
> > >  
> > >  #ifdef CONFIG_PROVE_RCU
> > > diff --git a/kernel/rcu/Kconfig.debug b/kernel/rcu/Kconfig.debug
> > > index 5ec3ea4028e2..7fbd21dbfcd0 100644
> > > --- a/kernel/rcu/Kconfig.debug
> > > +++ b/kernel/rcu/Kconfig.debug
> > > @@ -8,6 +8,17 @@ menu "RCU Debugging"
> > >  config PROVE_RCU
> > >  	def_bool PROVE_LOCKING
> > >  
> > > +config PROVE_RCU_LIST
> > > +	bool "RCU list lockdep debugging"
> > > +	depends on PROVE_RCU
> > 
> > This must also depend on RCU_EXPERT.  
> 
> Sure.
> 
> > > +	default n
> > > +	help
> > > +	  Enable RCU lockdep checking for list usages. By default it is
> > > +	  turned off since there are several list RCU users that still
> > > +	  need to be converted to pass a lockdep expression. To prevent
> > > +	  false-positive splats, we keep it default disabled but once all
> > > +	  users are converted, we can remove this config option.
> > > +
> > >  config TORTURE_TEST
> > >  	tristate
> > >  	default n
> > > diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> > > index 9dd5aeef6e70..b7a4e3b5fa98 100644
> > > --- a/kernel/rcu/update.c
> > > +++ b/kernel/rcu/update.c
> > > @@ -91,14 +91,18 @@ module_param(rcu_normal_after_boot, int, 0);
> > >   * Similarly, we avoid claiming an SRCU read lock held if the current
> > >   * CPU is offline.
> > >   */
> > > +#define rcu_read_lock_held_common()		\
> > > +	if (!debug_lockdep_rcu_enabled())	\
> > > +		return 1;			\
> > > +	if (!rcu_is_watching())			\
> > > +		return 0;			\
> > > +	if (!rcu_lockdep_current_cpu_online())	\
> > > +		return 0;
> > 
> > Nice abstraction of common code!
> 
> Thanks!
> 


^ permalink raw reply

* Re: [PATCH 2/2] net: apply proc_net_mkdir() harder
From: Pablo Neira Ayuso @ 2019-07-16 18:52 UTC (permalink / raw)
  To: Alexey Dobriyan
  Cc: davem, netdev, netfilter-devel, linux-nfs, j.vosburgh, vfalico,
	andy, kadlec, fw, bfields, chuck.lever
In-Reply-To: <20190706165521.GB10550@avx2>

On Sat, Jul 06, 2019 at 07:55:21PM +0300, Alexey Dobriyan wrote:
> From: "Hallsmark, Per" <Per.Hallsmark@windriver.com>
> 
> proc_net_mkdir() should be used to create stuff under /proc/net,
> so that dentry revalidation kicks in.
> 
> See
> 
> 	commit 1fde6f21d90f8ba5da3cb9c54ca991ed72696c43
> 	proc: fix /proc/net/* after setns(2)
> 
> 	[added more chunks --adobriyan]

I don't find this in the tree, if you split the netfilter part in an
independent patch, I could take it into the netfilter tree.

Or just keep it like this and ask David to take it.

^ permalink raw reply

* Re: [PATCH v2 2/9] rcu: Add support for consolidated-RCU reader checking
From: Paul E. McKenney @ 2019-07-16 18:50 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: linux-kernel, Alexey Kuznetsov, Bjorn Helgaas, Borislav Petkov,
	c0d1n61at3, David S. Miller, edumazet, Greg Kroah-Hartman,
	Hideaki YOSHIFUJI, H. Peter Anvin, Ingo Molnar, Jonathan Corbet,
	Josh Triplett, keescook, kernel-hardening, kernel-team,
	Lai Jiangshan, Len Brown, linux-acpi, linux-doc, linux-pci,
	linux-pm, Mathieu Desnoyers, neilb, netdev, Oleg Nesterov,
	Pavel Machek, peterz, Rafael J. Wysocki, Rasmus Villemoes, rcu,
	Steven Rostedt, Tejun Heo, Thomas Gleixner, will,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)
In-Reply-To: <20190716183517.GA129705@google.com>

On Tue, Jul 16, 2019 at 02:35:17PM -0400, Joel Fernandes wrote:
> On Tue, Jul 16, 2019 at 11:22:37AM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 12, 2019 at 01:00:17PM -0400, Joel Fernandes (Google) wrote:
> > > This patch adds support for checking RCU reader sections in list
> > > traversal macros. Optionally, if the list macro is called under SRCU or
> > > other lock/mutex protection, then appropriate lockdep expressions can be
> > > passed to make the checks pass.
> > > 
> > > Existing list_for_each_entry_rcu() invocations don't need to pass the
> > > optional fourth argument (cond) unless they are under some non-RCU
> > > protection and needs to make lockdep check pass.
> > > 
> > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> > 
> > If you fold in the checks for extra parameters, I will take this
> > one and also 1/9.
> 
> I folded the checks in and also threw in the rcu-sync with Oleg's ack:
> 
> Could you pull into /dev branch?
> 
> git pull https://github.com/joelagnel/linux-kernel.git list-first-three
> (Based on your dev branch)

Given that I am going to have to rebase these a few times, please
email a v4.

							Thanx, Paul

^ permalink raw reply

* Re: [PATCH 0/9] Harden list_for_each_entry_rcu() and family
From: Paul E. McKenney @ 2019-07-16 18:46 UTC (permalink / raw)
  To: Joel Fernandes (Google)
  Cc: linux-kernel, Alexey Kuznetsov, Bjorn Helgaas, Borislav Petkov,
	c0d1n61at3, David S. Miller, edumazet, Greg Kroah-Hartman,
	Hideaki YOSHIFUJI, H. Peter Anvin, Ingo Molnar, Jonathan Corbet,
	Josh Triplett, keescook, kernel-hardening, kernel-team,
	Lai Jiangshan, Len Brown, linux-acpi, linux-doc, linux-pci,
	linux-pm, Mathieu Desnoyers, neilb, netdev, Oleg Nesterov,
	Pavel Machek, peterz, Rafael J. Wysocki, Rasmus Villemoes, rcu,
	Steven Rostedt, Tejun Heo, Thomas Gleixner, will,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)
In-Reply-To: <20190715143705.117908-1-joel@joelfernandes.org>

On Mon, Jul 15, 2019 at 10:36:56AM -0400, Joel Fernandes (Google) wrote:
> Hi,
> This series aims to provide lockdep checking to RCU list macros for additional
> kernel hardening.
> 
> RCU has a number of primitives for "consumption" of an RCU protected pointer.
> Most of the time, these consumers make sure that such accesses are under a RCU
> reader-section (such as rcu_dereference{,sched,bh} or under a lock, such as
> with rcu_dereference_protected()).
> 
> However, there are other ways to consume RCU pointers, such as by
> list_for_each_entry_rcu or hlist_for_each_enry_rcu. Unlike the rcu_dereference
> family, these consumers do no lockdep checking at all. And with the growing
> number of RCU list uses (1000+), it is possible for bugs to creep in and go
> unnoticed which lockdep checks can catch.
> 
> Since RCU consolidation efforts last year, the different traditional RCU
> flavors (preempt, bh, sched) are all consolidated. In other words, any of these
> flavors can cause a reader section to occur and all of them must cease before
> the reader section is considered to be unlocked. Thanks to this, we can
> generically check if we are in an RCU reader. This is what patch 1 does. Note
> that the list_for_each_entry_rcu and family are different from the
> rcu_dereference family in that, there is no _bh or _sched version of this
> macro. They are used under many different RCU reader flavors, and also SRCU.
> Patch 1 adds a new internal function rcu_read_lock_any_held() which checks
> if any reader section is active at all, when these macros are called. If no
> reader section exists, then the optional fourth argument to
> list_for_each_entry_rcu() can be a lockdep expression which is evaluated
> (similar to how rcu_dereference_check() works). If no lockdep expression is
> passed, and we are not in a reader, then a splat occurs. Just take off the
> lockdep expression after applying the patches, by using the following diff and
> see what happens:
> 
> +++ b/arch/x86/pci/mmconfig-shared.c
> @@ -55,7 +55,7 @@ static void list_add_sorted(struct pci_mmcfg_region *new)
>         struct pci_mmcfg_region *cfg;
> 
>         /* keep list sorted by segment and starting bus number */
> -       list_for_each_entry_rcu(cfg, &pci_mmcfg_list, list, pci_mmcfg_lock_held()) {
> +       list_for_each_entry_rcu(cfg, &pci_mmcfg_list, list) {
> 
> 
> The optional argument trick to list_for_each_entry_rcu() can also be used in
> the future to possibly remove rcu_dereference_{,bh,sched}_protected() API and
> we can pass an optional lockdep expression to rcu_dereference() itself. Thus
> eliminating 3 more RCU APIs.
> 
> Note that some list macro wrappers already do their own lockdep checking in the
> caller side. These can be eliminated in favor of the built-in lockdep checking
> in the list macro that this series adds. For example, workqueue code has a
> assert_rcu_or_wq_mutex() function which is called in for_each_wq().  This
> series replaces that in favor of the built-in check.
> 
> Also in the future, we can extend these checks to list_entry_rcu() and other
> list macros as well, if needed.
> 
> Please note that I have kept this option default-disabled under a new config:
> CONFIG_PROVE_RCU_LIST. This is so that until all users are converted to pass
> the optional argument, we should keep the check disabled. There are about a
> 1000 or so users and it is not possible to pass in the optional lockdep
> expression in a single series since it is done on a case-by-case basis. I did
> convert a few users in this series itself.

I do like the optional argument as opposed to the traditional practice
of expanding the RCU API!  Good stuff!!!

Please resend incorporating the acks and the changes from feedback.
I will hold off on any patches not yet having their maintainer's ack,
but it is OK to include them in v4.  (I will just avoid applying them.)

The documentation patch needs a bit of wordsmithing, but I can do that.
Feel free to take another pass on it if you wish, though.

							Thanx, Paul

> v2->v3: Simplified rcu-sync logic after rebase (Paul)
> 	Added check for bh_map (Paul)
> 	Refactored out more of the common code (Joel)
> 	Added Oleg ack to rcu-sync patch.
> 
> v1->v2: Have assert_rcu_or_wq_mutex deleted (Daniel Jordan)
> 	Simplify rcu_read_lock_any_held()   (Peter Zijlstra)
> 	Simplified rcu-sync logic	    (Oleg Nesterov)
> 	Updated documentation and rculist comments.
> 	Added GregKH ack.
> 
> RFC->v1: 
> 	Simplify list checking macro (Rasmus Villemoes)
> 
> Joel Fernandes (Google) (9):
> rcu/update: Remove useless check for debug_locks (v1)
> rcu: Add support for consolidated-RCU reader checking (v3)
> rcu/sync: Remove custom check for reader-section (v2)
> ipv4: add lockdep condition to fix for_each_entry (v1)
> driver/core: Convert to use built-in RCU list checking (v1)
> workqueue: Convert for_each_wq to use built-in list check (v2)
> x86/pci: Pass lockdep condition to pcm_mmcfg_list iterator (v1)
> acpi: Use built-in RCU list checking for acpi_ioremaps list (v1)
> doc: Update documentation about list_for_each_entry_rcu (v1)
> 
> Documentation/RCU/lockdep.txt   | 15 ++++++++---
> Documentation/RCU/whatisRCU.txt |  9 ++++++-
> arch/x86/pci/mmconfig-shared.c  |  5 ++--
> drivers/acpi/osl.c              |  6 +++--
> drivers/base/base.h             |  1 +
> drivers/base/core.c             | 10 +++++++
> drivers/base/power/runtime.c    | 15 +++++++----
> include/linux/rcu_sync.h        |  4 +--
> include/linux/rculist.h         | 28 +++++++++++++++----
> include/linux/rcupdate.h        |  7 +++++
> kernel/rcu/Kconfig.debug        | 11 ++++++++
> kernel/rcu/update.c             | 48 ++++++++++++++++++---------------
> kernel/workqueue.c              | 10 ++-----
> net/ipv4/fib_frontend.c         |  3 ++-
> 14 files changed, 119 insertions(+), 53 deletions(-)
> 
> --
> 2.22.0.510.g264f2c817a-goog
> 

^ permalink raw reply

* Re: [PATCH 2/9] rcu: Add support for consolidated-RCU reader checking (v3)
From: Joel Fernandes @ 2019-07-16 18:46 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, Alexey Kuznetsov, Bjorn Helgaas, Borislav Petkov,
	c0d1n61at3, David S. Miller, edumazet, Greg Kroah-Hartman,
	Hideaki YOSHIFUJI, H. Peter Anvin, Ingo Molnar, Jonathan Corbet,
	Josh Triplett, keescook, kernel-hardening, kernel-team,
	Lai Jiangshan, Len Brown, linux-acpi, linux-doc, linux-pci,
	linux-pm, Mathieu Desnoyers, neilb, netdev, Oleg Nesterov,
	Pavel Machek, peterz, Rafael J. Wysocki, Rasmus Villemoes, rcu,
	Steven Rostedt, Tejun Heo, Thomas Gleixner, will,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)
In-Reply-To: <20190716183833.GD14271@linux.ibm.com>

On Tue, Jul 16, 2019 at 11:38:33AM -0700, Paul E. McKenney wrote:
> On Mon, Jul 15, 2019 at 10:36:58AM -0400, Joel Fernandes (Google) wrote:
> > This patch adds support for checking RCU reader sections in list
> > traversal macros. Optionally, if the list macro is called under SRCU or
> > other lock/mutex protection, then appropriate lockdep expressions can be
> > passed to make the checks pass.
> > 
> > Existing list_for_each_entry_rcu() invocations don't need to pass the
> > optional fourth argument (cond) unless they are under some non-RCU
> > protection and needs to make lockdep check pass.
> > 
> > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> 
> Now that I am on the correct version, again please fold in the checks
> for the extra argument.  The ability to have an optional argument looks
> quite helpful, especially when compared to growing the RCU API!

I did fold this and replied with a pull request URL based on /dev branch. But
we can hold off on the pull requests until we decide on the below comments:

> A few more things below.
> > ---
> >  include/linux/rculist.h  | 28 ++++++++++++++++++++-----
> >  include/linux/rcupdate.h |  7 +++++++
> >  kernel/rcu/Kconfig.debug | 11 ++++++++++
> >  kernel/rcu/update.c      | 44 ++++++++++++++++++++++++----------------
> >  4 files changed, 67 insertions(+), 23 deletions(-)
> > 
> > diff --git a/include/linux/rculist.h b/include/linux/rculist.h
> > index e91ec9ddcd30..1048160625bb 100644
> > --- a/include/linux/rculist.h
> > +++ b/include/linux/rculist.h
> > @@ -40,6 +40,20 @@ static inline void INIT_LIST_HEAD_RCU(struct list_head *list)
> >   */
> >  #define list_next_rcu(list)	(*((struct list_head __rcu **)(&(list)->next)))
> >  
> > +/*
> > + * Check during list traversal that we are within an RCU reader
> > + */
> > +
> > +#ifdef CONFIG_PROVE_RCU_LIST
> 
> This new Kconfig option is OK temporarily, but unless there is reason to
> fear malfunction that a few weeks of rcutorture, 0day, and -next won't
> find, it would be better to just use CONFIG_PROVE_RCU.  The overall goal
> is to reduce the number of RCU knobs rather than grow them, must though
> history might lead one to believe otherwise.  :-/

If you want, we can try to drop this option and just use PROVE_RCU however I
must say there may be several warnings that need to be fixed in a short
period of time (even a few weeks may be too short) considering the 1000+
uses of RCU lists.

But I don't mind dropping it and it may just accelerate the fixing up of all
callers.

> > +#define __list_check_rcu(dummy, cond, ...)				\
> > +	({								\
> > +	RCU_LOCKDEP_WARN(!cond && !rcu_read_lock_any_held(),		\
> > +			 "RCU-list traversed in non-reader section!");	\
> > +	 })
> > +#else
> > +#define __list_check_rcu(dummy, cond, ...) ({})
> > +#endif
> > +
> >  /*
> >   * Insert a new entry between two known consecutive entries.
> >   *
> > @@ -343,14 +357,16 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,
> >   * @pos:	the type * to use as a loop cursor.
> >   * @head:	the head for your list.
> >   * @member:	the name of the list_head within the struct.
> > + * @cond:	optional lockdep expression if called from non-RCU protection.
> >   *
> >   * This list-traversal primitive may safely run concurrently with
> >   * the _rcu list-mutation primitives such as list_add_rcu()
> >   * as long as the traversal is guarded by rcu_read_lock().
> >   */
> > -#define list_for_each_entry_rcu(pos, head, member) \
> > -	for (pos = list_entry_rcu((head)->next, typeof(*pos), member); \
> > -		&pos->member != (head); \
> > +#define list_for_each_entry_rcu(pos, head, member, cond...)		\
> > +	for (__list_check_rcu(dummy, ## cond, 0),			\
> > +	     pos = list_entry_rcu((head)->next, typeof(*pos), member);	\
> > +		&pos->member != (head);					\
> >  		pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
> >  
> >  /**
> > @@ -616,13 +632,15 @@ static inline void hlist_add_behind_rcu(struct hlist_node *n,
> >   * @pos:	the type * to use as a loop cursor.
> >   * @head:	the head for your list.
> >   * @member:	the name of the hlist_node within the struct.
> > + * @cond:	optional lockdep expression if called from non-RCU protection.
> >   *
> >   * This list-traversal primitive may safely run concurrently with
> >   * the _rcu list-mutation primitives such as hlist_add_head_rcu()
> >   * as long as the traversal is guarded by rcu_read_lock().
> >   */
> > -#define hlist_for_each_entry_rcu(pos, head, member)			\
> > -	for (pos = hlist_entry_safe (rcu_dereference_raw(hlist_first_rcu(head)),\
> > +#define hlist_for_each_entry_rcu(pos, head, member, cond...)		\
> > +	for (__list_check_rcu(dummy, ## cond, 0),			\
> > +	     pos = hlist_entry_safe (rcu_dereference_raw(hlist_first_rcu(head)),\
> >  			typeof(*(pos)), member);			\
> >  		pos;							\
> >  		pos = hlist_entry_safe(rcu_dereference_raw(hlist_next_rcu(\
> > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > index 8f7167478c1d..f3c29efdf19a 100644
> > --- a/include/linux/rcupdate.h
> > +++ b/include/linux/rcupdate.h
> > @@ -221,6 +221,7 @@ int debug_lockdep_rcu_enabled(void);
> >  int rcu_read_lock_held(void);
> >  int rcu_read_lock_bh_held(void);
> >  int rcu_read_lock_sched_held(void);
> > +int rcu_read_lock_any_held(void);
> >  
> >  #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
> >  
> > @@ -241,6 +242,12 @@ static inline int rcu_read_lock_sched_held(void)
> >  {
> >  	return !preemptible();
> >  }
> > +
> > +static inline int rcu_read_lock_any_held(void)
> > +{
> > +	return !preemptible();
> > +}
> > +
> >  #endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
> >  
> >  #ifdef CONFIG_PROVE_RCU
> > diff --git a/kernel/rcu/Kconfig.debug b/kernel/rcu/Kconfig.debug
> > index 5ec3ea4028e2..7fbd21dbfcd0 100644
> > --- a/kernel/rcu/Kconfig.debug
> > +++ b/kernel/rcu/Kconfig.debug
> > @@ -8,6 +8,17 @@ menu "RCU Debugging"
> >  config PROVE_RCU
> >  	def_bool PROVE_LOCKING
> >  
> > +config PROVE_RCU_LIST
> > +	bool "RCU list lockdep debugging"
> > +	depends on PROVE_RCU
> 
> This must also depend on RCU_EXPERT.  

Sure.

> > +	default n
> > +	help
> > +	  Enable RCU lockdep checking for list usages. By default it is
> > +	  turned off since there are several list RCU users that still
> > +	  need to be converted to pass a lockdep expression. To prevent
> > +	  false-positive splats, we keep it default disabled but once all
> > +	  users are converted, we can remove this config option.
> > +
> >  config TORTURE_TEST
> >  	tristate
> >  	default n
> > diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> > index 9dd5aeef6e70..b7a4e3b5fa98 100644
> > --- a/kernel/rcu/update.c
> > +++ b/kernel/rcu/update.c
> > @@ -91,14 +91,18 @@ module_param(rcu_normal_after_boot, int, 0);
> >   * Similarly, we avoid claiming an SRCU read lock held if the current
> >   * CPU is offline.
> >   */
> > +#define rcu_read_lock_held_common()		\
> > +	if (!debug_lockdep_rcu_enabled())	\
> > +		return 1;			\
> > +	if (!rcu_is_watching())			\
> > +		return 0;			\
> > +	if (!rcu_lockdep_current_cpu_online())	\
> > +		return 0;
> 
> Nice abstraction of common code!

Thanks!


^ permalink raw reply

* Re: [PATCH 8/9] acpi: Use built-in RCU list checking for acpi_ioremaps list (v1)
From: Paul E. McKenney @ 2019-07-16 18:43 UTC (permalink / raw)
  To: Joel Fernandes (Google)
  Cc: linux-kernel, Alexey Kuznetsov, Bjorn Helgaas, Borislav Petkov,
	c0d1n61at3, David S. Miller, edumazet, Greg Kroah-Hartman,
	Hideaki YOSHIFUJI, H. Peter Anvin, Ingo Molnar, Jonathan Corbet,
	Josh Triplett, keescook, kernel-hardening, kernel-team,
	Lai Jiangshan, Len Brown, linux-acpi, linux-doc, linux-pci,
	linux-pm, Mathieu Desnoyers, neilb, netdev, Oleg Nesterov,
	Pavel Machek, peterz, Rafael J. Wysocki, Rasmus Villemoes, rcu,
	Steven Rostedt, Tejun Heo, Thomas Gleixner, will,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)
In-Reply-To: <20190715143705.117908-9-joel@joelfernandes.org>

On Mon, Jul 15, 2019 at 10:37:04AM -0400, Joel Fernandes (Google) wrote:
> list_for_each_entry_rcu has built-in RCU and lock checking. Make use of
> it for acpi_ioremaps list traversal.
> 
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>

Given that Rafael acked it, this one looks ready.

							Thanx, Paul

> ---
>  drivers/acpi/osl.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
> index 9c0edf2fc0dd..2f9d0d20b836 100644
> --- a/drivers/acpi/osl.c
> +++ b/drivers/acpi/osl.c
> @@ -14,6 +14,7 @@
>  #include <linux/slab.h>
>  #include <linux/mm.h>
>  #include <linux/highmem.h>
> +#include <linux/lockdep.h>
>  #include <linux/pci.h>
>  #include <linux/interrupt.h>
>  #include <linux/kmod.h>
> @@ -80,6 +81,7 @@ struct acpi_ioremap {
>  
>  static LIST_HEAD(acpi_ioremaps);
>  static DEFINE_MUTEX(acpi_ioremap_lock);
> +#define acpi_ioremap_lock_held() lock_is_held(&acpi_ioremap_lock.dep_map)
>  
>  static void __init acpi_request_region (struct acpi_generic_address *gas,
>  	unsigned int length, char *desc)
> @@ -206,7 +208,7 @@ acpi_map_lookup(acpi_physical_address phys, acpi_size size)
>  {
>  	struct acpi_ioremap *map;
>  
> -	list_for_each_entry_rcu(map, &acpi_ioremaps, list)
> +	list_for_each_entry_rcu(map, &acpi_ioremaps, list, acpi_ioremap_lock_held())
>  		if (map->phys <= phys &&
>  		    phys + size <= map->phys + map->size)
>  			return map;
> @@ -249,7 +251,7 @@ acpi_map_lookup_virt(void __iomem *virt, acpi_size size)
>  {
>  	struct acpi_ioremap *map;
>  
> -	list_for_each_entry_rcu(map, &acpi_ioremaps, list)
> +	list_for_each_entry_rcu(map, &acpi_ioremaps, list, acpi_ioremap_lock_held())
>  		if (map->virt <= virt &&
>  		    virt + size <= map->virt + map->size)
>  			return map;
> -- 
> 2.22.0.510.g264f2c817a-goog
> 

^ permalink raw reply

* Re: [PATCH 6/9] workqueue: Convert for_each_wq to use built-in list check (v2)
From: Paul E. McKenney @ 2019-07-16 18:41 UTC (permalink / raw)
  To: Joel Fernandes (Google)
  Cc: linux-kernel, Alexey Kuznetsov, Bjorn Helgaas, Borislav Petkov,
	c0d1n61at3, David S. Miller, edumazet, Greg Kroah-Hartman,
	Hideaki YOSHIFUJI, H. Peter Anvin, Ingo Molnar, Jonathan Corbet,
	Josh Triplett, keescook, kernel-hardening, kernel-team,
	Lai Jiangshan, Len Brown, linux-acpi, linux-doc, linux-pci,
	linux-pm, Mathieu Desnoyers, neilb, netdev, Oleg Nesterov,
	Pavel Machek, peterz, Rafael J. Wysocki, Rasmus Villemoes, rcu,
	Steven Rostedt, Tejun Heo, Thomas Gleixner, will,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)
In-Reply-To: <20190715143705.117908-7-joel@joelfernandes.org>

On Mon, Jul 15, 2019 at 10:37:02AM -0400, Joel Fernandes (Google) wrote:
> list_for_each_entry_rcu now has support to check for RCU reader sections
> as well as lock. Just use the support in it, instead of explictly
> checking in the caller.
> 
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>

We need an ack from one of the subsystem maintainers on this one.

							Thanx, Paul

> ---
>  kernel/workqueue.c | 10 ++--------
>  1 file changed, 2 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 601d61150b65..e882477ebf6e 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -364,11 +364,6 @@ static void workqueue_sysfs_unregister(struct workqueue_struct *wq);
>  			 !lockdep_is_held(&wq_pool_mutex),		\
>  			 "RCU or wq_pool_mutex should be held")
>  
> -#define assert_rcu_or_wq_mutex(wq)					\
> -	RCU_LOCKDEP_WARN(!rcu_read_lock_held() &&			\
> -			 !lockdep_is_held(&wq->mutex),			\
> -			 "RCU or wq->mutex should be held")
> -
>  #define assert_rcu_or_wq_mutex_or_pool_mutex(wq)			\
>  	RCU_LOCKDEP_WARN(!rcu_read_lock_held() &&			\
>  			 !lockdep_is_held(&wq->mutex) &&		\
> @@ -425,9 +420,8 @@ static void workqueue_sysfs_unregister(struct workqueue_struct *wq);
>   * ignored.
>   */
>  #define for_each_pwq(pwq, wq)						\
> -	list_for_each_entry_rcu((pwq), &(wq)->pwqs, pwqs_node)		\
> -		if (({ assert_rcu_or_wq_mutex(wq); false; })) { }	\
> -		else
> +	list_for_each_entry_rcu((pwq), &(wq)->pwqs, pwqs_node,		\
> +				 lock_is_held(&(wq->mutex).dep_map))
>  
>  #ifdef CONFIG_DEBUG_OBJECTS_WORK
>  
> -- 
> 2.22.0.510.g264f2c817a-goog
> 

^ permalink raw reply

* Re: [PATCH 7/9] x86/pci: Pass lockdep condition to pcm_mmcfg_list iterator (v1)
From: Paul E. McKenney @ 2019-07-16 18:42 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Bjorn Helgaas, linux-kernel, Alexey Kuznetsov, Borislav Petkov,
	c0d1n61at3, David S. Miller, edumazet, Greg Kroah-Hartman,
	Hideaki YOSHIFUJI, H. Peter Anvin, Ingo Molnar, Jonathan Corbet,
	Josh Triplett, keescook, kernel-hardening, kernel-team,
	Lai Jiangshan, Len Brown, linux-acpi, linux-doc, linux-pci,
	linux-pm, Mathieu Desnoyers, neilb, netdev, Oleg Nesterov,
	Pavel Machek, peterz, Rafael J. Wysocki, Rasmus Villemoes, rcu,
	Steven Rostedt, Tejun Heo, Thomas Gleixner, will,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)
In-Reply-To: <20190716040303.GA73383@google.com>

On Tue, Jul 16, 2019 at 12:03:03AM -0400, Joel Fernandes wrote:
> On Mon, Jul 15, 2019 at 03:02:35PM -0500, Bjorn Helgaas wrote:
> > On Mon, Jul 15, 2019 at 10:37:03AM -0400, Joel Fernandes (Google) wrote:
> > > The pcm_mmcfg_list is traversed with list_for_each_entry_rcu without a
> > > reader-lock held, because the pci_mmcfg_lock is already held. Make this
> > > known to the list macro so that it fixes new lockdep warnings that
> > > trigger due to lockdep checks added to list_for_each_entry_rcu().
> > > 
> > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> > 
> > Ingo takes care of most patches to this file, but FWIW,
> > 
> > Acked-by: Bjorn Helgaas <bhelgaas@google.com>
> 
> Thanks.
> 
> > I would personally prefer if you capitalized the subject to match the
> > "x86/PCI:" convention that's used fairly consistently in
> > arch/x86/pci/.
> > 
> > Also, I didn't apply this to be sure, but it looks like this might
> > make a line or two wider than 80 columns, which I would rewrap if I
> > were applying this.
> 
> Updated below is the patch with the nits corrected:

I am OK with this going either way, but it does depend on an earlier
patch.

							Thanx, Paul

> ---8<-----------------------
> 
> >From 73fab09d7e33ca2110c24215f8ed428c12625dbe Mon Sep 17 00:00:00 2001
> From: "Joel Fernandes (Google)" <joel@joelfernandes.org>
> Date: Sat, 1 Jun 2019 15:05:49 -0400
> Subject: [PATCH] x86/PCI: Pass lockdep condition to pcm_mmcfg_list iterator
>  (v1)
> 
> The pcm_mmcfg_list is traversed with list_for_each_entry_rcu without a
> reader-lock held, because the pci_mmcfg_lock is already held. Make this
> known to the list macro so that it fixes new lockdep warnings that
> trigger due to lockdep checks added to list_for_each_entry_rcu().
> 
> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> ---
>  arch/x86/pci/mmconfig-shared.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c
> index 7389db538c30..9e3250ec5a37 100644
> --- a/arch/x86/pci/mmconfig-shared.c
> +++ b/arch/x86/pci/mmconfig-shared.c
> @@ -29,6 +29,7 @@
>  static bool pci_mmcfg_running_state;
>  static bool pci_mmcfg_arch_init_failed;
>  static DEFINE_MUTEX(pci_mmcfg_lock);
> +#define pci_mmcfg_lock_held() lock_is_held(&(pci_mmcfg_lock).dep_map)
>  
>  LIST_HEAD(pci_mmcfg_list);
>  
> @@ -54,7 +55,8 @@ static void list_add_sorted(struct pci_mmcfg_region *new)
>  	struct pci_mmcfg_region *cfg;
>  
>  	/* keep list sorted by segment and starting bus number */
> -	list_for_each_entry_rcu(cfg, &pci_mmcfg_list, list) {
> +	list_for_each_entry_rcu(cfg, &pci_mmcfg_list, list,
> +				pci_mmcfg_lock_held()) {
>  		if (cfg->segment > new->segment ||
>  		    (cfg->segment == new->segment &&
>  		     cfg->start_bus >= new->start_bus)) {
> @@ -118,7 +120,8 @@ struct pci_mmcfg_region *pci_mmconfig_lookup(int segment, int bus)
>  {
>  	struct pci_mmcfg_region *cfg;
>  
> -	list_for_each_entry_rcu(cfg, &pci_mmcfg_list, list)
> +	list_for_each_entry_rcu(cfg, &pci_mmcfg_list, list
> +				pci_mmcfg_lock_held())
>  		if (cfg->segment == segment &&
>  		    cfg->start_bus <= bus && bus <= cfg->end_bus)
>  			return cfg;
> -- 
> 2.22.0.510.g264f2c817a-goog
> 


^ permalink raw reply

* Re: [PATCH 5/9] driver/core: Convert to use built-in RCU list checking (v1)
From: Paul E. McKenney @ 2019-07-16 18:40 UTC (permalink / raw)
  To: Joel Fernandes (Google)
  Cc: linux-kernel, Greg Kroah-Hartman, Alexey Kuznetsov, Bjorn Helgaas,
	Borislav Petkov, c0d1n61at3, David S. Miller, edumazet,
	Hideaki YOSHIFUJI, H. Peter Anvin, Ingo Molnar, Jonathan Corbet,
	Josh Triplett, keescook, kernel-hardening, kernel-team,
	Lai Jiangshan, Len Brown, linux-acpi, linux-doc, linux-pci,
	linux-pm, Mathieu Desnoyers, neilb, netdev, Oleg Nesterov,
	Pavel Machek, peterz, Rafael J. Wysocki, Rasmus Villemoes, rcu,
	Steven Rostedt, Tejun Heo, Thomas Gleixner, will,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)
In-Reply-To: <20190715143705.117908-6-joel@joelfernandes.org>

On Mon, Jul 15, 2019 at 10:37:01AM -0400, Joel Fernandes (Google) wrote:
> list_for_each_entry_rcu has built-in RCU and lock checking. Make use of
> it in driver core.
> 
> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>

This one looks ready.

							Thanx, Paul

> ---
>  drivers/base/base.h          |  1 +
>  drivers/base/core.c          | 10 ++++++++++
>  drivers/base/power/runtime.c | 15 ++++++++++-----
>  3 files changed, 21 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/base/base.h b/drivers/base/base.h
> index b405436ee28e..0d32544b6f91 100644
> --- a/drivers/base/base.h
> +++ b/drivers/base/base.h
> @@ -165,6 +165,7 @@ static inline int devtmpfs_init(void) { return 0; }
>  /* Device links support */
>  extern int device_links_read_lock(void);
>  extern void device_links_read_unlock(int idx);
> +extern int device_links_read_lock_held(void);
>  extern int device_links_check_suppliers(struct device *dev);
>  extern void device_links_driver_bound(struct device *dev);
>  extern void device_links_driver_cleanup(struct device *dev);
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index da84a73f2ba6..85e82f38717f 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -68,6 +68,11 @@ void device_links_read_unlock(int idx)
>  {
>  	srcu_read_unlock(&device_links_srcu, idx);
>  }
> +
> +int device_links_read_lock_held(void)
> +{
> +	return srcu_read_lock_held(&device_links_srcu);
> +}
>  #else /* !CONFIG_SRCU */
>  static DECLARE_RWSEM(device_links_lock);
>  
> @@ -91,6 +96,11 @@ void device_links_read_unlock(int not_used)
>  {
>  	up_read(&device_links_lock);
>  }
> +
> +int device_links_read_lock_held(void)
> +{
> +	return lock_is_held(&device_links_lock);
> +}
>  #endif /* !CONFIG_SRCU */
>  
>  /**
> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> index 952a1e7057c7..7a10e8379a70 100644
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -287,7 +287,8 @@ static int rpm_get_suppliers(struct device *dev)
>  {
>  	struct device_link *link;
>  
> -	list_for_each_entry_rcu(link, &dev->links.suppliers, c_node) {
> +	list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
> +				device_links_read_lock_held()) {
>  		int retval;
>  
>  		if (!(link->flags & DL_FLAG_PM_RUNTIME) ||
> @@ -309,7 +310,8 @@ static void rpm_put_suppliers(struct device *dev)
>  {
>  	struct device_link *link;
>  
> -	list_for_each_entry_rcu(link, &dev->links.suppliers, c_node) {
> +	list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
> +				device_links_read_lock_held()) {
>  		if (READ_ONCE(link->status) == DL_STATE_SUPPLIER_UNBIND)
>  			continue;
>  
> @@ -1640,7 +1642,8 @@ void pm_runtime_clean_up_links(struct device *dev)
>  
>  	idx = device_links_read_lock();
>  
> -	list_for_each_entry_rcu(link, &dev->links.consumers, s_node) {
> +	list_for_each_entry_rcu(link, &dev->links.consumers, s_node,
> +				device_links_read_lock_held()) {
>  		if (link->flags & DL_FLAG_STATELESS)
>  			continue;
>  
> @@ -1662,7 +1665,8 @@ void pm_runtime_get_suppliers(struct device *dev)
>  
>  	idx = device_links_read_lock();
>  
> -	list_for_each_entry_rcu(link, &dev->links.suppliers, c_node)
> +	list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
> +				device_links_read_lock_held())
>  		if (link->flags & DL_FLAG_PM_RUNTIME) {
>  			link->supplier_preactivated = true;
>  			refcount_inc(&link->rpm_active);
> @@ -1683,7 +1687,8 @@ void pm_runtime_put_suppliers(struct device *dev)
>  
>  	idx = device_links_read_lock();
>  
> -	list_for_each_entry_rcu(link, &dev->links.suppliers, c_node)
> +	list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
> +				device_links_read_lock_held())
>  		if (link->supplier_preactivated) {
>  			link->supplier_preactivated = false;
>  			if (refcount_dec_not_one(&link->rpm_active))
> -- 
> 2.22.0.510.g264f2c817a-goog
> 

^ permalink raw reply

* Re: [PATCH 4/9] ipv4: add lockdep condition to fix for_each_entry (v1)
From: Paul E. McKenney @ 2019-07-16 18:39 UTC (permalink / raw)
  To: Joel Fernandes (Google)
  Cc: linux-kernel, Alexey Kuznetsov, Bjorn Helgaas, Borislav Petkov,
	c0d1n61at3, David S. Miller, edumazet, Greg Kroah-Hartman,
	Hideaki YOSHIFUJI, H. Peter Anvin, Ingo Molnar, Jonathan Corbet,
	Josh Triplett, keescook, kernel-hardening, kernel-team,
	Lai Jiangshan, Len Brown, linux-acpi, linux-doc, linux-pci,
	linux-pm, Mathieu Desnoyers, neilb, netdev, Oleg Nesterov,
	Pavel Machek, peterz, Rafael J. Wysocki, Rasmus Villemoes, rcu,
	Steven Rostedt, Tejun Heo, Thomas Gleixner, will,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)
In-Reply-To: <20190715143705.117908-5-joel@joelfernandes.org>

On Mon, Jul 15, 2019 at 10:37:00AM -0400, Joel Fernandes (Google) wrote:
> Using the previous support added, use it for adding lockdep conditions
> to list usage here.
> 
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>

We need an ack or better from the subsystem maintainer for this one.

						Thanx, Paul

> ---
>  net/ipv4/fib_frontend.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
> index 317339cd7f03..26b0fb24e2c2 100644
> --- a/net/ipv4/fib_frontend.c
> +++ b/net/ipv4/fib_frontend.c
> @@ -124,7 +124,8 @@ struct fib_table *fib_get_table(struct net *net, u32 id)
>  	h = id & (FIB_TABLE_HASHSZ - 1);
>  
>  	head = &net->ipv4.fib_table_hash[h];
> -	hlist_for_each_entry_rcu(tb, head, tb_hlist) {
> +	hlist_for_each_entry_rcu(tb, head, tb_hlist,
> +				 lockdep_rtnl_is_held()) {
>  		if (tb->tb_id == id)
>  			return tb;
>  	}
> -- 
> 2.22.0.510.g264f2c817a-goog
> 

^ permalink raw reply

* Re: [PATCH 2/9] rcu: Add support for consolidated-RCU reader checking (v3)
From: Paul E. McKenney @ 2019-07-16 18:38 UTC (permalink / raw)
  To: Joel Fernandes (Google)
  Cc: linux-kernel, Alexey Kuznetsov, Bjorn Helgaas, Borislav Petkov,
	c0d1n61at3, David S. Miller, edumazet, Greg Kroah-Hartman,
	Hideaki YOSHIFUJI, H. Peter Anvin, Ingo Molnar, Jonathan Corbet,
	Josh Triplett, keescook, kernel-hardening, kernel-team,
	Lai Jiangshan, Len Brown, linux-acpi, linux-doc, linux-pci,
	linux-pm, Mathieu Desnoyers, neilb, netdev, Oleg Nesterov,
	Pavel Machek, peterz, Rafael J. Wysocki, Rasmus Villemoes, rcu,
	Steven Rostedt, Tejun Heo, Thomas Gleixner, will,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)
In-Reply-To: <20190715143705.117908-3-joel@joelfernandes.org>

On Mon, Jul 15, 2019 at 10:36:58AM -0400, Joel Fernandes (Google) wrote:
> This patch adds support for checking RCU reader sections in list
> traversal macros. Optionally, if the list macro is called under SRCU or
> other lock/mutex protection, then appropriate lockdep expressions can be
> passed to make the checks pass.
> 
> Existing list_for_each_entry_rcu() invocations don't need to pass the
> optional fourth argument (cond) unless they are under some non-RCU
> protection and needs to make lockdep check pass.
> 
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>

Now that I am on the correct version, again please fold in the checks
for the extra argument.  The ability to have an optional argument looks
quite helpful, especially when compared to growing the RCU API!

A few more things below.

> ---
>  include/linux/rculist.h  | 28 ++++++++++++++++++++-----
>  include/linux/rcupdate.h |  7 +++++++
>  kernel/rcu/Kconfig.debug | 11 ++++++++++
>  kernel/rcu/update.c      | 44 ++++++++++++++++++++++++----------------
>  4 files changed, 67 insertions(+), 23 deletions(-)
> 
> diff --git a/include/linux/rculist.h b/include/linux/rculist.h
> index e91ec9ddcd30..1048160625bb 100644
> --- a/include/linux/rculist.h
> +++ b/include/linux/rculist.h
> @@ -40,6 +40,20 @@ static inline void INIT_LIST_HEAD_RCU(struct list_head *list)
>   */
>  #define list_next_rcu(list)	(*((struct list_head __rcu **)(&(list)->next)))
>  
> +/*
> + * Check during list traversal that we are within an RCU reader
> + */
> +
> +#ifdef CONFIG_PROVE_RCU_LIST

This new Kconfig option is OK temporarily, but unless there is reason to
fear malfunction that a few weeks of rcutorture, 0day, and -next won't
find, it would be better to just use CONFIG_PROVE_RCU.  The overall goal
is to reduce the number of RCU knobs rather than grow them, must though
history might lead one to believe otherwise.  :-/

> +#define __list_check_rcu(dummy, cond, ...)				\
> +	({								\
> +	RCU_LOCKDEP_WARN(!cond && !rcu_read_lock_any_held(),		\
> +			 "RCU-list traversed in non-reader section!");	\
> +	 })
> +#else
> +#define __list_check_rcu(dummy, cond, ...) ({})
> +#endif
> +
>  /*
>   * Insert a new entry between two known consecutive entries.
>   *
> @@ -343,14 +357,16 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,
>   * @pos:	the type * to use as a loop cursor.
>   * @head:	the head for your list.
>   * @member:	the name of the list_head within the struct.
> + * @cond:	optional lockdep expression if called from non-RCU protection.
>   *
>   * This list-traversal primitive may safely run concurrently with
>   * the _rcu list-mutation primitives such as list_add_rcu()
>   * as long as the traversal is guarded by rcu_read_lock().
>   */
> -#define list_for_each_entry_rcu(pos, head, member) \
> -	for (pos = list_entry_rcu((head)->next, typeof(*pos), member); \
> -		&pos->member != (head); \
> +#define list_for_each_entry_rcu(pos, head, member, cond...)		\
> +	for (__list_check_rcu(dummy, ## cond, 0),			\
> +	     pos = list_entry_rcu((head)->next, typeof(*pos), member);	\
> +		&pos->member != (head);					\
>  		pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
>  
>  /**
> @@ -616,13 +632,15 @@ static inline void hlist_add_behind_rcu(struct hlist_node *n,
>   * @pos:	the type * to use as a loop cursor.
>   * @head:	the head for your list.
>   * @member:	the name of the hlist_node within the struct.
> + * @cond:	optional lockdep expression if called from non-RCU protection.
>   *
>   * This list-traversal primitive may safely run concurrently with
>   * the _rcu list-mutation primitives such as hlist_add_head_rcu()
>   * as long as the traversal is guarded by rcu_read_lock().
>   */
> -#define hlist_for_each_entry_rcu(pos, head, member)			\
> -	for (pos = hlist_entry_safe (rcu_dereference_raw(hlist_first_rcu(head)),\
> +#define hlist_for_each_entry_rcu(pos, head, member, cond...)		\
> +	for (__list_check_rcu(dummy, ## cond, 0),			\
> +	     pos = hlist_entry_safe (rcu_dereference_raw(hlist_first_rcu(head)),\
>  			typeof(*(pos)), member);			\
>  		pos;							\
>  		pos = hlist_entry_safe(rcu_dereference_raw(hlist_next_rcu(\
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 8f7167478c1d..f3c29efdf19a 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -221,6 +221,7 @@ int debug_lockdep_rcu_enabled(void);
>  int rcu_read_lock_held(void);
>  int rcu_read_lock_bh_held(void);
>  int rcu_read_lock_sched_held(void);
> +int rcu_read_lock_any_held(void);
>  
>  #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
>  
> @@ -241,6 +242,12 @@ static inline int rcu_read_lock_sched_held(void)
>  {
>  	return !preemptible();
>  }
> +
> +static inline int rcu_read_lock_any_held(void)
> +{
> +	return !preemptible();
> +}
> +
>  #endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
>  
>  #ifdef CONFIG_PROVE_RCU
> diff --git a/kernel/rcu/Kconfig.debug b/kernel/rcu/Kconfig.debug
> index 5ec3ea4028e2..7fbd21dbfcd0 100644
> --- a/kernel/rcu/Kconfig.debug
> +++ b/kernel/rcu/Kconfig.debug
> @@ -8,6 +8,17 @@ menu "RCU Debugging"
>  config PROVE_RCU
>  	def_bool PROVE_LOCKING
>  
> +config PROVE_RCU_LIST
> +	bool "RCU list lockdep debugging"
> +	depends on PROVE_RCU

This must also depend on RCU_EXPERT.  

> +	default n
> +	help
> +	  Enable RCU lockdep checking for list usages. By default it is
> +	  turned off since there are several list RCU users that still
> +	  need to be converted to pass a lockdep expression. To prevent
> +	  false-positive splats, we keep it default disabled but once all
> +	  users are converted, we can remove this config option.
> +
>  config TORTURE_TEST
>  	tristate
>  	default n
> diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> index 9dd5aeef6e70..b7a4e3b5fa98 100644
> --- a/kernel/rcu/update.c
> +++ b/kernel/rcu/update.c
> @@ -91,14 +91,18 @@ module_param(rcu_normal_after_boot, int, 0);
>   * Similarly, we avoid claiming an SRCU read lock held if the current
>   * CPU is offline.
>   */
> +#define rcu_read_lock_held_common()		\
> +	if (!debug_lockdep_rcu_enabled())	\
> +		return 1;			\
> +	if (!rcu_is_watching())			\
> +		return 0;			\
> +	if (!rcu_lockdep_current_cpu_online())	\
> +		return 0;

Nice abstraction of common code!

							Thanx, Paul

> +
>  int rcu_read_lock_sched_held(void)
>  {
> -	if (!debug_lockdep_rcu_enabled())
> -		return 1;
> -	if (!rcu_is_watching())
> -		return 0;
> -	if (!rcu_lockdep_current_cpu_online())
> -		return 0;
> +	rcu_read_lock_held_common();
> +
>  	return lock_is_held(&rcu_sched_lock_map) || !preemptible();
>  }
>  EXPORT_SYMBOL(rcu_read_lock_sched_held);
> @@ -257,12 +261,8 @@ NOKPROBE_SYMBOL(debug_lockdep_rcu_enabled);
>   */
>  int rcu_read_lock_held(void)
>  {
> -	if (!debug_lockdep_rcu_enabled())
> -		return 1;
> -	if (!rcu_is_watching())
> -		return 0;
> -	if (!rcu_lockdep_current_cpu_online())
> -		return 0;
> +	rcu_read_lock_held_common();
> +
>  	return lock_is_held(&rcu_lock_map);
>  }
>  EXPORT_SYMBOL_GPL(rcu_read_lock_held);
> @@ -284,16 +284,24 @@ EXPORT_SYMBOL_GPL(rcu_read_lock_held);
>   */
>  int rcu_read_lock_bh_held(void)
>  {
> -	if (!debug_lockdep_rcu_enabled())
> -		return 1;
> -	if (!rcu_is_watching())
> -		return 0;
> -	if (!rcu_lockdep_current_cpu_online())
> -		return 0;
> +	rcu_read_lock_held_common();
> +
>  	return in_softirq() || irqs_disabled();
>  }
>  EXPORT_SYMBOL_GPL(rcu_read_lock_bh_held);
>  
> +int rcu_read_lock_any_held(void)
> +{
> +	rcu_read_lock_held_common();
> +
> +	if (lock_is_held(&rcu_lock_map) ||
> +	    lock_is_held(&rcu_bh_lock_map) ||
> +	    lock_is_held(&rcu_sched_lock_map))
> +		return 1;
> +	return !preemptible();
> +}
> +EXPORT_SYMBOL_GPL(rcu_read_lock_any_held);
> +
>  #endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
>  
>  /**
> -- 
> 2.22.0.510.g264f2c817a-goog
> 

^ permalink raw reply

* Re: [PATCH 3/9] rcu/sync: Remove custom check for reader-section (v2)
From: Paul E. McKenney @ 2019-07-16 18:39 UTC (permalink / raw)
  To: Joel Fernandes (Google)
  Cc: linux-kernel, Oleg Nesterov, Alexey Kuznetsov, Bjorn Helgaas,
	Borislav Petkov, c0d1n61at3, David S. Miller, edumazet,
	Greg Kroah-Hartman, Hideaki YOSHIFUJI, H. Peter Anvin,
	Ingo Molnar, Jonathan Corbet, Josh Triplett, keescook,
	kernel-hardening, kernel-team, Lai Jiangshan, Len Brown,
	linux-acpi, linux-doc, linux-pci, linux-pm, Mathieu Desnoyers,
	neilb, netdev, Pavel Machek, peterz, Rafael J. Wysocki,
	Rasmus Villemoes, rcu, Steven Rostedt, Tejun Heo, Thomas Gleixner,
	will, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)
In-Reply-To: <20190715143705.117908-4-joel@joelfernandes.org>

On Mon, Jul 15, 2019 at 10:36:59AM -0400, Joel Fernandes (Google) wrote:
> The rcu/sync code was doing its own check whether we are in a reader
> section. With RCU consolidating flavors and the generic helper added in
> this series, this is no longer need. We can just use the generic helper
> and it results in a nice cleanup.
> 
> Cc: Oleg Nesterov <oleg@redhat.com>
> Acked-by: Oleg Nesterov <oleg@redhat.com>
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>

This one looks good!

							Thanx, Paul

> ---
>  include/linux/rcu_sync.h | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/include/linux/rcu_sync.h b/include/linux/rcu_sync.h
> index 9b83865d24f9..0027d4c8087c 100644
> --- a/include/linux/rcu_sync.h
> +++ b/include/linux/rcu_sync.h
> @@ -31,9 +31,7 @@ struct rcu_sync {
>   */
>  static inline bool rcu_sync_is_idle(struct rcu_sync *rsp)
>  {
> -	RCU_LOCKDEP_WARN(!rcu_read_lock_held() &&
> -			 !rcu_read_lock_bh_held() &&
> -			 !rcu_read_lock_sched_held(),
> +	RCU_LOCKDEP_WARN(!rcu_read_lock_any_held(),
>  			 "suspicious rcu_sync_is_idle() usage");
>  	return !READ_ONCE(rsp->gp_state); /* GP_IDLE */
>  }
> -- 
> 2.22.0.510.g264f2c817a-goog
> 


^ permalink raw reply

* Re: [PATCH v2 2/9] rcu: Add support for consolidated-RCU reader checking
From: Joel Fernandes @ 2019-07-16 18:35 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, Alexey Kuznetsov, Bjorn Helgaas, Borislav Petkov,
	c0d1n61at3, David S. Miller, edumazet, Greg Kroah-Hartman,
	Hideaki YOSHIFUJI, H. Peter Anvin, Ingo Molnar, Jonathan Corbet,
	Josh Triplett, keescook, kernel-hardening, kernel-team,
	Lai Jiangshan, Len Brown, linux-acpi, linux-doc, linux-pci,
	linux-pm, Mathieu Desnoyers, neilb, netdev, Oleg Nesterov,
	Pavel Machek, peterz, Rafael J. Wysocki, Rasmus Villemoes, rcu,
	Steven Rostedt, Tejun Heo, Thomas Gleixner, will,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)
In-Reply-To: <20190716182237.GA22819@linux.ibm.com>

On Tue, Jul 16, 2019 at 11:22:37AM -0700, Paul E. McKenney wrote:
> On Fri, Jul 12, 2019 at 01:00:17PM -0400, Joel Fernandes (Google) wrote:
> > This patch adds support for checking RCU reader sections in list
> > traversal macros. Optionally, if the list macro is called under SRCU or
> > other lock/mutex protection, then appropriate lockdep expressions can be
> > passed to make the checks pass.
> > 
> > Existing list_for_each_entry_rcu() invocations don't need to pass the
> > optional fourth argument (cond) unless they are under some non-RCU
> > protection and needs to make lockdep check pass.
> > 
> > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> 
> If you fold in the checks for extra parameters, I will take this
> one and also 1/9.

I folded the checks in and also threw in the rcu-sync with Oleg's ack:

Could you pull into /dev branch?

git pull https://github.com/joelagnel/linux-kernel.git list-first-three
(Based on your dev branch)


^ permalink raw reply

* Re: [PATCH v2 2/9] rcu: Add support for consolidated-RCU reader checking
From: Paul E. McKenney @ 2019-07-16 18:22 UTC (permalink / raw)
  To: Joel Fernandes (Google)
  Cc: linux-kernel, Alexey Kuznetsov, Bjorn Helgaas, Borislav Petkov,
	c0d1n61at3, David S. Miller, edumazet, Greg Kroah-Hartman,
	Hideaki YOSHIFUJI, H. Peter Anvin, Ingo Molnar, Jonathan Corbet,
	Josh Triplett, keescook, kernel-hardening, kernel-team,
	Lai Jiangshan, Len Brown, linux-acpi, linux-doc, linux-pci,
	linux-pm, Mathieu Desnoyers, neilb, netdev, Oleg Nesterov,
	Pavel Machek, peterz, Rafael J. Wysocki, Rasmus Villemoes, rcu,
	Steven Rostedt, Tejun Heo, Thomas Gleixner, will,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)
In-Reply-To: <20190712170024.111093-3-joel@joelfernandes.org>

On Fri, Jul 12, 2019 at 01:00:17PM -0400, Joel Fernandes (Google) wrote:
> This patch adds support for checking RCU reader sections in list
> traversal macros. Optionally, if the list macro is called under SRCU or
> other lock/mutex protection, then appropriate lockdep expressions can be
> passed to make the checks pass.
> 
> Existing list_for_each_entry_rcu() invocations don't need to pass the
> optional fourth argument (cond) unless they are under some non-RCU
> protection and needs to make lockdep check pass.
> 
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>

If you fold in the checks for extra parameters, I will take this
one and also 1/9.

							Thanx, Paul

> ---
>  include/linux/rculist.h  | 28 +++++++++++++++++++++++-----
>  include/linux/rcupdate.h |  7 +++++++
>  kernel/rcu/Kconfig.debug | 11 +++++++++++
>  kernel/rcu/update.c      | 14 ++++++++++++++
>  4 files changed, 55 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/rculist.h b/include/linux/rculist.h
> index e91ec9ddcd30..1048160625bb 100644
> --- a/include/linux/rculist.h
> +++ b/include/linux/rculist.h
> @@ -40,6 +40,20 @@ static inline void INIT_LIST_HEAD_RCU(struct list_head *list)
>   */
>  #define list_next_rcu(list)	(*((struct list_head __rcu **)(&(list)->next)))
>  
> +/*
> + * Check during list traversal that we are within an RCU reader
> + */
> +
> +#ifdef CONFIG_PROVE_RCU_LIST
> +#define __list_check_rcu(dummy, cond, ...)				\
> +	({								\
> +	RCU_LOCKDEP_WARN(!cond && !rcu_read_lock_any_held(),		\
> +			 "RCU-list traversed in non-reader section!");	\
> +	 })
> +#else
> +#define __list_check_rcu(dummy, cond, ...) ({})
> +#endif
> +
>  /*
>   * Insert a new entry between two known consecutive entries.
>   *
> @@ -343,14 +357,16 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,
>   * @pos:	the type * to use as a loop cursor.
>   * @head:	the head for your list.
>   * @member:	the name of the list_head within the struct.
> + * @cond:	optional lockdep expression if called from non-RCU protection.
>   *
>   * This list-traversal primitive may safely run concurrently with
>   * the _rcu list-mutation primitives such as list_add_rcu()
>   * as long as the traversal is guarded by rcu_read_lock().
>   */
> -#define list_for_each_entry_rcu(pos, head, member) \
> -	for (pos = list_entry_rcu((head)->next, typeof(*pos), member); \
> -		&pos->member != (head); \
> +#define list_for_each_entry_rcu(pos, head, member, cond...)		\
> +	for (__list_check_rcu(dummy, ## cond, 0),			\
> +	     pos = list_entry_rcu((head)->next, typeof(*pos), member);	\
> +		&pos->member != (head);					\
>  		pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
>  
>  /**
> @@ -616,13 +632,15 @@ static inline void hlist_add_behind_rcu(struct hlist_node *n,
>   * @pos:	the type * to use as a loop cursor.
>   * @head:	the head for your list.
>   * @member:	the name of the hlist_node within the struct.
> + * @cond:	optional lockdep expression if called from non-RCU protection.
>   *
>   * This list-traversal primitive may safely run concurrently with
>   * the _rcu list-mutation primitives such as hlist_add_head_rcu()
>   * as long as the traversal is guarded by rcu_read_lock().
>   */
> -#define hlist_for_each_entry_rcu(pos, head, member)			\
> -	for (pos = hlist_entry_safe (rcu_dereference_raw(hlist_first_rcu(head)),\
> +#define hlist_for_each_entry_rcu(pos, head, member, cond...)		\
> +	for (__list_check_rcu(dummy, ## cond, 0),			\
> +	     pos = hlist_entry_safe (rcu_dereference_raw(hlist_first_rcu(head)),\
>  			typeof(*(pos)), member);			\
>  		pos;							\
>  		pos = hlist_entry_safe(rcu_dereference_raw(hlist_next_rcu(\
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 922bb6848813..712b464ab960 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -223,6 +223,7 @@ int debug_lockdep_rcu_enabled(void);
>  int rcu_read_lock_held(void);
>  int rcu_read_lock_bh_held(void);
>  int rcu_read_lock_sched_held(void);
> +int rcu_read_lock_any_held(void);
>  
>  #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
>  
> @@ -243,6 +244,12 @@ static inline int rcu_read_lock_sched_held(void)
>  {
>  	return !preemptible();
>  }
> +
> +static inline int rcu_read_lock_any_held(void)
> +{
> +	return !preemptible();
> +}
> +
>  #endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
>  
>  #ifdef CONFIG_PROVE_RCU
> diff --git a/kernel/rcu/Kconfig.debug b/kernel/rcu/Kconfig.debug
> index 0ec7d1d33a14..b20d0e2903d1 100644
> --- a/kernel/rcu/Kconfig.debug
> +++ b/kernel/rcu/Kconfig.debug
> @@ -7,6 +7,17 @@ menu "RCU Debugging"
>  config PROVE_RCU
>  	def_bool PROVE_LOCKING
>  
> +config PROVE_RCU_LIST
> +	bool "RCU list lockdep debugging"
> +	depends on PROVE_RCU
> +	default n
> +	help
> +	  Enable RCU lockdep checking for list usages. By default it is
> +	  turned off since there are several list RCU users that still
> +	  need to be converted to pass a lockdep expression. To prevent
> +	  false-positive splats, we keep it default disabled but once all
> +	  users are converted, we can remove this config option.
> +
>  config TORTURE_TEST
>  	tristate
>  	default n
> diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> index bb961cd89e76..0cc7be0fb6b5 100644
> --- a/kernel/rcu/update.c
> +++ b/kernel/rcu/update.c
> @@ -294,6 +294,20 @@ int rcu_read_lock_bh_held(void)
>  }
>  EXPORT_SYMBOL_GPL(rcu_read_lock_bh_held);
>  
> +int rcu_read_lock_any_held(void)
> +{
> +	if (!debug_lockdep_rcu_enabled())
> +		return 1;
> +	if (!rcu_is_watching())
> +		return 0;
> +	if (!rcu_lockdep_current_cpu_online())
> +		return 0;
> +	if (lock_is_held(&rcu_lock_map) || lock_is_held(&rcu_sched_lock_map))
> +		return 1;
> +	return !preemptible();
> +}
> +EXPORT_SYMBOL_GPL(rcu_read_lock_any_held);
> +
>  #endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
>  
>  /**
> -- 
> 2.22.0.510.g264f2c817a-goog
> 

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox