* [PATCH] flow_dissector: work around stack frame size warning
From: Arnd Bergmann @ 2020-05-29 20:13 UTC (permalink / raw)
To: Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S. Miller,
Jakub Kicinski, Guillaume Nault
Cc: Arnd Bergmann, Vlad Buslov, Xin Long, Pablo Neira Ayuso, netdev,
linux-kernel
The fl_flow_key structure is around 500 bytes, so having two of them
on the stack in one function now exceeds the warning limit after an
otherwise correct change:
net/sched/cls_flower.c:298:12: error: stack frame size of 1056 bytes in function 'fl_classify' [-Werror,-Wframe-larger-than=]
I suspect the fl_classify function could be reworked to only have one
of them on the stack and modify it in place, but I could not work out
how to do that.
As a somewhat hacky workaround, move one of them into an out-of-line
function to reduce its scope. This does not necessarily reduce the stack
usage of the outer function, but at least the second copy is removed
from the stack during most of it and does not add up to whatever is
called from there.
I now see 552 bytes of stack usage for fl_classify(), plus 528 bytes
for fl_mask_lookup().
Fixes: 58cff782cc55 ("flow_dissector: Parse multiple MPLS Label Stack Entries")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
net/sched/cls_flower.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 96f5999281e0..030896eadd11 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -272,14 +272,16 @@ static struct cls_fl_filter *fl_lookup_range(struct fl_flow_mask *mask,
return NULL;
}
-static struct cls_fl_filter *fl_lookup(struct fl_flow_mask *mask,
- struct fl_flow_key *mkey,
- struct fl_flow_key *key)
+static noinline_for_stack
+struct cls_fl_filter *fl_mask_lookup(struct fl_flow_mask *mask, struct fl_flow_key *key)
{
+ struct fl_flow_key mkey;
+
+ fl_set_masked_key(&mkey, key, mask);
if ((mask->flags & TCA_FLOWER_MASK_FLAGS_RANGE))
- return fl_lookup_range(mask, mkey, key);
+ return fl_lookup_range(mask, &mkey, key);
- return __fl_lookup(mask, mkey);
+ return __fl_lookup(mask, &mkey);
}
static u16 fl_ct_info_to_flower_map[] = {
@@ -299,7 +301,6 @@ static int fl_classify(struct sk_buff *skb, const struct tcf_proto *tp,
struct tcf_result *res)
{
struct cls_fl_head *head = rcu_dereference_bh(tp->root);
- struct fl_flow_key skb_mkey;
struct fl_flow_key skb_key;
struct fl_flow_mask *mask;
struct cls_fl_filter *f;
@@ -319,9 +320,7 @@ static int fl_classify(struct sk_buff *skb, const struct tcf_proto *tp,
ARRAY_SIZE(fl_ct_info_to_flower_map));
skb_flow_dissect(skb, &mask->dissector, &skb_key, 0);
- fl_set_masked_key(&skb_mkey, &skb_key, mask);
-
- f = fl_lookup(mask, &skb_mkey, &skb_key);
+ f = fl_mask_lookup(mask, &skb_key);
if (f && !tc_skip_sw(f->flags)) {
*res = f->res;
return tcf_exts_exec(skb, &f->exts, res);
--
2.26.2
^ permalink raw reply related
* Re: [PATCH] HID: usbhid: do not sleep when opening device
From: Guenter Roeck @ 2020-05-29 20:14 UTC (permalink / raw)
To: Dmitry Torokhov
Cc: Jiri Kosina, Benjamin Tissoires, groeck, Nicolas Boichat,
linux-usb, linux-input, linux-kernel
In-Reply-To: <20200529195951.GA3767@dtor-ws>
On Fri, May 29, 2020 at 12:59:51PM -0700, Dmitry Torokhov wrote:
> usbhid tries to give the device 50 milliseconds to drain its queues
> when opening the device, but does it naively by simply sleeping in open
> handler, which slows down device probing (and thus may affect overall
> boot time).
>
> However we do not need to sleep as we can instead mark a point of time
> in the future when we should start processing the events.
>
> Reported-by: Nicolas Boichat <drinkcat@chromium.org>
> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
> ---
> drivers/hid/usbhid/hid-core.c | 27 +++++++++++++++------------
> drivers/hid/usbhid/usbhid.h | 1 +
> 2 files changed, 16 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c
> index c7bc9db5b192..e69992e945b2 100644
> --- a/drivers/hid/usbhid/hid-core.c
> +++ b/drivers/hid/usbhid/hid-core.c
> @@ -95,6 +95,19 @@ static int hid_start_in(struct hid_device *hid)
> set_bit(HID_NO_BANDWIDTH, &usbhid->iofl);
> } else {
> clear_bit(HID_NO_BANDWIDTH, &usbhid->iofl);
> +
> + if (test_and_clear_bit(HID_RESUME_RUNNING,
> + &usbhid->iofl)) {
> + /*
> + * In case events are generated while nobody was
> + * listening, some are released when the device
> + * is re-opened. Wait 50 msec for the queue to
> + * empty before allowing events to go through
> + * hid.
> + */
> + usbhid->input_start_time = jiffies +
> + msecs_to_jiffies(50);
> + }
> }
> }
> spin_unlock_irqrestore(&usbhid->lock, flags);
> @@ -280,7 +293,8 @@ static void hid_irq_in(struct urb *urb)
> if (!test_bit(HID_OPENED, &usbhid->iofl))
> break;
> usbhid_mark_busy(usbhid);
> - if (!test_bit(HID_RESUME_RUNNING, &usbhid->iofl)) {
> + if (!test_bit(HID_RESUME_RUNNING, &usbhid->iofl) &&
> + time_after(jiffies, usbhid->input_start_time)) {
> hid_input_report(urb->context, HID_INPUT_REPORT,
> urb->transfer_buffer,
> urb->actual_length, 1);
> @@ -714,17 +728,6 @@ static int usbhid_open(struct hid_device *hid)
> }
>
> usb_autopm_put_interface(usbhid->intf);
> -
> - /*
> - * In case events are generated while nobody was listening,
> - * some are released when the device is re-opened.
> - * Wait 50 msec for the queue to empty before allowing events
> - * to go through hid.
> - */
> - if (res == 0)
> - msleep(50);
> -
Can you just set usbhid->input_start_time here ?
if (res == 0)
usbhid->input_start_time = jiffies + msecs_to_jiffies(50);
clear_bit(HID_RESUME_RUNNING, &usbhid->iofl);
Then you might not need the added code in hid_start_in().
Thanks,
Guenter
> - clear_bit(HID_RESUME_RUNNING, &usbhid->iofl);
> return res;
> }
>
> diff --git a/drivers/hid/usbhid/usbhid.h b/drivers/hid/usbhid/usbhid.h
> index 8620408bd7af..805949671b96 100644
> --- a/drivers/hid/usbhid/usbhid.h
> +++ b/drivers/hid/usbhid/usbhid.h
> @@ -82,6 +82,7 @@ struct usbhid_device {
>
> spinlock_t lock; /* fifo spinlock */
> unsigned long iofl; /* I/O flags (CTRL_RUNNING, OUT_RUNNING) */
> + unsigned long input_start_time; /* When to start handling input, in jiffies */
> struct timer_list io_retry; /* Retry timer */
> unsigned long stop_retry; /* Time to give up, in jiffies */
> unsigned int retry_delay; /* Delay length in ms */
> --
> 2.27.0.rc0.183.gde8f92d652-goog
>
>
> --
> Dmitry
^ permalink raw reply
* + kasan-fix-clang-compilation-warning-due-to-stack-protector.patch added to -mm tree
From: akpm @ 2020-05-29 20:14 UTC (permalink / raw)
To: andreyknvl, cai, elver, mm-commits
The patch titled
Subject: kasan: fix clang compilation warning due to stack protector
has been added to the -mm tree. Its filename is
kasan-fix-clang-compilation-warning-due-to-stack-protector.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/kasan-fix-clang-compilation-warning-due-to-stack-protector.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/kasan-fix-clang-compilation-warning-due-to-stack-protector.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Andrey Konovalov <andreyknvl@google.com>
Subject: kasan: fix clang compilation warning due to stack protector
KASAN uses a single cc-option invocation to disable both conserve-stack
and stack-protector flags. The former flag is not present in Clang, which
causes cc-option to fail, and results in stack-protector being enabled.
Fix by using separate cc-option calls for each flag. Also collect all
flags in a variable to avoid calling cc-option multiple times for
different files.
Link: http://lkml.kernel.org/r/c2f0c8e4048852ae014f4a391d96ca42d27e3255.1590779332.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reported-by: Qian Cai <cai@lca.pw>
Reviewed-by: Marco Elver <elver@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/kasan/Makefile | 21 +++++++++++++--------
1 file changed, 13 insertions(+), 8 deletions(-)
--- a/mm/kasan/Makefile~kasan-fix-clang-compilation-warning-due-to-stack-protector
+++ a/mm/kasan/Makefile
@@ -15,14 +15,19 @@ CFLAGS_REMOVE_tags_report.o = $(CC_FLAGS
# Function splitter causes unnecessary splits in __asan_load1/__asan_store1
# see: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63533
-CFLAGS_common.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector) -DDISABLE_BRANCH_PROFILING
-CFLAGS_generic.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector) -DDISABLE_BRANCH_PROFILING
-CFLAGS_generic_report.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector) -DDISABLE_BRANCH_PROFILING
-CFLAGS_init.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector) -DDISABLE_BRANCH_PROFILING
-CFLAGS_quarantine.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector) -DDISABLE_BRANCH_PROFILING
-CFLAGS_report.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector) -DDISABLE_BRANCH_PROFILING
-CFLAGS_tags.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector) -DDISABLE_BRANCH_PROFILING
-CFLAGS_tags_report.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector) -DDISABLE_BRANCH_PROFILING
+CC_FLAGS_KASAN_RUNTIME := $(call cc-option, -fno-conserve-stack)
+CC_FLAGS_KASAN_RUNTIME += $(call cc-option, -fno-stack-protector)
+# Disable branch tracing to avoid recursion.
+CC_FLAGS_KASAN_RUNTIME += -DDISABLE_BRANCH_PROFILING
+
+CFLAGS_common.o := $(CC_FLAGS_KASAN_RUNTIME)
+CFLAGS_generic.o := $(CC_FLAGS_KASAN_RUNTIME)
+CFLAGS_generic_report.o := $(CC_FLAGS_KASAN_RUNTIME)
+CFLAGS_init.o := $(CC_FLAGS_KASAN_RUNTIME)
+CFLAGS_quarantine.o := $(CC_FLAGS_KASAN_RUNTIME)
+CFLAGS_report.o := $(CC_FLAGS_KASAN_RUNTIME)
+CFLAGS_tags.o := $(CC_FLAGS_KASAN_RUNTIME)
+CFLAGS_tags_report.o := $(CC_FLAGS_KASAN_RUNTIME)
obj-$(CONFIG_KASAN) := common.o init.o report.o
obj-$(CONFIG_KASAN_GENERIC) += generic.o generic_report.o quarantine.o
_
Patches currently in -mm which might be from andreyknvl@google.com are
kcov-cleanup-debug-messages.patch
kcov-fix-potential-use-after-free-in-kcov_remote_start.patch
kcov-move-t-kcov-assignments-into-kcov_start-stop.patch
kcov-move-t-kcov_sequence-assignment.patch
kcov-use-t-kcov_mode-as-enabled-indicator.patch
kcov-collect-coverage-from-interrupts.patch
usb-core-kcov-collect-coverage-from-usb-complete-callback.patch
kasan-fix-clang-compilation-warning-due-to-stack-protector.patch
kasan-move-kasan_report-into-reportc.patch
^ permalink raw reply
* Re: [Intel-gfx] [PATCH 1/2] drm/i915/gem: Taint all shrinkable object locks
From: Matthew Auld @ 2020-05-29 20:13 UTC (permalink / raw)
To: Chris Wilson; +Cc: Intel Graphics Development, Matthew Auld
In-Reply-To: <20200529183204.16850-1-chris@chris-wilson.co.uk>
On Fri, 29 May 2020 at 19:32, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> If we declare that an object type is shrinkable (any that we can reclaim
> to recover system pages), make sure we taint the object mutex so that
> lockdep expects us to use it within fs_reclaim. lockdep will then
> complain the first time we try to allocate while holding the plain
> mutex, as doing so invites potential recursion.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply
* Re: [PATCH 0/3] Couple of HMAT fixes
From: no-reply @ 2020-05-29 20:11 UTC (permalink / raw)
To: mprivozn; +Cc: jingqi.liu, tao3.xu, qemu-devel, ehabkost
In-Reply-To: <cover.1590753455.git.mprivozn@redhat.com>
Patchew URL: https://patchew.org/QEMU/cover.1590753455.git.mprivozn@redhat.com/
Hi,
This series failed the docker-quick@centos7 build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.
=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===
TEST check-qtest-x86_64: tests/qtest/test-x86-cpuid-compat
TEST check-qtest-x86_64: tests/qtest/numa-test
**
ERROR:/tmp/qemu-test/src/tests/qtest/numa-test.c:524:pc_hmat_erange_cfg: 'qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node'," " 'arguments': { 'type': 'hmat-cache', 'node-id': 0, 'size': 10240," " 'level': 1, 'associativity': \"direct\", 'policy': \"write-back\"," " 'line': 8 } }"))' should be TRUE
ERROR - Bail out! ERROR:/tmp/qemu-test/src/tests/qtest/numa-test.c:524:pc_hmat_erange_cfg: 'qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node'," " 'arguments': { 'type': 'hmat-cache', 'node-id': 0, 'size': 10240," " 'level': 1, 'associativity': \"direct\", 'policy': \"write-back\"," " 'line': 8 } }"))' should be TRUE
/tmp/qemu-test/src/tests/qtest/libqtest.c:175: kill_qemu() detected QEMU death from signal 15 (Terminated)
make: *** [check-qtest-x86_64] Error 1
make: *** Waiting for unfinished jobs....
TEST iotest-qcow2: 176
TEST iotest-qcow2: 177
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=2b956b136ef4469f9b237fa0b3cad819', '-u', '1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-voh1aoeh/src/docker-src.2020-05-29-15.56.38.16826:/var/tmp/qemu:z,ro', 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=2b956b136ef4469f9b237fa0b3cad819
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-voh1aoeh/src'
make: *** [docker-run-test-quick@centos7] Error 2
real 14m39.846s
user 0m8.869s
The full log is available at
http://patchew.org/logs/cover.1590753455.git.mprivozn@redhat.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
^ permalink raw reply
* Re: [PATCH v2 bpf-next 2/4] bpf: Introduce sleepable BPF programs
From: Alexei Starovoitov @ 2020-05-29 20:12 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: David S. Miller, Daniel Borkmann, Networking, bpf, Kernel Team
In-Reply-To: <CAEf4BzZXnqLwhJaUVKX0ExVa+Sw5mnhg5FLJN-VKPX59f6EAoQ@mail.gmail.com>
On Fri, May 29, 2020 at 01:25:06AM -0700, Andrii Nakryiko wrote:
> > index 11584618e861..26b18b6a3dbc 100644
> > --- a/kernel/bpf/arraymap.c
> > +++ b/kernel/bpf/arraymap.c
> > @@ -393,6 +393,11 @@ static void array_map_free(struct bpf_map *map)
> > */
> > synchronize_rcu();
> >
> > + /* arrays could have been used by both sleepable and non-sleepable bpf
> > + * progs. Make sure to wait for both prog types to finish executing.
> > + */
> > + synchronize_srcu(&bpf_srcu);
> > +
>
> to minimize churn later on when you switch to rcu_trace, maybe extract
> synchronize_rcu() + synchronize_srcu(&bpf_srcu) into a function (e.g.,
> something like synchronize_sleepable_bpf?), exposed as an internal
> API? That way you also wouldn't need to add bpf_srcu to linux/bpf.h?
I think the opposite is must have actually. I think rcu operations should never
be hidden in helpers. All rcu/srcu/rcu_trace ops should always be open coded.
> > @@ -577,8 +577,8 @@ static void *__htab_map_lookup_elem(struct bpf_map *map, void *key)
> > struct htab_elem *l;
> > u32 hash, key_size;
> >
> > - /* Must be called with rcu_read_lock. */
> > - WARN_ON_ONCE(!rcu_read_lock_held());
> > + /* Must be called with s?rcu_read_lock. */
> > + WARN_ON_ONCE(!rcu_read_lock_held() && !srcu_read_lock_held(&bpf_srcu));
> >
>
> Similar to above, might be worthwhile extracting into a function?
This one I'm 50/50, since this pattern will be in many places.
But what kind of helper that would be?
Clear name is very hard.
WARN_ON_ONCE(!bpf_specific_rcu_lock_held()) ?
Moving WARN into the helper would be even worse.
When rcu_trace is available the churn of patches to convert srcu to rcu_trace
will be a good thing. The patches will convey the difference.
Like bpf_srcu will disappear. They will give a way to do benchmarking before/after
and will help to go back to srcu in unlikely case there is some obscure bug
in rcu_trace. Hiding srcu vs rcu_trace details behind helpers is not how
the code should read. The trade off with one and another will be different
case by case. Like synchronize_srcu() is ok, but synchronize_rcu_trace()
may be too heavy in the trampoline update code and extra counter would be needed.
Also there will be synchronize_multi() that I plan to use as well.
> >
> > + if (prog->aux->sleepable && prog->type != BPF_PROG_TYPE_TRACING &&
> > + prog->type != BPF_PROG_TYPE_LSM) {
> > + verbose(env, "Only fentry/fexit/fmod_ret and lsm programs can be sleepable\n");
> > + return -EINVAL;
> > + }
>
>
> BPF_PROG_TYPE_TRACING also includes iterator and raw tracepoint
> programs. You mention only fentry/fexit/fmod_ret are allowed. What
> about those two? I don't see any explicit checks for iterator and
> raw_tracepoint attach types in a switch below, so just checking if
> they should be allowed to be sleepable?
good point. tp_btf and iter don't use trampoline, so sleepable flag
is ignored. which is wrong. I'll add a check to get the prog rejected.
> Also seems like freplace ones are also sleeepable, if they replace
> sleepable programs, right?
freplace is a different program type. So it's rejected by this code already.
Eventually I'll add support to allow sleepable freplace prog that extend
sleepable target. But that's future.
> > +
> > if (prog->type == BPF_PROG_TYPE_STRUCT_OPS)
> > return check_struct_ops_btf_id(env);
> >
> > @@ -10762,8 +10801,29 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
> > if (ret)
> > verbose(env, "%s() is not modifiable\n",
> > prog->aux->attach_func_name);
> > + } else if (prog->aux->sleepable) {
> > + switch (prog->type) {
> > + case BPF_PROG_TYPE_TRACING:
> > + /* fentry/fexit progs can be sleepable only if they are
> > + * attached to ALLOW_ERROR_INJECTION or security_*() funcs.
> > + */
> > + ret = check_attach_modify_return(prog, addr);
>
> I was so confused about this piece... check_attach_modify_return()
> should probably be renamed to something else, it's not for fmod_ret
> only anymore.
why? I think the name is correct. The helper checks whether target
allows modifying its return value. It's a first while list.
When that passes the black list applies via check_sleepable_blacklist() function.
I was considering using whitelist for sleepable as well, but that's overkill.
Too much overlap with mod_ret.
Imo check whitelist + check blacklist for white list exceptions is clean enough.
>
> > + if (!ret)
> > + ret = check_sleepable_blacklist(addr);
> > + break;
> > + case BPF_PROG_TYPE_LSM:
> > + /* LSM progs check that they are attached to bpf_lsm_*() funcs
> > + * which are sleepable too.
> > + */
> > + ret = check_sleepable_blacklist(addr);
> > + break;
^ permalink raw reply
* [MPTCP] Re: Crashers on netnext with apache-benchmark
From: Christoph Paasch @ 2020-05-29 20:11 UTC (permalink / raw)
To: mptcp
[-- Attachment #1: Type: text/plain, Size: 4974 bytes --]
On 05/29/20 - 22:04, Paolo Abeni wrote:
> On Tue, 2020-05-26 at 17:28 -0700, Christoph Paasch wrote:
> > And another one:
> >
> > [ 62.586401] ==================================================================
> > [ 62.588813] BUG: KASAN: use-after-free in inet_twsk_bind_unhash+0x5f/0xe0
> > [ 62.589975] Write of size 8 at addr ffff88810f155a20 by task ksoftirqd/2/21
> > [ 62.591194]
> > [ 62.591485] CPU: 2 PID: 21 Comm: ksoftirqd/2 Kdump: loaded Not tainted 5.7.0-rc6.mptcp #36
> > [ 62.593067] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> > [ 62.595268] Call Trace:
> > [ 62.595775] dump_stack+0x76/0xa0
> > [ 62.596448] print_address_description.constprop.0+0x3a/0x60
> > [ 62.600581] __kasan_report.cold+0x20/0x3b
> > [ 62.602968] kasan_report+0x38/0x50
> > [ 62.603561] inet_twsk_bind_unhash+0x5f/0xe0
> > [ 62.604282] inet_twsk_kill+0x195/0x200
> > [ 62.604945] inet_twsk_deschedule_put+0x25/0x30
> > [ 62.605731] tcp_v4_rcv+0xa79/0x15e0
> > [ 62.607139] ip_protocol_deliver_rcu+0x37/0x270
> > [ 62.607980] ip_local_deliver_finish+0xb0/0xd0
> > [ 62.608758] ip_local_deliver+0x1c9/0x1e0
> > [ 62.611162] ip_sublist_rcv_finish+0x84/0xa0
> > [ 62.611894] ip_sublist_rcv+0x22c/0x320
> > [ 62.616143] ip_list_rcv+0x1e4/0x225
> > [ 62.619427] __netif_receive_skb_list_core+0x439/0x460
> > [ 62.622771] netif_receive_skb_list_internal+0x3ea/0x570
> > [ 62.625320] gro_normal_list.part.0+0x14/0x50
> > [ 62.626088] napi_gro_receive+0x6a/0xb0
> > [ 62.626787] receive_buf+0x371/0x1d50
> > [ 62.632092] virtnet_poll+0x2be/0x5b0
> > [ 62.634099] net_rx_action+0x1ec/0x4c0
> > [ 62.636132] __do_softirq+0xfc/0x29c
> > [ 62.638180] run_ksoftirqd+0x15/0x30
> > [ 62.638787] smpboot_thread_fn+0x1fc/0x380
> > [ 62.642009] kthread+0x1f1/0x210
> > [ 62.643478] ret_from_fork+0x35/0x40
> > [ 62.644094]
> > [ 62.644371] Allocated by task 1355:
> > [ 62.644980] save_stack+0x1b/0x40
> > [ 62.645539] __kasan_kmalloc.constprop.0+0xc2/0xd0
> > [ 62.646347] kmem_cache_alloc+0xb8/0x190
> > [ 62.647006] getname_flags+0x6b/0x2b0
> > [ 62.647627] user_path_at_empty+0x1b/0x40
> > [ 62.648306] vfs_statx+0xba/0x140
> > [ 62.648875] __do_sys_newstat+0x8c/0xf0
> > [ 62.649518] do_syscall_64+0xbc/0x790
> > [ 62.650199] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > [ 62.651091]
> > [ 62.651360] Freed by task 1355:
> > [ 62.651903] save_stack+0x1b/0x40
> > [ 62.652460] __kasan_slab_free+0x12f/0x180
> > [ 62.653147] kmem_cache_free+0x87/0x240
> > [ 62.653795] filename_lookup+0x183/0x250
> > [ 62.654447] vfs_statx+0xba/0x140
> > [ 62.655001] __do_sys_newstat+0x8c/0xf0
> > [ 62.655640] do_syscall_64+0xbc/0x790
> > [ 62.656246] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > [ 62.657089]
> > [ 62.657351] The buggy address belongs to the object at ffff88810f155500
> > which belongs to the cache names_cache of size 4096
> > [ 62.659420] The buggy address is located 1312 bytes inside of
> > 4096-byte region [ffff88810f155500, ffff88810f156500)
> > [ 62.661358] The buggy address belongs to the page:
> > [ 62.662175] page:ffffea00043c5400 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 head:ffffea00043c5400 order:3 compound_mapcount:0 compound_pincount:0
> > [ 62.664523] flags: 0x8000000000010200(slab|head)
> > [ 62.665342] raw: 8000000000010200 0000000000000000 0000000400000001 ffff88811ac772c0
> > [ 62.666713] raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000
> > [ 62.667984] page dumped because: kasan: bad access detected
> > [ 62.668904]
> > [ 62.669171] Memory state around the buggy address:
> > [ 62.669975] ffff88810f155900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > [ 62.671163] ffff88810f155980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > [ 62.672363] >ffff88810f155a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > [ 62.673559] ^
> > [ 62.674349] ffff88810f155a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > [ 62.675531] ffff88810f155b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > [ 62.676723] ==================================================================
>
> Could you please try booting with "slab_nomerge" on the kernel command
> line?
>
> Possibly kasan could track better the really relevant UAF cycle that
> way.
>
> Thanks!
Sure, will try that!
Some more info from my crash-analysis: the next-pointer in the bindhash-table
is pointing to free'd memory. That free'd memory seems to be a tcp_sock in state CLOSED.
So, I guess the bind_node is not unlinked when a socket is being free'd. I'm
adding debugging code to see if that is true.
Christoph
^ permalink raw reply
* Re: [PATCH] libvhost-user: advertise vring features
From: Marc-André Lureau @ 2020-05-29 20:10 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: Jason Wang, Michael S. Tsirkin, qemu-devel, Raphael Norwitz
In-Reply-To: <20200529161338.456017-1-stefanha@redhat.com>
Hi
On Fri, May 29, 2020 at 6:13 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
>
> libvhost-user implements several vring features without advertising
> them. There is no way for the vhost-user master to detect support for
> these features.
>
> Things more or less work today because QEMU assumes the vhost-user
> backend always implements certain feature bits like
> VIRTIO_RING_F_EVENT_IDX. This is not documented anywhere.
>
> This patch explicitly advertises features implemented in libvhost-user
> so that the vhost-user master does not need to make undocumented
> assumptions.
>
> Feature bits that libvhost-user now advertises can be removed from
> vhost-user-blk.c. Devices should not be responsible for advertising
> vring feature bits, that is libvhost-user's job.
>
> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> Cc: Jason Wang <jasowang@redhat.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> ---
> I have tested make check and virtiofsd.
> ---
> contrib/libvhost-user/libvhost-user.c | 10 ++++++++++
> contrib/vhost-user-blk/vhost-user-blk.c | 4 +---
> 2 files changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/contrib/libvhost-user/libvhost-user.c b/contrib/libvhost-user/libvhost-user.c
> index 3bca996c62..b43874ba12 100644
> --- a/contrib/libvhost-user/libvhost-user.c
> +++ b/contrib/libvhost-user/libvhost-user.c
> @@ -495,6 +495,16 @@ static bool
> vu_get_features_exec(VuDev *dev, VhostUserMsg *vmsg)
> {
> vmsg->payload.u64 =
> + /*
> + * The following VIRTIO feature bits are supported by our virtqueue
> + * implementation:
> + */
> + 1ULL << VIRTIO_F_NOTIFY_ON_EMPTY |
> + 1ULL << VIRTIO_RING_F_INDIRECT_DESC |
> + 1ULL << VIRTIO_RING_F_EVENT_IDX |
> + 1ULL << VIRTIO_F_VERSION_1 |
> +
> + /* vhost-user feature bits */
> 1ULL << VHOST_F_LOG_ALL |
> 1ULL << VHOST_USER_F_PROTOCOL_FEATURES;
>
> diff --git a/contrib/vhost-user-blk/vhost-user-blk.c b/contrib/vhost-user-blk/vhost-user-blk.c
> index 6fd91c7e99..25eccd02b5 100644
> --- a/contrib/vhost-user-blk/vhost-user-blk.c
> +++ b/contrib/vhost-user-blk/vhost-user-blk.c
> @@ -382,9 +382,7 @@ vub_get_features(VuDev *dev)
> 1ull << VIRTIO_BLK_F_DISCARD |
> 1ull << VIRTIO_BLK_F_WRITE_ZEROES |
> #endif
> - 1ull << VIRTIO_BLK_F_CONFIG_WCE |
> - 1ull << VIRTIO_F_VERSION_1 |
> - 1ull << VHOST_USER_F_PROTOCOL_FEATURES;
> + 1ull << VIRTIO_BLK_F_CONFIG_WCE;
>
> if (vdev_blk->enable_ro) {
> features |= 1ull << VIRTIO_BLK_F_RO;
> --
> 2.25.4
>
^ permalink raw reply
* Re: [OE-core] [PATCH V3 3/3] u-boot: introduce UBOOT_INITIAL_ENV
From: Denys Dmytriyenko @ 2020-05-29 20:11 UTC (permalink / raw)
To: Ming Liu; +Cc: openembedded-core, stefan.agner, max.krummenacher, denys,
Ming Liu
In-Reply-To: <20200528124129.15100-4-liu.ming50@gmail.com>
On Thu, May 28, 2020 at 02:41:29PM +0200, Ming Liu wrote:
> From: Ming Liu <ming.liu@toradex.com>
>
> It defaults to ${PN}-initial-env, no functional changes with current
> implementation, but this allows it to be changed in individual u-boot
> recipes.
>
> If UBOOT_INITIAL_ENV is empty, then no initial env would be compiled/
> installed/deployed, set ALLOW_EMPTY_${PN}-env = "1".
>
> The major purpose for introducing this, is that the users might have
> some scripts on targets like:
> ```
> /sbin/fw_setenv -f /etc/u-boot-initial-env
> ```
>
> and it should be able to run against a identical path generated by
> different u-boot recipes.
>
> Signed-off-by: Ming Liu <ming.liu@toradex.com>
> ---
> meta/recipes-bsp/u-boot/u-boot.inc | 55 +++++++++++++++++++-----------
> 1 file changed, 36 insertions(+), 19 deletions(-)
>
> diff --git a/meta/recipes-bsp/u-boot/u-boot.inc b/meta/recipes-bsp/u-boot/u-boot.inc
> index be15e1760f..8e60615e5c 100644
> --- a/meta/recipes-bsp/u-boot/u-boot.inc
> +++ b/meta/recipes-bsp/u-boot/u-boot.inc
> @@ -60,6 +60,10 @@ UBOOT_ENV_BINARY ?= "${UBOOT_ENV}.${UBOOT_ENV_SUFFIX}"
> UBOOT_ENV_IMAGE ?= "${UBOOT_ENV}-${MACHINE}-${PV}-${PR}.${UBOOT_ENV_SUFFIX}"
> UBOOT_ENV_SYMLINK ?= "${UBOOT_ENV}-${MACHINE}.${UBOOT_ENV_SUFFIX}"
>
> +# Default name of u-boot initial env, but enable individual recipes to change
> +# this value.
> +UBOOT_INITIAL_ENV ?= "${PN}-initial-env"
> +
> # U-Boot EXTLINUX variables. U-Boot searches for /boot/extlinux/extlinux.conf
> # to find EXTLINUX conf file.
> UBOOT_EXTLINUX_INSTALL_DIR ?= "/boot/extlinux"
> @@ -137,8 +141,10 @@ do_compile () {
> done
>
> # Generate the uboot-initial-env
> - oe_runmake -C ${S} O=${B}/${config} u-boot-initial-env
> - cp ${B}/${config}/u-boot-initial-env ${B}/${config}/u-boot-initial-env-${type}
> + if [ -n "${UBOOT_INITIAL_ENV}" ]; then
> + oe_runmake -C ${S} O=${B}/${config} u-boot-initial-env
> + cp ${B}/${config}/u-boot-initial-env ${B}/${config}/u-boot-initial-env-${type}
> + fi
>
> unset k
> fi
> @@ -150,7 +156,9 @@ do_compile () {
> oe_runmake -C ${S} O=${B} ${UBOOT_MAKE_TARGET}
>
> # Generate the uboot-initial-env
> - oe_runmake -C ${S} O=${B} u-boot-initial-env
> + if [ -n "${UBOOT_INITIAL_ENV}" ]; then
> + oe_runmake -C ${S} O=${B} u-boot-initial-env
> + fi
> fi
> }
>
> @@ -168,10 +176,12 @@ do_install () {
> ln -sf u-boot-${type}-${PV}-${PR}.${UBOOT_SUFFIX} ${D}/boot/${UBOOT_BINARY}
>
> # Install the uboot-initial-env
> - install -D -m 644 ${B}/${config}/u-boot-initial-env-${type} ${D}/${sysconfdir}/${PN}-initial-env-${MACHINE}-${type}-${PV}-${PR}
> - ln -sf ${PN}-initial-env-${MACHINE}-${type}-${PV}-${PR} ${D}/${sysconfdir}/${PN}-initial-env-${MACHINE}-${type}
> - ln -sf ${PN}-initial-env-${MACHINE}-${type}-${PV}-${PR} ${D}/${sysconfdir}/${PN}-initial-env-${type}
> - ln -sf ${PN}-initial-env-${MACHINE}-${type}-${PV}-${PR} ${D}/${sysconfdir}/${PN}-initial-env
> + if [ -n "${UBOOT_INITIAL_ENV}" ]; then
> + install -D -m 644 ${B}/${config}/u-boot-initial-env-${type} ${D}/${sysconfdir}/${UBOOT_INITIAL_ENV}-${MACHINE}-${type}-${PV}-${PR}
> + ln -sf ${UBOOT_INITIAL_ENV}-${MACHINE}-${type}-${PV}-${PR} ${D}/${sysconfdir}/${UBOOT_INITIAL_ENV}-${MACHINE}-${type}
> + ln -sf ${UBOOT_INITIAL_ENV}-${MACHINE}-${type}-${PV}-${PR} ${D}/${sysconfdir}/${UBOOT_INITIAL_ENV}-${type}
> + ln -sf ${UBOOT_INITIAL_ENV}-${MACHINE}-${type}-${PV}-${PR} ${D}/${sysconfdir}/${UBOOT_INITIAL_ENV}
> + fi
> fi
> done
> unset j
> @@ -182,9 +192,11 @@ do_install () {
> ln -sf ${UBOOT_IMAGE} ${D}/boot/${UBOOT_BINARY}
>
> # Install the uboot-initial-env
> - install -D -m 644 ${B}/u-boot-initial-env ${D}/${sysconfdir}/${PN}-initial-env-${MACHINE}-${PV}-${PR}
> - ln -sf ${PN}-initial-env-${MACHINE}-${PV}-${PR} ${D}/${sysconfdir}/${PN}-initial-env-${MACHINE}
> - ln -sf ${PN}-initial-env-${MACHINE}-${PV}-${PR} ${D}/${sysconfdir}/${PN}-initial-env
> + if [ -n "${UBOOT_INITIAL_ENV}" ]; then
> + install -D -m 644 ${B}/u-boot-initial-env ${D}/${sysconfdir}/${UBOOT_INITIAL_ENV}-${MACHINE}-${PV}-${PR}
> + ln -sf ${UBOOT_INITIAL_ENV}-${MACHINE}-${PV}-${PR} ${D}/${sysconfdir}/${UBOOT_INITIAL_ENV}-${MACHINE}
> + ln -sf ${UBOOT_INITIAL_ENV}-${MACHINE}-${PV}-${PR} ${D}/${sysconfdir}/${UBOOT_INITIAL_ENV}
> + fi
> fi
>
> if [ -n "${UBOOT_ELF}" ]
> @@ -255,8 +267,9 @@ do_install () {
> PACKAGE_BEFORE_PN += "${PN}-env"
>
> RPROVIDES_${PN}-env += "u-boot-default-env"
> +ALLOW_EMPTY_${PN}-env = "1"
I don't think this ^ is required, as there are other files in ${PN}-env, e.g.
fw_env.config:
> FILES_${PN}-env = " \
> - ${sysconfdir}/${PN}-initial-env* \
> + ${sysconfdir}/${UBOOT_INITIAL_ENV}* \
> ${sysconfdir}/fw_env.config \
> "
So, what happens whe UBOOT_INITIAL_ENV is empty? You get ${sysconfdir}/* in
there. Mayve you need a better check here?
> @@ -280,10 +293,12 @@ do_deploy () {
> ln -sf u-boot-${type}-${PV}-${PR}.${UBOOT_SUFFIX} ${UBOOT_BINARY}
>
> # Deploy the uboot-initial-env
> - install -D -m 644 ${B}/${config}/u-boot-initial-env-${type} ${DEPLOYDIR}/${PN}-initial-env-${MACHINE}-${type}-${PV}-${PR}
> - cd ${DEPLOYDIR}
> - ln -sf ${PN}-initial-env-${MACHINE}-${type}-${PV}-${PR} ${PN}-initial-env-${MACHINE}-${type}
> - ln -sf ${PN}-initial-env-${MACHINE}-${type}-${PV}-${PR} ${PN}-initial-env-${type}
> + if [ -n "${UBOOT_INITIAL_ENV}" ]; then
> + install -D -m 644 ${B}/${config}/u-boot-initial-env-${type} ${DEPLOYDIR}/${UBOOT_INITIAL_ENV}-${MACHINE}-${type}-${PV}-${PR}
> + cd ${DEPLOYDIR}
> + ln -sf ${UBOOT_INITIAL_ENV}-${MACHINE}-${type}-${PV}-${PR} ${UBOOT_INITIAL_ENV}-${MACHINE}-${type}
> + ln -sf ${UBOOT_INITIAL_ENV}-${MACHINE}-${type}-${PV}-${PR} ${UBOOT_INITIAL_ENV}-${type}
> + fi
> fi
> done
> unset j
> @@ -298,10 +313,12 @@ do_deploy () {
> ln -sf ${UBOOT_IMAGE} ${UBOOT_BINARY}
>
> # Deploy the uboot-initial-env
> - install -D -m 644 ${B}/u-boot-initial-env ${DEPLOYDIR}/${PN}-initial-env-${MACHINE}-${PV}-${PR}
> - cd ${DEPLOYDIR}
> - ln -sf ${PN}-initial-env-${MACHINE}-${PV}-${PR} ${PN}-initial-env-${MACHINE}
> - ln -sf ${PN}-initial-env-${MACHINE}-${PV}-${PR} ${PN}-initial-env
> + if [ -n "${UBOOT_INITIAL_ENV}" ]; then
> + install -D -m 644 ${B}/u-boot-initial-env ${DEPLOYDIR}/${UBOOT_INITIAL_ENV}-${MACHINE}-${PV}-${PR}
> + cd ${DEPLOYDIR}
> + ln -sf ${UBOOT_INITIAL_ENV}-${MACHINE}-${PV}-${PR} ${UBOOT_INITIAL_ENV}-${MACHINE}
> + ln -sf ${UBOOT_INITIAL_ENV}-${MACHINE}-${PV}-${PR} ${UBOOT_INITIAL_ENV}
> + fi
> fi
>
> if [ -e ${WORKDIR}/fw_env.config ] ; then
> --
> 2.26.2
>
>
^ permalink raw reply
* Re: remove kernel_setsockopt v4
From: David Miller @ 2020-05-29 20:11 UTC (permalink / raw)
To: hch
Cc: kuba, vyasevich, nhorman, marcelo.leitner, David.Laight,
linux-sctp, linux-kernel, cluster-devel, netdev
In-Reply-To: <20200529120943.101454-1-hch@lst.de>
From: Christoph Hellwig <hch@lst.de>
Date: Fri, 29 May 2020 14:09:39 +0200
> now that only the dlm calls to sctp are left for kernel_setsockopt,
> while we haven't really made much progress with the sctp setsockopt
> refactoring, how about this small series that splits out a
> sctp_setsockopt_bindx_kernel that takes a kernel space address array
> to share more code as requested by Marcelo. This should fit in with
> whatever variant of the refator of sctp setsockopt we go with, but
> just solved the immediate problem for now.
...
Series applied, thanks.
^ permalink raw reply
* Re: remove kernel_setsockopt v4
From: David Miller @ 2020-05-29 20:11 UTC (permalink / raw)
To: hch
Cc: kuba, vyasevich, nhorman, marcelo.leitner, David.Laight,
linux-sctp, linux-kernel, cluster-devel, netdev
In-Reply-To: <20200529120943.101454-1-hch@lst.de>
From: Christoph Hellwig <hch@lst.de>
Date: Fri, 29 May 2020 14:09:39 +0200
> now that only the dlm calls to sctp are left for kernel_setsockopt,
> while we haven't really made much progress with the sctp setsockopt
> refactoring, how about this small series that splits out a
> sctp_setsockopt_bindx_kernel that takes a kernel space address array
> to share more code as requested by Marcelo. This should fit in with
> whatever variant of the refator of sctp setsockopt we go with, but
> just solved the immediate problem for now.
...
Series applied, thanks.
^ permalink raw reply
* [Cluster-devel] remove kernel_setsockopt v4
From: David Miller @ 2020-05-29 20:11 UTC (permalink / raw)
To: cluster-devel.redhat.com
In-Reply-To: <20200529120943.101454-1-hch@lst.de>
From: Christoph Hellwig <hch@lst.de>
Date: Fri, 29 May 2020 14:09:39 +0200
> now that only the dlm calls to sctp are left for kernel_setsockopt,
> while we haven't really made much progress with the sctp setsockopt
> refactoring, how about this small series that splits out a
> sctp_setsockopt_bindx_kernel that takes a kernel space address array
> to share more code as requested by Marcelo. This should fit in with
> whatever variant of the refator of sctp setsockopt we go with, but
> just solved the immediate problem for now.
...
Series applied, thanks.
^ permalink raw reply
* [PATCH] iommu/amd: Fix event counter availability check
From: Alexander Monakov @ 2020-05-29 20:07 UTC (permalink / raw)
To: linux-kernel; +Cc: Alexander Monakov, iommu
The driver performs an extra check if the IOMMU's capabilities advertise
presence of performance counters: it verifies that counters are writable
by writing a hard-coded value to a counter and testing that reading that
counter gives back the same value.
Unfortunately it does so quite early, even before pci_enable_device is
called for the IOMMU, i.e. when accessing its MMIO space is not
guaranteed to work. On Ryzen 4500U CPU, this actually breaks the test:
the driver assumes the counters are not writable, and disables the
functionality.
Moving init_iommu_perf_ctr just after iommu_flush_all_caches resolves
the issue. This is the earliest point in amd_iommu_init_pci where the
call succeeds on my laptop.
Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: iommu@lists.linux-foundation.org
---
PS. I'm seeing another hiccup with IOMMU probing on my system:
pci 0000:00:00.2: can't derive routing for PCI INT A
pci 0000:00:00.2: PCI INT A: not connected
Hopefully I can figure it out, but I'd appreciate hints.
drivers/iommu/amd_iommu_init.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 5b81fd16f5fa..1b7ec6b6a282 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -1788,8 +1788,6 @@ static int __init iommu_init_pci(struct amd_iommu *iommu)
if (iommu->cap & (1UL << IOMMU_CAP_NPCACHE))
amd_iommu_np_cache = true;
- init_iommu_perf_ctr(iommu);
-
if (is_rd890_iommu(iommu->dev)) {
int i, j;
@@ -1891,8 +1889,10 @@ static int __init amd_iommu_init_pci(void)
init_device_table_dma();
- for_each_iommu(iommu)
+ for_each_iommu(iommu) {
iommu_flush_all_caches(iommu);
+ init_iommu_perf_ctr(iommu);
+ }
if (!ret)
print_iommu_info();
base-commit: 75caf310d16cc5e2f851c048cd597f5437013368
--
2.26.2
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply related
* [GIT PULL] Block fixes for 5.7 final
From: Jens Axboe @ 2020-05-29 20:11 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-block@vger.kernel.org
Hi Linus,
Two small fixes:
- Revert a block change that mixed up the return values for non-mq
devices
- NVMe poll race fix
Please pull!
git://git.kernel.dk/linux-block.git tags/block-5.7-2020-05-29
----------------------------------------------------------------
Dongli Zhang (1):
nvme-pci: avoid race between nvme_reap_pending_cqes() and nvme_poll()
Jens Axboe (2):
Merge branch 'nvme-5.7' of git://git.infradead.org/nvme into block-5.7
Revert "block: end bio with BLK_STS_AGAIN in case of non-mq devs and REQ_NOWAIT"
block/blk-core.c | 11 ++++-------
drivers/nvme/host/pci.c | 11 +++++++----
2 files changed, 11 insertions(+), 11 deletions(-)
--
Jens Axboe
^ permalink raw reply
* [PATCH] iommu/amd: Fix event counter availability check
From: Alexander Monakov @ 2020-05-29 20:07 UTC (permalink / raw)
To: linux-kernel
Cc: Alexander Monakov, Joerg Roedel, Suravee Suthikulpanit, iommu
The driver performs an extra check if the IOMMU's capabilities advertise
presence of performance counters: it verifies that counters are writable
by writing a hard-coded value to a counter and testing that reading that
counter gives back the same value.
Unfortunately it does so quite early, even before pci_enable_device is
called for the IOMMU, i.e. when accessing its MMIO space is not
guaranteed to work. On Ryzen 4500U CPU, this actually breaks the test:
the driver assumes the counters are not writable, and disables the
functionality.
Moving init_iommu_perf_ctr just after iommu_flush_all_caches resolves
the issue. This is the earliest point in amd_iommu_init_pci where the
call succeeds on my laptop.
Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: iommu@lists.linux-foundation.org
---
PS. I'm seeing another hiccup with IOMMU probing on my system:
pci 0000:00:00.2: can't derive routing for PCI INT A
pci 0000:00:00.2: PCI INT A: not connected
Hopefully I can figure it out, but I'd appreciate hints.
drivers/iommu/amd_iommu_init.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 5b81fd16f5fa..1b7ec6b6a282 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -1788,8 +1788,6 @@ static int __init iommu_init_pci(struct amd_iommu *iommu)
if (iommu->cap & (1UL << IOMMU_CAP_NPCACHE))
amd_iommu_np_cache = true;
- init_iommu_perf_ctr(iommu);
-
if (is_rd890_iommu(iommu->dev)) {
int i, j;
@@ -1891,8 +1889,10 @@ static int __init amd_iommu_init_pci(void)
init_device_table_dma();
- for_each_iommu(iommu)
+ for_each_iommu(iommu) {
iommu_flush_all_caches(iommu);
+ init_iommu_perf_ctr(iommu);
+ }
if (!ret)
print_iommu_info();
base-commit: 75caf310d16cc5e2f851c048cd597f5437013368
--
2.26.2
^ permalink raw reply related
* Re: Some -serious- BPF-related litmus tests
From: Andrii Nakryiko @ 2020-05-29 20:10 UTC (permalink / raw)
To: Joel Fernandes
Cc: Boqun Feng, Andrii Nakryiko, Paul E . McKenney, Alan Stern,
Peter Zijlstra, parri.andrea, will, npiggin, dhowells, j.alglave,
luc.maranget, Akira Yokosawa, dlustig, open list, linux-arch
In-Reply-To: <20200529172301.GB196085@google.com>
On Fri, May 29, 2020 at 10:23 AM Joel Fernandes <joel@joelfernandes.org> wrote:
>
> On Thu, May 28, 2020 at 09:38:35PM -0700, Andrii Nakryiko wrote:
> > On Thu, May 28, 2020 at 2:48 PM Joel Fernandes <joel@joelfernandes.org> wrote:
> > >
> > > On Mon, May 25, 2020 at 11:38:23AM -0700, Andrii Nakryiko wrote:
> > > > On Mon, May 25, 2020 at 7:53 AM Boqun Feng <boqun.feng@gmail.com> wrote:
> > > > >
> > > > > Hi Andrii,
> > > > >
> > > > > On Fri, May 22, 2020 at 12:38:21PM -0700, Andrii Nakryiko wrote:
> > > > > > On 5/22/20 10:43 AM, Paul E. McKenney wrote:
> > > > > > > On Fri, May 22, 2020 at 10:32:01AM -0400, Alan Stern wrote:
> > > > > > > > On Fri, May 22, 2020 at 11:44:07AM +0200, Peter Zijlstra wrote:
> > > > > > > > > On Thu, May 21, 2020 at 05:38:50PM -0700, Paul E. McKenney wrote:
> > > > > > > > > > Hello!
> > > > > > > > > >
> > > > > > > > > > Just wanted to call your attention to some pretty cool and pretty serious
> > > > > > > > > > litmus tests that Andrii did as part of his BPF ring-buffer work:
> > > > > > > > > >
> > > > > > > > > > https://lore.kernel.org/bpf/20200517195727.279322-3-andriin@fb.com/
> > > > > > > > > >
> > > > > > > > > > Thoughts?
> > > > > > > > >
> > > > > > > > > I find:
> > > > > > > > >
> > > > > > > > > smp_wmb()
> > > > > > > > > smp_store_release()
> > > > > > > > >
> > > > > > > > > a _very_ weird construct. What is that supposed to even do?
> > > > > > > >
> > > > > > > > Indeed, it looks like one or the other of those is redundant (depending
> > > > > > > > on the context).
> > > > > > >
> > > > > > > Probably. Peter instead asked what it was supposed to even do. ;-)
> > > > > >
> > > > > > I agree, I think smp_wmb() is redundant here. Can't remember why I thought
> > > > > > that it's necessary, this algorithm went through a bunch of iterations,
> > > > > > starting as completely lockless, also using READ_ONCE/WRITE_ONCE at some
> > > > > > point, and settling on smp_read_acquire/smp_store_release, eventually. Maybe
> > > > > > there was some reason, but might be that I was just over-cautious. See reply
> > > > > > on patch thread as well ([0]).
> > > > > >
> > > > > > [0] https://lore.kernel.org/bpf/CAEf4Bza26AbRMtWcoD5+TFhnmnU6p5YJ8zO+SoAJCDtp1jVhcQ@mail.gmail.com/
> > > > > >
> > > > >
> > > > > While we are at it, could you explain a bit on why you use
> > > > > smp_store_release() on consumer_pos? I ask because IIUC, consumer_pos is
> > > > > only updated at consumer side, and there is no other write at consumer
> > > > > side that we want to order with the write to consumer_pos. So I fail
> > > > > to find why smp_store_release() is necessary.
> > > > >
> > > > > I did the following modification on litmus tests, and I didn't see
> > > > > different results (on States) between two versions of litmus tests.
> > > > >
> > > >
> > > > This is needed to ensure that producer can reliably detect whether it
> > > > needs to trigger poll notification.
> > >
> > > Boqun's question is on the consumer side though. Are you saying that on the
> > > consumer side, the loads prior to the smp_store_release() on the consumer
> > > side should have been seen by the consumer? You are already using
> > > smp_load_acquire() so that should be satisified already because the
> > > smp_load_acquire() makes sure that the smp_load_acquire()'s happens before
> > > any future loads and stores.
> >
> > Consumer is reading two things: producer_pos and each record's length
> > header, and writes consumer_pos. I re-read this paragraph many times,
> > but I'm still a bit confused on what exactly you are trying to say.
>
> This is what I was saying in the other thread. I think you missed that
> comment. If you are adding litmus documentation, at least it should be clear
> what memory ordering is being verified. Both me and Boqun tried to remove a
> memory barrier each and the test still passes. So what exactly are you
> verifying from a memory consistency standpoint? I know you have those various
> rFail things and conditions - but I am assuming the goal here is to verify
> memory consistency as well. Or are we just throwing enough memory barriers at
> the problem to make sure the test passes, without understanding exactly what
> ordering is needed?
High-level goal was to verify that producers and consumer don't see
intermediate states they are not supposed to and overall the flow of
records is correct. It wasn't an explicit goal for me to find the
absolute minimal/weakest memory ordering that make this work. I did my
best to write invariants in such a way as to capture violations, but
I'm sure it won't catch 100% of possible problems unfortunately. E.g.,
if busy bit (len = -1 part) ordering is buggy, I didn't find a perfect
way to differentiate between consumer being stuck because record is
"busy" or because consumer (which is in no way serialized with
producers) "ran sooner" and just didn't see the record being committed
yet. But on the other hand, it did capture few subtle issues, which
made writing these litmus tests worthwhile nevertheless :)
I'm sure litmus tests can be improved and expanded, but I tried to
strike a balance between practicality and perfection.
>
> > Can you please specify in each case release()/acquire() of which
> > variable you are talking about?
>
> I don't want to speculate and confuse the thread more. I am afraid the burden
> of specifying what the various release/acquire orders is on the author of the
> code introducing the memory barriers ;-). That is, IMHO you should probably add
> code comments in the test about why a certain memory barrier is needed.
Sure, I'll follow up with more comments clarifying this. I was
genuinely trying to understand all those ordering implications you
were trying to describe, it's a tricky business, unfortunately.
>
> That said, I need to do more diligence and read the actual BPF ring buffer
> code to understand what you're modeling. I will try to make time to do that.
Great, thanks!
>
> thanks!
>
> - Joel
>
^ permalink raw reply
* [PATCH] dmaengine: fsl-edma: fix wrong tcd endianness for big-endian cpu
From: Angelo Dureghello @ 2020-05-29 20:15 UTC (permalink / raw)
To: vkoul; +Cc: dmaengine, peng.ma, maowenan, yibin.gong, festevam,
Angelo Dureghello
Due to recent fixes in m68k arch-specific I/O accessor macros, this
driver is not working anymore for ColdFire. Fix wrong tcd endianness
removing additional swaps, since edma_xxx() functions should already
take care of any eventual swap when needed.
Note, i could only test the change in ColdFire mcf54415 and Vybrid
vf50 / Colibri where i don't see any issue. So, every feedback and
test for all other SoCs involved is really appreciated.
Signed-off-by: Angelo Dureghello <angelo.dureghello@timesys.com>
---
drivers/dma/fsl-edma-common.c | 25 ++++++++++++-------------
1 file changed, 12 insertions(+), 13 deletions(-)
diff --git a/drivers/dma/fsl-edma-common.c b/drivers/dma/fsl-edma-common.c
index 5697c3622699..2008d0cedb66 100644
--- a/drivers/dma/fsl-edma-common.c
+++ b/drivers/dma/fsl-edma-common.c
@@ -353,25 +353,24 @@ static void fsl_edma_set_tcd_regs(struct fsl_edma_chan *fsl_chan,
* TCD parameters are stored in struct fsl_edma_hw_tcd in little
* endian format. However, we need to load the TCD registers in
* big- or little-endian obeying the eDMA engine model endian.
+ * The swap, when needed, is performed from edma_xxx() functions.
*/
edma_writew(edma, 0, ®s->tcd[ch].csr);
- edma_writel(edma, le32_to_cpu(tcd->saddr), ®s->tcd[ch].saddr);
- edma_writel(edma, le32_to_cpu(tcd->daddr), ®s->tcd[ch].daddr);
+ edma_writel(edma, tcd->saddr, ®s->tcd[ch].saddr);
+ edma_writel(edma, tcd->daddr, ®s->tcd[ch].daddr);
- edma_writew(edma, le16_to_cpu(tcd->attr), ®s->tcd[ch].attr);
- edma_writew(edma, le16_to_cpu(tcd->soff), ®s->tcd[ch].soff);
+ edma_writew(edma, tcd->attr, ®s->tcd[ch].attr);
+ edma_writew(edma, tcd->soff, ®s->tcd[ch].soff);
- edma_writel(edma, le32_to_cpu(tcd->nbytes), ®s->tcd[ch].nbytes);
- edma_writel(edma, le32_to_cpu(tcd->slast), ®s->tcd[ch].slast);
+ edma_writel(edma, tcd->nbytes, ®s->tcd[ch].nbytes);
+ edma_writel(edma, tcd->slast, ®s->tcd[ch].slast);
- edma_writew(edma, le16_to_cpu(tcd->citer), ®s->tcd[ch].citer);
- edma_writew(edma, le16_to_cpu(tcd->biter), ®s->tcd[ch].biter);
- edma_writew(edma, le16_to_cpu(tcd->doff), ®s->tcd[ch].doff);
+ edma_writew(edma, tcd->citer, ®s->tcd[ch].citer);
+ edma_writew(edma, tcd->biter, ®s->tcd[ch].biter);
+ edma_writew(edma, tcd->doff, ®s->tcd[ch].doff);
+ edma_writel(edma, tcd->dlast_sga, ®s->tcd[ch].dlast_sga);
- edma_writel(edma, le32_to_cpu(tcd->dlast_sga),
- ®s->tcd[ch].dlast_sga);
-
- edma_writew(edma, le16_to_cpu(tcd->csr), ®s->tcd[ch].csr);
+ edma_writew(edma, tcd->csr, ®s->tcd[ch].csr);
}
static inline
--
2.26.2
^ permalink raw reply related
* Re: [PATCH 1/4] lkdtm: Avoid more compiler optimizations for bad writes
From: Nick Desaulniers @ 2020-05-29 20:10 UTC (permalink / raw)
To: Kees Cook
Cc: Greg Kroah-Hartman, Prasad Sodagudi, Sami Tolvanen,
Amit Daniel Kachhap, open list:KERNEL SELFTEST FRAMEWORK,
clang-built-linux, LKML
In-Reply-To: <20200529200347.2464284-2-keescook@chromium.org>
On Fri, May 29, 2020 at 1:03 PM Kees Cook <keescook@chromium.org> wrote:
>
> It seems at least Clang is able to throw away writes it knows are
> destined for read-only memory, which makes things like the WRITE_RO test
> fail, as the write gets elided. Instead, force the variable to be
Heh, yep. I recall the exact patch in LLVM causing build breakages
for kernels and various parts of Android userspace within the past
year, for code that tried to write to variables declared const through
casts that removed the const. (Was the last patch for us to build MIPS
IIRC). Doing so is explicitly UB. I did feel that that particular
"optimization" was very specific to C/C++, and should not have been
performed in LLVM (which should be more agnostic to the front end
language's wacky rules, IMO) but rather Clang (which doesn't do much
C/C++ language specific optimizations currently, though there are
rough plans forming to change that).
> volatile, and make similar changes through-out other tests in an effort
> to avoid needing to repeat fixing these kinds of problems. Also includes
> pr_err() calls in failure paths so that kernel logs are more clear in
> the failure case.
>
> Reported-by: Prasad Sodagudi <psodagud@codeaurora.org>
> Suggested-by: Sami Tolvanen <samitolvanen@google.com>
> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
> drivers/misc/lkdtm/bugs.c | 11 +++++------
> drivers/misc/lkdtm/perms.c | 22 +++++++++++++++-------
> drivers/misc/lkdtm/usercopy.c | 7 +++++--
> 3 files changed, 25 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c
> index 886459e0ddd9..e1b43f615549 100644
> --- a/drivers/misc/lkdtm/bugs.c
> +++ b/drivers/misc/lkdtm/bugs.c
> @@ -118,9 +118,8 @@ noinline void lkdtm_CORRUPT_STACK(void)
> /* Use default char array length that triggers stack protection. */
> char data[8] __aligned(sizeof(void *));
>
> - __lkdtm_CORRUPT_STACK(&data);
> -
> - pr_info("Corrupted stack containing char array ...\n");
> + pr_info("Corrupting stack containing char array ...\n");
> + __lkdtm_CORRUPT_STACK((void *)&data);
> }
>
> /* Same as above but will only get a canary with -fstack-protector-strong */
> @@ -131,9 +130,8 @@ noinline void lkdtm_CORRUPT_STACK_STRONG(void)
> unsigned long *ptr;
> } data __aligned(sizeof(void *));
>
> - __lkdtm_CORRUPT_STACK(&data);
> -
> - pr_info("Corrupted stack containing union ...\n");
> + pr_info("Corrupting stack containing union ...\n");
> + __lkdtm_CORRUPT_STACK((void *)&data);
> }
>
> void lkdtm_UNALIGNED_LOAD_STORE_WRITE(void)
> @@ -248,6 +246,7 @@ void lkdtm_ARRAY_BOUNDS(void)
>
> kfree(not_checked);
> kfree(checked);
> + pr_err("FAIL: survived array bounds overflow!\n");
> }
>
> void lkdtm_CORRUPT_LIST_ADD(void)
> diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c
> index 62f76d506f04..2dede2ef658f 100644
> --- a/drivers/misc/lkdtm/perms.c
> +++ b/drivers/misc/lkdtm/perms.c
> @@ -57,6 +57,7 @@ static noinline void execute_location(void *dst, bool write)
> }
> pr_info("attempting bad execution at %px\n", func);
> func();
> + pr_err("FAIL: func returned\n");
> }
>
> static void execute_user_location(void *dst)
> @@ -75,20 +76,22 @@ static void execute_user_location(void *dst)
> return;
> pr_info("attempting bad execution at %px\n", func);
> func();
> + pr_err("FAIL: func returned\n");
> }
>
> void lkdtm_WRITE_RO(void)
> {
> - /* Explicitly cast away "const" for the test. */
> - unsigned long *ptr = (unsigned long *)&rodata;
> + /* Explicitly cast away "const" for the test and make volatile. */
> + volatile unsigned long *ptr = (unsigned long *)&rodata;
>
> pr_info("attempting bad rodata write at %px\n", ptr);
> *ptr ^= 0xabcd1234;
> + pr_err("FAIL: survived bad write\n");
> }
>
> void lkdtm_WRITE_RO_AFTER_INIT(void)
> {
> - unsigned long *ptr = &ro_after_init;
> + volatile unsigned long *ptr = &ro_after_init;
>
> /*
> * Verify we were written to during init. Since an Oops
> @@ -102,19 +105,21 @@ void lkdtm_WRITE_RO_AFTER_INIT(void)
>
> pr_info("attempting bad ro_after_init write at %px\n", ptr);
> *ptr ^= 0xabcd1234;
> + pr_err("FAIL: survived bad write\n");
> }
>
> void lkdtm_WRITE_KERN(void)
> {
> size_t size;
> - unsigned char *ptr;
> + volatile unsigned char *ptr;
>
> size = (unsigned long)do_overwritten - (unsigned long)do_nothing;
> ptr = (unsigned char *)do_overwritten;
>
> pr_info("attempting bad %zu byte write at %px\n", size, ptr);
> - memcpy(ptr, (unsigned char *)do_nothing, size);
> + memcpy((void *)ptr, (unsigned char *)do_nothing, size);
> flush_icache_range((unsigned long)ptr, (unsigned long)(ptr + size));
> + pr_err("FAIL: survived bad write\n");
>
> do_overwritten();
> }
> @@ -193,9 +198,11 @@ void lkdtm_ACCESS_USERSPACE(void)
> pr_info("attempting bad read at %px\n", ptr);
> tmp = *ptr;
> tmp += 0xc0dec0de;
> + pr_err("FAIL: survived bad read\n");
>
> pr_info("attempting bad write at %px\n", ptr);
> *ptr = tmp;
> + pr_err("FAIL: survived bad write\n");
>
> vm_munmap(user_addr, PAGE_SIZE);
> }
> @@ -203,19 +210,20 @@ void lkdtm_ACCESS_USERSPACE(void)
> void lkdtm_ACCESS_NULL(void)
> {
> unsigned long tmp;
> - unsigned long *ptr = (unsigned long *)NULL;
> + volatile unsigned long *ptr = (unsigned long *)NULL;
>
> pr_info("attempting bad read at %px\n", ptr);
> tmp = *ptr;
> tmp += 0xc0dec0de;
> + pr_err("FAIL: survived bad read\n");
>
> pr_info("attempting bad write at %px\n", ptr);
> *ptr = tmp;
> + pr_err("FAIL: survived bad write\n");
> }
>
> void __init lkdtm_perms_init(void)
> {
> /* Make sure we can write to __ro_after_init values during __init */
> ro_after_init |= 0xAA;
> -
> }
> diff --git a/drivers/misc/lkdtm/usercopy.c b/drivers/misc/lkdtm/usercopy.c
> index e172719dd86d..b833367a45d0 100644
> --- a/drivers/misc/lkdtm/usercopy.c
> +++ b/drivers/misc/lkdtm/usercopy.c
> @@ -304,19 +304,22 @@ void lkdtm_USERCOPY_KERNEL(void)
> return;
> }
>
> - pr_info("attempting good copy_to_user from kernel rodata\n");
> + pr_info("attempting good copy_to_user from kernel rodata: %px\n",
> + test_text);
> if (copy_to_user((void __user *)user_addr, test_text,
> unconst + sizeof(test_text))) {
> pr_warn("copy_to_user failed unexpectedly?!\n");
> goto free_user;
> }
>
> - pr_info("attempting bad copy_to_user from kernel text\n");
> + pr_info("attempting bad copy_to_user from kernel text: %px\n",
> + vm_mmap);
> if (copy_to_user((void __user *)user_addr, vm_mmap,
> unconst + PAGE_SIZE)) {
> pr_warn("copy_to_user failed, but lacked Oops\n");
> goto free_user;
> }
> + pr_err("FAIL: survived bad copy_to_user()\n");
>
> free_user:
> vm_munmap(user_addr, PAGE_SIZE);
> --
> 2.25.1
>
> --
> You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20200529200347.2464284-2-keescook%40chromium.org.
--
Thanks,
~Nick Desaulniers
^ permalink raw reply
* [MPTCP] Re: [PATCH v2 mptcp-next] mptcp: add receive buffer auto-tuning
From: Florian Westphal @ 2020-05-29 20:10 UTC (permalink / raw)
To: mptcp
[-- Attachment #1: Type: text/plain, Size: 3011 bytes --]
Christoph Paasch <cpaasch(a)apple.com> wrote:
> > After:
> > ns4 MPTCP -> ns3 (10.0.3.2:10108 ) MPTCP (duration 5417ms) [ OK ]
> > ns4 MPTCP -> ns3 (10.0.3.2:10109 ) TCP (duration 5429ms) [ OK ]
> > ns4 TCP -> ns3 (10.0.3.2:10110 ) MPTCP (duration 5418ms) [ OK ]
> > ns4 MPTCP -> ns3 (dead:beef:3::2:10111) MPTCP (duration 5423ms) [ OK ]
> > ns4 MPTCP -> ns3 (dead:beef:3::2:10112) TCP (duration 5715ms) [ OK ]
> > ns4 TCP -> ns3 (dead:beef:3::2:10113) MPTCP (duration 5415ms) [ OK ]
> > Time: 275 seconds
> >
> > Signed-off-by: Florian Westphal <fw(a)strlen.de>
> > ---
> Need to also handle the fallback to TCP-case. Otherwise there is a
> divide-by-0:
>
> server login: [ 1998.037445] divide error: 0000 [#1] SMP KASAN PTI
> [ 1998.038321] CPU: 3 PID: 1671 Comm: http Kdump: loaded Not tainted 5.7.0-rc6.mptcp #43
> [ 1998.039599] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> [ 1998.041526] RIP: 0010:mptcp_recvmsg+0x6ae/0xa40
> [ 1998.042370] Code: 44 89 e0 89 da 4c 8b 44 24 08 45 8d b4 24 40 03 00 00 c1 e0 04 48 8d 0c 50 89 d8 49 8d b8 60 04 00 00 31 d2 29 e8 48 0f af c1 <48> f7 f5 4c 8d 2c 41 e8 66 6b 6a ff 4c 8b 44 24 08 41 8b a8 64
> [ 1998.045394] RSP: 0018:ffff8881174bf968 EFLAGS: 00010206
> [ 1998.046237] RAX: 0000000140764000 RBX: 000000000000b500 RCX: 000000000001c540
> [ 1998.047397] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff8291b460
> [ 1998.048564] RBP: 0000000000000000 R08: ffffffff8291b000 R09: ffffffff82f1d0c8
> [ 1998.049724] R10: ffffffff82f1d0c3 R11: fffffbfff05e3a18 R12: 00000000000005b4
> [ 1998.050872] R13: 000000000022b368 R14: 00000000000008f4 R15: ffff888118740000
> [ 1998.052059] FS: 00007f081dd1ea40(0000) GS:ffff88811b980000(0000) knlGS:0000000000000000
> [ 1998.053357] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1998.054189] CR2: 000055b5364a7000 CR3: 0000000115d5e000 CR4: 00000000000006e0
> [ 1998.055231] Call Trace:
> [ 1998.057574] inet_recvmsg+0x207/0x220
> [ 1998.058832] sock_read_iter+0x1fe/0x230
> [ 1998.060656] new_sync_read+0x33a/0x350
> [ 1998.063934] vfs_read+0xbc/0x1b0
> [ 1998.064466] ksys_read+0x11b/0x150
> [ 1998.066975] do_syscall_64+0xbc/0x790
> [ 1998.070940] entry_SYSCALL_64_after_hwframe+0x44/0xa9
WTF? How can this happen? All TCP -> MPTCP tests passed for me.
This would mean we call mptcp_recvmsg *without* gping through
either mptcp_finish_connect or mptcp_accept?
> @@ -259,8 +259,10 @@ static void subflow_finish_connect(struct sock *sk, const struct sk_buff *skb)
> MPTCP_MIB_MPCAPABLEACTIVEFALLBACK);
> }
>
> - if (mptcp_check_fallback(sk))
> + if (mptcp_check_fallback(sk)) {
> + mptcp_rcv_space_init(mptcp_sk(parent), sk);
> return;
> + }
Oh, wait. That code is not present in net-next.
In that case I will wait until this has propagated to net-next; i assume
I would see the crash with the self test too.
^ permalink raw reply
* Re: mmotm 2020-05-13-20-30 uploaded (objtool warnings)
From: Al Viro @ 2020-05-29 20:08 UTC (permalink / raw)
To: Linus Torvalds
Cc: Josh Poimboeuf, Peter Zijlstra, Christoph Hellwig, Randy Dunlap,
Andrew Morton, Mark Brown, linux-fsdevel,
Linux Kernel Mailing List, Linux-MM, Linux Next Mailing List,
Michal Hocko, mm-commits, Stephen Rothwell,
the arch/x86 maintainers, Steven Rostedt
In-Reply-To: <CAHk-=wi7xda+zM=iRGXWbU9i8S7kbNaSfPhXVXR-vK6uEFNx_w@mail.gmail.com>
On Fri, May 29, 2020 at 12:31:04PM -0700, Linus Torvalds wrote:
> On Fri, May 29, 2020 at 9:50 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> >
> > From staring at the asm I think the generated code is correct, it's just
> > that the nested likelys with ftrace profiling cause GCC to converge the
> > error/success paths. But objtool doesn't do register value tracking so
> > it's not smart enough to know that it's safe.
>
> I'm surprised that gcc doesn't end up doing the obvious CSE and then
> branch following and folding it all away in the end, but your patch is
> obviously the right thing to do regardless, so ack on that.
>
> Al - I think this had best go into your uaccess cleanup branch with
> that csum-wrapper update, to avoid any unnecessary conflicts or
> dependencies.
Sure, just let me verify that other branches don't introduce anything
of that sort...
^ permalink raw reply
* Re: [PATCH 3/3] power: supply: max17040: Set rcomp value
From: Jonathan Bakker @ 2020-05-29 20:09 UTC (permalink / raw)
To: Sebastian Reichel; +Cc: linux-pm, linux-kernel, robh+dt, devicetree
In-Reply-To: <20200528170230.62c7jvmyjkhpoykj@earth.universe>
Hi Sebastian,
I'm sorry, I messed up my rebase on top of the low battery alert and it somehow
slipped through my pre-submit checklist.
Before resubmitting, do you want the rcomp changed in any manner (where the
datasheet doesn't specify if its the full 16 bits or only 8 bites for max17040
but does for the later max17043/max77836 where its only 8 bits)?
Thanks and sorry for the issues,
Jonathan
On 2020-05-28 10:02 a.m., Sebastian Reichel wrote:
> Hi,
>
> This patch does not even compile, how did you test it?
>
> -- Sebastian
>
> On Mon, May 04, 2020 at 03:13:00PM -0700, Jonathan Bakker wrote:
>> According to the datasheet (1), the rcomp parameter can
>> vary based on the typical operating temperature and the
>> battery chemistry. If provided, make sure we set it after
>> we reset the chip on boot.
>>
>> 1) https://datasheets.maximintegrated.com/en/ds/MAX17040-MAX17041.pdf
>>
>> Signed-off-by: Jonathan Bakker <xc-racer2@live.ca>
>> ---
>> drivers/power/supply/max17040_battery.c | 33 +++++++++++++++++++++----
>> 1 file changed, 28 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/power/supply/max17040_battery.c b/drivers/power/supply/max17040_battery.c
>> index 48aa44665e2f..f66e2fdc0a8a 100644
>> --- a/drivers/power/supply/max17040_battery.c
>> +++ b/drivers/power/supply/max17040_battery.c
>> @@ -10,6 +10,7 @@
>> #include <linux/init.h>
>> #include <linux/platform_device.h>
>> #include <linux/mutex.h>
>> +#include <linux/property.h>
>> #include <linux/err.h>
>> #include <linux/i2c.h>
>> #include <linux/delay.h>
>> @@ -31,6 +32,8 @@
>>
>> #define MAX17040_ATHD_MASK 0xFFC0
>> #define MAX17040_ATHD_DEFAULT_POWER_UP 4
>> +#define MAX17040_RCOMP_MASK 0xFF
>> +#define MAX17040_RCOMP_DEFAULT_POWER_UP 0x97
>>
>> struct max17040_chip {
>> struct i2c_client *client;
>> @@ -48,6 +51,8 @@ struct max17040_chip {
>> int status;
>> /* Low alert threshold from 32% to 1% of the State of Charge */
>> u32 low_soc_alert;
>> + /* Optimization for specific chemistries */
>> + u8 rcomp_value;
>> };
>>
>> static int max17040_get_property(struct power_supply *psy,
>> @@ -119,6 +124,20 @@ static int max17040_set_low_soc_alert(struct i2c_client *client, u32 level)
>> return ret;
>> }
>>
>> +static int max17040_set_rcomp(struct i2c_client *client, u32 val)
>> +{
>> + int ret;
>> + u16 data;
>> +
>> + data = max17040_read_reg(client, MAX17040_RCOMP);
>> + /* clear the rcomp val and set MSb 8 bits */
>> + data &= MAX17040_RCOMP_MASK;
>> + data |= val << 8;
>> + ret = max17040_write_reg(client, MAX17040_RCOMP, data);
>> +
>> + return ret;
>> +}
>> +
>> static void max17040_get_vcell(struct i2c_client *client)
>> {
>> struct max17040_chip *chip = i2c_get_clientdata(client);
>> @@ -190,8 +209,14 @@ static int max17040_get_of_data(struct max17040_chip *chip)
>> "maxim,alert-low-soc-level",
>> &chip->low_soc_alert);
>>
>> - if (chip->low_soc_alert <= 0 || chip->low_soc_alert >= 33)
>> + if (chip->low_soc_alert <= 0 || chip->low_soc_alert >= 33) {
>> + dev_err(&client->dev,
>> + "failed: low SOC alert OF data out of bounds\n");
>> return -EINVAL;
>> + }
>> +
>> + chip->rcomp_value = MAX17040_RCOMP_DEFAULT_POWER_UP;
>> + device_property_read_u8(dev, "maxim,rcomp-value", &chip->rcomp_value);
>>
>> return 0;
>> }
>> @@ -289,11 +314,8 @@ static int max17040_probe(struct i2c_client *client,
>> chip->client = client;
>> chip->pdata = client->dev.platform_data;
>> ret = max17040_get_of_data(chip);
>> - if (ret) {
>> - dev_err(&client->dev,
>> - "failed: low SOC alert OF data out of bounds\n");
>> + if (ret)
>> return ret;
>> - }
>>
>> i2c_set_clientdata(client, chip);
>> psy_cfg.drv_data = chip;
>> @@ -307,6 +329,7 @@ static int max17040_probe(struct i2c_client *client,
>>
>> max17040_reset(client);
>> max17040_get_version(client);
>> + max17040_set_rcomp(client, chip->rcomp_value);
>>
>> /* check interrupt */
>> if (client->irq && of_device_is_compatible(client->dev.of_node,
>> --
>> 2.20.1
^ permalink raw reply
* Re: Lost PCIe PME after a914ff2d78ce ("PCI/ASPM: Don't select CONFIG_PCIEASPM by default")
From: Heiner Kallweit @ 2020-05-29 20:09 UTC (permalink / raw)
To: Bjorn Helgaas, Rafael J. Wysocki
Cc: Bjorn Helgaas, linux-pci@vger.kernel.org, linux-kernel,
linux-acpi
In-Reply-To: <2d3944ea-f46c-037b-2395-859c4240f1fb@gmail.com>
On 29.05.2020 21:40, Heiner Kallweit wrote:
> On 29.05.2020 21:21, Bjorn Helgaas wrote:
>> [+cc Rafael, linux-kernel]
>>
>> On Fri, May 29, 2020 at 08:50:46PM +0200, Heiner Kallweit wrote:
>>> On 28.05.2020 23:44, Heiner Kallweit wrote:
>>>> For whatever reason with this change (and losing ASPM control) I also
>>>> loose the PCIe PME interrupts. This prevents my network card from
>>>> resuming from runtime-suspend.
>>>> Reverting the change brings back ASPM control and the PCIe PME irq's.
>>>>
>>>> Affected system is a Zotac MiniPC with a N3450 CPU:
>>>> PCI bridge: Intel Corporation Celeron N3350/Pentium N4200/Atom E3900 Series PCI Express Port A #1 (rev fb)
>>>>
>>> I checked a little bit further and w/o ASPM control the root ports
>>> don't have the PME service bit set in their capabilities.
>>> Not sure whether this is a chipset bug or whether there's a better
>>> explanation. However more chipsets may have such a behavior.
>>
>> Hmm. Is the difference simply changing the PCIEASPM config symbol, or
>> are you booting with command-line arguments like "pcie_aspm=off"?
>>
> Only difference is the config symbol. My command line is plain and simple:
>
> Command line: initrd=\intel-ucode.img initrd=\initramfs-linux.img root=/dev/sda2 rw
>
>> What's the specific PME bit that changes in the root ports? Can you
>> collect the "sudo lspci -vvxxxx" output with and without ASPM?
>>
>> The capability bits are generally read-only as far as the PCI spec is
>> concerned, but devices have implementation-specific knobs that the
>> BIOS may use to change things. Without CONFIG_PCIEASPM, Linux will
>> not request control of LTR, and that could cause the BIOS to change
>> something. You should be able to see the LTR control difference in
>> the dmesg logging about _OSC.
>>
>>> W/o the "default y" for ASPM control we also have the situation now
>>> that the config option description says "When in doubt, say Y."
>>> but it takes the EXPERT mode to enable it. This seems to be a little
>>> bit inconsistent.
>>
>> We should probably remove the "if EXPERT" from the PCIEASPM kconfig.
>> But I would expect PME to work correctly regardless of PCIEASPM, so
>> removing "if EXPERT" doesn't solve the underlying problem.
>>
>> Rafael, does this ring any bells for you? I don't remember a
>> connection between PME and ASPM, but maybe there is one.
>>
>>> To cut a long story short:
>>> At least on some systems this change has unwanted side effects.
>
> lspci output w/ and w/o ASPM is attached incl. a diff.
> Here comes the _OSC difference.
>
> w/o ASPM
>
> [ 0.386063] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig Segments MSI HPX-Type3]
> [ 0.386918] acpi PNP0A08:00: _OSC: not requesting OS control; OS requires [ExtendedConfig ASPM ClockPM MSI]
>
> w/ ASPM
> [ 0.388141] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI HPX-Type3]
> [ 0.393648] acpi PNP0A08:00: _OSC: OS now controls [PME AER PCIeCapability LTR]
>
> It's at least interesting that w/o ASPM OS doesn't control PME and AER.
>
This was the right entry point, also w/o ASPM control OS states to ACPI that it
needs ASPM and ClockPM. The following patch fixes the PME issue for me.
See also the _OSC part below.
diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index 9e235c1a7..8df1fa728 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -38,10 +38,15 @@ static int acpi_pci_root_scan_dependent(struct acpi_device *adev)
return 0;
}
+#ifdef CONFIG_PCIEASPM
#define ACPI_PCIE_REQ_SUPPORT (OSC_PCI_EXT_CONFIG_SUPPORT \
| OSC_PCI_ASPM_SUPPORT \
| OSC_PCI_CLOCK_PM_SUPPORT \
| OSC_PCI_MSI_SUPPORT)
+#else
+#define ACPI_PCIE_REQ_SUPPORT (OSC_PCI_EXT_CONFIG_SUPPORT \
+ | OSC_PCI_MSI_SUPPORT)
+#endif
static const struct acpi_device_id root_device_ids[] = {
{"PNP0A03", 0},
--
2.26.2
[ 0.387527] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig Segments MSI HPX-Type3]
[ 0.393033] acpi PNP0A08:00: _OSC: OS now controls [PME AER PCIeCapability]
^ permalink raw reply related
* Re: [net-next 09/11] net/mlx5e: kTLS, Add kTLS RX stats
From: Jakub Kicinski @ 2020-05-29 20:09 UTC (permalink / raw)
To: Saeed Mahameed; +Cc: David S. Miller, netdev, Tariq Toukan
In-Reply-To: <20200529194641.243989-10-saeedm@mellanox.com>
On Fri, 29 May 2020 12:46:39 -0700 Saeed Mahameed wrote:
> diff --git a/Documentation/networking/tls-offload.rst b/Documentation/networking/tls-offload.rst
> index f914e81fd3a64..44c4b19647746 100644
> --- a/Documentation/networking/tls-offload.rst
> +++ b/Documentation/networking/tls-offload.rst
> @@ -428,6 +428,14 @@ by the driver:
> which were part of a TLS stream.
> * ``rx_tls_decrypted_bytes`` - number of TLS payload bytes in RX packets
> which were successfully decrypted.
> + * ``rx_tls_ctx`` - number of TLS RX HW offload contexts added to device for
> + decryption.
> + * ``rx_tls_ooo`` - number of RX packets which were part of a TLS stream
> + but did not arrive in the expected order and triggered the resync procedure.
> + * ``rx_tls_del`` - number of TLS RX HW offload contexts deleted from device
> + (connection has finished).
> + * ``rx_tls_err`` - number of RX packets which were part of a TLS stream
> + but were not decrypted due to unexpected error in the state machine.
> * ``tx_tls_encrypted_packets`` - number of TX packets passed to the device
> for encryption of their TLS payload.
> * ``tx_tls_encrypted_bytes`` - number of TLS payload bytes in TX packets
Stack already has stats for some of these in /proc/net/tls_stat.
Does this really need to be per device?
^ permalink raw reply
* Re: [PATCH v2 3/3] selftests/seccomp: Test SECCOMP_IOCTL_NOTIF_ADDFD
From: Kees Cook @ 2020-05-29 20:09 UTC (permalink / raw)
To: Sargun Dhillon
Cc: christian.brauner, containers, cyphar, jannh, jeffv, linux-api,
linux-kernel, palmer, rsesek, tycho, Matt Denton
In-Reply-To: <20200529184606.GB11153@ircssh-2.c.rugged-nimbus-611.internal>
On Fri, May 29, 2020 at 06:46:07PM +0000, Sargun Dhillon wrote:
> On Fri, May 29, 2020 at 12:41:51AM -0700, Kees Cook wrote:
> > On Thu, May 28, 2020 at 04:08:58AM -0700, Sargun Dhillon wrote:
> > > + EXPECT_EQ(ioctl(listener, SECCOMP_IOCTL_NOTIF_SEND, &resp), 0);
> > > +
> > > + nextid = req.id + 1;
> > > +
> > > + /* Wait for getppid to be called for the second time */
> > > + sleep(1);
> >
> > I always rebel at finding "sleep" in tests. ;) Is this needed? IIUC,
> > userspace will immediately see EINPROGRESS after the NOTIF_SEND
> > finishes, yes?
> >
> > Otherwise, yes, this looks good.
> >
> > --
> > Kees Cook
> I'm open to better suggestions, but there's a race where if getppid
> is not called before the second SECCOMP_IOCTL_NOTIF_ADDFD is called,
> you will just get an ENOENT, since the notification ID is not found.
>
> The other approach is to "poll" the child, and wait for it to enter
> the second syscall. Calling receive beforehand doesn't work because
> it moves the state of the notification in the kernel to received,
> and then the kernel doesn't error with EINPROGRESS.
For tests, I prefer polling. How about adding a busy-loop
(with a iteration-bounded small usleep) that just calls
SECCOMP_IOCTL_NOTIF_ID_VALID until it's valid?
--
Kees Cook
^ permalink raw reply
* Re: [OE-core] [PATCH V3 1/3] u-boot: support merging .cfg files for UBOOT_CONFIG
From: Denys Dmytriyenko @ 2020-05-29 20:09 UTC (permalink / raw)
To: Ming Liu; +Cc: openembedded-core, stefan.agner, max.krummenacher, denys,
Ming Liu
In-Reply-To: <20200528124129.15100-2-liu.ming50@gmail.com>
Is this change really required for UBOOT_INITIAL_ENV? I think you are merging
several patch series together?
On Thu, May 28, 2020 at 02:41:27PM +0200, Ming Liu wrote:
> From: Ming Liu <ming.liu@toradex.com>
>
> U-boot recipe supports .cfg files in SRC_URI, but they would be merged
> to .config during do_configure only when UBOOT_MACHINE is set, we
> should also support merging .cfg files for UBOOT_CONFIG.
>
> Signed-off-by: Max Krummenacher <max.krummenacher@toradex.com>
> Signed-off-by: Ming Liu <ming.liu@toradex.com>
> ---
> meta/recipes-bsp/u-boot/u-boot.inc | 21 +++++++++++++++++----
> 1 file changed, 17 insertions(+), 4 deletions(-)
>
> diff --git a/meta/recipes-bsp/u-boot/u-boot.inc b/meta/recipes-bsp/u-boot/u-boot.inc
> index 80f828df52..8cfd25020c 100644
> --- a/meta/recipes-bsp/u-boot/u-boot.inc
> +++ b/meta/recipes-bsp/u-boot/u-boot.inc
> @@ -77,7 +77,23 @@ def find_cfgs(d):
> return sources_list
>
> do_configure () {
> - if [ -z "${UBOOT_CONFIG}" ]; then
> + if [ -n "${UBOOT_CONFIG}" ]; then
> + unset i j
> + for config in ${UBOOT_MACHINE}; do
> + i=$(expr $i + 1);
> + for type in ${UBOOT_CONFIG}; do
> + j=$(expr $j + 1);
> + if [ $j -eq $i ]; then
> + oe_runmake -C ${S} O=${B}/${config} ${config}
> + merge_config.sh -m -O ${B}/${config} ${B}/${config}/.config ${@" ".join(find_cfgs(d))}
> + oe_runmake -C ${S} O=${B}/${config} oldconfig
> + fi
> + done
> + unset j
> + done
> + unset i
> + DEVTOOL_DISABLE_MENUCONFIG=true
> + else
> if [ -n "${UBOOT_MACHINE}" ]; then
> oe_runmake -C ${S} O=${B} ${UBOOT_MACHINE}
> else
> @@ -85,8 +101,6 @@ do_configure () {
> fi
> merge_config.sh -m .config ${@" ".join(find_cfgs(d))}
> cml1_do_configure
> - else
> - DEVTOOL_DISABLE_MENUCONFIG=true
> fi
> }
>
> @@ -114,7 +128,6 @@ do_compile () {
> j=$(expr $j + 1);
> if [ $j -eq $i ]
> then
> - oe_runmake -C ${S} O=${B}/${config} ${config}
> oe_runmake -C ${S} O=${B}/${config} ${UBOOT_MAKE_TARGET}
> for binary in ${UBOOT_BINARIES}; do
> k=$(expr $k + 1);
> --
> 2.26.2
>
>
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.