All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: "Ni, BaoleX" <baolex.ni@intel.com>
Cc: "mingo@redhat.com" <mingo@redhat.com>,
	"acme@kernel.org" <acme@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"alexander.shishkin@linux.intel.com" 
	<alexander.shishkin@linux.intel.com>,
	"Liu, Chuansheng" <chuansheng.liu@intel.com>,
	Oleg Nesterov <oleg@redhat.com>
Subject: Re: hit a KASan bug related to Perf during stress test
Date: Mon, 24 Oct 2016 11:53:41 +0200	[thread overview]
Message-ID: <20161024095341.GF3102@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <318B87A793BE164187D8851D6CE09D64371C8811@shsmsx102.ccr.corp.intel.com>

On Mon, Oct 24, 2016 at 09:35:46AM +0000, Ni, BaoleX wrote:
> 
> [32736.018823] BUG: KASan: use after free in task_tgid_nr_ns+0x35/0xb0 at addr ffff8800265568c0
> [32736.028309] Read of size 8 by task dumpsys/11268
> [32736.033511] =============================================================================
> [32736.042700] BUG task_struct (Tainted: G        W  O): kasan: bad access detected

'W' this wasn't the first WARN you got, this means this might be the
result of prior borkage.

Also, it says: "BUG task_struct", does that mean task_struct was the
object accessed after free?

> [32736.051002] -----------------------------------------------------------------------------
> [32736.051002] 
> [32736.061840] Disabling lock debugging due to kernel taint
> [32736.067830] INFO: Slab 0xffffea0000995400 objects=5 used=3 fp=0xffff880026550000 flags=0x4000000000004080
> [32736.078572] INFO: Object 0xffff880026556440 @offset=25664 fp=0x          (null)
> ...
> [32738.776936] CPU: 0 PID: 11268 Comm: dumpsys Tainted: G    B   W  O 3.14.70-x86_64-02260-g162539f #1
> [32738.787092] Hardware name: Insyde CherryTrail/T3 MRD, BIOS CHTMRD.A6.002.016 09/20/2016
> [32738.796082]  ffff880026550000 0000000000000086 0000000000000000 ffff880065e05a70
> [32738.796215]  ffffffff81fc9427 ffff880065803b40 ffff880026556440 ffff880065e05aa0
> [32738.796345]  ffffffff8123fe2d ffff880065803b40 ffffea0000995400 ffff880026556440
> [32738.796475] Call Trace:
> [32738.796510]  <NMI> 
> [32738.796585]  [<ffffffff81fc9427>] dump_stack+0x67/0x90
> [32738.802404]  [<ffffffff8123fe2d>] print_trailer+0xfd/0x170
> [32738.808603]  [<ffffffff81244f26>] object_err+0x36/0x40
> [32738.814417]  [<ffffffff812467ed>] kasan_report_error+0x1fd/0x3d0
> [32738.821193]  [<ffffffff81131b84>] ? __rcu_read_unlock+0x24/0x90
> [32738.827881]  [<ffffffff81fe0888>] ? preempt_count_sub+0x18/0xf0
> [32738.834565]  [<ffffffff811db32c>] ? perf_output_put_handle+0x5c/0x170
> [32738.841833]  [<ffffffff81246e70>] kasan_report+0x40/0x50
> [32738.847838]  [<ffffffff810d9975>] ? task_tgid_nr_ns+0x35/0xb0
> [32738.854327]  [<ffffffff81245d59>] __asan_load8+0x69/0xa0
> [32738.860333]  [<ffffffff811dba18>] ? perf_output_copy+0x88/0x120
> [32738.867020]  [<ffffffff810d9975>] task_tgid_nr_ns+0x35/0xb0

So here we did: perf_event_[pt]id(event, current);

How can _current_ not be valid anymore?

> [32738.873319]  [<ffffffff811cd5d8>] __perf_event_header__init_id+0xb8/0x200
> [32738.880970]  [<ffffffff811d6f19>] perf_prepare_sample+0xa9/0x4a0
> [32738.887754]  [<ffffffff811d7700>] __perf_event_overflow+0x3f0/0x460
> [32738.894835]  [<ffffffff81022998>] ? x86_perf_event_set_period+0x128/0x210
> [32738.902496]  [<ffffffff811d8494>] perf_event_overflow+0x14/0x20
> [32738.909180]  [<ffffffff8102cabc>] intel_pmu_handle_irq+0x25c/0x520
> [32738.916156]  [<ffffffff81245945>] ? __asan_store8+0x15/0xa0
> [32738.922460]  [<ffffffff81fddb8b>] perf_event_nmi_handler+0x2b/0x50
> [32738.929437]  [<ffffffff81fdd4a8>] nmi_handle+0x88/0x230
> [32738.935346]  [<ffffffff81009873>] do_nmi+0x193/0x490
> [32738.940963]  [<ffffffff81fdc6d6>] end_repeat_nmi+0x1a/0x1e
> [32738.947163]  [<ffffffff81245d22>] ? __asan_load8+0x32/0xa0
> [32738.953358]  [<ffffffff81245d22>] ? __asan_load8+0x32/0xa0
> [32738.959554]  [<ffffffff81245d22>] ? __asan_load8+0x32/0xa0
> [32738.965718]  <<EOE>> 
> [32738.965787]  [<ffffffff811065a2>] ? check_preempt_wakeup+0x1a2/0x3a0
> [32738.972970]  [<ffffffff810f4618>] check_preempt_curr+0xf8/0x120
> [32738.979658]  [<ffffffff810f465d>] ttwu_do_wakeup+0x1d/0x1b0
> [32738.985953]  [<ffffffff810f4909>] ttwu_do_activate.constprop.105+0x89/0x90
> [32738.993710]  [<ffffffff810f87fe>] try_to_wake_up+0x29e/0x4e0
> [32739.000100]  [<ffffffff810f8aaf>] default_wake_function+0x2f/0x40
> [32739.006979]  [<ffffffff81114338>] autoremove_wake_function+0x18/0x50
> [32739.014149]  [<ffffffff81fe0888>] ? preempt_count_sub+0x18/0xf0
> [32739.020836]  [<ffffffff81113ab9>] __wake_up_common+0x79/0xb0
> [32739.027232]  [<ffffffff81113d69>] __wake_up+0x39/0x50
> [32739.032945]  [<ffffffff81135918>] __call_rcu_nocb_enqueue+0x158/0x160
> [32739.040207]  [<ffffffff81135a4c>] __call_rcu+0x12c/0x450

And while we just called release_task(), that call_rcu() should still be
pending at this point, also I don't think that can be current until
after do_task_dead() where we schedule away from the dead task and
change current.

> [32739.046207]  [<ffffffff81135dcd>] call_rcu+0x1d/0x20
> [32739.051821]  [<ffffffff810ae2da>] release_task+0x6aa/0x8d0
> [32739.058022]  [<ffffffff8111e86f>] ? do_raw_write_unlock+0x6f/0xd0
> [32739.064900]  [<ffffffff810b1002>] do_exit+0xe52/0x1020
> [32739.070712]  [<ffffffff810b1222>] SyS_exit+0x22/0x30
> [32739.076328]  [<ffffffff81fe9063>] sysenter_dispatch+0x7/0x1f
> [32739.082725]  [<ffffffff8152f33b>] ? trace_hardirqs_on_thunk+0x3a/0x3c

Oleg, any idea?

       reply	other threads:[~2016-10-24  9:53 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <318B87A793BE164187D8851D6CE09D64371C8811@shsmsx102.ccr.corp.intel.com>
2016-10-24  9:53 ` Peter Zijlstra [this message]
2016-10-24 11:15   ` hit a KASan bug related to Perf during stress test Oleg Nesterov
2016-10-24 11:24     ` Peter Zijlstra
2016-10-24 12:02       ` Oleg Nesterov
2016-10-24 12:10         ` Oleg Nesterov
2016-10-24 12:22           ` Peter Zijlstra
2016-10-24 12:29             ` Oleg Nesterov
2016-10-24 12:38               ` Peter Zijlstra
2016-10-24 13:25                 ` Oleg Nesterov
2016-10-24 13:40                   ` Oleg Nesterov
2016-10-24 14:17                     ` Peter Zijlstra
2016-10-24 14:36                   ` Peter Zijlstra
2016-10-24 15:39                     ` Oleg Nesterov
2016-10-24 15:53                       ` Oleg Nesterov
2016-10-25  6:55                         ` Ni, BaoleX
2016-10-25  9:28                       ` Peter Zijlstra
2016-10-25 14:41                         ` Oleg Nesterov
2016-10-26  9:03                           ` Peter Zijlstra
2016-10-26 16:10                             ` Oleg Nesterov
2016-10-24 12:19         ` Peter Zijlstra
2016-10-24 11:27     ` Peter Zijlstra
2016-10-24 11:29       ` Peter Zijlstra
2016-10-24 12:04         ` Jiri Olsa
2016-10-24 12:12           ` Peter Zijlstra
2016-10-24 12:11     ` Peter Zijlstra
2016-10-24 12:21       ` Oleg Nesterov
2016-10-24 12:27         ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161024095341.GF3102@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=baolex.ni@intel.com \
    --cc=chuansheng.liu@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.