public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: "Ni, BaoleX" <baolex.ni@intel.com>
Cc: "mingo@redhat.com" <mingo@redhat.com>,
	"acme@kernel.org" <acme@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"alexander.shishkin@linux.intel.com" 
	<alexander.shishkin@linux.intel.com>,
	"Liu, Chuansheng" <chuansheng.liu@intel.com>,
	Oleg Nesterov <oleg@redhat.com>
Subject: Re: hit a KASan bug related to Perf during stress test
Date: Mon, 24 Oct 2016 11:53:41 +0200	[thread overview]
Message-ID: <20161024095341.GF3102@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <318B87A793BE164187D8851D6CE09D64371C8811@shsmsx102.ccr.corp.intel.com>

On Mon, Oct 24, 2016 at 09:35:46AM +0000, Ni, BaoleX wrote:
> 
> [32736.018823] BUG: KASan: use after free in task_tgid_nr_ns+0x35/0xb0 at addr ffff8800265568c0
> [32736.028309] Read of size 8 by task dumpsys/11268
> [32736.033511] =============================================================================
> [32736.042700] BUG task_struct (Tainted: G        W  O): kasan: bad access detected

'W' this wasn't the first WARN you got, this means this might be the
result of prior borkage.

Also, it says: "BUG task_struct", does that mean task_struct was the
object accessed after free?

> [32736.051002] -----------------------------------------------------------------------------
> [32736.051002] 
> [32736.061840] Disabling lock debugging due to kernel taint
> [32736.067830] INFO: Slab 0xffffea0000995400 objects=5 used=3 fp=0xffff880026550000 flags=0x4000000000004080
> [32736.078572] INFO: Object 0xffff880026556440 @offset=25664 fp=0x          (null)
> ...
> [32738.776936] CPU: 0 PID: 11268 Comm: dumpsys Tainted: G    B   W  O 3.14.70-x86_64-02260-g162539f #1
> [32738.787092] Hardware name: Insyde CherryTrail/T3 MRD, BIOS CHTMRD.A6.002.016 09/20/2016
> [32738.796082]  ffff880026550000 0000000000000086 0000000000000000 ffff880065e05a70
> [32738.796215]  ffffffff81fc9427 ffff880065803b40 ffff880026556440 ffff880065e05aa0
> [32738.796345]  ffffffff8123fe2d ffff880065803b40 ffffea0000995400 ffff880026556440
> [32738.796475] Call Trace:
> [32738.796510]  <NMI> 
> [32738.796585]  [<ffffffff81fc9427>] dump_stack+0x67/0x90
> [32738.802404]  [<ffffffff8123fe2d>] print_trailer+0xfd/0x170
> [32738.808603]  [<ffffffff81244f26>] object_err+0x36/0x40
> [32738.814417]  [<ffffffff812467ed>] kasan_report_error+0x1fd/0x3d0
> [32738.821193]  [<ffffffff81131b84>] ? __rcu_read_unlock+0x24/0x90
> [32738.827881]  [<ffffffff81fe0888>] ? preempt_count_sub+0x18/0xf0
> [32738.834565]  [<ffffffff811db32c>] ? perf_output_put_handle+0x5c/0x170
> [32738.841833]  [<ffffffff81246e70>] kasan_report+0x40/0x50
> [32738.847838]  [<ffffffff810d9975>] ? task_tgid_nr_ns+0x35/0xb0
> [32738.854327]  [<ffffffff81245d59>] __asan_load8+0x69/0xa0
> [32738.860333]  [<ffffffff811dba18>] ? perf_output_copy+0x88/0x120
> [32738.867020]  [<ffffffff810d9975>] task_tgid_nr_ns+0x35/0xb0

So here we did: perf_event_[pt]id(event, current);

How can _current_ not be valid anymore?

> [32738.873319]  [<ffffffff811cd5d8>] __perf_event_header__init_id+0xb8/0x200
> [32738.880970]  [<ffffffff811d6f19>] perf_prepare_sample+0xa9/0x4a0
> [32738.887754]  [<ffffffff811d7700>] __perf_event_overflow+0x3f0/0x460
> [32738.894835]  [<ffffffff81022998>] ? x86_perf_event_set_period+0x128/0x210
> [32738.902496]  [<ffffffff811d8494>] perf_event_overflow+0x14/0x20
> [32738.909180]  [<ffffffff8102cabc>] intel_pmu_handle_irq+0x25c/0x520
> [32738.916156]  [<ffffffff81245945>] ? __asan_store8+0x15/0xa0
> [32738.922460]  [<ffffffff81fddb8b>] perf_event_nmi_handler+0x2b/0x50
> [32738.929437]  [<ffffffff81fdd4a8>] nmi_handle+0x88/0x230
> [32738.935346]  [<ffffffff81009873>] do_nmi+0x193/0x490
> [32738.940963]  [<ffffffff81fdc6d6>] end_repeat_nmi+0x1a/0x1e
> [32738.947163]  [<ffffffff81245d22>] ? __asan_load8+0x32/0xa0
> [32738.953358]  [<ffffffff81245d22>] ? __asan_load8+0x32/0xa0
> [32738.959554]  [<ffffffff81245d22>] ? __asan_load8+0x32/0xa0
> [32738.965718]  <<EOE>> 
> [32738.965787]  [<ffffffff811065a2>] ? check_preempt_wakeup+0x1a2/0x3a0
> [32738.972970]  [<ffffffff810f4618>] check_preempt_curr+0xf8/0x120
> [32738.979658]  [<ffffffff810f465d>] ttwu_do_wakeup+0x1d/0x1b0
> [32738.985953]  [<ffffffff810f4909>] ttwu_do_activate.constprop.105+0x89/0x90
> [32738.993710]  [<ffffffff810f87fe>] try_to_wake_up+0x29e/0x4e0
> [32739.000100]  [<ffffffff810f8aaf>] default_wake_function+0x2f/0x40
> [32739.006979]  [<ffffffff81114338>] autoremove_wake_function+0x18/0x50
> [32739.014149]  [<ffffffff81fe0888>] ? preempt_count_sub+0x18/0xf0
> [32739.020836]  [<ffffffff81113ab9>] __wake_up_common+0x79/0xb0
> [32739.027232]  [<ffffffff81113d69>] __wake_up+0x39/0x50
> [32739.032945]  [<ffffffff81135918>] __call_rcu_nocb_enqueue+0x158/0x160
> [32739.040207]  [<ffffffff81135a4c>] __call_rcu+0x12c/0x450

And while we just called release_task(), that call_rcu() should still be
pending at this point, also I don't think that can be current until
after do_task_dead() where we schedule away from the dead task and
change current.

> [32739.046207]  [<ffffffff81135dcd>] call_rcu+0x1d/0x20
> [32739.051821]  [<ffffffff810ae2da>] release_task+0x6aa/0x8d0
> [32739.058022]  [<ffffffff8111e86f>] ? do_raw_write_unlock+0x6f/0xd0
> [32739.064900]  [<ffffffff810b1002>] do_exit+0xe52/0x1020
> [32739.070712]  [<ffffffff810b1222>] SyS_exit+0x22/0x30
> [32739.076328]  [<ffffffff81fe9063>] sysenter_dispatch+0x7/0x1f
> [32739.082725]  [<ffffffff8152f33b>] ? trace_hardirqs_on_thunk+0x3a/0x3c

Oleg, any idea?

       reply	other threads:[~2016-10-24  9:53 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <318B87A793BE164187D8851D6CE09D64371C8811@shsmsx102.ccr.corp.intel.com>
2016-10-24  9:53 ` Peter Zijlstra [this message]
2016-10-24 11:15   ` hit a KASan bug related to Perf during stress test Oleg Nesterov
2016-10-24 11:24     ` Peter Zijlstra
2016-10-24 12:02       ` Oleg Nesterov
2016-10-24 12:10         ` Oleg Nesterov
2016-10-24 12:22           ` Peter Zijlstra
2016-10-24 12:29             ` Oleg Nesterov
2016-10-24 12:38               ` Peter Zijlstra
2016-10-24 13:25                 ` Oleg Nesterov
2016-10-24 13:40                   ` Oleg Nesterov
2016-10-24 14:17                     ` Peter Zijlstra
2016-10-24 14:36                   ` Peter Zijlstra
2016-10-24 15:39                     ` Oleg Nesterov
2016-10-24 15:53                       ` Oleg Nesterov
2016-10-25  6:55                         ` Ni, BaoleX
2016-10-25  9:28                       ` Peter Zijlstra
2016-10-25 14:41                         ` Oleg Nesterov
2016-10-26  9:03                           ` Peter Zijlstra
2016-10-26 16:10                             ` Oleg Nesterov
2016-10-24 12:19         ` Peter Zijlstra
2016-10-24 11:27     ` Peter Zijlstra
2016-10-24 11:29       ` Peter Zijlstra
2016-10-24 12:04         ` Jiri Olsa
2016-10-24 12:12           ` Peter Zijlstra
2016-10-24 12:11     ` Peter Zijlstra
2016-10-24 12:21       ` Oleg Nesterov
2016-10-24 12:27         ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161024095341.GF3102@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=baolex.ni@intel.com \
    --cc=chuansheng.liu@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox