All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ravi Bangoria <ravi.bangoria@amd.com>
To: James Clark <james.clark@arm.com>,
	linux-perf-users@vger.kernel.org, peterz@infradead.org
Cc: syzbot+697196bc0265049822bd@syzkaller.appspotmail.com,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
	linux-kernel@vger.kernel.org,
	Ravi Bangoria <ravi.bangoria@amd.com>,
	Thomas Richter <tmricht@linux.ibm.com>
Subject: Re: [PATCH 1/1] perf: Fix warning from concurrent read/write of perf_event_pmu_context
Date: Mon, 30 Jan 2023 11:19:01 +0530	[thread overview]
Message-ID: <2cb6bb4b-fd04-556e-0790-524bf174cf89@amd.com> (raw)
In-Reply-To: <20230127143141.1782804-2-james.clark@arm.com>

Hi James,

On 27-Jan-23 8:01 PM, James Clark wrote:
> When running two Perf sessions, the following warning can appear:
> 
>   WARNING: CPU: 1 PID: 2245 at kernel/events/core.c:4925 put_pmu_ctx+0x1f0/0x278
>   Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack libcrc32c nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter ip6_tables iptable_filter bridge stp llc coresight_stm stm_core coresight_etm4x coresight_tmc coresight_replicator coresight_funnel coresight_tpiu coresight arm_spe_pmu ip_tables x_tables ipv6 xhci_pci xhci_pci_renesas r8169
>   CPU: 1 PID: 2245 Comm: perf Not tainted 6.2.0-rc4+ #1
>   pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>   pc : put_pmu_ctx+0x1f0/0x278
>   lr : put_pmu_ctx+0x1b4/0x278
>   sp : ffff80000dfcbc20
>   x29: ffff80000dfcbca0 x28: ffff008004f00000 x27: ffff00800763a928
>   x26: ffff00800763a928 x25: 00000000000000c0 x24: 0000000000000000
>   x23: 00000000000a0003 x22: ffff00837df74088 x21: ffff80000dfcbd18
>   x20: 0000000000000000 x19: ffff00800763a6c0 x18: 0000000000000000
>   x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>   x14: 0000000000000000 x13: ffff80000dfc8000 x12: ffff80000dfcc000
>   x11: be58ab6d2939e700 x10: be58ab6d2939e700 x9 : 0000000000000000
>   x8 : 0000000000000001 x7 : 0000000000000000 x6 : 0000000000000000
>   x5 : ffff00800093c9c0 x4 : 0000000000000000 x3 : ffff80000dfcbca0
>   x2 : ffff008004f00000 x1 : ffff8000082403c4 x0 : 0000000000000000
>   Call trace:
>    put_pmu_ctx+0x1f0/0x278
>    _free_event+0x2bc/0x3d0
>    perf_event_release_kernel+0x444/0x4bc
>    perf_release+0x20/0x30
>    __fput+0xe4/0x25c
>    ____fput+0x1c/0x28
>    task_work_run+0xc4/0xe8
>    do_notify_resume+0x10c/0x164
>    el0_svc+0xb4/0xdc
>    el0t_64_sync_handler+0x84/0xf0
>    el0t_64_sync+0x190/0x194
> 
> This is because there is no locking around the access of "if
> (!epc->ctx)" in find_get_pmu_context() and when it is set to NULL in
> put_pmu_ctx().
> 
> The decrement of the reference count in put_pmu_ctx() also happens
> outside of the spinlock, leading to the possibility of this order of
> events, and the context being cleared in put_pmu_ctx(), after its
> refcount is non zero:
> 
>  CPU0                                   CPU1
>  find_get_pmu_context()
>    if (!epc->ctx) == false
>                                         put_pmu_ctx()
>                                         atomic_dec_and_test(&epc->refcount) == true
>                                         epc->refcount == 0
>      atomic_inc(&epc->refcount);
>      epc->refcount == 1
>                                         list_del_init(&epc->pmu_ctx_entry);
> 	                                      epc->ctx = NULL;
> 
> Another issue is that WARN_ON for no active PMU events in put_pmu_ctx()
> is outside of the lock. If the perf_event_pmu_context is an embedded
> one, even after clearing it, it won't be deleted and can be re-used. So
> the warning can trigger. For this reason it also needs to be moved
> inside the lock.
> 
> The above warning is very quick to trigger on Arm by running these two
> commands at the same time:
> 
>   while true; do perf record -- ls; done
>   while true; do perf record -- ls; done

These dose not trigger WARN_ON on my x86 machine, however, the C reproducer
provided by syzbot[1] does trigger it.

[1]: https://syzkaller.appspot.com/text?tag=ReproC&x=17beacbc480000

Thanks,
Ravi

  reply	other threads:[~2023-01-30  5:49 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-27 14:31 [PATCH 0/1] perf: Fix warning from concurrent read/write of perf_event_pmu_context James Clark
2023-01-27 14:31 ` [PATCH 1/1] " James Clark
2023-01-30  5:49   ` Ravi Bangoria [this message]
2023-01-31  8:25     ` Ravi Bangoria
2023-01-31 13:31   ` Peter Zijlstra
2023-02-01 10:49     ` James Clark
2023-02-01 16:10   ` [tip: perf/urgent] perf: Fix perf_event_pmu_context serialization tip-bot2 for James Clark

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2cb6bb4b-fd04-556e-0790-524bf174cf89@amd.com \
    --to=ravi.bangoria@amd.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=james.clark@arm.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=syzbot+697196bc0265049822bd@syzkaller.appspotmail.com \
    --cc=tmricht@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.