From: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
To: Suzuki Poulose <suzuki.poulose@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>,
peterz@infradead.org, linux-arm-msm@vger.kernel.org,
coresight@lists.linaro.org, linux-kernel@vger.kernel.org,
swboyd@chromium.org, denik@google.com, leo.yan@linaro.org,
linux-arm-kernel@lists.infradead.org, mike.leach@linaro.org
Subject: Re: [PATCH 1/2] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf()
Date: Thu, 22 Oct 2020 16:37:08 +0530 [thread overview]
Message-ID: <d05559ad020b46b55eb5b8ec305d946b@codeaurora.org> (raw)
In-Reply-To: <fa6cdf34-88a0-1050-b9ea-556d0a9438cb@arm.com>
On 2020-10-22 14:57, Suzuki Poulose wrote:
> On 10/22/20 9:02 AM, Sai Prakash Ranjan wrote:
>> On 2020-10-21 15:38, Suzuki Poulose wrote:
>>> On 10/21/20 8:29 AM, Sai Prakash Ranjan wrote:
>>>> On 2020-10-20 21:40, Sai Prakash Ranjan wrote:
>>>>> On 2020-10-14 21:29, Sai Prakash Ranjan wrote:
>>>>>> On 2020-10-14 18:46, Suzuki K Poulose wrote:
>>>>>>> On 10/14/2020 10:36 AM, Sai Prakash Ranjan wrote:
>>>>>>>> On 2020-10-13 22:05, Suzuki K Poulose wrote:
>>>>>>>>> On 10/07/2020 02:00 PM, Sai Prakash Ranjan wrote:
>>>>>>>>>> There was a report of NULL pointer dereference in ETF enable
>>>>>>>>>> path for perf CS mode with PID monitoring. It is almost 100%
>>>>>>>>>> reproducible when the process to monitor is something very
>>>>>>>>>> active such as chrome and with ETF as the sink and not ETR.
>>>>>>>>>> Currently in a bid to find the pid, the owner is dereferenced
>>>>>>>>>> via task_pid_nr() call in tmc_enable_etf_sink_perf() and with
>>>>>>>>>> owner being NULL, we get a NULL pointer dereference.
>>>>>>>>>>
>>>>>>>>>> Looking at the ETR and other places in the kernel, ETF and the
>>>>>>>>>> ETB are the only places trying to dereference the task(owner)
>>>>>>>>>> in tmc_enable_etf_sink_perf() which is also called from the
>>>>>>>>>> sched_in path as in the call trace. Owner(task) is NULL even
>>>>>>>>>> in the case of ETR in tmc_enable_etr_sink_perf(), but since we
>>>>>>>>>> cache the PID in alloc_buffer() callback and it is done as
>>>>>>>>>> part
>>>>>>>>>> of etm_setup_aux() when allocating buffer for ETR sink, we
>>>>>>>>>> never
>>>>>>>>>> dereference this NULL pointer and we are safe. So lets do the
>>>>>>>>>
>>>>>>>>> The patch is necessary to fix some of the issues. But I feel it
>>>>>>>>> is
>>>>>>>>> not complete. Why is it safe earlier and not later ? I believe
>>>>>>>>> we are
>>>>>>>>> simply reducing the chances of hitting the issue, by doing this
>>>>>>>>> earlier than
>>>>>>>>> later. I would say we better fix all instances to make sure
>>>>>>>>> that the
>>>>>>>>> event->owner is valid. (e.g, I can see that the for kernel
>>>>>>>>> events
>>>>>>>>> event->owner == -1 ?)
>>>>>>>>>
>>>>>>>>> struct task_struct *tsk = READ_ONCE(event->owner);
>>>>>>>>>
>>>>>>>>> if (!tsk || is_kernel_event(event))
>>>>>>>>> /* skip ? */
>>>>>>>>>
>>>>>>>>
>>>>>>>> Looking at it some more, is_kernel_event() is not exposed
>>>>>>>> outside events core and probably for good reason. Why do
>>>>>>>> we need to check for this and not just tsk?
>>>>>>>
>>>>>>> Because the event->owner could be :
>>>>>>>
>>>>>>> = NULL
>>>>>>> = -1UL // kernel event
>>>>>>> = valid.
>>>>>>>
>>>>>>
>>>>>> Yes I understood that part, but here we were trying to
>>>>>> fix the NULL pointer dereference right and hence the
>>>>>> question as to why we need to check for kernel events?
>>>>>> I am no expert in perf but I don't see anywhere in the
>>>>>> kernel checking for is_kernel_event(), so I am a bit
>>>>>> skeptical if exporting that is actually right or not.
>>>>>>
>>>>>
>>>>> I have stress tested with the original patch many times
>>>>> now, i.e., without a check for event->owner and is_kernel_event()
>>>>> and didn't observe any crash. Plus on ETR where this was already
>>>>> done, no crashes were reported till date and with ETF, the issue
>>>>> was quickly reproducible, so I am fairly confident that this
>>>>> doesn't just delay the original issue but actually fixes
>>>>> it. I will run an overnight test again to confirm this.
>>>>>
>>>>
>>>> I ran the overnight test which collected aroung 4G data(see below),
>>>> with the following small change to see if the two cases
>>>> (event->owner=NULL and is_kernel_event()) are triggered
>>>> with suggested changes and it didn't trigger at all.
>>>> Do we still need those additional checks?
>>>>
>>>
>>> Yes. Please see perf_event_create_kernel_event(), which is
>>> an exported function allowing any kernel code (including modules)
>>> to use the PMU (just like the userspace perf tool would do).
>>> Just because your use case doesn't trigger this (because
>>> you don't run something that can trigger this) doesn't mean
>>> this can't be triggered.
>>>
>>
>> Thanks for that pointer, I will add them in the next version.
>>
>
> And instead of redefining TASK_TOMBSTONE in the driver, you
> may simply use IS_ERR_OR_NULL(tsk) to cover both NULL case
> and kernel event.
>
Ugh sorry, sent out v2 exporting is_kernel_event() before seeing
this comment, I will resend.
Thanks,
Sai
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member
of Code Aurora Forum, hosted by The Linux Foundation
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2020-10-22 11:09 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-07 13:00 [PATCH 0/2] coresight: etf/etb: NULL Pointer dereference crash fixes Sai Prakash Ranjan
2020-10-07 13:00 ` [PATCH 1/2] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf() Sai Prakash Ranjan
2020-10-13 16:35 ` Suzuki K Poulose
2020-10-14 7:50 ` Sai Prakash Ranjan
2020-10-14 9:36 ` Sai Prakash Ranjan
2020-10-14 13:16 ` Suzuki K Poulose
2020-10-14 15:59 ` Sai Prakash Ranjan
2020-10-20 16:10 ` Sai Prakash Ranjan
2020-10-21 7:29 ` Sai Prakash Ranjan
2020-10-21 10:08 ` Suzuki Poulose
2020-10-22 8:02 ` Sai Prakash Ranjan
2020-10-22 9:27 ` Suzuki Poulose
2020-10-22 11:07 ` Sai Prakash Ranjan [this message]
2020-10-22 11:14 ` Suzuki Poulose
2020-10-22 11:20 ` Sai Prakash Ranjan
2020-10-07 13:00 ` [PATCH 2/2] coresight: etb10: Fix possible NULL ptr dereference in etb_enable_perf() Sai Prakash Ranjan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d05559ad020b46b55eb5b8ec305d946b@codeaurora.org \
--to=saiprakash.ranjan@codeaurora.org \
--cc=coresight@lists.linaro.org \
--cc=denik@google.com \
--cc=leo.yan@linaro.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.poirier@linaro.org \
--cc=mike.leach@linaro.org \
--cc=peterz@infradead.org \
--cc=suzuki.poulose@arm.com \
--cc=swboyd@chromium.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).