From: Shivendra Pratap <shivendra.pratap@oss.qualcomm.com>
To: Bjorn Andersson <andersson@kernel.org>
Cc: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>,
Sebastian Reichel <sre@kernel.org>, Rob Herring <robh@kernel.org>,
Sudeep Holla <sudeep.holla@arm.com>,
Souvik Chakravarty <Souvik.Chakravarty@arm.com>,
Krzysztof Kozlowski <krzk+dt@kernel.org>,
Conor Dooley <conor+dt@kernel.org>,
Andy Yan <andy.yan@rock-chips.com>,
Mark Rutland <mark.rutland@arm.com>,
Lorenzo Pieralisi <lpieralisi@kernel.org>,
Arnd Bergmann <arnd@arndb.de>,
Konrad Dybcio <konradybcio@kernel.org>,
cros-qcom-dts-watchers@chromium.org,
Vinod Koul <vkoul@kernel.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>,
Florian Fainelli <florian.fainelli@broadcom.com>,
Moritz Fischer <moritz.fischer@ettus.com>,
John Stultz <john.stultz@linaro.org>,
Matthias Brugger <matthias.bgg@gmail.com>,
Krzysztof Kozlowski <krzk@kernel.org>,
Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>,
Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>,
Stephen Boyd <swboyd@chromium.org>,
Andre Draszik <andre.draszik@linaro.org>,
Kathiravan Thirumoorthy
<kathiravan.thirumoorthy@oss.qualcomm.com>,
linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org,
devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
linux-arm-msm@vger.kernel.org,
Elliot Berman <quic_eberman@quicinc.com>,
Srinivas Kandagatla <srini@kernel.org>
Subject: Re: [PATCH v16 01/14] power: reset: reboot-mode: Synchronize list traversal
Date: Wed, 29 Oct 2025 23:18:52 +0530 [thread overview]
Message-ID: <efb52d01-0430-8bdb-e0c7-86c5a2857ef6@oss.qualcomm.com> (raw)
In-Reply-To: <5l2tcjbdtikkhkuhuz64ymk5et6wtl4kwf2mc265su27oh57rt@3shmo3wfx7fb>
On 10/28/2025 9:04 AM, Bjorn Andersson wrote:
> On Wed, Oct 15, 2025 at 10:08:16AM +0530, Shivendra Pratap wrote:
>> List traversals must be synchronized to prevent race conditions
>> and data corruption. The reboot-mode list is not protected by a
>> lock currently, which can lead to concurrent access and race.
>
> Is it a theoretical future race or something that we can hit in the
> current implementation?
>
>>
>> Introduce a mutex lock to guard all operations on the reboot-mode
>> list and ensure thread-safe access. The change prevents unsafe
>> concurrent access on reboot-mode list.
>
> I was under the impression that these lists where created during boot
> and then used at some later point, which at best would bring a
> theoretical window for a race... Reviewing the code supports my
> understanding, but perhaps I'm missing something?
>
>>
>> Fixes: 4fcd504edbf7 ("power: reset: add reboot mode driver")
>> Fixes: ca3d2ea52314 ("power: reset: reboot-mode: better compatibility with DT (replace ' ,/')")
>>
>
> Skip this empty line, please.
>
>
> And given that you have fixes here, I guess this is a problem today. In
> which case, this shouldn't have been carried for 16 versions - but have
> sent and been merged on its own already.
>
> So please, if this is a real issue, start your commit message with a
> descriptive problem description, to make it clear that this needs to be
> merged yesterday - or drop the fixes.
>
>> Signed-off-by: Shivendra Pratap <shivendra.pratap@oss.qualcomm.com>
>> ---
>> drivers/power/reset/reboot-mode.c | 96 +++++++++++++++++++++------------------
>> include/linux/reboot-mode.h | 4 ++
>> 2 files changed, 57 insertions(+), 43 deletions(-)
>>
>> diff --git a/drivers/power/reset/reboot-mode.c b/drivers/power/reset/reboot-mode.c
>> index fba53f638da04655e756b5f8b7d2d666d1379535..8fc3e14638ea757c8dc3808c240ff569cbd74786 100644
>> --- a/drivers/power/reset/reboot-mode.c
>> +++ b/drivers/power/reset/reboot-mode.c
>> @@ -29,9 +29,11 @@ static unsigned int get_reboot_mode_magic(struct reboot_mode_driver *reboot,
>> if (!cmd)
>> cmd = normal;
>>
>> - list_for_each_entry(info, &reboot->head, list)
>> - if (!strcmp(info->mode, cmd))
>> - return info->magic;
>> + scoped_guard(mutex, &reboot->rb_lock) {
>> + list_for_each_entry(info, &reboot->head, list)
>> + if (!strcmp(info->mode, cmd))
>> + return info->magic;
>> + }
>>
>> /* try to match again, replacing characters impossible in DT */
>> if (strscpy(cmd_, cmd, sizeof(cmd_)) == -E2BIG)
>> @@ -41,9 +43,11 @@ static unsigned int get_reboot_mode_magic(struct reboot_mode_driver *reboot,
>> strreplace(cmd_, ',', '-');
>> strreplace(cmd_, '/', '-');
>>
>> - list_for_each_entry(info, &reboot->head, list)
>> - if (!strcmp(info->mode, cmd_))
>> - return info->magic;
>> + scoped_guard(mutex, &reboot->rb_lock) {
>> + list_for_each_entry(info, &reboot->head, list)
>> + if (!strcmp(info->mode, cmd_))
>> + return info->magic;
>> + }
>>
>> return 0;
>> }
>> @@ -78,46 +82,50 @@ int reboot_mode_register(struct reboot_mode_driver *reboot)
>>
>> INIT_LIST_HEAD(&reboot->head);
>>
>> - for_each_property_of_node(np, prop) {
>> - if (strncmp(prop->name, PREFIX, len))
>> - continue;
>> -
>> - info = devm_kzalloc(reboot->dev, sizeof(*info), GFP_KERNEL);
>> - if (!info) {
>> - ret = -ENOMEM;
>> - goto error;
>> - }
>> -
>> - if (of_property_read_u32(np, prop->name, &info->magic)) {
>> - dev_err(reboot->dev, "reboot mode %s without magic number\n",
>> - info->mode);
>> - devm_kfree(reboot->dev, info);
>> - continue;
>> - }
>> -
>> - info->mode = kstrdup_const(prop->name + len, GFP_KERNEL);
>> - if (!info->mode) {
>> - ret = -ENOMEM;
>> - goto error;
>> - } else if (info->mode[0] == '\0') {
>> - kfree_const(info->mode);
>> - ret = -EINVAL;
>> - dev_err(reboot->dev, "invalid mode name(%s): too short!\n",
>> - prop->name);
>> - goto error;
>> + mutex_init(&reboot->rb_lock);
>> +
>> + scoped_guard(mutex, &reboot->rb_lock) {
>
> I don't see how this can race with anything, reboot_mode_register() is
> supposed to be called from some probe function, with reboot_mode_driver
> being a "local" object.
>
> The guard here "protects" &reboot->head, but that is not a shared
> resources at this point.
>
>> + for_each_property_of_node(np, prop) {
>> + if (strncmp(prop->name, PREFIX, len))
>> + continue;
>> +
>> + info = devm_kzalloc(reboot->dev, sizeof(*info), GFP_KERNEL);
>> + if (!info) {
>> + ret = -ENOMEM;
>> + goto error;
>> + }
>> +
>> + if (of_property_read_u32(np, prop->name, &info->magic)) {
>> + dev_err(reboot->dev, "reboot mode %s without magic number\n",
>> + info->mode);
>> + devm_kfree(reboot->dev, info);
>> + continue;
>> + }
>> +
>> + info->mode = kstrdup_const(prop->name + len, GFP_KERNEL);
>> + if (!info->mode) {
>> + ret = -ENOMEM;
>> + goto error;
>> + } else if (info->mode[0] == '\0') {
>> + kfree_const(info->mode);
>> + ret = -EINVAL;
>> + dev_err(reboot->dev, "invalid mode name(%s): too short!\n",
>> + prop->name);
>> + goto error;
>> + }
>> +
>> + list_add_tail(&info->list, &reboot->head);
>> }
>>
>> - list_add_tail(&info->list, &reboot->head);
>> - }
>> -
>> - reboot->reboot_notifier.notifier_call = reboot_mode_notify;
>> - register_reboot_notifier(&reboot->reboot_notifier);
>> + reboot->reboot_notifier.notifier_call = reboot_mode_notify;
>> + register_reboot_notifier(&reboot->reboot_notifier);
>
> Once register_reboot_notifier() has been called, &reboot->head is
> visible outside the specific driver instance.
>
> So, there's no reason to lock in reboot_mode_register().
>
>>
>> - return 0;
>> + return 0;
>>
>> error:
>> - list_for_each_entry(info, &reboot->head, list)
>> - kfree_const(info->mode);
>> + list_for_each_entry(info, &reboot->head, list)
>> + kfree_const(info->mode);
>> + }
>>
>> return ret;
>> }
>> @@ -133,8 +141,10 @@ int reboot_mode_unregister(struct reboot_mode_driver *reboot)
>>
>> unregister_reboot_notifier(&reboot->reboot_notifier);
>>
>> - list_for_each_entry(info, &reboot->head, list)
>> - kfree_const(info->mode);
>> + scoped_guard(mutex, &reboot->rb_lock) {
>
> get_reboot_mode_magic() is only called from reboot_mode_notify(), which
> is only invoked by blocking_notifier_call_chain().
>
> blocking_notifier_call_chain() takes a read semaphore.
> unregister_reboot_notifier() take a write semaphore.
>
> So, if we're racing with a shutdown or reboot, I see two possible
> things:
>
> 1) blocking_notifier_call_chain() happens first and calls
> reboot_mode_notify(), blocking unregister_reboot_notifier(). Once it
> returns, the unregister proceeds and we enter case #2
>
> 2) unregister_reboot_notifier() happens first (or after the
> blocking_notifier_call_chain() returns). Our reboot object is removed
> from the list and blocking_notifier_call_chain() will not invoke
> reboot_mode_notify().
>
> In either case, the list has a single owner here.
>
>
> As far as I can see, the only race left is if multiple concurrent calls
> happens to blocking_notifier_call_chain(), the behavior of
> reboot->write() might be undefined. But I think that is reasonable.
>
>
> Please let me know if I'm missing something.
Thank you for the detailed review. Tried to summarize below:
The mutex lock was introduced in v13 following earlier discussions about
whether the issue was a theoretical future race or something that could
occur in the current implementation.
At the time (prior to v13), we concluded that while no race condition was
observable in the current code, there could be a potential in future
changes or usage patterns — making it a theoretical concern.
During review, there was further discussion around whether it's acceptable
to leave the list unprotected simply because no race is currently suspected.
The general consensus was that it's better practice to protect shared data
structures like lists with a mutex to ensure correctness and future-proofing.
Based on that feedback, the mutex lock introduced in v13.
Later in v15, the reboot-mode maintainer suggested that the patch should
include a Fixes tag, which was incorporated accordingly.
So both the mutex addition in v13 and the Fixes tag in v15 were made in
response to upstream review comments.
Need some guidance on how to take this forward or is it ok to protect all
list operation, as done in this patch and keep the fixes tag as suggested
in earlier reviews?
thanks,
Shivendra
next prev parent reply other threads:[~2025-10-29 17:49 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-15 4:38 [PATCH v16 00/14] Implement vendor resets for PSCI SYSTEM_RESET2 Shivendra Pratap
2025-10-15 4:38 ` [PATCH v16 01/14] power: reset: reboot-mode: Synchronize list traversal Shivendra Pratap
2025-10-15 14:32 ` Bartosz Golaszewski
2025-10-17 14:47 ` Shivendra Pratap
2025-10-28 3:34 ` Bjorn Andersson
2025-10-29 17:48 ` Shivendra Pratap [this message]
2025-11-03 8:16 ` Mukesh Ojha
2025-10-15 4:38 ` [PATCH v16 02/14] power: reset: reboot-mode: Add device tree node-based registration Shivendra Pratap
2025-10-15 14:40 ` Bartosz Golaszewski
2025-10-16 17:19 ` Shivendra Pratap
2025-10-17 9:06 ` Bartosz Golaszewski
2025-10-17 14:24 ` Shivendra Pratap
2025-10-15 4:38 ` [PATCH v16 03/14] power: reset: reboot-mode: Add support for 64 bit magic Shivendra Pratap
2025-10-15 8:40 ` Nirmesh Kumar Singh
2025-10-15 4:38 ` [PATCH v16 04/14] Documentation: ABI: Add sysfs-class-reboot-mode-reboot_modes Shivendra Pratap
2025-10-15 4:38 ` [PATCH v16 05/14] power: reset: reboot-mode: Expose sysfs for registered reboot_modes Shivendra Pratap
2025-10-15 8:45 ` Nirmesh Kumar Singh
2025-10-15 14:47 ` Bartosz Golaszewski
2025-10-17 19:40 ` Shivendra Pratap
2025-10-20 7:40 ` Bartosz Golaszewski
2025-10-22 14:21 ` Shivendra Pratap
2025-10-23 15:02 ` Bartosz Golaszewski
2025-10-23 15:54 ` Shivendra Pratap
2025-10-15 4:38 ` [PATCH v16 06/14] dt-bindings: arm: Document reboot mode magic Shivendra Pratap
2025-10-15 4:38 ` [PATCH v16 07/14] firmware: psci: Implement vendor-specific resets as reboot-mode Shivendra Pratap
2025-10-15 6:55 ` Pavan Kondeti
2025-10-15 7:47 ` Shivendra Pratap
2025-10-15 4:38 ` [PATCH v16 08/14] arm64: dts: qcom: qcm6490-idp: Add PSCI SYSTEM_RESET2 types Shivendra Pratap
2025-10-15 4:38 ` [PATCH v16 09/14] arm64: dts: qcom: qcs6490-rb3gen2: " Shivendra Pratap
2025-10-15 4:38 ` [PATCH v16 10/14] arm64: dts: qcom: lemans-ride: " Shivendra Pratap
2025-11-03 8:36 ` Mukesh Ojha
2025-10-15 4:38 ` [PATCH v16 11/14] arm64: dts: qcom: lemans-evk: " Shivendra Pratap
2025-10-15 4:38 ` [PATCH v16 12/14] arm64: dts: qcom: qcs8300-ride: " Shivendra Pratap
2025-10-15 4:38 ` [PATCH v16 13/14] arm64: dts: qcom: monaco-evk: " Shivendra Pratap
2025-10-15 4:38 ` [PATCH v16 14/14] arm64: dts: qcom: qcs615-ride: " Shivendra Pratap
2025-11-07 8:08 ` Xin Liu
2025-11-07 8:00 ` [PATCH v16 00/14] Implement vendor resets for PSCI SYSTEM_RESET2 Xin Liu
2025-11-07 8:10 ` Krzysztof Kozlowski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=efb52d01-0430-8bdb-e0c7-86c5a2857ef6@oss.qualcomm.com \
--to=shivendra.pratap@oss.qualcomm.com \
--cc=Souvik.Chakravarty@arm.com \
--cc=andersson@kernel.org \
--cc=andre.draszik@linaro.org \
--cc=andy.yan@rock-chips.com \
--cc=arnd@arndb.de \
--cc=bartosz.golaszewski@linaro.org \
--cc=catalin.marinas@arm.com \
--cc=conor+dt@kernel.org \
--cc=cros-qcom-dts-watchers@chromium.org \
--cc=devicetree@vger.kernel.org \
--cc=dmitry.baryshkov@oss.qualcomm.com \
--cc=florian.fainelli@broadcom.com \
--cc=john.stultz@linaro.org \
--cc=kathiravan.thirumoorthy@oss.qualcomm.com \
--cc=konradybcio@kernel.org \
--cc=krzk+dt@kernel.org \
--cc=krzk@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=lpieralisi@kernel.org \
--cc=mark.rutland@arm.com \
--cc=matthias.bgg@gmail.com \
--cc=moritz.fischer@ettus.com \
--cc=mukesh.ojha@oss.qualcomm.com \
--cc=quic_eberman@quicinc.com \
--cc=robh@kernel.org \
--cc=sre@kernel.org \
--cc=srini@kernel.org \
--cc=sudeep.holla@arm.com \
--cc=swboyd@chromium.org \
--cc=vkoul@kernel.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox