All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bjorn Andersson <andersson@kernel.org>
To: Jingyi Wang <jingyi.wang@oss.qualcomm.com>
Cc: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>,
	 Bartosz Golaszewski <brgl@kernel.org>,
	aiqun.yu@oss.qualcomm.com, tingwei.zhang@oss.qualcomm.com,
	 trilok.soni@oss.qualcomm.com, yijie.yang@oss.qualcomm.com,
	linux-arm-msm@vger.kernel.org,  linux-remoteproc@vger.kernel.org,
	devicetree@vger.kernel.org, linux-kernel@vger.kernel.org,
	 Mathieu Poirier <mathieu.poirier@linaro.org>,
	Rob Herring <robh@kernel.org>,
	 Krzysztof Kozlowski <krzk+dt@kernel.org>,
	Conor Dooley <conor+dt@kernel.org>,
	 Manivannan Sadhasivam <mani@kernel.org>,
	Luca Weiss <luca.weiss@fairphone.com>,
	 Konrad Dybcio <konradybcio@kernel.org>
Subject: Re: [PATCH v4 5/7] remoteproc: core: set recovery_disabled when doing rproc_add()
Date: Mon, 6 Apr 2026 10:04:03 -0500	[thread overview]
Message-ID: <adPLDWz6_QmBa8w1@baldur> (raw)
In-Reply-To: <ef36a946-ba7d-4588-b94f-4287f3ea6105@oss.qualcomm.com>

On Thu, Mar 19, 2026 at 01:44:48PM +0800, Jingyi Wang wrote:
> 
> 
> On 3/19/2026 1:23 PM, Dmitry Baryshkov wrote:
> > On Thu, Mar 19, 2026 at 12:36:15PM +0800, Jingyi Wang wrote:
> > > 
> > > 
> > > On 3/13/2026 10:37 AM, Dmitry Baryshkov wrote:
> > > > On Wed, Mar 11, 2026 at 01:39:50AM -0700, Bartosz Golaszewski wrote:
> > > > > On Wed, 11 Mar 2026 03:11:42 +0100, Dmitry Baryshkov
> > > > > <dmitry.baryshkov@oss.qualcomm.com> said:
> > > > > > On Tue, Mar 10, 2026 at 06:50:30AM -0700, Bartosz Golaszewski wrote:
> > > > > > > 
> > > > > > > Ideally things like this would be passed to the rproc core in some kind of a
> > > > > > > config structure and only set when registration succeeds. This looks to me
> > > > > > > like papering over the real issue and I think it's still racy as there's no
> > > > > > > true synchronization.
> > > > > > > 
> > > > > > > Wouldn't it be better to take rproc->lock for the entire duration of
> > > > > > > rproc_add()? It's already initialized in rproc_alloc().
> > > > > > 
> > > > > > It would still be racy as rproc_trigger_recovery() is called outside of
> > > > > > the lock. Instead the error cleanup path (and BTW, rproc_del() path too)
> > > > > > must explicitly call cancel_work_sync() on the crash_handler work (and
> > > > > > any other work items that can be scheduled).
> > > > > > 
> > > > > 
> > > > > This looks weird TBH. For example: rproc_crash_handler_work() takes the lock,
> > > > > but releases it right before calling inspecting rproc->recovery_disabled and
> > > > > calling rproc_trigger_recovery(). It looks wrong, I think it should keep the
> > > > > lock and rptoc_trigger_recovery() should enforce it being taken before the
> > > > > call.
> > > > 
> > > > Yes. Nevertheless the driver should cancel the work too.
> > > > 
> > > 
> > > Hi Dmitry & Bartosz,
> > > 
> > > rproc_crash_handler_work() may call rproc_trigger_recovery() and
> > > rproc_add() may call rproc_boot(), both the function have already
> > > hold the lock. And the lock cannot protect resources like glink_subdev
> > > in the patch.
> > > 
> > > And there is a possible case for cancel_work, rproc_add tear down call
> > > cancel work and wait for the work finished, the reboot run successfully,
> > > and the tear down continued and the resources all released, including sysfs
> > > and glink_subdev.
> > > 
> > > Indeed recovery_disabled is kind of hacky.
> > > The root cause for this issue is that for remoteproc with RPROC_OFFLINE
> > > state, the rproc_start will be called asynchronously, but for the remoteproc
> > > with RPROC_DETACHED, the attach function is called directly, the failure
> > > in this path will cause the rproc_add() fail and the resource release.
> > > I think the current patch can be dropped, we are thinking about make rproc_attach
> > > called asynchronously to avoid this race.
> > 
> > Isn't this patch necessary for SoCCP bringup? If not, why did you
> > include it into the series?
> > 
> yes, will squash to soccp patch in next versoin.

I'm sorry, but that doesn't make sense to me.

The SoCCP patch adds support for attaching SoCCP. This change tries to
address a generic problem shared across all remoteproc drivers (that
does attach?).

I think you should interpret Dmitry's comment as "why is this part of
this series, please fix this problem in a separate series".

Regards,
Bjorn

  reply	other threads:[~2026-04-06 15:04 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-10 10:03 [PATCH v4 0/7] Add binding and driver for Kaanapali SoCCP Jingyi Wang
2026-03-10 10:03 ` [PATCH v4 1/7] dt-bindings: remoteproc: qcom: cleanup qcom,adsp.yaml Jingyi Wang
2026-03-11  6:19   ` Krzysztof Kozlowski
2026-03-10 10:03 ` [PATCH v4 2/7] dt-bindings: remoteproc: qcom: move interrupts and interrupt-names list out of pas-common Jingyi Wang
2026-03-11  6:31   ` Krzysztof Kozlowski
2026-03-13  1:38   ` Dmitry Baryshkov
2026-03-19  4:37     ` Jingyi Wang
2026-03-10 10:03 ` [PATCH v4 3/7] dt-bindings: remoteproc: qcom: Add smem properties in documents that reference to pas-common Jingyi Wang
2026-03-11  6:22   ` Krzysztof Kozlowski
2026-03-19  4:38     ` Jingyi Wang
2026-03-10 10:03 ` [PATCH v4 4/7] dt-bindings: remoteproc: qcom: Document pas for SoCCP on Kaanapali and Glymur platforms Jingyi Wang
2026-03-11  2:04   ` Dmitry Baryshkov
2026-03-11  6:26     ` Krzysztof Kozlowski
2026-03-12  4:53       ` Dmitry Baryshkov
2026-03-12 16:08         ` Krzysztof Kozlowski
2026-03-13  1:00           ` Dmitry Baryshkov
2026-03-11  6:32   ` Krzysztof Kozlowski
2026-03-10 10:03 ` [PATCH v4 5/7] remoteproc: core: set recovery_disabled when doing rproc_add() Jingyi Wang
2026-03-10 13:50   ` Bartosz Golaszewski
2026-03-11  2:11     ` Dmitry Baryshkov
2026-03-11  8:39       ` Bartosz Golaszewski
2026-03-13  2:37         ` Dmitry Baryshkov
2026-03-19  4:36           ` Jingyi Wang
2026-03-19  5:23             ` Dmitry Baryshkov
2026-03-19  5:44               ` Jingyi Wang
2026-04-06 15:04                 ` Bjorn Andersson [this message]
2026-04-06 19:29                   ` Dmitry Baryshkov
2026-04-07  2:40                   ` Jingyi Wang
2026-03-11  6:20   ` Krzysztof Kozlowski
2026-03-19  4:38     ` Jingyi Wang
2026-03-10 10:03 ` [PATCH v4 6/7] remoteproc: qcom: pas: Add late attach support for subsystems Jingyi Wang
2026-03-11  8:28   ` Stephan Gerhold
2026-03-19  4:11     ` Jingyi Wang
2026-04-06 14:59   ` Bjorn Andersson
2026-04-07  2:44     ` Jingyi Wang
2026-03-10 10:03 ` [PATCH v4 7/7] remoteproc: qcom_q6v5_pas: Add SoCCP node on Kaanapali Jingyi Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=adPLDWz6_QmBa8w1@baldur \
    --to=andersson@kernel.org \
    --cc=aiqun.yu@oss.qualcomm.com \
    --cc=brgl@kernel.org \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=dmitry.baryshkov@oss.qualcomm.com \
    --cc=jingyi.wang@oss.qualcomm.com \
    --cc=konradybcio@kernel.org \
    --cc=krzk+dt@kernel.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-remoteproc@vger.kernel.org \
    --cc=luca.weiss@fairphone.com \
    --cc=mani@kernel.org \
    --cc=mathieu.poirier@linaro.org \
    --cc=robh@kernel.org \
    --cc=tingwei.zhang@oss.qualcomm.com \
    --cc=trilok.soni@oss.qualcomm.com \
    --cc=yijie.yang@oss.qualcomm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.