Re: [PATCH 3/7] cxl/region: Hold memdev lock during region poison injection/clear

public inbox for linux-cxl@vger.kernel.org
 help / color / mirror / Atom feed

From: Dan Williams <dan.j.williams@intel.com>
To: Li Ming <ming.li@zohomail.com>,
	Alison Schofield <alison.schofield@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Danilo Krummrich <dakr@kernel.org>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Jonathan Cameron <jonathan.cameron@huawei.com>,
	"Dave Jiang" <dave.jiang@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	"Ira Weiny" <ira.weiny@intel.com>,
	Dan Williams <dan.j.williams@intel.com>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	Ben Cheatham <benjamin.cheatham@amd.com>,
	<driver-core@lists.linux.dev>, <linux-kernel@vger.kernel.org>,
	<linux-cxl@vger.kernel.org>
Subject: Re: [PATCH 3/7] cxl/region: Hold memdev lock during region poison injection/clear
Date: Mon, 16 Mar 2026 19:10:30 -0700	[thread overview]
Message-ID: <69b8b81621e16_452b100e@dwillia2-mobl4.notmuch> (raw)
In-Reply-To: <13a62481-256b-405d-b2da-dd30b8cc6689@zohomail.com>

Li Ming wrote:
> 
> 在 2026/3/11 05:57, Alison Schofield 写道:
> > On Tue, Mar 10, 2026 at 11:57:55PM +0800, Li Ming wrote:
> >> cxl_dpa_to_region() will require callers holding the given CXL memdev
> >> lock for endpoint validity checking in the following patch. To prepare
> >> it, region poison injection/clearing debugfs interfaces need to ensure
> >> the correct CXL memdev lock is held at the beginning.
> >>
> >> To keep lock sequence(cxlmd.dev -> cxl_rwsem.region -> cxl_rwsem.dpa)
> >> for avoiding deadlock. the interfaces have to find out the correct CXL
> >> memdev at first, holding lock in the sequence then checking if the DPA
> >> data has been changed before holding locks.
> > This seems complicated, that this path needs to do needless work to
> > comply with a new locking scheme that this path doesn't really need.
> >
> > One thought, if in Patch 2 we lock at the debugfs callsite for
> > by memdev poison, then here, we don't need to think about following
> > unnecessary locking protocol. I say that because I think locking
> > down w dpa and region rwsem is enough here. Then there is nothing
> > new to do in this path. Might that work?
> >
> > Along w that can cxl_dpa_to_region() go back to safety checking as
> > opposed to the new device_lock_assert?
> >
> > I haven't gone thru every case in this set, so might be way off ;)
> 
> Actually, my understanding is same as yours.
> 
> I think we don't need this patch either if we remove the 
> device_lock_assert() in cxl_dpa_to_region().(Just keep the 
> cxlmd->dev.driver checking)
> 
> Because region is created only when endpoint is ready. Detaching an 
> endpoint from a region always holds cxl_rwsem.region.
> 
> But we always hold device lock for the device driver bingding checking 
> in CXL subsystem.
> 
> I'm OK to remove this patch, Let see any inputs from other reviewers.

The question is can cxlmd->dev.driver be invalidated while userspace is
pending inside of a debugfs callback?

Yes, it can. Debugfs, unlike sysfs which uses kernfs_drain(), does not
flush active threads inside debugfs handlers. At least not as far as I
can see.

The next question is, are we protected by the fact that a region needs
to disconnect a decoder and it only does that under the rwsem held for
write?

I think the answer is "no".

If the protocol was to have a region in hand and then validate its
locked association to an endpoint decoder then, yes. In this case though
we have no idea if the cxlmd is associated with any region at all.  So I
think you can theoretically race injecting an error to an idle cxlmd
that is in the process of destroying its ->endpoint.

Now, maybe some other detail saves us, but I do not think it is worth
the mental gymnastics of just requiring that no driver detach
shenanigans are running concurrent with poison injection shenanigans.

next prev parent reply	other threads:[~2026-03-17  2:10 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-10 15:57 [PATCH 0/7] cxl: Consolidate cxlmd->endpoint accessing Li Ming
2026-03-10 15:57 ` [PATCH 1/7] driver core: Add conditional guard support for device_lock() Li Ming
2026-03-10 17:45   ` Dave Jiang
2026-03-10 18:06     ` Danilo Krummrich
2026-03-10 18:09       ` Dave Jiang
2026-03-10 18:39         ` Dan Williams
2026-03-10 19:17           ` Danilo Krummrich
2026-03-10 20:37             ` Dan Williams
2026-03-10 20:41               ` Danilo Krummrich
2026-03-12 14:35   ` Greg Kroah-Hartman
2026-03-14 15:14   ` Danilo Krummrich
2026-03-10 15:57 ` [PATCH 2/7] cxl/memdev: Hold memdev lock during memdev poison injection/clear Li Ming
2026-03-10 17:53   ` Dave Jiang
2026-03-10 19:29   ` Alison Schofield
2026-03-10 21:34   ` Alison Schofield
2026-03-11 10:53     ` Li Ming
2026-03-12  4:05       ` Alison Schofield
2026-03-12 10:45         ` Li Ming
2026-03-10 15:57 ` [PATCH 3/7] cxl/region: Hold memdev lock during region " Li Ming
2026-03-10 19:54   ` Dan Williams
2026-03-10 21:57   ` Alison Schofield
2026-03-11 11:10     ` Li Ming
2026-03-17  2:10       ` Dan Williams [this message]
2026-03-10 15:57 ` [PATCH 4/7] cxl/pci: Hold memdev lock in cxl_event_trace_record() Li Ming
2026-03-10 19:33   ` Dan Williams
2026-03-11 11:11     ` Li Ming
2026-03-10 20:52   ` Dave Jiang
2026-03-11 11:12     ` Li Ming
2026-03-10 15:57 ` [PATCH 5/7] cxl/region: Ensure endpoint is valid in cxl_dpa_to_region() Li Ming
2026-03-10 20:53   ` Dave Jiang
2026-03-10 15:57 ` [PATCH 6/7] cxl/pci: Check memdev driver binding status in cxl_reset_done() Li Ming
2026-03-10 19:31   ` Dan Williams
2026-03-10 20:50   ` Dave Jiang
2026-03-10 15:57 ` [PATCH 7/7] cxl/port: Reset cxlmd->endpoint to -ENXIO by default Li Ming
2026-03-10 19:29   ` Dan Williams
2026-03-11 12:14     ` Li Ming
2026-03-10 19:20 ` [PATCH 0/7] cxl: Consolidate cxlmd->endpoint accessing Alison Schofield
2026-03-11 10:41   ` Li Ming
2026-03-10 20:33 ` Dan Williams
2026-03-11 10:44   ` Li Ming

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=69b8b81621e16_452b100e@dwillia2-mobl4.notmuch \
    --to=dan.j.williams@intel.com \
    --cc=alison.schofield@intel.com \
    --cc=benjamin.cheatham@amd.com \
    --cc=bhelgaas@google.com \
    --cc=dakr@kernel.org \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=driver-core@lists.linux.dev \
    --cc=gregkh@linuxfoundation.org \
    --cc=ira.weiny@intel.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.li@zohomail.com \
    --cc=rafael@kernel.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox