Linux CXL
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: "Daisuke Kobayashi (Fujitsu)" <kobayashi.da-06@fujitsu.com>,
	"'Dan Williams'" <dan.j.williams@intel.com>,
	"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>
Cc: "Yasunori Gotou (Fujitsu)" <y-goto@fujitsu.com>,
	"mj@ucw.cz" <mj@ucw.cz>,
	"jonathan.cameron@huawei.com" <jonathan.cameron@huawei.com>
Subject: RE: [PATCH v14 1/2] cxl/core/regs: Add rcd_pcie_cap initialization
Date: Wed, 10 Jul 2024 18:34:53 -0700	[thread overview]
Message-ID: <668f36bd17043_1bc832949f@dwillia2-xfh.jf.intel.com.notmuch> (raw)
In-Reply-To: <OSAPR01MB71829D3B273E919435A1A86BBAA42@OSAPR01MB7182.jpnprd01.prod.outlook.com>

Daisuke Kobayashi (Fujitsu) wrote:
> Dan Williams wrote:
> > Dan Williams wrote:
> > > Kobayashi,Daisuke wrote:
> > > > Add rcd_pcie_cap and its initialization to cache the offset of cxl1.1
> > > > device link status information. By caching it, avoid the walking
> > > > memory map area to find the offset when output the register value.
> > > >
> > > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > > Signed-off-by: "Kobayashi,Daisuke" <kobayashi.da-06@fujitsu.com>
> > > > ---
> > > >  drivers/cxl/core/core.h |  6 ++++
> > > >  drivers/cxl/core/regs.c | 61
> > +++++++++++++++++++++++++++++++++++++++++
> > > >  drivers/cxl/cxl.h       |  9 ++++++
> > > >  drivers/cxl/pci.c       |  8 ++++--
> > > >  4 files changed, 82 insertions(+), 2 deletions(-)
> > > >
> > [..]
> > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > > index 2ff361e756d6..bbc55732d6c1 100644
> > > > --- a/drivers/cxl/pci.c
> > > > +++ b/drivers/cxl/pci.c
> > > > @@ -512,11 +512,15 @@ static int cxl_pci_setup_regs(struct pci_dev
> > *pdev, enum cxl_regloc_type type,
> > > >  	 * is an RCH and try to extract the Component Registers from
> > > >  	 * an RCRB.
> > > >  	 */
> > > > -	if (rc && type == CXL_REGLOC_RBI_COMPONENT &&
> > is_cxl_restricted(pdev))
> > > > +	if (rc && type == CXL_REGLOC_RBI_COMPONENT &&
> > is_cxl_restricted(pdev)) {
> > > >  		rc = cxl_rcrb_get_comp_regs(pdev, map);
> > > > +		if (rc)
> > > > +			return rc;
> > > >
> > > > -	if (rc)
> > > > +		cxl_dport_map_rcd_linkcap(pdev);
> > >
> > [..]
> > > Ugh, I was going to say copy what cxl_mem_probe() does around locking
> > > endpoint_parent before attaching further ports, but that also appears to
> > > not handle the same race. I.e. I think cxl_mem_probe() needs a fix to do
> > > this as well. I will copy you on a proposed patch for that.
> > 
> > I attempted to add the proper locking to keep cxl_dport live, but that
> > runs into lockdep issues.
> > 
> > So I think a better fix is rework dport lifetime to stay alive until the
> > final put_device() of the port. In other words dport instances get added
> > dynamically to the cxl_port, but only get destroyed after all port
> > references are dropped. Then the @dport result from find_cxl_port() is
> > not ephemeral.
> > 
> > Given this is a latent bug that affects all current
> > cxl_{mem,pci}_find_port() users, the planned fix is to just make dport
> > lifetime longer, and that I will not have time to do that rework before
> > v6.11 merge window, then I am ok for this lnkcap code to introduce
> > another instance of the same bug.
> > 
> > So, just make cxl_rcrb_get_comp_regs() and cxl_dport_map_rcd_linkcap()
> > share the same port reference from one cxl_pci_find_port() call.
> 
> Thanks for checking.
> 
> I'd like to confirm my understanding of the comment. Are you suggesting that,
> due to time constraints with the current patch, cxl_rcrb_get_comp_regs() and
> cxl_dport_map_rcd_linkcap() should share the same dport reference as a temporary
> workaround for the bug regarding the dport lifetime?

What I am saying is forget the bug for now, just trust that the @dport
result from cxl_pci_find_port() is valid until the put_device() on the
port.

> If that's what you mean, I think I can solve this problem by adding
> "struct cxl_dport *dport" to the arguments of the two functions to share the reference.

Yes, that's what I want for this patch, but to be clear this does not
fix the bug with cxl_pci_find_port(). That bug needs deeper work that
you can ignore for now. Adding another cxl_pci_find_port() user just
increases the urgency to get that bug fixed.

To be clear it is definitely a use after-free issue, but it needs root to be bringing
ports up and down during the "cxl_pci_find_port() ->
put_device(@port->dev)" window.

I expect you could trigger a crash by a "modprobe -r cxl_acpi; modprobe
cxl_acpi" loop while accessing these sysfs files.

> In this implementation, I'm planning to run cxl_pci_find_port() in
> cxl_rcrb_get_comp_regs() and share the dport obtained there. You said
> that find requires a corresponding put_device(), but where is the
> correct place to run it in this case? Or is it better not to run it in
> this patch?

So the ordering I expect is something like:

	port = cxl_pci_find_port(...)
	if (!port)
		return -EPROBE_DEFER;
	rc = cxl_rcrb_get_comp_regs(dport, ...)
	if (rc)
		goto put;
	rc = cxl_dport_map_rcd_linkcap(dport, ...)
	if (rc)
		goto put;
put:
	put_device(...)
	return rc;

  reply	other threads:[~2024-07-11  1:35 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-18  4:29 [PATCH v14 0/2] Export cxl1.1 device link status register value to pci device sysfs Kobayashi,Daisuke
2024-06-18  4:29 ` [PATCH v14 1/2] cxl/core/regs: Add rcd_pcie_cap initialization Kobayashi,Daisuke
2024-07-10  2:36   ` Dan Williams
2024-07-10  6:08     ` Dan Williams
2024-07-10  8:10       ` Daisuke Kobayashi (Fujitsu)
2024-07-11  1:34         ` Dan Williams [this message]
2024-06-18  4:29 ` [PATCH v14 2/2] cxl/pci: Add sysfs attribute for CXL 1.1 device link status Kobayashi,Daisuke
2024-07-08  3:05 ` [PATCH v14 0/2] Export cxl1.1 device link status register value to pci device sysfs Daisuke Kobayashi (Fujitsu)
2024-07-08 16:23   ` Dave Jiang
2024-07-09  8:00     ` Daisuke Kobayashi (Fujitsu)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=668f36bd17043_1bc832949f@dwillia2-xfh.jf.intel.com.notmuch \
    --to=dan.j.williams@intel.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=kobayashi.da-06@fujitsu.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=mj@ucw.cz \
    --cc=y-goto@fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox