From: Dan Williams <dan.j.williams@intel.com>
To: "Daisuke Kobayashi (Fujitsu)" <kobayashi.da-06@fujitsu.com>,
"'Dan Williams'" <dan.j.williams@intel.com>,
"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>
Cc: "Yasunori Gotou (Fujitsu)" <y-goto@fujitsu.com>,
"mj@ucw.cz" <mj@ucw.cz>,
"jonathan.cameron@huawei.com" <jonathan.cameron@huawei.com>
Subject: RE: [PATCH v14 1/2] cxl/core/regs: Add rcd_pcie_cap initialization
Date: Wed, 10 Jul 2024 18:34:53 -0700 [thread overview]
Message-ID: <668f36bd17043_1bc832949f@dwillia2-xfh.jf.intel.com.notmuch> (raw)
In-Reply-To: <OSAPR01MB71829D3B273E919435A1A86BBAA42@OSAPR01MB7182.jpnprd01.prod.outlook.com>
Daisuke Kobayashi (Fujitsu) wrote:
> Dan Williams wrote:
> > Dan Williams wrote:
> > > Kobayashi,Daisuke wrote:
> > > > Add rcd_pcie_cap and its initialization to cache the offset of cxl1.1
> > > > device link status information. By caching it, avoid the walking
> > > > memory map area to find the offset when output the register value.
> > > >
> > > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > > Signed-off-by: "Kobayashi,Daisuke" <kobayashi.da-06@fujitsu.com>
> > > > ---
> > > > drivers/cxl/core/core.h | 6 ++++
> > > > drivers/cxl/core/regs.c | 61
> > +++++++++++++++++++++++++++++++++++++++++
> > > > drivers/cxl/cxl.h | 9 ++++++
> > > > drivers/cxl/pci.c | 8 ++++--
> > > > 4 files changed, 82 insertions(+), 2 deletions(-)
> > > >
> > [..]
> > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > > index 2ff361e756d6..bbc55732d6c1 100644
> > > > --- a/drivers/cxl/pci.c
> > > > +++ b/drivers/cxl/pci.c
> > > > @@ -512,11 +512,15 @@ static int cxl_pci_setup_regs(struct pci_dev
> > *pdev, enum cxl_regloc_type type,
> > > > * is an RCH and try to extract the Component Registers from
> > > > * an RCRB.
> > > > */
> > > > - if (rc && type == CXL_REGLOC_RBI_COMPONENT &&
> > is_cxl_restricted(pdev))
> > > > + if (rc && type == CXL_REGLOC_RBI_COMPONENT &&
> > is_cxl_restricted(pdev)) {
> > > > rc = cxl_rcrb_get_comp_regs(pdev, map);
> > > > + if (rc)
> > > > + return rc;
> > > >
> > > > - if (rc)
> > > > + cxl_dport_map_rcd_linkcap(pdev);
> > >
> > [..]
> > > Ugh, I was going to say copy what cxl_mem_probe() does around locking
> > > endpoint_parent before attaching further ports, but that also appears to
> > > not handle the same race. I.e. I think cxl_mem_probe() needs a fix to do
> > > this as well. I will copy you on a proposed patch for that.
> >
> > I attempted to add the proper locking to keep cxl_dport live, but that
> > runs into lockdep issues.
> >
> > So I think a better fix is rework dport lifetime to stay alive until the
> > final put_device() of the port. In other words dport instances get added
> > dynamically to the cxl_port, but only get destroyed after all port
> > references are dropped. Then the @dport result from find_cxl_port() is
> > not ephemeral.
> >
> > Given this is a latent bug that affects all current
> > cxl_{mem,pci}_find_port() users, the planned fix is to just make dport
> > lifetime longer, and that I will not have time to do that rework before
> > v6.11 merge window, then I am ok for this lnkcap code to introduce
> > another instance of the same bug.
> >
> > So, just make cxl_rcrb_get_comp_regs() and cxl_dport_map_rcd_linkcap()
> > share the same port reference from one cxl_pci_find_port() call.
>
> Thanks for checking.
>
> I'd like to confirm my understanding of the comment. Are you suggesting that,
> due to time constraints with the current patch, cxl_rcrb_get_comp_regs() and
> cxl_dport_map_rcd_linkcap() should share the same dport reference as a temporary
> workaround for the bug regarding the dport lifetime?
What I am saying is forget the bug for now, just trust that the @dport
result from cxl_pci_find_port() is valid until the put_device() on the
port.
> If that's what you mean, I think I can solve this problem by adding
> "struct cxl_dport *dport" to the arguments of the two functions to share the reference.
Yes, that's what I want for this patch, but to be clear this does not
fix the bug with cxl_pci_find_port(). That bug needs deeper work that
you can ignore for now. Adding another cxl_pci_find_port() user just
increases the urgency to get that bug fixed.
To be clear it is definitely a use after-free issue, but it needs root to be bringing
ports up and down during the "cxl_pci_find_port() ->
put_device(@port->dev)" window.
I expect you could trigger a crash by a "modprobe -r cxl_acpi; modprobe
cxl_acpi" loop while accessing these sysfs files.
> In this implementation, I'm planning to run cxl_pci_find_port() in
> cxl_rcrb_get_comp_regs() and share the dport obtained there. You said
> that find requires a corresponding put_device(), but where is the
> correct place to run it in this case? Or is it better not to run it in
> this patch?
So the ordering I expect is something like:
port = cxl_pci_find_port(...)
if (!port)
return -EPROBE_DEFER;
rc = cxl_rcrb_get_comp_regs(dport, ...)
if (rc)
goto put;
rc = cxl_dport_map_rcd_linkcap(dport, ...)
if (rc)
goto put;
put:
put_device(...)
return rc;
next prev parent reply other threads:[~2024-07-11 1:35 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-18 4:29 [PATCH v14 0/2] Export cxl1.1 device link status register value to pci device sysfs Kobayashi,Daisuke
2024-06-18 4:29 ` [PATCH v14 1/2] cxl/core/regs: Add rcd_pcie_cap initialization Kobayashi,Daisuke
2024-07-10 2:36 ` Dan Williams
2024-07-10 6:08 ` Dan Williams
2024-07-10 8:10 ` Daisuke Kobayashi (Fujitsu)
2024-07-11 1:34 ` Dan Williams [this message]
2024-06-18 4:29 ` [PATCH v14 2/2] cxl/pci: Add sysfs attribute for CXL 1.1 device link status Kobayashi,Daisuke
2024-07-08 3:05 ` [PATCH v14 0/2] Export cxl1.1 device link status register value to pci device sysfs Daisuke Kobayashi (Fujitsu)
2024-07-08 16:23 ` Dave Jiang
2024-07-09 8:00 ` Daisuke Kobayashi (Fujitsu)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=668f36bd17043_1bc832949f@dwillia2-xfh.jf.intel.com.notmuch \
--to=dan.j.williams@intel.com \
--cc=jonathan.cameron@huawei.com \
--cc=kobayashi.da-06@fujitsu.com \
--cc=linux-cxl@vger.kernel.org \
--cc=mj@ucw.cz \
--cc=y-goto@fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox