From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Davidlohr Bueso <dave@stgolabs.net>
Cc: <linux-cxl@vger.kernel.org>, Dave Jiang <dave.jiang@intel.com>,
<linuxarm@huawei.com>, Dan Williams <dan.j.williams@intel.com>,
"Alison Schofield" <alison.schofield@intel.com>,
Vishal Verma <vishal.l.verma@intel.com>,
Ira Weiny <ira.weiny@intel.com>,
Ben Widawsky <bwidawsk@kernel.org>,
<linux-perf-users@vger.kernel.org>, Will Deacon <will@kernel.org>,
Mark Rutland <mark.rutland@arm.com>
Subject: Re: [RFC PATCH v3 3/5] cxl/pci: Find and register CXL PMU devices
Date: Fri, 21 Oct 2022 09:29:42 +0100 [thread overview]
Message-ID: <20221021092942.000050f5@huawei.com> (raw)
In-Reply-To: <20221020224424.xj4onq44rqlbxk23@offworld>
On Thu, 20 Oct 2022 15:44:24 -0700
Davidlohr Bueso <dave@stgolabs.net> wrote:
> On Tue, 18 Oct 2022, Jonathan Cameron wrote:
>
> > static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > {
> >+ struct cxl_cpmu_regs *cpmu_regs_array = NULL;
> > struct cxl_register_map map;
> > struct cxl_memdev *cxlmd;
> > struct cxl_dev_state *cxlds;
> >- int rc;
> >+ int i, rc, cpmu_count;
> >
> > /*
> > * Double check the anonymous union trickery in struct cxl_regs
> >@@ -561,11 +567,54 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > if (rc)
> > return rc;
> >
> >+ /*
> >+ * Before registering sub devices that use interrupts, the maximum
> >+ * interrupt used on the device should be established to allow the correct
> >+ * number of irq vectors to be requested. Separate the registration process
> >+ * into before / after interrupt vector request.
> >+ */
> >+ cpmu_count = cxl_count_regblock(pdev, CXL_REGLOC_RBI_CPMU);
> >+ if (cpmu_count) {
> >+ cpmu_regs_array = kmalloc_array(cpmu_count, sizeof(cpmu_regs_array),
> >+ GFP_KERNEL);
> >+ if (!cpmu_regs_array)
> >+ return -ENOMEM;
>
> I don't think failure specific to the cpmu stuff should fail the general pci probing.
> That includes memory allocation, which can then fail again somewhere else and properly
> handle it.
>
So we've had several discussions about this and differing opinions.
To restate my view...
Partial failure is extremely fragile and leads to lots of complex error
paths each of which needs to be tested. Right now about 50% of the time that
I hit an error path in testing (due to kernel or emulation bugs) the
kernel driver goes splat because of a wrong assumption around the fallout
of those errors. So far I haven't even bothered figuring most of those
out - focusing instead on the root cause.
I may have been involved in a bias sample of kernel drivers
but my experience is only time you'd normally muddle on is over strictly
debug features (debugfs etc). Otherwise you are just hiding bugs that should
have been fixed.
Anyhow, sure I'll change this for next version.
This series isn't going anywhere until we resolve the interrupt
registration question. Is the hack in here bad enough to imply that
unified registration is a bad idea or not?
My view is the below makes it clear that generic approach doesn't make sense
as it met reality which turned out to be more complex than we were
thinking it was!
Jonathan
p.s. Good to get review of other aspects in meantime though!
> Thanks,
> Davidlohr
>
> >+ }
> >+
> >+ /* Pass one - just enough to find out what interrupts are used */
> >+ cxlds->cpmu_max_vector = -1;
> >+ for (i = 0; i < cpmu_count; i++) {
> >+ int irq;
> >+
> >+ rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_CPMU, &map, i);
> >+ if (rc) {
> >+ kfree(cpmu_regs_array);
> >+ return rc;
> >+ }
> >+
> >+ rc = cxl_map_cpmu_regs(pdev, &cpmu_regs_array[i], &map);
> >+ if (rc) {
> >+ kfree(cpmu_regs_array);
> >+ return rc;
> >+ }
> >+
> >+ irq = cxl_cpmu_get_irq(&cpmu_regs_array[i]);
> >+ cxlds->cpmu_max_vector = max(cxlds->cpmu_max_vector, irq);
> >+ }
> >+
> > if (!cxl_pci_alloc_irq_vectors(cxlds)) {
> > cxlds->has_irq = true;
> >+ pci_set_master(pdev);
> > } else
> > cxlds->has_irq = false;
> >
> >+ /* Pass 2 - register the CPMU instances, currently no support without interrupts */
> >+ if (cxlds->has_irq) {
> >+ for (i = 0; i < cpmu_count; i++)
> >+ devm_cxl_cpmu_add(cxlds->dev, &cpmu_regs_array[i], i);
> >+ }
> >+ kfree(cpmu_regs_array);
> >+
> > cxlmd = devm_cxl_add_memdev(cxlds);
> > if (IS_ERR(cxlmd))
> > return PTR_ERR(cxlmd);
> >--
> >2.37.2
> >
next prev parent reply other threads:[~2022-10-21 8:53 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-18 12:13 [RFC PATCH v3 0/5] CXL 3.0 Performance Monitoring Unit support Jonathan Cameron
2022-10-18 12:13 ` [RFC PATCH v3 1/5] cxl/pci: Add generic MSI-X/MSI irq support Jonathan Cameron
2022-10-18 12:13 ` [RFC PATCH v3 2/5] cxl: Add function to count regblocks of a given type Jonathan Cameron
2022-10-18 12:13 ` [RFC PATCH v3 3/5] cxl/pci: Find and register CXL PMU devices Jonathan Cameron
2022-10-20 22:44 ` Davidlohr Bueso
2022-10-21 8:29 ` Jonathan Cameron [this message]
2022-10-18 12:13 ` [RFC PATCH v3 4/5] cxl: CXL Performance Monitoring Unit driver Jonathan Cameron
2022-10-18 12:13 ` [RFC PATCH v3 5/5] docs: perf: Minimal introduction the the CXL PMU device and driver Jonathan Cameron
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221021092942.000050f5@huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=bwidawsk@kernel.org \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=linuxarm@huawei.com \
--cc=mark.rutland@arm.com \
--cc=vishal.l.verma@intel.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox