All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Alistair Francis <alistair23@gmail.com>
Cc: bhelgaas@google.com, linux-pci@vger.kernel.org,
	Jonathan.Cameron@huawei.com, lukas@wunner.de,
	alex.williamson@redhat.com, christian.koenig@amd.com,
	kch@nvidia.com, gregkh@linuxfoundation.org, logang@deltatee.com,
	linux-kernel@vger.kernel.org, chaitanyak@nvidia.com,
	rdunlap@infradead.org,
	Alistair Francis <alistair.francis@wdc.com>
Subject: Re: [PATCH v9 2/3] PCI/DOE: Expose the DOE features via sysfs
Date: Thu, 19 Oct 2023 11:58:29 -0500	[thread overview]
Message-ID: <20231019165829.GA1381099@bhelgaas> (raw)
In-Reply-To: <20231013034158.2745127-2-alistair.francis@wdc.com>

On Fri, Oct 13, 2023 at 01:41:57PM +1000, Alistair Francis wrote:
> The PCIe 6 specification added support for the Data Object Exchange (DOE).
> When DOE is supported the DOE Discovery Feature must be implemented per
> PCIe r6.1 sec 6.30.1.1. The protocol allows a requester to obtain
> information about the other DOE features supported by the device.
> ...

> +static umode_t pci_doe_sysfs_attr_is_visible(struct kobject *kobj,
> +					     struct attribute *a, int n)
> +{
> +	struct pci_dev *pdev = to_pci_dev(kobj_to_dev(kobj));
> +	struct pci_doe_mb *doe_mb;
> +	unsigned long index, j;
> +	void *entry;
> +
> +	xa_for_each(&pdev->doe_mbs, index, doe_mb) {
> +		xa_for_each(&doe_mb->feats, j, entry)
> +			return a->mode;
> +	}
> +
> +	return 0;

The nested loops that don't test anything look a little weird and
maybe I'm missing something, but this looks like it returns a->mode if
any mailbox with a feature exists, and 0 otherwise.

Is that the same as this:

  if (pdev->doe_mbs)
    return a->mode;

  return 0;

since it sounds like a mailbox must support at least one feature?

> +}
> +
> +static struct attribute *pci_dev_doe_feature_attrs[] = {
> +	NULL,
> +};
> +
> +const struct attribute_group pci_dev_doe_feature_group = {
> +	.name	= "doe_features",
> +	.attrs	= pci_dev_doe_feature_attrs,
> +	.is_visible = pci_doe_sysfs_attr_is_visible,
> +};
> +
> +static ssize_t pci_doe_sysfs_feature_show(struct device *dev,
> +					  struct device_attribute *attr,
> +					  char *buf)
> +{
> +	return sysfs_emit(buf, "%s\n", attr->attr.name);
> +}
> +
> +static void pci_doe_sysfs_feature_remove(struct pci_dev *pdev,
> +					 struct pci_doe_mb *doe_mb)
> +{
> +	struct device *dev = &pdev->dev;
> +	struct device_attribute *attrs = doe_mb->sysfs_attrs;
> +	unsigned long i;
> +	void *entry;
> +
> +	if (!attrs)
> +		return;
> +
> +	doe_mb->sysfs_attrs = NULL;
> +	xa_for_each(&doe_mb->feats, i, entry) {
> +		if (attrs[i].show)
> +			sysfs_remove_file_from_group(&dev->kobj, &attrs[i].attr,
> +						     pci_dev_doe_feature_group.name);
> +		kfree(attrs[i].attr.name);
> +	}
> +	kfree(attrs);
> +}
> +
> +static int pci_doe_sysfs_feature_populate(struct pci_dev *pdev,
> +					  struct pci_doe_mb *doe_mb)
> +{
> +	struct device *dev = &pdev->dev;
> +	struct device_attribute *attrs;
> +	unsigned long num_features = 0;
> +	unsigned long vid, type;
> +	unsigned long i;
> +	void *entry;
> +	int ret;
> +
> +	xa_for_each(&doe_mb->feats, i, entry)
> +		num_features++;
> +
> +	attrs = kcalloc(num_features, sizeof(*attrs), GFP_KERNEL);
> +	if (!attrs)
> +		return -ENOMEM;
> +
> +	doe_mb->sysfs_attrs = attrs;
> +	xa_for_each(&doe_mb->feats, i, entry) {
> +		sysfs_attr_init(&attrs[i].attr);
> +		vid = xa_to_value(entry) >> 8;
> +		type = xa_to_value(entry) & 0xFF;
> +		attrs[i].attr.name = kasprintf(GFP_KERNEL,
> +					       "0x%04lX:%02lX", vid, type);

What's the rationale for using "0x" on the vendor ID but not on the
type?  "0x1234:10" hints that the "10" might be decimal since it lacks
"0x".

Suggest lower-case "%04lx:%02lx" either way.

FWIW, there's no "0x" prefix on the hex vendor IDs in "lspci -n"
output and dmesg messages like this:

  pci 0000:01:00.0: [10de:13b6] type 00

> +		if (!attrs[i].attr.name) {
> +			ret = -ENOMEM;
> +			goto fail;
> +		}
> +
> +		attrs[i].attr.mode = 0444;
> +		attrs[i].show = pci_doe_sysfs_feature_show;
> +
> +		ret = sysfs_add_file_to_group(&dev->kobj, &attrs[i].attr,
> +					      pci_dev_doe_feature_group.name);
> +		if (ret) {
> +			attrs[i].show = NULL;
> +			goto fail;
> +		}
> +	}
> +
> +	return 0;
> +
> +fail:
> +	pci_doe_sysfs_feature_remove(pdev, doe_mb);
> +	return ret;

Not sure we should treat this failure this seriously.  Looks like it
will prevent creation of the BAR resource files, and possibly even
abort pci_sysfs_init() early.  I think the pci_dev will still be
usable (lacking DOE sysfs) even if this fails.

> +}
> +
> +void pci_doe_sysfs_teardown(struct pci_dev *pdev)
> +{
> +	struct pci_doe_mb *doe_mb;
> +	unsigned long index;
> +
> +	xa_for_each(&pdev->doe_mbs, index, doe_mb) {
> +		pci_doe_sysfs_feature_remove(pdev, doe_mb);
> +	}
> +}
> +
> +int pci_doe_sysfs_init(struct pci_dev *pdev)
> +{
> +	struct pci_doe_mb *doe_mb;
> +	unsigned long index;
> +	int ret;
> +
> +	xa_for_each(&pdev->doe_mbs, index, doe_mb) {
> +		ret = pci_doe_sysfs_feature_populate(pdev, doe_mb);
> +		if (ret)
> +			return ret;
> +	}

I agree with Lukas that pci_create_resource_files() is not the right
place to call this.

I try hard to avoid calling *anything* from the
pci_create_sysfs_dev_files() path because it has the nasty
"sysfs_initialized" check and the associated pci_sysfs_init()
initcall.

It's so much cleaner when we can set up static attributes that are
automatically added in the device_add() path.  I don't know whether
that's possible.  I see lots of discussion with Greg KH that might be
related, but I'm not sure.

I do know that we enumerate the mailboxes and features during
pci_init_capabilities(), which happens before device_add(), so the
information about which attributes should be present is at least
*available* early enough:

  pci_host_probe
    pci_scan_root_bus_bridge
      ...
        pci_scan_single_device
          pci_device_add
            pci_init_capabilities
              pci_doe_init
                while (pci_find_next_ext_capability(PCI_EXT_CAP_ID_DOE))
                  pci_doe_create_mb
                    pci_doe_cache_features
                      pci_doe_discovery
                      xa_insert(&doe_mb->feats)   <--
            device_add
              device_add_attrs
                device_add_groups
    pci_bus_add_devices
      pci_bus_add_device
        pci_create_sysfs_dev_files
          ...
            pci_doe_sysfs_init                    <--
              xa_for_each(&pdev->doe_mbs)
                pci_doe_sysfs_feature_populate
                  xa_for_each(&doe_mb->feats)
                    sysfs_add_file_to_group(pci_dev_doe_feature_group.name)

Is it feasible to build an attribute group in pci_doe_init() and add
it to dev->groups so device_add() will automatically add them?

It looks like __power_supply_register() does something like that:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/power/supply/power_supply_core.c?id=v6.5#n1375

> --- a/include/linux/pci-doe.h
> +++ b/include/linux/pci-doe.h
> @@ -22,4 +22,6 @@ int pci_doe(struct pci_doe_mb *doe_mb, u16 vendor, u8 type,
>  	    const void *request, size_t request_sz,
>  	    void *response, size_t response_sz);
>  
> +int pci_doe_sysfs_init(struct pci_dev *pci_dev);
> +void pci_doe_sysfs_teardown(struct pci_dev *pdev);

These declarations look like they should be in drivers/pci/pci.h as
well.  I don't think anything outside the PCI core should need these.

Bjorn

  parent reply	other threads:[~2023-10-19 16:58 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-13  3:41 [PATCH v9 1/3] PCI/DOE: Rename DOE protocol to feature Alistair Francis
2023-10-13  3:41 ` [PATCH v9 2/3] PCI/DOE: Expose the DOE features via sysfs Alistair Francis
2023-10-17  8:34   ` Lukas Wunner
2023-11-03  1:27     ` Alistair Francis
2023-10-19 16:58   ` Bjorn Helgaas [this message]
2023-10-19 19:32     ` Lukas Wunner
2023-10-19 20:01       ` Bjorn Helgaas
2023-11-03  2:17     ` Alistair Francis
2024-08-04  5:55     ` Lukas Wunner
2023-10-13  3:41 ` [PATCH v9 3/3] PCI/DOE: Allow enabling DOE without CXL Alistair Francis
2023-10-18 22:24 ` [PATCH v9 1/3] PCI/DOE: Rename DOE protocol to feature Bjorn Helgaas
2023-11-03  0:18   ` Alistair Francis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231019165829.GA1381099@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=alex.williamson@redhat.com \
    --cc=alistair.francis@wdc.com \
    --cc=alistair23@gmail.com \
    --cc=bhelgaas@google.com \
    --cc=chaitanyak@nvidia.com \
    --cc=christian.koenig@amd.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=kch@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=lukas@wunner.de \
    --cc=rdunlap@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.