* [RFC PATCH 0/3] lspci: Display cxl1.1 device link status
@ 2023-12-20 5:07 KobayashiDaisuke
2023-12-20 5:07 ` [RFC PATCH 1/3] Add function to display " KobayashiDaisuke
` (4 more replies)
0 siblings, 5 replies; 12+ messages in thread
From: KobayashiDaisuke @ 2023-12-20 5:07 UTC (permalink / raw)
To: linux-pci; +Cc: linux-cxl, y-goto, KobayashiDaisuke
Hello.
This patch series adds a feature to lspci that displays the link status
of the CXL1.1 device.
CXL devices are extensions of PCIe. Therefore, from CXL2.0 onwards,
the link status can be output in the same way as traditional PCIe.
However, unlike devices from CXL2.0 onwards, CXL1.1 requires a
different method to obtain the link status from traditional PCIe.
This is because the link status of the CXL1.1 device is not mapped
in the configuration space (as per cxl3.0 specification 8.1).
Instead, the configuration space containing the link status is mapped
to the memory mapped register region (as per cxl3.0 specification 8.2,
Table 8-18). Therefore, the current lspci has a problem where it does
not display the link status of the CXL1.1 device.
This patch solves these issues.
The method of acquisition is in the order of obtaining the device UID,
obtaining the base address from CEDT, and then obtaining the link
status from memory mapped register. Considered outputting with the cxl
command due to the scope of the CXL specification, but devices from
CXL2.0 onwards can be output in the same way as traditional PCIe.
Therefore, it would be better to make the lspci command compatible with
the CXL1.1 device for compatibility reasons.
I look forward to any comments you may have.
KobayashiDaisuke (3):
Add function to display cxl1.1 device link status
Implement a function to get cxl1.1 device uid
Implement a function to get a RCRB Base address
ls-caps.c | 216 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
lspci.h | 35 +++++++++
2 files changed, 251 insertions(+)
--
2.43.0
^ permalink raw reply [flat|nested] 12+ messages in thread* [RFC PATCH 1/3] Add function to display cxl1.1 device link status 2023-12-20 5:07 [RFC PATCH 0/3] lspci: Display cxl1.1 device link status KobayashiDaisuke @ 2023-12-20 5:07 ` KobayashiDaisuke 2023-12-20 5:07 ` [RFC PATCH 2/3] Implement a function to get cxl1.1 device uid KobayashiDaisuke ` (3 subsequent siblings) 4 siblings, 0 replies; 12+ messages in thread From: KobayashiDaisuke @ 2023-12-20 5:07 UTC (permalink / raw) To: linux-pci; +Cc: linux-cxl, y-goto, KobayashiDaisuke This patch adds a function to output the link status of the CXL1.1 device when it is connected. In CXL1.1, the link status of the device is included in the RCRB mapped to the memory mapped register area. This function accesses the address where the device's RCRB is mapped and outputs the link status. Signed-off-by: KobayashiDaisuke <kobayashi.da-06@fujitsu.com> --- ls-caps.c | 123 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ lspci.h | 35 ++++++++++++++++ 2 files changed, 158 insertions(+) diff --git a/ls-caps.c b/ls-caps.c index 1b63262..be81bb9 100644 --- a/ls-caps.c +++ b/ls-caps.c @@ -10,6 +10,9 @@ #include <stdio.h> #include <string.h> +#include <fcntl.h> +#include <sys/mman.h> +#include <stdlib.h> #include "lspci.h" @@ -1381,6 +1384,121 @@ static void cap_express_slot2(struct device *d UNUSED, int where UNUSED) /* No capabilities that require this field in PCIe rev2.0 spec. */ } +#define OBJNAMELEN 1024 +static int get_device_uid(struct device *d){ + return -1; +} + +static off_t get_rcrb_base(int device_uid){ + return -1; +} + +static uint32_t read_pci_config(uint32_t* pcie_config_space, off_t offset) { + return *((uint32_t*)((uint8_t*)pcie_config_space + offset)); +} + +static void cap_express_link_rcd(struct device *d) +{ + /* Check whether the device is cxl 1.1 device or not */ + int device_uid = get_device_uid(d); + if (device_uid < 0) + return; + + off_t rcrb_base = get_rcrb_base(device_uid); + if(rcrb_base <= 0) + return; + + int mem_fd = open("/dev/mem", O_RDONLY | O_SYNC); + if (mem_fd == -1) + return; + + /* Set target address(RCD = RCH + 4K) */ + off_t target_address = rcrb_base + 0x1000; + long page_size = sysconf(_SC_PAGESIZE); + void *mem_ptr = mmap(NULL, page_size, PROT_READ, MAP_SHARED, mem_fd, target_address); + if (mem_ptr == MAP_FAILED) + { + close(mem_fd); + return; + } + + /* Search PCIe Capability */ + volatile uint32_t value = read_pci_config(mem_ptr, PCI_CAPABILITY_LIST); + volatile off_t offset = value & 0xFF; + + while (1) + { + value = read_pci_config(mem_ptr, offset); + /* If find PCIe Capability, display the link status info */ + if ((value & 0xFF) == PCI_CAP_ID_EXP) + { + u32 t, aspm, cap_speed, cap_width, sta_speed, sta_width; + u16 w; + t = read_pci_config(mem_ptr, offset + PCI_EXP_LNKCAP); + aspm = (t & PCI_EXP_LNKCAP_ASPM) >> 10; + cap_speed = t & PCI_EXP_LNKCAP_SPEED; + cap_width = (t & PCI_EXP_LNKCAP_WIDTH) >> 4; + printf("\t\tLnkCap:\tPort #%d, Speed %s, Width x%d, ASPM %s", + t >> 24, + link_speed(cap_speed), cap_width, + aspm_support(aspm)); + if (aspm) + { + printf(", Exit Latency "); + if (aspm & 1) + printf("L0s %s", latency_l0s((t & PCI_EXP_LNKCAP_L0S) >> 12)); + if (aspm & 2) + printf("%sL1 %s", (aspm & 1) ? ", " : "", + latency_l1((t & PCI_EXP_LNKCAP_L1) >> 15)); + } + printf("\n"); + printf("\t\t\tClockPM%c Surprise%c LLActRep%c BwNot%c ASPMOptComp%c\n", + FLAG(t, PCI_EXP_LNKCAP_CLOCKPM), + FLAG(t, PCI_EXP_LNKCAP_SURPRISE), + FLAG(t, PCI_EXP_LNKCAP_DLLA), + FLAG(t, PCI_EXP_LNKCAP_LBNC), + FLAG(t, PCI_EXP_LNKCAP_AOC)); + + t = read_pci_config(mem_ptr, offset + PCI_EXP_LNKCTL); + w = (uint16_t)(t & 0xffff); + printf("\t\tLnkCtl:\tASPM %s;", aspm_enabled(w & PCI_EXP_LNKCTL_ASPM)); + printf(" Disabled%c CommClk%c\n\t\t\tExtSynch%c ClockPM%c AutWidDis%c BWInt%c AutBWInt%c\n", + FLAG(w, PCI_EXP_LNKCTL_DISABLE), + FLAG(w, PCI_EXP_LNKCTL_CLOCK), + FLAG(w, PCI_EXP_LNKCTL_XSYNCH), + FLAG(w, PCI_EXP_LNKCTL_CLOCKPM), + FLAG(w, PCI_EXP_LNKCTL_HWAUTWD), + FLAG(w, PCI_EXP_LNKCTL_BWMIE), + FLAG(w, PCI_EXP_LNKCTL_AUTBWIE)); + + w = (uint16_t)((t >> 16) & 0xffff); + sta_speed = w & PCI_EXP_LNKSTA_SPEED; + sta_width = (w & PCI_EXP_LNKSTA_WIDTH) >> 4; + printf("\t\tLnkSta:\tSpeed %s%s, Width x%d%s\n", + link_speed(sta_speed), + link_compare(PCI_EXP_TYPE_ROOT_INT_EP, sta_speed, cap_speed), + sta_width, + link_compare(PCI_EXP_TYPE_ROOT_INT_EP, sta_width, cap_width)); + printf("\t\t\tTrErr%c Train%c SlotClk%c DLActive%c BWMgmt%c ABWMgmt%c\n", + FLAG(w, PCI_EXP_LNKSTA_TR_ERR), + FLAG(w, PCI_EXP_LNKSTA_TRAIN), + FLAG(w, PCI_EXP_LNKSTA_SL_CLK), + FLAG(w, PCI_EXP_LNKSTA_DL_ACT), + FLAG(w, PCI_EXP_LNKSTA_BWMGMT), + FLAG(w, PCI_EXP_LNKSTA_AUTBW)); + break; + }else{ /* else get Next Capability Pointer, and move the pointer */ + offset = (value >> 8) & 0xFF; + if (offset == 0) + break; + } + } + + munmap(mem_ptr, page_size); + close(mem_fd); + return; +} + static int cap_express(struct device *d, int where, int cap) { @@ -1445,6 +1563,11 @@ cap_express(struct device *d, int where, int cap) cap_express_dev(d, where, type); if (link) cap_express_link(d, where, type); + else if (type == PCI_EXP_TYPE_ROOT_INT_EP) + { + cap_express_link_rcd(d); + } + if (slot) cap_express_slot(d, where); if (type == PCI_EXP_TYPE_ROOT_PORT || type == PCI_EXP_TYPE_ROOT_EC) diff --git a/lspci.h b/lspci.h index c5a9ec7..eab5a77 100644 --- a/lspci.h +++ b/lspci.h @@ -58,6 +58,41 @@ u32 get_conf_long(struct device *d, unsigned int pos); word get_conf_word(struct device *d, unsigned int pos); byte get_conf_byte(struct device *d, unsigned int pos); +/* access to CEDT structure*/ + +#pragma pack(1) +#define CHBS_TYPE 0 + +// CHBS Structure +struct CHBS_Structure { + uint32_t uid; // UID (4 bytes) + uint32_t cxl_version; // CXL Version (4 bytes) + uint32_t reserved2; // Reserved (4 bytes) + uint64_t base; // Base (8 bytes) + uint64_t length; // Length (8 bytes) +}; + +// CEDT Structure Header +struct CEDT_Structure { + uint8_t type; // Type (1 byte) + uint8_t reserved; // Reserved (1 byte) + uint16_t record_length; // Record Length (2 bytes) +}; + +// CEDT Header +struct CEDT_Header { + uint32_t signature; // Signature (4 bytes) + uint32_t length; // Length (4 bytes) + uint8_t revision; // Revision (1 byte) + uint8_t checksum; // Checksum (1 byte) + char oem_ID[6]; // OEM ID (6 bytes) + char oem_tableID[8]; // OEM Table ID (8 bytes) + uint32_t oem_Revision; // OEM Revision (4 bytes) + uint32_t creatorID; // Creator ID (4 bytes) + uint32_t creator_revision; // Creator Revision (4 bytes) +}; +#pragma pack() + /* Useful macros for decoding of bits and bit fields */ #define FLAG(x,y) ((x & y) ? '+' : '-') -- 2.43.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC PATCH 2/3] Implement a function to get cxl1.1 device uid 2023-12-20 5:07 [RFC PATCH 0/3] lspci: Display cxl1.1 device link status KobayashiDaisuke 2023-12-20 5:07 ` [RFC PATCH 1/3] Add function to display " KobayashiDaisuke @ 2023-12-20 5:07 ` KobayashiDaisuke 2023-12-20 5:07 ` [RFC PATCH 3/3] Implement a function to get a RCRB Base address KobayashiDaisuke ` (2 subsequent siblings) 4 siblings, 0 replies; 12+ messages in thread From: KobayashiDaisuke @ 2023-12-20 5:07 UTC (permalink / raw) To: linux-pci; +Cc: linux-cxl, y-goto, KobayashiDaisuke This patch adds a function to obtain the uid of the host bridge containing the device. In this function, the host bridge is found by exploring the tree structure of pci, and the uid is obtained. The uid obtained here is used to identify the CHBS structure of the device. Signed-off-by: KobayashiDaisuke <kobayashi.da-06@fujitsu.com> --- ls-caps.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) diff --git a/ls-caps.c b/ls-caps.c index be81bb9..077e8ea 100644 --- a/ls-caps.c +++ b/ls-caps.c @@ -1386,6 +1386,54 @@ static void cap_express_slot2(struct device *d UNUSED, int where UNUSED) #define OBJNAMELEN 1024 static int get_device_uid(struct device *d){ + char link_path[OBJNAMELEN]; + char path[OBJNAMELEN]; + ssize_t bytes_read; + int n = snprintf(path, OBJNAMELEN, "%s/devices/%04x:%02x:%02x.%d", + pci_get_param(d->dev->access, "sysfs.path"), d->dev->domain, d->dev->bus, d->dev->dev, d->dev->func); + if (n < 0 || n >= OBJNAMELEN){ + d->dev->access->error("sysfs file name error"); + return -1; + } + + /* get absolute path pointed by the sym link */ + bytes_read = readlink(path, link_path, sizeof(link_path)); + if (bytes_read == -1) + return -1; + + link_path[bytes_read] = '\0'; + + char *path_copy = strdup(link_path); + char *token = strtok(path_copy, "/"); + int device_uid; + while (token) + { + if (strncmp(token, "pci", 3) == 0) + { + char buffer[OBJNAMELEN]; + sprintf(buffer, "/sys/devices/%s/firmware_node/uid", token); + FILE *file = fopen(buffer, "r"); + if (file == NULL) + { + free(path_copy); + return -1; + } + + char line[OBJNAMELEN]; + while (fgets(line, sizeof(line), file) != NULL) + { + if (sscanf(line, "%d", &device_uid) == 1) + { + fclose(file); + free(path_copy); + return device_uid; + } + } + fclose(file); + } + token = strtok(NULL, "/"); + } + free(path_copy); return -1; } -- 2.43.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC PATCH 3/3] Implement a function to get a RCRB Base address 2023-12-20 5:07 [RFC PATCH 0/3] lspci: Display cxl1.1 device link status KobayashiDaisuke 2023-12-20 5:07 ` [RFC PATCH 1/3] Add function to display " KobayashiDaisuke 2023-12-20 5:07 ` [RFC PATCH 2/3] Implement a function to get cxl1.1 device uid KobayashiDaisuke @ 2023-12-20 5:07 ` KobayashiDaisuke 2024-01-09 15:57 ` [RFC PATCH 0/3] lspci: Display cxl1.1 device link status Jonathan Cameron 2024-01-17 12:10 ` Martin Mareš 4 siblings, 0 replies; 12+ messages in thread From: KobayashiDaisuke @ 2023-12-20 5:07 UTC (permalink / raw) To: linux-pci; +Cc: linux-cxl, y-goto, KobayashiDaisuke This patch adds a function to obtain the RCRB base address corresponding to the uid. In the case of a CXL1.1 device, the RCRB base address is included in the CHBS in the CEDT (cxl3.0 specification 9.17.1). In this function, the ACPI's CEDT is explored, and the RCRB base address is obtained from the CHBS corresponding to the uid. Signed-off-by: KobayashiDaisuke <kobayashi.da-06@fujitsu.com> --- ls-caps.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) diff --git a/ls-caps.c b/ls-caps.c index 077e8ea..cccd775 100644 --- a/ls-caps.c +++ b/ls-caps.c @@ -1438,6 +1438,51 @@ static int get_device_uid(struct device *d){ } static off_t get_rcrb_base(int device_uid){ + FILE *cedt_file = fopen("/sys/firmware/acpi/tables/CEDT", "rb"); + if (cedt_file == NULL) + return -1; + + struct CEDT_Header header; + fread(&header, sizeof(header), 1, cedt_file); + + struct CHBS_Structure chbs; + chbs.base = 0; + size_t total_bytes_read = 0; + while (total_bytes_read < header.length) + { + struct CEDT_Structure cedt; + size_t bytes_read = fread(&cedt, sizeof(cedt), 1, cedt_file); + if (bytes_read != 1) + { + fclose(cedt_file); + return -1; + } + total_bytes_read += sizeof(cedt); + if (cedt.type == CHBS_TYPE) + { + bytes_read = fread(&chbs, sizeof(chbs), 1, cedt_file); + if(bytes_read != 1){ + fclose(cedt_file); + return -1; + } + total_bytes_read += sizeof(chbs); + if ((int)chbs.uid == device_uid){ + if(chbs.cxl_version == 0){ + fclose(cedt_file); + return chbs.base; + }else{ + fclose(cedt_file); + return -1; + } + } + } + else + { + fseek(cedt_file, cedt.record_length - sizeof(cedt), SEEK_SET); + total_bytes_read += cedt.record_length; + } + } + fclose(cedt_file); return -1; } -- 2.43.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [RFC PATCH 0/3] lspci: Display cxl1.1 device link status 2023-12-20 5:07 [RFC PATCH 0/3] lspci: Display cxl1.1 device link status KobayashiDaisuke ` (2 preceding siblings ...) 2023-12-20 5:07 ` [RFC PATCH 3/3] Implement a function to get a RCRB Base address KobayashiDaisuke @ 2024-01-09 15:57 ` Jonathan Cameron 2024-01-11 1:11 ` Dan Williams 2024-01-17 12:10 ` Martin Mareš 4 siblings, 1 reply; 12+ messages in thread From: Jonathan Cameron @ 2024-01-09 15:57 UTC (permalink / raw) To: KobayashiDaisuke; +Cc: linux-pci, linux-cxl, y-goto On Wed, 20 Dec 2023 14:07:35 +0900 KobayashiDaisuke <kobayashi.da-06@fujitsu.com> wrote: > Hello. > > This patch series adds a feature to lspci that displays the link status > of the CXL1.1 device. > > CXL devices are extensions of PCIe. Therefore, from CXL2.0 onwards, > the link status can be output in the same way as traditional PCIe. > However, unlike devices from CXL2.0 onwards, CXL1.1 requires a > different method to obtain the link status from traditional PCIe. > This is because the link status of the CXL1.1 device is not mapped > in the configuration space (as per cxl3.0 specification 8.1). > Instead, the configuration space containing the link status is mapped > to the memory mapped register region (as per cxl3.0 specification 8.2, > Table 8-18). Therefore, the current lspci has a problem where it does > not display the link status of the CXL1.1 device. > This patch solves these issues. > > The method of acquisition is in the order of obtaining the device UID, > obtaining the base address from CEDT, and then obtaining the link > status from memory mapped register. Considered outputting with the cxl > command due to the scope of the CXL specification, but devices from > CXL2.0 onwards can be output in the same way as traditional PCIe. > Therefore, it would be better to make the lspci command compatible with > the CXL1.1 device for compatibility reasons. > > I look forward to any comments you may have. Yikes. My gut feeling is that you shouldn't need to do this level of hackery. If we need this information to be exposed to tooling then we should add support to the kernel to export it somewhere in sysfs and read that directly. Do we need it to be available in absence of the CXL driver stack? Jonathan > > KobayashiDaisuke (3): > Add function to display cxl1.1 device link status > Implement a function to get cxl1.1 device uid > Implement a function to get a RCRB Base address > > ls-caps.c | 216 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ > lspci.h | 35 +++++++++ > 2 files changed, 251 insertions(+) > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH 0/3] lspci: Display cxl1.1 device link status 2024-01-09 15:57 ` [RFC PATCH 0/3] lspci: Display cxl1.1 device link status Jonathan Cameron @ 2024-01-11 1:11 ` Dan Williams 2024-01-12 11:24 ` Jonathan Cameron 0 siblings, 1 reply; 12+ messages in thread From: Dan Williams @ 2024-01-11 1:11 UTC (permalink / raw) To: Jonathan Cameron, KobayashiDaisuke; +Cc: linux-pci, linux-cxl, y-goto Jonathan Cameron wrote: > On Wed, 20 Dec 2023 14:07:35 +0900 > KobayashiDaisuke <kobayashi.da-06@fujitsu.com> wrote: > > > Hello. > > > > This patch series adds a feature to lspci that displays the link status > > of the CXL1.1 device. > > > > CXL devices are extensions of PCIe. Therefore, from CXL2.0 onwards, > > the link status can be output in the same way as traditional PCIe. > > However, unlike devices from CXL2.0 onwards, CXL1.1 requires a > > different method to obtain the link status from traditional PCIe. > > This is because the link status of the CXL1.1 device is not mapped > > in the configuration space (as per cxl3.0 specification 8.1). > > Instead, the configuration space containing the link status is mapped > > to the memory mapped register region (as per cxl3.0 specification 8.2, > > Table 8-18). Therefore, the current lspci has a problem where it does > > not display the link status of the CXL1.1 device. > > This patch solves these issues. > > > > The method of acquisition is in the order of obtaining the device UID, > > obtaining the base address from CEDT, and then obtaining the link > > status from memory mapped register. Considered outputting with the cxl > > command due to the scope of the CXL specification, but devices from > > CXL2.0 onwards can be output in the same way as traditional PCIe. > > Therefore, it would be better to make the lspci command compatible with > > the CXL1.1 device for compatibility reasons. > > > > I look forward to any comments you may have. > Yikes. > > My gut feeling is that you shouldn't need to do this level of hackery. > > If we need this information to be exposed to tooling then we should > add support to the kernel to export it somewhere in sysfs and read that > directly. Do we need it to be available in absence of the CXL driver > stack? I am hoping that's a non-goal if only because that makes it more difficult for the kernel to provide some help here without polluting to the PCI core. To date, RCRB handling is nothing that the PCI core needs to worry about, and I am not sure I want to open that box. I am wondering about an approach like below is sufficient for lspci. The idea here is that cxl_pci (or other PCI driver for Type-2 RCDs) can opt-in to publishing these hidden registers. diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index 4fd1f207c84e..ee63dff63b68 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -960,6 +960,19 @@ static const struct pci_error_handlers cxl_error_handlers = { .cor_error_detected = cxl_cor_error_detected, }; +static struct attribute *cxl_rcd_attrs[] = { + &dev_attr_rcd_lnkcp.attr, + &dev_attr_rcd_lnkctl.attr, + NULL +}; + +static struct attribute_group cxl_rcd_group = { + .attrs = cxl_rcd_attrs, + .is_visible = cxl_rcd_visible, +}; + +__ATTRIBUTE_GROUPS(cxl_pci); + static struct pci_driver cxl_pci_driver = { .name = KBUILD_MODNAME, .id_table = cxl_mem_pci_tbl, @@ -967,6 +980,7 @@ static struct pci_driver cxl_pci_driver = { .err_handler = &cxl_error_handlers, .driver = { .probe_type = PROBE_PREFER_ASYNCHRONOUS, + .dev_groups = cxl_rcd_groups, }, }; However, the problem I believe is this will end up with: /sys/bus/pci/devices/$pdev/rcd_lnkcap /sys/bus/pci/devices/$pdev/rcd_lnkctl ...with valid values, but attributes like: /sys/bus/pci/devices/$pdev/current_link_speed ...returning -EINVAL. So I think the options are: 1/ Keep the status quo of RCRB knowledge only lives in drivers/cxl/ and piecemeal enable specific lspci needs with RCD-specific attributes ...or: 2/ Hack pcie_capability_read_word() to internally figure out that based on a config offset a device may have a hidden capability and switch over to RCRB based config-cycle access for those. Given that the CXL 1.1 RCH topology concept was immediately deprecated in favor of VH topology in CXL 2.0, I am not inclined to pollute the general Linux PCI core with that "aberration of history" as it were. ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [RFC PATCH 0/3] lspci: Display cxl1.1 device link status 2024-01-11 1:11 ` Dan Williams @ 2024-01-12 11:24 ` Jonathan Cameron 2024-01-15 9:09 ` Daisuke Kobayashi (Fujitsu) 0 siblings, 1 reply; 12+ messages in thread From: Jonathan Cameron @ 2024-01-12 11:24 UTC (permalink / raw) To: Dan Williams; +Cc: KobayashiDaisuke, linux-pci, linux-cxl, y-goto On Wed, 10 Jan 2024 17:11:38 -0800 Dan Williams <dan.j.williams@intel.com> wrote: > Jonathan Cameron wrote: > > On Wed, 20 Dec 2023 14:07:35 +0900 > > KobayashiDaisuke <kobayashi.da-06@fujitsu.com> wrote: > > > > > Hello. > > > > > > This patch series adds a feature to lspci that displays the link status > > > of the CXL1.1 device. > > > > > > CXL devices are extensions of PCIe. Therefore, from CXL2.0 onwards, > > > the link status can be output in the same way as traditional PCIe. > > > However, unlike devices from CXL2.0 onwards, CXL1.1 requires a > > > different method to obtain the link status from traditional PCIe. > > > This is because the link status of the CXL1.1 device is not mapped > > > in the configuration space (as per cxl3.0 specification 8.1). > > > Instead, the configuration space containing the link status is mapped > > > to the memory mapped register region (as per cxl3.0 specification 8.2, > > > Table 8-18). Therefore, the current lspci has a problem where it does > > > not display the link status of the CXL1.1 device. > > > This patch solves these issues. > > > > > > The method of acquisition is in the order of obtaining the device UID, > > > obtaining the base address from CEDT, and then obtaining the link > > > status from memory mapped register. Considered outputting with the cxl > > > command due to the scope of the CXL specification, but devices from > > > CXL2.0 onwards can be output in the same way as traditional PCIe. > > > Therefore, it would be better to make the lspci command compatible with > > > the CXL1.1 device for compatibility reasons. > > > > > > I look forward to any comments you may have. > > Yikes. > > > > My gut feeling is that you shouldn't need to do this level of hackery. > > > > If we need this information to be exposed to tooling then we should > > add support to the kernel to export it somewhere in sysfs and read that > > directly. Do we need it to be available in absence of the CXL driver > > stack? > > I am hoping that's a non-goal if only because that makes it more > difficult for the kernel to provide some help here without polluting to > the PCI core. > > To date, RCRB handling is nothing that the PCI core needs to worry > about, and I am not sure I want to open that box. > > I am wondering about an approach like below is sufficient for lspci. > > The idea here is that cxl_pci (or other PCI driver for Type-2 RCDs) can > opt-in to publishing these hidden registers. > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c > index 4fd1f207c84e..ee63dff63b68 100644 > --- a/drivers/cxl/pci.c > +++ b/drivers/cxl/pci.c > @@ -960,6 +960,19 @@ static const struct pci_error_handlers cxl_error_handlers = { > .cor_error_detected = cxl_cor_error_detected, > }; > > +static struct attribute *cxl_rcd_attrs[] = { > + &dev_attr_rcd_lnkcp.attr, > + &dev_attr_rcd_lnkctl.attr, > + NULL > +}; > + > +static struct attribute_group cxl_rcd_group = { > + .attrs = cxl_rcd_attrs, > + .is_visible = cxl_rcd_visible, > +}; > + > +__ATTRIBUTE_GROUPS(cxl_pci); > + > static struct pci_driver cxl_pci_driver = { > .name = KBUILD_MODNAME, > .id_table = cxl_mem_pci_tbl, > @@ -967,6 +980,7 @@ static struct pci_driver cxl_pci_driver = { > .err_handler = &cxl_error_handlers, > .driver = { > .probe_type = PROBE_PREFER_ASYNCHRONOUS, > + .dev_groups = cxl_rcd_groups, > }, > }; > > > However, the problem I believe is this will end up with: > > /sys/bus/pci/devices/$pdev/rcd_lnkcap > /sys/bus/pci/devices/$pdev/rcd_lnkctl > > ...with valid values, but attributes like: > > /sys/bus/pci/devices/$pdev/current_link_speed > > ...returning -EINVAL. > > So I think the options are: > > 1/ Keep the status quo of RCRB knowledge only lives in drivers/cxl/ and > piecemeal enable specific lspci needs with RCD-specific attributes This one gets my vote. > > ...or: > > 2/ Hack pcie_capability_read_word() to internally figure out that based > on a config offset a device may have a hidden capability and switch over > to RCRB based config-cycle access for those. > > Given that the CXL 1.1 RCH topology concept was immediately deprecated > in favor of VH topology in CXL 2.0, I am not inclined to pollute the > general Linux PCI core with that "aberration of history" as it were. Agreed. ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [RFC PATCH 0/3] lspci: Display cxl1.1 device link status 2024-01-12 11:24 ` Jonathan Cameron @ 2024-01-15 9:09 ` Daisuke Kobayashi (Fujitsu) 2024-01-16 21:29 ` Dan Williams 0 siblings, 1 reply; 12+ messages in thread From: Daisuke Kobayashi (Fujitsu) @ 2024-01-15 9:09 UTC (permalink / raw) To: 'Jonathan Cameron', Dan Williams Cc: linux-pci@vger.kernel.org, linux-cxl@vger.kernel.org, Yasunori Gotou (Fujitsu) > -----Original Message----- > From: Jonathan Cameron <Jonathan.Cameron@Huawei.com> > Sent: Friday, January 12, 2024 8:24 PM > To: Dan Williams <dan.j.williams@intel.com> > Cc: Kobayashi, Daisuke/小林 大介 <kobayashi.da-06@fujitsu.com>; > linux-pci@vger.kernel.org; linux-cxl@vger.kernel.org; Gotou, Yasunori/五島 康 > 文 <y-goto@fujitsu.com> > Subject: Re: [RFC PATCH 0/3] lspci: Display cxl1.1 device link status > > On Wed, 10 Jan 2024 17:11:38 -0800 > Dan Williams <dan.j.williams@intel.com> wrote: > > > Jonathan Cameron wrote: > > > On Wed, 20 Dec 2023 14:07:35 +0900 > > > KobayashiDaisuke <kobayashi.da-06@fujitsu.com> wrote: > > > > > > > Hello. > > > > > > > > This patch series adds a feature to lspci that displays the link status > > > > of the CXL1.1 device. > > > > > > > > CXL devices are extensions of PCIe. Therefore, from CXL2.0 onwards, > > > > the link status can be output in the same way as traditional PCIe. > > > > However, unlike devices from CXL2.0 onwards, CXL1.1 requires a > > > > different method to obtain the link status from traditional PCIe. > > > > This is because the link status of the CXL1.1 device is not mapped > > > > in the configuration space (as per cxl3.0 specification 8.1). > > > > Instead, the configuration space containing the link status is mapped > > > > to the memory mapped register region (as per cxl3.0 specification 8.2, > > > > Table 8-18). Therefore, the current lspci has a problem where it does > > > > not display the link status of the CXL1.1 device. > > > > This patch solves these issues. > > > > > > > > The method of acquisition is in the order of obtaining the device UID, > > > > obtaining the base address from CEDT, and then obtaining the link > > > > status from memory mapped register. Considered outputting with the cxl > > > > command due to the scope of the CXL specification, but devices from > > > > CXL2.0 onwards can be output in the same way as traditional PCIe. > > > > Therefore, it would be better to make the lspci command compatible with > > > > the CXL1.1 device for compatibility reasons. > > > > > > > > I look forward to any comments you may have. > > > Yikes. > > > > > > My gut feeling is that you shouldn't need to do this level of hackery. > > > > > > If we need this information to be exposed to tooling then we should > > > add support to the kernel to export it somewhere in sysfs and read that > > > directly. Do we need it to be available in absence of the CXL driver > > > stack? > > > > I am hoping that's a non-goal if only because that makes it more > > difficult for the kernel to provide some help here without polluting to > > the PCI core. > > > > To date, RCRB handling is nothing that the PCI core needs to worry > > about, and I am not sure I want to open that box. > > > > I am wondering about an approach like below is sufficient for lspci. > > > > The idea here is that cxl_pci (or other PCI driver for Type-2 RCDs) can > > opt-in to publishing these hidden registers. > > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c > > index 4fd1f207c84e..ee63dff63b68 100644 > > --- a/drivers/cxl/pci.c > > +++ b/drivers/cxl/pci.c > > @@ -960,6 +960,19 @@ static const struct pci_error_handlers > cxl_error_handlers = { > > .cor_error_detected = cxl_cor_error_detected, > > }; > > > > +static struct attribute *cxl_rcd_attrs[] = { > > + &dev_attr_rcd_lnkcp.attr, > > + &dev_attr_rcd_lnkctl.attr, > > + NULL > > +}; > > + > > +static struct attribute_group cxl_rcd_group = { > > + .attrs = cxl_rcd_attrs, > > + .is_visible = cxl_rcd_visible, > > +}; > > + > > +__ATTRIBUTE_GROUPS(cxl_pci); > > + > > static struct pci_driver cxl_pci_driver = { > > .name = KBUILD_MODNAME, > > .id_table = cxl_mem_pci_tbl, > > @@ -967,6 +980,7 @@ static struct pci_driver cxl_pci_driver = { > > .err_handler = &cxl_error_handlers, > > .driver = { > > .probe_type = PROBE_PREFER_ASYNCHRONOUS, > > + .dev_groups = cxl_rcd_groups, > > }, > > }; > > > > > > However, the problem I believe is this will end up with: > > > > /sys/bus/pci/devices/$pdev/rcd_lnkcap > > /sys/bus/pci/devices/$pdev/rcd_lnkctl > > > > ...with valid values, but attributes like: > > > > /sys/bus/pci/devices/$pdev/current_link_speed > > > > ...returning -EINVAL. > > > > So I think the options are: > > > > 1/ Keep the status quo of RCRB knowledge only lives in drivers/cxl/ and > > piecemeal enable specific lspci needs with RCD-specific attributes > > This one gets my vote. Thank you for your feedback. Like Dan, I also believe that implementing this feature in the kernel may not be appropriate, as it is specifically needed for CXL1.1 devices. Therefore, I understand that it would be better to implement the link status of CXL1.1 devices directly in lspci. Please tell me if my understanding is wrong. > > > > > ...or: > > > > 2/ Hack pcie_capability_read_word() to internally figure out that based > > on a config offset a device may have a hidden capability and switch over > > to RCRB based config-cycle access for those. > > > > Given that the CXL 1.1 RCH topology concept was immediately deprecated > > in favor of VH topology in CXL 2.0, I am not inclined to pollute the > > general Linux PCI core with that "aberration of history" as it were. > Agreed. > ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [RFC PATCH 0/3] lspci: Display cxl1.1 device link status 2024-01-15 9:09 ` Daisuke Kobayashi (Fujitsu) @ 2024-01-16 21:29 ` Dan Williams 2024-01-17 9:23 ` Daisuke Kobayashi (Fujitsu) 0 siblings, 1 reply; 12+ messages in thread From: Dan Williams @ 2024-01-16 21:29 UTC (permalink / raw) To: Daisuke Kobayashi (Fujitsu), 'Jonathan Cameron', Dan Williams Cc: linux-pci@vger.kernel.org, linux-cxl@vger.kernel.org, Yasunori Gotou (Fujitsu) Daisuke Kobayashi (Fujitsu) wrote: > > > I am wondering about an approach like below is sufficient for lspci. > > > > > > The idea here is that cxl_pci (or other PCI driver for Type-2 RCDs) can > > > opt-in to publishing these hidden registers. > > > > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c > > > index 4fd1f207c84e..ee63dff63b68 100644 > > > --- a/drivers/cxl/pci.c > > > +++ b/drivers/cxl/pci.c > > > @@ -960,6 +960,19 @@ static const struct pci_error_handlers > > cxl_error_handlers = { > > > .cor_error_detected = cxl_cor_error_detected, > > > }; > > > > > > +static struct attribute *cxl_rcd_attrs[] = { > > > + &dev_attr_rcd_lnkcp.attr, > > > + &dev_attr_rcd_lnkctl.attr, > > > + NULL > > > +}; > > > + > > > +static struct attribute_group cxl_rcd_group = { > > > + .attrs = cxl_rcd_attrs, > > > + .is_visible = cxl_rcd_visible, > > > +}; > > > + > > > +__ATTRIBUTE_GROUPS(cxl_pci); > > > + > > > static struct pci_driver cxl_pci_driver = { > > > .name = KBUILD_MODNAME, > > > .id_table = cxl_mem_pci_tbl, > > > @@ -967,6 +980,7 @@ static struct pci_driver cxl_pci_driver = { > > > .err_handler = &cxl_error_handlers, > > > .driver = { > > > .probe_type = PROBE_PREFER_ASYNCHRONOUS, > > > + .dev_groups = cxl_rcd_groups, > > > }, > > > }; > > > > > > > > > However, the problem I believe is this will end up with: > > > > > > /sys/bus/pci/devices/$pdev/rcd_lnkcap > > > /sys/bus/pci/devices/$pdev/rcd_lnkctl > > > > > > ...with valid values, but attributes like: > > > > > > /sys/bus/pci/devices/$pdev/current_link_speed > > > > > > ...returning -EINVAL. > > > > > > So I think the options are: > > > > > > 1/ Keep the status quo of RCRB knowledge only lives in drivers/cxl/ and > > > piecemeal enable specific lspci needs with RCD-specific attributes > > > > This one gets my vote. > > Thank you for your feedback. > Like Dan, I also believe that implementing this feature in the kernel may > not be appropriate, as it is specifically needed for CXL1.1 devices. > Therefore, I understand that it would be better to implement > the link status of CXL1.1 devices directly in lspci. > Please tell me if my understanding is wrong. The proposal is to do a hybrid approach. The drivers/cxl/ subsystem already handles RCRB register access internally, so it can go further and expose a couple attributes ("rcd_lnkcap" and "rcd_lnkctl") that lspci can go read. In other words, "/dev/mem" is not a reliable way to access the RCRB, and it is too much work to make the existing sysfs config-space access ABI understand the RCRB layout since that complication would only be useful for one hardware generation. An additional idea here is to allow for the CXL subsystem to takeover publishing PCIe attributes like "current_link_speed", that are currently broken by the RCRB configuration, with a change like this: diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c index 2321fdfefd7d..982bbec721fd 100644 --- a/drivers/pci/pci-sysfs.c +++ b/drivers/pci/pci-sysfs.c @@ -1613,7 +1613,7 @@ static umode_t pcie_dev_attrs_are_visible(struct kobject *kobj, struct device *dev = kobj_to_dev(kobj); struct pci_dev *pdev = to_pci_dev(dev); - if (pci_is_pcie(pdev)) + if (pci_is_pcie(pdev) && !is_cxl_rcd(pdev)) return a->mode; return 0; ...then the CXL subsystem can produce its own attributes with the same name, but backed by the RCRB lookup mechanism. ^ permalink raw reply related [flat|nested] 12+ messages in thread
* RE: [RFC PATCH 0/3] lspci: Display cxl1.1 device link status 2024-01-16 21:29 ` Dan Williams @ 2024-01-17 9:23 ` Daisuke Kobayashi (Fujitsu) 0 siblings, 0 replies; 12+ messages in thread From: Daisuke Kobayashi (Fujitsu) @ 2024-01-17 9:23 UTC (permalink / raw) To: 'Dan Williams', 'Jonathan Cameron' Cc: linux-pci@vger.kernel.org, linux-cxl@vger.kernel.org, Yasunori Gotou (Fujitsu) > -----Original Message----- > From: Dan Williams <dan.j.williams@intel.com> > Sent: Wednesday, January 17, 2024 6:29 AM > To: Kobayashi, Daisuke/小林 大介 <kobayashi.da-06@fujitsu.com>; > 'Jonathan Cameron' <Jonathan.Cameron@huawei.com>; Dan Williams > <dan.j.williams@intel.com> > Cc: linux-pci@vger.kernel.org; linux-cxl@vger.kernel.org; Gotou, Yasunori/五島 > 康文 <y-goto@fujitsu.com> > Subject: RE: [RFC PATCH 0/3] lspci: Display cxl1.1 device link status > > Daisuke Kobayashi (Fujitsu) wrote: > > > > I am wondering about an approach like below is sufficient for lspci. > > > > > > > > The idea here is that cxl_pci (or other PCI driver for Type-2 > > > > RCDs) can opt-in to publishing these hidden registers. > > > > > > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index > > > > 4fd1f207c84e..ee63dff63b68 100644 > > > > --- a/drivers/cxl/pci.c > > > > +++ b/drivers/cxl/pci.c > > > > @@ -960,6 +960,19 @@ static const struct pci_error_handlers > > > cxl_error_handlers = { > > > > .cor_error_detected = cxl_cor_error_detected, > > > > }; > > > > > > > > +static struct attribute *cxl_rcd_attrs[] = { > > > > + &dev_attr_rcd_lnkcp.attr, > > > > + &dev_attr_rcd_lnkctl.attr, > > > > + NULL > > > > +}; > > > > + > > > > +static struct attribute_group cxl_rcd_group = { > > > > + .attrs = cxl_rcd_attrs, > > > > + .is_visible = cxl_rcd_visible, }; > > > > + > > > > +__ATTRIBUTE_GROUPS(cxl_pci); > > > > + > > > > static struct pci_driver cxl_pci_driver = { > > > > .name = KBUILD_MODNAME, > > > > .id_table = cxl_mem_pci_tbl, > > > > @@ -967,6 +980,7 @@ static struct pci_driver cxl_pci_driver = { > > > > .err_handler = &cxl_error_handlers, > > > > .driver = { > > > > .probe_type = > PROBE_PREFER_ASYNCHRONOUS, > > > > + .dev_groups = cxl_rcd_groups, > > > > }, > > > > }; > > > > > > > > > > > > However, the problem I believe is this will end up with: > > > > > > > > /sys/bus/pci/devices/$pdev/rcd_lnkcap > > > > /sys/bus/pci/devices/$pdev/rcd_lnkctl > > > > > > > > ...with valid values, but attributes like: > > > > > > > > /sys/bus/pci/devices/$pdev/current_link_speed > > > > > > > > ...returning -EINVAL. > > > > > > > > So I think the options are: > > > > > > > > 1/ Keep the status quo of RCRB knowledge only lives in drivers/cxl/ and > > > > piecemeal enable specific lspci needs with RCD-specific > > > > attributes > > > > > > This one gets my vote. > > > > Thank you for your feedback. > > Like Dan, I also believe that implementing this feature in the kernel > > may not be appropriate, as it is specifically needed for CXL1.1 devices. > > Therefore, I understand that it would be better to implement the link > > status of CXL1.1 devices directly in lspci. > > Please tell me if my understanding is wrong. > > The proposal is to do a hybrid approach. The drivers/cxl/ subsystem already > handles RCRB register access internally, so it can go further and expose a > couple attributes ("rcd_lnkcap" and "rcd_lnkctl") that lspci can go read. In > other words, "/dev/mem" is not a reliable way to access the RCRB, and it is too > much work to make the existing sysfs config-space access ABI understand the > RCRB layout since that complication would only be useful for one hardware > generation. > > An additional idea here is to allow for the CXL subsystem to takeover > publishing PCIe attributes like "current_link_speed", that are currently broken > by the RCRB configuration, with a change like this: > Thank you, it seems that my understanding was incorrect. I will try to consider the implementation by dividing it into parts: the hook on the pci driver, the RCRB access in the cxl driver, and the sysfs reading in lspci. > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c index > 2321fdfefd7d..982bbec721fd 100644 > --- a/drivers/pci/pci-sysfs.c > +++ b/drivers/pci/pci-sysfs.c > @@ -1613,7 +1613,7 @@ static umode_t pcie_dev_attrs_are_visible(struct > kobject *kobj, > struct device *dev = kobj_to_dev(kobj); > struct pci_dev *pdev = to_pci_dev(dev); > > - if (pci_is_pcie(pdev)) > + if (pci_is_pcie(pdev) && !is_cxl_rcd(pdev)) > return a->mode; > > return 0; > > ...then the CXL subsystem can produce its own attributes with the same name, > but backed by the RCRB lookup mechanism. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH 0/3] lspci: Display cxl1.1 device link status 2023-12-20 5:07 [RFC PATCH 0/3] lspci: Display cxl1.1 device link status KobayashiDaisuke ` (3 preceding siblings ...) 2024-01-09 15:57 ` [RFC PATCH 0/3] lspci: Display cxl1.1 device link status Jonathan Cameron @ 2024-01-17 12:10 ` Martin Mareš 2024-01-18 5:07 ` Daisuke Kobayashi (Fujitsu) 4 siblings, 1 reply; 12+ messages in thread From: Martin Mareš @ 2024-01-17 12:10 UTC (permalink / raw) To: KobayashiDaisuke; +Cc: linux-pci, linux-cxl, y-goto Hello! Sorry for the late reply, but these days I don't read linux-pci regularly. Please Cc me on all patches for the pciutils. Anyway... I don't think this is the right approach. You poke things you shouldn't in user space, you also make some bold assumptions on endianity of the machine (you are using native C structs for data provided by the hardware). This belongs to the kernel. Have a nice fortnight -- Martin `MJ' Mareš <mj@ucw.cz> http://mj.ucw.cz/ United Computer Wizards, Prague, Czech Republic, Europe, Earth, Universe ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [RFC PATCH 0/3] lspci: Display cxl1.1 device link status 2024-01-17 12:10 ` Martin Mareš @ 2024-01-18 5:07 ` Daisuke Kobayashi (Fujitsu) 0 siblings, 0 replies; 12+ messages in thread From: Daisuke Kobayashi (Fujitsu) @ 2024-01-18 5:07 UTC (permalink / raw) To: 'Martin Mareš' Cc: linux-pci@vger.kernel.org, linux-cxl@vger.kernel.org, Yasunori Gotou (Fujitsu), 'mj@ucw.cz' > -----Original Message----- > From: Martin Mareš <mj@ucw.cz> > Sent: Wednesday, January 17, 2024 9:11 PM > To: Kobayashi, Daisuke/小林 大介 <kobayashi.da-06@fujitsu.com> > Cc: linux-pci@vger.kernel.org; linux-cxl@vger.kernel.org; Gotou, Yasunori/五島 > 康文 <y-goto@fujitsu.com> > Subject: Re: [RFC PATCH 0/3] lspci: Display cxl1.1 device link status > > Hello! > > Sorry for the late reply, but these days I don't read linux-pci regularly. Please Cc > me on all patches for the pciutils. > I see. I'll include you in CC. > Anyway... > > I don't think this is the right approach. You poke things you shouldn't in user > space, you also make some bold assumptions on endianity of the machine (you > are using native C structs for data provided by the hardware). > > This belongs to the kernel. Thank you, Martin. I just want to say you made a good point. I totally missed that, so thanks for pointing it out. > > Have a nice fortnight > -- > Martin `MJ' Mareš <mj@ucw.cz> > http://mj.ucw.cz/ > United Computer Wizards, Prague, Czech Republic, Europe, Earth, Universe ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-01-18 5:08 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-12-20 5:07 [RFC PATCH 0/3] lspci: Display cxl1.1 device link status KobayashiDaisuke 2023-12-20 5:07 ` [RFC PATCH 1/3] Add function to display " KobayashiDaisuke 2023-12-20 5:07 ` [RFC PATCH 2/3] Implement a function to get cxl1.1 device uid KobayashiDaisuke 2023-12-20 5:07 ` [RFC PATCH 3/3] Implement a function to get a RCRB Base address KobayashiDaisuke 2024-01-09 15:57 ` [RFC PATCH 0/3] lspci: Display cxl1.1 device link status Jonathan Cameron 2024-01-11 1:11 ` Dan Williams 2024-01-12 11:24 ` Jonathan Cameron 2024-01-15 9:09 ` Daisuke Kobayashi (Fujitsu) 2024-01-16 21:29 ` Dan Williams 2024-01-17 9:23 ` Daisuke Kobayashi (Fujitsu) 2024-01-17 12:10 ` Martin Mareš 2024-01-18 5:07 ` Daisuke Kobayashi (Fujitsu)
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox