* PCI VPD checksum ambiguity
@ 2025-04-01 20:55 Bjorn Helgaas
2025-04-01 21:51 ` Heiner Kallweit
2025-04-10 16:21 ` Bjorn Helgaas
0 siblings, 2 replies; 6+ messages in thread
From: Bjorn Helgaas @ 2025-04-01 20:55 UTC (permalink / raw)
To: Pavan Chebbi, Michael Chan, Potnuri Bharat Teja, Heiner Kallweit
Cc: netdev, linux-pci, linux-kernel
Hi,
The PCIe spec is ambiguous about how the VPD checksum should be
computed, and resolving this ambiguity might break drivers.
PCIe r6.0 sec 6.27 says only the VPD-R list should be included in the
checksum:
One VPD-R (10h) tag is used as a header for the read-only keywords.
The VPD-R list (including tag and length) must checksum to zero.
But sec 6.27.2.2 says "all bytes in VPD ... up to the checksum byte":
RV The first byte of this item is a checksum byte. The checksum is
correct if the sum of all bytes in VPD (from VPD address 0 up
to and including this byte) is zero.
These are obviously different unless VPD-R happens to be the first
item in VPD. But sec 6.27 and 6.27.2.1 suggest that the Identifier
String item should be the first item, preceding the VPD-R list:
The first VPD tag is the Identifier String (02h) and provides the
product name of the device. [6.27]
Large resource type Identifier String (02h)
This tag is the first item in the VPD storage component. It
contains the name of the add-in card in alphanumeric characters.
[6.27.2.1, Table 6-23]
I think pci_vpd_check_csum() follows sec 6.27.2.2: it sums all the
bytes in the buffer up to and including the checksum byte of the RV
keyword. The range starts at 0, not at the beginning of the VPD-R
read-only list, so it likely includes the Identifier String.
As far as I can tell, only the broadcom/tg3 and chelsio/cxgb4/t4
drivers use pci_vpd_check_csum(). Of course, other drivers might
compute the checksum themselves.
Any thoughts on how this spec ambiguity should be resolved?
Any idea how devices in the field populate their VPD?
Can you share any VPD dumps from devices that include an RV keyword
item?
Bjorn
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: PCI VPD checksum ambiguity
2025-04-01 20:55 PCI VPD checksum ambiguity Bjorn Helgaas
@ 2025-04-01 21:51 ` Heiner Kallweit
2025-04-01 22:35 ` Bjorn Helgaas
2025-04-10 16:21 ` Bjorn Helgaas
1 sibling, 1 reply; 6+ messages in thread
From: Heiner Kallweit @ 2025-04-01 21:51 UTC (permalink / raw)
To: Bjorn Helgaas, Pavan Chebbi, Michael Chan, Potnuri Bharat Teja
Cc: netdev, linux-pci, linux-kernel
On 01.04.2025 22:55, Bjorn Helgaas wrote:
> Hi,
>
> The PCIe spec is ambiguous about how the VPD checksum should be
> computed, and resolving this ambiguity might break drivers.
>
> PCIe r6.0 sec 6.27 says only the VPD-R list should be included in the
> checksum:
>
> One VPD-R (10h) tag is used as a header for the read-only keywords.
> The VPD-R list (including tag and length) must checksum to zero.
>
This requirement I don't find in the PCI 3.0 spec, not sure with which
version it was added. Interestingly this doesn't even require the
presence of a RV keyword. Or the presence of the RV keyword is
implicitly assumed.
Maybe this part isn't meant literally. I can imagine they wanted to
clarify that checksum calculation excludes the VPD-W area.
And unfortunately they weren't precise enough, and introduced the
ambiguity you found.
> But sec 6.27.2.2 says "all bytes in VPD ... up to the checksum byte":
>
> RV The first byte of this item is a checksum byte. The checksum is
> correct if the sum of all bytes in VPD (from VPD address 0 up
> to and including this byte) is zero.
>
This one can be found identically in the PCI v3.0 spec already:
The checksum is correct if the sum of all bytes in VPD (from
VPD address 0 up to and including this byte) is zero.
I don't think they want to break backwards-compatibility, therefore
this requirement should still be valid.
> These are obviously different unless VPD-R happens to be the first
> item in VPD. But sec 6.27 and 6.27.2.1 suggest that the Identifier
> String item should be the first item, preceding the VPD-R list:
>
> The first VPD tag is the Identifier String (02h) and provides the
> product name of the device. [6.27]
>
> Large resource type Identifier String (02h)
>
> This tag is the first item in the VPD storage component. It
> contains the name of the add-in card in alphanumeric characters.
> [6.27.2.1, Table 6-23]
>
> I think pci_vpd_check_csum() follows sec 6.27.2.2: it sums all the
> bytes in the buffer up to and including the checksum byte of the RV
> keyword. The range starts at 0, not at the beginning of the VPD-R
> read-only list, so it likely includes the Identifier String.
>
> As far as I can tell, only the broadcom/tg3 and chelsio/cxgb4/t4
> drivers use pci_vpd_check_csum(). Of course, other drivers might
> compute the checksum themselves.
>
> Any thoughts on how this spec ambiguity should be resolved?
>
> Any idea how devices in the field populate their VPD?
>
> Can you share any VPD dumps from devices that include an RV keyword
> item?
>
I have only very dated devices which likely date back to before
the existence of PCIe r6.0. So their VPD dump may not really help.
IIRC there's an ongoing discussion regarding making VPD content
user-readable on mlx5 devices. Maybe check with the Mellanox/Nvidia
guys how they interpret the spec and implemented VPD checksumming.
> Bjorn
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: PCI VPD checksum ambiguity
2025-04-01 21:51 ` Heiner Kallweit
@ 2025-04-01 22:35 ` Bjorn Helgaas
2025-04-02 10:33 ` Pavan Chebbi
0 siblings, 1 reply; 6+ messages in thread
From: Bjorn Helgaas @ 2025-04-01 22:35 UTC (permalink / raw)
To: Heiner Kallweit
Cc: Pavan Chebbi, Michael Chan, Potnuri Bharat Teja, netdev,
linux-pci, linux-kernel, Leon Romanovsky
On Tue, Apr 01, 2025 at 11:51:07PM +0200, Heiner Kallweit wrote:
> On 01.04.2025 22:55, Bjorn Helgaas wrote:
> > Hi,
> >
> > The PCIe spec is ambiguous about how the VPD checksum should be
> > computed, and resolving this ambiguity might break drivers.
> >
> > PCIe r6.0 sec 6.27 says only the VPD-R list should be included in the
> > checksum:
> >
> > One VPD-R (10h) tag is used as a header for the read-only keywords.
> > The VPD-R list (including tag and length) must checksum to zero.
>
> This requirement I don't find in the PCI 3.0 spec, not sure with which
> version it was added.
I think it's there; that same text is in PCI r3.0, Appendix I, about
five lines before I.1.
> Interestingly this doesn't even require the presence of a RV
> keyword. Or the presence of the RV keyword is implicitly assumed.
I.3.1.1 says RV is required, and I guess it has to be last in VPD-R to
cover any reserved space (as needed, I suppose to align to the VPD-W
area, which might be in a different chip).
> Maybe this part isn't meant literally. I can imagine they wanted to
> clarify that checksum calculation excludes the VPD-W area.
> And unfortunately they weren't precise enough, and introduced the
> ambiguity you found.
>
> > But sec 6.27.2.2 says "all bytes in VPD ... up to the checksum byte":
> >
> > RV The first byte of this item is a checksum byte. The checksum is
> > correct if the sum of all bytes in VPD (from VPD address 0 up
> > to and including this byte) is zero.
>
> This one can be found identically in the PCI v3.0 spec already:
>
> The checksum is correct if the sum of all bytes in VPD (from
> VPD address 0 up to and including this byte) is zero.
>
> I don't think they want to break backwards-compatibility, therefore
> this requirement should still be valid.
Yes, and I think the RV description is more specific and is what I
would have used to implement it.
> > These are obviously different unless VPD-R happens to be the first
> > item in VPD. But sec 6.27 and 6.27.2.1 suggest that the Identifier
> > String item should be the first item, preceding the VPD-R list:
> >
> > The first VPD tag is the Identifier String (02h) and provides the
> > product name of the device. [6.27]
> >
> > Large resource type Identifier String (02h)
> >
> > This tag is the first item in the VPD storage component. It
> > contains the name of the add-in card in alphanumeric characters.
> > [6.27.2.1, Table 6-23]
> >
> > I think pci_vpd_check_csum() follows sec 6.27.2.2: it sums all the
> > bytes in the buffer up to and including the checksum byte of the RV
> > keyword. The range starts at 0, not at the beginning of the VPD-R
> > read-only list, so it likely includes the Identifier String.
> >
> > As far as I can tell, only the broadcom/tg3 and chelsio/cxgb4/t4
> > drivers use pci_vpd_check_csum(). Of course, other drivers might
> > compute the checksum themselves.
> >
> > Any thoughts on how this spec ambiguity should be resolved?
> >
> > Any idea how devices in the field populate their VPD?
> >
> > Can you share any VPD dumps from devices that include an RV keyword
> > item?
>
> I have only very dated devices which likely date back to before
> the existence of PCIe r6.0. So their VPD dump may not really help.
>
> IIRC there's an ongoing discussion regarding making VPD content
> user-readable on mlx5 devices. Maybe check with the Mellanox/Nvidia
> guys how they interpret the spec and implemented VPD checksumming.
Good idea, cc'd.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: PCI VPD checksum ambiguity
2025-04-01 22:35 ` Bjorn Helgaas
@ 2025-04-02 10:33 ` Pavan Chebbi
2025-04-07 22:22 ` Bjorn Helgaas
0 siblings, 1 reply; 6+ messages in thread
From: Pavan Chebbi @ 2025-04-02 10:33 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Heiner Kallweit, Michael Chan, Potnuri Bharat Teja, netdev,
linux-pci, linux-kernel, Leon Romanovsky
[-- Attachment #1: Type: text/plain, Size: 1316 bytes --]
> > >
> > > Any idea how devices in the field populate their VPD?
> > >
I took a quick look at our manufacturing tool, and it does look like
the computation simply starts at address 0.
> > > Can you share any VPD dumps from devices that include an RV keyword
> > > item?
A couple of devices I could find: hope it helps..
000100: 822f0042 726f6164 636f6d20 4e657458 7472656d 65204769 67616269 74204574
000120: 6865726e 65742043 6f6e7472 6f6c6c65 7200904b 00504e08 42434d39 35373230
000140: 45430931 30363637 392d3135 534e0a30 31323334 35363738 394d4e04 31346534
000160: 52561d1d 00000000 00000000 00000000 000000
000100: 822f0042 726f6164 636f6d20 4e657458 7472656d 65204769 67616269 74204574
000120: 6865726e 65742043 6f6e7472 6f6c6c65 7200904b 00504e08 42434d39 35373139
000140: 45430931 30363637 392d3135 534e0a30 31323334 35363738 394d4e04 31346534
000160: 52561d15 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> >
> > I have only very dated devices which likely date back to before
> > the existence of PCIe r6.0. So their VPD dump may not really help.
> >
> > IIRC there's an ongoing discussion regarding making VPD content
> > user-readable on mlx5 devices. Maybe check with the Mellanox/Nvidia
> > guys how they interpret the spec and implemented VPD checksumming.
>
> Good idea, cc'd.
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4196 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: PCI VPD checksum ambiguity
2025-04-02 10:33 ` Pavan Chebbi
@ 2025-04-07 22:22 ` Bjorn Helgaas
0 siblings, 0 replies; 6+ messages in thread
From: Bjorn Helgaas @ 2025-04-07 22:22 UTC (permalink / raw)
To: Pavan Chebbi
Cc: Heiner Kallweit, Michael Chan, Potnuri Bharat Teja, netdev,
linux-pci, linux-kernel, Leon Romanovsky
On Wed, Apr 02, 2025 at 04:03:42PM +0530, Pavan Chebbi wrote:
> > > >
> > > > Any idea how devices in the field populate their VPD?
>
> I took a quick look at our manufacturing tool, and it does look like
> the computation simply starts at address 0.
>
> > > > Can you share any VPD dumps from devices that include an RV keyword
> > > > item?
>
> A couple of devices I could find: hope it helps..
> 000100: 822f0042 726f6164 636f6d20 4e657458 7472656d 65204769 67616269 74204574
> 000120: 6865726e 65742043 6f6e7472 6f6c6c65 7200904b 00504e08 42434d39 35373230
> 000140: 45430931 30363637 392d3135 534e0a30 31323334 35363738 394d4e04 31346534
> 000160: 52561d1d 00000000 00000000 00000000 000000
>
> 000100: 822f0042 726f6164 636f6d20 4e657458 7472656d 65204769 67616269 74204574
> 000120: 6865726e 65742043 6f6e7472 6f6c6c65 7200904b 00504e08 42434d39 35373139
> 000140: 45430931 30363637 392d3135 534e0a30 31323334 35363738 394d4e04 31346534
> 000160: 52561d15 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Thanks a lot for this!
I put each of these in a file ("vpd.txt") and computed the checksum
with this:
addr=0; sum=0; xxd -r -c 32 vpd.txt | xxd -p -g1 -c1 x.bin | while read X; do sum=$(($sum + "0x$X")); printf "addr 0x%04x: 0x%02x sum 0x%02x\n" $addr "0x$X" $(($sum % 256)); addr=$(($addr + 1)); done
In both cases the sum came out to 0x00 as it should.
So it looks like Broadcom interpreted the spec the same way Linux
pci_vpd_check_csum() does: the checksum includes all bytes from the
beginning of VPD up to and including the RV checksum byte, not just
the VPD-R list.
These dumps start at 0x100 (not 0), which seems a little weird. But
"xxd -r" assumes zeros for the 0-0xff range, so it doesn't affect the
checksum.
I manually decoded the first one, which looked like this. Nothing
surprising here:
00000100: 822f 0042 726f 6164 636f 6d20 4e65 7458 ./.Broadcom NetX
00000110: 7472 656d 6520 4769 6761 6269 7420 4574 treme Gigabit Et
00000120: 6865 726e 6574 2043 6f6e 7472 6f6c 6c65 hernet Controlle
00000130: 7200 r.
82 == large resource tag 0x02 (Identifier String), data item length 0x2f
"Broadcom NetXtreme Gigabit Ethernet Controller"
00000132: 904b 0050 4e08 4243 4d39 3537 3230 .K.PN.BCM95720
00000140: 4543 0931 3036 3637 392d 3135 534e 0a30 EC.106679-15SN.0
00000150: 3132 3334 3536 3738 394d 4e04 3134 6534 123456789MN.14e4
00000160: 5256 1d1d 0000 0000 0000 0000 0000 0000 RV..............
00000170: 0000 00 ...
90 == large resource tag 1_0000b (0x10, VPD-R), data item length 0x4b
50 4e PN keyword, length 0x08: 4243 4d39 3537 3230: "BCM95720"
45 43 EC keyword, length 0x09: 31 3036 3637 392d 3135: "106679-15"
53 4e SN keyword, length 0x0a: 30 3132 3334 3536 3738 39: "0123456789"
4d 4e MN keyword, length 0x04: 3134 6534: "14e4"
52 56 RV keyword, length 0x1d: 1d
(last byte of VPD-R is 0x162 + 0x1d == 0x17f)
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/vpd.c?id=v6.14#n520
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: PCI VPD checksum ambiguity
2025-04-01 20:55 PCI VPD checksum ambiguity Bjorn Helgaas
2025-04-01 21:51 ` Heiner Kallweit
@ 2025-04-10 16:21 ` Bjorn Helgaas
1 sibling, 0 replies; 6+ messages in thread
From: Bjorn Helgaas @ 2025-04-10 16:21 UTC (permalink / raw)
To: Pavan Chebbi, Michael Chan, Potnuri Bharat Teja, Heiner Kallweit,
Leon Romanovsky
Cc: netdev, linux-pci, linux-kernel
On Tue, Apr 01, 2025 at 03:55:44PM -0500, Bjorn Helgaas wrote:
> Hi,
>
> The PCIe spec is ambiguous about how the VPD checksum should be
> computed, and resolving this ambiguity might break drivers.
Any more input on this? It would be great to have more information
about how vendors compute the checksum on their devices.
There is a proposed PCIe spec change to resolve the ambiguity, and the
intent is to make a change that reflects what vendors have actually
implemented.
The only concrete data I've seen so far is from Pavan at Broadcom
(thank you very much for that), where the checksum starts at the
beginning of VPD, not at the beginning of VPD-R.
If you can collect VPD data from devices, you can use something like
this to compute the checksum from the beginning of VPD:
addr=0; sum=0; xxd -r -c 32 vpd.txt | xxd -p -g1 -c1 x.bin | while read X; do sum=$(($sum + "0x$X")); printf "addr 0x%04x: 0x%02x sum 0x%02x\n" $addr "0x$X" $(($sum % 256)); addr=$(($addr + 1)); done
(You still have to figure out manually where the RV item is so you
don't include the writable VPD-W part.)
> PCIe r6.0 sec 6.27 says only the VPD-R list should be included in the
> checksum:
>
> One VPD-R (10h) tag is used as a header for the read-only keywords.
> The VPD-R list (including tag and length) must checksum to zero.
>
> But sec 6.27.2.2 says "all bytes in VPD ... up to the checksum byte":
>
> RV The first byte of this item is a checksum byte. The checksum is
> correct if the sum of all bytes in VPD (from VPD address 0 up
> to and including this byte) is zero.
>
> These are obviously different unless VPD-R happens to be the first
> item in VPD. But sec 6.27 and 6.27.2.1 suggest that the Identifier
> String item should be the first item, preceding the VPD-R list:
>
> The first VPD tag is the Identifier String (02h) and provides the
> product name of the device. [6.27]
>
> Large resource type Identifier String (02h)
>
> This tag is the first item in the VPD storage component. It
> contains the name of the add-in card in alphanumeric characters.
> [6.27.2.1, Table 6-23]
>
> I think pci_vpd_check_csum() follows sec 6.27.2.2: it sums all the
> bytes in the buffer up to and including the checksum byte of the RV
> keyword. The range starts at 0, not at the beginning of the VPD-R
> read-only list, so it likely includes the Identifier String.
>
> As far as I can tell, only the broadcom/tg3 and chelsio/cxgb4/t4
> drivers use pci_vpd_check_csum(). Of course, other drivers might
> compute the checksum themselves.
>
> Any thoughts on how this spec ambiguity should be resolved?
>
> Any idea how devices in the field populate their VPD?
>
> Can you share any VPD dumps from devices that include an RV keyword
> item?
>
> Bjorn
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-04-10 16:21 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-01 20:55 PCI VPD checksum ambiguity Bjorn Helgaas
2025-04-01 21:51 ` Heiner Kallweit
2025-04-01 22:35 ` Bjorn Helgaas
2025-04-02 10:33 ` Pavan Chebbi
2025-04-07 22:22 ` Bjorn Helgaas
2025-04-10 16:21 ` Bjorn Helgaas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox