* 4.6-rc2 regression with commit 104daa71b396: check VPD access offset against length
@ 2016-04-08 22:24 Casey Leedom
[not found] ` <BY2PR12MB0648F0C35D907AA640D7E56CC8940@BY2PR12MB0648.namprd12.prod.outlook.com>
0 siblings, 1 reply; 12+ messages in thread
From: Casey Leedom @ 2016-04-08 22:24 UTC (permalink / raw)
To: linux-pci@vger.kernel.org; +Cc: Bjorn Helgaas, Hannes Reinecke, Hariprasad S
[[ I apologize if this is a duplicate message. I've been struggling all
afternoon to get a message successfully delivered to the linux-pci
list.
-- Casey ]]
kernel.org commit 104daa71b396 added a check to make sure that efforts to
read/write the VPD wouldn't extend past the computed length of the VPD.
Later, kernel.org commit 408641e93aa5 folded the pci_vpd_pci22 into
struct pci_vpd so things moved around a bit after that and an error return
got changed into a silent failure instead of -EINVAL.
The problem is that the previous pci_vpd_pci22_read() didn't check for a read with
a VPD Offset > VPD Length and the new pci_vpd_read() is checking that. Worse
yet, when a VPD Offset is greater than the recorded VPD Length, it simply
returns 0 rather than -EINVAL.
The problem is stemming from the fact that the Chelsio adapters actually have
two VPD structures stored in the VPD. An abbreviated on at Offset 0x0 and the
complete VPD at Offset 0x400. The abbreviated one only contains the PN, SN and
EC Keywords, while the complete VPD contains those plus various adapter
constants contained in V0, V1, etc. And it also contains the Base Ethernet MAC
Address in the "NA" Keyword which the cxgb4 driver needs when it can't contact
the adapter firmware. (We don't have the "NA" Keywork in the VPD Structure at
Offset 0x0 because that's not an allowed VPD Keyword in the PCI-E 3.0
specification.)
With the new code, the computed size of the VPD is 0x200 and so our efforts
to read the VPD at Offset 0x400 silently fails. We check the result of the
read looking for a signature 0x82 byte but we're checking against random stack
garbage.
The end result is that the cxgb4 driver now fails the PCI-E Probe.
Again, apologies if I'm not communicating this to the Linux Development
Community properly -- I'm usually not allowed out of my cage.
Casey
^ permalink raw reply [flat|nested] 12+ messages in thread[parent not found: <BY2PR12MB0648F0C35D907AA640D7E56CC8940@BY2PR12MB0648.namprd12.prod.outlook.com>]
* RE: 4.6-rc2 regression with commit 104daa71b396: check VPD access offset against length [not found] ` <BY2PR12MB0648F0C35D907AA640D7E56CC8940@BY2PR12MB0648.namprd12.prod.outlook.com> @ 2016-04-11 18:16 ` Steve Wise 2016-04-12 5:37 ` Bjorn Helgaas 0 siblings, 1 reply; 12+ messages in thread From: Steve Wise @ 2016-04-11 18:16 UTC (permalink / raw) To: 'Casey Leedom'; +Cc: linux-pci, hariprasad, hare, bhelgaas > kernel.org commit 104daa71b396 added a check to make sure that efforts to > read/write the VPD wouldn't extend past the computed length of the VPD. > Later, kernel.org commit 408641e93aa5 folded the pci_vpd_pci22 into > struct pci_vpd so things moved around a bit after that and an error return > got changed into a silent failure instead of -EINVAL. > > The problem is that the previous pci_vpd_pci22_read() didn't check for a read with > a VPD Offset > VPD Length and the new pci_vpd_read() is checking that. Worse > yet, when a VPD Offset is greater than the recorded VPD Length, it simply > returns 0 rather than -EINVAL. > > The problem is stemming from the fact that the Chelsio adapters actually have > two VPD structures stored in the VPD. An abbreviated on at Offset 0x0 and the > complete VPD at Offset 0x400. The abbreviated one only contains the PN, SN and > EC Keywords, while the complete VPD contains those plus various adapter > constants contained in V0, V1, etc. And it also contains the Base Ethernet MAC > Address in the "NA" Keyword which the cxgb4 driver needs when it can't contact > the adapter firmware. (We don't have the "NA" Keywork in the VPD Structure at > Offset 0x0 because that's not an allowed VPD Keyword in the PCI-E 3.0 > specification.) > > With the new code, the computed size of the VPD is 0x200 and so our efforts > to read the VPD at Offset 0x400 silently fails. We check the result of the > read looking for a signature 0x82 byte but we're checking against random stack > garbage. > > The end result is that the cxgb4 driver now fails the PCI-E Probe. > Silently failing is wrong, in my opinion. And I even question truncating which is also done in pci_vpd_read(). To the PCI maintainers: Should the length checks just be removed? If not, what is the correct solution? Adding a different "expert" API that ignores the length checks, or somehow allowing the device driver to set the actual VPD size? Thanks, Steve. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 4.6-rc2 regression with commit 104daa71b396: check VPD access offset against length 2016-04-11 18:16 ` Steve Wise @ 2016-04-12 5:37 ` Bjorn Helgaas 2016-04-12 6:20 ` Hannes Reinecke 0 siblings, 1 reply; 12+ messages in thread From: Bjorn Helgaas @ 2016-04-12 5:37 UTC (permalink / raw) To: Steve Wise; +Cc: 'Casey Leedom', linux-pci, hariprasad, hare, bhelgaas On Mon, Apr 11, 2016 at 01:16:17PM -0500, Steve Wise wrote: > > kernel.org commit 104daa71b396 added a check to make sure that efforts to > > read/write the VPD wouldn't extend past the computed length of the VPD. > > Later, kernel.org commit 408641e93aa5 folded the pci_vpd_pci22 into > > struct pci_vpd so things moved around a bit after that and an error return > > got changed into a silent failure instead of -EINVAL. > > > > The problem is that the previous pci_vpd_pci22_read() didn't check for a > read with > > a VPD Offset > VPD Length and the new pci_vpd_read() is checking that. Worse > > yet, when a VPD Offset is greater than the recorded VPD Length, it simply > > returns 0 rather than -EINVAL. > > > > The problem is stemming from the fact that the Chelsio adapters actually > have > > two VPD structures stored in the VPD. An abbreviated on at Offset 0x0 and the > > complete VPD at Offset 0x400. The abbreviated one only contains the PN, SN > and > > EC Keywords, while the complete VPD contains those plus various adapter > > constants contained in V0, V1, etc. And it also contains the Base Ethernet > MAC > > Address in the "NA" Keyword which the cxgb4 driver needs when it can't contact > > the adapter firmware. (We don't have the "NA" Keywork in the VPD Structure at > > Offset 0x0 because that's not an allowed VPD Keyword in the PCI-E 3.0 > > specification.) > > > > With the new code, the computed size of the VPD is 0x200 and so our efforts > > to read the VPD at Offset 0x400 silently fails. We check the result of the > > read looking for a signature 0x82 byte but we're checking against random stack > > garbage. > > > > The end result is that the cxgb4 driver now fails the PCI-E Probe. > > > > Silently failing is wrong, in my opinion. And I even question truncating which > is also done in pci_vpd_read(). To the PCI maintainers: Should the length > checks just be removed? If not, what is the correct solution? Adding a > different "expert" API that ignores the length checks, or somehow allowing the > device driver to set the actual VPD size? I think everybody would prefer if it the kernel could just read whatever VPD region the user requested, without parsing the data or checking for length (as long as we're within the 32K space allowed by the spec). The problem is that some cards crash if you read too much: commit 104daa71b396 Author: Hannes Reinecke <hare@suse.de> Date: Mon Feb 15 09:42:01 2016 +0100 PCI: Determine actual VPD size on first access PCI-2.2 VPD entries have a maximum size of 32k, but might actually be smaller than that. To figure out the actual size one has to read the VPD area until the 'end marker' is reached. Per spec, reading outside of the VPD space is "not allowed." In practice, it may cause simple read errors or even crash the card. To make matters worse not every PCI card implements this properly, leaving us with no 'end' marker or even completely invalid data. Try to determine the size of the VPD data when it's first accessed. If no valid data can be read an I/O error will be returned when reading or writing the sysfs attribute. So if you want to get rid of the length checks, you have to propose some other mechanism to avoid these issues. The only ideas I have are to (1) parse the data as we do in 104daa71b396, (2) add quirks to prevent VPD access (as in 7c20078a8197 ("PCI: Prevent VPD access for buggy devices"), and/or (3) add quirks to allow access to more VPD than parsing says we can access. These aren't mutually exclusive -- we already have (1) and (2), and I think we could easily add (3) into the mix. (3) seems like a possible solution for Chelsio. In that case, it's the driver that needs the data, so the driver could maintain a quirk. Bjorn ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 4.6-rc2 regression with commit 104daa71b396: check VPD access offset against length 2016-04-12 5:37 ` Bjorn Helgaas @ 2016-04-12 6:20 ` Hannes Reinecke 2016-04-12 8:23 ` Hariprasad Shenai 0 siblings, 1 reply; 12+ messages in thread From: Hannes Reinecke @ 2016-04-12 6:20 UTC (permalink / raw) To: Bjorn Helgaas, Steve Wise Cc: 'Casey Leedom', linux-pci, hariprasad, bhelgaas On 04/12/2016 07:37 AM, Bjorn Helgaas wrote: > On Mon, Apr 11, 2016 at 01:16:17PM -0500, Steve Wise wrote: >>> kernel.org commit 104daa71b396 added a check to make sure that efforts to >>> read/write the VPD wouldn't extend past the computed length of the VPD. >>> Later, kernel.org commit 408641e93aa5 folded the pci_vpd_pci22 into >>> struct pci_vpd so things moved around a bit after that and an error return >>> got changed into a silent failure instead of -EINVAL. >>> >>> The problem is that the previous pci_vpd_pci22_read() didn't check for a >> read with >>> a VPD Offset > VPD Length and the new pci_vpd_read() is checking that. Worse >>> yet, when a VPD Offset is greater than the recorded VPD Length, it simply >>> returns 0 rather than -EINVAL. >>> >>> The problem is stemming from the fact that the Chelsio adapters actually >> have >>> two VPD structures stored in the VPD. An abbreviated on at Offset 0x0 and the >>> complete VPD at Offset 0x400. The abbreviated one only contains the PN, SN >> and >>> EC Keywords, while the complete VPD contains those plus various adapter >>> constants contained in V0, V1, etc. And it also contains the Base Ethernet >> MAC >>> Address in the "NA" Keyword which the cxgb4 driver needs when it can't contact >>> the adapter firmware. (We don't have the "NA" Keywork in the VPD Structure at >>> Offset 0x0 because that's not an allowed VPD Keyword in the PCI-E 3.0 >>> specification.) >>> >>> With the new code, the computed size of the VPD is 0x200 and so our efforts >>> to read the VPD at Offset 0x400 silently fails. We check the result of the >>> read looking for a signature 0x82 byte but we're checking against random stack >>> garbage. >>> >>> The end result is that the cxgb4 driver now fails the PCI-E Probe. >>> >> >> Silently failing is wrong, in my opinion. And I even question truncating which >> is also done in pci_vpd_read(). To the PCI maintainers: Should the length >> checks just be removed? If not, what is the correct solution? Adding a >> different "expert" API that ignores the length checks, or somehow allowing the >> device driver to set the actual VPD size? > > I think everybody would prefer if it the kernel could just read > whatever VPD region the user requested, without parsing the data or > checking for length (as long as we're within the 32K space allowed by > the spec). > > The problem is that some cards crash if you read too much: > > commit 104daa71b396 > Author: Hannes Reinecke <hare@suse.de> > Date: Mon Feb 15 09:42:01 2016 +0100 > > PCI: Determine actual VPD size on first access > > PCI-2.2 VPD entries have a maximum size of 32k, but might actually be > smaller than that. To figure out the actual size one has to read the VPD > area until the 'end marker' is reached. > > Per spec, reading outside of the VPD space is "not allowed." In practice, > it may cause simple read errors or even crash the card. To make matters > worse not every PCI card implements this properly, leaving us with no 'end' > marker or even completely invalid data. > > Try to determine the size of the VPD data when it's first accessed. If no > valid data can be read an I/O error will be returned when reading or > writing the sysfs attribute. > > So if you want to get rid of the length checks, you have to propose > some other mechanism to avoid these issues. > > The only ideas I have are to (1) parse the data as we do in > 104daa71b396, (2) add quirks to prevent VPD access (as in > 7c20078a8197 ("PCI: Prevent VPD access for buggy devices"), and/or (3) > add quirks to allow access to more VPD than parsing says we can > access. These aren't mutually exclusive -- we already have (1) and > (2), and I think we could easily add (3) into the mix. > > (3) seems like a possible solution for Chelsio. In that case, it's > the driver that needs the data, so the driver could maintain a quirk. > That's my suggestion, too. The generic code should be handling things according to the standard. If other drivers require a different handling we should be adding a quirk for them. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.com +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 4.6-rc2 regression with commit 104daa71b396: check VPD access offset against length 2016-04-12 6:20 ` Hannes Reinecke @ 2016-04-12 8:23 ` Hariprasad Shenai 2016-04-12 8:46 ` Hannes Reinecke 0 siblings, 1 reply; 12+ messages in thread From: Hariprasad Shenai @ 2016-04-12 8:23 UTC (permalink / raw) To: Hannes Reinecke Cc: Bjorn Helgaas, Steve Wise, 'Casey Leedom', linux-pci, bhelgaas On Tue, Apr 12, 2016 at 08:20:45 +0200, Hannes Reinecke wrote: > On 04/12/2016 07:37 AM, Bjorn Helgaas wrote: > > On Mon, Apr 11, 2016 at 01:16:17PM -0500, Steve Wise wrote: > >>> kernel.org commit 104daa71b396 added a check to make sure that efforts to > >>> read/write the VPD wouldn't extend past the computed length of the VPD. > >>> Later, kernel.org commit 408641e93aa5 folded the pci_vpd_pci22 into > >>> struct pci_vpd so things moved around a bit after that and an error return > >>> got changed into a silent failure instead of -EINVAL. > >>> > >>> The problem is that the previous pci_vpd_pci22_read() didn't check for a > >> read with > >>> a VPD Offset > VPD Length and the new pci_vpd_read() is checking that. Worse > >>> yet, when a VPD Offset is greater than the recorded VPD Length, it simply > >>> returns 0 rather than -EINVAL. > >>> > >>> The problem is stemming from the fact that the Chelsio adapters actually > >> have > >>> two VPD structures stored in the VPD. An abbreviated on at Offset 0x0 and the > >>> complete VPD at Offset 0x400. The abbreviated one only contains the PN, SN > >> and > >>> EC Keywords, while the complete VPD contains those plus various adapter > >>> constants contained in V0, V1, etc. And it also contains the Base Ethernet > >> MAC > >>> Address in the "NA" Keyword which the cxgb4 driver needs when it can't contact > >>> the adapter firmware. (We don't have the "NA" Keywork in the VPD Structure at > >>> Offset 0x0 because that's not an allowed VPD Keyword in the PCI-E 3.0 > >>> specification.) > >>> > >>> With the new code, the computed size of the VPD is 0x200 and so our efforts > >>> to read the VPD at Offset 0x400 silently fails. We check the result of the > >>> read looking for a signature 0x82 byte but we're checking against random stack > >>> garbage. > >>> > >>> The end result is that the cxgb4 driver now fails the PCI-E Probe. > >>> > >> > >> Silently failing is wrong, in my opinion. And I even question truncating which > >> is also done in pci_vpd_read(). To the PCI maintainers: Should the length > >> checks just be removed? If not, what is the correct solution? Adding a > >> different "expert" API that ignores the length checks, or somehow allowing the > >> device driver to set the actual VPD size? > > > > I think everybody would prefer if it the kernel could just read > > whatever VPD region the user requested, without parsing the data or > > checking for length (as long as we're within the 32K space allowed by > > the spec). > > > > The problem is that some cards crash if you read too much: > > > > commit 104daa71b396 > > Author: Hannes Reinecke <hare@suse.de> > > Date: Mon Feb 15 09:42:01 2016 +0100 > > > > PCI: Determine actual VPD size on first access > > > > PCI-2.2 VPD entries have a maximum size of 32k, but might actually be > > smaller than that. To figure out the actual size one has to read the VPD > > area until the 'end marker' is reached. > > > > Per spec, reading outside of the VPD space is "not allowed." In practice, > > it may cause simple read errors or even crash the card. To make matters > > worse not every PCI card implements this properly, leaving us with no 'end' > > marker or even completely invalid data. > > > > Try to determine the size of the VPD data when it's first accessed. If no > > valid data can be read an I/O error will be returned when reading or > > writing the sysfs attribute. > > > > So if you want to get rid of the length checks, you have to propose > > some other mechanism to avoid these issues. > > > > The only ideas I have are to (1) parse the data as we do in > > 104daa71b396, (2) add quirks to prevent VPD access (as in > > 7c20078a8197 ("PCI: Prevent VPD access for buggy devices"), and/or (3) > > add quirks to allow access to more VPD than parsing says we can > > access. These aren't mutually exclusive -- we already have (1) and > > (2), and I think we could easily add (3) into the mix. > > > > (3) seems like a possible solution for Chelsio. In that case, it's > > the driver that needs the data, so the driver could maintain a quirk. > > > That's my suggestion, too. > The generic code should be handling things according to the standard. > If other drivers require a different handling we should be adding a > quirk for them. > > Cheers, > > Hannes > -- > Dr. Hannes Reinecke zSeries & Storage > hare@suse.com +49 911 74053 688 > SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg > GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton > HRB 21284 (AG Nürnberg) Hi, How about adding a PCI helper function to set the actual VPD_SIZE. Some thing like below. We have tested this and works. The bnx2x and the tg3 drive may also need this, because I see them calling pci_read_vpd() with non-zero offsets. The bnx2x in particular looks like it's doing something similar to cxgb4 so it would also probably benefit from this change (once it's fixed to call the new pci_set_size_vpd() API). Thanks, Hari ==== --- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c +++ a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c @@ -2557,6 +2557,7 @@ void t4_get_regs(struct adapter *adap, void *buf, size_t buf_size) } #define EEPROM_STAT_ADDR 0x7bfc +#define VPD_SIZE 0x800 #define VPD_BASE 0x400 #define VPD_BASE_OLD 0 #define VPD_LEN 1024 @@ -2594,6 +2595,15 @@ int t4_get_raw_vpd_params(struct adapter *adapter, struct vpd_params *p) if (!vpd) return -ENOMEM; + /* We have two VPD data structures stored in the adapter VPD area. + * By default, Linux calculates the size of the VPD area by traversing + * the first VPD area at offset 0x0, so we need to tell the OS what + * our real VPD size is. + */ + ret = pci_set_size_vpd(adapter->pdev, VPD_SIZE); + if (ret < 0) + goto out; + /* Card information normally starts at VPD_BASE but early cards had * it at 0. */ --- a/drivers/pci/access.c +++ a/drivers/pci/access.c @@ -275,6 +275,19 @@ ssize_t pci_write_vpd(struct pci_dev *dev, loff_t pos, size_t count, const void } EXPORT_SYMBOL(pci_write_vpd); +/** + * pci_set_size_vpd - Set size of Vital Product Data space + * @dev: pci device struct + * @len: size of vpd space + */ +ssize_t pci_set_size_vpd(struct pci_dev *dev, size_t len) +{ + if (!dev->vpd || !dev->vpd->ops) + return -ENODEV; + return dev->vpd->ops->set_size(dev, len); +} +EXPORT_SYMBOL(pci_set_size_vpd); + #define PCI_VPD_MAX_SIZE (PCI_VPD_ADDR_MASK + 1) /** @@ -392,13 +405,8 @@ static ssize_t pci_vpd_read(struct pci_dev *dev, loff_t pos, size_t count, if (vpd->len == 0) return -EIO; - if (pos > vpd->len) - return 0; - - if (end > vpd->len) { - end = vpd->len; - count = end - pos; - } + if (end > vpd->len) + return -EINVAL; if (mutex_lock_killable(&vpd->lock)) return -EINTR; @@ -498,9 +506,23 @@ out: return ret ? ret : count; } +static ssize_t pci_vpd_set_size(struct pci_dev *dev, size_t len) +{ + struct pci_vpd *vpd = dev->vpd; + + if (len == 0 || len > PCI_VPD_MAX_SIZE) + return -EIO; + + vpd->valid = 1; + vpd->len = len; + + return 0; +} + static const struct pci_vpd_ops pci_vpd_ops = { .read = pci_vpd_read, .write = pci_vpd_write, + .set_size = pci_vpd_set_size, }; static ssize_t pci_vpd_f0_read(struct pci_dev *dev, loff_t pos, size_t count, @@ -533,9 +555,24 @@ static ssize_t pci_vpd_f0_write(struct pci_dev *dev, loff_t pos, size_t count, return ret; } +static ssize_t pci_vpd_f0_set_size(struct pci_dev *dev, size_t len) +{ + struct pci_dev *tdev = pci_get_slot(dev->bus, + PCI_DEVFN(PCI_SLOT(dev->devfn), 0)); + ssize_t ret; + + if (!tdev) + return -ENODEV; + + ret = pci_set_size_vpd(tdev, len); + pci_dev_put(tdev); + return ret; +} + static const struct pci_vpd_ops pci_vpd_f0_ops = { .read = pci_vpd_f0_read, .write = pci_vpd_f0_write, + .set_size = pci_vpd_f0_set_size, }; int pci_vpd_init(struct pci_dev *dev) --- a/drivers/pci/pci.h +++ a/drivers/pci/pci.h @@ -97,6 +97,7 @@ static inline bool pci_has_subordinate(struct pci_dev *pci_dev) struct pci_vpd_ops { ssize_t (*read)(struct pci_dev *dev, loff_t pos, size_t count, void *buf); ssize_t (*write)(struct pci_dev *dev, loff_t pos, size_t count, const void *buf); + ssize_t (*set_size)(struct pci_dev *dev, size_t len); }; struct pci_vpd { --- a/include/linux/pci.h +++ a/include/linux/pci.h @@ -1111,6 +1111,7 @@ void pci_unlock_rescan_remove(void); /* Vital product data routines */ ssize_t pci_read_vpd(struct pci_dev *dev, loff_t pos, size_t count, void *buf); ssize_t pci_write_vpd(struct pci_dev *dev, loff_t pos, size_t count, const void *buf); +ssize_t pci_set_size_vpd(struct pci_dev *dev, size_t len); /* Helper functions for low-level code (drivers/pci/setup-[bus,res].c) */ resource_size_t pcibios_retrieve_fw_addr(struct pci_dev *dev, int idx); ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 4.6-rc2 regression with commit 104daa71b396: check VPD access offset against length 2016-04-12 8:23 ` Hariprasad Shenai @ 2016-04-12 8:46 ` Hannes Reinecke 2016-04-12 17:35 ` Casey Leedom 0 siblings, 1 reply; 12+ messages in thread From: Hannes Reinecke @ 2016-04-12 8:46 UTC (permalink / raw) To: Hariprasad Shenai Cc: Bjorn Helgaas, Steve Wise, 'Casey Leedom', linux-pci, bhelgaas On 04/12/2016 10:23 AM, Hariprasad Shenai wrote: [ .. ] > > Hi, > > How about adding a PCI helper function to set the actual VPD_SIZE. > > Some thing like below. We have tested this and works. The bnx2x and the tg3 > drive may also need this, because I see them calling pci_read_vpd() > with non-zero offsets. The bnx2x in particular looks like it's doing something > similar to cxgb4 so it would also probably benefit from this change (once it's > fixed to call the new pci_set_size_vpd() API). > That indeed looks reasonable. Please find some comments inline. > Thanks, > Hari > > ==== > > --- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c > +++ a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c > @@ -2557,6 +2557,7 @@ void t4_get_regs(struct adapter *adap, void *buf, size_t buf_size) > } > > #define EEPROM_STAT_ADDR 0x7bfc > +#define VPD_SIZE 0x800 > #define VPD_BASE 0x400 > #define VPD_BASE_OLD 0 > #define VPD_LEN 1024 > @@ -2594,6 +2595,15 @@ int t4_get_raw_vpd_params(struct adapter *adapter, struct vpd_params *p) > if (!vpd) > return -ENOMEM; > > + /* We have two VPD data structures stored in the adapter VPD area. > + * By default, Linux calculates the size of the VPD area by traversing > + * the first VPD area at offset 0x0, so we need to tell the OS what > + * our real VPD size is. > + */ > + ret = pci_set_size_vpd(adapter->pdev, VPD_SIZE); > + if (ret < 0) > + goto out; > + > /* Card information normally starts at VPD_BASE but early cards had > * it at 0. > */ > --- a/drivers/pci/access.c > +++ a/drivers/pci/access.c > @@ -275,6 +275,19 @@ ssize_t pci_write_vpd(struct pci_dev *dev, loff_t pos, size_t count, const void > } > EXPORT_SYMBOL(pci_write_vpd); > > +/** > + * pci_set_size_vpd - Set size of Vital Product Data space > + * @dev: pci device struct > + * @len: size of vpd space > + */ > +ssize_t pci_set_size_vpd(struct pci_dev *dev, size_t len) > +{ > + if (!dev->vpd || !dev->vpd->ops) > + return -ENODEV; > + return dev->vpd->ops->set_size(dev, len); > +} > +EXPORT_SYMBOL(pci_set_size_vpd); > + > #define PCI_VPD_MAX_SIZE (PCI_VPD_ADDR_MASK + 1) > > /** > @@ -392,13 +405,8 @@ static ssize_t pci_vpd_read(struct pci_dev *dev, loff_t pos, size_t count, > if (vpd->len == 0) > return -EIO; > > - if (pos > vpd->len) > - return 0; > - > - if (end > vpd->len) { > - end = vpd->len; > - count = end - pos; > - } > + if (end > vpd->len) > + return -EINVAL; > > if (mutex_lock_killable(&vpd->lock)) > return -EINTR; Why do you need this change? We still would be needing to validate 'pos', don't we? I'd prefer not to have this bit in. > @@ -498,9 +506,23 @@ out: > return ret ? ret : count; > } > > +static ssize_t pci_vpd_set_size(struct pci_dev *dev, size_t len) > +{ > + struct pci_vpd *vpd = dev->vpd; > + > + if (len == 0 || len > PCI_VPD_MAX_SIZE) > + return -EIO; > + > + vpd->valid = 1; > + vpd->len = len; > + > + return 0; > +} > + > static const struct pci_vpd_ops pci_vpd_ops = { > .read = pci_vpd_read, > .write = pci_vpd_write, > + .set_size = pci_vpd_set_size, > }; > > static ssize_t pci_vpd_f0_read(struct pci_dev *dev, loff_t pos, size_t count, > @@ -533,9 +555,24 @@ static ssize_t pci_vpd_f0_write(struct pci_dev *dev, loff_t pos, size_t count, > return ret; > } > > +static ssize_t pci_vpd_f0_set_size(struct pci_dev *dev, size_t len) > +{ > + struct pci_dev *tdev = pci_get_slot(dev->bus, > + PCI_DEVFN(PCI_SLOT(dev->devfn), 0)); > + ssize_t ret; > + > + if (!tdev) > + return -ENODEV; > + > + ret = pci_set_size_vpd(tdev, len); > + pci_dev_put(tdev); > + return ret; > +} > + > static const struct pci_vpd_ops pci_vpd_f0_ops = { > .read = pci_vpd_f0_read, > .write = pci_vpd_f0_write, > + .set_size = pci_vpd_f0_set_size, > }; > > int pci_vpd_init(struct pci_dev *dev) > --- a/drivers/pci/pci.h > +++ a/drivers/pci/pci.h > @@ -97,6 +97,7 @@ static inline bool pci_has_subordinate(struct pci_dev *pci_dev) > struct pci_vpd_ops { > ssize_t (*read)(struct pci_dev *dev, loff_t pos, size_t count, void *buf); > ssize_t (*write)(struct pci_dev *dev, loff_t pos, size_t count, const void *buf); > + ssize_t (*set_size)(struct pci_dev *dev, size_t len); > }; > > struct pci_vpd { > --- a/include/linux/pci.h > +++ a/include/linux/pci.h > @@ -1111,6 +1111,7 @@ void pci_unlock_rescan_remove(void); > /* Vital product data routines */ > ssize_t pci_read_vpd(struct pci_dev *dev, loff_t pos, size_t count, void *buf); > ssize_t pci_write_vpd(struct pci_dev *dev, loff_t pos, size_t count, const void *buf); > +ssize_t pci_set_size_vpd(struct pci_dev *dev, size_t len); > > /* Helper functions for low-level code (drivers/pci/setup-[bus,res].c) */ > resource_size_t pcibios_retrieve_fw_addr(struct pci_dev *dev, int idx); > > Remaining bits look okay. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.com +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 4.6-rc2 regression with commit 104daa71b396: check VPD access offset against length 2016-04-12 8:46 ` Hannes Reinecke @ 2016-04-12 17:35 ` Casey Leedom 2016-04-12 20:17 ` Casey Leedom ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Casey Leedom @ 2016-04-12 17:35 UTC (permalink / raw) To: Hannes Reinecke, Hariprasad S Cc: Bjorn Helgaas, SWise OGC, linux-pci@vger.kernel.org, bhelgaas@google.com | From: Hannes Reinecke <hare@suse.com> | Sent: Tuesday, April 12, 2016 1:46 AM | To: Hariprasad S | Cc: Bjorn Helgaas; SWise OGC; Casey Leedom; linux-pci@vger.kernel.org; bhelgaas@google.com | Subject: Re: 4.6-rc2 regression with commit 104daa71b396: check VPD access offset against length | | On 04/12/2016 10:23 AM, Hariprasad Shenai wrote: | [ .. ] | > | > Hi, | > | > How about adding a PCI helper function to set the actual VPD_SIZE. | > | > Some thing like below. We have tested this and works. The bnx2x and the tg3 | > drive may also need this, because I see them calling pci_read_vpd() | > with non-zero offsets. The bnx2x in particular looks like it's doing something | > similar to cxgb4 so it would also probably benefit from this change (once it's | > fixed to call the new pci_set_size_vpd() API). | | That indeed looks reasonable. | Please find some comments inline. | | ... | | > --- a/drivers/pci/access.c | > +++ a/drivers/pci/access.c | > @@ -275,6 +275,19 @@ ssize_t pci_write_vpd(struct pci_dev *dev, loff_t pos, size_t count, const void | > } | > EXPORT_SYMBOL(pci_write_vpd); | > | > +/** | > + * pci_set_size_vpd - Set size of Vital Product Data space | > + * @dev: pci device struct | > + * @len: size of vpd space | > + */ | > +ssize_t pci_set_size_vpd(struct pci_dev *dev, size_t len) | > +{ | > + if (!dev->vpd || !dev->vpd->ops) | > + return -ENODEV; | > + return dev->vpd->ops->set_size(dev, len); | > +} | > +EXPORT_SYMBOL(pci_set_size_vpd); | > + | > #define PCI_VPD_MAX_SIZE (PCI_VPD_ADDR_MASK + 1) | > | > /** | > @@ -392,13 +405,8 @@ static ssize_t pci_vpd_read(struct pci_dev *dev, loff_t pos, size_t count, | > if (vpd->len == 0) | > return -EIO; | > | > - if (pos > vpd->len) | > - return 0; | > - | > - if (end > vpd->len) { | > - end = vpd->len; | > - count = end - pos; | > - } | > + if (end > vpd->len) | > + return -EINVAL; | > | > if (mutex_lock_killable(&vpd->lock)) | > return -EINTR; | | Why do you need this change? | We still would be needing to validate 'pos', don't we? | I'd prefer not to have this bit in. Two reasons: 1. It makes pci_vpd_read() with pci_vpd_write() which has exactly this logic. 2. More importantly, the new implementation of pci_read_vpd() silently fails to perform a VPD read and allows the caller to use random stack garbage in the read buffer without knowing that it's not really VPD contents. If any portion of the VPD read isn't going to be performed, we should signal that back to the caller. We could either return an error or we could return the number of bytes actually read. The problem with the latter is that it would require changing every single caller to check for Requested Read Length == Actual Read Length. Returning an error is the more conservative fix and allows for rapid diagnosis of problems. And that last point is important because I spent quite a bit of time digging around trying to figure out why cxgb4 suddenly wasn't working. Casey ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 4.6-rc2 regression with commit 104daa71b396: check VPD access offset against length 2016-04-12 17:35 ` Casey Leedom @ 2016-04-12 20:17 ` Casey Leedom 2016-04-12 21:52 ` Steve Wise 2016-04-13 6:00 ` Hannes Reinecke 2 siblings, 0 replies; 12+ messages in thread From: Casey Leedom @ 2016-04-12 20:17 UTC (permalink / raw) To: Hannes Reinecke, Hariprasad S Cc: Bjorn Helgaas, SWise OGC, linux-pci@vger.kernel.org, bhelgaas@google.com, Ariel Elior By the way, I should note that I don't think that cxgb4 is the only regression associated with kernel.org commit 104daa71b396 which added a check for accesses beyond the computed end of the VPD data structures starting at offset 0x0 in the VPD Space. Looking at other callers to pci_read_vpd() I see at least one other driver which pass in a non-zero offset which seems to be the same kind of thing: drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c:bnx2x_read_fwinfo(): cnt = pci_read_vpd(bp->pdev, BNX2X_VPD_LEN, block_end - BNX2X_VPD_LEN, vpd_extended_data + BNX2X_VPD_LEN); Looking more in depth, if you really want to keep the new pci_read_vpd() logic where it can return partial reads, then we need to go through all of the callers and look for cases like cxgb4 where the only check is for return values less than zero and enhance them to deal with return values less than the requested VPD read request. Casey ________________________________________ From: linux-pci-owner@vger.kernel.org <linux-pci-owner@vger.kernel.org> on behalf of Casey Leedom <leedom@chelsio.com> Sent: Tuesday, April 12, 2016 10:35 AM To: Hannes Reinecke; Hariprasad S Cc: Bjorn Helgaas; SWise OGC; linux-pci@vger.kernel.org; bhelgaas@google.com Subject: Re: 4.6-rc2 regression with commit 104daa71b396: check VPD access offset against length | From: Hannes Reinecke <hare@suse.com> | Sent: Tuesday, April 12, 2016 1:46 AM | To: Hariprasad S | Cc: Bjorn Helgaas; SWise OGC; Casey Leedom; linux-pci@vger.kernel.org; bhelgaas@google.com | Subject: Re: 4.6-rc2 regression with commit 104daa71b396: check VPD access offset against length | | On 04/12/2016 10:23 AM, Hariprasad Shenai wrote: | [ .. ] | > | > Hi, | > | > How about adding a PCI helper function to set the actual VPD_SIZE. | > | > Some thing like below. We have tested this and works. The bnx2x and the tg3 | > drive may also need this, because I see them calling pci_read_vpd() | > with non-zero offsets. The bnx2x in particular looks like it's doing something | > similar to cxgb4 so it would also probably benefit from this change (once it's | > fixed to call the new pci_set_size_vpd() API). | | That indeed looks reasonable. | Please find some comments inline. | | ... | | > --- a/drivers/pci/access.c | > +++ a/drivers/pci/access.c | > @@ -275,6 +275,19 @@ ssize_t pci_write_vpd(struct pci_dev *dev, loff_t pos, size_t count, const void | > } | > EXPORT_SYMBOL(pci_write_vpd); | > | > +/** | > + * pci_set_size_vpd - Set size of Vital Product Data space | > + * @dev: pci device struct | > + * @len: size of vpd space | > + */ | > +ssize_t pci_set_size_vpd(struct pci_dev *dev, size_t len) | > +{ | > + if (!dev->vpd || !dev->vpd->ops) | > + return -ENODEV; | > + return dev->vpd->ops->set_size(dev, len); | > +} | > +EXPORT_SYMBOL(pci_set_size_vpd); | > + | > #define PCI_VPD_MAX_SIZE (PCI_VPD_ADDR_MASK + 1) | > | > /** | > @@ -392,13 +405,8 @@ static ssize_t pci_vpd_read(struct pci_dev *dev, loff_t pos, size_t count, | > if (vpd->len == 0) | > return -EIO; | > | > - if (pos > vpd->len) | > - return 0; | > - | > - if (end > vpd->len) { | > - end = vpd->len; | > - count = end - pos; | > - } | > + if (end > vpd->len) | > + return -EINVAL; | > | > if (mutex_lock_killable(&vpd->lock)) | > return -EINTR; | | Why do you need this change? | We still would be needing to validate 'pos', don't we? | I'd prefer not to have this bit in. Two reasons: 1. It makes pci_vpd_read() with pci_vpd_write() which has exactly this logic. 2. More importantly, the new implementation of pci_read_vpd() silently fails to perform a VPD read and allows the caller to use random stack garbage in the read buffer without knowing that it's not really VPD contents. If any portion of the VPD read isn't going to be performed, we should signal that back to the caller. We could either return an error or we could return the number of bytes actually read. The problem with the latter is that it would require changing every single caller to check for Requested Read Length == Actual Read Length. Returning an error is the more conservative fix and allows for rapid diagnosis of problems. And that last point is important because I spent quite a bit of time digging around trying to figure out why cxgb4 suddenly wasn't working. Casey -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 4.6-rc2 regression with commit 104daa71b396: check VPD access offset against length 2016-04-12 17:35 ` Casey Leedom 2016-04-12 20:17 ` Casey Leedom @ 2016-04-12 21:52 ` Steve Wise 2016-04-13 6:00 ` Hannes Reinecke 2 siblings, 0 replies; 12+ messages in thread From: Steve Wise @ 2016-04-12 21:52 UTC (permalink / raw) To: Casey Leedom, Hannes Reinecke, Hariprasad S Cc: Bjorn Helgaas, linux-pci@vger.kernel.org, bhelgaas@google.com On 4/12/2016 12:35 PM, Casey Leedom wrote: > | From: Hannes Reinecke <hare@suse.com> > | Sent: Tuesday, April 12, 2016 1:46 AM > | To: Hariprasad S > | Cc: Bjorn Helgaas; SWise OGC; Casey Leedom; linux-pci@vger.kernel.org; bhelgaas@google.com > | Subject: Re: 4.6-rc2 regression with commit 104daa71b396: check VPD access offset against length > | > | On 04/12/2016 10:23 AM, Hariprasad Shenai wrote: > | [ .. ] > | > > | > Hi, > | > > | > How about adding a PCI helper function to set the actual VPD_SIZE. > | > > | > Some thing like below. We have tested this and works. The bnx2x and the tg3 > | > drive may also need this, because I see them calling pci_read_vpd() > | > with non-zero offsets. The bnx2x in particular looks like it's doing something > | > similar to cxgb4 so it would also probably benefit from this change (once it's > | > fixed to call the new pci_set_size_vpd() API). > | > | That indeed looks reasonable. > | Please find some comments inline. > | > | ... > | > | > --- a/drivers/pci/access.c > | > +++ a/drivers/pci/access.c > | > @@ -275,6 +275,19 @@ ssize_t pci_write_vpd(struct pci_dev *dev, loff_t pos, size_t count, const void > | > } > | > EXPORT_SYMBOL(pci_write_vpd); > | > > | > +/** > | > + * pci_set_size_vpd - Set size of Vital Product Data space > | > + * @dev: pci device struct > | > + * @len: size of vpd space > | > + */ > | > +ssize_t pci_set_size_vpd(struct pci_dev *dev, size_t len) > | > +{ > | > + if (!dev->vpd || !dev->vpd->ops) > | > + return -ENODEV; > | > + return dev->vpd->ops->set_size(dev, len); > | > +} > | > +EXPORT_SYMBOL(pci_set_size_vpd); > | > + > | > #define PCI_VPD_MAX_SIZE (PCI_VPD_ADDR_MASK + 1) > | > > | > /** > | > @@ -392,13 +405,8 @@ static ssize_t pci_vpd_read(struct pci_dev *dev, loff_t pos, size_t count, > | > if (vpd->len == 0) > | > return -EIO; > | > > | > - if (pos > vpd->len) > | > - return 0; > | > - > | > - if (end > vpd->len) { > | > - end = vpd->len; > | > - count = end - pos; > | > - } > | > + if (end > vpd->len) > | > + return -EINVAL; > | > > | > if (mutex_lock_killable(&vpd->lock)) > | > return -EINTR; > | > | Why do you need this change? > | We still would be needing to validate 'pos', don't we? > | I'd prefer not to have this bit in. > > Two reasons: > > 1. It makes pci_vpd_read() with pci_vpd_write() which has exactly this > logic. > > 2. More importantly, the new implementation of pci_read_vpd() silently > fails to perform a VPD read and allows the caller to use random stack > garbage in the read buffer without knowing that it's not really VPD > contents. If any portion of the VPD read isn't going to be performed, > we should signal that back to the caller. We could either return an > error or we could return the number of bytes actually read. The problem > with the latter is that it would require changing every single caller to > check for Requested Read Length == Actual Read Length. Returning an > error is the more conservative fix and allows for rapid diagnosis of > problems. > > And that last point is important because I spent quite a bit of time > digging around trying to figure out why cxgb4 suddenly wasn't working. > > I agree with Casey. As I said before, pci_vpd_read() returning 0 and not reading anything has to be a bug, right? IMO it needs to return a negative errno if the region defined by the offset/length being read exceeds the entire vpd region length. I don't see the use of a partial read anyway. Is there utility in that? Also, I suggest the patch Hari sent out be split into two patches: 1) add the pci_set_size_vpd() API and usage by cxgb4 - this is critical for 4.6-rc 2) fixing pci_vpd_read() - perhaps less critical, but if we converge on a consensus, it can hit 4.6-rc or 4.7 Thoughts? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 4.6-rc2 regression with commit 104daa71b396: check VPD access offset against length 2016-04-12 17:35 ` Casey Leedom 2016-04-12 20:17 ` Casey Leedom 2016-04-12 21:52 ` Steve Wise @ 2016-04-13 6:00 ` Hannes Reinecke 2016-04-13 16:52 ` Casey Leedom 2 siblings, 1 reply; 12+ messages in thread From: Hannes Reinecke @ 2016-04-13 6:00 UTC (permalink / raw) To: Casey Leedom, Hariprasad S Cc: Bjorn Helgaas, SWise OGC, linux-pci@vger.kernel.org, bhelgaas@google.com On 04/12/2016 07:35 PM, Casey Leedom wrote: > | From: Hannes Reinecke <hare@suse.com> > | Sent: Tuesday, April 12, 2016 1:46 AM > | To: Hariprasad S > | Cc: Bjorn Helgaas; SWise OGC; Casey Leedom; linux-pci@vger.kernel.org; bhelgaas@google.com > | Subject: Re: 4.6-rc2 regression with commit 104daa71b396: check VPD access offset against length > | > | On 04/12/2016 10:23 AM, Hariprasad Shenai wrote: > | [ .. ] > | > > | > Hi, > | > > | > How about adding a PCI helper function to set the actual VPD_SIZE. > | > > | > Some thing like below. We have tested this and works. The bnx2x and the tg3 > | > drive may also need this, because I see them calling pci_read_vpd() > | > with non-zero offsets. The bnx2x in particular looks like it's doing something > | > similar to cxgb4 so it would also probably benefit from this change (once it's > | > fixed to call the new pci_set_size_vpd() API). > | > | That indeed looks reasonable. > | Please find some comments inline. > | > | ... > | > | > --- a/drivers/pci/access.c > | > +++ a/drivers/pci/access.c > | > @@ -275,6 +275,19 @@ ssize_t pci_write_vpd(struct pci_dev *dev, loff_t pos, size_t count, const void > | > } > | > EXPORT_SYMBOL(pci_write_vpd); > | > > | > +/** > | > + * pci_set_size_vpd - Set size of Vital Product Data space > | > + * @dev: pci device struct > | > + * @len: size of vpd space > | > + */ > | > +ssize_t pci_set_size_vpd(struct pci_dev *dev, size_t len) > | > +{ > | > + if (!dev->vpd || !dev->vpd->ops) > | > + return -ENODEV; > | > + return dev->vpd->ops->set_size(dev, len); > | > +} > | > +EXPORT_SYMBOL(pci_set_size_vpd); > | > + > | > #define PCI_VPD_MAX_SIZE (PCI_VPD_ADDR_MASK + 1) > | > > | > /** > | > @@ -392,13 +405,8 @@ static ssize_t pci_vpd_read(struct pci_dev *dev, loff_t pos, size_t count, > | > if (vpd->len == 0) > | > return -EIO; > | > > | > - if (pos > vpd->len) > | > - return 0; > | > - > | > - if (end > vpd->len) { > | > - end = vpd->len; > | > - count = end - pos; > | > - } > | > + if (end > vpd->len) > | > + return -EINVAL; > | > > | > if (mutex_lock_killable(&vpd->lock)) > | > return -EINTR; > | > | Why do you need this change? > | We still would be needing to validate 'pos', don't we? > | I'd prefer not to have this bit in. > > Two reasons: > > 1. It makes pci_vpd_read() with pci_vpd_write() which has exactly this > logic. > > 2. More importantly, the new implementation of pci_read_vpd() silently > fails to perform a VPD read and allows the caller to use random stack > garbage in the read buffer without knowing that it's not really VPD > contents. If any portion of the VPD read isn't going to be performed, > we should signal that back to the caller. We could either return an > error or we could return the number of bytes actually read. The problem > with the latter is that it would require changing every single caller to > check for Requested Read Length == Actual Read Length. Returning an > error is the more conservative fix and allows for rapid diagnosis of > problems. > > And that last point is important because I spent quite a bit of time > digging around trying to figure out why cxgb4 suddenly wasn't working. > Okay. But wouldn't we need to check for 'pos' exceeding 'vpd->len', too? Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.com +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 4.6-rc2 regression with commit 104daa71b396: check VPD access offset against length 2016-04-13 6:00 ` Hannes Reinecke @ 2016-04-13 16:52 ` Casey Leedom 0 siblings, 0 replies; 12+ messages in thread From: Casey Leedom @ 2016-04-13 16:52 UTC (permalink / raw) To: Hannes Reinecke, Hariprasad S Cc: Bjorn Helgaas, SWise OGC, linux-pci@vger.kernel.org, bhelgaas@google.com, Ariel Elior | From: linux-pci-owner@vger.kernel.org <linux-pci-owner@vger.kernel.org> on behalf of Hannes Reinecke <hare@suse.com> | Sent: Tuesday, April 12, 2016 11:00 PM | | On 04/12/2016 07:35 PM, Casey Leedom wrote: | > | From: Hannes Reinecke <hare@suse.com> | > | Sent: Tuesday, April 12, 2016 1:46 AM | > | ... | > | > --- a/drivers/pci/access.c | > | > +++ a/drivers/pci/access.c | > | > ... | > | > @@ -392,13 +405,8 @@ static ssize_t pci_vpd_read(struct pci_dev *dev, loff_t pos, size_t count, | > | > if (vpd->len == 0) | > | > return -EIO; | > | > | > | > - if (pos > vpd->len) | > | > - return 0; | > | > - | > | > - if (end > vpd->len) { | > | > - end = vpd->len; | > | > - count = end - pos; | > | > - } | > | > + if (end > vpd->len) | > | > + return -EINVAL; | > | > | > | > if (mutex_lock_killable(&vpd->lock)) | > | > return -EINTR; | > | | > | Why do you need this change? | > | We still would be needing to validate 'pos', don't we? | > | I'd prefer not to have this bit in. | > | > Two reasons: | > | > 1. It makes pci_vpd_read() with pci_vpd_write() which has exactly this | > logic. | > | > 2. More importantly, the new implementation of pci_read_vpd() silently | > fails to perform a VPD read and allows the caller to use random stack | > garbage in the read buffer without knowing that it's not really VPD | > contents. If any portion of the VPD read isn't going to be performed, | > we should signal that back to the caller. We could either return an | > error or we could return the number of bytes actually read. The problem | > with the latter is that it would require changing every single caller to | > check for Requested Read Length == Actual Read Length. Returning an | > error is the more conservative fix and allows for rapid diagnosis of | > problems. | > | > And that last point is important because I spent quite a bit of time | > digging around trying to figure out why cxgb4 suddenly wasn't working. | | Okay. But wouldn't we need to check for 'pos' exceeding 'vpd->len', too? Nope. "pos", "count", and "end" are unsigned: loff_t end = pos + count; If "pos" is greater than "vpd->len", "end" will be as well. This is actually exactly the same code that's already in pci_vpd_write(). Casey ^ permalink raw reply [flat|nested] 12+ messages in thread
* 4.6-rc2 regression with commit 104daa71b396: check VPD access offset against length
@ 2016-04-08 21:58 Casey Leedom
0 siblings, 0 replies; 12+ messages in thread
From: Casey Leedom @ 2016-04-08 21:58 UTC (permalink / raw)
To: linux-pci@vger.kernel.org; +Cc: Bjorn Helgaas, Hannes Reinecke, Hariprasad S
[[ I apologize if this is a duplicate message. I've been struggling all
afternoon to get a message successfully delivered to the linux-pci
list.
-- Casey ]]
kernel.org commit 104daa71b396 added a check to make sure that efforts to
read/write the VPD wouldn't extend past the computed length of the VPD.
Later, kernel.org commit 408641e93aa5 folded the pci_vpd_pci22 into
struct pci_vpd so things moved around a bit after that and an error return
got changed into a silent failure instead of -EINVAL.
The problem is that the previous pci_vpd_pci22_read() didn't check for a read with
a VPD Offset > VPD Length and the new pci_vpd_read() is checking that. Worse
yet, when a VPD Offset is greater than the recorded VPD Length, it simply
returns 0 rather than -EINVAL.
The problem is stemming from the fact that the Chelsio adapters actually have
two VPD structures stored in the VPD. An abbreviated on at Offset 0x0 and the
complete VPD at Offset 0x400. The abbreviated one only contains the PN, SN and
EC Keywords, while the complete VPD contains those plus various adapter
constants contained in V0, V1, etc. And it also contains the Base Ethernet MAC
Address in the "NA" Keyword which the cxgb4 driver needs when it can't contact
the adapter firmware. (We don't have the "NA" Keywork in the VPD Structure at
Offset 0x0 because that's not an allowed VPD Keyword in the PCI-E 3.0
specification.)
With the new code, the computed size of the VPD is 0x200 and so our efforts
to read the VPD at Offset 0x400 silently fails. We check the result of the
read looking for a signature 0x82 byte but we're checking against random stack
garbage.
The end result is that the cxgb4 driver now fails the PCI-E Probe.
Again, apologies if I'm not communicating this to the Linux Development
Community properly -- I'm usually not allowed out of my cage.
Casey
^ permalink raw reply [flat|nested] 12+ messages in threadend of thread, other threads:[~2016-04-13 17:06 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-08 22:24 4.6-rc2 regression with commit 104daa71b396: check VPD access offset against length Casey Leedom
[not found] ` <BY2PR12MB0648F0C35D907AA640D7E56CC8940@BY2PR12MB0648.namprd12.prod.outlook.com>
2016-04-11 18:16 ` Steve Wise
2016-04-12 5:37 ` Bjorn Helgaas
2016-04-12 6:20 ` Hannes Reinecke
2016-04-12 8:23 ` Hariprasad Shenai
2016-04-12 8:46 ` Hannes Reinecke
2016-04-12 17:35 ` Casey Leedom
2016-04-12 20:17 ` Casey Leedom
2016-04-12 21:52 ` Steve Wise
2016-04-13 6:00 ` Hannes Reinecke
2016-04-13 16:52 ` Casey Leedom
-- strict thread matches above, loose matches on Subject: below --
2016-04-08 21:58 Casey Leedom
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).