* [PATCH] Cover letter for (PCI/AER: only insert one element into kfifo) @ 2018-12-12 8:32 ` Yanjiang Jin 0 siblings, 0 replies; 8+ messages in thread From: Yanjiang Jin @ 2018-12-12 8:32 UTC (permalink / raw) To: bhelgaas, keith.busch Cc: yu.zheng, ruscur, sbobroff, oohall, jinyanjiang, yanjiang.jin, linuxppc-dev, linux-pci, linux-kernel Without this patch, if we have multi PCIe devices, and one of them has AER error, aer_recover_work_func() -> kfifo_get() will traverse the whole kfifo which has wrong element number(16). If one null element's uninitialized memory matches another PCIe device(0000:01:00.0), we may get the below call trace. It is unusual, but indeed happened on my board: QDF2400. # lspci 0000:00:00.0 PCI bridge: 0000:01:00.0 Ethernet controller: 0004:00:00.0 PCI bridge: 0004:01:00.0 Ethernet controller: 0005:00:00.0 PCI bridge: 0005:01:00.0 Ethernet controller: Call trace: [Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 5 [Hardware Error]: It has been corrected by h/w and requires no further action [Hardware Error]: event severity: corrected [Hardware Error]: precise tstamp: 2018-11-29 09:23:16 [Hardware Error]: Error 0, type: corrected [Hardware Error]: section_type: PCIe error [Hardware Error]: port_type: 4, root port [Hardware Error]: version: 3.0 [Hardware Error]: command: 0x0407, status: 0x0010 [Hardware Error]: device_id: 0004:00:00.0 [Hardware Error]: slot: 0 [Hardware Error]: secondary_bus: 0x01 [Hardware Error]: vendor_id: 0x17cb, device_id: 0x0401 [Hardware Error]: class_code: 000406 [Hardware Error]: bridge: secondary_status: 0x0000, control: 0x0000 AER recover: find pci_dev for 0004:00:00:0 pcieport 0004:00:00.0: aer_status: 0x00000001, aer_mask: 0x0000e000 pcieport 0004:00:00.0: [ 0] RxErr (First) pcieport 0004:00:00.0: aer_layer=Physical Layer, aer_agent=Receiver ID AER recover: Can not find pci_dev for a38f:00:18:2 AER recover: Can not find pci_dev for 0857:1c:03:5 AER recover: Can not find pci_dev for 62d2:80:19:6 AER recover: Can not find pci_dev for 0857:f8:03:4 AER recover: Can not find pci_dev for 0907:78:07:1 AER recover: Can not find pci_dev for 0000:00:00:1 AER recover: Can not find pci_dev for 0907:00:00:0 AER recover: Can not find pci_dev for 0000:00:00:1 AER recover: find pci_dev for 0000:01:00:0 Unable to handle kernel paging request at virtual address 0000000000813004 Mem abort info: ESR = 0x96000007 Exception class = DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000007 CM = 0, WnR = 0 user pgtable: 64k pages, 48-bit VAs, pgdp = 000000000dce9024 [0000000000813004] pgd=0000001727260003, pud=0000001727260003 pmd=0000001727290003, pte=0000000000000000 Internal error: Oops: 96000007 [#1] SMP Workqueue: events aer_recover_work_func pstate: 20400005 (nzCv daif +PAN -UAO) pc : cper_print_aer+0x4c/0x290 lr : aer_recover_work_func+0x110/0x150 sp : ffff8017ca59fca0 x29: ffff8017ca59fca0 x28: ffff8017ca841000 x27: ffff8017ca841000 x26: 0000000000000001 x25: 0000000000813000 x24: 0000000000000040 x23: 0000000000000040 x22: ffff000008d5f830 x21: ffff0000090f1f10 x20: ffff0000090f1e98 x19: 0000000000000000 x18: ffffffffffffffff x17: 0000000000000001 x16: 0000000000000007 x15: ffff000009073708 x14: ffff0000891e8faf x13: ffff0000091e8fbd x12: 2c726579614c206c x11: ffff00000909b000 x10: 0000000005f5e0ff x9 : ffff8017ca59fa10 x8 : ffff000009073978 x7 : ffff0000091e8a40 x6 : 0000000000000518 x5 : 0000000000000001 x4 : ffff8017ff9710b8 x3 : ffff8017ff9710b8 x2 : 0000000000813000 x1 : 0000000000000000 x0 : ffff000009073708 Process kworker/11:1 (pid: 232, stack limit = 0x00000000060ad7e1) Call trace: cper_print_aer+0x4c/0x290 aer_recover_work_func+0x110/0x150 process_one_work+0x1ac/0x3f0 worker_thread+0x54/0x430 kthread+0x104/0x130 ret_from_fork+0x10/0x18 Code: f9400001 f90057a1 d2800001 54000f40 (2940e334) SMP: stopping secondary CPUs Starting crashdump kernel... Bye! This email is intended only for the named addressee. It may contain information that is confidential/private, legally privileged, or copyright-protected, and you should handle it accordingly. If you are not the intended recipient, you do not have legal rights to retain, copy, or distribute this email or its contents, and should promptly delete the email and all electronic copies in your system; do not retain copies in any media. If you have received this email in error, please notify the sender promptly. Thank you. ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Cover letter for (PCI/AER: only insert one element into kfifo) @ 2018-12-12 8:32 ` Yanjiang Jin 0 siblings, 0 replies; 8+ messages in thread From: Yanjiang Jin @ 2018-12-12 8:32 UTC (permalink / raw) To: bhelgaas, keith.busch Cc: yanjiang.jin, sbobroff, linux-pci, jinyanjiang, linux-kernel, oohall, linuxppc-dev, yu.zheng Without this patch, if we have multi PCIe devices, and one of them has AER error, aer_recover_work_func() -> kfifo_get() will traverse the whole kfifo which has wrong element number(16). If one null element's uninitialized memory matches another PCIe device(0000:01:00.0), we may get the below call trace. It is unusual, but indeed happened on my board: QDF2400. # lspci 0000:00:00.0 PCI bridge: 0000:01:00.0 Ethernet controller: 0004:00:00.0 PCI bridge: 0004:01:00.0 Ethernet controller: 0005:00:00.0 PCI bridge: 0005:01:00.0 Ethernet controller: Call trace: [Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 5 [Hardware Error]: It has been corrected by h/w and requires no further action [Hardware Error]: event severity: corrected [Hardware Error]: precise tstamp: 2018-11-29 09:23:16 [Hardware Error]: Error 0, type: corrected [Hardware Error]: section_type: PCIe error [Hardware Error]: port_type: 4, root port [Hardware Error]: version: 3.0 [Hardware Error]: command: 0x0407, status: 0x0010 [Hardware Error]: device_id: 0004:00:00.0 [Hardware Error]: slot: 0 [Hardware Error]: secondary_bus: 0x01 [Hardware Error]: vendor_id: 0x17cb, device_id: 0x0401 [Hardware Error]: class_code: 000406 [Hardware Error]: bridge: secondary_status: 0x0000, control: 0x0000 AER recover: find pci_dev for 0004:00:00:0 pcieport 0004:00:00.0: aer_status: 0x00000001, aer_mask: 0x0000e000 pcieport 0004:00:00.0: [ 0] RxErr (First) pcieport 0004:00:00.0: aer_layer=Physical Layer, aer_agent=Receiver ID AER recover: Can not find pci_dev for a38f:00:18:2 AER recover: Can not find pci_dev for 0857:1c:03:5 AER recover: Can not find pci_dev for 62d2:80:19:6 AER recover: Can not find pci_dev for 0857:f8:03:4 AER recover: Can not find pci_dev for 0907:78:07:1 AER recover: Can not find pci_dev for 0000:00:00:1 AER recover: Can not find pci_dev for 0907:00:00:0 AER recover: Can not find pci_dev for 0000:00:00:1 AER recover: find pci_dev for 0000:01:00:0 Unable to handle kernel paging request at virtual address 0000000000813004 Mem abort info: ESR = 0x96000007 Exception class = DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000007 CM = 0, WnR = 0 user pgtable: 64k pages, 48-bit VAs, pgdp = 000000000dce9024 [0000000000813004] pgd=0000001727260003, pud=0000001727260003 pmd=0000001727290003, pte=0000000000000000 Internal error: Oops: 96000007 [#1] SMP Workqueue: events aer_recover_work_func pstate: 20400005 (nzCv daif +PAN -UAO) pc : cper_print_aer+0x4c/0x290 lr : aer_recover_work_func+0x110/0x150 sp : ffff8017ca59fca0 x29: ffff8017ca59fca0 x28: ffff8017ca841000 x27: ffff8017ca841000 x26: 0000000000000001 x25: 0000000000813000 x24: 0000000000000040 x23: 0000000000000040 x22: ffff000008d5f830 x21: ffff0000090f1f10 x20: ffff0000090f1e98 x19: 0000000000000000 x18: ffffffffffffffff x17: 0000000000000001 x16: 0000000000000007 x15: ffff000009073708 x14: ffff0000891e8faf x13: ffff0000091e8fbd x12: 2c726579614c206c x11: ffff00000909b000 x10: 0000000005f5e0ff x9 : ffff8017ca59fa10 x8 : ffff000009073978 x7 : ffff0000091e8a40 x6 : 0000000000000518 x5 : 0000000000000001 x4 : ffff8017ff9710b8 x3 : ffff8017ff9710b8 x2 : 0000000000813000 x1 : 0000000000000000 x0 : ffff000009073708 Process kworker/11:1 (pid: 232, stack limit = 0x00000000060ad7e1) Call trace: cper_print_aer+0x4c/0x290 aer_recover_work_func+0x110/0x150 process_one_work+0x1ac/0x3f0 worker_thread+0x54/0x430 kthread+0x104/0x130 ret_from_fork+0x10/0x18 Code: f9400001 f90057a1 d2800001 54000f40 (2940e334) SMP: stopping secondary CPUs Starting crashdump kernel... Bye! This email is intended only for the named addressee. It may contain information that is confidential/private, legally privileged, or copyright-protected, and you should handle it accordingly. If you are not the intended recipient, you do not have legal rights to retain, copy, or distribute this email or its contents, and should promptly delete the email and all electronic copies in your system; do not retain copies in any media. If you have received this email in error, please notify the sender promptly. Thank you. ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] PCI/AER: only insert one element into kfifo 2018-12-12 8:32 ` Yanjiang Jin @ 2018-12-12 8:32 ` Yanjiang Jin -1 siblings, 0 replies; 8+ messages in thread From: Yanjiang Jin @ 2018-12-12 8:32 UTC (permalink / raw) To: bhelgaas, keith.busch Cc: yu.zheng, ruscur, sbobroff, oohall, jinyanjiang, yanjiang.jin, linuxppc-dev, linux-pci, linux-kernel 'commit ecae65e133f2 ("PCI/AER: Use kfifo_in_spinlocked() to insert locked elements")' replace kfifo_put() with kfifo_in_spinlocked(). But as "kfifo_in(fifo, buf, n)" describes: " * @n: number of elements to be added". We want to insert only one element into kfifo, not "sizeof(entry) = 16". Without this patch, we would get 15 uninitialized elements. Signed-off-by: Yanjiang Jin <yanjiang.jin@hxt-semitech.com> --- drivers/pci/pcie/aer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c index a90a919..fed29de 100644 --- a/drivers/pci/pcie/aer.c +++ b/drivers/pci/pcie/aer.c @@ -1064,7 +1064,7 @@ void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn, .regs = aer_regs, }; - if (kfifo_in_spinlocked(&aer_recover_ring, &entry, sizeof(entry), + if (kfifo_in_spinlocked(&aer_recover_ring, &entry, 1, &aer_recover_ring_lock)) schedule_work(&aer_recover_work); else -- 1.8.3.1 This email is intended only for the named addressee. It may contain information that is confidential/private, legally privileged, or copyright-protected, and you should handle it accordingly. If you are not the intended recipient, you do not have legal rights to retain, copy, or distribute this email or its contents, and should promptly delete the email and all electronic copies in your system; do not retain copies in any media. If you have received this email in error, please notify the sender promptly. Thank you. ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH] PCI/AER: only insert one element into kfifo @ 2018-12-12 8:32 ` Yanjiang Jin 0 siblings, 0 replies; 8+ messages in thread From: Yanjiang Jin @ 2018-12-12 8:32 UTC (permalink / raw) To: bhelgaas, keith.busch Cc: yanjiang.jin, sbobroff, linux-pci, jinyanjiang, linux-kernel, oohall, linuxppc-dev, yu.zheng 'commit ecae65e133f2 ("PCI/AER: Use kfifo_in_spinlocked() to insert locked elements")' replace kfifo_put() with kfifo_in_spinlocked(). But as "kfifo_in(fifo, buf, n)" describes: " * @n: number of elements to be added". We want to insert only one element into kfifo, not "sizeof(entry) = 16". Without this patch, we would get 15 uninitialized elements. Signed-off-by: Yanjiang Jin <yanjiang.jin@hxt-semitech.com> --- drivers/pci/pcie/aer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c index a90a919..fed29de 100644 --- a/drivers/pci/pcie/aer.c +++ b/drivers/pci/pcie/aer.c @@ -1064,7 +1064,7 @@ void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn, .regs = aer_regs, }; - if (kfifo_in_spinlocked(&aer_recover_ring, &entry, sizeof(entry), + if (kfifo_in_spinlocked(&aer_recover_ring, &entry, 1, &aer_recover_ring_lock)) schedule_work(&aer_recover_work); else -- 1.8.3.1 This email is intended only for the named addressee. It may contain information that is confidential/private, legally privileged, or copyright-protected, and you should handle it accordingly. If you are not the intended recipient, you do not have legal rights to retain, copy, or distribute this email or its contents, and should promptly delete the email and all electronic copies in your system; do not retain copies in any media. If you have received this email in error, please notify the sender promptly. Thank you. ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] PCI/AER: only insert one element into kfifo 2018-12-12 8:32 ` Yanjiang Jin @ 2018-12-12 14:40 ` Keith Busch -1 siblings, 0 replies; 8+ messages in thread From: Keith Busch @ 2018-12-12 14:40 UTC (permalink / raw) To: Yanjiang Jin Cc: bhelgaas, yu.zheng, ruscur, sbobroff, oohall, jinyanjiang, linuxppc-dev, linux-pci, linux-kernel On Wed, Dec 12, 2018 at 04:32:30PM +0800, Yanjiang Jin wrote: > 'commit ecae65e133f2 ("PCI/AER: Use kfifo_in_spinlocked() to > insert locked elements")' replace kfifo_put() with kfifo_in_spinlocked(). > > But as "kfifo_in(fifo, buf, n)" describes: > " * @n: number of elements to be added". > > We want to insert only one element into kfifo, not "sizeof(entry) = 16". > Without this patch, we would get 15 uninitialized elements. > > Signed-off-by: Yanjiang Jin <yanjiang.jin@hxt-semitech.com> My bad. I had trouble testing the GHES path for this. Thanks for the fix. Reviewed-by: Keith Busch <keith.busch@intel.com> > --- > drivers/pci/pcie/aer.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index a90a919..fed29de 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -1064,7 +1064,7 @@ void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn, > .regs = aer_regs, > }; > > - if (kfifo_in_spinlocked(&aer_recover_ring, &entry, sizeof(entry), > + if (kfifo_in_spinlocked(&aer_recover_ring, &entry, 1, > &aer_recover_ring_lock)) > schedule_work(&aer_recover_work); > else > -- > 1.8.3.1 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] PCI/AER: only insert one element into kfifo @ 2018-12-12 14:40 ` Keith Busch 0 siblings, 0 replies; 8+ messages in thread From: Keith Busch @ 2018-12-12 14:40 UTC (permalink / raw) To: Yanjiang Jin Cc: sbobroff, linux-pci, jinyanjiang, linux-kernel, oohall, bhelgaas, linuxppc-dev, yu.zheng On Wed, Dec 12, 2018 at 04:32:30PM +0800, Yanjiang Jin wrote: > 'commit ecae65e133f2 ("PCI/AER: Use kfifo_in_spinlocked() to > insert locked elements")' replace kfifo_put() with kfifo_in_spinlocked(). > > But as "kfifo_in(fifo, buf, n)" describes: > " * @n: number of elements to be added". > > We want to insert only one element into kfifo, not "sizeof(entry) = 16". > Without this patch, we would get 15 uninitialized elements. > > Signed-off-by: Yanjiang Jin <yanjiang.jin@hxt-semitech.com> My bad. I had trouble testing the GHES path for this. Thanks for the fix. Reviewed-by: Keith Busch <keith.busch@intel.com> > --- > drivers/pci/pcie/aer.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index a90a919..fed29de 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -1064,7 +1064,7 @@ void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn, > .regs = aer_regs, > }; > > - if (kfifo_in_spinlocked(&aer_recover_ring, &entry, sizeof(entry), > + if (kfifo_in_spinlocked(&aer_recover_ring, &entry, 1, > &aer_recover_ring_lock)) > schedule_work(&aer_recover_work); > else > -- > 1.8.3.1 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] PCI/AER: only insert one element into kfifo 2018-12-12 8:32 ` Yanjiang Jin @ 2018-12-14 17:36 ` Bjorn Helgaas -1 siblings, 0 replies; 8+ messages in thread From: Bjorn Helgaas @ 2018-12-14 17:36 UTC (permalink / raw) To: Yanjiang Jin Cc: keith.busch, yu.zheng, ruscur, sbobroff, oohall, jinyanjiang, linuxppc-dev, linux-pci, linux-kernel On Wed, Dec 12, 2018 at 04:32:30PM +0800, Yanjiang Jin wrote: > 'commit ecae65e133f2 ("PCI/AER: Use kfifo_in_spinlocked() to > insert locked elements")' replace kfifo_put() with kfifo_in_spinlocked(). > > But as "kfifo_in(fifo, buf, n)" describes: > " * @n: number of elements to be added". > > We want to insert only one element into kfifo, not "sizeof(entry) = 16". > Without this patch, we would get 15 uninitialized elements. > > Signed-off-by: Yanjiang Jin <yanjiang.jin@hxt-semitech.com> Since this fixes ecae65e133f2, which was applied for v4.20, I applied this with Keith's reviewed-by to for-linus so we can get it into v4.20. For some reason the patch didn't apply, but I can't see why, so I just applied it by hand. Thanks! > --- > drivers/pci/pcie/aer.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index a90a919..fed29de 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -1064,7 +1064,7 @@ void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn, > .regs = aer_regs, > }; > > - if (kfifo_in_spinlocked(&aer_recover_ring, &entry, sizeof(entry), > + if (kfifo_in_spinlocked(&aer_recover_ring, &entry, 1, > &aer_recover_ring_lock)) > schedule_work(&aer_recover_work); > else > -- > 1.8.3.1 > > > > > This email is intended only for the named addressee. It may contain information that is confidential/private, legally privileged, or copyright-protected, and you should handle it accordingly. If you are not the intended recipient, you do not have legal rights to retain, copy, or distribute this email or its contents, and should promptly delete the email and all electronic copies in your system; do not retain copies in any media. If you have received this email in error, please notify the sender promptly. Thank you. > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] PCI/AER: only insert one element into kfifo @ 2018-12-14 17:36 ` Bjorn Helgaas 0 siblings, 0 replies; 8+ messages in thread From: Bjorn Helgaas @ 2018-12-14 17:36 UTC (permalink / raw) To: Yanjiang Jin Cc: sbobroff, linux-pci, jinyanjiang, linux-kernel, keith.busch, oohall, linuxppc-dev, yu.zheng On Wed, Dec 12, 2018 at 04:32:30PM +0800, Yanjiang Jin wrote: > 'commit ecae65e133f2 ("PCI/AER: Use kfifo_in_spinlocked() to > insert locked elements")' replace kfifo_put() with kfifo_in_spinlocked(). > > But as "kfifo_in(fifo, buf, n)" describes: > " * @n: number of elements to be added". > > We want to insert only one element into kfifo, not "sizeof(entry) = 16". > Without this patch, we would get 15 uninitialized elements. > > Signed-off-by: Yanjiang Jin <yanjiang.jin@hxt-semitech.com> Since this fixes ecae65e133f2, which was applied for v4.20, I applied this with Keith's reviewed-by to for-linus so we can get it into v4.20. For some reason the patch didn't apply, but I can't see why, so I just applied it by hand. Thanks! > --- > drivers/pci/pcie/aer.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index a90a919..fed29de 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -1064,7 +1064,7 @@ void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn, > .regs = aer_regs, > }; > > - if (kfifo_in_spinlocked(&aer_recover_ring, &entry, sizeof(entry), > + if (kfifo_in_spinlocked(&aer_recover_ring, &entry, 1, > &aer_recover_ring_lock)) > schedule_work(&aer_recover_work); > else > -- > 1.8.3.1 > > > > > This email is intended only for the named addressee. It may contain information that is confidential/private, legally privileged, or copyright-protected, and you should handle it accordingly. If you are not the intended recipient, you do not have legal rights to retain, copy, or distribute this email or its contents, and should promptly delete the email and all electronic copies in your system; do not retain copies in any media. If you have received this email in error, please notify the sender promptly. Thank you. > > ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2018-12-14 17:39 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-12-12 8:32 [PATCH] Cover letter for (PCI/AER: only insert one element into kfifo) Yanjiang Jin 2018-12-12 8:32 ` Yanjiang Jin 2018-12-12 8:32 ` [PATCH] PCI/AER: only insert one element into kfifo Yanjiang Jin 2018-12-12 8:32 ` Yanjiang Jin 2018-12-12 14:40 ` Keith Busch 2018-12-12 14:40 ` Keith Busch 2018-12-14 17:36 ` Bjorn Helgaas 2018-12-14 17:36 ` Bjorn Helgaas
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.