linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Kernel oops from pci_disable_msi
       [not found] ` <CAErSpo4TbUuq0wb06JV9Xchmcjsk9q3cm7+XO-dOSiJAAhXPMA@mail.gmail.com>
@ 2013-11-18 21:02   ` Bjorn Helgaas
  2013-11-18 23:11     ` Jingoo Han
  0 siblings, 1 reply; 28+ messages in thread
From: Bjorn Helgaas @ 2013-11-18 21:02 UTC (permalink / raw)
  To: Bjørn Erik Nilsen; +Cc: jg1.han@samsung.com, linux-pci@vger.kernel.org

[+cc linux-pci for real]

On Mon, Nov 18, 2013 at 2:01 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> [+cc linux-pci]
>
> On Mon, Nov 18, 2013 at 6:53 AM, Bjørn Erik Nilsen <ben@datarespons.no> wrote:
>> I just hit an kernel oops related to PCI (in dw_msi_teardown_irq()/clear_irq() (pcie-designware))
>>
>> Linux version 3.12.0-next-20131105 (bnilsen@bnilsen) (gcc version 4.7.2 (GCC) )
>>
>> Problem seem to be dereferencing a null pointer returned from irq_desc_get_msi_desc(desc) (see attached backtrace).
>
> Included oops inline for ease of viewing/searching.  Jingooo, I assume
> you'll investigate this.  Let me know if otherwise.
>
> Unable to handle kernel NULL pointer dereference at virtual address 00000020
> pgd = 80004000
> [00000020] *pgd=00000000
> Internal error: Oops: 17 [#1] SMP ARM
> Modules linked in: sxdma(O)
> CPU: 1 PID: 569 Comm: i2cipc.B3 Tainted: G           O 3.12.0-next-20131105 #8
> task: 9efcb600 ti: 9ec8c000 task.ti: 9ec8c000
> PC is at dw_msi_teardown_irq+0x40/0x118
> LR is at trace_hardirqs_on_caller+0xf4/0x1c0
> pc : [<802c405c>]    lr : [<80078370>]    psr: 20000013
> sp : 9ec8dca8  ip : 9ec8dbe0  fp : 9ec8dccc
> r10: 9fa58800  r9 : 00000000  r8 : 00000001
> r7 : 9f9fecc0  r6 : 00000188  r5 : 9ecdb580  r4 : 00000000
> r3 : 9efcb600  r2 : 00000000  r1 : 00000001  r0 : 00000012
> Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
> Control: 10c5387d  Table: 2ed1804a  DAC: 00000015
> Process i2cipc.B3 (pid: 569, stack limit = 0x9ec8c240)
> Stack: (0x9ec8dca8 to 0x9ec8e000)
> dca0:                   8005d1a4 802844c8 00000188 9ecdb580 00000004 9fa58c74
> dcc0: 9ec8dce4 9ec8dcd0 802c1844 802c4028 00000000 00000002 9ec8dd0c 9ec8dce8
> dce0: 802c1a08 802c1814 00000000 00000004 9fa58c60 00000004 9fa58c74 00000001
> dd00: 9ec8dd1c 9ec8dd10 802c1a34 802c19ac 9ec8dd44 9ec8dd20 802c1ad0 802c1a30
> dd20: 9fa58800 8084a410 00047c8c 0000016c 0000016c 00000008 9ec8dd5c 9ec8dd48
> dd40: 802c2570 802c1a44 00000000 9d580000 9ec8dd74 9ec8dd60 7f0057d4 802c2534
> dd60: 00000000 9d580000 9ec8dd9c 9ec8dd78 7f000a38 7f00579c 7f0007b8 9f3fa200
> dd80: 9f5a2bd0 9f0c9cd0 9f383d50 9f383d50 9ec8dddc 9ec8dda0 800d4d40 7f0007c4
> dda0: 00000000 00000000 800cf478 9f3fa208 9f05b9c0 9efcb8f8 80891d40 9e3aab00
> ddc0: 9efcb600 9ec8c000 9ec8c000 9d450254 9ec8ddec 9ec8dde0 800d4f18 800d4cb4
> dde0: 9ec8de0c 9ec8ddf0 8003d7cc 800d4f14 9efcb908 00000001 9efcb600 9d450200
> de00: 9ec8de44 9ec8de10 80025464 8003d72c 9ec8de3c 00000001 80078370 8007814c
> de20: 9e0b2604 9ec8decc 9ec8c000 9e0b2604 9ec8de4c 9e0b2604 9ec8de64 9ec8de48
> de40: 80025b9c 800251cc 9ec8c000 9ec8decc 9ec8c000 9e0b2604 9ec8deb4 9ec8de68
> de60: 80031490 80025b64 80073404 80073330 9ec8de94 9ec8de80 805faf2c 9ec8deb8
> de80: 418004fc 9d46e680 9ec8debc 9ec8dfb0 9ec8c000 fffffdfc 9ec8deb8 76d533d0
> dea0: 9ec8c000 76d533d4 9ec8df7c 9ec8deb8 8001153c 80031278 80078370 8007814c
> dec0: 40000013 9f072728 8083ffc0 00000009 00000000 00000000 00000000 00000000
> dee0: 9ec8df04 9ec8def0 805faf1c 80078448 9f072400 81215fc0 9ec8df2c 9ec8df08
> df00: 8004c6dc 805faef0 00000000 9f072400 00000000 6deff4c8 000003e2 00000000
> df20: 9ec8df8c 9ec8c000 80011ab8 9efcb600 00000001 9ec8dfb0 9ec8c000 00000000
> df40: 9ec8df6c 9ec8df50 80078370 8007814c 9ec8c000 9ec8c000 9ec8c000 9ec8c000
> df60: 8000ea64 9ec8dfb0 9ec8c000 00000000 9ec8dfac 9ec8df80 80011ad4 800112fc
> df80: 00000000 00000001 8000e8ac 00000000 0341cc18 00000001 000000a8 8000ea64
> dfa0: 00000000 9ec8dfb0 8000e8f8 80011a58 74ffeaa0 00000002 ffffffff 012f7314
> dfc0: 00000000 0341cc18 00000001 000000a8 74ffeb98 0341cd80 0341bcf0 74ffece4
> dfe0: 00000000 74ffea70 76d533c4 76d533d0 80000010 74ffeaa0 00000000 00000000
> Backtrace:
> [<802c401c>] (dw_msi_teardown_irq+0x0/0x118) from [<802c1844>]
> (arch_teardown_msi_irq+0x3c/0x40)
>  r7:9fa58c74 r6:00000004 r5:9ecdb580 r4:00000188
> [<802c1808>] (arch_teardown_msi_irq+0x0/0x40) from [<802c1a08>]
> (default_teardown_msi_irqs+0x68/0x84)
>  r4:00000002 r3:00000000
> [<802c19a0>] (default_teardown_msi_irqs+0x0/0x84) from [<802c1a34>]
> (arch_teardown_msi_irqs+0x10/0x14)
>  r8:00000001 r7:9fa58c74 r6:00000004 r5:9fa58c60 r4:00000004
> r3:00000000
> [<802c1a24>] (arch_teardown_msi_irqs+0x0/0x14) from [<802c1ad0>]
> (free_msi_irqs+0x98/0x144)
> [<802c1a38>] (free_msi_irqs+0x0/0x144) from [<802c2570>]
> (pci_disable_msi+0x48/0x60)
> [<802c2528>] (pci_disable_msi+0x0/0x60) from [<7f0057d4>]
> (sxdma_irq_free+0x44/0x48 [sxdma])
>  r4:9d580000 r3:00000000
> [<7f005790>] (sxdma_irq_free+0x0/0x48 [sxdma]) from [<7f000a38>]
> (sxdma_dev_release+0x280/0x2b0 [sxdma])
>  r4:9d580000 r3:00000000
> [<7f0007b8>] (sxdma_dev_release+0x0/0x2b0 [sxdma]) from [<800d4d40>]
> (__fput+0x98/0x214)
>  r8:9f383d50 r7:9f383d50 r6:9f0c9cd0 r5:9f5a2bd0 r4:9f3fa200
> r3:7f0007b8
> [<800d4ca8>] (__fput+0x0/0x214) from [<800d4f18>] (____fput+0x10/0x14)
> [<800d4f08>] (____fput+0x0/0x14) from [<8003d7cc>] (task_work_run+0xac/0xe0)
> [<8003d720>] (task_work_run+0x0/0xe0) from [<80025464>] (do_exit+0x2a4/0x8c8)
>  r7:9d450200 r6:9efcb600 r5:00000001 r4:9efcb908
> [<800251c0>] (do_exit+0x0/0x8c8) from [<80025b9c>] (do_group_exit+0x44/0xb8)
>  r7:9e0b2604
> [<80025b58>] (do_group_exit+0x0/0xb8) from [<80031490>]
> (get_signal_to_deliver+0x224/0x5b4)
>  r7:9e0b2604 r6:9ec8c000 r5:9ec8decc r4:9ec8c000
> [<8003126c>] (get_signal_to_deliver+0x0/0x5b4) from [<8001153c>]
> (do_signal+0x24c/0x41c)
> [<800112f0>] (do_signal+0x0/0x41c) from [<80011ad4>] (do_work_pending+0x88/0xc8)
> [<80011a4c>] (do_work_pending+0x0/0xc8) from [<8000e8f8>]
> (work_pending+0xc/0x20)
>  r8:8000ea64 r7:000000a8 r6:00000001 r5:0341cc18 r4:00000000
> Code: 0a000032 e5944024 e3540000 0a00002c (e5943020)
> ---[ end trace bb8d943ca1a10994 ]---
> Fixing recursive fault but reboot is needed!

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Kernel oops from pci_disable_msi
  2013-11-18 21:02   ` Kernel oops from pci_disable_msi Bjorn Helgaas
@ 2013-11-18 23:11     ` Jingoo Han
  2013-11-19 11:24       ` Marek Vasut
  0 siblings, 1 reply; 28+ messages in thread
From: Jingoo Han @ 2013-11-18 23:11 UTC (permalink / raw)
  To: 'Bjorn Helgaas', 'Bjørn Erik Nilsen'
  Cc: linux-pci, 'Pratyush Anand',
	'Kishon Vijay Abraham I', 'Marek Vasut',
	'Mohit KUMAR DCG', 'Ajay KHANDELWAL',
	'Tim Harvey', 'Jingoo Han'

On Tuesday, November 19, 2013 6:03 AM, Bjorn Helgaas wrote:
> On Mon, Nov 18, 2013 at 2:01 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> On Mon, Nov 18, 2013 at 6:53 AM, Bjørn Erik Nilsen <ben@datarespons.no> wrote:
>>> I just hit an kernel oops related to PCI (in dw_msi_teardown_irq()/clear_irq()
>>> (pcie-designware))
>>>
>>> Linux version 3.12.0-next-20131105 (bnilsen@bnilsen) (gcc version 4.7.2 (GCC) )
>>>
>>> Problem seem to be dereferencing a null pointer returned from
>>> irq_desc_get_msi_desc(desc) (see attached backtrace).
>>
>> Included oops inline for ease of viewing/searching.  Jingooo, I assume
>> you'll investigate this.  Let me know if otherwise.

(+cc Marek Vasut, Pratyush Anand, Kishon Vijay Abraham I,
       Mohit KUMAR DCG, Ajay KHANDELWAL, Tim Harvey)

Sorry, I will not investigate this.

Bjørn Erik Nilsen,

Would you let us know the ARM platform and LAN card?
If you let us know them, one of these pcie-designware related
people would reproduce and look at the issue.

Best regards,
Jingoo Han

>>
>> Unable to handle kernel NULL pointer dereference at virtual address 00000020
>> pgd = 80004000
>> [00000020] *pgd=00000000
>> Internal error: Oops: 17 [#1] SMP ARM
>> Modules linked in: sxdma(O)
>> CPU: 1 PID: 569 Comm: i2cipc.B3 Tainted: G           O 3.12.0-next-20131105 #8
>> task: 9efcb600 ti: 9ec8c000 task.ti: 9ec8c000
>> PC is at dw_msi_teardown_irq+0x40/0x118
>> LR is at trace_hardirqs_on_caller+0xf4/0x1c0
>> pc : [<802c405c>]    lr : [<80078370>]    psr: 20000013
>> sp : 9ec8dca8  ip : 9ec8dbe0  fp : 9ec8dccc
>> r10: 9fa58800  r9 : 00000000  r8 : 00000001
>> r7 : 9f9fecc0  r6 : 00000188  r5 : 9ecdb580  r4 : 00000000
>> r3 : 9efcb600  r2 : 00000000  r1 : 00000001  r0 : 00000012
>> Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
>> Control: 10c5387d  Table: 2ed1804a  DAC: 00000015
>> Process i2cipc.B3 (pid: 569, stack limit = 0x9ec8c240)
>> Stack: (0x9ec8dca8 to 0x9ec8e000)
>> dca0:                   8005d1a4 802844c8 00000188 9ecdb580 00000004 9fa58c74
>> dcc0: 9ec8dce4 9ec8dcd0 802c1844 802c4028 00000000 00000002 9ec8dd0c 9ec8dce8
>> dce0: 802c1a08 802c1814 00000000 00000004 9fa58c60 00000004 9fa58c74 00000001
>> dd00: 9ec8dd1c 9ec8dd10 802c1a34 802c19ac 9ec8dd44 9ec8dd20 802c1ad0 802c1a30
>> dd20: 9fa58800 8084a410 00047c8c 0000016c 0000016c 00000008 9ec8dd5c 9ec8dd48
>> dd40: 802c2570 802c1a44 00000000 9d580000 9ec8dd74 9ec8dd60 7f0057d4 802c2534
>> dd60: 00000000 9d580000 9ec8dd9c 9ec8dd78 7f000a38 7f00579c 7f0007b8 9f3fa200
>> dd80: 9f5a2bd0 9f0c9cd0 9f383d50 9f383d50 9ec8dddc 9ec8dda0 800d4d40 7f0007c4
>> dda0: 00000000 00000000 800cf478 9f3fa208 9f05b9c0 9efcb8f8 80891d40 9e3aab00
>> ddc0: 9efcb600 9ec8c000 9ec8c000 9d450254 9ec8ddec 9ec8dde0 800d4f18 800d4cb4
>> dde0: 9ec8de0c 9ec8ddf0 8003d7cc 800d4f14 9efcb908 00000001 9efcb600 9d450200
>> de00: 9ec8de44 9ec8de10 80025464 8003d72c 9ec8de3c 00000001 80078370 8007814c
>> de20: 9e0b2604 9ec8decc 9ec8c000 9e0b2604 9ec8de4c 9e0b2604 9ec8de64 9ec8de48
>> de40: 80025b9c 800251cc 9ec8c000 9ec8decc 9ec8c000 9e0b2604 9ec8deb4 9ec8de68
>> de60: 80031490 80025b64 80073404 80073330 9ec8de94 9ec8de80 805faf2c 9ec8deb8
>> de80: 418004fc 9d46e680 9ec8debc 9ec8dfb0 9ec8c000 fffffdfc 9ec8deb8 76d533d0
>> dea0: 9ec8c000 76d533d4 9ec8df7c 9ec8deb8 8001153c 80031278 80078370 8007814c
>> dec0: 40000013 9f072728 8083ffc0 00000009 00000000 00000000 00000000 00000000
>> dee0: 9ec8df04 9ec8def0 805faf1c 80078448 9f072400 81215fc0 9ec8df2c 9ec8df08
>> df00: 8004c6dc 805faef0 00000000 9f072400 00000000 6deff4c8 000003e2 00000000
>> df20: 9ec8df8c 9ec8c000 80011ab8 9efcb600 00000001 9ec8dfb0 9ec8c000 00000000
>> df40: 9ec8df6c 9ec8df50 80078370 8007814c 9ec8c000 9ec8c000 9ec8c000 9ec8c000
>> df60: 8000ea64 9ec8dfb0 9ec8c000 00000000 9ec8dfac 9ec8df80 80011ad4 800112fc
>> df80: 00000000 00000001 8000e8ac 00000000 0341cc18 00000001 000000a8 8000ea64
>> dfa0: 00000000 9ec8dfb0 8000e8f8 80011a58 74ffeaa0 00000002 ffffffff 012f7314
>> dfc0: 00000000 0341cc18 00000001 000000a8 74ffeb98 0341cd80 0341bcf0 74ffece4
>> dfe0: 00000000 74ffea70 76d533c4 76d533d0 80000010 74ffeaa0 00000000 00000000
>> Backtrace:
>> [<802c401c>] (dw_msi_teardown_irq+0x0/0x118) from [<802c1844>]
>> (arch_teardown_msi_irq+0x3c/0x40)
>>  r7:9fa58c74 r6:00000004 r5:9ecdb580 r4:00000188
>> [<802c1808>] (arch_teardown_msi_irq+0x0/0x40) from [<802c1a08>]
>> (default_teardown_msi_irqs+0x68/0x84)
>>  r4:00000002 r3:00000000
>> [<802c19a0>] (default_teardown_msi_irqs+0x0/0x84) from [<802c1a34>]
>> (arch_teardown_msi_irqs+0x10/0x14)
>>  r8:00000001 r7:9fa58c74 r6:00000004 r5:9fa58c60 r4:00000004
>> r3:00000000
>> [<802c1a24>] (arch_teardown_msi_irqs+0x0/0x14) from [<802c1ad0>]
>> (free_msi_irqs+0x98/0x144)
>> [<802c1a38>] (free_msi_irqs+0x0/0x144) from [<802c2570>]
>> (pci_disable_msi+0x48/0x60)
>> [<802c2528>] (pci_disable_msi+0x0/0x60) from [<7f0057d4>]
>> (sxdma_irq_free+0x44/0x48 [sxdma])
>>  r4:9d580000 r3:00000000
>> [<7f005790>] (sxdma_irq_free+0x0/0x48 [sxdma]) from [<7f000a38>]
>> (sxdma_dev_release+0x280/0x2b0 [sxdma])
>>  r4:9d580000 r3:00000000
>> [<7f0007b8>] (sxdma_dev_release+0x0/0x2b0 [sxdma]) from [<800d4d40>]
>> (__fput+0x98/0x214)
>>  r8:9f383d50 r7:9f383d50 r6:9f0c9cd0 r5:9f5a2bd0 r4:9f3fa200
>> r3:7f0007b8
>> [<800d4ca8>] (__fput+0x0/0x214) from [<800d4f18>] (____fput+0x10/0x14)
>> [<800d4f08>] (____fput+0x0/0x14) from [<8003d7cc>] (task_work_run+0xac/0xe0)
>> [<8003d720>] (task_work_run+0x0/0xe0) from [<80025464>] (do_exit+0x2a4/0x8c8)
>>  r7:9d450200 r6:9efcb600 r5:00000001 r4:9efcb908
>> [<800251c0>] (do_exit+0x0/0x8c8) from [<80025b9c>] (do_group_exit+0x44/0xb8)
>>  r7:9e0b2604
>> [<80025b58>] (do_group_exit+0x0/0xb8) from [<80031490>]
>> (get_signal_to_deliver+0x224/0x5b4)
>>  r7:9e0b2604 r6:9ec8c000 r5:9ec8decc r4:9ec8c000
>> [<8003126c>] (get_signal_to_deliver+0x0/0x5b4) from [<8001153c>]
>> (do_signal+0x24c/0x41c)
>> [<800112f0>] (do_signal+0x0/0x41c) from [<80011ad4>] (do_work_pending+0x88/0xc8)
>> [<80011a4c>] (do_work_pending+0x0/0xc8) from [<8000e8f8>]
>> (work_pending+0xc/0x20)
>>  r8:8000ea64 r7:000000a8 r6:00000001 r5:0341cc18 r4:00000000
>> Code: 0a000032 e5944024 e3540000 0a00002c (e5943020)
>> ---[ end trace bb8d943ca1a10994 ]---
>> Fixing recursive fault but reboot is needed!


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Kernel oops from pci_disable_msi
  2013-11-18 23:11     ` Jingoo Han
@ 2013-11-19 11:24       ` Marek Vasut
  0 siblings, 0 replies; 28+ messages in thread
From: Marek Vasut @ 2013-11-19 11:24 UTC (permalink / raw)
  To: Jingoo Han
  Cc: 'Bjorn Helgaas',  'Bjørn Erik Nilsen',
	linux-pci, 'Pratyush Anand',
	'Kishon Vijay Abraham I', 'Mohit KUMAR DCG',
	'Ajay KHANDELWAL', 'Tim Harvey'

Dear Jingoo Han,

> On Tuesday, November 19, 2013 6:03 AM, Bjorn Helgaas wrote:
> > On Mon, Nov 18, 2013 at 2:01 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> >> On Mon, Nov 18, 2013 at 6:53 AM, Bjørn Erik Nilsen <ben@datarespons.no> 
wrote:
> >>> I just hit an kernel oops related to PCI (in
> >>> dw_msi_teardown_irq()/clear_irq() (pcie-designware))
> >>> 
> >>> Linux version 3.12.0-next-20131105 (bnilsen@bnilsen) (gcc version 4.7.2
> >>> (GCC) )
> >>> 
> >>> Problem seem to be dereferencing a null pointer returned from
> >>> irq_desc_get_msi_desc(desc) (see attached backtrace).
> >> 
> >> Included oops inline for ease of viewing/searching.  Jingooo, I assume
> >> you'll investigate this.  Let me know if otherwise.
> 
> (+cc Marek Vasut, Pratyush Anand, Kishon Vijay Abraham I,
>        Mohit KUMAR DCG, Ajay KHANDELWAL, Tim Harvey)
> 
> Sorry, I will not investigate this.
> 
> Bjørn Erik Nilsen,
> 
> Would you let us know the ARM platform and LAN card?
> If you let us know them, one of these pcie-designware related
> people would reproduce and look at the issue.
> 
> Best regards,
> Jingoo Han
> 
> >> Unable to handle kernel NULL pointer dereference at virtual address
> >> 00000020 pgd = 80004000
> >> [00000020] *pgd=00000000
> >> Internal error: Oops: 17 [#1] SMP ARM
> >> Modules linked in: sxdma(O)
> >> CPU: 1 PID: 569 Comm: i2cipc.B3 Tainted: G           O
> >> 3.12.0-next-20131105 #8 task: 9efcb600 ti: 9ec8c000 task.ti: 9ec8c000
> >> PC is at dw_msi_teardown_irq+0x40/0x118

see drivers/pci/host/pcie-designware.c :

336 static void dw_msi_teardown_irq(struct msi_chip *chip, unsigned int irq)
337 {
338         clear_irq(irq);
339 }

So, add such a print before the clear_irq() call:

pr_err("%i %i\n", chip != NULL, irq);

And let us know the result please.

Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Kernel oops from pci_disable_msi
       [not found]   ` <cmu-lmtpd-19155-1384861370-13@frontend1.mail.m-online.net>
@ 2013-11-19 22:01     ` Marek Vasut
       [not found]       ` <cmu-lmtpd-1612-1384936883-21@frontend1.mail.m-online.net>
  0 siblings, 1 reply; 28+ messages in thread
From: Marek Vasut @ 2013-11-19 22:01 UTC (permalink / raw)
  To: Bjørn Erik Nilsen
  Cc: Jingoo Han, 'Bjorn Helgaas', linux-pci@vger.kernel.org,
	'Pratyush Anand', 'Kishon Vijay Abraham I',
	'Mohit KUMAR DCG', 'Ajay KHANDELWAL',
	'Tim Harvey', Eric Nelson, Troy Kisky

Dear Bjørn Erik Nilsen,

> On Tue, 2013-11-19 at 12:39 +0100, Bjørn Erik Nilsen wrote:
> > On Tue, 2013-11-19 at 12:24 +0100, Marek Vasut wrote:
> > > Dear Jingoo Han,
> > > 
> > > > On Tuesday, November 19, 2013 6:03 AM, Bjorn Helgaas wrote:
> > > > > On Mon, Nov 18, 2013 at 2:01 PM, Bjorn Helgaas <bhelgaas@google.com> 
wrote:
> > > > >> On Mon, Nov 18, 2013 at 6:53 AM, Bjørn Erik Nilsen
> > > > >> <ben@datarespons.no>
> > > 
> > > wrote:
> > > > >>> I just hit an kernel oops related to PCI (in
> > > > >>> dw_msi_teardown_irq()/clear_irq() (pcie-designware))
> > > > >>> 
> > > > >>> Linux version 3.12.0-next-20131105 (bnilsen@bnilsen) (gcc version
> > > > >>> 4.7.2 (GCC) )
> > > > >>> 
> > > > >>> Problem seem to be dereferencing a null pointer returned from
> > > > >>> irq_desc_get_msi_desc(desc) (see attached backtrace).
> > > > >> 
> > > > >> Included oops inline for ease of viewing/searching.  Jingooo, I
> > > > >> assume you'll investigate this.  Let me know if otherwise.
> > > > 
> > > > (+cc Marek Vasut, Pratyush Anand, Kishon Vijay Abraham I,
> > > > 
> > > >        Mohit KUMAR DCG, Ajay KHANDELWAL, Tim Harvey)
> > > > 
> > > > Sorry, I will not investigate this.
> > > > 
> > > > Bjørn Erik Nilsen,
> > > > 
> > > > Would you let us know the ARM platform and LAN card?
> > > > If you let us know them, one of these pcie-designware related
> > > > people would reproduce and look at the issue.
> > > > 
> > > > Best regards,
> > > > Jingoo Han
> > > > 
> > > > >> Unable to handle kernel NULL pointer dereference at virtual
> > > > >> address 00000020 pgd = 80004000
> > > > >> [00000020] *pgd=00000000
> > > > >> Internal error: Oops: 17 [#1] SMP ARM
> > > > >> Modules linked in: sxdma(O)
> > > > >> CPU: 1 PID: 569 Comm: i2cipc.B3 Tainted: G           O
> > > > >> 3.12.0-next-20131105 #8 task: 9efcb600 ti: 9ec8c000 task.ti:
> > > > >> 9ec8c000 PC is at dw_msi_teardown_irq+0x40/0x118
> > > 
> > > see drivers/pci/host/pcie-designware.c :
> > > 
> > > 336 static void dw_msi_teardown_irq(struct msi_chip *chip, unsigned int
> > > irq) 337 {
> > > 338         clear_irq(irq);
> > > 339 }
> > > 
> > > So, add such a print before the clear_irq() call:
> > > 
> > > pr_err("%i %i\n", chip != NULL, irq);
> > > 
> > > And let us know the result please.
> > 
> > Here's what I get:
> > 
> > 1 391
> > 1 392
> 
> Also worth to mention is that I trigger this behavior by removing the
> device:
> 
> echo 1 > /sys/bus/pci/devices/0000:01:00.0/remove

Just for completeness, is this pure next or something else, like the 
boundarydevices's kernel ?

Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Kernel oops from pci_disable_msi
       [not found]       ` <cmu-lmtpd-1612-1384936883-21@frontend1.mail.m-online.net>
@ 2013-11-20 10:30         ` Marek Vasut
       [not found]           ` <cmu-lmtpd-21237-1384948548-21@frontend1.mail.m-online.net>
  0 siblings, 1 reply; 28+ messages in thread
From: Marek Vasut @ 2013-11-20 10:30 UTC (permalink / raw)
  To: Bjørn Erik Nilsen
  Cc: Jingoo Han, 'Bjorn Helgaas', linux-pci@vger.kernel.org,
	'Pratyush Anand', 'Kishon Vijay Abraham I',
	'Mohit KUMAR DCG', 'Ajay KHANDELWAL',
	'Tim Harvey', Eric Nelson, Troy Kisky

Hi Bjørn Erik Nilsen,

> On Tue, 2013-11-19 at 23:01 +0100, Marek Vasut wrote:
> > Dear Bjørn Erik Nilsen,
> > 
> > > On Tue, 2013-11-19 at 12:39 +0100, Bjørn Erik Nilsen wrote:
> > > > On Tue, 2013-11-19 at 12:24 +0100, Marek Vasut wrote:
> > > > > Dear Jingoo Han,
> > > > > 
> > > > > > On Tuesday, November 19, 2013 6:03 AM, Bjorn Helgaas wrote:
> > > > > > > On Mon, Nov 18, 2013 at 2:01 PM, Bjorn Helgaas
> > > > > > > <bhelgaas@google.com>
> > 
> > wrote:
> > > > > > >> On Mon, Nov 18, 2013 at 6:53 AM, Bjørn Erik Nilsen
> > > > > > >> <ben@datarespons.no>
> > > > > 
> > > > > wrote:
> > > > > > >>> I just hit an kernel oops related to PCI (in
> > > > > > >>> dw_msi_teardown_irq()/clear_irq() (pcie-designware))
> > > > > > >>> 
> > > > > > >>> Linux version 3.12.0-next-20131105 (bnilsen@bnilsen) (gcc
> > > > > > >>> version 4.7.2 (GCC) )
> > > > > > >>> 
> > > > > > >>> Problem seem to be dereferencing a null pointer returned from
> > > > > > >>> irq_desc_get_msi_desc(desc) (see attached backtrace).
> > > > > > >> 
> > > > > > >> Included oops inline for ease of viewing/searching.  Jingooo,
> > > > > > >> I assume you'll investigate this.  Let me know if otherwise.
> > > > > > 
> > > > > > (+cc Marek Vasut, Pratyush Anand, Kishon Vijay Abraham I,
> > > > > > 
> > > > > >        Mohit KUMAR DCG, Ajay KHANDELWAL, Tim Harvey)
> > > > > > 
> > > > > > Sorry, I will not investigate this.
> > > > > > 
> > > > > > Bjørn Erik Nilsen,
> > > > > > 
> > > > > > Would you let us know the ARM platform and LAN card?
> > > > > > If you let us know them, one of these pcie-designware related
> > > > > > people would reproduce and look at the issue.
> > > > > > 
> > > > > > Best regards,
> > > > > > Jingoo Han
> > > > > > 
> > > > > > >> Unable to handle kernel NULL pointer dereference at virtual
> > > > > > >> address 00000020 pgd = 80004000
> > > > > > >> [00000020] *pgd=00000000
> > > > > > >> Internal error: Oops: 17 [#1] SMP ARM
> > > > > > >> Modules linked in: sxdma(O)
> > > > > > >> CPU: 1 PID: 569 Comm: i2cipc.B3 Tainted: G           O
> > > > > > >> 3.12.0-next-20131105 #8 task: 9efcb600 ti: 9ec8c000 task.ti:
> > > > > > >> 9ec8c000 PC is at dw_msi_teardown_irq+0x40/0x118
> > > > > 
> > > > > see drivers/pci/host/pcie-designware.c :
> > > > > 
> > > > > 336 static void dw_msi_teardown_irq(struct msi_chip *chip, unsigned
> > > > > int irq) 337 {
> > > > > 338         clear_irq(irq);
> > > > > 339 }
> > > > > 
> > > > > So, add such a print before the clear_irq() call:
> > > > > 
> > > > > pr_err("%i %i\n", chip != NULL, irq);
> > > > > 
> > > > > And let us know the result please.
> > > > 
> > > > Here's what I get:
> > > > 
> > > > 1 391
> > > > 1 392
> > > 
> > > Also worth to mention is that I trigger this behavior by removing the
> > > device:
> > > 
> > > echo 1 > /sys/bus/pci/devices/0000:01:00.0/remove
> > 
> > Just for completeness, is this pure next or something else, like the
> > boundarydevices's kernel ?
> 
> It's a boundary device kernel (boundary-imx_3.12.0):
> 
> https://github.com/boundarydevices/linux-imx6/tree/boundary-imx_3.12.0

OK, thanks. Jingoo, can you please try if this also happens on Exynos?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Kernel oops from pci_disable_msi
       [not found]           ` <cmu-lmtpd-21237-1384948548-21@frontend1.mail.m-online.net>
@ 2013-11-20 12:02             ` Marek Vasut
       [not found]               ` <cmu-lmtpd-23243-1384949805-3@frontend1.mail.m-online.net>
  0 siblings, 1 reply; 28+ messages in thread
From: Marek Vasut @ 2013-11-20 12:02 UTC (permalink / raw)
  To: Bjørn Erik Nilsen
  Cc: Jingoo Han, 'Bjorn Helgaas', linux-pci@vger.kernel.org,
	'Pratyush Anand', 'Kishon Vijay Abraham I',
	'Mohit KUMAR DCG', 'Ajay KHANDELWAL',
	'Tim Harvey', Eric Nelson, Troy Kisky

Dear Bjørn Erik Nilsen,

> On Wed, 2013-11-20 at 11:30 +0100, Marek Vasut wrote:
> > Hi Bjørn Erik Nilsen,
> > 
> > > On Tue, 2013-11-19 at 23:01 +0100, Marek Vasut wrote:
> > > > Dear Bjørn Erik Nilsen,
> > > > 
> > > > > On Tue, 2013-11-19 at 12:39 +0100, Bjørn Erik Nilsen wrote:
> > > > > > On Tue, 2013-11-19 at 12:24 +0100, Marek Vasut wrote:
> > > > > > > Dear Jingoo Han,
> > > > > > > 
> > > > > > > > On Tuesday, November 19, 2013 6:03 AM, Bjorn Helgaas wrote:
> > > > > > > > > On Mon, Nov 18, 2013 at 2:01 PM, Bjorn Helgaas
> > > > > > > > > <bhelgaas@google.com>
> > > > 
> > > > wrote:
> > > > > > > > >> On Mon, Nov 18, 2013 at 6:53 AM, Bjørn Erik Nilsen
> > > > > > > > >> <ben@datarespons.no>
> > > > > > > 
> > > > > > > wrote:
> > > > > > > > >>> I just hit an kernel oops related to PCI (in
> > > > > > > > >>> dw_msi_teardown_irq()/clear_irq() (pcie-designware))
> > > > > > > > >>> 
> > > > > > > > >>> Linux version 3.12.0-next-20131105 (bnilsen@bnilsen) (gcc
> > > > > > > > >>> version 4.7.2 (GCC) )
> > > > > > > > >>> 
> > > > > > > > >>> Problem seem to be dereferencing a null pointer returned
> > > > > > > > >>> from irq_desc_get_msi_desc(desc) (see attached
> > > > > > > > >>> backtrace).
> > > > > > > > >> 
> > > > > > > > >> Included oops inline for ease of viewing/searching. 
> > > > > > > > >> Jingooo, I assume you'll investigate this.  Let me know
> > > > > > > > >> if otherwise.
> > > > > > > > 
> > > > > > > > (+cc Marek Vasut, Pratyush Anand, Kishon Vijay Abraham I,
> > > > > > > > 
> > > > > > > >        Mohit KUMAR DCG, Ajay KHANDELWAL, Tim Harvey)
> > > > > > > > 
> > > > > > > > Sorry, I will not investigate this.
> > > > > > > > 
> > > > > > > > Bjørn Erik Nilsen,
> > > > > > > > 
> > > > > > > > Would you let us know the ARM platform and LAN card?
> > > > > > > > If you let us know them, one of these pcie-designware related
> > > > > > > > people would reproduce and look at the issue.
> > > > > > > > 
> > > > > > > > Best regards,
> > > > > > > > Jingoo Han
> > > > > > > > 
> > > > > > > > >> Unable to handle kernel NULL pointer dereference at
> > > > > > > > >> virtual address 00000020 pgd = 80004000
> > > > > > > > >> [00000020] *pgd=00000000
> > > > > > > > >> Internal error: Oops: 17 [#1] SMP ARM
> > > > > > > > >> Modules linked in: sxdma(O)
> > > > > > > > >> CPU: 1 PID: 569 Comm: i2cipc.B3 Tainted: G           O
> > > > > > > > >> 3.12.0-next-20131105 #8 task: 9efcb600 ti: 9ec8c000
> > > > > > > > >> task.ti: 9ec8c000 PC is at dw_msi_teardown_irq+0x40/0x118
> > > > > > > 
> > > > > > > see drivers/pci/host/pcie-designware.c :
> > > > > > > 
> > > > > > > 336 static void dw_msi_teardown_irq(struct msi_chip *chip,
> > > > > > > unsigned int irq) 337 {
> > > > > > > 338         clear_irq(irq);
> > > > > > > 339 }
> > > > > > > 
> > > > > > > So, add such a print before the clear_irq() call:
> > > > > > > 
> > > > > > > pr_err("%i %i\n", chip != NULL, irq);
> > > > > > > 
> > > > > > > And let us know the result please.
> > > > > > 
> > > > > > Here's what I get:
> > > > > > 
> > > > > > 1 391
> > > > > > 1 392
> > > > > 
> > > > > Also worth to mention is that I trigger this behavior by removing
> > > > > the device:
> > > > > 
> > > > > echo 1 > /sys/bus/pci/devices/0000:01:00.0/remove
> > > > 
> > > > Just for completeness, is this pure next or something else, like the
> > > > boundarydevices's kernel ?
> > > 
> > > It's a boundary device kernel (boundary-imx_3.12.0):
> > > 
> > > https://github.com/boundarydevices/linux-imx6/tree/boundary-imx_3.12.0
> > 
> > OK, thanks. Jingoo, can you please try if this also happens on Exynos?
> 
> Sorry, I need to clarify the steps leading to the oops. It's actually
> not removing the device itself which is the trigger point. The kernel
> module (sxdma) registers a PCI driver and creates a char device
> (/dev/sxdma). When this device is opened pci_enable_msi is called, and
> when it is closed sxdma_dev_release is called (which in turn calls
> pci_disable_msi as shown in the bt).

Uh, what's this 'sxdma' ?

> I can therefore reproduce the issue by simply opening and
> closing /dev/sxdma.
> 
> Hope this clarifies and did not cause too much confusion.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Kernel oops from pci_disable_msi
       [not found]               ` <cmu-lmtpd-23243-1384949805-3@frontend1.mail.m-online.net>
@ 2013-11-20 13:57                 ` Marek Vasut
       [not found]                   ` <1385036087.3945.28.camel@bnilsen-HP>
  0 siblings, 1 reply; 28+ messages in thread
From: Marek Vasut @ 2013-11-20 13:57 UTC (permalink / raw)
  To: Bjørn Erik Nilsen
  Cc: Jingoo Han, 'Bjorn Helgaas', linux-pci@vger.kernel.org,
	'Pratyush Anand', 'Kishon Vijay Abraham I',
	'Mohit KUMAR DCG', 'Ajay KHANDELWAL',
	'Tim Harvey', Eric Nelson, Troy Kisky

Dear Bjørn Erik Nilsen,

> On Wed, 2013-11-20 at 13:02 +0100, Marek Vasut wrote:
> > Dear Bjørn Erik Nilsen,
> > 
> > > On Wed, 2013-11-20 at 11:30 +0100, Marek Vasut wrote:
> > > > Hi Bjørn Erik Nilsen,
> > > > 
> > > > > On Tue, 2013-11-19 at 23:01 +0100, Marek Vasut wrote:
> > > > > > Dear Bjørn Erik Nilsen,
> > > > > > 
> > > > > > > On Tue, 2013-11-19 at 12:39 +0100, Bjørn Erik Nilsen wrote:
> > > > > > > > On Tue, 2013-11-19 at 12:24 +0100, Marek Vasut wrote:
> > > > > > > > > Dear Jingoo Han,
> > > > > > > > > 
> > > > > > > > > > On Tuesday, November 19, 2013 6:03 AM, Bjorn Helgaas wrote:
> > > > > > > > > > > On Mon, Nov 18, 2013 at 2:01 PM, Bjorn Helgaas
> > > > > > > > > > > <bhelgaas@google.com>
> > > > > > 
> > > > > > wrote:
> > > > > > > > > > >> On Mon, Nov 18, 2013 at 6:53 AM, Bjørn Erik Nilsen
> > > > > > > > > > >> <ben@datarespons.no>
> > > > > > > > > 
> > > > > > > > > wrote:
> > > > > > > > > > >>> I just hit an kernel oops related to PCI (in
> > > > > > > > > > >>> dw_msi_teardown_irq()/clear_irq() (pcie-designware))
> > > > > > > > > > >>> 
> > > > > > > > > > >>> Linux version 3.12.0-next-20131105 (bnilsen@bnilsen)
> > > > > > > > > > >>> (gcc version 4.7.2 (GCC) )
> > > > > > > > > > >>> 
> > > > > > > > > > >>> Problem seem to be dereferencing a null pointer
> > > > > > > > > > >>> returned from irq_desc_get_msi_desc(desc) (see
> > > > > > > > > > >>> attached backtrace).
> > > > > > > > > > >> 
> > > > > > > > > > >> Included oops inline for ease of viewing/searching.
> > > > > > > > > > >> Jingooo, I assume you'll investigate this.  Let me
> > > > > > > > > > >> know if otherwise.
> > > > > > > > > > 
> > > > > > > > > > (+cc Marek Vasut, Pratyush Anand, Kishon Vijay Abraham I,
> > > > > > > > > > 
> > > > > > > > > >        Mohit KUMAR DCG, Ajay KHANDELWAL, Tim Harvey)
> > > > > > > > > > 
> > > > > > > > > > Sorry, I will not investigate this.
> > > > > > > > > > 
> > > > > > > > > > Bjørn Erik Nilsen,
> > > > > > > > > > 
> > > > > > > > > > Would you let us know the ARM platform and LAN card?
> > > > > > > > > > If you let us know them, one of these pcie-designware
> > > > > > > > > > related people would reproduce and look at the issue.
> > > > > > > > > > 
> > > > > > > > > > Best regards,
> > > > > > > > > > Jingoo Han
> > > > > > > > > > 
> > > > > > > > > > >> Unable to handle kernel NULL pointer dereference at
> > > > > > > > > > >> virtual address 00000020 pgd = 80004000
> > > > > > > > > > >> [00000020] *pgd=00000000
> > > > > > > > > > >> Internal error: Oops: 17 [#1] SMP ARM
> > > > > > > > > > >> Modules linked in: sxdma(O)
> > > > > > > > > > >> CPU: 1 PID: 569 Comm: i2cipc.B3 Tainted: G           O
> > > > > > > > > > >> 3.12.0-next-20131105 #8 task: 9efcb600 ti: 9ec8c000
> > > > > > > > > > >> task.ti: 9ec8c000 PC is at
> > > > > > > > > > >> dw_msi_teardown_irq+0x40/0x118
> > > > > > > > > 
> > > > > > > > > see drivers/pci/host/pcie-designware.c :
> > > > > > > > > 
> > > > > > > > > 336 static void dw_msi_teardown_irq(struct msi_chip *chip,
> > > > > > > > > unsigned int irq) 337 {
> > > > > > > > > 338         clear_irq(irq);
> > > > > > > > > 339 }
> > > > > > > > > 
> > > > > > > > > So, add such a print before the clear_irq() call:
> > > > > > > > > 
> > > > > > > > > pr_err("%i %i\n", chip != NULL, irq);
> > > > > > > > > 
> > > > > > > > > And let us know the result please.
> > > > > > > > 
> > > > > > > > Here's what I get:
> > > > > > > > 
> > > > > > > > 1 391
> > > > > > > > 1 392
> > > > > > > 
> > > > > > > Also worth to mention is that I trigger this behavior by
> > > > > > > removing the device:
> > > > > > > 
> > > > > > > echo 1 > /sys/bus/pci/devices/0000:01:00.0/remove
> > > > > > 
> > > > > > Just for completeness, is this pure next or something else, like
> > > > > > the boundarydevices's kernel ?
> > > > > 
> > > > > It's a boundary device kernel (boundary-imx_3.12.0):
> > > > > 
> > > > > https://github.com/boundarydevices/linux-imx6/tree/boundary-imx_3.1
> > > > > 2.0
> > > > 
> > > > OK, thanks. Jingoo, can you please try if this also happens on
> > > > Exynos?
> > > 
> > > Sorry, I need to clarify the steps leading to the oops. It's actually
> > > not removing the device itself which is the trigger point. The kernel
> > > module (sxdma) registers a PCI driver and creates a char device
> > > (/dev/sxdma). When this device is opened pci_enable_msi is called, and
> > > when it is closed sxdma_dev_release is called (which in turn calls
> > > pci_disable_msi as shown in the bt).
> > 
> > Uh, what's this 'sxdma' ?
> 
> It's the driver for the PCI device:
> 
> 01:00.0 Memory controller: Barco Graphics NV Device ba01 (rev 01)
> 	Flags: bus master, fast devsel, latency 0, IRQ 155
> 	Memory at 01000000 (32-bit, non-prefetchable) [size=2M]
> 	Capabilities: [50] MSI: Enable- Count=1/4 Maskable- 64bit+
> 	Capabilities: [78] Power Management version 3
> 	Capabilities: [80] Express Endpoint, MSI 00
> 	Capabilities: [100] Virtual Channel
> 	Capabilities: [200] Vendor Specific Information: ID=1172 Rev=0 Len=044
> <?>
> 	Kernel driver in use: sxdma

Sure, I just cannot find such driver anywhere in the kernel tree for some 
reason.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Kernel oops from pci_disable_msi
       [not found]                       ` <16.79.22145.6305E825@epmailin9.samsung.com>
@ 2013-11-22  8:48                         ` Jingoo Han
       [not found]                           ` <1385118399.3944.32.camel@bnilsen-HP>
  0 siblings, 1 reply; 28+ messages in thread
From: Jingoo Han @ 2013-11-22  8:48 UTC (permalink / raw)
  To: 'Bjørn Erik Nilsen', 'Marek Vasut',
	'Pratyush Anand'
  Cc: 'Bjorn Helgaas', linux-pci,
	'Kishon Vijay Abraham I', 'Mohit KUMAR DCG',
	'Ajay KHANDELWAL', 'Tim Harvey',
	'Eric Nelson', 'Troy Kisky', 'Jingoo Han'

On Friday, November 22, 2013 3:25 AM, Bjørn Erik Nilsen wrote:
> On Thu, 2013-11-21 at 14:19 +0100, Bjørn Erik Nilsen wrote:
> > On Thu, 2013-11-21 at 13:14 +0100, Bjørn Erik Nilsen wrote:
> > > On Wed, 2013-11-20 at 14:57 +0100, Marek Vasut wrote:
> > > > > On Wed, 2013-11-20 at 13:02 +0100, Marek Vasut wrote:
> > > > > > > On Wed, 2013-11-20 at 11:30 +0100, Marek Vasut wrote:
> > > > > > > > > On Tue, 2013-11-19 at 23:01 +0100, Marek Vasut wrote:
> > > > > > > > > > > On Tue, 2013-11-19 at 12:39 +0100, Bjørn Erik Nilsen wrote:
> > > > > > > > > > > > On Tue, 2013-11-19 at 12:24 +0100, Marek Vasut wrote:
> > > > > > > > > > > > > > On Tuesday, November 19, 2013 6:03 AM, Bjorn Helgaas wrote:
> > > > > > > > > > > > > > > On Mon, Nov 18, 2013 at 2:01 PM, Bjorn Helgaas
> > > > > > > > > > > > > > > <bhelgaas@google.com>
> > > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > > > > > >> On Mon, Nov 18, 2013 at 6:53 AM, Bjørn Erik Nilsen
> > > > > > > > > > > > > > >> <ben@datarespons.no>
> > > > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >>> I just hit an kernel oops related to PCI (in
> > > > > > > > > > > > > > >>> dw_msi_teardown_irq()/clear_irq() (pcie-designware))
> > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > >>> Linux version 3.12.0-next-20131105 (bnilsen@bnilsen)
> > > > > > > > > > > > > > >>> (gcc version 4.7.2 (GCC) )
> > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > >>> Problem seem to be dereferencing a null pointer
> > > > > > > > > > > > > > >>> returned from irq_desc_get_msi_desc(desc) (see
> > > > > > > > > > > > > > >>> attached backtrace).
> > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > >> Included oops inline for ease of viewing/searching.
> > > > > > > > > > > > > > >> Jingooo, I assume you'll investigate this.  Let me
> > > > > > > > > > > > > > >> know if otherwise.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > (+cc Marek Vasut, Pratyush Anand, Kishon Vijay Abraham I,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >        Mohit KUMAR DCG, Ajay KHANDELWAL, Tim Harvey)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Sorry, I will not investigate this.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Bjørn Erik Nilsen,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Would you let us know the ARM platform and LAN card?
> > > > > > > > > > > > > > If you let us know them, one of these pcie-designware
> > > > > > > > > > > > > > related people would reproduce and look at the issue.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best regards,
> > > > > > > > > > > > > > Jingoo Han
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >> Unable to handle kernel NULL pointer dereference at
> > > > > > > > > > > > > > >> virtual address 00000020 pgd = 80004000
> > > > > > > > > > > > > > >> [00000020] *pgd=00000000
> > > > > > > > > > > > > > >> Internal error: Oops: 17 [#1] SMP ARM
> > > > > > > > > > > > > > >> Modules linked in: sxdma(O)
> > > > > > > > > > > > > > >> CPU: 1 PID: 569 Comm: i2cipc.B3 Tainted: G           O
> > > > > > > > > > > > > > >> 3.12.0-next-20131105 #8 task: 9efcb600 ti: 9ec8c000
> > > > > > > > > > > > > > >> task.ti: 9ec8c000 PC is at
> > > > > > > > > > > > > > >> dw_msi_teardown_irq+0x40/0x118
> > > > > > > > > > > > >
> > > > > > > > > > > > > see drivers/pci/host/pcie-designware.c :
> > > > > > > > > > > > >
> > > > > > > > > > > > > 336 static void dw_msi_teardown_irq(struct msi_chip *chip,
> > > > > > > > > > > > > unsigned int irq) 337 {
> > > > > > > > > > > > > 338         clear_irq(irq);
> > > > > > > > > > > > > 339 }
> > > > > > > > > > > > >
> > > > > > > > > > > > > So, add such a print before the clear_irq() call:
> > > > > > > > > > > > >
> > > > > > > > > > > > > pr_err("%i %i\n", chip != NULL, irq);
> > > > > > > > > > > > >
> > > > > > > > > > > > > And let us know the result please.
> > > > > > > > > > > >
> > > > > > > > > > > > Here's what I get:
> > > > > > > > > > > >
> > > > > > > > > > > > 1 391
> > > > > > > > > > > > 1 392
> > > > > > > > > > >
> > > > > > > > > > > Also worth to mention is that I trigger this behavior by
> > > > > > > > > > > removing the device:
> > > > > > > > > > >
> > > > > > > > > > > echo 1 > /sys/bus/pci/devices/0000:01:00.0/remove
> > > > > > > > > >
> > > > > > > > > > Just for completeness, is this pure next or something else, like
> > > > > > > > > > the boundarydevices's kernel ?
> > > > > > > > >
> > > > > > > > > It's a boundary device kernel (boundary-imx_3.12.0):
> > > > > > > > >
> > > > > > > > > https://github.com/boundarydevices/linux-imx6/tree/boundary-imx_3.1
> > > > > > > > > 2.0
> > > > > > > >
> > > > > > > > OK, thanks. Jingoo, can you please try if this also happens on
> > > > > > > > Exynos?
> > > > > > >
> > > > > > > Sorry, I need to clarify the steps leading to the oops. It's actually
> > > > > > > not removing the device itself which is the trigger point. The kernel
> > > > > > > module (sxdma) registers a PCI driver and creates a char device
> > > > > > > (/dev/sxdma). When this device is opened pci_enable_msi is called, and
> > > > > > > when it is closed sxdma_dev_release is called (which in turn calls
> > > > > > > pci_disable_msi as shown in the bt).
> > > > > >
> > > > > > Uh, what's this 'sxdma' ?
> > > > >
> > > > > It's the driver for the PCI device:
> > > > >
> > > > > 01:00.0 Memory controller: Barco Graphics NV Device ba01 (rev 01)
> > > > > 	Flags: bus master, fast devsel, latency 0, IRQ 155
> > > > > 	Memory at 01000000 (32-bit, non-prefetchable) [size=2M]
> > > > > 	Capabilities: [50] MSI: Enable- Count=1/4 Maskable- 64bit+
> > > > > 	Capabilities: [78] Power Management version 3
> > > > > 	Capabilities: [80] Express Endpoint, MSI 00
> > > > > 	Capabilities: [100] Virtual Channel
> > > > > 	Capabilities: [200] Vendor Specific Information: ID=1172 Rev=0 Len=044
> > > > > <?>
> > > > > 	Kernel driver in use: sxdma
> > > >
> > > > Sure, I just cannot find such driver anywhere in the kernel tree for some
> > > > reason.
> > >
> > > The sxdma kernel module is developed outside the kernel tree, however I
> > > can give you the source (GPL V2) if you are interested or think it is
> > > important for this issue. Please let me know.
> > >
> > > The background is that a successful call to pci_enable_msi followed by
> > > pci_disable_msi triggers the crash. It used to work just fine before the
> > > pci-imx6/pcie-designware code was upstreamed.
> > >
> > > I guess you already figured this out, but to clarify my findings so far:
> > >
> > > As mentioned earlier, the real hot spot is that irq_desc_get_msi_desc
> > > (called from clear_irq in pcie-designware.c) returns NULL and its return
> > > value is unconditionally dereferenced.
> > >
> > > Judging from the code it looks to me that it should never return NULL in
> > > that case (because there is no check against NULL before dereferencing
> > > it). So the nut to crack is to find out why it returns NULL in this
> > > particular case, and why it used to work before.
> >
> > As far as I can tell, irq_set_msi_desc(irq, desc) is never called which
> > is why irq_desc_get_msi_desc returns NULL. The following patch fixes the
> > problem (I also simplified the retrieval of the msi_desc):
> 
> I jumped on the conclusion a bit too fast with my previous patch as it
> only removes the artifact in one particular case rather than fixing the
> real problem.
> 
> I did more extensive testing and here's my new findings:
> 
> irq_set_msi_desc() is indeed called from dw_msi_setup_irq/assign_irq,
> however when I looked at the implementation of irq_set_msi_desc() I
> noticed how it actually updates the entry's irq. So after four
> consecutive calls to the function, the irq points to the last irq:
> 
> irq_set_msi_desc(388, desc);
>    -> desc->irq = 388
> ...
> irq_set_msi_desc(391, desc);
>    -> desc->irq = 391
> 
> This is an important detail, because from msi.c teardown_msi_irq() the
> desc->irq is used as the base address for the teardown. I.e.:
> 
> for (int i = 0; i < number of irqs; ++i)
>     teardown_msi_irq(desc->irq + i);
> 
> teardown(391)
> teardown(392)
> teardown(393)
> teardown(394)
> 
> Bummer.
> 
> And now you probably understand why my previous patch fixed an artifact,
> because what it really did was updating the desc->irq to point to 388
> (instead of 391):
> 
> irq_set_msi_desc(388, desc);
>     -> desc->irq = 388
> 
> We therefore got somewhat closer to a clean teardown. The next problem
> is that each allocated irq depend on each other as they all share the
> same msi_desc pointer, which means they cannot be freed independently
> (at least so I think). My idea was therefore to allocate all irqs at
> once from setup() and free them all at once from teardown(). My new
> patch makes my system stable again and fixes all the issues (not only
> one particular issue):

Hi Bjørn Erik Nilsen,

Thank you for your effort!
I reproduced this kernel panic on Exynos platform with LAN card.
And then, I tested your patch and checked this kernel panic is
resolved.

Marek Vasut,
Will you test Bjørn Erik Nilsen's patch with your i.MX platform?

Pratyush Anand,
Would you confirm Bjørn Erik Nilsen's patch?

I will do more extensive testing.
Thank you.

Best regards,
Jingoo Han

> 
> --- pcie-designware.c.orig	2013-11-21 14:02:03.656007695 +0100
> +++ pcie-designware.c	2013-11-21 18:24:09.436048776 +0100
> @@ -242,11 +242,12 @@ static int assign_irq(int no_irqs, struc
>  	if (!irq)
>  		goto no_valid_irq;
> 
> +	irq_alloc_descs(irq, irq, no_irqs, 0);
> +
>  	i = 0;
>  	while (i < no_irqs) {
>  		set_bit(pos0 + i, pp->msi_irq_in_use);
> -		irq_alloc_descs((irq + i), (irq + i), 1, 0);
> -		irq_set_msi_desc(irq + i, desc);
> +		irq_set_msi_desc_off(irq + i, i, desc);
>  		/*Enable corresponding interrupt in MSI interrupt controller */
>  		res = ((pos0 + i) / 32) * 12;
>  		bit = (pos0 + i) % 32;
> @@ -267,14 +268,12 @@ no_valid_irq:
>  static void clear_irq(unsigned int irq)
>  {
>  	int res, bit, val, pos;
> -	struct irq_desc *desc;
>  	struct msi_desc *msi;
>  	struct pcie_port *pp;
>  	struct irq_data *data = irq_get_irq_data(irq);
> 
>  	/* get the port structure */
> -	desc = irq_to_desc(irq);
> -	msi = irq_desc_get_msi_desc(desc);
> +	msi = data->msi_desc;
>  	pp = sys_to_pcie(msi->dev->bus->sysdata);
>  	if (!pp) {
>  		BUG();
> @@ -283,7 +282,15 @@ static void clear_irq(unsigned int irq)
> 
>  	pos = data->hwirq;
> 
> -	irq_free_desc(irq);
> +	if (irq == msi->irq) {
> +		/*
> +		 * This is the irq returned from dw_msi_setup_irq where
> +		 * 'nvec' number of desc where allocated. Now it is time
> +		 * to free them all.
> +		 */
> +		unsigned int nvec = 1 << msi->msi_attrib.multiple;
> +		irq_free_descs(irq, nvec);
> +	}
> 
>  	clear_bit(pos, pp->msi_irq_in_use);
> 
> 
> 
> Best regards,
> 
> Bjørn Erik Nilsen
> 



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Kernel oops from pci_disable_msi
       [not found]                             ` <7D.78.31634.B838F825@epmailin2.samsung.com>
@ 2013-11-26 11:21                               ` Jingoo Han
  0 siblings, 0 replies; 28+ messages in thread
From: Jingoo Han @ 2013-11-26 11:21 UTC (permalink / raw)
  To: 'Bjørn Erik Nilsen', 'Marek Vasut',
	'Pratyush Anand', 'Kishon Vijay Abraham I'
  Cc: 'Bjorn Helgaas', linux-pci, 'Mohit KUMAR DCG',
	'Ajay KHANDELWAL', 'Tim Harvey',
	'Eric Nelson', 'Troy Kisky', 'Jingoo Han'

On Saturday, November 23, 2013 1:17 AM, Bjørn Erik Nilsen wrote:
> On Fri, 2013-11-22 at 12:06 +0100, Bjørn Erik Nilsen wrote:

[....]

> > The patch does indeed fix the crash, but there are more subtle issues
> > lurking around. I noticed how irq numbers were constantly increasing and
> > I found at least one stupid mistake that I made.
> >
> > > -             irq_set_msi_desc(irq + i, desc);
> > > +             irq_set_msi_desc_off(irq + i, i, desc);
> >
> > That should be 'irq_set_msi_desc_off(irq, i, desc)'
> >
> > (I'm really puzzled why this didn't cause other oops ...)
> >
> > I also realized that teardown() is only called for the irq returned from
> > setup(), which kind of makes sense since the others are destroyed() in
> > the first call to teardown. This means we need to iterate all irqs in
> > the same fashion as we allocate in setup/assign_irq. Basically unrolling
> > what was done there.
> >
> > I'm hacking on this solution now but it doesn't quite take me to where I
> > want at the moment, so it would be nice if someone with a better
> > understanding of the code could pitch in.
> 
> I gave it another shot and now it starts to look like something. At
> least I get consistent irq numbers and my system is very stable in
> general.
> 
> My new patch does exactly the opposite in teardown() of what is done in
> setup(), which in itself is a good sign.

Hi Bjørn Erik Nilsen,

I tested this patch with Exynos platform.
I checked that the kernel panic is resolved.

By the way, would you change your mail to text mode?
Your mails have not been submitted to PCI-mailing list.

Marek Vasut, Pratyush Anand, Kishon Vijay Abraham I
Would you review Bjørn Erik Nilsen's new patch?

> --- pcie-designware.c.orig      2013-11-21 14:02:03.656007695 +0100
> +++ pcie-designware.c   2013-11-22 16:32:30.360954591 +0100
> @@ -242,11 +242,15 @@ static int assign_irq(int no_irqs, struc
>         if (!irq)
>                 goto no_valid_irq;
> 
> +       /*
> +        * irq_domain_add_linear (called from dw_pcie_host_init) pre-allocates
> +        * descs so there is no need to call irq_alloc_descs here.
> +        */
> +
>         i = 0;
>         while (i < no_irqs) {
>                 set_bit(pos0 + i, pp->msi_irq_in_use);
> -               irq_alloc_descs((irq + i), (irq + i), 1, 0);
> -               irq_set_msi_desc(irq + i, desc);
> +               irq_set_msi_desc_off(irq, i, desc);
>                 /*Enable corresponding interrupt in MSI interrupt controller */
>                 res = ((pos0 + i) / 32) * 12;
>                 bit = (pos0 + i) % 32;
> @@ -266,7 +270,7 @@ no_valid_irq:
> 
>  static void clear_irq(unsigned int irq)
>  {
> -       int res, bit, val, pos;

The build errors happens.
Thus, the following lines are necessary

+       int res, bit, pos, nvec, i;
+       u32 val;

Best regards,
Jingoo Han

>         struct irq_desc *desc;
>         struct msi_desc *msi;
>         struct pcie_port *pp;
> @@ -281,18 +285,28 @@ static void clear_irq(unsigned int irq)
>                 return;
>         }
> 
> +       /* undo what was done in assign_irq */
>         pos = data->hwirq;
> +       nvec = 1 << msi->msi_attrib.multiple;
> 
> -       irq_free_desc(irq);
> -
> -       clear_bit(pos, pp->msi_irq_in_use);
> +       i = 0;
> +       while (i < nvec) {
> +               clear_bit(pos + i, pp->msi_irq_in_use);
> +               irq_set_msi_desc_off(irq, i, NULL);
> +               /* Disable corresponding interrupt on MSI interrupt controller */
> +               res = ((pos + i) / 32) * 12;
> +               bit = (pos + i) % 32;
> +               dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, &val);
> +               val &= ~(1 << bit);
> +               dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, val);
> +               ++i;
> +       }
> 
> -       /* Disable corresponding interrupt on MSI interrupt controller */
> -       res = (pos / 32) * 12;
> -       bit = pos % 32;
> -       dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, &val);
> -       val &= ~(1 << bit);
> -       dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, val);
> +       msi->msg.address_lo = 0;
> +       msi->msg.address_hi = 0;
> +       msi->msg.data = 0;
> +       msi->irq = 0;
> +       msi->msi_attrib.multiple = 0;
>  }
> 
>  static int dw_msi_setup_irq(struct msi_chip *chip, struct pci_dev *pdev,
> @@ -320,15 +334,14 @@ static int dw_msi_setup_irq(struct msi_c
>         if (irq < 0)
>                 return irq;
> 
> -       msg_ctr &= ~PCI_MSI_FLAGS_QSIZE;
> -       msg_ctr |= msgvec << 4;
> -       pci_write_config_word(pdev, desc->msi_attrib.pos + PCI_MSI_FLAGS,
> -                               msg_ctr);
>         desc->msi_attrib.multiple = msgvec;
> -
>         msg.address_lo = virt_to_phys((void *)pp->msi_data);
>         msg.address_hi = 0x0;
>         msg.data = pos;
> +       /*
> +        * write_msi_msg() will update PCI_MSI_FLAGS so there
> +        * is no need to explicitly call pci_write_config here
> +        */
>         write_msi_msg(irq, &msg);
> 
>         return 0;
>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Kernel oops from pci_disable_msi
       [not found]                             ` <cmu-lmtpd-32538-1385137032-0@frontend1.mail.m-online.net>
@ 2013-11-26 21:19                               ` Marek Vasut
       [not found]                                 ` <cmu-lmtpd-17794-1385506605-2@frontend1.mail.m-online.net>
  0 siblings, 1 reply; 28+ messages in thread
From: Marek Vasut @ 2013-11-26 21:19 UTC (permalink / raw)
  To: Bjørn Erik Nilsen
  Cc: Jingoo Han, 'Pratyush Anand', 'Bjorn Helgaas',
	linux-pci@vger.kernel.org, 'Kishon Vijay Abraham I',
	'Mohit KUMAR DCG', 'Ajay KHANDELWAL',
	'Tim Harvey', 'Eric Nelson', 'Troy Kisky'

Dear Bjørn Erik Nilsen,
[...]

> > > Thank you for your effort!
> > > I reproduced this kernel panic on Exynos platform with LAN card.
> > > And then, I tested your patch and checked this kernel panic is
> > > resolved.
> > > 
> > > Marek Vasut,
> > > Will you test Bjørn Erik Nilsen's patch with your i.MX platform?
> > > 
> > > Pratyush Anand,
> > > Would you confirm Bjørn Erik Nilsen's patch?
> > > 
> > > I will do more extensive testing.
> > > Thank you.
> > 
> > The patch does indeed fix the crash, but there are more subtle issues
> > lurking around. I noticed how irq numbers were constantly increasing and
> > I found at least one stupid mistake that I made.
> > 
> > > -             irq_set_msi_desc(irq + i, desc);
> > > +             irq_set_msi_desc_off(irq + i, i, desc);
> > 
> > That should be 'irq_set_msi_desc_off(irq, i, desc)'
> > 
> > (I'm really puzzled why this didn't cause other oops ...)
> > 
> > I also realized that teardown() is only called for the irq returned from
> > setup(), which kind of makes sense since the others are destroyed() in
> > the first call to teardown. This means we need to iterate all irqs in
> > the same fashion as we allocate in setup/assign_irq. Basically unrolling
> > what was done there.
> > 
> > I'm hacking on this solution now but it doesn't quite take me to where I
> > want at the moment, so it would be nice if someone with a better
> > understanding of the code could pitch in.
> 
> I gave it another shot and now it starts to look like something. At
> least I get consistent irq numbers and my system is very stable in
> general.
> 
> My new patch does exactly the opposite in teardown() of what is done in
> setup(), which in itself is a good sign.
> 
> --- pcie-designware.c.orig      2013-11-21 14:02:03.656007695 +0100
> +++ pcie-designware.c   2013-11-22 16:32:30.360954591 +0100
> @@ -242,11 +242,15 @@ static int assign_irq(int no_irqs, struc
>         if (!irq)
>                 goto no_valid_irq;
> 
> +       /*
> +        * irq_domain_add_linear (called from dw_pcie_host_init)
> pre-allocates +        * descs so there is no need to call irq_alloc_descs
> here. +        */
> +
>         i = 0;
>         while (i < no_irqs) {
>                 set_bit(pos0 + i, pp->msi_irq_in_use);
> -               irq_alloc_descs((irq + i), (irq + i), 1, 0);
> -               irq_set_msi_desc(irq + i, desc);
> +               irq_set_msi_desc_off(irq, i, desc);

Why do you not allocate the descs anymore ?

>                 /*Enable corresponding interrupt in MSI interrupt
> controller */ res = ((pos0 + i) / 32) * 12;
>                 bit = (pos0 + i) % 32;
> @@ -266,7 +270,7 @@ no_valid_irq:
> 
>  static void clear_irq(unsigned int irq)
>  {
> -       int res, bit, val, pos;
>         struct irq_desc *desc;
>         struct msi_desc *msi;
>         struct pcie_port *pp;
> @@ -281,18 +285,28 @@ static void clear_irq(unsigned int irq)
>                 return;
>         }
> 
> +       /* undo what was done in assign_irq */
>         pos = data->hwirq;
> +       nvec = 1 << msi->msi_attrib.multiple;
> 
> -       irq_free_desc(irq);
> -
> -       clear_bit(pos, pp->msi_irq_in_use);
> +       i = 0;
> +       while (i < nvec) {
> +               clear_bit(pos + i, pp->msi_irq_in_use);
> +               irq_set_msi_desc_off(irq, i, NULL);
> +               /* Disable corresponding interrupt on MSI interrupt
> controller */ +               res = ((pos + i) / 32) * 12;
> +               bit = (pos + i) % 32;
> +               dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4,
> &val); +               val &= ~(1 << bit);
> +               dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4,
> val); +               ++i;
> +       }
> 
> -       /* Disable corresponding interrupt on MSI interrupt controller */
> -       res = (pos / 32) * 12;
> -       bit = pos % 32;
> -       dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, &val);
> -       val &= ~(1 << bit);
> -       dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, val);
> +       msi->msg.address_lo = 0;
> +       msi->msg.address_hi = 0;
> +       msi->msg.data = 0;
> +       msi->irq = 0;
> +       msi->msi_attrib.multiple = 0;
>  }
> 
>  static int dw_msi_setup_irq(struct msi_chip *chip, struct pci_dev *pdev,
> @@ -320,15 +334,14 @@ static int dw_msi_setup_irq(struct msi_c
>         if (irq < 0)
>                 return irq;
> 
> -       msg_ctr &= ~PCI_MSI_FLAGS_QSIZE;
> -       msg_ctr |= msgvec << 4;
> -       pci_write_config_word(pdev, desc->msi_attrib.pos + PCI_MSI_FLAGS,
> -                               msg_ctr);
>         desc->msi_attrib.multiple = msgvec;
> -
>         msg.address_lo = virt_to_phys((void *)pp->msi_data);
>         msg.address_hi = 0x0;
>         msg.data = pos;
> +       /*
> +        * write_msi_msg() will update PCI_MSI_FLAGS so there
> +        * is no need to explicitly call pci_write_config here
> +        */
>         write_msi_msg(irq, &msg);
> 
>         return 0;
> 
> 
> Best regards,
> 
> Bjørn Erik Nilsen

Can you try using 'git send-email' for submitting subsequent patches please?

The patch works back and forth, I can 'remove' the card with:

echo 1 > /sys/bus/pci/devices/0000:01:00.0/remove

and then 'rescan' the card with:

echo 1 > /sys/bus/pci/devices/0000:00:00.0/rescan

Tested-by: Marek Vasut <marex@denx.de>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Kernel oops from pci_disable_msi
       [not found]                                 ` <cmu-lmtpd-17794-1385506605-2@frontend1.mail.m-online.net>
@ 2013-11-26 23:05                                   ` Marek Vasut
  2013-11-27  9:46                                   ` Marek Vasut
  1 sibling, 0 replies; 28+ messages in thread
From: Marek Vasut @ 2013-11-26 23:05 UTC (permalink / raw)
  To: Bjørn Erik Nilsen
  Cc: Jingoo Han, Pratyush Anand, Bjorn Helgaas,
	linux-pci@vger.kernel.org, Kishon Vijay Abraham I,
	Mohit KUMAR DCG, Ajay KHANDELWAL, Tim Harvey, Eric Nelson,
	Troy Kisky

Dear Bjørn Erik Nilsen,

> 26. nov. 2013 kl. 22:19 skrev Marek Vasut <marex@denx.de>:
> > Dear Bjørn Erik Nilsen,
> > [...]
> > 
> >>>> Thank you for your effort!
> >>>> I reproduced this kernel panic on Exynos platform with LAN card.
> >>>> And then, I tested your patch and checked this kernel panic is
> >>>> resolved.
> >>>> 
> >>>> Marek Vasut,
> >>>> Will you test Bjørn Erik Nilsen's patch with your i.MX platform?
> >>>> 
> >>>> Pratyush Anand,
> >>>> Would you confirm Bjørn Erik Nilsen's patch?
> >>>> 
> >>>> I will do more extensive testing.
> >>>> Thank you.
> >>> 
> >>> The patch does indeed fix the crash, but there are more subtle issues
> >>> lurking around. I noticed how irq numbers were constantly increasing
> >>> and I found at least one stupid mistake that I made.
> >>> 
> >>>> -             irq_set_msi_desc(irq + i, desc);
> >>>> +             irq_set_msi_desc_off(irq + i, i, desc);
> >>> 
> >>> That should be 'irq_set_msi_desc_off(irq, i, desc)'
> >>> 
> >>> (I'm really puzzled why this didn't cause other oops ...)
> >>> 
> >>> I also realized that teardown() is only called for the irq returned
> >>> from setup(), which kind of makes sense since the others are
> >>> destroyed() in the first call to teardown. This means we need to
> >>> iterate all irqs in the same fashion as we allocate in
> >>> setup/assign_irq. Basically unrolling what was done there.
> >>> 
> >>> I'm hacking on this solution now but it doesn't quite take me to where
> >>> I want at the moment, so it would be nice if someone with a better
> >>> understanding of the code could pitch in.
> >> 
> >> I gave it another shot and now it starts to look like something. At
> >> least I get consistent irq numbers and my system is very stable in
> >> general.
> >> 
> >> My new patch does exactly the opposite in teardown() of what is done in
> >> setup(), which in itself is a good sign.
> >> 
> >> --- pcie-designware.c.orig      2013-11-21 14:02:03.656007695 +0100
> >> +++ pcie-designware.c   2013-11-22 16:32:30.360954591 +0100
> >> @@ -242,11 +242,15 @@ static int assign_irq(int no_irqs, struc
> >> 
> >>        if (!irq)
> >>        
> >>                goto no_valid_irq;
> >> 
> >> +       /*
> >> +        * irq_domain_add_linear (called from dw_pcie_host_init)
> >> pre-allocates +        * descs so there is no need to call
> >> irq_alloc_descs here. +        */
> >> +
> >> 
> >>        i = 0;
> >>        while (i < no_irqs) {
> >>        
> >>                set_bit(pos0 + i, pp->msi_irq_in_use);
> >> 
> >> -               irq_alloc_descs((irq + i), (irq + i), 1, 0);
> >> -               irq_set_msi_desc(irq + i, desc);
> >> +               irq_set_msi_desc_off(irq, i, desc);
> > 
> > Why do you not allocate the descs anymore ?
> 
> Descs are allocated in irq_create_mapping, added in commit 904d0e788993
> 
> Oh, and I realize now that my comment in the patch regarding this is wrong.
> 
> […]
> 
> > Can you try using 'git send-email' for submitting subsequent patches
> > please?
> 
> I will eventually try using git send-mail for my next patch, which will
> correct the comment.

That'd be really helpful, thanks for spotting this btw ;-)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Kernel oops from pci_disable_msi
       [not found]                                 ` <cmu-lmtpd-17794-1385506605-2@frontend1.mail.m-online.net>
  2013-11-26 23:05                                   ` Marek Vasut
@ 2013-11-27  9:46                                   ` Marek Vasut
       [not found]                                     ` <cmu-lmtpd-20073-1385566822-1@frontend1.mail.m-online.net>
       [not found]                                     ` <C7.DF.00504.D1216925@epmailin4.samsung.com>
  1 sibling, 2 replies; 28+ messages in thread
From: Marek Vasut @ 2013-11-27  9:46 UTC (permalink / raw)
  To: Bjørn Erik Nilsen
  Cc: Jingoo Han, Pratyush Anand, Bjorn Helgaas,
	linux-pci@vger.kernel.org, Kishon Vijay Abraham I,
	Mohit KUMAR DCG, Ajay KHANDELWAL, Tim Harvey, Eric Nelson,
	Troy Kisky

Dear Bjørn Erik Nilsen,

> 26. nov. 2013 kl. 22:19 skrev Marek Vasut <marex@denx.de>:
> > Dear Bjørn Erik Nilsen,
> > [...]
> > 
> >>>> Thank you for your effort!
> >>>> I reproduced this kernel panic on Exynos platform with LAN card.
> >>>> And then, I tested your patch and checked this kernel panic is
> >>>> resolved.
> >>>> 
> >>>> Marek Vasut,
> >>>> Will you test Bjørn Erik Nilsen's patch with your i.MX platform?
> >>>> 
> >>>> Pratyush Anand,
> >>>> Would you confirm Bjørn Erik Nilsen's patch?
> >>>> 
> >>>> I will do more extensive testing.
> >>>> Thank you.
> >>> 
> >>> The patch does indeed fix the crash, but there are more subtle issues
> >>> lurking around. I noticed how irq numbers were constantly increasing
> >>> and I found at least one stupid mistake that I made.
> >>> 
> >>>> -             irq_set_msi_desc(irq + i, desc);
> >>>> +             irq_set_msi_desc_off(irq + i, i, desc);
> >>> 
> >>> That should be 'irq_set_msi_desc_off(irq, i, desc)'
> >>> 
> >>> (I'm really puzzled why this didn't cause other oops ...)
> >>> 
> >>> I also realized that teardown() is only called for the irq returned
> >>> from setup(), which kind of makes sense since the others are
> >>> destroyed() in the first call to teardown. This means we need to
> >>> iterate all irqs in the same fashion as we allocate in
> >>> setup/assign_irq. Basically unrolling what was done there.
> >>> 
> >>> I'm hacking on this solution now but it doesn't quite take me to where
> >>> I want at the moment, so it would be nice if someone with a better
> >>> understanding of the code could pitch in.
> >> 
> >> I gave it another shot and now it starts to look like something. At
> >> least I get consistent irq numbers and my system is very stable in
> >> general.
> >> 
> >> My new patch does exactly the opposite in teardown() of what is done in
> >> setup(), which in itself is a good sign.
> >> 
> >> --- pcie-designware.c.orig      2013-11-21 14:02:03.656007695 +0100
> >> +++ pcie-designware.c   2013-11-22 16:32:30.360954591 +0100
> >> @@ -242,11 +242,15 @@ static int assign_irq(int no_irqs, struc
> >> 
> >>        if (!irq)
> >>        
> >>                goto no_valid_irq;
> >> 
> >> +       /*
> >> +        * irq_domain_add_linear (called from dw_pcie_host_init)
> >> pre-allocates +        * descs so there is no need to call
> >> irq_alloc_descs here. +        */
> >> +
> >> 
> >>        i = 0;
> >>        while (i < no_irqs) {
> >>        
> >>                set_bit(pos0 + i, pp->msi_irq_in_use);
> >> 
> >> -               irq_alloc_descs((irq + i), (irq + i), 1, 0);
> >> -               irq_set_msi_desc(irq + i, desc);
> >> +               irq_set_msi_desc_off(irq, i, desc);

btw you are not checking the return value of irq_set_msi_desc{,_off}(), are you 
sure these functions do return 0 always ? Please fix this so you also check the 
return value.

Best regards,

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4] Kernel oops from pci_disable_msi
       [not found]                                     ` <cmu-lmtpd-20073-1385566822-1@frontend1.mail.m-online.net>
@ 2013-11-27 19:05                                       ` Marek Vasut
  0 siblings, 0 replies; 28+ messages in thread
From: Marek Vasut @ 2013-11-27 19:05 UTC (permalink / raw)
  To: Bjørn Erik Nilsen
  Cc: Jingoo Han, Pratyush Anand, Bjorn Helgaas,
	linux-pci@vger.kernel.org, Kishon Vijay Abraham I,
	Mohit KUMAR DCG, Ajay KHANDELWAL, Tim Harvey, Eric Nelson,
	Troy Kisky

Dear Bjørn Erik Nilsen,

> Commit 904d0e7889933fb48d921c998fd1cabb3a9d6635 added
> irq_create_mapping which pre-allocates irq descs.
> Problem was that in assign_irq these descs were explicitly
> allocated and hence also freed, resulting in a crash.
> 
> Signed-off-by: Bjørn Erik Nilsen <ben@datarespons.no>
> ---
>  drivers/pci/host/pcie-designware.c | 50
> +++++++++++++++++++++++++------------- 1 file changed, 33 insertions(+),
> 17 deletions(-)
> 
> diff --git a/drivers/pci/host/pcie-designware.c
> b/drivers/pci/host/pcie-designware.c index e33b68b..6c61abd 100644
> --- a/drivers/pci/host/pcie-designware.c
> +++ b/drivers/pci/host/pcie-designware.c
> @@ -242,11 +242,18 @@ static int assign_irq(int no_irqs, struct msi_desc
> *desc, int *pos) if (!irq)
>  		goto no_valid_irq;
> 
> +	/*
> +	 * irq_create_mapping (called from dw_pcie_host_init) pre-allocates
> +	 * descs so there is no need to allocate descs here. We can therefore
> +	 * assume that if irq_find_mapping above returns non-zero, then the
> +	 * descs are also successfully allocated. Hence no need to check the
> +	 * return value of irq_set_msi_desc_off below.
> +	 */
> +
>  	i = 0;
>  	while (i < no_irqs) {
>  		set_bit(pos0 + i, pp->msi_irq_in_use);
> -		irq_alloc_descs((irq + i), (irq + i), 1, 0);
> -		irq_set_msi_desc(irq + i, desc);
> +		irq_set_msi_desc_off(irq, i, desc);

This function here returns an return value , you _must_ check it and handle the 
possible error accordingly .

>  		/*Enable corresponding interrupt in MSI interrupt controller */
>  		res = ((pos0 + i) / 32) * 12;
>  		bit = (pos0 + i) % 32;
> @@ -266,7 +273,7 @@ no_valid_irq:
> 
>  static void clear_irq(unsigned int irq)
>  {
> -	int res, bit, val, pos;
> +	int res, bit, val, pos, i, nvec;
>  	struct irq_desc *desc;
>  	struct msi_desc *msi;
>  	struct pcie_port *pp;
> @@ -281,18 +288,28 @@ static void clear_irq(unsigned int irq)
>  		return;
>  	}
> 
> +	/* undo what was done in assign_irq */
>  	pos = data->hwirq;
> +	nvec = 1 << msi->msi_attrib.multiple;
> 
> -	irq_free_desc(irq);
> -
> -	clear_bit(pos, pp->msi_irq_in_use);
> +	i = 0;
> +	while (i < nvec) {
> +		clear_bit(pos + i, pp->msi_irq_in_use);
> +		irq_set_msi_desc_off(irq, i, NULL);

DTTO here.

> +		/* Disable corresponding interrupt on MSI interrupt controller 
*/
> +		res = ((pos + i) / 32) * 12;
> +		bit = (pos + i) % 32;
> +		dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, &val);
> +		val &= ~(1 << bit);
> +		dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, val);
> +		++i;
> +	}
> 
> -	/* Disable corresponding interrupt on MSI interrupt controller */
> -	res = (pos / 32) * 12;
> -	bit = pos % 32;
> -	dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, &val);
> -	val &= ~(1 << bit);
> -	dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, val);
> +	msi->msg.address_lo = 0;
> +	msi->msg.address_hi = 0;
> +	msi->msg.data = 0;
> +	msi->irq = 0;
> +	msi->msi_attrib.multiple = 0;

This piece of new code could use a comment please.

>  }
> 
>  static int dw_msi_setup_irq(struct msi_chip *chip, struct pci_dev *pdev,
> @@ -320,12 +337,11 @@ static int dw_msi_setup_irq(struct msi_chip *chip,
> struct pci_dev *pdev, if (irq < 0)
>  		return irq;
> 
> -	msg_ctr &= ~PCI_MSI_FLAGS_QSIZE;
> -	msg_ctr |= msgvec << 4;
> -	pci_write_config_word(pdev, desc->msi_attrib.pos + PCI_MSI_FLAGS,
> -				msg_ctr);
> +	/*
> +	 * write_msi_msg() will update PCI_MSI_FLAGS so there
> +	 * is no need to explicitly call pci_write_config here
> +	 */
>  	desc->msi_attrib.multiple = msgvec;
> -
>  	msg.address_lo = virt_to_phys((void *)pp->msi_data);
>  	msg.address_hi = 0x0;

Won't this overwrite your new code above ?

>  	msg.data = pos;

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4] Kernel oops from pci_disable_msi
       [not found]                                     ` <C7.DF.00504.D1216925@epmailin4.samsung.com>
@ 2013-11-29  7:37                                       ` Jingoo Han
  2013-11-29 13:35                                         ` [PATCH v5 0/2] " Bjørn Erik Nilsen
  0 siblings, 1 reply; 28+ messages in thread
From: Jingoo Han @ 2013-11-29  7:37 UTC (permalink / raw)
  To: 'Bjørn Erik Nilsen', 'Marek Vasut'
  Cc: 'Pratyush Anand', 'Bjorn Helgaas', linux-pci,
	'Kishon Vijay Abraham I', 'Mohit KUMAR DCG',
	'Ajay KHANDELWAL', 'Tim Harvey',
	'Eric Nelson', 'Troy Kisky', 'Jingoo Han'

On Thursday, November 28, 2013 12:39 AM, Bjørn Erik Nilsen wrote:
> 
> Commit 904d0e7889933fb48d921c998fd1cabb3a9d6635 added

Please replace it with the following.

Commit 904d0e7889933fb48d921c998fd1cabb3a9d6635
"PCI: designware: Add irq_create_mapping()" added

Also, the subject of the commit is not good.
Usually, 'verb + objective + etc.' looks good.

Ex. PCI: designware: Remove irq_alloc_descs()

I tested this patch on Exynos platform; it works properly.
I will wait for v5 patch which addresses Marek Vasut's comments.
Thank you.

Best regards,
Jingoo Han

> irq_create_mapping which pre-allocates irq descs.
> Problem was that in assign_irq these descs were explicitly
> allocated and hence also freed, resulting in a crash.
> 
> Signed-off-by: Bjørn Erik Nilsen <ben@datarespons.no>
> ---
>  drivers/pci/host/pcie-designware.c | 50 +++++++++++++++++++++++++-------------
>  1 file changed, 33 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/pci/host/pcie-designware.c b/drivers/pci/host/pcie-designware.c
> index e33b68b..6c61abd 100644
> --- a/drivers/pci/host/pcie-designware.c
> +++ b/drivers/pci/host/pcie-designware.c
> @@ -242,11 +242,18 @@ static int assign_irq(int no_irqs, struct msi_desc *desc, int *pos)
>  	if (!irq)
>  		goto no_valid_irq;
> 
> +	/*
> +	 * irq_create_mapping (called from dw_pcie_host_init) pre-allocates
> +	 * descs so there is no need to allocate descs here. We can therefore
> +	 * assume that if irq_find_mapping above returns non-zero, then the
> +	 * descs are also successfully allocated. Hence no need to check the
> +	 * return value of irq_set_msi_desc_off below.
> +	 */
> +
>  	i = 0;
>  	while (i < no_irqs) {
>  		set_bit(pos0 + i, pp->msi_irq_in_use);
> -		irq_alloc_descs((irq + i), (irq + i), 1, 0);
> -		irq_set_msi_desc(irq + i, desc);
> +		irq_set_msi_desc_off(irq, i, desc);
>  		/*Enable corresponding interrupt in MSI interrupt controller */
>  		res = ((pos0 + i) / 32) * 12;
>  		bit = (pos0 + i) % 32;
> @@ -266,7 +273,7 @@ no_valid_irq:
> 
>  static void clear_irq(unsigned int irq)
>  {
> -	int res, bit, val, pos;
> +	int res, bit, val, pos, i, nvec;
>  	struct irq_desc *desc;
>  	struct msi_desc *msi;
>  	struct pcie_port *pp;
> @@ -281,18 +288,28 @@ static void clear_irq(unsigned int irq)
>  		return;
>  	}
> 
> +	/* undo what was done in assign_irq */
>  	pos = data->hwirq;
> +	nvec = 1 << msi->msi_attrib.multiple;
> 
> -	irq_free_desc(irq);
> -
> -	clear_bit(pos, pp->msi_irq_in_use);
> +	i = 0;
> +	while (i < nvec) {
> +		clear_bit(pos + i, pp->msi_irq_in_use);
> +		irq_set_msi_desc_off(irq, i, NULL);
> +		/* Disable corresponding interrupt on MSI interrupt controller */
> +		res = ((pos + i) / 32) * 12;
> +		bit = (pos + i) % 32;
> +		dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, &val);
> +		val &= ~(1 << bit);
> +		dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, val);
> +		++i;
> +	}
> 
> -	/* Disable corresponding interrupt on MSI interrupt controller */
> -	res = (pos / 32) * 12;
> -	bit = pos % 32;
> -	dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, &val);
> -	val &= ~(1 << bit);
> -	dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, val);
> +	msi->msg.address_lo = 0;
> +	msi->msg.address_hi = 0;
> +	msi->msg.data = 0;
> +	msi->irq = 0;
> +	msi->msi_attrib.multiple = 0;
>  }
> 
>  static int dw_msi_setup_irq(struct msi_chip *chip, struct pci_dev *pdev,
> @@ -320,12 +337,11 @@ static int dw_msi_setup_irq(struct msi_chip *chip, struct pci_dev *pdev,
>  	if (irq < 0)
>  		return irq;
> 
> -	msg_ctr &= ~PCI_MSI_FLAGS_QSIZE;
> -	msg_ctr |= msgvec << 4;
> -	pci_write_config_word(pdev, desc->msi_attrib.pos + PCI_MSI_FLAGS,
> -				msg_ctr);
> +	/*
> +	 * write_msi_msg() will update PCI_MSI_FLAGS so there
> +	 * is no need to explicitly call pci_write_config here
> +	 */
>  	desc->msi_attrib.multiple = msgvec;
> -
>  	msg.address_lo = virt_to_phys((void *)pp->msi_data);
>  	msg.address_hi = 0x0;
>  	msg.data = pos;
> --
> 1.8.4.4
> 



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 0/2] Kernel oops from pci_disable_msi
  2013-11-29  7:37                                       ` Jingoo Han
@ 2013-11-29 13:35                                         ` Bjørn Erik Nilsen
  2013-11-29 13:35                                           ` [PATCH v5 1/2] PCI: designware: Fix crash in dw_msi_teardown_irq Bjørn Erik Nilsen
                                                             ` (3 more replies)
  0 siblings, 4 replies; 28+ messages in thread
From: Bjørn Erik Nilsen @ 2013-11-29 13:35 UTC (permalink / raw)
  To: jg1.han
  Cc: marex, pratyush.anand, bhelgaas, linux-pci, kishon, Mohit.KUMAR,
	ajay.khandelwal, tharvey, Eric.Nelson, troy.kisky,
	Bjørn Erik Nilsen

This patchset address Marek Vasut's comments on the previous
patches. I decided to split into two separate commits as they are
unrelated.

Bjørn Erik Nilsen (2):
  PCI: designware: Fix crash in dw_msi_teardown_irq
  PCI: designware: Remove redundant call to pci_write_config

 drivers/pci/host/pcie-designware.c | 57 +++++++++++++++++++++++++++-----------
 1 file changed, 41 insertions(+), 16 deletions(-)

-- 
1.8.4.4


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 1/2] PCI: designware: Fix crash in dw_msi_teardown_irq
  2013-11-29 13:35                                         ` [PATCH v5 0/2] " Bjørn Erik Nilsen
@ 2013-11-29 13:35                                           ` Bjørn Erik Nilsen
  2013-11-29 14:32                                             ` Marek Vasut
  2013-11-29 13:35                                           ` [PATCH v5 2/2] PCI: designware: Remove redundant call to pci_write_config Bjørn Erik Nilsen
                                                             ` (2 subsequent siblings)
  3 siblings, 1 reply; 28+ messages in thread
From: Bjørn Erik Nilsen @ 2013-11-29 13:35 UTC (permalink / raw)
  To: jg1.han
  Cc: marex, pratyush.anand, bhelgaas, linux-pci, kishon, Mohit.KUMAR,
	ajay.khandelwal, tharvey, Eric.Nelson, troy.kisky,
	Bjørn Erik Nilsen

"PCI: designware: Add irq_create_mapping()" resulted
in pre-allocated irq descs. Problem was that in assign_irq()
these descs were explicitly allocated and hence also freed,
resulting in a crash. We also need to clear the entire irq
range in teardown. With this commit the teardown basically
does exactly the opposite of what was done in setup.

Signed-off-by: Bjørn Erik Nilsen <ben@datarespons.no>
---
 drivers/pci/host/pcie-designware.c | 49 ++++++++++++++++++++++++++++----------
 1 file changed, 37 insertions(+), 12 deletions(-)

diff --git a/drivers/pci/host/pcie-designware.c b/drivers/pci/host/pcie-designware.c
index e33b68b..61345a1 100644
--- a/drivers/pci/host/pcie-designware.c
+++ b/drivers/pci/host/pcie-designware.c
@@ -209,6 +209,25 @@ static int find_valid_pos0(struct pcie_port *pp, int msgvec, int pos, int *pos0)
 	return 0;
 }
 
+static void clear_irq_range(struct pcie_port *pp, unsigned int irq_base,
+                            unsigned int nvec, unsigned int pos)
+{
+	unsigned int i, res, bit, val;
+
+	i = 0;
+	while (i < nvec) {
+		irq_set_msi_desc_off(irq_base, i, NULL);
+		clear_bit(pos + i, pp->msi_irq_in_use);
+		/* Disable corresponding interrupt on MSI interrupt controller */
+		res = ((pos + i) / 32) * 12;
+		bit = (pos + i) % 32;
+		dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, &val);
+		val &= ~(1 << bit);
+		dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, val);
+		++i;
+	}
+}
+
 static int assign_irq(int no_irqs, struct msi_desc *desc, int *pos)
 {
 	int res, bit, irq, pos0, pos1, i;
@@ -242,11 +261,20 @@ static int assign_irq(int no_irqs, struct msi_desc *desc, int *pos)
 	if (!irq)
 		goto no_valid_irq;
 
+	/*
+	 * irq_create_mapping (called from dw_pcie_host_init) pre-allocates
+	 * descs so there is no need to allocate descs here. We can therefore
+	 * assume that if irq_find_mapping above returns non-zero, then the
+	 * descs are also successfully allocated.
+	 */
+
 	i = 0;
 	while (i < no_irqs) {
+		if (irq_set_msi_desc_off(irq, i, desc) != 0) {
+			clear_irq_range(pp, irq, i, pos0);
+			goto no_valid_irq;
+		}
 		set_bit(pos0 + i, pp->msi_irq_in_use);
-		irq_alloc_descs((irq + i), (irq + i), 1, 0);
-		irq_set_msi_desc(irq + i, desc);
 		/*Enable corresponding interrupt in MSI interrupt controller */
 		res = ((pos0 + i) / 32) * 12;
 		bit = (pos0 + i) % 32;
@@ -266,7 +294,7 @@ no_valid_irq:
 
 static void clear_irq(unsigned int irq)
 {
-	int res, bit, val, pos;
+	unsigned int pos, nvec;
 	struct irq_desc *desc;
 	struct msi_desc *msi;
 	struct pcie_port *pp;
@@ -281,18 +309,15 @@ static void clear_irq(unsigned int irq)
 		return;
 	}
 
+	/* undo what was done in assign_irq */
 	pos = data->hwirq;
+	nvec = 1 << msi->msi_attrib.multiple;
 
-	irq_free_desc(irq);
-
-	clear_bit(pos, pp->msi_irq_in_use);
+	clear_irq_range(pp, irq, nvec, pos);
 
-	/* Disable corresponding interrupt on MSI interrupt controller */
-	res = (pos / 32) * 12;
-	bit = pos % 32;
-	dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, &val);
-	val &= ~(1 << bit);
-	dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, val);
+	/* all irqs cleared; reset attributes */
+	msi->irq = 0;
+	msi->msi_attrib.multiple = 0;
 }
 
 static int dw_msi_setup_irq(struct msi_chip *chip, struct pci_dev *pdev,
-- 
1.8.4.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v5 2/2] PCI: designware: Remove redundant call to pci_write_config
  2013-11-29 13:35                                         ` [PATCH v5 0/2] " Bjørn Erik Nilsen
  2013-11-29 13:35                                           ` [PATCH v5 1/2] PCI: designware: Fix crash in dw_msi_teardown_irq Bjørn Erik Nilsen
@ 2013-11-29 13:35                                           ` Bjørn Erik Nilsen
  2013-12-05  1:52                                           ` [PATCH v5 0/2] Kernel oops from pci_disable_msi Jingoo Han
  2013-12-09 20:43                                           ` Bjorn Helgaas
  3 siblings, 0 replies; 28+ messages in thread
From: Bjørn Erik Nilsen @ 2013-11-29 13:35 UTC (permalink / raw)
  To: jg1.han
  Cc: marex, pratyush.anand, bhelgaas, linux-pci, kishon, Mohit.KUMAR,
	ajay.khandelwal, tharvey, Eric.Nelson, troy.kisky,
	Bjørn Erik Nilsen

write_msi_msg() does exactly the same so there is no need
to explicitly call pci_write_config and do the same twice.

Signed-off-by: Bjørn Erik Nilsen <ben@datarespons.no>
---
 drivers/pci/host/pcie-designware.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/host/pcie-designware.c b/drivers/pci/host/pcie-designware.c
index 61345a1..212b8b6 100644
--- a/drivers/pci/host/pcie-designware.c
+++ b/drivers/pci/host/pcie-designware.c
@@ -345,10 +345,10 @@ static int dw_msi_setup_irq(struct msi_chip *chip, struct pci_dev *pdev,
 	if (irq < 0)
 		return irq;
 
-	msg_ctr &= ~PCI_MSI_FLAGS_QSIZE;
-	msg_ctr |= msgvec << 4;
-	pci_write_config_word(pdev, desc->msi_attrib.pos + PCI_MSI_FLAGS,
-				msg_ctr);
+	/*
+	 * write_msi_msg() will update PCI_MSI_FLAGS so there is
+	 * no need to explicitly call pci_write_config.
+	 */
 	desc->msi_attrib.multiple = msgvec;
 
 	msg.address_lo = virt_to_phys((void *)pp->msi_data);
-- 
1.8.4.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 1/2] PCI: designware: Fix crash in dw_msi_teardown_irq
  2013-11-29 13:35                                           ` [PATCH v5 1/2] PCI: designware: Fix crash in dw_msi_teardown_irq Bjørn Erik Nilsen
@ 2013-11-29 14:32                                             ` Marek Vasut
       [not found]                                               ` <cmu-lmtpd-25244-1385737839-23@frontend1.mail.m-online.net>
  0 siblings, 1 reply; 28+ messages in thread
From: Marek Vasut @ 2013-11-29 14:32 UTC (permalink / raw)
  To: Bjørn Erik Nilsen
  Cc: jg1.han, pratyush.anand, bhelgaas, linux-pci, kishon, Mohit.KUMAR,
	ajay.khandelwal, tharvey, Eric.Nelson, troy.kisky

Dear Bjørn Erik Nilsen,

I will be a nitpicking bastard below, sorry ;-/

> "PCI: designware: Add irq_create_mapping()" resulted
> in pre-allocated irq descs. Problem was that in assign_irq()
> these descs were explicitly allocated and hence also freed,
> resulting in a crash. We also need to clear the entire irq
> range in teardown. With this commit the teardown basically
> does exactly the opposite of what was done in setup.
> 
> Signed-off-by: Bjørn Erik Nilsen <ben@datarespons.no>

You can stuff your CC list here, like this, if it's more convenient for you then 
to type all these people on the command line:

Cc: Exa Mple <user@example.com>

The git send-email will automatically send email to this address.

> ---
>  drivers/pci/host/pcie-designware.c | 49
> ++++++++++++++++++++++++++++---------- 1 file changed, 37 insertions(+),
> 12 deletions(-)
> 
> diff --git a/drivers/pci/host/pcie-designware.c
> b/drivers/pci/host/pcie-designware.c index e33b68b..61345a1 100644
> --- a/drivers/pci/host/pcie-designware.c
> +++ b/drivers/pci/host/pcie-designware.c
> @@ -209,6 +209,25 @@ static int find_valid_pos0(struct pcie_port *pp, int
> msgvec, int pos, int *pos0) return 0;
>  }
> 
> +static void clear_irq_range(struct pcie_port *pp, unsigned int irq_base,
> +                            unsigned int nvec, unsigned int pos)
> +{
> +	unsigned int i, res, bit, val;
> +
> +	i = 0;
> +	while (i < nvec) {

Won't simple

for (i = 0; i < nvec; i++) {

do here?

> +		irq_set_msi_desc_off(irq_base, i, NULL);

This one returns a return value, please handle it.
ret = irq_set(...);
if (ret)
 goto fail;


> +		clear_bit(pos + i, pp->msi_irq_in_use);
> +		/* Disable corresponding interrupt on MSI interrupt controller 
*/
> +		res = ((pos + i) / 32) * 12;
> +		bit = (pos + i) % 32;
> +		dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, &val);
> +		val &= ~(1 << bit);
> +		dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, val);
> +		++i;
> +	}

fail:
 for (--i; i >=0; i--)
   irq...
return ret;

> +}
> +
>  static int assign_irq(int no_irqs, struct msi_desc *desc, int *pos)
>  {
>  	int res, bit, irq, pos0, pos1, i;
> @@ -242,11 +261,20 @@ static int assign_irq(int no_irqs, struct msi_desc
> *desc, int *pos) if (!irq)
>  		goto no_valid_irq;
> 
> +	/*
> +	 * irq_create_mapping (called from dw_pcie_host_init) pre-allocates
> +	 * descs so there is no need to allocate descs here. We can therefore
> +	 * assume that if irq_find_mapping above returns non-zero, then the
> +	 * descs are also successfully allocated.
> +	 */
> +
>  	i = 0;
>  	while (i < no_irqs) {
> +		if (irq_set_msi_desc_off(irq, i, desc) != 0) {
> +			clear_irq_range(pp, irq, i, pos0);
> +			goto no_valid_irq;
> +		}
>  		set_bit(pos0 + i, pp->msi_irq_in_use);
> -		irq_alloc_descs((irq + i), (irq + i), 1, 0);
> -		irq_set_msi_desc(irq + i, desc);
>  		/*Enable corresponding interrupt in MSI interrupt controller */
>  		res = ((pos0 + i) / 32) * 12;
>  		bit = (pos0 + i) % 32;
> @@ -266,7 +294,7 @@ no_valid_irq:
> 
>  static void clear_irq(unsigned int irq)
>  {
> -	int res, bit, val, pos;
> +	unsigned int pos, nvec;
>  	struct irq_desc *desc;
>  	struct msi_desc *msi;
>  	struct pcie_port *pp;
> @@ -281,18 +309,15 @@ static void clear_irq(unsigned int irq)
>  		return;
>  	}
> 
> +	/* undo what was done in assign_irq */
>  	pos = data->hwirq;
> +	nvec = 1 << msi->msi_attrib.multiple;
> 
> -	irq_free_desc(irq);
> -
> -	clear_bit(pos, pp->msi_irq_in_use);
> +	clear_irq_range(pp, irq, nvec, pos);
> 
> -	/* Disable corresponding interrupt on MSI interrupt controller */
> -	res = (pos / 32) * 12;
> -	bit = pos % 32;
> -	dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, &val);
> -	val &= ~(1 << bit);
> -	dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, val);
> +	/* all irqs cleared; reset attributes */
> +	msi->irq = 0;
> +	msi->msi_attrib.multiple = 0;
>  }
> 
>  static int dw_msi_setup_irq(struct msi_chip *chip, struct pci_dev *pdev,

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 1/2] PCI: designware: Fix crash in dw_msi_teardown_irq
       [not found]                                               ` <cmu-lmtpd-25244-1385737839-23@frontend1.mail.m-online.net>
@ 2013-11-29 15:36                                                 ` Marek Vasut
       [not found]                                                   ` <cmu-lmtpd-32360-1385742114-0@frontend1.mail.m-online.net>
  0 siblings, 1 reply; 28+ messages in thread
From: Marek Vasut @ 2013-11-29 15:36 UTC (permalink / raw)
  To: Bjørn Erik Nilsen
  Cc: jg1.han@samsung.com, pratyush.anand@gmail.com,
	bhelgaas@google.com, linux-pci@vger.kernel.org, kishon@ti.com,
	Mohit.KUMAR@st.com, ajay.khandelwal@st.com, tharvey@gateworks.com,
	Eric.Nelson@boundarydevices.com, troy.kisky@boundarydevices.com

Dear Bjørn Erik Nilsen,
[...]
> > Won't simple
> > 
> > for (i = 0; i < nvec; i++) {
> > 
> > do here?
> 
> Yes.
> 
> The very same syntax ('while i < nvec') is used in assign_irq. That is
> why I wanted to keep it like that to avoid adding too much confusion, or
> at least make it easy to recognize the same pattern.
> 
> You still want me to change it?

Your loop does exactly what a for() loop would do, there's no need to emulate it 
with a while() loop. If you can fix the other one, fix the other one as well 
please.

> As for the other nitpick, I don't agree with you.
> 
> In fact, dw_msi_teardown_irq has no return value itself. Moreover, if
> setting the msi desc to NULL fails, then it simply means there is no irq
> desc and there is nothing to unwind. Also, skipping the other unwind
> operations just because that single operation failed would leave the
> driver in a much worse state. At least in my opinion.
> 
> What's your opinion on that?

I see this:

$ git grep irq_set_msi_desc_off include/
include/linux/irq.h:extern int irq_set_msi_desc_off(unsigned int irq_base, 
unsigned int irq_offset,

So it has a return value which needs to be handled. Sorry if I wasn't clear on 
the first try.

> Now, if you want me to make another patch do you prefer a standalone
> patch on top of the other patches, or a completely new patchset?

Let's keep iterating :) But please wait for more feedback first.

Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 1/2] PCI: designware: Fix crash in dw_msi_teardown_irq
       [not found]                                                   ` <cmu-lmtpd-32360-1385742114-0@frontend1.mail.m-online.net>
@ 2013-11-29 17:02                                                     ` Marek Vasut
       [not found]                                                       ` <cmu-lmtpd-8418-1385750273-5@frontend1.mail.m-online.net>
  0 siblings, 1 reply; 28+ messages in thread
From: Marek Vasut @ 2013-11-29 17:02 UTC (permalink / raw)
  To: Bjørn Erik Nilsen
  Cc: jg1.han@samsung.com, pratyush.anand@gmail.com,
	bhelgaas@google.com, linux-pci@vger.kernel.org, kishon@ti.com,
	Mohit.KUMAR@st.com, ajay.khandelwal@st.com, tharvey@gateworks.com,
	Eric.Nelson@boundarydevices.com, troy.kisky@boundarydevices.com

Dear Bjørn Erik Nilsen,

> Dear Marek Vasut,
> 
> On Fri, 2013-11-29 at 16:36 +0100, Marek Vasut wrote:
> > Dear Bjørn Erik Nilsen,
> > [...]
> > 
> > > > Won't simple
> > > > 
> > > > for (i = 0; i < nvec; i++) {
> > > > 
> > > > do here?
> > > 
> > > Yes.
> > > 
> > > The very same syntax ('while i < nvec') is used in assign_irq. That is
> > > why I wanted to keep it like that to avoid adding too much confusion,
> > > or at least make it easy to recognize the same pattern.
> > > 
> > > You still want me to change it?
> > 
> > Your loop does exactly what a for() loop would do, there's no need to
> > emulate it with a while() loop. If you can fix the other one, fix the
> > other one as well please.
> 
> Yes, it is syntax sugar indeed and I did it purely for readability as
> explained. You don't seem to agree with me on the readability argument,
> so I can make the new code use a for-loop. And then in an unrelated
> commit (which would look awfully silly) I can change the other
> while-loop too.

Either way is fine with me, I'm just more inclined to the for loop ;-) Let's 
wait for what the others have to say, they can judge this argument.

> > > As for the other nitpick, I don't agree with you.
> > > 
> > > In fact, dw_msi_teardown_irq has no return value itself. Moreover, if
> > > setting the msi desc to NULL fails, then it simply means there is no
> > > irq desc and there is nothing to unwind. Also, skipping the other
> > > unwind operations just because that single operation failed would
> > > leave the driver in a much worse state. At least in my opinion.
> > > 
> > > What's your opinion on that?
> > 
> > I see this:
> > 
> > $ git grep irq_set_msi_desc_off include/
> > include/linux/irq.h:extern int irq_set_msi_desc_off(unsigned int
> > irq_base, unsigned int irq_offset,
> > 
> > So it has a return value which needs to be handled. Sorry if I wasn't
> > clear on the first try.
> 
> I agree that return values should be checked.
> 
> However based on my reasoning there is absolutely nothing to do with the
> return value in this particular case, and in fact bailing out will leave
> the driver in a much worse state (as other unwind operations will be
> skipped).
> 
> That is what I kindly asked you to comment on.

How come you cannot unwind what you did thus far and bail out ? That means the 
design here is seriously broken, don't you think ?

> > > Now, if you want me to make another patch do you prefer a standalone
> > > patch on top of the other patches, or a completely new patchset?
> > 
> > Let's keep iterating :) But please wait for more feedback first.
> 
> OK. I will wait for more feedback then.

ACK, thanks ;-)

Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 1/2] PCI: designware: Fix crash in dw_msi_teardown_irq
       [not found]                                                       ` <cmu-lmtpd-8418-1385750273-5@frontend1.mail.m-online.net>
@ 2013-12-02  8:10                                                         ` Marek Vasut
  0 siblings, 0 replies; 28+ messages in thread
From: Marek Vasut @ 2013-12-02  8:10 UTC (permalink / raw)
  To: Bjørn Erik Nilsen
  Cc: jg1.han@samsung.com, pratyush.anand@gmail.com,
	bhelgaas@google.com, linux-pci@vger.kernel.org, kishon@ti.com,
	Mohit.KUMAR@st.com, ajay.khandelwal@st.com, tharvey@gateworks.com,
	Eric.Nelson@boundarydevices.com, troy.kisky@boundarydevices.com

Dear Bjørn Erik Nilsen,

> 29. nov. 2013 kl. 18:02 skrev Marek Vasut <marex@denx.de>:
> > Dear Bjørn Erik Nilsen,
> > 
> >> Dear Marek Vasut,
> >> 
> >> On Fri, 2013-11-29 at 16:36 +0100, Marek Vasut wrote:
> >>> Dear Bjørn Erik Nilsen,
> >>> [...]
> >>> 
> >>>>> Won't simple
> >>>>> 
> >>>>> for (i = 0; i < nvec; i++) {
> >>>>> 
> >>>>> do here?
> >>>> 
> >>>> Yes.
> >>>> 
> >>>> The very same syntax ('while i < nvec') is used in assign_irq. That is
> >>>> why I wanted to keep it like that to avoid adding too much confusion,
> >>>> or at least make it easy to recognize the same pattern.
> >>>> 
> >>>> You still want me to change it?
> >>> 
> >>> Your loop does exactly what a for() loop would do, there's no need to
> >>> emulate it with a while() loop. If you can fix the other one, fix the
> >>> other one as well please.
> >> 
> >> Yes, it is syntax sugar indeed and I did it purely for readability as
> >> explained. You don't seem to agree with me on the readability argument,
> >> so I can make the new code use a for-loop. And then in an unrelated
> >> commit (which would look awfully silly) I can change the other
> >> while-loop too.
> > 
> > Either way is fine with me, I'm just more inclined to the for loop ;-)
> > Let's wait for what the others have to say, they can judge this
> > argument.
> > 
> >>>> As for the other nitpick, I don't agree with you.
> >>>> 
> >>>> In fact, dw_msi_teardown_irq has no return value itself. Moreover, if
> >>>> setting the msi desc to NULL fails, then it simply means there is no
> >>>> irq desc and there is nothing to unwind. Also, skipping the other
> >>>> unwind operations just because that single operation failed would
> >>>> leave the driver in a much worse state. At least in my opinion.
> >>>> 
> >>>> What's your opinion on that?
> >>> 
> >>> I see this:
> >>> 
> >>> $ git grep irq_set_msi_desc_off include/
> >>> include/linux/irq.h:extern int irq_set_msi_desc_off(unsigned int
> >>> irq_base, unsigned int irq_offset,
> >>> 
> >>> So it has a return value which needs to be handled. Sorry if I wasn't
> >>> clear on the first try.
> >> 
> >> I agree that return values should be checked.
> >> 
> >> However based on my reasoning there is absolutely nothing to do with the
> >> return value in this particular case, and in fact bailing out will leave
> >> the driver in a much worse state (as other unwind operations will be
> >> skipped).
> >> 
> >> That is what I kindly asked you to comment on.
> > 
> > How come you cannot unwind what you did thus far and bail out ? That
> > means the design here is seriously broken, don't you think ?
> 
> Maybe you need to take a closer look at the patch because the code in
> question is *the* unwinding code. So bailing out from there  doesn't make
> much sense.
> 
> Also note that I do check the return value of the same function call in
> setup, but the one you are complaining is for unwinding stuff that has
> already been setup.
> 
> I hope this helps :-)

Please consider me stupid and blind, sorry. I think maybe a WARN_ON(ret) might 
be a good sign things have gone severely awry here at least. What do you think?

Best regards,

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 0/2] Kernel oops from pci_disable_msi
  2013-11-29 13:35                                         ` [PATCH v5 0/2] " Bjørn Erik Nilsen
  2013-11-29 13:35                                           ` [PATCH v5 1/2] PCI: designware: Fix crash in dw_msi_teardown_irq Bjørn Erik Nilsen
  2013-11-29 13:35                                           ` [PATCH v5 2/2] PCI: designware: Remove redundant call to pci_write_config Bjørn Erik Nilsen
@ 2013-12-05  1:52                                           ` Jingoo Han
  2013-12-05  2:18                                             ` Marek Vasut
  2013-12-09 20:43                                           ` Bjorn Helgaas
  3 siblings, 1 reply; 28+ messages in thread
From: Jingoo Han @ 2013-12-05  1:52 UTC (permalink / raw)
  To: 'Bjørn Erik Nilsen', 'Marek Vasut'
  Cc: 'Pratyush Anand', 'Bjorn Helgaas', linux-pci,
	'Kishon Vijay Abraham I', 'Mohit KUMAR',
	'Ajay KHANDELWAL', 'Tim Harvey',
	'Eric Nelson', 'Troy Kisky', 'Jingoo Han'

On Friday, November 29, 2013 10:35 PM, Bjørn Erik Nilsen wrote:
> 
> This patchset address Marek Vasut's comments on the previous
> patches. I decided to split into two separate commits as they are
> unrelated.
> 
> Bjørn Erik Nilsen (2):
>   PCI: designware: Fix crash in dw_msi_teardown_irq
>   PCI: designware: Remove redundant call to pci_write_config
> 
>  drivers/pci/host/pcie-designware.c | 57 +++++++++++++++++++++++++++-----------
>  1 file changed, 41 insertions(+), 16 deletions(-)

I tested these patches on Exynos platform, and checked that
the kernel panic was resolved.

Marek Vasut, Bjørn Erik Nilsen,
Is there any issue for accepting these patches?

Best regards,
Jingoo Han


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 0/2] Kernel oops from pci_disable_msi
  2013-12-05  1:52                                           ` [PATCH v5 0/2] Kernel oops from pci_disable_msi Jingoo Han
@ 2013-12-05  2:18                                             ` Marek Vasut
  2013-12-05  2:24                                               ` Jingoo Han
  0 siblings, 1 reply; 28+ messages in thread
From: Marek Vasut @ 2013-12-05  2:18 UTC (permalink / raw)
  To: Jingoo Han
  Cc: 'Bjørn Erik Nilsen', 'Pratyush Anand',
	'Bjorn Helgaas', linux-pci,
	'Kishon Vijay Abraham I', 'Mohit KUMAR',
	'Ajay KHANDELWAL', 'Tim Harvey',
	'Eric Nelson', 'Troy Kisky'

On Thursday, December 05, 2013 at 02:52:38 AM, Jingoo Han wrote:
> On Friday, November 29, 2013 10:35 PM, Bjørn Erik Nilsen wrote:
> > This patchset address Marek Vasut's comments on the previous
> > patches. I decided to split into two separate commits as they are
> > unrelated.
> > 
> > Bjørn Erik Nilsen (2):
> >   PCI: designware: Fix crash in dw_msi_teardown_irq
> >   PCI: designware: Remove redundant call to pci_write_config
> >  
> >  drivers/pci/host/pcie-designware.c | 57
> >  +++++++++++++++++++++++++++----------- 1 file changed, 41
> >  insertions(+), 16 deletions(-)
> 
> I tested these patches on Exynos platform, and checked that
> the kernel panic was resolved.
> 
> Marek Vasut, Bjørn Erik Nilsen,
> Is there any issue for accepting these patches?

No issue on my end.

Acked-by: Marek Vasut <marex@denx.de>

Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 0/2] Kernel oops from pci_disable_msi
  2013-12-05  2:18                                             ` Marek Vasut
@ 2013-12-05  2:24                                               ` Jingoo Han
  2013-12-05  4:07                                                 ` Mohit KUMAR DCG
  0 siblings, 1 reply; 28+ messages in thread
From: Jingoo Han @ 2013-12-05  2:24 UTC (permalink / raw)
  To: 'Bjorn Helgaas'
  Cc: 'Marek Vasut', 'Bjørn Erik Nilsen',
	'Pratyush Anand', linux-pci,
	'Kishon Vijay Abraham I', 'Mohit KUMAR',
	'Ajay KHANDELWAL', 'Tim Harvey',
	'Eric Nelson', 'Troy Kisky', 'Jingoo Han'

On Thursday, December 05, 2013 11:19 AM, Marek Vasut wrote:
> On Thursday, December 05, 2013 at 02:52:38 AM, Jingoo Han wrote:
> > On Friday, November 29, 2013 10:35 PM, Bjørn Erik Nilsen wrote:
> > > This patchset address Marek Vasut's comments on the previous
> > > patches. I decided to split into two separate commits as they are
> > > unrelated.
> > >
> > > Bjørn Erik Nilsen (2):
> > >   PCI: designware: Fix crash in dw_msi_teardown_irq
> > >   PCI: designware: Remove redundant call to pci_write_config
> > >
> > >  drivers/pci/host/pcie-designware.c | 57
> > >  +++++++++++++++++++++++++++----------- 1 file changed, 41
> > >  insertions(+), 16 deletions(-)
> >
> > I tested these patches on Exynos platform, and checked that
> > the kernel panic was resolved.
> >
> > Marek Vasut, Bjørn Erik Nilsen,
> > Is there any issue for accepting these patches?
> 
> No issue on my end.
> 
> Acked-by: Marek Vasut <marex@denx.de>

Acked-by: Jingoo Han <jg1.han@samsung.com>

Best regards,
Jingoo Han


^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v5 0/2] Kernel oops from pci_disable_msi
  2013-12-05  2:24                                               ` Jingoo Han
@ 2013-12-05  4:07                                                 ` Mohit KUMAR DCG
  0 siblings, 0 replies; 28+ messages in thread
From: Mohit KUMAR DCG @ 2013-12-05  4:07 UTC (permalink / raw)
  To: Jingoo Han, 'Bjorn Helgaas'
  Cc: 'Marek Vasut', 'Bjørn Erik Nilsen',
	'Pratyush Anand', linux-pci@vger.kernel.org,
	'Kishon Vijay Abraham I', Ajay KHANDELWAL,
	'Tim Harvey', 'Eric Nelson', 'Troy Kisky'

Hello Bjørn Erik Nilsen,

> -----Original Message-----
> From: Jingoo Han [mailto:jg1.han@samsung.com]
> Sent: Thursday, December 05, 2013 7:55 AM
> To: 'Bjorn Helgaas'
> Cc: 'Marek Vasut'; 'Bjørn Erik Nilsen'; 'Pratyush Anand'; linux-
> pci@vger.kernel.org; 'Kishon Vijay Abraham I'; Mohit KUMAR DCG; Ajay
> KHANDELWAL; 'Tim Harvey'; 'Eric Nelson'; 'Troy Kisky'; 'Jingoo Han'
> Subject: Re: [PATCH v5 0/2] Kernel oops from pci_disable_msi
> 
> On Thursday, December 05, 2013 11:19 AM, Marek Vasut wrote:
> > On Thursday, December 05, 2013 at 02:52:38 AM, Jingoo Han wrote:
> > > On Friday, November 29, 2013 10:35 PM, Bjørn Erik Nilsen wrote:
> > > > This patchset address Marek Vasut's comments on the previous
> > > > patches. I decided to split into two separate commits as they are
> > > > unrelated.
> > > >
> > > > Bjørn Erik Nilsen (2):
> > > >   PCI: designware: Fix crash in dw_msi_teardown_irq
> > > >   PCI: designware: Remove redundant call to pci_write_config
> > > >
> > > >  drivers/pci/host/pcie-designware.c | 57
> > > >  +++++++++++++++++++++++++++----------- 1 file changed, 41
> > > > insertions(+), 16 deletions(-)
> > >
> > > I tested these patches on Exynos platform, and checked that the
> > > kernel panic was resolved.
> > >
> > > Marek Vasut, Bjørn Erik Nilsen,
> > > Is there any issue for accepting these patches?
> >
> > No issue on my end.
> >
> > Acked-by: Marek Vasut <marex@denx.de>
> 
> Acked-by: Jingoo Han <jg1.han@samsung.com>
> 
I tested it on SPEAr1310 platform and no kernel panic is found.

Tested-by: Mohit Kumar <mohit.kumar@st.com>

As 'write_msi_msg()' redundant  call is removed so the following comment messages are not required here:
	/*
	 * write_msi_msg() will update PCI_MSI_FLAGS so there is
	 * no need to explicitly call pci_write_config.
	 */
Instead move it to the commit log. Same for the other patch for dw_msi_teardown_irq().

Thanks
Mohit

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 0/2] Kernel oops from pci_disable_msi
  2013-11-29 13:35                                         ` [PATCH v5 0/2] " Bjørn Erik Nilsen
                                                             ` (2 preceding siblings ...)
  2013-12-05  1:52                                           ` [PATCH v5 0/2] Kernel oops from pci_disable_msi Jingoo Han
@ 2013-12-09 20:43                                           ` Bjorn Helgaas
       [not found]                                             ` <52a632f0.e42c980a.3d86.ffff8faeSMTPIN_ADDED_MISSING@mx.google.com>
  3 siblings, 1 reply; 28+ messages in thread
From: Bjorn Helgaas @ 2013-12-09 20:43 UTC (permalink / raw)
  To: Bjørn Erik Nilsen
  Cc: jg1.han, marex, pratyush.anand, linux-pci, kishon, Mohit.KUMAR,
	ajay.khandelwal, tharvey, Eric.Nelson, troy.kisky

On Fri, Nov 29, 2013 at 02:35:23PM +0100, Bjørn Erik Nilsen wrote:
> This patchset address Marek Vasut's comments on the previous
> patches. I decided to split into two separate commits as they are
> unrelated.
> 
> Bjørn Erik Nilsen (2):
>   PCI: designware: Fix crash in dw_msi_teardown_irq
>   PCI: designware: Remove redundant call to pci_write_config

Hi Bjørn,

There's been a fair amount of discussion about the first patch, so I'm
ignoring these for now.  Please repost these (again) when that all settles
down.

Bjorn

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 0/2] Kernel oops from pci_disable_msi
       [not found]                                             ` <52a632f0.e42c980a.3d86.ffff8faeSMTPIN_ADDED_MISSING@mx.google.com>
@ 2013-12-09 21:21                                               ` Bjorn Helgaas
       [not found]                                                 ` <52a63678.4902980a.6fd7.ffffa2b9SMTPIN_ADDED_MISSING@mx.google.com>
  0 siblings, 1 reply; 28+ messages in thread
From: Bjorn Helgaas @ 2013-12-09 21:21 UTC (permalink / raw)
  To: Bjørn Erik Nilsen
  Cc: jg1.han@samsung.com, marex@denx.de, pratyush.anand@gmail.com,
	linux-pci@vger.kernel.org, kishon@ti.com, Mohit.KUMAR@st.com,
	ajay.khandelwal@st.com, tharvey@gateworks.com,
	Eric.Nelson@boundarydevices.com, troy.kisky@boundarydevices.com

On Mon, Dec 9, 2013 at 2:15 PM, Bjørn Erik Nilsen <ben@datarespons.no> wrote:
> Hi Bjorn,
>
> 9. des. 2013 kl. 21:43 skrev Bjorn Helgaas <bhelgaas@google.com>:
>
>> On Fri, Nov 29, 2013 at 02:35:23PM +0100, Bjørn Erik Nilsen wrote:
>>> This patchset address Marek Vasut's comments on the previous
>>> patches. I decided to split into two separate commits as they are
>>> unrelated.
>>>
>>> Bjørn Erik Nilsen (2):
>>>  PCI: designware: Fix crash in dw_msi_teardown_irq
>>>  PCI: designware: Remove redundant call to pci_write_config
>>
>> Hi Bjørn,
>>
>> There's been a fair amount of discussion about the first patch, so I'm
>> ignoring these for now.  Please repost these (again) when that all settles
>> down.
>
> I'm under the impression that all concerns have been addressed and that the discussion has settled down.
>
> What are the outstanding issues to be solved regarding these patches?
>
> I'm asking because these patches are acked-by Marek Vasut and Jingoo Han as well as tested-by Mohit Kumar (and obviously myself).
>
> Jingoo Han confirms he can reproduce the crash on Exynos and that the first patch fixes the problem.

I didn't trace through the discussion closely enough to follow
complaints and their resolution or retraction.  Can you post the
current version including the acks, just so I make sure I get the
latest stuff?

Thanks,
  Bjorn

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 0/2] Kernel oops from pci_disable_msi
       [not found]                                                 ` <52a63678.4902980a.6fd7.ffffa2b9SMTPIN_ADDED_MISSING@mx.google.com>
@ 2013-12-09 22:27                                                   ` Bjorn Helgaas
  0 siblings, 0 replies; 28+ messages in thread
From: Bjorn Helgaas @ 2013-12-09 22:27 UTC (permalink / raw)
  To: Bjørn Erik Nilsen
  Cc: jg1.han@samsung.com, marex@denx.de, pratyush.anand@gmail.com,
	linux-pci@vger.kernel.org, kishon@ti.com, Mohit.KUMAR@st.com,
	ajay.khandelwal@st.com, tharvey@gateworks.com,
	Eric.Nelson@boundarydevices.com, troy.kisky@boundarydevices.com

On Mon, Dec 09, 2013 at 10:30:17PM +0100, Bjørn Erik Nilsen wrote:
> 
> 9. des. 2013 kl. 22:21 skrev Bjorn Helgaas <bhelgaas@google.com>:
> 
> > On Mon, Dec 9, 2013 at 2:15 PM, Bjørn Erik Nilsen <ben@datarespons.no> wrote:
> >> Hi Bjorn,
> >> 
> >> 9. des. 2013 kl. 21:43 skrev Bjorn Helgaas <bhelgaas@google.com>:
> >> 
> >>> On Fri, Nov 29, 2013 at 02:35:23PM +0100, Bjørn Erik Nilsen wrote:
> >>>> This patchset address Marek Vasut's comments on the previous
> >>>> patches. I decided to split into two separate commits as they are
> >>>> unrelated.
> >>>> 
> >>>> Bjørn Erik Nilsen (2):
> >>>> PCI: designware: Fix crash in dw_msi_teardown_irq
> >>>> PCI: designware: Remove redundant call to pci_write_config
> >>> 
> >>> Hi Bjørn,
> >>> 
> >>> There's been a fair amount of discussion about the first patch, so I'm
> >>> ignoring these for now.  Please repost these (again) when that all settles
> >>> down.
> >> 
> >> I'm under the impression that all concerns have been addressed and that the discussion has settled down.
> >> 
> >> What are the outstanding issues to be solved regarding these patches?
> >> 
> >> I'm asking because these patches are acked-by Marek Vasut and Jingoo Han as well as tested-by Mohit Kumar (and obviously myself).
> >> 
> >> Jingoo Han confirms he can reproduce the crash on Exynos and that the first patch fixes the problem.
> > 
> > I didn't trace through the discussion closely enough to follow
> > complaints and their resolution or retraction.  Can you post the
> > current version including the acks, just so I make sure I get the
> > latest stuff?
> 
> I'm not able to repost right now, but I can do it tomorrow morning if needed.
> 
> v5 (as posted) is the latest however, and the amendment is:
> 
> Acked-by: Marek Vasut <marex@denx.de>
> Acked-by: Jingoo Han <jg1.han@samsung.com>
> Tested-by: Mohit Kumar <mohit.kumar@st.com>

OK, I added these and applied to a pci/host-designware branch for v3.14.  I
also added the following cleanup patch as suggested by Marek:


commit 0b8cfb6aa3aabc96177b1e68ef13d2eb5c686606
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Mon Dec 9 15:11:25 2013 -0700

    PCI: designware: Use typical "for" loop idiom
    
    It's conventional to use "for" rather than "while" for simple iteration.
    No functional change.
    
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

diff --git a/drivers/pci/host/pcie-designware.c b/drivers/pci/host/pcie-designware.c
index 5274085ecd4b..1c92833a4ed3 100644
--- a/drivers/pci/host/pcie-designware.c
+++ b/drivers/pci/host/pcie-designware.c
@@ -214,8 +214,7 @@ static void clear_irq_range(struct pcie_port *pp, unsigned int irq_base,
 {
 	unsigned int i, res, bit, val;
 
-	i = 0;
-	while (i < nvec) {
+	for (i = 0; i < nvec; i++) {
 		irq_set_msi_desc_off(irq_base, i, NULL);
 		clear_bit(pos + i, pp->msi_irq_in_use);
 		/* Disable corresponding interrupt on MSI interrupt controller */
@@ -224,7 +223,6 @@ static void clear_irq_range(struct pcie_port *pp, unsigned int irq_base,
 		dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, &val);
 		val &= ~(1 << bit);
 		dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, val);
-		++i;
 	}
 }
 
@@ -268,8 +266,7 @@ static int assign_irq(int no_irqs, struct msi_desc *desc, int *pos)
 	 * descs are also successfully allocated.
 	 */
 
-	i = 0;
-	while (i < no_irqs) {
+	for (i = 0; i < no_irqs; i++) {
 		if (irq_set_msi_desc_off(irq, i, desc) != 0) {
 			clear_irq_range(pp, irq, i, pos0);
 			goto no_valid_irq;
@@ -281,7 +278,6 @@ static int assign_irq(int no_irqs, struct msi_desc *desc, int *pos)
 		dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, &val);
 		val |= 1 << bit;
 		dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, val);
-		i++;
 	}
 
 	*pos = pos0;

^ permalink raw reply related	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2013-12-09 22:28 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <528a1bb6.6a88700a.28c9.ffff824aSMTPIN_ADDED_MISSING@mx.google.com>
     [not found] ` <CAErSpo4TbUuq0wb06JV9Xchmcjsk9q3cm7+XO-dOSiJAAhXPMA@mail.gmail.com>
2013-11-18 21:02   ` Kernel oops from pci_disable_msi Bjorn Helgaas
2013-11-18 23:11     ` Jingoo Han
2013-11-19 11:24       ` Marek Vasut
     [not found] ` <1384861142.3682.1.camel@bnilsen-HP-EliteBook-8760w>
     [not found]   ` <cmu-lmtpd-19155-1384861370-13@frontend1.mail.m-online.net>
2013-11-19 22:01     ` Marek Vasut
     [not found]       ` <cmu-lmtpd-1612-1384936883-21@frontend1.mail.m-online.net>
2013-11-20 10:30         ` Marek Vasut
     [not found]           ` <cmu-lmtpd-21237-1384948548-21@frontend1.mail.m-online.net>
2013-11-20 12:02             ` Marek Vasut
     [not found]               ` <cmu-lmtpd-23243-1384949805-3@frontend1.mail.m-online.net>
2013-11-20 13:57                 ` Marek Vasut
     [not found]                   ` <1385036087.3945.28.camel@bnilsen-HP>
     [not found]                     ` <1385039987.6020.5.camel@bnilsen-HP>
     [not found]                       ` <16.79.22145.6305E825@epmailin9.samsung.com>
2013-11-22  8:48                         ` Jingoo Han
     [not found]                           ` <1385118399.3944.32.camel@bnilsen-HP>
     [not found]                             ` <7D.78.31634.B838F825@epmailin2.samsung.com>
2013-11-26 11:21                               ` [PATCH] " Jingoo Han
     [not found]                             ` <cmu-lmtpd-32538-1385137032-0@frontend1.mail.m-online.net>
2013-11-26 21:19                               ` Marek Vasut
     [not found]                                 ` <cmu-lmtpd-17794-1385506605-2@frontend1.mail.m-online.net>
2013-11-26 23:05                                   ` Marek Vasut
2013-11-27  9:46                                   ` Marek Vasut
     [not found]                                     ` <cmu-lmtpd-20073-1385566822-1@frontend1.mail.m-online.net>
2013-11-27 19:05                                       ` [PATCH v4] " Marek Vasut
     [not found]                                     ` <C7.DF.00504.D1216925@epmailin4.samsung.com>
2013-11-29  7:37                                       ` Jingoo Han
2013-11-29 13:35                                         ` [PATCH v5 0/2] " Bjørn Erik Nilsen
2013-11-29 13:35                                           ` [PATCH v5 1/2] PCI: designware: Fix crash in dw_msi_teardown_irq Bjørn Erik Nilsen
2013-11-29 14:32                                             ` Marek Vasut
     [not found]                                               ` <cmu-lmtpd-25244-1385737839-23@frontend1.mail.m-online.net>
2013-11-29 15:36                                                 ` Marek Vasut
     [not found]                                                   ` <cmu-lmtpd-32360-1385742114-0@frontend1.mail.m-online.net>
2013-11-29 17:02                                                     ` Marek Vasut
     [not found]                                                       ` <cmu-lmtpd-8418-1385750273-5@frontend1.mail.m-online.net>
2013-12-02  8:10                                                         ` Marek Vasut
2013-11-29 13:35                                           ` [PATCH v5 2/2] PCI: designware: Remove redundant call to pci_write_config Bjørn Erik Nilsen
2013-12-05  1:52                                           ` [PATCH v5 0/2] Kernel oops from pci_disable_msi Jingoo Han
2013-12-05  2:18                                             ` Marek Vasut
2013-12-05  2:24                                               ` Jingoo Han
2013-12-05  4:07                                                 ` Mohit KUMAR DCG
2013-12-09 20:43                                           ` Bjorn Helgaas
     [not found]                                             ` <52a632f0.e42c980a.3d86.ffff8faeSMTPIN_ADDED_MISSING@mx.google.com>
2013-12-09 21:21                                               ` Bjorn Helgaas
     [not found]                                                 ` <52a63678.4902980a.6fd7.ffffa2b9SMTPIN_ADDED_MISSING@mx.google.com>
2013-12-09 22:27                                                   ` Bjorn Helgaas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).