public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Takao Indoh <indou.takao@jp.fujitsu.com>
To: kaneshige.kenji@jp.fujitsu.com
Cc: vgoyal@redhat.com, kexec@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
	bhelgaas@google.com, hbabu@us.ibm.com,
	ishii.hironobu@jp.fujitsu.com, martin.wilck@ts.fujitsu.com
Subject: Re: [RFC][PATCH] Reset PCIe devices to address DMA problem on kdump with iommu
Date: Mon, 10 Sep 2012 15:35:20 +0900	[thread overview]
Message-ID: <504D8A28.9090105@jp.fujitsu.com> (raw)
In-Reply-To: <4A338DB2991D2A44B9A44B8718AECF650A4AA6BF@G01JPEXMBYT03>

(2012/09/10 11:34), Kaneshige, Kenji wrote:
>> -----Original Message-----
>> From: linux-pci-owner@vger.kernel.org
>> [mailto:linux-pci-owner@vger.kernel.org] On Behalf Of Takao Indoh
>> Sent: Wednesday, September 05, 2012 8:10 PM
>> To: vgoyal@redhat.com
>> Cc: kexec@lists.infradead.org; linux-kernel@vger.kernel.org;
>> linux-pci@vger.kernel.org; bhelgaas@google.com; hbabu@us.ibm.com; Ishii,
>> Hironobu/石井 宏延; martin.wilck@ts.fujitsu.com
>> Subject: Re: [RFC][PATCH] Reset PCIe devices to address DMA problem on kdump
>> with iommu
>>
>> (2012/08/07 5:39), Vivek Goyal wrote:
>>> On Mon, Aug 06, 2012 at 01:30:47PM +0900, Takao Indoh wrote:
>>>> Hi Vivek,
>>>>
>>>> (2012/08/03 20:46), Vivek Goyal wrote:
>>>>> On Fri, Aug 03, 2012 at 08:24:31PM +0900, Takao Indoh wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> This patch adds kernel parameter "reset_pcie_devices" which resets
>> PCIe
>>>>>> devices at boot time to address DMA problem on kdump with iommu. When
>>>>>> this parameter is specified, a hot reset is triggered on each PCIe
>> root
>>>>>> port and downstream port to reset its downstream endpoint.
>>>>>
>>>>> Hi Takao,
>>>>>
>>>>> Why not use existing "reset_devices" parameter instead of introducing
>>>>> a new one?
>>>>
>>>> "reset_devices" is used for each driver to reset their own device, and
>>>> this patch resets all devices forcibly, so I thought they were different
>>>> things.
>>>
>>> Yes reset_devices currently is used for driver to reset its device. I
>>> thought one could very well extend its reach to reset pci express devices
>>> at bus level.
>>>
>>> Having them separate is not going to be much useful from kdump
>>> perspective. We will end up passing both reset_devices and
>>> reset_pcie_devices to second kernel whill lead to bus level reset as well
>>> as device level reset.
>>>
>>> Ideal situation would be that somehow detect that bus level reset has
>> been
>>> done and skip device level reset (assuming bus level reset obviates the
>>> need of device level reset, please correct me if that's not the case).
>>>
>>> After pcie reset, can we store the state in a variable and drivers can
>>> use that variable to check if PCIe level reset was done or not. If yes,
>>> skip device level reset (Assuming driver knows that device is on a
>>> PCIe slot).
>>>
>>> In that case we will not have to introduce new kernel command line, and
>>> also avoid double reset?
>>
>> I found a problem when testing my patch on some machines.
>>
>> Originally there are two problems in kdump kernel when iommu is enabled;
>> DMAR error and PCI SERR. I thought they are fixed by my patch, but I
>> noticed that PCI SERR is still detected after applying the patch. It
>> seems that something happens when Interrupt Remapping is initialized in
>> kdump kernel.
> 
> I'm not sure, but I guess the PCI SERR might be caused as follows.
> 
> - The 1st kernel enables interrupt remapping. The MSI(-X) address and
>    data registers of PCI devices are programmed in remappable format.
> 
> - At the beginning of 2nd kernel, interrupt remapping is still active.
>    And then it is disabled by enable_IR() function for initialization.
> 
> - PCI device generate an interrupt. At this moment, interrupt remapping
>    is not enabled yet. On the other hand, MSI(-X) address and data of this
>    interrupt is in remappable format because those are programmed by 1st
>    kernel. I guess this might be a cause of PCI SERR.
> 
> I guess clearing command register or disabling MSI before interrupt
> remapping initialization in 2nd kernel might solve the PCI SERR problem.

Thank you for your comment. That makes sense.

Though I think clearing bus master bit is enough, do you think I need to clear
all command register bit not only bus master? 

Thanks,
Takao Indoh


> Regards,
> Kenji Kaneshige
> 
> 
>>
>> Therefore resetting devices has to be done before enable_IR() is
>> called. I have three ideas for it.
>>
>>    (i) Resetting devices in 1st kernel(panic kernel)
>>    We can reset devices before jumping into 2nd kernel. Of course it may
>>    be dangerous to scan pci device tree and call PCI functions in panic'd
>>    kernel. Beforehand we need to collect device information so that only
>>    minimal code could run on panic.
>>
>>    (ii) Resetting devices in purgatory
>>    It seems to be be appropriate place to do this, but I'm not sure
>>    where I can save/restore PCI config when resetting devices in
>>    purgatory.
>>
>>    (iii) Resetting devices in 2nd kernel(kdump kernel)
>>    Important point is to do reset before enable_IR() is called as I wrote
>>    above. I think I should add new function to do reset into
>>    arch/x86/pci/early.c and call it in setup_arch like
>>    early_dump_pci_devices() or early_quirks().
>>
>> Any comments?
>>
>> Thanks,
>> Takao Indoh
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 


  reply	other threads:[~2012-09-10  6:39 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-03 11:24 [RFC][PATCH] Reset PCIe devices to address DMA problem on kdump with iommu Takao Indoh
2012-08-03 11:46 ` Vivek Goyal
2012-08-06  4:30   ` Takao Indoh
2012-08-06 20:39     ` Vivek Goyal
2012-08-07  9:02       ` Takao Indoh
2012-09-05 11:09       ` Takao Indoh
2012-09-10  2:34         ` Kaneshige, Kenji
2012-09-10  6:35           ` Takao Indoh [this message]
2012-09-11 11:52             ` Kaneshige, Kenji
2012-09-10 14:36         ` Vivek Goyal
2012-09-11 10:32           ` Takao Indoh
2012-09-11 14:43             ` Vivek Goyal
2012-09-12  9:00               ` Takao Indoh
2012-09-14 15:48                 ` Vivek Goyal
2012-09-24 11:22                   ` Takao Indoh
2012-09-14 20:03             ` Konrad Rzeszutek Wilk
2012-09-19  1:52               ` Takao Indoh
2012-09-21 17:57               ` Don Dutile
2012-09-24 11:12                 ` Takao Indoh
2012-08-06  4:09 ` Don Dutile
2012-08-06  4:45   ` Takao Indoh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=504D8A28.9090105@jp.fujitsu.com \
    --to=indou.takao@jp.fujitsu.com \
    --cc=bhelgaas@google.com \
    --cc=hbabu@us.ibm.com \
    --cc=ishii.hironobu@jp.fujitsu.com \
    --cc=kaneshige.kenji@jp.fujitsu.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=martin.wilck@ts.fujitsu.com \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox