From: Takao Indoh <indou.takao@jp.fujitsu.com>
To: dwmw2@infradead.org
Cc: joro@8bytes.org, kexec@lists.infradead.org,
linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org
Subject: Re: [PATCH] intel-iommu: Synchronize gcmd value with global command register
Date: Thu, 04 Apr 2013 14:48:25 +0900 [thread overview]
Message-ID: <515D1429.40707@jp.fujitsu.com> (raw)
In-Reply-To: <1364977479.28127.15.camel@i7.infradead.org>
(2013/04/03 17:24), David Woodhouse wrote:
> On Wed, 2013-04-03 at 16:11 +0900, Takao Indoh wrote:
>> (2013/04/02 23:05), Joerg Roedel wrote:
>>> On Mon, Apr 01, 2013 at 02:45:18PM +0900, Takao Indoh wrote:
>>>> <Current flow on kdump boot>
>>>> enable_IR
>>>> intel_enable_irq_remapping
>>>> iommu_disable_irq_remapping <== IRES/QIES/TES disabled here
>>>> dmar_disable_qi <== do nothing
>>>> dmar_enable_qi <== QIES enabled
>>>> intel_setup_irq_remapping <== IRES enabled
>>>
>>> But what we want to do here in the kdumo case is to disable translation
>>> too, right? Because the former kernel might have translation and
>>> irq-remapping enabled and the kdump kernel might be compiled without
>>> support for dma-remapping. So if we don't disable translation here too
>>> the kdump kernel is unable to do DMA.
>>
>> Yeah, you are right. I forgot such a case.
>
> If you disable translation and there's some device still doing DMA, it's
> going to scribble over random areas of memory. You really want to have
> translation enabled and all the page tables *cleared*, during kexec. I
> think it's fair to insist that the secondary kernel should use the IOMMU
> if the first one did.
>
>> To be honest, I also expected the side effect of this patch. As I wrote
>> in the previous mail, I'm working on kdump problem with iommu, that is,
>> ongoing DMA causes DMAR fault in 2nd kernel and sometimes kdump fails
>> due to this fault.
>
> Here you've lost me. The DMAR fault is caught and reported, and how does
> this lead to a kdump failure? Are you using dodgy hardware that just
> keeps *trying* after an abort, and floods the system with a storm of
> DMAR faults? We've occasionally spoken about working around such a
> problem by setting a bit to make subsequent faults *silent*. Would that
> work?
There are several cases.
- DMAR fault messages floods and second kernel does not boot. Recently I
saw similar report. https://lkml.org/lkml/2013/3/8/120
- igb driver detectes error on linkup and kdump via network fails.
- On a certain platform, though kdump itself works, PCIe error like
Unexpected Completion is detected and it gets hardware degraded.
Thanks,
Takao Indoh
>
>> What we have to do is stopping DMA transaction
>> before DMA-remapping is disabled in 2nd kernel.
>
> The IOMMU is there to stop DMA transactions. That is its *job*. :)
>
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
WARNING: multiple messages have this Message-ID (diff)
From: Takao Indoh <indou.takao-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
To: dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org
Cc: kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
Subject: Re: [PATCH] intel-iommu: Synchronize gcmd value with global command register
Date: Thu, 04 Apr 2013 14:48:25 +0900 [thread overview]
Message-ID: <515D1429.40707@jp.fujitsu.com> (raw)
In-Reply-To: <1364977479.28127.15.camel-W2I5cNIroUsVm/YvaOjsyQ@public.gmane.org>
(2013/04/03 17:24), David Woodhouse wrote:
> On Wed, 2013-04-03 at 16:11 +0900, Takao Indoh wrote:
>> (2013/04/02 23:05), Joerg Roedel wrote:
>>> On Mon, Apr 01, 2013 at 02:45:18PM +0900, Takao Indoh wrote:
>>>> <Current flow on kdump boot>
>>>> enable_IR
>>>> intel_enable_irq_remapping
>>>> iommu_disable_irq_remapping <== IRES/QIES/TES disabled here
>>>> dmar_disable_qi <== do nothing
>>>> dmar_enable_qi <== QIES enabled
>>>> intel_setup_irq_remapping <== IRES enabled
>>>
>>> But what we want to do here in the kdumo case is to disable translation
>>> too, right? Because the former kernel might have translation and
>>> irq-remapping enabled and the kdump kernel might be compiled without
>>> support for dma-remapping. So if we don't disable translation here too
>>> the kdump kernel is unable to do DMA.
>>
>> Yeah, you are right. I forgot such a case.
>
> If you disable translation and there's some device still doing DMA, it's
> going to scribble over random areas of memory. You really want to have
> translation enabled and all the page tables *cleared*, during kexec. I
> think it's fair to insist that the secondary kernel should use the IOMMU
> if the first one did.
>
>> To be honest, I also expected the side effect of this patch. As I wrote
>> in the previous mail, I'm working on kdump problem with iommu, that is,
>> ongoing DMA causes DMAR fault in 2nd kernel and sometimes kdump fails
>> due to this fault.
>
> Here you've lost me. The DMAR fault is caught and reported, and how does
> this lead to a kdump failure? Are you using dodgy hardware that just
> keeps *trying* after an abort, and floods the system with a storm of
> DMAR faults? We've occasionally spoken about working around such a
> problem by setting a bit to make subsequent faults *silent*. Would that
> work?
There are several cases.
- DMAR fault messages floods and second kernel does not boot. Recently I
saw similar report. https://lkml.org/lkml/2013/3/8/120
- igb driver detectes error on linkup and kdump via network fails.
- On a certain platform, though kdump itself works, PCIe error like
Unexpected Completion is detected and it gets hardware degraded.
Thanks,
Takao Indoh
>
>> What we have to do is stopping DMA transaction
>> before DMA-remapping is disabled in 2nd kernel.
>
> The IOMMU is there to stop DMA transactions. That is its *job*. :)
>
WARNING: multiple messages have this Message-ID (diff)
From: Takao Indoh <indou.takao@jp.fujitsu.com>
To: dwmw2@infradead.org
Cc: joro@8bytes.org, linux-kernel@vger.kernel.org,
iommu@lists.linux-foundation.org, kexec@lists.infradead.org
Subject: Re: [PATCH] intel-iommu: Synchronize gcmd value with global command register
Date: Thu, 04 Apr 2013 14:48:25 +0900 [thread overview]
Message-ID: <515D1429.40707@jp.fujitsu.com> (raw)
In-Reply-To: <1364977479.28127.15.camel@i7.infradead.org>
(2013/04/03 17:24), David Woodhouse wrote:
> On Wed, 2013-04-03 at 16:11 +0900, Takao Indoh wrote:
>> (2013/04/02 23:05), Joerg Roedel wrote:
>>> On Mon, Apr 01, 2013 at 02:45:18PM +0900, Takao Indoh wrote:
>>>> <Current flow on kdump boot>
>>>> enable_IR
>>>> intel_enable_irq_remapping
>>>> iommu_disable_irq_remapping <== IRES/QIES/TES disabled here
>>>> dmar_disable_qi <== do nothing
>>>> dmar_enable_qi <== QIES enabled
>>>> intel_setup_irq_remapping <== IRES enabled
>>>
>>> But what we want to do here in the kdumo case is to disable translation
>>> too, right? Because the former kernel might have translation and
>>> irq-remapping enabled and the kdump kernel might be compiled without
>>> support for dma-remapping. So if we don't disable translation here too
>>> the kdump kernel is unable to do DMA.
>>
>> Yeah, you are right. I forgot such a case.
>
> If you disable translation and there's some device still doing DMA, it's
> going to scribble over random areas of memory. You really want to have
> translation enabled and all the page tables *cleared*, during kexec. I
> think it's fair to insist that the secondary kernel should use the IOMMU
> if the first one did.
>
>> To be honest, I also expected the side effect of this patch. As I wrote
>> in the previous mail, I'm working on kdump problem with iommu, that is,
>> ongoing DMA causes DMAR fault in 2nd kernel and sometimes kdump fails
>> due to this fault.
>
> Here you've lost me. The DMAR fault is caught and reported, and how does
> this lead to a kdump failure? Are you using dodgy hardware that just
> keeps *trying* after an abort, and floods the system with a storm of
> DMAR faults? We've occasionally spoken about working around such a
> problem by setting a bit to make subsequent faults *silent*. Would that
> work?
There are several cases.
- DMAR fault messages floods and second kernel does not boot. Recently I
saw similar report. https://lkml.org/lkml/2013/3/8/120
- igb driver detectes error on linkup and kdump via network fails.
- On a certain platform, though kdump itself works, PCIe error like
Unexpected Completion is detected and it gets hardware degraded.
Thanks,
Takao Indoh
>
>> What we have to do is stopping DMA transaction
>> before DMA-remapping is disabled in 2nd kernel.
>
> The IOMMU is there to stop DMA transactions. That is its *job*. :)
>
next prev parent reply other threads:[~2013-04-04 5:48 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-21 1:32 [PATCH] intel-iommu: Synchronize gcmd value with global command register Takao Indoh
2013-03-21 1:32 ` Takao Indoh
2013-03-21 1:32 ` Takao Indoh
2013-03-26 14:46 ` Joerg Roedel
2013-03-26 14:46 ` Joerg Roedel
2013-03-26 14:46 ` Joerg Roedel
2013-03-27 5:02 ` Takao Indoh
2013-03-27 5:02 ` Takao Indoh
2013-03-27 5:02 ` Takao Indoh
2013-03-27 10:31 ` Joerg Roedel
2013-03-27 10:31 ` Joerg Roedel
2013-03-27 10:31 ` Joerg Roedel
2013-04-01 5:45 ` Takao Indoh
2013-04-01 5:45 ` Takao Indoh
2013-04-01 5:45 ` Takao Indoh
2013-04-02 14:05 ` Joerg Roedel
2013-04-02 14:05 ` Joerg Roedel
2013-04-02 14:05 ` Joerg Roedel
2013-04-03 7:11 ` Takao Indoh
2013-04-03 7:11 ` Takao Indoh
2013-04-03 7:11 ` Takao Indoh
2013-04-03 8:24 ` David Woodhouse
2013-04-03 8:24 ` David Woodhouse
2013-04-04 5:48 ` Takao Indoh [this message]
2013-04-04 5:48 ` Takao Indoh
2013-04-04 5:48 ` Takao Indoh
2013-04-04 14:24 ` David Woodhouse
2013-04-04 14:24 ` David Woodhouse
2013-04-04 14:24 ` David Woodhouse
2013-04-08 8:57 ` Takao Indoh
2013-04-08 8:57 ` Takao Indoh
2013-04-05 11:06 ` Joerg Roedel
2013-04-05 11:06 ` Joerg Roedel
2013-04-05 11:06 ` Joerg Roedel
2013-04-10 4:47 ` Takao Indoh
2013-04-10 4:47 ` Takao Indoh
2013-04-10 4:47 ` Takao Indoh
2013-04-15 9:00 ` Takao Indoh
2013-04-15 9:00 ` Takao Indoh
2013-04-15 9:00 ` Takao Indoh
2013-04-15 10:18 ` Joerg Roedel
2013-04-15 10:18 ` Joerg Roedel
2013-04-15 10:18 ` Joerg Roedel
2013-04-17 8:48 ` Takao Indoh
2013-04-17 8:48 ` Takao Indoh
2013-04-17 8:48 ` Takao Indoh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=515D1429.40707@jp.fujitsu.com \
--to=indou.takao@jp.fujitsu.com \
--cc=dwmw2@infradead.org \
--cc=iommu@lists.linux-foundation.org \
--cc=joro@8bytes.org \
--cc=kexec@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.