From: Takao Indoh <indou.takao@jp.fujitsu.com>
To: dwmw2@infradead.org
Cc: alex.williamson@redhat.com, iommu@lists.linux-foundation.org,
kexec@lists.infradead.org, linux-kernel@vger.kernel.org,
joro@8bytes.org
Subject: Re: [PATCH] intel-iommu: Quiesce devices before disabling IOMMU
Date: Tue, 10 Sep 2013 14:43:13 +0900 [thread overview]
Message-ID: <522EB171.6000909@jp.fujitsu.com> (raw)
In-Reply-To: <1378717669.2627.239.camel@shinybook.infradead.org>
(2013/09/09 18:07), David Woodhouse wrote:
> On Wed, 2013-08-21 at 16:15 +0900, Takao Indoh wrote:
>>
>> This causes problem on kdump. Devices are working in first kernel, and
>> after switching to second kernel and initializing IOMMU, many DMAR faults
>> occur and it causes problems like driver error or PCI SERR, at last
>> kdump fails. This patch fixes this problem.
>
> I'm not sure I'd call this a fix.
>
> If the driver is so broken that it cannot get the device working again
> after a fault, surely the driver needs to be fixed?
Yes,this problem may be solved by fixing driver. Actually megaraid sas
driver is recently fixed for this problem. (See commit 6431f5d7)
But I think root cause of this problem is initializing IOMMU while DMA
is still working, and I want to solve the root cause rather than
handling it in each driver, otherwise we have to fix driver each time we
find this kind of problem.
>
> If the system is suffering an IRQ storm because device doesn't give up
> after the first few faults, then we should switch off the fault
> *reporting* for that device so that its faults get ignored (until it
> next actually sets up a DMA mapping, or something).
In such a case, yeah limiting messages is enough.
>
> For the IOMMU code to reset individual devices, just because they still
> have an active DMA mapping even if they're not *doing* DMA, seems wrong.
> You'll even end up resetting devices just because they have an RMRR,
> won't you? (Although I wouldn't lose any sleep over that, I suppose. In
> fact it might be a *feature*... :)
Right, current code is resetting devices which *may* be doing DMA. The
ideal way is finding devices which are actually doing DMA and reset only
them but I don't know how we can do this, though I think current code
is sufficient.
Thanks,
Takao Indoh
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
WARNING: multiple messages have this Message-ID (diff)
From: Takao Indoh <indou.takao@jp.fujitsu.com>
To: dwmw2@infradead.org
Cc: linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org,
joro@8bytes.org, kexec@lists.infradead.org,
alex.williamson@redhat.com
Subject: Re: [PATCH] intel-iommu: Quiesce devices before disabling IOMMU
Date: Tue, 10 Sep 2013 14:43:13 +0900 [thread overview]
Message-ID: <522EB171.6000909@jp.fujitsu.com> (raw)
In-Reply-To: <1378717669.2627.239.camel@shinybook.infradead.org>
(2013/09/09 18:07), David Woodhouse wrote:
> On Wed, 2013-08-21 at 16:15 +0900, Takao Indoh wrote:
>>
>> This causes problem on kdump. Devices are working in first kernel, and
>> after switching to second kernel and initializing IOMMU, many DMAR faults
>> occur and it causes problems like driver error or PCI SERR, at last
>> kdump fails. This patch fixes this problem.
>
> I'm not sure I'd call this a fix.
>
> If the driver is so broken that it cannot get the device working again
> after a fault, surely the driver needs to be fixed?
Yes,this problem may be solved by fixing driver. Actually megaraid sas
driver is recently fixed for this problem. (See commit 6431f5d7)
But I think root cause of this problem is initializing IOMMU while DMA
is still working, and I want to solve the root cause rather than
handling it in each driver, otherwise we have to fix driver each time we
find this kind of problem.
>
> If the system is suffering an IRQ storm because device doesn't give up
> after the first few faults, then we should switch off the fault
> *reporting* for that device so that its faults get ignored (until it
> next actually sets up a DMA mapping, or something).
In such a case, yeah limiting messages is enough.
>
> For the IOMMU code to reset individual devices, just because they still
> have an active DMA mapping even if they're not *doing* DMA, seems wrong.
> You'll even end up resetting devices just because they have an RMRR,
> won't you? (Although I wouldn't lose any sleep over that, I suppose. In
> fact it might be a *feature*... :)
Right, current code is resetting devices which *may* be doing DMA. The
ideal way is finding devices which are actually doing DMA and reset only
them but I don't know how we can do this, though I think current code
is sufficient.
Thanks,
Takao Indoh
next prev parent reply other threads:[~2013-09-10 5:44 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-21 7:15 [PATCH] intel-iommu: Quiesce devices before disabling IOMMU Takao Indoh
2013-08-21 7:15 ` Takao Indoh
2013-08-21 7:15 ` Takao Indoh
2013-09-08 11:47 ` Baoquan He
2013-09-08 11:47 ` Baoquan He
2013-09-08 11:47 ` Baoquan He
2013-09-09 4:28 ` Takao Indoh
2013-09-09 4:28 ` Takao Indoh
2013-09-09 4:28 ` Takao Indoh
2013-09-09 4:46 ` Takao Indoh
2013-09-09 4:46 ` Takao Indoh
2013-09-09 4:46 ` Takao Indoh
2013-09-09 9:07 ` David Woodhouse
2013-09-09 9:07 ` David Woodhouse
2013-09-09 9:07 ` David Woodhouse
2013-09-10 5:43 ` Takao Indoh [this message]
2013-09-10 5:43 ` Takao Indoh
2013-09-18 11:29 ` David Woodhouse
2013-09-18 11:29 ` David Woodhouse
2013-09-18 11:29 ` David Woodhouse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=522EB171.6000909@jp.fujitsu.com \
--to=indou.takao@jp.fujitsu.com \
--cc=alex.williamson@redhat.com \
--cc=dwmw2@infradead.org \
--cc=iommu@lists.linux-foundation.org \
--cc=joro@8bytes.org \
--cc=kexec@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.