From: "Li, ZhenHua" <zhen-hual-VXdhtT5mjnY@public.gmane.org>
To: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Cc: Baoquan He <bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
"Vaden,
Tom L (HP Server OS Architecture)"
<tom.vaden-VXdhtT5mjnY@public.gmane.org>,
"linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
"kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org"
<kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org>,
"linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
"open list:INTEL IOMMU (VT-d)"
<iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>,
doug.hatch-VXdhtT5mjnY@public.gmane.org,
"ishii.hironobu-+CUm20s59erQFUHtdCDX3A@public.gmane.org"
<ishii.hironobu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>,
Bjorn Helgaas <bhelgaas-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
zhenhua-VXdhtT5mjnY@public.gmane.org,
David Woodhouse <dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Subject: Re: [PATCH 0/8] iommu/vt-d: Fix crash dump failure caused by legacy DMA/IO
Date: Wed, 22 Oct 2014 11:08:36 +0800 [thread overview]
Message-ID: <54471FB4.4030602@hp.com> (raw)
In-Reply-To: <87mw8on7lx.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
Need more time to read and think about these mails. I just want to
clarify one thing: Bill has left HP, and now I inherited his works.
That's why I sent an update of his patch
https://lkml.org/lkml/2014/10/21/134
On 10/22/2014 10:47 AM, Eric W. Biederman wrote:
> Bjorn Helgaas <bhelgaas-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> writes:
>
>> [-cc Bill, +cc Zhen-Hua, Eric, Tom, Jerry]
>>
>> Hi Joerg,
>>
>> I was looking at Zhen-Hua's recent patches, trying to figure out if I
>> need to do anything with them. Resetting devices in the old kernel
>> seems like a non-starter. Resetting devices in the new kernel, ...,
>> well, maybe. It seems ugly, and it seems like the sort of problem
>> that IOMMUs are designed to solve. Anyway, I found this old
>> discussion that I didn't quite understand:
>
> For context here is the kexec on panic design, and what I know from
> previous rounds of similar conversations.
>
> The way kexec on panic aka kdump is designed to work is that the
> recovery kernel lives in a piece of memory reserved at boot time and
> known not to be in use by any driver (because we never ever use it for
> DMA). If DMA's continue from any source the old kernel may be a little
> more corrupted but our currently running kernel should not.
>
> Device drivers that we use in the recovery kernel are required to be
> able to initialize their devices from an arbitrary state or fail to
> initialize their devices.
>
> We have discussed things on various occassions but IOMMUs all have their
> own individual idiosynchrousies and came late to the party so that it
> is hard to generalize.
>
> The reserved region is generally low enough in memory that simply
> not using IOMMUs works.
>
> The major challenge with initializing an IOMMU would be that there are
> potentially devices whose driver is not loaded in the recover kernel
> with on-going DMA sessions (perhaps a NIC in response to network
> packet).
>
> Which essentially means that if you are going to use an IOMMU slot in a
> recovery kernel you have to either know that IOMMU slot was reserved for
> the recovery kernel (what has always felt like the easiest way to me).
> Or you have to know everything that could target that IOMMU slot has
> been reset or has it's driver loaded.
>
> I have always thought the simplist and easiest solution would be to
> reserve a few IOMMU slots for the kexec on panic kernel. But if folks
> can find other ways to guarantee that an on-going DMA isn't targeting
> an IOMMU slot (such as resetting everything downstream from that
> IOMMU slot) more power to you.
>
>> On Wed, Jul 2, 2014 at 7:32 AM, Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org> wrote:
>>> On Wed, Apr 30, 2014 at 11:49:33AM +0100, David Woodhouse wrote:
>>
>>>> After the last round of this patchset, we discussed a potential
>>>> improvement where you point every virtual bus address at the *same*
>>>> physical scratch page.
>>>
>>> That is a solution to prevent the in-flight DMA failures. But what
>>> happens when there is some in-flight DMA to a disk to write some inodes
>>> or a new superblock. Then this scratch address-space may cause
>>> filesystem corruption at worst.
>>
>> This in-flight DMA is from a device programmed by the old kernel, and
>> it would be reading data from the old kernel's buffers. I think
>> you're suggesting that we might want that DMA read to complete so the
>> device can update filesystem metadata?
>>
>> I don't really understand that argument. Don't we usually want to
>> stop any data from escaping the machine after a crash, on the theory
>> that the old kernel is crashing because something is catastrophically
>> wrong and we may have already corrupted things in memory? If so,
>> allowing this old DMA to complete is just as likely to make things
>> worse as to make them better.
>>
>> Without kdump, we likely would reboot through the BIOS and the device
>> would get reset and the DMA would never happen at all. So if we made
>> the dump kernel program the IOMMU to prevent the DMA, that seems like
>> a similar situation.
>>
>>> So with this in mind I would prefer initially taking over the
>>> page-tables from the old kernel before the device drivers re-initialize
>>> the devices.
>>
>> This makes the dump kernel more dependent on data from the old kernel,
>> which we obviously want to avoid when possible.
>>
>> I didn't find the previous discussion where pointing every virtual bus
>> address at the same physical scratch page was proposed. Why was that
>> better than programming the IOMMU to reject every DMA?
>>
>> Bjorn
>
> Eric
>
next prev parent reply other threads:[~2014-10-22 3:08 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-25 0:36 [PATCH 0/8] iommu/vt-d: Fix crash dump failure caused by legacy DMA/IO Bill Sumner
[not found] ` <1398386198-19304-1-git-send-email-bill.sumner-VXdhtT5mjnY@public.gmane.org>
2014-04-25 0:36 ` [PATCH 1/8] iommu/vt-d: Fix a few existing lines for checkpatch.pl Bill Sumner
2014-04-25 0:36 ` [PATCH 2/8] iommu/vt-d: Consolidate lines for a new private header Bill Sumner
2014-04-25 0:36 ` [PATCH 3/8] iommu/vt-d: Create intel-iommu-private.h Bill Sumner
2014-04-25 0:36 ` [PATCH 4/8] iommu/vt-d: Update iommu_attach_domain() and its callers Bill Sumner
2014-04-25 0:36 ` [PATCH 5/8] iommu/vt-d: Items required for kdump Bill Sumner
2014-04-25 0:36 ` [PATCH 6/8] iommu/vt-d: Create intel-iommu-kdump.c Bill Sumner
2014-04-25 0:36 ` [PATCH 7/8] iommu/vt-d: Add domain-id functions to intel-iommu-kdump.c Bill Sumner
2014-04-25 0:36 ` [PATCH 8/8] iommu/vt-d: Changes to support kdump Bill Sumner
2014-04-30 10:49 ` [PATCH 0/8] iommu/vt-d: Fix crash dump failure caused by legacy DMA/IO David Woodhouse
[not found] ` <1398854973.12733.23.camel-W2I5cNIroUsVm/YvaOjsyQ@public.gmane.org>
2014-05-02 20:13 ` Jerry Hoemann
2014-05-07 18:25 ` Jerry Hoemann
2014-07-02 13:32 ` Joerg Roedel
[not found] ` <20140702133258.GN26537-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2014-07-11 16:27 ` Jerry Hoemann
[not found] ` <20140711162745.GA8335-dMAi7lA+vBPDUbYHzcRnttBPR1lH4CV8@public.gmane.org>
2014-10-15 8:10 ` Li, ZhenHua
2014-10-15 8:45 ` Li, ZhenHua
2014-10-22 2:16 ` Bjorn Helgaas
[not found] ` <CAErSpo69VmK5zD-ztNVHCA=KrK8zucqubY17z0K3rcpe6ReNUA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-22 2:47 ` Eric W. Biederman
[not found] ` <87mw8on7lx.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2014-10-22 3:08 ` Li, ZhenHua [this message]
2014-10-22 13:21 ` Joerg Roedel
[not found] ` <20141022132158.GD10074-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2014-10-22 18:26 ` Bjorn Helgaas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54471FB4.4030602@hp.com \
--to=zhen-hual-vxdhtt5mjny@public.gmane.org \
--cc=bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=bhelgaas-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=doug.hatch-VXdhtT5mjnY@public.gmane.org \
--cc=dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
--cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
--cc=iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=ishii.hironobu-+CUm20s59erQFUHtdCDX3A@public.gmane.org \
--cc=kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=tom.vaden-VXdhtT5mjnY@public.gmane.org \
--cc=zhenhua-VXdhtT5mjnY@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).