From mboxrd@z Thu Jan 1 00:00:00 1970 From: Don Dutile Subject: Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel Date: Tue, 07 Apr 2015 10:12:43 -0400 Message-ID: <5523E5DB.2090001@redhat.com> References: <1426743388-26908-1-git-send-email-zhen-hual@hp.com> <20150403084031.GF22579@dhcp-128-53.nay.redhat.com> <551E56F6.60503@hp.com> <20150403092111.GG22579@dhcp-128-53.nay.redhat.com> <20150405015453.GB1562@dhcp-17-102.nay.redhat.com> <20150407034622.GB7213@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20150407034622.GB7213-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: Dave Young , Baoquan He Cc: tom.vaden-VXdhtT5mjnY@public.gmane.org, rwright-VXdhtT5mjnY@public.gmane.org, linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, lisa.mitchell-VXdhtT5mjnY@public.gmane.org, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org, "Li, ZhenHua" , doug.hatch-VXdhtT5mjnY@public.gmane.org, ishii.hironobu-+CUm20s59erQFUHtdCDX3A@public.gmane.org, bhelgaas-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org, billsumnerlinux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org, li.zhang6-VXdhtT5mjnY@public.gmane.org, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org, vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org List-Id: iommu@lists.linux-foundation.org On 04/06/2015 11:46 PM, Dave Young wrote: > On 04/05/15 at 09:54am, Baoquan He wrote: >> On 04/03/15 at 05:21pm, Dave Young wrote: >>> On 04/03/15 at 05:01pm, Li, ZhenHua wrote: >>>> Hi Dave, >>>> >>>> There may be some possibilities that the old iommu data is corrupted by >>>> some other modules. Currently we do not have a better solution for the >>>> dmar faults. >>>> >>>> But I think when this happens, we need to fix the module that corrupted >>>> the old iommu data. I once met a similar problem in normal kernel, the >>>> queue used by the qi_* functions was written again by another module. >>>> The fix was in that module, not in iommu module. >>> >>> It is too late, there will be no chance to save vmcore then. >>> >>> Also if it is possible to continue corrupt other area of oldmem because >>> of using old iommu tables then it will cause more problems. >>> >>> So I think the tables at least need some verifycation before being used. >>> >> >> Yes, it's a good thinking anout this and verification is also an >> interesting idea. kexec/kdump do a sha256 calculation on loaded kernel >> and then verify this again when panic happens in purgatory. This checks >> whether any code stomps into region reserved for kexec/kernel and corrupt >> the loaded kernel. >> >> If this is decided to do it should be an enhancement to current >> patchset but not a approach change. Since this patchset is going very >> close to point as maintainers expected maybe this can be merged firstly, >> then think about enhancement. After all without this patchset vt-d often >> raised error message, hung. > > It does not convince me, we should do it right at the beginning instead of > introduce something wrong. > > I wonder why the old dma can not be remap to a specific page in kdump kernel > so that it will not corrupt more memory. But I may missed something, I will > looking for old threads and catch up. > > Thanks > Dave > The (only) issue is not corruption, but once the iommu is re-configured, the old, not-stopped-yet, dma engines will use iova's that will generate dmar faults, which will be enabled when the iommu is re-configured (even to a single/simple paging scheme) in the kexec kernel.