From: Jiang Liu <jiang.liu-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
To: Yijing Wang <wangyijing-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>,
Linda Knippers <linda.knippers-VXdhtT5mjnY@public.gmane.org>,
Alex Williamson
<alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: huaxiuxiu-hv44wF8Li93QT0dZR+AlfA@public.gmane.org,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
David Woodhouse <dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Subject: Re: [PATCH] iommu/vt-d: Fix broken device issue when using iommu=pt
Date: Tue, 12 Aug 2014 10:34:30 +0800 [thread overview]
Message-ID: <53E97D36.8010403@linux.intel.com> (raw)
In-Reply-To: <53E96FE0.7080600-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
On 2014/8/12 9:37, Yijing Wang wrote:
> On 2014/8/11 22:59, Linda Knippers wrote:
>> On 8/11/2014 12:43 AM, Alex Williamson wrote:
>>> On Mon, 2014-08-11 at 10:54 +0800, Yijing Wang wrote:
>>>> We found some strange devices in HP C7000 and Huawei Server. These devices
>>>> can not be enumerated by OS, but they still did DMA read/write without OS
>>>> management. Because iommu will not create the DMA mapping for these devices,
>>>> the DMA read/write will be blocked by iommu hardware.
>>>>
>>>> Eg.
>>>> \-[0000:00]-+-00.0 Intel Corporation Xeon E5/Core i7 DMI2
>>>> +-01.0-[11]--
>>>> +-01.1-[02]--
>>>> +-02.0-[04]--+-00.0 Emulex Corporation OneConnect 10Gb NIC (be3)
>>>> | +-00.1 Emulex Corporation OneConnect 10Gb NIC (be3)
>>>> | +-00.2 Emulex Corporation OneConnect 10Gb iSCSI Initiator (be3)
>>>> | \-00.3 Emulex Corporation OneConnect 10Gb iSCSI Initiator (be3)
>>>> +-02.1-[12]--
>>>> Kernel only found four devices in bus 0x04, but we found following DMA errors in dmesg.
>>>>
>>>> [ 1438.477262] DRHD: handling fault status reg 402
>>>> [ 1438.498278] DMAR:[DMA Write] Request device [04:00.4] fault addr bdf70000
>>>> [ 1438.498280] DMAR:[fault reason 02] Present bit in context entry is clear
>>>> [ 1438.566458] DMAR:[DMA Write] Request device [04:00.5] fault addr bdf70000
>>>> [ 1438.566460] DMAR:[fault reason 02] Present bit in context entry is clear
>>>> [ 1438.635211] DMAR:[DMA Write] Request device [04:00.6] fault addr bdf70000
>>>> [ 1438.635213] DMAR:[fault reason 02] Present bit in context entry is clear
>>>> [ 1438.703849] DMAR:[DMA Write] Request device [04:00.7] fault addr bdf70000
>>>> [ 1438.703851] DMAR:[fault reason 02] Present bit in context entry is clear
>>>>
>>>> Signed-off-by: Yijing Wang <wangyijing-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
>>>> ---
>>>> arch/x86/include/asm/iommu.h | 2 ++
>>>> arch/x86/kernel/pci-dma.c | 8 ++++++++
>>>> drivers/iommu/intel-iommu.c | 41 +++++++++++++++++++++++++++++++++++++++++
>>>> 3 files changed, 51 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/arch/x86/include/asm/iommu.h b/arch/x86/include/asm/iommu.h
>>>> index 345c99c..5e3a2d8 100644
>>>> --- a/arch/x86/include/asm/iommu.h
>>>> +++ b/arch/x86/include/asm/iommu.h
>>>> @@ -5,6 +5,8 @@ extern struct dma_map_ops nommu_dma_ops;
>>>> extern int force_iommu, no_iommu;
>>>> extern int iommu_detected;
>>>> extern int iommu_pass_through;
>>>> +extern int iommu_pt_force_bus;
>>>> +extern int iommu_pt_force_domain;
>>>>
>>>> /* 10 seconds */
>>>> #define DMAR_OPERATION_TIMEOUT ((cycles_t) tsc_khz*10*1000)
>>>> diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
>>>> index a25e202..bf21d97 100644
>>>> --- a/arch/x86/kernel/pci-dma.c
>>>> +++ b/arch/x86/kernel/pci-dma.c
>>>> @@ -44,6 +44,8 @@ int iommu_detected __read_mostly = 0;
>>>> * guests and not for driver dma translation.
>>>> */
>>>> int iommu_pass_through __read_mostly;
>>>> +int iommu_pt_force_bus = -1;
>>>> +int iommu_pt_force_domain = -1;
>>>>
>>>> extern struct iommu_table_entry __iommu_table[], __iommu_table_end[];
>>>>
>>>> @@ -146,6 +148,7 @@ void dma_generic_free_coherent(struct device *dev, size_t size, void *vaddr,
>>>> */
>>>> static __init int iommu_setup(char *p)
>>>> {
>>>> + char *end;
>>>> iommu_merge = 1;
>>>>
>>>> if (!p)
>>>> @@ -192,6 +195,11 @@ static __init int iommu_setup(char *p)
>>>> #endif
>>>> if (!strncmp(p, "pt", 2))
>>>> iommu_pass_through = 1;
>>>> + if (!strncmp(p, "pt_force=", 9)) {
>>>> + iommu_pass_through = 1;
>>>> + iommu_pt_force_domain = simple_strtol(p+9, &end, 0);
>>>> + iommu_pt_force_bus = simple_strtol(end+1, NULL, 0);
>>>
>>> Documentation/kernel-parameters.txt?
>>>
>>>> + }
>>>>
>>>> gart_parse_options(p);
>>>>
>>>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>>>> index d1f5caa..49757f1 100644
>>>> --- a/drivers/iommu/intel-iommu.c
>>>> +++ b/drivers/iommu/intel-iommu.c
>>>> @@ -2705,6 +2705,47 @@ static int __init iommu_prepare_static_identity_mapping(int hw)
>>>> return ret;
>>>> }
>>>>
>>>> + /* We found some strange devices in HP c7000 and other platforms that
>>>> + * can not be enumerated by OS, but they did DMA read/write without
>>>> + * driver management, so we should create the pt mapping for these
>>>> + * devices to avoid DMA errors. Add iommu=pt_force=segment:busnum to
>>>> + * force to do pt context mapping in the bus number.
>>>> + */
>>>
>>> So best case with this patch is that the user needs to discover that
>>> this option exists, figure out the undocumented parameters, be running
>>> on VT-d, permanently add a kernel commandline option, and never have any
>>> intention of assigning the device to userspace or a VM...
>>>
>>> Can't we handle this with the DMA alias quirks that are now in 3.17? Or
>>> can the vendor fix this with a firmware update? This device behavior is
>>> really quite broken for this kind of server class product.
>>
>> Yeah, something doesn't sound right here.
>>
>> I would like to hear more about this configuration, off list if you prefer.
>> What servers? What firmware revisions?
>
> Hi Linda, we found this issue in HP C7000 server. I attached the dmesg and lspci info,
> because the machine is in product department, so I don't know the firmware revision.
>
> Thanks!
> Yijing.
Hi Yijing,
I still suspect something is wrong with ARI support
instead of Phantom Function.
According to lspci output:
1) Root port 00:02.0 has ARIFwd enabled in DevCtl2
2) Function 04:00.[0-3] all have Alternative Routing-ID Interpretation
capability.
So could you please try to clear ARIFwd bit in devctl2 when enumerating
root port 00:02.0?
BTW, do function 04:00.[0-3] encounter any other issues except the
IOMMU warnings?
Thanks!
Gerry
>
>
>>>
>>>> + if (iommu_pt_force_bus >= 0 && iommu_pt_force_bus >= 0) {
>>>> + int found = 0;
>>>> +
>>>> + iommu = NULL;
>>>> + for_each_active_iommu(iommu, drhd) {
>>>> + if (iommu_pt_force_domain != drhd->segment)
>>>> + continue;
>>>> +
>>>> + for_each_active_dev_scope(drhd->devices, drhd->devices_cnt, i, dev) {
>>>> + if (!dev_is_pci(dev))
>>>> + continue;
>>>> +
>>>> + pdev = to_pci_dev(dev);
>>>> + if (pdev->bus->number == iommu_pt_force_bus ||
>>>> + (pdev->subordinate
>>>> + && pdev->subordinate->number <= iommu_pt_force_bus
>>>> + && pdev->subordinate->busn_res.end >= iommu_pt_force_bus)) {
>>>> + found = 1;
>>>> + break;
>>>> + }
>>>> + }
>>>> +
>>>> + if (drhd->include_all) {
>>>> + found = 1;
>>>> + break;
>>>> + }
>>>> + }
>>>> +
>>>> + if (found && iommu)
>>>> + for (i = 0; i < 256; i++)
>>>> + domain_context_mapping_one(si_domain, iommu, iommu_pt_force_bus,
>>>> + i, hw ? CONTEXT_TT_PASS_THROUGH :
>>>> + CONTEXT_TT_MULTI_LEVEL);
>>>> + }
>>>> +
>>>> return 0;
>>>> }
>>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> iommu mailing list
>>> iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
>>> https://lists.linuxfoundation.org/mailman/listinfo/iommu
>>>
>>
>>
>> .
>>
>
>
next prev parent reply other threads:[~2014-08-12 2:34 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-11 2:54 [PATCH] iommu/vt-d: Fix broken device issue when using iommu=pt Yijing Wang
[not found] ` <1407725674-27271-1-git-send-email-wangyijing-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2014-08-11 3:15 ` Jiang Liu
[not found] ` <53E8355C.4010906-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-08-11 3:46 ` Yijing Wang
[not found] ` <53E83C9E.9060405-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2014-08-11 4:56 ` Jiang Liu
[not found] ` <53E84CEA.7080402-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-08-11 8:26 ` Yijing Wang
2014-08-11 4:43 ` Alex Williamson
[not found] ` <1407732187.9800.11.camel-85EaTFmN5p//9pzu0YdTqQ@public.gmane.org>
2014-08-11 8:36 ` Yijing Wang
2014-08-11 14:59 ` Linda Knippers
[not found] ` <53E8DA5E.8090406-VXdhtT5mjnY@public.gmane.org>
2014-08-12 1:37 ` Yijing Wang
[not found] ` <53E96FE0.7080600-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2014-08-12 2:34 ` Jiang Liu [this message]
2014-08-12 3:18 ` Jiang Liu
[not found] ` <53E9876F.9040300-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-08-12 3:48 ` Yijing Wang
2014-08-14 16:07 ` Linda Knippers
-- strict thread matches above, loose matches on Subject: below --
2014-09-24 21:56 Rob Roschewsk
[not found] ` <1597491265.153338.1411595791727.JavaMail.open-xchange-91tEjvOFe9KcT/DCa4qSTkaJx/dRlJfr5NbjCUgZEJk@public.gmane.org>
2014-09-25 2:36 ` Yijing Wang
[not found] ` <54237F92.1090601-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2014-09-25 13:56 ` Rob Roschewsk
[not found] ` <CAE1O4xrQYpc4o6MbN8b7kkgwTbjj+Pszu5WUwM_vErUfyQYfNw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-09-25 14:29 ` Linda Knippers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53E97D36.8010403@linux.intel.com \
--to=jiang.liu-vuqaysv1563yd54fqh9/ca@public.gmane.org \
--cc=alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
--cc=huaxiuxiu-hv44wF8Li93QT0dZR+AlfA@public.gmane.org \
--cc=iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=linda.knippers-VXdhtT5mjnY@public.gmane.org \
--cc=wangyijing-hv44wF8Li93QT0dZR+AlfA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.