iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: Jiang Liu <jiang.liu-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
To: Yijing Wang <wangyijing-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>,
	Linda Knippers <linda.knippers-VXdhtT5mjnY@public.gmane.org>,
	Alex Williamson
	<alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: huaxiuxiu-hv44wF8Li93QT0dZR+AlfA@public.gmane.org,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	David Woodhouse <dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Subject: Re: [PATCH] iommu/vt-d: Fix broken device issue when using iommu=pt
Date: Tue, 12 Aug 2014 10:34:30 +0800	[thread overview]
Message-ID: <53E97D36.8010403@linux.intel.com> (raw)
In-Reply-To: <53E96FE0.7080600-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>



On 2014/8/12 9:37, Yijing Wang wrote:
> On 2014/8/11 22:59, Linda Knippers wrote:
>> On 8/11/2014 12:43 AM, Alex Williamson wrote:
>>> On Mon, 2014-08-11 at 10:54 +0800, Yijing Wang wrote:
>>>> We found some strange devices in HP C7000 and Huawei Server. These devices
>>>> can not be enumerated by OS, but they still did DMA read/write without OS 
>>>> management. Because iommu will not create the DMA mapping for these devices,
>>>> the DMA read/write will be blocked by iommu hardware.
>>>>
>>>> Eg.
>>>>  \-[0000:00]-+-00.0  Intel Corporation Xeon E5/Core i7 DMI2
>>>>              +-01.0-[11]--
>>>> 			 +-01.1-[02]--
>>>> 			 +-02.0-[04]--+-00.0  Emulex Corporation OneConnect 10Gb NIC (be3)
>>>> 	         |            +-00.1  Emulex Corporation OneConnect 10Gb NIC (be3)
>>>> 	         |            +-00.2  Emulex Corporation OneConnect 10Gb iSCSI Initiator (be3)
>>>> 	         |            \-00.3  Emulex Corporation OneConnect 10Gb iSCSI Initiator (be3)
>>>> 	         +-02.1-[12]--
>>>> Kernel only found four devices in bus 0x04, but we found following DMA errors in dmesg.
>>>>
>>>> [ 1438.477262] DRHD: handling fault status reg 402
>>>> [ 1438.498278] DMAR:[DMA Write] Request device [04:00.4] fault addr bdf70000 
>>>> [ 1438.498280] DMAR:[fault reason 02] Present bit in context entry is clear
>>>> [ 1438.566458] DMAR:[DMA Write] Request device [04:00.5] fault addr bdf70000 
>>>> [ 1438.566460] DMAR:[fault reason 02] Present bit in context entry is clear
>>>> [ 1438.635211] DMAR:[DMA Write] Request device [04:00.6] fault addr bdf70000 
>>>> [ 1438.635213] DMAR:[fault reason 02] Present bit in context entry is clear
>>>> [ 1438.703849] DMAR:[DMA Write] Request device [04:00.7] fault addr bdf70000 
>>>> [ 1438.703851] DMAR:[fault reason 02] Present bit in context entry is clear
>>>>
>>>> Signed-off-by: Yijing Wang <wangyijing-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
>>>> ---
>>>>  arch/x86/include/asm/iommu.h |    2 ++
>>>>  arch/x86/kernel/pci-dma.c    |    8 ++++++++
>>>>  drivers/iommu/intel-iommu.c  |   41 +++++++++++++++++++++++++++++++++++++++++
>>>>  3 files changed, 51 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/arch/x86/include/asm/iommu.h b/arch/x86/include/asm/iommu.h
>>>> index 345c99c..5e3a2d8 100644
>>>> --- a/arch/x86/include/asm/iommu.h
>>>> +++ b/arch/x86/include/asm/iommu.h
>>>> @@ -5,6 +5,8 @@ extern struct dma_map_ops nommu_dma_ops;
>>>>  extern int force_iommu, no_iommu;
>>>>  extern int iommu_detected;
>>>>  extern int iommu_pass_through;
>>>> +extern int iommu_pt_force_bus;
>>>> +extern int iommu_pt_force_domain;
>>>>  
>>>>  /* 10 seconds */
>>>>  #define DMAR_OPERATION_TIMEOUT ((cycles_t) tsc_khz*10*1000)
>>>> diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
>>>> index a25e202..bf21d97 100644
>>>> --- a/arch/x86/kernel/pci-dma.c
>>>> +++ b/arch/x86/kernel/pci-dma.c
>>>> @@ -44,6 +44,8 @@ int iommu_detected __read_mostly = 0;
>>>>   * guests and not for driver dma translation.
>>>>   */
>>>>  int iommu_pass_through __read_mostly;
>>>> +int iommu_pt_force_bus = -1;
>>>> +int iommu_pt_force_domain = -1;
>>>>  
>>>>  extern struct iommu_table_entry __iommu_table[], __iommu_table_end[];
>>>>  
>>>> @@ -146,6 +148,7 @@ void dma_generic_free_coherent(struct device *dev, size_t size, void *vaddr,
>>>>   */
>>>>  static __init int iommu_setup(char *p)
>>>>  {
>>>> +	char *end;
>>>>  	iommu_merge = 1;
>>>>  
>>>>  	if (!p)
>>>> @@ -192,6 +195,11 @@ static __init int iommu_setup(char *p)
>>>>  #endif
>>>>  		if (!strncmp(p, "pt", 2))
>>>>  			iommu_pass_through = 1;
>>>> +		if (!strncmp(p, "pt_force=", 9)) {
>>>> +			iommu_pass_through = 1;
>>>> +			iommu_pt_force_domain = simple_strtol(p+9, &end, 0);
>>>> +			iommu_pt_force_bus = simple_strtol(end+1, NULL, 0);
>>>
>>> Documentation/kernel-parameters.txt?
>>>
>>>> +		}
>>>>  
>>>>  		gart_parse_options(p);
>>>>  
>>>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>>>> index d1f5caa..49757f1 100644
>>>> --- a/drivers/iommu/intel-iommu.c
>>>> +++ b/drivers/iommu/intel-iommu.c
>>>> @@ -2705,6 +2705,47 @@ static int __init iommu_prepare_static_identity_mapping(int hw)
>>>>  				return ret;
>>>>  		}
>>>>  
>>>> +	/* We found some strange devices in HP c7000 and other platforms that
>>>> +	 * can not be enumerated by OS, but they did DMA read/write without
>>>> +	 * driver management, so we should create the pt mapping for these
>>>> +	 * devices to avoid DMA errors. Add iommu=pt_force=segment:busnum to
>>>> +	 * force to do pt context mapping in the bus number.
>>>> +	 */
>>>
>>> So best case with this patch is that the user needs to discover that
>>> this option exists, figure out the undocumented parameters, be running
>>> on VT-d, permanently add a kernel commandline option, and never have any
>>> intention of assigning the device to userspace or a VM...
>>>
>>> Can't we handle this with the DMA alias quirks that are now in 3.17?  Or
>>> can the vendor fix this with a firmware update?  This device behavior is
>>> really quite broken for this kind of server class product.  
>>
>> Yeah, something doesn't sound right here.
>>
>> I would like to hear more about this configuration, off list if you prefer.
>> What servers?  What firmware revisions?
> 
> Hi Linda, we found this issue in HP C7000 server. I attached the dmesg and lspci info,
> because the machine is in product department, so I don't know the firmware revision.
> 
> Thanks!
> Yijing.
Hi Yijing,
	I still suspect something is wrong with ARI support
instead of Phantom Function.
	According to lspci output:
1) Root port 00:02.0 has ARIFwd enabled in DevCtl2
2) Function 04:00.[0-3] all have Alternative Routing-ID Interpretation
   capability.
So could you please try to clear ARIFwd bit in devctl2 when enumerating
root port 00:02.0?

BTW, do function 04:00.[0-3] encounter any other issues except the
IOMMU warnings?

Thanks!
Gerry

> 
> 
>>>
>>>> +	if (iommu_pt_force_bus >= 0 && iommu_pt_force_bus >= 0) {
>>>> +		int found = 0;
>>>> +
>>>> +		iommu = NULL;
>>>> +		for_each_active_iommu(iommu, drhd) {
>>>> +			if (iommu_pt_force_domain != drhd->segment)
>>>> +				continue;
>>>> +
>>>> +			for_each_active_dev_scope(drhd->devices, drhd->devices_cnt, i, dev) {
>>>> +				if (!dev_is_pci(dev))
>>>> +					continue;
>>>> +
>>>> +				pdev = to_pci_dev(dev);
>>>> +				if (pdev->bus->number == iommu_pt_force_bus ||
>>>> +						(pdev->subordinate
>>>> +						 && pdev->subordinate->number <= iommu_pt_force_bus
>>>> +						 && pdev->subordinate->busn_res.end >= iommu_pt_force_bus)) {
>>>> +					found = 1;
>>>> +					break;
>>>> +				}
>>>> +			}
>>>> +
>>>> +			if (drhd->include_all) {
>>>> +				found = 1;
>>>> +				break;
>>>> +			}
>>>> +		}
>>>> +
>>>> +		if (found && iommu)
>>>> +			for (i = 0; i < 256; i++)
>>>> +				domain_context_mapping_one(si_domain, iommu, iommu_pt_force_bus,
>>>> +						i,  hw ? CONTEXT_TT_PASS_THROUGH :
>>>> +						CONTEXT_TT_MULTI_LEVEL);
>>>> +	}
>>>> +
>>>>  	return 0;
>>>>  }
>>>>  
>>>
>>>
>>>
>>> _______________________________________________
>>> iommu mailing list
>>> iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
>>> https://lists.linuxfoundation.org/mailman/listinfo/iommu
>>>
>>
>>
>> .
>>
> 
> 

  parent reply	other threads:[~2014-08-12  2:34 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-11  2:54 [PATCH] iommu/vt-d: Fix broken device issue when using iommu=pt Yijing Wang
     [not found] ` <1407725674-27271-1-git-send-email-wangyijing-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2014-08-11  3:15   ` Jiang Liu
     [not found]     ` <53E8355C.4010906-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-08-11  3:46       ` Yijing Wang
     [not found]         ` <53E83C9E.9060405-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2014-08-11  4:56           ` Jiang Liu
     [not found]             ` <53E84CEA.7080402-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-08-11  8:26               ` Yijing Wang
2014-08-11  4:43   ` Alex Williamson
     [not found]     ` <1407732187.9800.11.camel-85EaTFmN5p//9pzu0YdTqQ@public.gmane.org>
2014-08-11  8:36       ` Yijing Wang
2014-08-11 14:59       ` Linda Knippers
     [not found]         ` <53E8DA5E.8090406-VXdhtT5mjnY@public.gmane.org>
2014-08-12  1:37           ` Yijing Wang
     [not found]             ` <53E96FE0.7080600-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2014-08-12  2:34               ` Jiang Liu [this message]
2014-08-12  3:18               ` Jiang Liu
     [not found]                 ` <53E9876F.9040300-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-08-12  3:48                   ` Yijing Wang
2014-08-14 16:07               ` Linda Knippers
  -- strict thread matches above, loose matches on Subject: below --
2014-09-24 21:56 Rob Roschewsk
     [not found] ` <1597491265.153338.1411595791727.JavaMail.open-xchange-91tEjvOFe9KcT/DCa4qSTkaJx/dRlJfr5NbjCUgZEJk@public.gmane.org>
2014-09-25  2:36   ` Yijing Wang
     [not found]     ` <54237F92.1090601-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2014-09-25 13:56       ` Rob Roschewsk
     [not found]         ` <CAE1O4xrQYpc4o6MbN8b7kkgwTbjj+Pszu5WUwM_vErUfyQYfNw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-09-25 14:29           ` Linda Knippers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53E97D36.8010403@linux.intel.com \
    --to=jiang.liu-vuqaysv1563yd54fqh9/ca@public.gmane.org \
    --cc=alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=huaxiuxiu-hv44wF8Li93QT0dZR+AlfA@public.gmane.org \
    --cc=iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=linda.knippers-VXdhtT5mjnY@public.gmane.org \
    --cc=wangyijing-hv44wF8Li93QT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).