public inbox for iommu@lists.linux-foundation.org
 help / color / mirror / Atom feed
From: Lu Baolu <baolu.lu@linux.intel.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: kevin.tian@intel.com, ashok.raj@intel.com,
	linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org,
	jacob.jun.pan@intel.com, David Woodhouse <dwmw2@infradead.org>
Subject: Re: [PATCH v2 7/7] iommu/vt-d: Consolidate domain_init() to avoid duplication
Date: Sat, 20 Jul 2019 09:15:58 +0800	[thread overview]
Message-ID: <13294960-c040-c501-c279-aa61d780d25e@linux.intel.com> (raw)
In-Reply-To: <20190719091952.58255c47@x1.home>

Hi Alex,

On 7/19/19 11:19 PM, Alex Williamson wrote:
> On Fri, 19 Jul 2019 16:27:04 +0800
> Lu Baolu <baolu.lu@linux.intel.com> wrote:
> 
>> Hi Alex,
>>
>> On 7/19/19 7:16 AM, Alex Williamson wrote:
>>> On Wed, 12 Jun 2019 08:28:51 +0800
>>> Lu Baolu <baolu.lu@linux.intel.com> wrote:
>>>    
>>>> The domain_init() and md_domain_init() do almost the same job.
>>>> Consolidate them to avoid duplication.
>>>>
>>>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>>>> ---
>>>>    drivers/iommu/intel-iommu.c | 123 +++++++++++-------------------------
>>>>    1 file changed, 36 insertions(+), 87 deletions(-)
>>>>
>>>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>>>> index 5215dcd535a1..b8c6cf1d5f90 100644
>>>> --- a/drivers/iommu/intel-iommu.c
>>>> +++ b/drivers/iommu/intel-iommu.c
>>>> @@ -1825,63 +1825,6 @@ static inline int guestwidth_to_adjustwidth(int gaw)
>>>>    	return agaw;
>>>>    }
>>>>    
>>>> -static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu,
>>>> -		       int guest_width)
>>>> -{
>>>> -	int adjust_width, agaw;
>>>> -	unsigned long sagaw;
>>>> -	int err;
>>>> -
>>>> -	init_iova_domain(&domain->iovad, VTD_PAGE_SIZE, IOVA_START_PFN);
>>>> -
>>>> -	err = init_iova_flush_queue(&domain->iovad,
>>>> -				    iommu_flush_iova, iova_entry_free);
>>>> -	if (err)
>>>> -		return err;
>>>> -
>>>> -	domain_reserve_special_ranges(domain);
>>>> -
>>>> -	/* calculate AGAW */
>>>> -	if (guest_width > cap_mgaw(iommu->cap))
>>>> -		guest_width = cap_mgaw(iommu->cap);
>>>> -	domain->gaw = guest_width;
>>>> -	adjust_width = guestwidth_to_adjustwidth(guest_width);
>>>> -	agaw = width_to_agaw(adjust_width);
>>>> -	sagaw = cap_sagaw(iommu->cap);
>>>> -	if (!test_bit(agaw, &sagaw)) {
>>>> -		/* hardware doesn't support it, choose a bigger one */
>>>> -		pr_debug("Hardware doesn't support agaw %d\n", agaw);
>>>> -		agaw = find_next_bit(&sagaw, 5, agaw);
>>>> -		if (agaw >= 5)
>>>> -			return -ENODEV;
>>>> -	}
>>>> -	domain->agaw = agaw;
>>>> -
>>>> -	if (ecap_coherent(iommu->ecap))
>>>> -		domain->iommu_coherency = 1;
>>>> -	else
>>>> -		domain->iommu_coherency = 0;
>>>> -
>>>> -	if (ecap_sc_support(iommu->ecap))
>>>> -		domain->iommu_snooping = 1;
>>>> -	else
>>>> -		domain->iommu_snooping = 0;
>>>> -
>>>> -	if (intel_iommu_superpage)
>>>> -		domain->iommu_superpage = fls(cap_super_page_val(iommu->cap));
>>>> -	else
>>>> -		domain->iommu_superpage = 0;
>>>> -
>>>> -	domain->nid = iommu->node;
>>>> -
>>>> -	/* always allocate the top pgd */
>>>> -	domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
>>>> -	if (!domain->pgd)
>>>> -		return -ENOMEM;
>>>> -	__iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
>>>> -	return 0;
>>>> -}
>>>> -
>>>>    static void domain_exit(struct dmar_domain *domain)
>>>>    {
>>>>    	struct page *freelist;
>>>> @@ -2563,6 +2506,31 @@ static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
>>>>    	return 0;
>>>>    }
>>>>    
>>>> +static int domain_init(struct dmar_domain *domain, int guest_width)
>>>> +{
>>>> +	int adjust_width;
>>>> +
>>>> +	init_iova_domain(&domain->iovad, VTD_PAGE_SIZE, IOVA_START_PFN);
>>>> +	domain_reserve_special_ranges(domain);
>>>> +
>>>> +	/* calculate AGAW */
>>>> +	domain->gaw = guest_width;
>>>> +	adjust_width = guestwidth_to_adjustwidth(guest_width);
>>>> +	domain->agaw = width_to_agaw(adjust_width);
>>>
>>>
>>> How do we justify that domain->agaw is nothing like it was previously
>>> here?  I spent some more time working on the failure to boot that I
>>> thought was caused by 4ec066c7b147, but there are so many breakages and
>>> fixes in Joerg's x86/vt-d branch that I think my bisect zero'd in on
>>> the wrong one.  Instead I cherry-picked every commit from Joerg's tree
>>> and matched Fixes patches to their original commit, which led me to
>>> this patch, mainline commit 123b2ffc376e.  The issue I'm seeing is that
>>> we call domain_context_mapping_one() and we are in this section:
>>>
>>>           struct dma_pte *pgd = domain->pgd;
>>>           int agaw;
>>>
>>>           context_set_domain_id(context, did);
>>>
>>>           if (translation != CONTEXT_TT_PASS_THROUGH) {
>>>                   /*
>>>                    * Skip top levels of page tables for iommu which has
>>>                    * less agaw than default. Unnecessary for PT mode.
>>>                    */
>>>                   for (agaw = domain->agaw; agaw > iommu->agaw; agaw--) {
>>>                           ret = -ENOMEM;
>>>                           pgd = phys_to_virt(dma_pte_addr(pgd));
>>>                           if (!dma_pte_present(pgd))
>>>                                   goto out_unlock;
>>>                   }
>>>
>>> Prior to this commit, we had domain->agaw=1 and iommu->agaw=1, so we
>>> don't enter the loop.  With this commit, we have domain->agaw=3,
>>> iommu->agaw=1 with pgd->val=0!
>>>    
>>
>> iommu->agaw presents the level of page table which is by default
>> supported by the @iommu. domain->agaw presents the level of page
>> table which is used by the @domain.
>>
>> agaw = 1: 3-level page table
>> agaw = 2: 4-level page table
>> agaw = 3: 5-level page table
>>
>> The case here is that @iommu only supports 3-level page table, but the
>> @domain was set to use 5-level page table. So we must skip level 4 and 5
>> page tables of the @domain.
>>
>> This code in the loop looks odd to me. It will always goto to unlock and
>> leave pgd->val==0. How about below change? (not tested yet!)
>>
>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>> index 412f18aba501..98d6878cd29d 100644
>> --- a/drivers/iommu/intel-iommu.c
>> +++ b/drivers/iommu/intel-iommu.c
>> @@ -2020,11 +2020,15 @@ static int domain_context_mapping_one(struct
>> dmar_domain *domain,
>>                            * Skip top levels of page tables for iommu
>> which has
>>                            * less agaw than default. Unnecessary for PT mode.
>>                            */
>> -                       for (agaw = domain->agaw; agaw > iommu->agaw;
>> agaw--) {
>> -                               ret = -ENOMEM;
>> -                               pgd = phys_to_virt(dma_pte_addr(pgd));
>> -                               if (!dma_pte_present(pgd))
>> -                                       goto out_unlock;
>> +                       while (iommu->agaw < domain->agaw) {
>> +                               struct dma_pte *pte;
>> +
>> +                               pte = domain->pgd;
>> +                               if (dma_pte_present(pte)) {
>> +                                       domain->pgd =
>> phys_to_virt(dma_pte_addr(pte));
>> +                                       free_pgtable_page(pte);
>> +                               }
>> +                               domain->agaw--;
>>                           }
>>
>>                           info = iommu_support_dev_iotlb(domain, iommu,
>> bus, devfn);
> 
> 
> The system still fails to boot, now with DMAR faults indicating invalid
> context entry.

Ah! It demands more investigation. Thank you for testing this.

>   
>>> I don't really follow how the setting of these fields above is
>>> equivalent to what they were previously or if they're supposed to be
>>> updated lazily, but the current behavior is non-functional.  Commit
>>> 123b2ffc376e can be reverted with only a bit of offset, which brings
>>> Linus' tree back into working operation for me.  Should we revert or is
>>> there an obvious fix here?  Thanks,
>>
>> This commit is not the root cause of this issue as far as I can see, it
>> only triggers above loop to get entered.
> 
> The system fails to boot with this commit, it does boot without this
> commit.  Whether this commit exposed a latent issue or introduced a bug
> itself is academic.  Thanks,

Okay. I agree wit you. Let's revert this commit first.

Best regards,
Baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  reply	other threads:[~2019-07-20  1:16 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-12  0:28 [PATCH v2 0/7] iommu/vt-d: Fixes and cleanups for linux-next Lu Baolu
2019-06-12  0:28 ` [PATCH v2 1/7] iommu/vt-d: Don't return error when device gets right domain Lu Baolu
2019-06-12  0:28 ` [PATCH v2 2/7] iommu/vt-d: Set domain type for a private domain Lu Baolu
2019-06-12  0:28 ` [PATCH v2 3/7] iommu/vt-d: Don't enable iommu's which have been ignored Lu Baolu
2019-06-12  0:28 ` [PATCH v2 4/7] iommu/vt-d: Allow DMA domain attaching to rmrr locked device Lu Baolu
2019-06-12  0:28 ` [PATCH v2 5/7] iommu/vt-d: Fix suspicious RCU usage in probe_acpi_namespace_devices() Lu Baolu
2022-06-29 13:03   ` Robin Murphy
2022-07-01  7:19     ` Baolu Lu
2022-07-01  8:18       ` Robin Murphy
2019-06-12  0:28 ` [PATCH v2 6/7] iommu/vt-d: Cleanup after delegating DMA domain to generic iommu Lu Baolu
2019-06-12  0:28 ` [PATCH v2 7/7] iommu/vt-d: Consolidate domain_init() to avoid duplication Lu Baolu
2019-07-18 23:16   ` Alex Williamson
2019-07-19  8:27     ` Lu Baolu
2019-07-19 15:19       ` Alex Williamson
2019-07-20  1:15         ` Lu Baolu [this message]
2019-07-22 14:22           ` Joerg Roedel
2019-07-25  1:41             ` Lu Baolu
2019-06-12  8:37 ` [PATCH v2 0/7] iommu/vt-d: Fixes and cleanups for linux-next Joerg Roedel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=13294960-c040-c501-c279-aa61d780d25e@linux.intel.com \
    --to=baolu.lu@linux.intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=dwmw2@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jacob.jun.pan@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox