Re: [PATCH 3/4] Intel pci: Limit dmar_init_reserved_ranges

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Mike Habeck <habeck@sgi.com>
To: Mike Travis <travis@sgi.com>
Cc: Chris Wright <chrisw@sous-sol.org>,
	David Woodhouse <dwmw2@infradead.org>,
	Jesse Barnes <jbarnes@virtuousgeek.org>,
	iommu@lists.linux-foundation.org, linux-pci@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/4] Intel pci: Limit dmar_init_reserved_ranges
Date: Thu, 31 Mar 2011 18:40:57 -0500	[thread overview]
Message-ID: <4D951109.1040707@sgi.com> (raw)
In-Reply-To: <4D950D52.8080100@sgi.com>

On 03/31/2011 06:25 PM, Mike Travis wrote:
> I'll probably need help from our Hardware PCI Engineer to help explain
> this further, though here's a pointer to an earlier email thread:
>
> http://marc.info/?l=linux-kernel&m=129259816925973&w=2
>
> I'll also dig out the specs you're asking for.
>
> Thanks,
> Mike
>
> Chris Wright wrote:
>> * Mike Travis (travis@sgi.com) wrote:
>>> Chris - did you have any comment on this patch?
>>
>> It doesn't actually look right to me. It means that particular range
>> is no longer reserved. But perhaps I've misunderstood something.
>>
>>> Mike Travis wrote:
>>>> dmar_init_reserved_ranges() reserves the card's MMIO ranges to
>>>> prevent handing out a DMA map that would overlap with the MMIO range.
>>>> The problem while the Nvidia GPU has 64bit BARs, it's capable of
>>>> receiving > 40bit PIOs, but can't generate > 40bit DMAs.
>>
>> I don't undertand what you mean here.

What Mike is getting at is there is no reason to reserve the MMIO
range if it's greater than the dma_mask, given the MMIO range is
outside of what the IOVA code will ever hand back to the IOMMU
code.  In this case the nVidia card has a 64bit BAR and is assigned
the MMIO range [0xf8200000000 - 0xf820fffffff].  But the Nvidia
card can only generate a 40bit DMA (thus has a 40bit dma_mask). If
the IOVA code honors the limit_pfn (i.e., dma_mask) passed in it
will never hand back a >40bit address back to the IOMMU code. Thus
there is no reason to reserve the cards MMIO range if it is greater
than the dma_mask. (And that is what the patch is doing).

More below,,,

>>
>>>> So when the iommu code reserves these MMIO ranges a > 40bit
>>>> entry ends up getting in the rbtree. On a UV test system with
>>>> the Nvidia cards, the BARs are:
>>>>
>>>> 0001:36:00.0 VGA compatible controller: nVidia Corporation
>>>> GT200GL Region 0: Memory at 92000000 (32-bit, non-prefetchable)
>>>> [size=16M]
>>>> Region 1: Memory at f8200000000 (64-bit, prefetchable) [size=256M]
>>>> Region 3: Memory at 90000000 (64-bit, non-prefetchable) [size=32M]
>>>>
>>>> So this 44bit MMIO address 0xf8200000000 ends up in the rbtree. As DMA
>>>> maps get added and deleted from the rbtree we can end up getting a cached
>>>> entry to this 0xf8200000000 entry... this is what results in the code
>>>> handing out the invalid DMA map of 0xf81fffff000:
>>>>
>>>> [ 0xf8200000000-1 >> PAGE_SIZE << PAGE_SIZE ]
>>>>
>>>> The IOVA code needs to better honor the "limit_pfn" when allocating
>>>> these maps.
>>
>> This means we could get the MMIO address range (it's no longer reserved).

Not true, the MMIO address is greater than the dma_mask (i.e., the
limit_pfn passed into alloc_iova()) thus the IOVA code will never
hand back that address range given it's greater than the dma_mask).

>> It seems to me the DMA transaction would then become a peer to peer
>> transaction if ACS is not enabled, which could show up as random register
>> write in that GPUs 256M BAR (i.e. broken).
>>
>> The iova allocation should not hand out an address bigger than the
>> dma_mask. What is the device's dma_mask?

Agree.  But there is a bug.  The IOVA doesn't validate the limit_pfn
if it uses the cached entry.  One could argue that it should validate
the limit_pfn, but then again a entry outside the limit_pfn should
have never got into the rbtree...  (it got in due to the IOMMU's
dmar_init_reserved_ranges() adding it).

-mike

>>
>> thanks,
>> -chris

next prev parent reply	other threads:[~2011-03-31 23:43 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-29 23:36 [PATCH 0/4] pci: Speed up processing of IOMMU related functions Mike Travis
2011-03-29 23:36 ` [PATCH 1/4] Intel pci: Remove Host Bridge devices from identity mapping Mike Travis
2011-03-30 17:51   ` Chris Wright
2011-03-30 18:30     ` Mike Travis
2011-03-30 19:15       ` Chris Wright
2011-03-30 19:25         ` Mike Travis
2011-03-30 19:57           ` Chris Wright
2011-03-29 23:36 ` [PATCH 2/4] Intel iommu: Speed up processing of the identity_mapping function Mike Travis
2011-03-30 19:19   ` Chris Wright
2011-03-30 19:29     ` Mike Travis
2011-03-31  0:33     ` [PATCH 2/4] Intel iommu: Speed up processing of the identity_mapping function v2 Mike Travis
2011-03-29 23:36 ` [PATCH 3/4] Intel pci: Limit dmar_init_reserved_ranges Mike Travis
2011-03-31 22:11   ` Mike Travis
2011-03-31 22:53     ` Chris Wright
2011-03-31 23:25       ` Mike Travis
2011-03-31 23:40         ` Mike Habeck [this message]
2011-03-31 23:56           ` Chris Wright
2011-04-01  1:05             ` Mike Habeck
2011-04-02  0:32               ` [PATCH 3/4 v2] intel-iommu: don't cache iova above 32bit caching boundary Chris Wright
2011-04-06  0:39                 ` [PATCH 3/4 v3] " Chris Wright
2011-03-31 23:39       ` [PATCH 3/4] Intel pci: Limit dmar_init_reserved_ranges Chris Wright
2011-03-29 23:36 ` [PATCH 4/4] Intel pci: Use coherent DMA mask when requested Mike Travis
2011-03-30 18:02   ` Chris Wright
2011-04-01  2:57     ` FUJITA Tomonori
2011-04-07 19:47 ` [PATCH 1/4] Intel pci: Remove Host Bridge devices from identity mapping Mike Travis
2011-04-07 19:51 ` [PATCH 2/4] Intel iommu: Speed up processing of the identity_mapping function Mike Travis
2011-04-07 19:52 ` [PATCH 4/4] Intel pci: Use coherent DMA mask when requested Mike Travis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D951109.1040707@sgi.com \
    --to=habeck@sgi.com \
    --cc=chrisw@sous-sol.org \
    --cc=dwmw2@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jbarnes@virtuousgeek.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=travis@sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.