Re: [RFC PATCH v2] Utilize the PCI API in the TTM framework.

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Thomas Hellstrom <thomas@shipmail.org>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: dri-devel@lists.freedesktop.org, airlied@linux.ie,
	linux-kernel@vger.kernel.org, konrad@darnok.org
Subject: Re: [RFC PATCH v2] Utilize the PCI API in the TTM framework.
Date: Mon, 10 Jan 2011 21:50:03 +0100	[thread overview]
Message-ID: <4D2B70FB.3000504@shipmail.org> (raw)
In-Reply-To: <20110110164519.GA27066@dumpdata.com>

On 01/10/2011 05:45 PM, Konrad Rzeszutek Wilk wrote:
> . snip ..
>    
>>>> 2) What about accounting? In a *non-Xen* environment, will the
>>>> number of coherent pages be less than the number of DMA32 pages, or
>>>> will dma_alloc_coherent just translate into a alloc_page(GFP_DMA32)?
>>>>          
>>> The code in the IOMMUs end up calling __get_free_pages, which ends up
>>> in alloc_pages. So the call doe ends up in alloc_page(flags).
>>>
>>>
>>> native SWIOTLB (so no IOMMU): GFP_DMA32
>>> GART (AMD's old IOMMU): GFP_DMA32:
>>>
>>> For the hardware IOMMUs:
>>>
>>> AMD VI: if it is in Passthrough mode, it calls it with GFP_DMA32.
>>>     If it is in DMA translation mode (normal mode) it allocates a page
>>>     with GFP_ZERO | ~(__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32) and immediately
>>>     translates the bus address.
>>>
>>> The flags change a bit:
>>> VT-d: if there is no identity mapping, nor the PCI device is one of the special ones
>>>     (GFX, Azalia), then it will pass it with GFP_DMA32.
>>>     If it is in identity mapping state, and the device is a GFX or Azalia sound
>>>     card, then it will ~(__GFP_DMA | GFP_DMA32) and immediately translate
>>>     the buss address.
>>>
>>> However, the interesting thing is that I've passed in the 'NULL' as
>>> the struct device (not intentionally - did not want to add more changes
>>> to the API) so all of the IOMMUs end up doing GFP_DMA32.
>>>
>>> But it does mess up the accounting with the AMD-VI and VT-D as they strip
>>> of the __GFP_DMA32 flag off. That is a big problem, I presume?
>>>        
>> Actually, I don't think it's a big problem. TTM allows a small
>> discrepancy between allocated pages and accounted pages to be able
>> to account on actual allocation result. IIRC, This means that a
>> DMA32 page will always be accounted as such, or at least we can make
>> it behave that way. As long as the device can always handle the
>> page, we should be fine.
>>      
> Excellent.
>    
>>      
>>>> 3) Same as above, but in a Xen environment, what will stop multiple
>>>> guests to exhaust the coherent pages? It seems that the TTM
>>>> accounting mechanisms will no longer be valid unless the number of
>>>> available coherent pages are split across the guests?
>>>>          
>>> Say I pass in four ATI Radeon cards (wherein each is a 32-bit card) to
>>> four guests. Lets also assume that we are doing heavy operations in all
>>> of the guests.  Since there are no communication between each TTM
>>> accounting in each guest you could end up eating all of the 4GB physical
>>> memory that is available to each guest. It could end up that the first
>>> guess gets a lion share of the 4GB memory, while the other ones are
>>> less so.
>>>
>>> And if one was to do that on baremetal, with four ATI Radeon cards, the
>>> TTM accounting mechanism would realize it is nearing the watermark
>>> and do.. something, right? What would it do actually?
>>>
>>> I think the error path would be the same in both cases?
>>>        
>> Not really. The really dangerous situation is if TTM is allowed to
>> exhaust all GFP_KERNEL memory. Then any application or kernel task
>>      
> Ok, since GFP_KERNEL does not contain the GFP_DMA32 flag then
> this should be OK?
>    

No, Unless I miss something, on a machine with 4GB or less, GFP_DMA32 
and GFP_KERNEL are allocated from the same pool of pages?

>
>> What *might* be possible, however, is that the GFP_KERNEL memory on
>> the host gets exhausted due to extensive TTM allocations in the
>> guest, but I guess that's a problem for XEN to resolve, not TTM.
>>      
> Hmm. I think I am missing something here. The GFP_KERNEL is any memory
> and the GFP_DMA32 is memory from the ZONE_DMA32. When we do start
> using the PCI-API, what happens underneath (so under Linux) is that
> "real PFNs" (Machine Frame Numbers) which are above the 0x100000 mark
> get swizzled in for the guest's PFNs (this is for the PCI devices
> that have the dma_mask set to 32bit). However, that is a Xen MMU
> accounting issue.
>    


So I was under the impression that when you allocate coherent memory in 
the guest, the physical page comes from DMA32 memory in the host. On a 
4GB machine or less, that would be the same as kernel memory. Now, if 4 
guests think they can allocate 2GB of coherent memory each, you might 
run out of kernel memory on the host?


Another thing that I was thinking of is what happens if you have a huge 
gart and allocate a lot of coherent memory. Could that potentially 
exhaust IOMMU resources?

>> /Thomas
>>
>> *) I think gem's flink still is vulnerable to this, though, so it
>>      
> Is there a good test-case for this?
>    


Not put in code. What you can do (for example in an openGL app) is to 
write some code that tries to flink with a guessed bo name until it 
succeeds. Then repeatedly from within the app, try to flink the same 
name until something crashes. I don't think the linux OOM killer can 
handle that situation. Should be fairly easy to put together.

/Thomas

next prev parent reply	other threads:[~2011-01-10 20:50 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-07 17:11 [RFC PATCH v2] Utilize the PCI API in the TTM framework Konrad Rzeszutek Wilk
2011-01-07 17:11 ` [PATCH 1/5] ttm: Introduce a placeholder for DMA (bus) addresses Konrad Rzeszutek Wilk
2011-01-27  9:13   ` Thomas Hellstrom
2011-01-07 17:11 ` [PATCH 2/5] tm: Utilize the dma_addr_t array for pages that are to in DMA32 pool Konrad Rzeszutek Wilk
2011-01-27  9:17   ` Thomas Hellstrom
2011-01-07 17:11 ` [PATCH 3/5] ttm: Expand (*populate) to support an array of DMA addresses Konrad Rzeszutek Wilk
2011-01-27  9:19   ` Thomas Hellstrom
2011-01-27 21:10     ` Konrad Rzeszutek Wilk
2011-01-07 17:11 ` [PATCH 4/5] radeon/ttm/PCIe: Use dma_addr if TTM has set it Konrad Rzeszutek Wilk
2011-01-27 21:20   ` Konrad Rzeszutek Wilk
2011-01-28 14:42     ` Jerome Glisse
2011-01-28 14:42       ` Jerome Glisse
2011-01-28 15:03       ` Konrad Rzeszutek Wilk
2011-01-28 15:03         ` Konrad Rzeszutek Wilk
2011-02-16 15:54       ` Konrad Rzeszutek Wilk
2011-02-16 15:54         ` Konrad Rzeszutek Wilk
2011-02-16 18:51         ` Jerome Glisse
2011-01-07 17:11 ` [PATCH 5/5] nouveau/ttm/PCIe: " Konrad Rzeszutek Wilk
2011-01-27 21:22   ` Konrad Rzeszutek Wilk
2011-01-07 22:21 ` [RFC PATCH v2] Utilize the PCI API in the TTM framework Ian Campbell
2011-01-08 10:41 ` Thomas Hellstrom
2011-01-10 14:25 ` Thomas Hellstrom
2011-01-10 15:21   ` Konrad Rzeszutek Wilk
2011-01-10 15:58     ` Thomas Hellstrom
2011-01-10 15:58       ` Thomas Hellstrom
2011-01-10 16:45       ` Konrad Rzeszutek Wilk
2011-01-10 20:50         ` Thomas Hellstrom [this message]
2011-01-11 15:55           ` Konrad Rzeszutek Wilk
2011-01-11 15:55             ` Konrad Rzeszutek Wilk
2011-01-11 16:21             ` Alex Deucher
2011-01-11 16:21               ` Alex Deucher
2011-01-11 16:59               ` Konrad Rzeszutek Wilk
2011-01-11 16:59                 ` Konrad Rzeszutek Wilk
2011-01-11 18:12                 ` Alex Deucher
2011-01-11 18:28                   ` Konrad Rzeszutek Wilk
2011-01-11 19:28                     ` Alex Deucher
2011-01-12  9:12             ` Thomas Hellstrom
2011-01-12 15:19               ` Konrad Rzeszutek Wilk
2011-01-12 15:19                 ` Konrad Rzeszutek Wilk
2011-01-24 14:49                 ` Konrad Rzeszutek Wilk
2011-01-24 14:49                   ` Konrad Rzeszutek Wilk
2011-01-27  9:28 ` Thomas Hellstrom
2011-01-27 21:13   ` Konrad Rzeszutek Wilk
2011-03-21 13:11 ` Michel Dänzer
2011-03-21 23:18   ` Konrad Rzeszutek Wilk
2011-03-21 23:18     ` Konrad Rzeszutek Wilk
2011-03-22 13:13     ` Michel Dänzer
2011-03-22 13:13       ` Michel Dänzer
2011-03-22 14:54       ` Konrad Rzeszutek Wilk
2011-03-22 15:10         ` Michel Dänzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D2B70FB.3000504@shipmail.org \
    --to=thomas@shipmail.org \
    --cc=airlied@linux.ie \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=konrad.wilk@oracle.com \
    --cc=konrad@darnok.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.