Re: [PATCH] cleanup: Add 'struct dev' in the TTM layer to be passed in for DMA API calls.

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Thomas Hellstrom <thomas@shipmail.org>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: linux-kernel@vger.kernel.org, Dave Airlie <airlied@redhat.com>,
	dri-devel@lists.freedesktop.org,
	Alex Deucher <alexdeucher@gmail.com>,
	Jerome Glisse <j.glisse@gmail.com>,
	Konrad Rzeszutek Wilk <konrad@kernel.org>
Subject: Re: [PATCH] cleanup: Add 'struct dev' in the TTM layer to be passed in for DMA API calls.
Date: Wed, 23 Mar 2011 14:17:18 +0100	[thread overview]
Message-ID: <4D89F2DE.7020209@shipmail.org> (raw)
In-Reply-To: <20110323125105.GA6599@dumpdata.com>

On 03/23/2011 01:51 PM, Konrad Rzeszutek Wilk wrote:
>>> I was thinking about this a bit after I found that the PowerPC requires
>>> the 'struct dev'. But I got a question first, what do you with pages
>>> that were allocated to a device that can do 64-bit DMA and then
>>> move it to a device than can 32-bit DMA? Obviously the 32-bit card would
>>> set the TTM_PAGE_FLAG_DMA32 flag, but the 64-bit would not. What is the
>>> process then? Allocate a new page from the 32-bit device and then copy over the
>>> page from the 64-bit TTM and put the 64-bit TTM page?
>>>        
>> Yes, in certain situations we need to copy, and if it's necessary in
>> some cases to use coherent memory with a struct device assoicated
>> with it, I agree it may be reasonable to do a copy in that case as
>> well. I'm against, however, to make that the default case when
>> running on bare metal.
>>      
> This situation could occur on native/baremetal. When you say 'default
> case' you mean for every type of page without consulting whether it
> had the TTM_PAGE_FLAG_DMA32?
>    

No, Basically I mean a device that runs perfectly fine with
alloc_pages(DMA32) on bare metal shouldn't need to be using
dma_alloc_coherent() on bare metal, because that would mean we'd need
to take the copy path above.

>> However, I've looked a bit deeper into all this, and it looks like
>> we already have other problems that need to be addressed, and that
>> exists with the code already in git:
>>
>> Consider a situation where you allocate a cached DMA32 page from the
>> ttm page allocator. You'll end up with a coherent page. Then you
>> make it uncached and finally you return it to the ttm page
>> allocator. Since it's uncached, it will not be freed by the dma api,
>> but kept in the uncached pool, and later the incorrect page free
>> function will be called.
>>      
> Let me look in details in the code, but I thought it would check the
> TTM_PAGE_FLAG_DMA32 and direct the page to the correct pool?
>
> We could piggyback on the idea of the struct I had and have these
> values:
>
> struct ttm_page {
> 	struct page *page;
> 	dma_addr_t	*bus_addr;
> 	struct *ttm_pool	*origin;
> }
>
> And the origin would point to the correct pool so that on de-allocate
> it would divert it to the original one. Naturally there would
> be some thinking to be done on the de-alloc path so that
> the *origin isn't pointing to something htat has already been free-d.
>    

The idea with these pools is to keep a cache of write-combined pages, so 
the pool
is determined by the caching status of the page when it is returned to 
the page
allocator.
If we were to associate a page with a device, we'd basically need one 
such pool per
device.


>    
>> I think we might need to take a few steps back and rethink this whole idea:
>>
>> 1) How does this work in the AGP case? Let's say you allocate
>> write-combined DMA32 pages from the ttm page pool (in this case you
>> won't get coherent memory) and then use them in an AGP gart? Why is
>> it that we don't need coherent pages then in the Xen case?
>>      
> Hehe.. So I had posted a set of patches to carry the 'dma_addr_t' through
> the AGP API and then to its backends to program that. And also the frontends
> (so DRM, TTM) Here is the
> patchset I posted some time ago:
>
> http://linux.derkeiler.com/Mailing-Lists/Kernel/2010-12/msg02382.html
> and the discussion:
>
> http://linux.derkeiler.com/Mailing-Lists/Kernel/2010-12/msg02411.html
>
> Dave recommended I skip AGP and just concentrate on PCIe since not to many
> folks use AGP anymore. Thought I realized that server boards use PCI
> cards (ATI ES1000), which do utilize the AGP API. So there is breakage there
> and I have a set of patches for this that I was in process of rebasing
> on 2.6.39-rcX.
>    

I see. Well, both in the AGP case and sometimes in the PCIe case,
(some devices can use a non-cache-coherent PCIe mode for speed)
it's a not too uncommon use case of TTM to alloc cached pages and
transition them to write-combined (see impact below).

>    
>> 2) http://www.mjmwired.net/kernel/Documentation/DMA-API.txt, line 33
>> makes me scared.
>> We should identify what platforms may have problems with this.
>>      
> Right.. I think nobody much thought about this in context of TTM since
> that was only used on X86. I can take a look at the DMA API's of the
> other two major platforms: IA64 and PPC and see what lurks there.
>
>    
>> 3) When hacking on the unichrome DMA engine it wasn't that hard to
>> use the synchronization functions of the DMA api correctly:
>>
>>   When binding a TTM, the backend calls dma_map_page() on pages, When
>> unbinding, the backend calls dma_unmap_page(), If we need cpu access
>> when bound, we need to call dma_sync_single_for_[cpu|device]. If
>> this is done, it will be harder to implement user-space
>> sub-allocation, but possible. There will be a performance loss on
>> some platforms, though.
>>      
> Yup. That was my other suggestion about this. But I had no idea
> where to sprinkle those 'dma_sync_single_[*]' calls, as they would
> have been done in the drivers. Probably on its DMA paths, right before
> telling the GPU to process the CP, and when receiving an interrupt
> when the CP has been completed.
>    

In this case dma_sync_single_for_device() would be needed for a bo when 
it was validated for a
PCI dma engine. At the same time, all user-space virtual mappings of the 
buffer would need to be killed.
A dma_sync_single_for_cpu() would be needed as part of making a bo idle.


/Thomas

next prev parent reply	other threads:[~2011-03-23 13:18 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-08 15:39 [PATCH] cleanup: Add 'struct dev' in the TTM layer to be passed in for DMA API calls Konrad Rzeszutek Wilk
2011-03-08 15:39 ` [PATCH 1/2] ttm: Include the 'struct dev' when using the DMA API Konrad Rzeszutek Wilk
2011-03-08 15:39 ` [PATCH 2/2] ttm: Pass in 'struct device' to TTM so it can do DMA API on behalf of device Konrad Rzeszutek Wilk
2011-03-08 20:52 ` [PATCH] cleanup: Add 'struct dev' in the TTM layer to be passed in for DMA API calls Thomas Hellstrom
2011-03-09  0:47   ` Konrad Rzeszutek Wilk
2011-03-22 14:31   ` Konrad Rzeszutek Wilk
2011-03-23  8:13     ` Thomas Hellstrom
2011-03-23 12:51       ` Konrad Rzeszutek Wilk
2011-03-23 13:17         ` Thomas Hellstrom [this message]
2011-03-23 14:52           ` Konrad Rzeszutek Wilk
2011-03-24  7:52             ` Thomas Hellstrom
2011-03-24 14:25               ` Konrad Rzeszutek Wilk
2011-03-24 16:06                 ` Jerome Glisse
2011-03-24 16:21                   ` Konrad Rzeszutek Wilk
2011-03-25 20:00                     ` Thomas Hellstrom
2011-03-31 15:49                       ` Konrad Rzeszutek Wilk
2011-04-08 14:57                         ` Thomas Hellstrom
2011-04-08 14:58                           ` Thomas Hellstrom
2011-04-08 15:12                           ` Konrad Rzeszutek Wilk
2011-04-08 15:29                             ` Thomas Hellstrom
2011-03-23 16:19         ` Alex Deucher
2011-03-22 17:39   ` Paul Mundt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D89F2DE.7020209@shipmail.org \
    --to=thomas@shipmail.org \
    --cc=airlied@redhat.com \
    --cc=alexdeucher@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=j.glisse@gmail.com \
    --cc=konrad.wilk@oracle.com \
    --cc=konrad@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).