public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] x86: Export kmap_atomic_prot() needed for TTM.
@ 2009-07-24  7:57 Thomas Hellstrom
  2009-07-24  7:57 ` [PATCH 2/2] ttm: Fix ttm in-kernel copying of pages with non-standard caching attributes Thomas Hellstrom
  0 siblings, 1 reply; 5+ messages in thread
From: Thomas Hellstrom @ 2009-07-24  7:57 UTC (permalink / raw)
  To: linux-kernel, dri-devel; +Cc: Thomas Hellstrom

This functionality is needed to kmap_atomic() highmem pages that may
potentially have or are about to set up other mappings with
non-standard caching attributes.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
---
 arch/x86/mm/highmem_32.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/x86/mm/highmem_32.c b/arch/x86/mm/highmem_32.c
index 58f621e..2112ed5 100644
--- a/arch/x86/mm/highmem_32.c
+++ b/arch/x86/mm/highmem_32.c
@@ -103,6 +103,7 @@ EXPORT_SYMBOL(kmap);
 EXPORT_SYMBOL(kunmap);
 EXPORT_SYMBOL(kmap_atomic);
 EXPORT_SYMBOL(kunmap_atomic);
+EXPORT_SYMBOL(kmap_atomic_prot);
 
 void __init set_highmem_pages_init(void)
 {
-- 
1.6.1.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] ttm: Fix ttm in-kernel copying of pages with non-standard caching attributes.
  2009-07-24  7:57 [PATCH 1/2] x86: Export kmap_atomic_prot() needed for TTM Thomas Hellstrom
@ 2009-07-24  7:57 ` Thomas Hellstrom
  2009-07-30 16:00   ` Pekka Paalanen
  0 siblings, 1 reply; 5+ messages in thread
From: Thomas Hellstrom @ 2009-07-24  7:57 UTC (permalink / raw)
  To: linux-kernel, dri-devel; +Cc: Thomas Hellstrom

For x86 this affected highmem pages only, since they were always kmapped
cache-coherent, and this is fixed using kmap_atomic_prot().

For other architectures that may not modify the linear kernel map we
resort to vmap() for now, since kmap_atomic_prot() generally uses the
linear kernel map for lowmem pages. This of course comes with a
performance impact and should be optimized when possible.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
---
 drivers/gpu/drm/ttm/ttm_bo_util.c |   63 ++++++++++++++++++++++++++++++------
 1 files changed, 52 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 3e5d0c4..ce2e6f3 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -136,7 +136,8 @@ static int ttm_copy_io_page(void *dst, void *src, unsigned long page)
 }
 
 static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src,
-				unsigned long page)
+				unsigned long page,
+				pgprot_t prot)
 {
 	struct page *d = ttm_tt_get_page(ttm, page);
 	void *dst;
@@ -145,17 +146,35 @@ static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src,
 		return -ENOMEM;
 
 	src = (void *)((unsigned long)src + (page << PAGE_SHIFT));
-	dst = kmap(d);
+
+#ifdef CONFIG_X86
+	dst = kmap_atomic_prot(d, KM_USER0, prot);
+#else
+	if (prot != PAGE_KERNEL)
+		dst = vmap(&d, 1, 0, prot);
+	else
+		dst = kmap(d);
+#endif
 	if (!dst)
 		return -ENOMEM;
 
 	memcpy_fromio(dst, src, PAGE_SIZE);
-	kunmap(d);
+
+#ifdef CONFIG_X86
+	kunmap_atomic(dst, KM_USER0);
+#else
+	if (prot != PAGE_KERNEL)
+		vunmap(dst);
+	else
+		kunmap(d);
+#endif
+
 	return 0;
 }
 
 static int ttm_copy_ttm_io_page(struct ttm_tt *ttm, void *dst,
-				unsigned long page)
+				unsigned long page,
+				pgprot_t prot)
 {
 	struct page *s = ttm_tt_get_page(ttm, page);
 	void *src;
@@ -164,12 +183,28 @@ static int ttm_copy_ttm_io_page(struct ttm_tt *ttm, void *dst,
 		return -ENOMEM;
 
 	dst = (void *)((unsigned long)dst + (page << PAGE_SHIFT));
-	src = kmap(s);
+#ifdef CONFIG_X86
+	src = kmap_atomic_prot(s, KM_USER0, prot);
+#else
+	if (prot != PAGE_KERNEL)
+		src = vmap(&s, 1, 0, prot);
+	else
+		src = kmap(s);
+#endif
 	if (!src)
 		return -ENOMEM;
 
 	memcpy_toio(dst, src, PAGE_SIZE);
-	kunmap(s);
+
+#ifdef CONFIG_X86
+	kunmap_atomic(src, KM_USER0);
+#else
+	if (prot != PAGE_KERNEL)
+		vunmap(src);
+	else
+		kunmap(s);
+#endif
+
 	return 0;
 }
 
@@ -214,11 +249,17 @@ int ttm_bo_move_memcpy(struct ttm_buffer_object *bo,
 
 	for (i = 0; i < new_mem->num_pages; ++i) {
 		page = i * dir + add;
-		if (old_iomap == NULL)
-			ret = ttm_copy_ttm_io_page(ttm, new_iomap, page);
-		else if (new_iomap == NULL)
-			ret = ttm_copy_io_ttm_page(ttm, old_iomap, page);
-		else
+		if (old_iomap == NULL) {
+			pgprot_t prot = ttm_io_prot(old_mem->placement,
+						    PAGE_KERNEL);
+			ret = ttm_copy_ttm_io_page(ttm, new_iomap, page,
+						   prot);
+		} else if (new_iomap == NULL) {
+			pgprot_t prot = ttm_io_prot(new_mem->placement,
+						    PAGE_KERNEL);
+			ret = ttm_copy_io_ttm_page(ttm, old_iomap, page,
+						   prot);
+		} else
 			ret = ttm_copy_io_page(new_iomap, old_iomap, page);
 		if (ret)
 			goto out1;
-- 
1.6.1.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] ttm: Fix ttm in-kernel copying of pages with non-standard caching attributes.
  2009-07-24  7:57 ` [PATCH 2/2] ttm: Fix ttm in-kernel copying of pages with non-standard caching attributes Thomas Hellstrom
@ 2009-07-30 16:00   ` Pekka Paalanen
  2009-07-31  8:59     ` Thomas Hellström
  0 siblings, 1 reply; 5+ messages in thread
From: Pekka Paalanen @ 2009-07-30 16:00 UTC (permalink / raw)
  To: Thomas Hellstrom; +Cc: linux-kernel, dri-devel

Hi,

since I see this patch in Linus' tree, and I likely have to patch
TTM in Nouveau's compat-branch to compile with older kernels,
I have a question below.

(The Nouveau kernel tree's compat branch offers drm.ko, ttm.ko and
nouveau.ko to be built against kernels 2.6.28 and later.)

On Fri, 24 Jul 2009 09:57:34 +0200
Thomas Hellstrom <thellstrom@vmware.com> wrote:

> For x86 this affected highmem pages only, since they were always kmapped
> cache-coherent, and this is fixed using kmap_atomic_prot().
> 
> For other architectures that may not modify the linear kernel map we
> resort to vmap() for now, since kmap_atomic_prot() generally uses the
> linear kernel map for lowmem pages. This of course comes with a
> performance impact and should be optimized when possible.
> 
> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
> ---
>  drivers/gpu/drm/ttm/ttm_bo_util.c |   63 ++++++++++++++++++++++++++++++------
>  1 files changed, 52 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
> index 3e5d0c4..ce2e6f3 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
> @@ -136,7 +136,8 @@ static int ttm_copy_io_page(void *dst, void *src, unsigned long page)
>  }
>  
>  static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src,
> -				unsigned long page)
> +				unsigned long page,
> +				pgprot_t prot)
>  {
>  	struct page *d = ttm_tt_get_page(ttm, page);
>  	void *dst;
> @@ -145,17 +146,35 @@ static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src,
>  		return -ENOMEM;
>  
>  	src = (void *)((unsigned long)src + (page << PAGE_SHIFT));
> -	dst = kmap(d);
> +
> +#ifdef CONFIG_X86
> +	dst = kmap_atomic_prot(d, KM_USER0, prot);
> +#else
> +	if (prot != PAGE_KERNEL)
> +		dst = vmap(&d, 1, 0, prot);
> +	else
> +		dst = kmap(d);
> +#endif

What are the implications of choosing the non-CONFIG_X86 path
even on x86?

Is kmap_atomic_prot() simply an optimization allowed by the x86
arch, and the alternate way also works, although it uses the
precious vmalloc address space?

Since kmap_atomic_prot() is not exported on earlier kernels,
I'm tempted to just do the non-CONFIG_X86 path.

>  	if (!dst)
>  		return -ENOMEM;
>  
>  	memcpy_fromio(dst, src, PAGE_SIZE);
> -	kunmap(d);
> +
> +#ifdef CONFIG_X86
> +	kunmap_atomic(dst, KM_USER0);
> +#else
> +	if (prot != PAGE_KERNEL)
> +		vunmap(dst);
> +	else
> +		kunmap(d);
> +#endif
> +
>  	return 0;
>  }
>  
>  static int ttm_copy_ttm_io_page(struct ttm_tt *ttm, void *dst,
> -				unsigned long page)
> +				unsigned long page,
> +				pgprot_t prot)
>  {
>  	struct page *s = ttm_tt_get_page(ttm, page);
>  	void *src;
> @@ -164,12 +183,28 @@ static int ttm_copy_ttm_io_page(struct ttm_tt *ttm, void *dst,
>  		return -ENOMEM;
>  
>  	dst = (void *)((unsigned long)dst + (page << PAGE_SHIFT));
> -	src = kmap(s);
> +#ifdef CONFIG_X86
> +	src = kmap_atomic_prot(s, KM_USER0, prot);
> +#else
> +	if (prot != PAGE_KERNEL)
> +		src = vmap(&s, 1, 0, prot);
> +	else
> +		src = kmap(s);
> +#endif
>  	if (!src)
>  		return -ENOMEM;
>  
>  	memcpy_toio(dst, src, PAGE_SIZE);
> -	kunmap(s);
> +
> +#ifdef CONFIG_X86
> +	kunmap_atomic(src, KM_USER0);
> +#else
> +	if (prot != PAGE_KERNEL)
> +		vunmap(src);
> +	else
> +		kunmap(s);
> +#endif
> +
>  	return 0;
>  }
>  
> @@ -214,11 +249,17 @@ int ttm_bo_move_memcpy(struct ttm_buffer_object *bo,
>  
>  	for (i = 0; i < new_mem->num_pages; ++i) {
>  		page = i * dir + add;
> -		if (old_iomap == NULL)
> -			ret = ttm_copy_ttm_io_page(ttm, new_iomap, page);
> -		else if (new_iomap == NULL)
> -			ret = ttm_copy_io_ttm_page(ttm, old_iomap, page);
> -		else
> +		if (old_iomap == NULL) {
> +			pgprot_t prot = ttm_io_prot(old_mem->placement,
> +						    PAGE_KERNEL);
> +			ret = ttm_copy_ttm_io_page(ttm, new_iomap, page,
> +						   prot);
> +		} else if (new_iomap == NULL) {
> +			pgprot_t prot = ttm_io_prot(new_mem->placement,
> +						    PAGE_KERNEL);
> +			ret = ttm_copy_io_ttm_page(ttm, old_iomap, page,
> +						   prot);
> +		} else
>  			ret = ttm_copy_io_page(new_iomap, old_iomap, page);
>  		if (ret)
>  			goto out1;
> -- 
> 1.6.1.3


Thanks.

-- 
Pekka Paalanen
http://www.iki.fi/pq/


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] ttm: Fix ttm in-kernel copying of pages with non-standard caching attributes.
  2009-07-30 16:00   ` Pekka Paalanen
@ 2009-07-31  8:59     ` Thomas Hellström
  2009-07-31  9:32       ` Pekka Paalanen
  0 siblings, 1 reply; 5+ messages in thread
From: Thomas Hellström @ 2009-07-31  8:59 UTC (permalink / raw)
  To: Pekka Paalanen; +Cc: Thomas Hellstrom, dri-devel, linux-kernel

Pekka Paalanen wrote:
> Hi,
>
> since I see this patch in Linus' tree, and I likely have to patch
> TTM in Nouveau's compat-branch to compile with older kernels,
> I have a question below.
>
> (The Nouveau kernel tree's compat branch offers drm.ko, ttm.ko and
> nouveau.ko to be built against kernels 2.6.28 and later.)
>
> On Fri, 24 Jul 2009 09:57:34 +0200
> Thomas Hellstrom <thellstrom@vmware.com> wrote:
>
>   
>> For x86 this affected highmem pages only, since they were always kmapped
>> cache-coherent, and this is fixed using kmap_atomic_prot().
>>
>> For other architectures that may not modify the linear kernel map we
>> resort to vmap() for now, since kmap_atomic_prot() generally uses the
>> linear kernel map for lowmem pages. This of course comes with a
>> performance impact and should be optimized when possible.
>>
>> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
>> ---
>>  drivers/gpu/drm/ttm/ttm_bo_util.c |   63 ++++++++++++++++++++++++++++++------
>>  1 files changed, 52 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
>> index 3e5d0c4..ce2e6f3 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
>> @@ -136,7 +136,8 @@ static int ttm_copy_io_page(void *dst, void *src, unsigned long page)
>>  }
>>  
>>  static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src,
>> -				unsigned long page)
>> +				unsigned long page,
>> +				pgprot_t prot)
>>  {
>>  	struct page *d = ttm_tt_get_page(ttm, page);
>>  	void *dst;
>> @@ -145,17 +146,35 @@ static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src,
>>  		return -ENOMEM;
>>  
>>  	src = (void *)((unsigned long)src + (page << PAGE_SHIFT));
>> -	dst = kmap(d);
>> +
>> +#ifdef CONFIG_X86
>> +	dst = kmap_atomic_prot(d, KM_USER0, prot);
>> +#else
>> +	if (prot != PAGE_KERNEL)
>> +		dst = vmap(&d, 1, 0, prot);
>> +	else
>> +		dst = kmap(d);
>> +#endif
>>     
>
> What are the implications of choosing the non-CONFIG_X86 path
> even on x86?
>   

The only implication is a slowdown if dealing with highmem pages or 
pages with
a non standard caching policy. Also you need the patch I just posted to 
dri-devel / lkml to make it compile.
I should've done more thorough testing of the non-x86 path.

> Is kmap_atomic_prot() simply an optimization allowed by the x86
> arch, and the alternate way also works, although it uses the
> precious vmalloc address space?
>   

Exactly, although it's only using one page out of vmalloc space and for 
the time it
takes to copy a page to / from io.

> Since kmap_atomic_prot() is not exported on earlier kernels,
> I'm tempted to just do the non-CONFIG_X86 path.
>   
For compat I think that should be fine. If your driver is using 
accelerated copy to / from
VRAM, you shouldn't even hit this path.

/Thomas






^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] ttm: Fix ttm in-kernel copying of pages with non-standard caching attributes.
  2009-07-31  8:59     ` Thomas Hellström
@ 2009-07-31  9:32       ` Pekka Paalanen
  0 siblings, 0 replies; 5+ messages in thread
From: Pekka Paalanen @ 2009-07-31  9:32 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: Thomas Hellstrom, dri-devel, linux-kernel

On Fri, 31 Jul 2009 10:59:57 +0200
Thomas Hellström <thomas@shipmail.org> wrote:

> Pekka Paalanen wrote:
> >> @@ -145,17 +146,35 @@ static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src,
> >>  		return -ENOMEM;
> >>  
> >>  	src = (void *)((unsigned long)src + (page << PAGE_SHIFT));
> >> -	dst = kmap(d);
> >> +
> >> +#ifdef CONFIG_X86
> >> +	dst = kmap_atomic_prot(d, KM_USER0, prot);
> >> +#else
> >> +	if (prot != PAGE_KERNEL)
> >> +		dst = vmap(&d, 1, 0, prot);
> >> +	else
> >> +		dst = kmap(d);
> >> +#endif
> >>     
> >
> > What are the implications of choosing the non-CONFIG_X86 path
> > even on x86?
> >   
> 
> The only implication is a slowdown if dealing with highmem pages or 
> pages with
> a non standard caching policy. Also you need the patch I just posted to 
> dri-devel / lkml to make it compile.
> I should've done more thorough testing of the non-x86 path.
> 
> > Is kmap_atomic_prot() simply an optimization allowed by the x86
> > arch, and the alternate way also works, although it uses the
> > precious vmalloc address space?
> >   
> 
> Exactly, although it's only using one page out of vmalloc space and for 
> the time it
> takes to copy a page to / from io.
> 
> > Since kmap_atomic_prot() is not exported on earlier kernels,
> > I'm tempted to just do the non-CONFIG_X86 path.
> >   
> For compat I think that should be fine. If your driver is using 
> accelerated copy to / from
> VRAM, you shouldn't even hit this path.

Okay, thank you very much.

-- 
Pekka Paalanen
http://www.iki.fi/pq/


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-07-31  9:31 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-24  7:57 [PATCH 1/2] x86: Export kmap_atomic_prot() needed for TTM Thomas Hellstrom
2009-07-24  7:57 ` [PATCH 2/2] ttm: Fix ttm in-kernel copying of pages with non-standard caching attributes Thomas Hellstrom
2009-07-30 16:00   ` Pekka Paalanen
2009-07-31  8:59     ` Thomas Hellström
2009-07-31  9:32       ` Pekka Paalanen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox