* [PATCH 1/2] x86: Export kmap_atomic_prot() needed for TTM. @ 2009-07-24 7:57 Thomas Hellstrom 2009-07-24 7:57 ` [PATCH 2/2] ttm: Fix ttm in-kernel copying of pages with non-standard caching attributes Thomas Hellstrom 0 siblings, 1 reply; 5+ messages in thread From: Thomas Hellstrom @ 2009-07-24 7:57 UTC (permalink / raw) To: linux-kernel, dri-devel; +Cc: Thomas Hellstrom This functionality is needed to kmap_atomic() highmem pages that may potentially have or are about to set up other mappings with non-standard caching attributes. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> --- arch/x86/mm/highmem_32.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/arch/x86/mm/highmem_32.c b/arch/x86/mm/highmem_32.c index 58f621e..2112ed5 100644 --- a/arch/x86/mm/highmem_32.c +++ b/arch/x86/mm/highmem_32.c @@ -103,6 +103,7 @@ EXPORT_SYMBOL(kmap); EXPORT_SYMBOL(kunmap); EXPORT_SYMBOL(kmap_atomic); EXPORT_SYMBOL(kunmap_atomic); +EXPORT_SYMBOL(kmap_atomic_prot); void __init set_highmem_pages_init(void) { -- 1.6.1.3 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/2] ttm: Fix ttm in-kernel copying of pages with non-standard caching attributes. 2009-07-24 7:57 [PATCH 1/2] x86: Export kmap_atomic_prot() needed for TTM Thomas Hellstrom @ 2009-07-24 7:57 ` Thomas Hellstrom 2009-07-30 16:00 ` Pekka Paalanen 0 siblings, 1 reply; 5+ messages in thread From: Thomas Hellstrom @ 2009-07-24 7:57 UTC (permalink / raw) To: linux-kernel, dri-devel; +Cc: Thomas Hellstrom For x86 this affected highmem pages only, since they were always kmapped cache-coherent, and this is fixed using kmap_atomic_prot(). For other architectures that may not modify the linear kernel map we resort to vmap() for now, since kmap_atomic_prot() generally uses the linear kernel map for lowmem pages. This of course comes with a performance impact and should be optimized when possible. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> --- drivers/gpu/drm/ttm/ttm_bo_util.c | 63 ++++++++++++++++++++++++++++++------ 1 files changed, 52 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c index 3e5d0c4..ce2e6f3 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_util.c +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c @@ -136,7 +136,8 @@ static int ttm_copy_io_page(void *dst, void *src, unsigned long page) } static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src, - unsigned long page) + unsigned long page, + pgprot_t prot) { struct page *d = ttm_tt_get_page(ttm, page); void *dst; @@ -145,17 +146,35 @@ static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src, return -ENOMEM; src = (void *)((unsigned long)src + (page << PAGE_SHIFT)); - dst = kmap(d); + +#ifdef CONFIG_X86 + dst = kmap_atomic_prot(d, KM_USER0, prot); +#else + if (prot != PAGE_KERNEL) + dst = vmap(&d, 1, 0, prot); + else + dst = kmap(d); +#endif if (!dst) return -ENOMEM; memcpy_fromio(dst, src, PAGE_SIZE); - kunmap(d); + +#ifdef CONFIG_X86 + kunmap_atomic(dst, KM_USER0); +#else + if (prot != PAGE_KERNEL) + vunmap(dst); + else + kunmap(d); +#endif + return 0; } static int ttm_copy_ttm_io_page(struct ttm_tt *ttm, void *dst, - unsigned long page) + unsigned long page, + pgprot_t prot) { struct page *s = ttm_tt_get_page(ttm, page); void *src; @@ -164,12 +183,28 @@ static int ttm_copy_ttm_io_page(struct ttm_tt *ttm, void *dst, return -ENOMEM; dst = (void *)((unsigned long)dst + (page << PAGE_SHIFT)); - src = kmap(s); +#ifdef CONFIG_X86 + src = kmap_atomic_prot(s, KM_USER0, prot); +#else + if (prot != PAGE_KERNEL) + src = vmap(&s, 1, 0, prot); + else + src = kmap(s); +#endif if (!src) return -ENOMEM; memcpy_toio(dst, src, PAGE_SIZE); - kunmap(s); + +#ifdef CONFIG_X86 + kunmap_atomic(src, KM_USER0); +#else + if (prot != PAGE_KERNEL) + vunmap(src); + else + kunmap(s); +#endif + return 0; } @@ -214,11 +249,17 @@ int ttm_bo_move_memcpy(struct ttm_buffer_object *bo, for (i = 0; i < new_mem->num_pages; ++i) { page = i * dir + add; - if (old_iomap == NULL) - ret = ttm_copy_ttm_io_page(ttm, new_iomap, page); - else if (new_iomap == NULL) - ret = ttm_copy_io_ttm_page(ttm, old_iomap, page); - else + if (old_iomap == NULL) { + pgprot_t prot = ttm_io_prot(old_mem->placement, + PAGE_KERNEL); + ret = ttm_copy_ttm_io_page(ttm, new_iomap, page, + prot); + } else if (new_iomap == NULL) { + pgprot_t prot = ttm_io_prot(new_mem->placement, + PAGE_KERNEL); + ret = ttm_copy_io_ttm_page(ttm, old_iomap, page, + prot); + } else ret = ttm_copy_io_page(new_iomap, old_iomap, page); if (ret) goto out1; -- 1.6.1.3 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] ttm: Fix ttm in-kernel copying of pages with non-standard caching attributes. 2009-07-24 7:57 ` [PATCH 2/2] ttm: Fix ttm in-kernel copying of pages with non-standard caching attributes Thomas Hellstrom @ 2009-07-30 16:00 ` Pekka Paalanen 2009-07-31 8:59 ` Thomas Hellström 0 siblings, 1 reply; 5+ messages in thread From: Pekka Paalanen @ 2009-07-30 16:00 UTC (permalink / raw) To: Thomas Hellstrom; +Cc: linux-kernel, dri-devel Hi, since I see this patch in Linus' tree, and I likely have to patch TTM in Nouveau's compat-branch to compile with older kernels, I have a question below. (The Nouveau kernel tree's compat branch offers drm.ko, ttm.ko and nouveau.ko to be built against kernels 2.6.28 and later.) On Fri, 24 Jul 2009 09:57:34 +0200 Thomas Hellstrom <thellstrom@vmware.com> wrote: > For x86 this affected highmem pages only, since they were always kmapped > cache-coherent, and this is fixed using kmap_atomic_prot(). > > For other architectures that may not modify the linear kernel map we > resort to vmap() for now, since kmap_atomic_prot() generally uses the > linear kernel map for lowmem pages. This of course comes with a > performance impact and should be optimized when possible. > > Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> > --- > drivers/gpu/drm/ttm/ttm_bo_util.c | 63 ++++++++++++++++++++++++++++++------ > 1 files changed, 52 insertions(+), 11 deletions(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c > index 3e5d0c4..ce2e6f3 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo_util.c > +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c > @@ -136,7 +136,8 @@ static int ttm_copy_io_page(void *dst, void *src, unsigned long page) > } > > static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src, > - unsigned long page) > + unsigned long page, > + pgprot_t prot) > { > struct page *d = ttm_tt_get_page(ttm, page); > void *dst; > @@ -145,17 +146,35 @@ static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src, > return -ENOMEM; > > src = (void *)((unsigned long)src + (page << PAGE_SHIFT)); > - dst = kmap(d); > + > +#ifdef CONFIG_X86 > + dst = kmap_atomic_prot(d, KM_USER0, prot); > +#else > + if (prot != PAGE_KERNEL) > + dst = vmap(&d, 1, 0, prot); > + else > + dst = kmap(d); > +#endif What are the implications of choosing the non-CONFIG_X86 path even on x86? Is kmap_atomic_prot() simply an optimization allowed by the x86 arch, and the alternate way also works, although it uses the precious vmalloc address space? Since kmap_atomic_prot() is not exported on earlier kernels, I'm tempted to just do the non-CONFIG_X86 path. > if (!dst) > return -ENOMEM; > > memcpy_fromio(dst, src, PAGE_SIZE); > - kunmap(d); > + > +#ifdef CONFIG_X86 > + kunmap_atomic(dst, KM_USER0); > +#else > + if (prot != PAGE_KERNEL) > + vunmap(dst); > + else > + kunmap(d); > +#endif > + > return 0; > } > > static int ttm_copy_ttm_io_page(struct ttm_tt *ttm, void *dst, > - unsigned long page) > + unsigned long page, > + pgprot_t prot) > { > struct page *s = ttm_tt_get_page(ttm, page); > void *src; > @@ -164,12 +183,28 @@ static int ttm_copy_ttm_io_page(struct ttm_tt *ttm, void *dst, > return -ENOMEM; > > dst = (void *)((unsigned long)dst + (page << PAGE_SHIFT)); > - src = kmap(s); > +#ifdef CONFIG_X86 > + src = kmap_atomic_prot(s, KM_USER0, prot); > +#else > + if (prot != PAGE_KERNEL) > + src = vmap(&s, 1, 0, prot); > + else > + src = kmap(s); > +#endif > if (!src) > return -ENOMEM; > > memcpy_toio(dst, src, PAGE_SIZE); > - kunmap(s); > + > +#ifdef CONFIG_X86 > + kunmap_atomic(src, KM_USER0); > +#else > + if (prot != PAGE_KERNEL) > + vunmap(src); > + else > + kunmap(s); > +#endif > + > return 0; > } > > @@ -214,11 +249,17 @@ int ttm_bo_move_memcpy(struct ttm_buffer_object *bo, > > for (i = 0; i < new_mem->num_pages; ++i) { > page = i * dir + add; > - if (old_iomap == NULL) > - ret = ttm_copy_ttm_io_page(ttm, new_iomap, page); > - else if (new_iomap == NULL) > - ret = ttm_copy_io_ttm_page(ttm, old_iomap, page); > - else > + if (old_iomap == NULL) { > + pgprot_t prot = ttm_io_prot(old_mem->placement, > + PAGE_KERNEL); > + ret = ttm_copy_ttm_io_page(ttm, new_iomap, page, > + prot); > + } else if (new_iomap == NULL) { > + pgprot_t prot = ttm_io_prot(new_mem->placement, > + PAGE_KERNEL); > + ret = ttm_copy_io_ttm_page(ttm, old_iomap, page, > + prot); > + } else > ret = ttm_copy_io_page(new_iomap, old_iomap, page); > if (ret) > goto out1; > -- > 1.6.1.3 Thanks. -- Pekka Paalanen http://www.iki.fi/pq/ ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] ttm: Fix ttm in-kernel copying of pages with non-standard caching attributes. 2009-07-30 16:00 ` Pekka Paalanen @ 2009-07-31 8:59 ` Thomas Hellström 2009-07-31 9:32 ` Pekka Paalanen 0 siblings, 1 reply; 5+ messages in thread From: Thomas Hellström @ 2009-07-31 8:59 UTC (permalink / raw) To: Pekka Paalanen; +Cc: Thomas Hellstrom, dri-devel, linux-kernel Pekka Paalanen wrote: > Hi, > > since I see this patch in Linus' tree, and I likely have to patch > TTM in Nouveau's compat-branch to compile with older kernels, > I have a question below. > > (The Nouveau kernel tree's compat branch offers drm.ko, ttm.ko and > nouveau.ko to be built against kernels 2.6.28 and later.) > > On Fri, 24 Jul 2009 09:57:34 +0200 > Thomas Hellstrom <thellstrom@vmware.com> wrote: > > >> For x86 this affected highmem pages only, since they were always kmapped >> cache-coherent, and this is fixed using kmap_atomic_prot(). >> >> For other architectures that may not modify the linear kernel map we >> resort to vmap() for now, since kmap_atomic_prot() generally uses the >> linear kernel map for lowmem pages. This of course comes with a >> performance impact and should be optimized when possible. >> >> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> >> --- >> drivers/gpu/drm/ttm/ttm_bo_util.c | 63 ++++++++++++++++++++++++++++++------ >> 1 files changed, 52 insertions(+), 11 deletions(-) >> >> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c >> index 3e5d0c4..ce2e6f3 100644 >> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c >> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c >> @@ -136,7 +136,8 @@ static int ttm_copy_io_page(void *dst, void *src, unsigned long page) >> } >> >> static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src, >> - unsigned long page) >> + unsigned long page, >> + pgprot_t prot) >> { >> struct page *d = ttm_tt_get_page(ttm, page); >> void *dst; >> @@ -145,17 +146,35 @@ static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src, >> return -ENOMEM; >> >> src = (void *)((unsigned long)src + (page << PAGE_SHIFT)); >> - dst = kmap(d); >> + >> +#ifdef CONFIG_X86 >> + dst = kmap_atomic_prot(d, KM_USER0, prot); >> +#else >> + if (prot != PAGE_KERNEL) >> + dst = vmap(&d, 1, 0, prot); >> + else >> + dst = kmap(d); >> +#endif >> > > What are the implications of choosing the non-CONFIG_X86 path > even on x86? > The only implication is a slowdown if dealing with highmem pages or pages with a non standard caching policy. Also you need the patch I just posted to dri-devel / lkml to make it compile. I should've done more thorough testing of the non-x86 path. > Is kmap_atomic_prot() simply an optimization allowed by the x86 > arch, and the alternate way also works, although it uses the > precious vmalloc address space? > Exactly, although it's only using one page out of vmalloc space and for the time it takes to copy a page to / from io. > Since kmap_atomic_prot() is not exported on earlier kernels, > I'm tempted to just do the non-CONFIG_X86 path. > For compat I think that should be fine. If your driver is using accelerated copy to / from VRAM, you shouldn't even hit this path. /Thomas ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] ttm: Fix ttm in-kernel copying of pages with non-standard caching attributes. 2009-07-31 8:59 ` Thomas Hellström @ 2009-07-31 9:32 ` Pekka Paalanen 0 siblings, 0 replies; 5+ messages in thread From: Pekka Paalanen @ 2009-07-31 9:32 UTC (permalink / raw) To: Thomas Hellström; +Cc: Thomas Hellstrom, dri-devel, linux-kernel On Fri, 31 Jul 2009 10:59:57 +0200 Thomas Hellström <thomas@shipmail.org> wrote: > Pekka Paalanen wrote: > >> @@ -145,17 +146,35 @@ static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src, > >> return -ENOMEM; > >> > >> src = (void *)((unsigned long)src + (page << PAGE_SHIFT)); > >> - dst = kmap(d); > >> + > >> +#ifdef CONFIG_X86 > >> + dst = kmap_atomic_prot(d, KM_USER0, prot); > >> +#else > >> + if (prot != PAGE_KERNEL) > >> + dst = vmap(&d, 1, 0, prot); > >> + else > >> + dst = kmap(d); > >> +#endif > >> > > > > What are the implications of choosing the non-CONFIG_X86 path > > even on x86? > > > > The only implication is a slowdown if dealing with highmem pages or > pages with > a non standard caching policy. Also you need the patch I just posted to > dri-devel / lkml to make it compile. > I should've done more thorough testing of the non-x86 path. > > > Is kmap_atomic_prot() simply an optimization allowed by the x86 > > arch, and the alternate way also works, although it uses the > > precious vmalloc address space? > > > > Exactly, although it's only using one page out of vmalloc space and for > the time it > takes to copy a page to / from io. > > > Since kmap_atomic_prot() is not exported on earlier kernels, > > I'm tempted to just do the non-CONFIG_X86 path. > > > For compat I think that should be fine. If your driver is using > accelerated copy to / from > VRAM, you shouldn't even hit this path. Okay, thank you very much. -- Pekka Paalanen http://www.iki.fi/pq/ ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-07-31 9:31 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-07-24 7:57 [PATCH 1/2] x86: Export kmap_atomic_prot() needed for TTM Thomas Hellstrom 2009-07-24 7:57 ` [PATCH 2/2] ttm: Fix ttm in-kernel copying of pages with non-standard caching attributes Thomas Hellstrom 2009-07-30 16:00 ` Pekka Paalanen 2009-07-31 8:59 ` Thomas Hellström 2009-07-31 9:32 ` Pekka Paalanen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox