* RE: memory with __get_free_pages and disabling caching
@ 2006-03-24 19:13 Kallol Biswas
2006-03-24 19:29 ` Kumar Gala
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Kallol Biswas @ 2006-03-24 19:13 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
We have a little endian device on a PPC 440GX based system.
The descriptors need to be swapped. With E bit turned on we can save swapping time.
May be all the pages with _get_free_page already are mapped with large tlb entry.
How about making a window (ptes) like consistent memory?
-----Original Message-----
From: Benjamin Herrenschmidt [mailto:benh@kernel.crashing.org]
Sent: Thursday, March 23, 2006 7:06 PM
To: Kallol Biswas
Cc: linuxppc-dev@ozlabs.org
Subject: Re: memory with __get_free_pages and disabling caching
On Thu, 2006-03-23 at 18:15 -0800, Kallol Biswas wrote:
> Hello,
> Is there an easy way to set page table attributes for the
> memory returned by __get_free_pages()?
>
> I need to be able to turn off caching and turn on E bit for these
> pages.
The Evil bit ? heh ! what are you trying to do ? here ... you can always create a virtual mapping to those pages with different attributes but that's nor recommended as some processors will shoke pretty badly if you end up with both cacheable and non-cacheable mappings for the same page.
However, it's not always possible to unmap the initial mapping since it's common to use things like large pages, BATs, large TLB entries etc... to map kernel memory..
> I tried to walk through the page tables data structures to get the
> pte, but it seems that the pmd is not present for the pages. If
> someone has done investigation on this before please send me a reply.
>
Kernel linear memory isn't necessarily mapped by the page tables. What are you trying to do and with what processor ?
Ben.
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: memory with __get_free_pages and disabling caching
2006-03-24 19:13 memory with __get_free_pages and disabling caching Kallol Biswas
@ 2006-03-24 19:29 ` Kumar Gala
2006-03-24 22:17 ` Paul Mackerras
2006-03-24 22:30 ` Benjamin Herrenschmidt
2 siblings, 0 replies; 12+ messages in thread
From: Kumar Gala @ 2006-03-24 19:29 UTC (permalink / raw)
To: Kallol Biswas; +Cc: linuxppc-dev
On Mar 24, 2006, at 1:13 PM, Kallol Biswas wrote:
>
> We have a little endian device on a PPC 440GX based system.
> The descriptors need to be swapped. With E bit turned on we can
> save swapping time.
What's the device and what bus is it on? Are you writing a standard
kernel driver for it?
> May be all the pages with _get_free_page already are mapped with
> large tlb entry.
>
> How about making a window (ptes) like consistent memory?
>
> -----Original Message-----
> From: Benjamin Herrenschmidt [mailto:benh@kernel.crashing.org]
> Sent: Thursday, March 23, 2006 7:06 PM
> To: Kallol Biswas
> Cc: linuxppc-dev@ozlabs.org
> Subject: Re: memory with __get_free_pages and disabling caching
>
> On Thu, 2006-03-23 at 18:15 -0800, Kallol Biswas wrote:
>> Hello,
>> Is there an easy way to set page table attributes for the
>> memory returned by __get_free_pages()?
>>
>> I need to be able to turn off caching and turn on E bit for these
>> pages.
>
> The Evil bit ? heh ! what are you trying to do ? here ... you can
> always create a virtual mapping to those pages with different
> attributes but that's nor recommended as some processors will shoke
> pretty badly if you end up with both cacheable and non-cacheable
> mappings for the same page.
> However, it's not always possible to unmap the initial mapping
> since it's common to use things like large pages, BATs, large TLB
> entries etc... to map kernel memory..
>
>> I tried to walk through the page tables data structures to get the
>> pte, but it seems that the pmd is not present for the pages. If
>> someone has done investigation on this before please send me a reply.
>>
> Kernel linear memory isn't necessarily mapped by the page tables.
> What are you trying to do and with what processor ?
>
>
>
> Ben.
>
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-dev
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: memory with __get_free_pages and disabling caching
2006-03-24 19:13 memory with __get_free_pages and disabling caching Kallol Biswas
2006-03-24 19:29 ` Kumar Gala
@ 2006-03-24 22:17 ` Paul Mackerras
2006-03-24 22:30 ` Benjamin Herrenschmidt
2 siblings, 0 replies; 12+ messages in thread
From: Paul Mackerras @ 2006-03-24 22:17 UTC (permalink / raw)
To: Kallol Biswas; +Cc: linuxppc-dev
Kallol Biswas writes:
> We have a little endian device on a PPC 440GX based system.
> The descriptors need to be swapped. With E bit turned on we can save
> swapping time.
Writing the descriptors with stwbrx should be just as fast as using
stw, and eliminates the need for a special mapping.
If you need it to be cache-inhibited, you should be using
dma_alloc_coherent() or pci_alloc_consistent().
Paul.
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: memory with __get_free_pages and disabling caching
2006-03-24 19:13 memory with __get_free_pages and disabling caching Kallol Biswas
2006-03-24 19:29 ` Kumar Gala
2006-03-24 22:17 ` Paul Mackerras
@ 2006-03-24 22:30 ` Benjamin Herrenschmidt
2 siblings, 0 replies; 12+ messages in thread
From: Benjamin Herrenschmidt @ 2006-03-24 22:30 UTC (permalink / raw)
To: Kallol Biswas; +Cc: linuxppc-dev
On Fri, 2006-03-24 at 11:13 -0800, Kallol Biswas wrote:
> We have a little endian device on a PPC 440GX based system.
> The descriptors need to be swapped. With E bit turned on we can save swapping time.
>
> May be all the pages with _get_free_page already are mapped with large tlb entry.
>
> How about making a window (ptes) like consistent memory?
If you allocate with consistent allocator on 4xx, you should be able to
hack the PTEs to set the E bit, but I think it's not necessary. We've
been swapping descriptor for ages without any noticeable performance
loss since pretty much all network devices have little endian descriptor
rings :) Look into using the {ld,st}_le{16,32} inlines, they use the
native swapped load/store instructions of the CPU to store things in
little endian format. They shouldn't cost more or at least not
significantly more than normal load/stores.
Ben.
> -----Original Message-----
> From: Benjamin Herrenschmidt [mailto:benh@kernel.crashing.org]
> Sent: Thursday, March 23, 2006 7:06 PM
> To: Kallol Biswas
> Cc: linuxppc-dev@ozlabs.org
> Subject: Re: memory with __get_free_pages and disabling caching
>
> On Thu, 2006-03-23 at 18:15 -0800, Kallol Biswas wrote:
> > Hello,
> > Is there an easy way to set page table attributes for the
> > memory returned by __get_free_pages()?
> >
> > I need to be able to turn off caching and turn on E bit for these
> > pages.
>
> The Evil bit ? heh ! what are you trying to do ? here ... you can always create a virtual mapping to those pages with different attributes but that's nor recommended as some processors will shoke pretty badly if you end up with both cacheable and non-cacheable mappings for the same page.
> However, it's not always possible to unmap the initial mapping since it's common to use things like large pages, BATs, large TLB entries etc... to map kernel memory..
>
> > I tried to walk through the page tables data structures to get the
> > pte, but it seems that the pmd is not present for the pages. If
> > someone has done investigation on this before please send me a reply.
> >
> Kernel linear memory isn't necessarily mapped by the page tables. What are you trying to do and with what processor ?
>
>
>
> Ben.
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: memory with __get_free_pages and disabling caching
@ 2006-03-24 23:44 Kallol Biswas
2006-03-25 0:20 ` Benjamin Herrenschmidt
2006-03-25 0:27 ` Matt Porter
0 siblings, 2 replies; 12+ messages in thread
From: Kallol Biswas @ 2006-03-24 23:44 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
Thank you.
I wonder how consistent ptes are used if all kernel memory is mapped with large tlb.
In the __dma_alloc_coherent() routine pages are allocated with alloc_pages(), new virtual address is created in consistent region, then consistent ptes are populated. Looks like that the routine creates a new virtual mapping. The memory is addressed with the new address.
Do we have two mappings in the TLB for the same physical address?
When I find it out, I will do a posting.
-----Original Message-----
From: Benjamin Herrenschmidt [mailto:benh@kernel.crashing.org]
Sent: Friday, March 24, 2006 2:31 PM
To: Kallol Biswas
Cc: linuxppc-dev@ozlabs.org
Subject: RE: memory with __get_free_pages and disabling caching
On Fri, 2006-03-24 at 11:13 -0800, Kallol Biswas wrote:
> We have a little endian device on a PPC 440GX based system.
> The descriptors need to be swapped. With E bit turned on we can save swapping time.
>
> May be all the pages with _get_free_page already are mapped with large tlb entry.
>
> How about making a window (ptes) like consistent memory?
If you allocate with consistent allocator on 4xx, you should be able to hack the PTEs to set the E bit, but I think it's not necessary. We've been swapping descriptor for ages without any noticeable performance loss since pretty much all network devices have little endian descriptor rings :) Look into using the {ld,st}_le{16,32} inlines, they use the native swapped load/store instructions of the CPU to store things in little endian format. They shouldn't cost more or at least not significantly more than normal load/stores.
Ben.
> -----Original Message-----
> From: Benjamin Herrenschmidt [mailto:benh@kernel.crashing.org]
> Sent: Thursday, March 23, 2006 7:06 PM
> To: Kallol Biswas
> Cc: linuxppc-dev@ozlabs.org
> Subject: Re: memory with __get_free_pages and disabling caching
>
> On Thu, 2006-03-23 at 18:15 -0800, Kallol Biswas wrote:
> > Hello,
> > Is there an easy way to set page table attributes for the
> > memory returned by __get_free_pages()?
> >
> > I need to be able to turn off caching and turn on E bit for these
> > pages.
>
> The Evil bit ? heh ! what are you trying to do ? here ... you can always create a virtual mapping to those pages with different attributes but that's nor recommended as some processors will shoke pretty badly if you end up with both cacheable and non-cacheable mappings for the same page.
> However, it's not always possible to unmap the initial mapping since it's common to use things like large pages, BATs, large TLB entries etc... to map kernel memory..
>
> > I tried to walk through the page tables data structures to get the
> > pte, but it seems that the pmd is not present for the pages. If
> > someone has done investigation on this before please send me a reply.
> >
> Kernel linear memory isn't necessarily mapped by the page tables. What are you trying to do and with what processor ?
>
>
>
> Ben.
>
^ permalink raw reply [flat|nested] 12+ messages in thread* RE: memory with __get_free_pages and disabling caching
2006-03-24 23:44 Kallol Biswas
@ 2006-03-25 0:20 ` Benjamin Herrenschmidt
2006-03-25 0:29 ` Matt Porter
2006-03-25 0:27 ` Matt Porter
1 sibling, 1 reply; 12+ messages in thread
From: Benjamin Herrenschmidt @ 2006-03-25 0:20 UTC (permalink / raw)
To: Kallol Biswas; +Cc: linuxppc-dev
On Fri, 2006-03-24 at 15:44 -0800, Kallol Biswas wrote:
> Thank you.
>
> I wonder how consistent ptes are used if all kernel memory is mapped with large tlb.
> In the __dma_alloc_coherent() routine pages are allocated with alloc_pages(), new virtual address is created in consistent region,
> then consistent ptes are populated. Looks like that the routine creates a new virtual mapping. The memory is addressed with the new address.
>
> Do we have two mappings in the TLB for the same physical address?
Yes, it seems like we do... the consistent DMA stuff assumes that is
safe to do, which is not the case on 6xx CPUs but might be on 4xx.
> When I find it out, I will do a posting.
>
> -----Original Message-----
> From: Benjamin Herrenschmidt [mailto:benh@kernel.crashing.org]
> Sent: Friday, March 24, 2006 2:31 PM
> To: Kallol Biswas
> Cc: linuxppc-dev@ozlabs.org
> Subject: RE: memory with __get_free_pages and disabling caching
>
> On Fri, 2006-03-24 at 11:13 -0800, Kallol Biswas wrote:
> > We have a little endian device on a PPC 440GX based system.
> > The descriptors need to be swapped. With E bit turned on we can save swapping time.
> >
> > May be all the pages with _get_free_page already are mapped with large tlb entry.
> >
> > How about making a window (ptes) like consistent memory?
>
> If you allocate with consistent allocator on 4xx, you should be able to hack the PTEs to set the E bit, but I think it's not necessary. We've been swapping descriptor for ages without any noticeable performance loss since pretty much all network devices have little endian descriptor rings :) Look into using the {ld,st}_le{16,32} inlines, they use the native swapped load/store instructions of the CPU to store things in little endian format. They shouldn't cost more or at least not significantly more than normal load/stores.
>
> Ben.
>
> > -----Original Message-----
> > From: Benjamin Herrenschmidt [mailto:benh@kernel.crashing.org]
> > Sent: Thursday, March 23, 2006 7:06 PM
> > To: Kallol Biswas
> > Cc: linuxppc-dev@ozlabs.org
> > Subject: Re: memory with __get_free_pages and disabling caching
> >
> > On Thu, 2006-03-23 at 18:15 -0800, Kallol Biswas wrote:
> > > Hello,
> > > Is there an easy way to set page table attributes for the
> > > memory returned by __get_free_pages()?
> > >
> > > I need to be able to turn off caching and turn on E bit for these
> > > pages.
> >
> > The Evil bit ? heh ! what are you trying to do ? here ... you can always create a virtual mapping to those pages with different attributes but that's nor recommended as some processors will shoke pretty badly if you end up with both cacheable and non-cacheable mappings for the same page.
> > However, it's not always possible to unmap the initial mapping since it's common to use things like large pages, BATs, large TLB entries etc... to map kernel memory..
> >
> > > I tried to walk through the page tables data structures to get the
> > > pte, but it seems that the pmd is not present for the pages. If
> > > someone has done investigation on this before please send me a reply.
> > >
> > Kernel linear memory isn't necessarily mapped by the page tables. What are you trying to do and with what processor ?
> >
> >
> >
> > Ben.
> >
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: memory with __get_free_pages and disabling caching
2006-03-25 0:20 ` Benjamin Herrenschmidt
@ 2006-03-25 0:29 ` Matt Porter
0 siblings, 0 replies; 12+ messages in thread
From: Matt Porter @ 2006-03-25 0:29 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Kallol Biswas, linuxppc-dev
On Sat, Mar 25, 2006 at 11:20:34AM +1100, Benjamin Herrenschmidt wrote:
> On Fri, 2006-03-24 at 15:44 -0800, Kallol Biswas wrote:
> > Thank you.
> >
> > I wonder how consistent ptes are used if all kernel memory is mapped with large tlb.
> > In the __dma_alloc_coherent() routine pages are allocated with alloc_pages(), new virtual address is created in consistent region,
> > then consistent ptes are populated. Looks like that the routine creates a new virtual mapping. The memory is addressed with the new address.
> >
> > Do we have two mappings in the TLB for the same physical address?
>
> Yes, it seems like we do... the consistent DMA stuff assumes that is
> safe to do, which is not the case on 6xx CPUs but might be on 4xx.
It is safe on 4xx. The doomsday scenario on 4xx is two mapping in the
TBL for the same virtual address range. Operation at that point is
boundedly undefined. :)
-Matt
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: memory with __get_free_pages and disabling caching
2006-03-24 23:44 Kallol Biswas
2006-03-25 0:20 ` Benjamin Herrenschmidt
@ 2006-03-25 0:27 ` Matt Porter
2006-03-25 1:02 ` Benjamin Herrenschmidt
1 sibling, 1 reply; 12+ messages in thread
From: Matt Porter @ 2006-03-25 0:27 UTC (permalink / raw)
To: Kallol Biswas; +Cc: linuxppc-dev
On Fri, Mar 24, 2006 at 03:44:42PM -0800, Kallol Biswas wrote:
> Thank you.
>
> I wonder how consistent ptes are used if all kernel memory is mapped with large tlb.
> In the __dma_alloc_coherent() routine pages are allocated with alloc_pages(), new virtual address is created in consistent region, then consistent ptes are populated. Looks like that the routine creates a new virtual mapping. The memory is addressed with the new address.
>
> Do we have two mappings in the TLB for the same physical address?
Yes, that's how it works. After being allocated by the dma api
routines, the direct map is never accessed. Accessing the same
physical address via the cached direct map would cause serious
problems but you aren't allowed to touch address space like
that unless it's been allocated through a kernel allocator for
your use.
-Matt
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: memory with __get_free_pages and disabling caching
2006-03-25 0:27 ` Matt Porter
@ 2006-03-25 1:02 ` Benjamin Herrenschmidt
2006-03-25 1:25 ` Matt Porter
0 siblings, 1 reply; 12+ messages in thread
From: Benjamin Herrenschmidt @ 2006-03-25 1:02 UTC (permalink / raw)
To: Matt Porter; +Cc: Kallol Biswas, linuxppc-dev
> Yes, that's how it works. After being allocated by the dma api
> routines, the direct map is never accessed. Accessing the same
> physical address via the cached direct map would cause serious
> problems but you aren't allowed to touch address space like
> that unless it's been allocated through a kernel allocator for
> your use.
That is still broken for at least 6xx CPUs ... they may well prefetch it
and you die...
For example, page A is a normal page allocated for kernel use, page B
just a after A is used by the DMA allocator for uncacheable accesses
(and is thus mapped twice). If something does a loop going through an
array in page A, you have no guarantee that some smart prefetcher &
speculative accesses will not bring bits of page B into the cache since
it's mapped and cacheable...
Ben.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: memory with __get_free_pages and disabling caching
2006-03-25 1:02 ` Benjamin Herrenschmidt
@ 2006-03-25 1:25 ` Matt Porter
0 siblings, 0 replies; 12+ messages in thread
From: Matt Porter @ 2006-03-25 1:25 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Kallol Biswas, linuxppc-dev
On Sat, Mar 25, 2006 at 12:02:51PM +1100, Benjamin Herrenschmidt wrote:
>
> > Yes, that's how it works. After being allocated by the dma api
> > routines, the direct map is never accessed. Accessing the same
> > physical address via the cached direct map would cause serious
> > problems but you aren't allowed to touch address space like
> > that unless it's been allocated through a kernel allocator for
> > your use.
>
> That is still broken for at least 6xx CPUs ... they may well prefetch it
> and you die...
Right.
> For example, page A is a normal page allocated for kernel use, page B
> just a after A is used by the DMA allocator for uncacheable accesses
> (and is thus mapped twice). If something does a loop going through an
> array in page A, you have no guarantee that some smart prefetcher &
> speculative accesses will not bring bits of page B into the cache since
> it's mapped and cacheable...
We had a similar scenario on 4xx with the smart prefetching that is
used in copy_tofrom_user. It had to be modified not to prefetch
across pages for 4xx. Eugene first saw corruption due to adjacent
page prefetches into a dmaable page.
-Matt
^ permalink raw reply [flat|nested] 12+ messages in thread
* memory with __get_free_pages and disabling caching
@ 2006-03-24 2:15 Kallol Biswas
2006-03-24 3:05 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 12+ messages in thread
From: Kallol Biswas @ 2006-03-24 2:15 UTC (permalink / raw)
To: linuxppc-dev
[-- Attachment #1: Type: text/plain, Size: 406 bytes --]
Hello,
Is there an easy way to set page table attributes for the memory returned
by __get_free_pages()?
I need to be able to turn off caching and turn on E bit for these pages.
I tried to walk through the page tables data structures to get the pte, but it seems
that the pmd is not present for the pages. If someone has done investigation
on this before please send me a reply.
Thanks,
Kallol
[-- Attachment #2: Type: text/html, Size: 1656 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: memory with __get_free_pages and disabling caching
2006-03-24 2:15 Kallol Biswas
@ 2006-03-24 3:05 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 12+ messages in thread
From: Benjamin Herrenschmidt @ 2006-03-24 3:05 UTC (permalink / raw)
To: Kallol Biswas; +Cc: linuxppc-dev
On Thu, 2006-03-23 at 18:15 -0800, Kallol Biswas wrote:
> Hello,
> Is there an easy way to set page table attributes for the
> memory returned
> by __get_free_pages()?
>
> I need to be able to turn off caching and turn on E bit for these
> pages.
The Evil bit ? heh ! what are you trying to do ? here ... you can always
create a virtual mapping to those pages with different attributes but
that's nor recommended as some processors will shoke pretty badly if you
end up with both cacheable and non-cacheable mappings for the same page.
However, it's not always possible to unmap the initial mapping since
it's common to use things like large pages, BATs, large TLB entries
etc... to map kernel memory..
> I tried to walk through the page tables data structures to get the
> pte, but it seems
> that the pmd is not present for the pages. If someone has done
> investigation
> on this before please send me a reply.
>
Kernel linear memory isn't necessarily mapped by the page tables. What
are you trying to do and with what processor ?
Ben.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2006-03-25 2:44 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-24 19:13 memory with __get_free_pages and disabling caching Kallol Biswas
2006-03-24 19:29 ` Kumar Gala
2006-03-24 22:17 ` Paul Mackerras
2006-03-24 22:30 ` Benjamin Herrenschmidt
-- strict thread matches above, loose matches on Subject: below --
2006-03-24 23:44 Kallol Biswas
2006-03-25 0:20 ` Benjamin Herrenschmidt
2006-03-25 0:29 ` Matt Porter
2006-03-25 0:27 ` Matt Porter
2006-03-25 1:02 ` Benjamin Herrenschmidt
2006-03-25 1:25 ` Matt Porter
2006-03-24 2:15 Kallol Biswas
2006-03-24 3:05 ` Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).