* mprotect pgprot handling weirdness
@ 2010-04-06 5:09 Benjamin Herrenschmidt
2010-04-06 5:32 ` Benjamin Herrenschmidt
` (2 more replies)
0 siblings, 3 replies; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2010-04-06 5:09 UTC (permalink / raw)
To: linux-mm; +Cc: linux-kernel@vger.kernel.org
Hi folks !
While looking at untangling a bit some of the mess with vm_flags and
pgprot (*), I notices a few things I can't quite explain... they may ..
or may not be bugs, but I though it was worth mentioning:
- In mprotect_fixup() :
/*
* vm_flags and vm_page_prot are protected by the mmap_sem
* held in write mode.
*/
vma->vm_flags = newflags;
vma->vm_page_prot = pgprot_modify(vma->vm_page_prot,
vm_get_page_prot(newflags));
if (vma_wants_writenotify(vma)) {
vma->vm_page_prot = vm_get_page_prot(newflags & ~VM_SHARED);
dirty_accountable = 1;
}
So as you can see above, we take great care (using pgprot_modify) to avoid
blasting away some PAT related flags on x86 (no other arch implements
pgprot_modify() today).... but if we hit vma_wants_writenotify(), then
we unconditionally override the entire vma->vm_page_prot field with some
new prot bits born of the new vm_flags. That sounds odd...
- in sys_mprotect:
newflags = vm_flags | (vma->vm_flags & ~(VM_READ | VM_WRITE | VM_EXEC));
Do I read correctly that this means we cannot -remove- any flag than
VM_READ, VM_WRITE or VM_EXEC ? That means that we cannot remove PROT_SAO
which gets turned into VM_SAO on powerpc ... Yet another reason to take
those arch specific mapping attributes out of the vm_flags.
(*) Right now it's near impossible to add arch specific PROT_* bits to
mmap/mprotect for fancy things like cachability attributes, or other
nifty things like reverse-endian mappings that we have on some embedded
platforms, I'm investigating ways to better separate vm_page_prot from
vm_flags so some PROT_* bits can go straight to the former without
having to be mirrored in some way in the later.
Cheers,
Ben.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: mprotect pgprot handling weirdness
2010-04-06 5:09 mprotect pgprot handling weirdness Benjamin Herrenschmidt
@ 2010-04-06 5:32 ` Benjamin Herrenschmidt
2010-04-06 5:43 ` Benjamin Herrenschmidt
2010-04-06 5:52 ` KOSAKI Motohiro
2 siblings, 0 replies; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2010-04-06 5:32 UTC (permalink / raw)
To: linux-mm; +Cc: linux-kernel@vger.kernel.org
On Tue, 2010-04-06 at 15:09 +1000, Benjamin Herrenschmidt wrote:
> Hi folks !
>
> While looking at untangling a bit some of the mess with vm_flags and
> pgprot (*), I notices a few things I can't quite explain... they may ..
> or may not be bugs, but I though it was worth mentioning:
And another one:
- vma_wants_writenotify():
/* The open routine did something to the protections already? */
if (pgprot_val(vma->vm_page_prot) !=
pgprot_val(vm_get_page_prot(vm_flags)))
return 0;
That's going to blow if any -other- prot bit is used here.
Cheers,
Ben.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: mprotect pgprot handling weirdness
2010-04-06 5:09 mprotect pgprot handling weirdness Benjamin Herrenschmidt
2010-04-06 5:32 ` Benjamin Herrenschmidt
@ 2010-04-06 5:43 ` Benjamin Herrenschmidt
2010-04-06 5:52 ` KOSAKI Motohiro
2 siblings, 0 replies; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2010-04-06 5:43 UTC (permalink / raw)
To: linux-mm; +Cc: linux-kernel@vger.kernel.org
On Tue, 2010-04-06 at 15:09 +1000, Benjamin Herrenschmidt wrote:
> (*) Right now it's near impossible to add arch specific PROT_* bits to
> mmap/mprotect for fancy things like cachability attributes, or other
> nifty things like reverse-endian mappings that we have on some embedded
> platforms, I'm investigating ways to better separate vm_page_prot from
> vm_flags so some PROT_* bits can go straight to the former without
> having to be mirrored in some way in the later.
The other (easier) option is to make the vm flags always 64-bit and
reserve a range of bits here for the arch to use but I suppose there's
going to be unhappiness about that one :-)
Cheers,
Ben.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: mprotect pgprot handling weirdness
2010-04-06 5:09 mprotect pgprot handling weirdness Benjamin Herrenschmidt
2010-04-06 5:32 ` Benjamin Herrenschmidt
2010-04-06 5:43 ` Benjamin Herrenschmidt
@ 2010-04-06 5:52 ` KOSAKI Motohiro
2010-04-06 6:07 ` Arch specific mmap attributes (Was: mprotect pgprot handling weirdness) Benjamin Herrenschmidt
2 siblings, 1 reply; 16+ messages in thread
From: KOSAKI Motohiro @ 2010-04-06 5:52 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: kosaki.motohiro, linux-mm, linux-kernel@vger.kernel.org
> Hi folks !
>
> While looking at untangling a bit some of the mess with vm_flags and
> pgprot (*), I notices a few things I can't quite explain... they may ..
> or may not be bugs, but I though it was worth mentioning:
>
> - In mprotect_fixup() :
>
> /*
> * vm_flags and vm_page_prot are protected by the mmap_sem
> * held in write mode.
> */
> vma->vm_flags = newflags;
> vma->vm_page_prot = pgprot_modify(vma->vm_page_prot,
> vm_get_page_prot(newflags));
>
> if (vma_wants_writenotify(vma)) {
> vma->vm_page_prot = vm_get_page_prot(newflags & ~VM_SHARED);
> dirty_accountable = 1;
> }
>
> So as you can see above, we take great care (using pgprot_modify) to avoid
> blasting away some PAT related flags on x86 (no other arch implements
> pgprot_modify() today).... but if we hit vma_wants_writenotify(), then
> we unconditionally override the entire vma->vm_page_prot field with some
> new prot bits born of the new vm_flags. That sounds odd...
>
> - in sys_mprotect:
>
> newflags = vm_flags | (vma->vm_flags & ~(VM_READ | VM_WRITE | VM_EXEC));
>
> Do I read correctly that this means we cannot -remove- any flag than
> VM_READ, VM_WRITE or VM_EXEC ? That means that we cannot remove PROT_SAO
> which gets turned into VM_SAO on powerpc ... Yet another reason to take
> those arch specific mapping attributes out of the vm_flags.
>
> (*) Right now it's near impossible to add arch specific PROT_* bits to
> mmap/mprotect for fancy things like cachability attributes, or other
> nifty things like reverse-endian mappings that we have on some embedded
> platforms, I'm investigating ways to better separate vm_page_prot from
> vm_flags so some PROT_* bits can go straight to the former without
> having to be mirrored in some way in the later.
This check was introduced the following commit. yes now we don't
consider arch specific PROT_xx flags. but I don't think it is odd.
Yeah, I can imagine at least embedded people certenary need arch
specific PROT_xx flags and they hope to change it. but I don't
think mprotect() fit for your usage. I mean mprotect() is widely
used glibc internally. then, If mprotec can change which flags,
glibc might turn off such flags implictly.
So, Why can't we proper new syscall? It has no regression risk.
==========================================================
commit d5e066ae3c39b4036b5f5021c352af0b73c85568
Author: torvalds <torvalds>
Date: Fri Sep 5 19:05:07 2003 +0000
Fix mprotect() to do proper PROT_xxx -> VM_xxx translation.
This also fixes the bug with MAP_SEM being potentially
interpreted as VM_SHARED.
BKrev: 3f58de63gvzz-PsxwnRPnXTpz7EOeg
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 2c01579..699962e 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -224,7 +224,7 @@ fail:
asmlinkage long
sys_mprotect(unsigned long start, size_t len, unsigned long prot)
{
- unsigned long nstart, end, tmp;
+ unsigned long vm_flags, nstart, end, tmp;
struct vm_area_struct * vma, * next, * prev;
int error = -EINVAL;
@@ -239,6 +239,8 @@ sys_mprotect(unsigned long start, size_t len, unsigned long prot)
if (end == start)
return 0;
+ vm_flags = calc_vm_prot_bits(prot);
+
down_write(¤t->mm->mmap_sem);
vma = find_vma_prev(current->mm, start, &prev);
@@ -257,7 +259,8 @@ sys_mprotect(unsigned long start, size_t len, unsigned long prot)
goto out;
}
- newflags = prot | (vma->vm_flags & ~(PROT_READ | PROT_WRITE | PROT_EXEC));
+ newflags = vm_flags | (vma->vm_flags & ~(VM_READ | VM_WRITE | VM_EXEC));
+
if ((newflags & ~(newflags >> 4)) & 0xf) {
error = -EACCES;
goto out;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Arch specific mmap attributes (Was: mprotect pgprot handling weirdness)
2010-04-06 5:52 ` KOSAKI Motohiro
@ 2010-04-06 6:07 ` Benjamin Herrenschmidt
2010-04-06 6:24 ` KOSAKI Motohiro
0 siblings, 1 reply; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2010-04-06 6:07 UTC (permalink / raw)
To: KOSAKI Motohiro; +Cc: linux-mm, linux-kernel@vger.kernel.org, linux-arch
On Tue, 2010-04-06 at 14:52 +0900, KOSAKI Motohiro wrote:
(Adding linux-arch)
> This check was introduced the following commit. yes now we don't
> consider arch specific PROT_xx flags. but I don't think it is odd.
>
> Yeah, I can imagine at least embedded people certenary need arch
> specific PROT_xx flags and they hope to change it. but I don't
> think mprotect() fit for your usage. I mean mprotect() is widely
> used glibc internally. then, If mprotec can change which flags,
> glibc might turn off such flags implictly.
>
> So, Why can't we proper new syscall? It has no regression risk.
I don't care much personally whether we use mprotect() or a new syscall,
but at this stage we already have PROT_SAO going that way for powerpc so
that would be an ABI change.
However, the main issue isn't really there. The main issue is that right
now, everything we do in mmap.c, mprotect.c, ... revolves around having
everything translated into the single vm_flags field. VMA merging
decisions, construction of vm_page_prot, etc... everything is there.
However, this is a 32-bit field on 32-bit archs, and we already use all
possible bits in there. It's also a field entirely defined in generic
code with no provision for arch specific bits.
The question here thus boils down to what direction do we want to go to
if we want to untangle that and provide the ability to expose mapping
"attributes" basically. In fact, I suspect even x86 might have good use
of that to create things like relaxed ordering mappings no ?
This boils down, so far to a few facts/questions to be resolved:
- Do we want to use the existing PROT_ argument to mmap, mprotect,... ?
There's plenty of bit space, and we already have at least one example of
an arch adding something to it (powerpc with PROT_SAO - aka Strong
Access Ordering - aka Make It Look Like An x86 :-)
- If not, while a separate syscall would be fine with me for setting
attributes after the fact, it makes it harder to pass them via mmap, is
that a big deal ? IE. Ie it means one -always- has to call it after mmap
to change the attributes. That means for example that mmap will
potentially create a VMA merged with another one, just to be re-split
due to the attribute change. A bit gross...
- Do we want to keep the current "Funnel everything into vm_flags"
approach ? That leaves no option that I can see but to extend it into a
u64 so it grows on 32-bit archs.
- If not, I see two approaches here: Either having a separate / new
"attribute" field in the VMA or going straight for the vm_page_prot (ie.
the pgprot). In both cases, things like vma_merge() need to grow a new
argument since obviously we can't merge things with different
attributes.
- ... Unless we just replace VM_SAO with VM_CANT_MERGE and set that
whenever a VMA has a non-0 attributes. Sad but simpler
Any other / better idea ?
Cheers,
Ben.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Arch specific mmap attributes (Was: mprotect pgprot handling weirdness)
2010-04-06 6:07 ` Arch specific mmap attributes (Was: mprotect pgprot handling weirdness) Benjamin Herrenschmidt
@ 2010-04-06 6:24 ` KOSAKI Motohiro
2010-04-06 7:30 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 16+ messages in thread
From: KOSAKI Motohiro @ 2010-04-06 6:24 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: kosaki.motohiro, linux-mm, linux-kernel@vger.kernel.org,
linux-arch
> On Tue, 2010-04-06 at 14:52 +0900, KOSAKI Motohiro wrote:
>
> (Adding linux-arch)
>
> > This check was introduced the following commit. yes now we don't
> > consider arch specific PROT_xx flags. but I don't think it is odd.
> >
> > Yeah, I can imagine at least embedded people certenary need arch
> > specific PROT_xx flags and they hope to change it. but I don't
> > think mprotect() fit for your usage. I mean mprotect() is widely
> > used glibc internally. then, If mprotec can change which flags,
> > glibc might turn off such flags implictly.
> >
> > So, Why can't we proper new syscall? It has no regression risk.
>
> I don't care much personally whether we use mprotect() or a new syscall,
> but at this stage we already have PROT_SAO going that way for powerpc so
> that would be an ABI change.
>
> However, the main issue isn't really there. The main issue is that right
> now, everything we do in mmap.c, mprotect.c, ... revolves around having
> everything translated into the single vm_flags field. VMA merging
> decisions, construction of vm_page_prot, etc... everything is there.
>
> However, this is a 32-bit field on 32-bit archs, and we already use all
> possible bits in there. It's also a field entirely defined in generic
> code with no provision for arch specific bits.
>
> The question here thus boils down to what direction do we want to go to
> if we want to untangle that and provide the ability to expose mapping
> "attributes" basically. In fact, I suspect even x86 might have good use
> of that to create things like relaxed ordering mappings no ?
>
> This boils down, so far to a few facts/questions to be resolved:
>
> - Do we want to use the existing PROT_ argument to mmap, mprotect,... ?
> There's plenty of bit space, and we already have at least one example of
> an arch adding something to it (powerpc with PROT_SAO - aka Strong
> Access Ordering - aka Make It Look Like An x86 :-)
>
> - If not, while a separate syscall would be fine with me for setting
> attributes after the fact, it makes it harder to pass them via mmap, is
> that a big deal ? IE. Ie it means one -always- has to call it after mmap
> to change the attributes. That means for example that mmap will
> potentially create a VMA merged with another one, just to be re-split
> due to the attribute change. A bit gross...
>
> - Do we want to keep the current "Funnel everything into vm_flags"
> approach ? That leaves no option that I can see but to extend it into a
> u64 so it grows on 32-bit archs.
>
> - If not, I see two approaches here: Either having a separate / new
> "attribute" field in the VMA or going straight for the vm_page_prot (ie.
> the pgprot). In both cases, things like vma_merge() need to grow a new
> argument since obviously we can't merge things with different
> attributes.
>
> - ... Unless we just replace VM_SAO with VM_CANT_MERGE and set that
> whenever a VMA has a non-0 attributes. Sad but simpler
>
> Any other / better idea ?
I guess you haven't catch my intention. I didn't say we have to remove
PROT_SAO and VM_SAO.
I mean mmap(PROT_SAO) is ok, it's only append new flag, not change exiting
flags meanings. I'm only against mprotect(PROT_NONE) turn off PROT_SAO
implicitely.
IOW I recommend we use three syscall
mmap() create new mappings
mprotect() change a protection of mapping (as a name)
mattribute(): (or similar name)
change an attribute of mapping (e.g. PROT_SAO or
another arch specific flags)
I'm not against changing mm/protect.c for PROT_SAO.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Arch specific mmap attributes (Was: mprotect pgprot handling weirdness)
2010-04-06 6:24 ` KOSAKI Motohiro
@ 2010-04-06 7:30 ` Benjamin Herrenschmidt
2010-04-06 10:26 ` KOSAKI Motohiro
0 siblings, 1 reply; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2010-04-06 7:30 UTC (permalink / raw)
To: KOSAKI Motohiro; +Cc: linux-mm, linux-kernel@vger.kernel.org, linux-arch
On Tue, 2010-04-06 at 15:24 +0900, KOSAKI Motohiro wrote:
> I guess you haven't catch my intention. I didn't say we have to remove
> PROT_SAO and VM_SAO.
> I mean mmap(PROT_SAO) is ok, it's only append new flag, not change exiting
> flags meanings. I'm only against mprotect(PROT_NONE) turn off PROT_SAO
> implicitely.
>
> IOW I recommend we use three syscall
> mmap() create new mappings
> mprotect() change a protection of mapping (as a name)
> mattribute(): (or similar name)
> change an attribute of mapping (e.g. PROT_SAO or
> another arch specific flags)
>
> I'm not against changing mm/protect.c for PROT_SAO.
Ok, I see. No biggie. The main deal remains how we want to do that
inside the kernel :-) I think the less horrible options here are
to either extend vm_flags to always be 64-bit, or add a separate
vm_map_attributes flag, and add the necessary bits and pieces to
prevent merge accross different attribute vma's.
The more I try to hack it into vm_page_prot, the more I hate that
option.
Cheers
Ben.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Arch specific mmap attributes (Was: mprotect pgprot handling weirdness)
2010-04-06 7:30 ` Benjamin Herrenschmidt
@ 2010-04-06 10:26 ` KOSAKI Motohiro
2010-04-06 22:15 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 16+ messages in thread
From: KOSAKI Motohiro @ 2010-04-06 10:26 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: kosaki.motohiro, linux-mm, linux-kernel@vger.kernel.org,
linux-arch
> Ok, I see. No biggie. The main deal remains how we want to do that
> inside the kernel :-) I think the less horrible options here are
> to either extend vm_flags to always be 64-bit, or add a separate
> vm_map_attributes flag, and add the necessary bits and pieces to
> prevent merge accross different attribute vma's.
vma->vm_flags already have VM_SAO. Why do we need more flags?
At least, I dislike to add separate flags member into vma.
It might introduce unnecessary messy into vma merge thing.
> The more I try to hack it into vm_page_prot, the more I hate that
> option.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Arch specific mmap attributes (Was: mprotect pgprot handling weirdness)
2010-04-06 10:26 ` KOSAKI Motohiro
@ 2010-04-06 22:15 ` Benjamin Herrenschmidt
2010-04-07 6:03 ` KOSAKI Motohiro
0 siblings, 1 reply; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2010-04-06 22:15 UTC (permalink / raw)
To: KOSAKI Motohiro; +Cc: linux-mm, linux-kernel@vger.kernel.org, linux-arch
On Tue, 2010-04-06 at 19:26 +0900, KOSAKI Motohiro wrote:
> > Ok, I see. No biggie. The main deal remains how we want to do that
> > inside the kernel :-) I think the less horrible options here are
> > to either extend vm_flags to always be 64-bit, or add a separate
> > vm_map_attributes flag, and add the necessary bits and pieces to
> > prevent merge accross different attribute vma's.
>
> vma->vm_flags already have VM_SAO. Why do we need more flags?
> At least, I dislike to add separate flags member into vma.
> It might introduce unnecessary messy into vma merge thing.
Well, we did shove SAO in there, and used up the very last vm_flag for
it a while back. Now I need another one, for little endian mappings. So
I'm stuck.
But the problem goes further I believe. Archs do nowadays have quite an
interesting set of MMU attributes that it would be useful to expose to
some extent.
Some powerpc's also provide storage keys for example and I think ARM
have something along those lines. There's interesting cachability
attributes too, on x86 as well. Being able to use such attributes to
request for example a relaxed ordering mapping on x86 might be useful.
I think it basically boils down to either extend vm_flags to always be
64-bit, which seems to be Nick preferred approach, or introduct a
vm_attributes with all the necessary changes to the merge code to take
it into account (not -that- hard tho, there's only half a page of
results in grep for these things :-)
Cheers,
Ben.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Arch specific mmap attributes (Was: mprotect pgprot handling weirdness)
2010-04-06 22:15 ` Benjamin Herrenschmidt
@ 2010-04-07 6:03 ` KOSAKI Motohiro
2010-04-07 7:03 ` Arch specific mmap attributes David Miller
2010-04-07 8:56 ` Arch specific mmap attributes (Was: mprotect pgprot handling weirdness) Benjamin Herrenschmidt
0 siblings, 2 replies; 16+ messages in thread
From: KOSAKI Motohiro @ 2010-04-07 6:03 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: kosaki.motohiro, linux-mm, linux-kernel@vger.kernel.org,
linux-arch
> On Tue, 2010-04-06 at 19:26 +0900, KOSAKI Motohiro wrote:
> > > Ok, I see. No biggie. The main deal remains how we want to do that
> > > inside the kernel :-) I think the less horrible options here are
> > > to either extend vm_flags to always be 64-bit, or add a separate
> > > vm_map_attributes flag, and add the necessary bits and pieces to
> > > prevent merge accross different attribute vma's.
> >
> > vma->vm_flags already have VM_SAO. Why do we need more flags?
> > At least, I dislike to add separate flags member into vma.
> > It might introduce unnecessary messy into vma merge thing.
>
> Well, we did shove SAO in there, and used up the very last vm_flag for
> it a while back. Now I need another one, for little endian mappings. So
> I'm stuck.
>
> But the problem goes further I believe. Archs do nowadays have quite an
> interesting set of MMU attributes that it would be useful to expose to
> some extent.
Generally speaking, It seems no good idea. desktop and server world don't
interest arch specific mmu attribute crap. because many many opensource
and ISV library don't care it. I know highend hpc and embedded have
differenct eco-system. they might want to use such strange mmu feature.
I recommend to you are focusing popwerpc eco-system.
I'm not against changing kernel internal. I only disagree mmu attribute
fashion will be become used widely.
>
> Some powerpc's also provide storage keys for example and I think ARM
> have something along those lines. There's interesting cachability
> attributes too, on x86 as well. Being able to use such attributes to
> request for example a relaxed ordering mapping on x86 might be useful.
>
> I think it basically boils down to either extend vm_flags to always be
> 64-bit, which seems to be Nick preferred approach, or introduct a
> vm_attributes with all the necessary changes to the merge code to take
> it into account (not -that- hard tho, there's only half a page of
> results in grep for these things :-)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Arch specific mmap attributes
2010-04-07 6:03 ` KOSAKI Motohiro
@ 2010-04-07 7:03 ` David Miller
2010-04-07 7:14 ` KOSAKI Motohiro
2010-04-07 8:58 ` Benjamin Herrenschmidt
2010-04-07 8:56 ` Arch specific mmap attributes (Was: mprotect pgprot handling weirdness) Benjamin Herrenschmidt
1 sibling, 2 replies; 16+ messages in thread
From: David Miller @ 2010-04-07 7:03 UTC (permalink / raw)
To: kosaki.motohiro; +Cc: benh, linux-mm, linux-kernel, linux-arch
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Date: Wed, 7 Apr 2010 15:03:45 +0900 (JST)
> I'm not against changing kernel internal. I only disagree mmu
> attribute fashion will be become used widely.
Desktop already uses similar features via PCI mmap
attributes and such, not to mention MSR settings on
x86.
So I disagree with your assesment that this is some
HPC/embedded issue.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Arch specific mmap attributes
2010-04-07 7:03 ` Arch specific mmap attributes David Miller
@ 2010-04-07 7:14 ` KOSAKI Motohiro
2010-04-07 7:18 ` David Miller
2010-04-07 9:00 ` Benjamin Herrenschmidt
2010-04-07 8:58 ` Benjamin Herrenschmidt
1 sibling, 2 replies; 16+ messages in thread
From: KOSAKI Motohiro @ 2010-04-07 7:14 UTC (permalink / raw)
To: David Miller; +Cc: kosaki.motohiro, benh, linux-mm, linux-kernel, linux-arch
> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Date: Wed, 7 Apr 2010 15:03:45 +0900 (JST)
>
> > I'm not against changing kernel internal. I only disagree mmu
> > attribute fashion will be become used widely.
>
> Desktop already uses similar features via PCI mmap
> attributes and such, not to mention MSR settings on
> x86.
Probably I haven't catch your mention. Why userland process
need to change PCI mmap attribute by mmap(2)? It seems kernel issue.
> So I disagree with your assesment that this is some
> HPC/embedded issue.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Arch specific mmap attributes
2010-04-07 7:14 ` KOSAKI Motohiro
@ 2010-04-07 7:18 ` David Miller
2010-04-07 9:00 ` Benjamin Herrenschmidt
1 sibling, 0 replies; 16+ messages in thread
From: David Miller @ 2010-04-07 7:18 UTC (permalink / raw)
To: kosaki.motohiro; +Cc: benh, linux-mm, linux-kernel, linux-arch
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Date: Wed, 7 Apr 2010 16:14:29 +0900 (JST)
>> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>> Date: Wed, 7 Apr 2010 15:03:45 +0900 (JST)
>>
>> > I'm not against changing kernel internal. I only disagree mmu
>> > attribute fashion will be become used widely.
>>
>> Desktop already uses similar features via PCI mmap
>> attributes and such, not to mention MSR settings on
>> x86.
>
> Probably I haven't catch your mention. Why userland process
> need to change PCI mmap attribute by mmap(2)? It seems kernel issue.
It uses PCI specific fd ioctls to change the attributes.
It's the same thing as extending the mmap() attribute space, but in a
device specific way.
I think evice and platform specific mmap() attributes are basically
inevitable, at any level, embedded or desktop or whatever. The
fact that we've hacked around the issue with device specific
interfaces like the PCI device ioctls, is no excuse to not
tackle the issue directly and come up with something usable.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Arch specific mmap attributes (Was: mprotect pgprot handling weirdness)
2010-04-07 6:03 ` KOSAKI Motohiro
2010-04-07 7:03 ` Arch specific mmap attributes David Miller
@ 2010-04-07 8:56 ` Benjamin Herrenschmidt
1 sibling, 0 replies; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2010-04-07 8:56 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: linux-mm, linux-kernel@vger.kernel.org, linux-arch, Nick Piggin,
Hugh Dickins
On Wed, 2010-04-07 at 15:03 +0900, KOSAKI Motohiro wrote:
> Generally speaking, It seems no good idea. desktop and server world don't
> interest arch specific mmu attribute crap.
So you are saying that because your desktop and servers don't care Linux
shouldn't support the possiblity ? IE. Embedded doesn't matter or some
sort of similar statement ? :-) Come on ...
Anyways, this is just not true. Take SAO, this is a server feature (used
among others for x86 emulation). Little Endian mappings is indeed more
of an "embedded" feature to some extent, at least the way we plan to use
it, but is still very relevant.
Caching attributes control and storage keys can be useful in a lot of
other areas that really have nothing to do with HPC :-) Databases come
to mind, there's more too.
In any case, I don't know why you argue. We have features that a lot of
the CPUs out there provide, that at least some people out there would
like to exploit, and you are saying that Linux should not provide
support for these because your vision of a desktop/server only world is
all that matters ?
Anyways, let's go back to -how- to implement that properly rather than
that sort of reasonably useless argument.
> because many many opensource
> and ISV library don't care it. I know highend hpc and embedded have
> differenct eco-system. they might want to use such strange mmu feature.
> I recommend to you are focusing popwerpc eco-system.
Thanks you for your recommendation :-)
> I'm not against changing kernel internal. I only disagree mmu attribute
> fashion will be become used widely.
So how do you propose we proceed ? Extend vm_flags to be a u64 instead ?
I don't really care much which method is used, though from a -technical-
perspective, the mmu attributes one seem to be nicer in the long run,
but my immediate needs would be well served by just adding 2 or 3 flags
in there :-)
In any case, I'd be curious to have Hugh and Nick opinions here on the
technicalities.
Cheers,
Ben.
> > Some powerpc's also provide storage keys for example and I think ARM
> > have something along those lines. There's interesting cachability
> > attributes too, on x86 as well. Being able to use such attributes to
> > request for example a relaxed ordering mapping on x86 might be useful.
> >
> > I think it basically boils down to either extend vm_flags to always be
> > 64-bit, which seems to be Nick preferred approach, or introduct a
> > vm_attributes with all the necessary changes to the merge code to take
> > it into account (not -that- hard tho, there's only half a page of
> > results in grep for these things :-)
>
>
>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Arch specific mmap attributes
2010-04-07 7:03 ` Arch specific mmap attributes David Miller
2010-04-07 7:14 ` KOSAKI Motohiro
@ 2010-04-07 8:58 ` Benjamin Herrenschmidt
1 sibling, 0 replies; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2010-04-07 8:58 UTC (permalink / raw)
To: David Miller; +Cc: kosaki.motohiro, linux-mm, linux-kernel, linux-arch
On Wed, 2010-04-07 at 00:03 -0700, David Miller wrote:
> > I'm not against changing kernel internal. I only disagree mmu
> > attribute fashion will be become used widely.
>
> Desktop already uses similar features via PCI mmap
> attributes and such, not to mention MSR settings on
> x86.
This is a very good point, we've had all sort of trouble hacking that in
for PCI mmap, between trying to get write combine in, which we got
on /proc via a tweak I think we never got over to sysfs, and the ability
to control cachability, for which we used to have O_SYNC hacks
in /dev/mem, I think there is room for some nice and clean set of
attributes here.
Cheers,
Ben.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Arch specific mmap attributes
2010-04-07 7:14 ` KOSAKI Motohiro
2010-04-07 7:18 ` David Miller
@ 2010-04-07 9:00 ` Benjamin Herrenschmidt
1 sibling, 0 replies; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2010-04-07 9:00 UTC (permalink / raw)
To: KOSAKI Motohiro; +Cc: David Miller, linux-mm, linux-kernel, linux-arch
On Wed, 2010-04-07 at 16:14 +0900, KOSAKI Motohiro wrote:
> > Desktop already uses similar features via PCI mmap
> > attributes and such, not to mention MSR settings on
> > x86.
>
> Probably I haven't catch your mention. Why userland process
> need to change PCI mmap attribute by mmap(2)? It seems kernel issue.
There are cases where the userspace based driver needs to control
attributes such as write combining, or even cachability when mapping PCI
devices directly into userspace.
It's not -that- common, though X still does it on a number of platforms,
and there are people still trying to run PCI drivers in userspace ;-)
But regardless. I don't see why HPC or Embedded would have to be
qualified as "crap" and not warrant our full attention into devising
something sane and clean anyways.
Cheers,
Ben.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2010-04-07 9:00 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-06 5:09 mprotect pgprot handling weirdness Benjamin Herrenschmidt
2010-04-06 5:32 ` Benjamin Herrenschmidt
2010-04-06 5:43 ` Benjamin Herrenschmidt
2010-04-06 5:52 ` KOSAKI Motohiro
2010-04-06 6:07 ` Arch specific mmap attributes (Was: mprotect pgprot handling weirdness) Benjamin Herrenschmidt
2010-04-06 6:24 ` KOSAKI Motohiro
2010-04-06 7:30 ` Benjamin Herrenschmidt
2010-04-06 10:26 ` KOSAKI Motohiro
2010-04-06 22:15 ` Benjamin Herrenschmidt
2010-04-07 6:03 ` KOSAKI Motohiro
2010-04-07 7:03 ` Arch specific mmap attributes David Miller
2010-04-07 7:14 ` KOSAKI Motohiro
2010-04-07 7:18 ` David Miller
2010-04-07 9:00 ` Benjamin Herrenschmidt
2010-04-07 8:58 ` Benjamin Herrenschmidt
2010-04-07 8:56 ` Arch specific mmap attributes (Was: mprotect pgprot handling weirdness) Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).