* mprotect pgprot handling weirdness @ 2010-04-06 5:09 Benjamin Herrenschmidt 2010-04-06 5:32 ` Benjamin Herrenschmidt ` (2 more replies) 0 siblings, 3 replies; 16+ messages in thread From: Benjamin Herrenschmidt @ 2010-04-06 5:09 UTC (permalink / raw) To: linux-mm; +Cc: linux-kernel@vger.kernel.org Hi folks ! While looking at untangling a bit some of the mess with vm_flags and pgprot (*), I notices a few things I can't quite explain... they may .. or may not be bugs, but I though it was worth mentioning: - In mprotect_fixup() : /* * vm_flags and vm_page_prot are protected by the mmap_sem * held in write mode. */ vma->vm_flags = newflags; vma->vm_page_prot = pgprot_modify(vma->vm_page_prot, vm_get_page_prot(newflags)); if (vma_wants_writenotify(vma)) { vma->vm_page_prot = vm_get_page_prot(newflags & ~VM_SHARED); dirty_accountable = 1; } So as you can see above, we take great care (using pgprot_modify) to avoid blasting away some PAT related flags on x86 (no other arch implements pgprot_modify() today).... but if we hit vma_wants_writenotify(), then we unconditionally override the entire vma->vm_page_prot field with some new prot bits born of the new vm_flags. That sounds odd... - in sys_mprotect: newflags = vm_flags | (vma->vm_flags & ~(VM_READ | VM_WRITE | VM_EXEC)); Do I read correctly that this means we cannot -remove- any flag than VM_READ, VM_WRITE or VM_EXEC ? That means that we cannot remove PROT_SAO which gets turned into VM_SAO on powerpc ... Yet another reason to take those arch specific mapping attributes out of the vm_flags. (*) Right now it's near impossible to add arch specific PROT_* bits to mmap/mprotect for fancy things like cachability attributes, or other nifty things like reverse-endian mappings that we have on some embedded platforms, I'm investigating ways to better separate vm_page_prot from vm_flags so some PROT_* bits can go straight to the former without having to be mirrored in some way in the later. Cheers, Ben. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: mprotect pgprot handling weirdness 2010-04-06 5:09 mprotect pgprot handling weirdness Benjamin Herrenschmidt @ 2010-04-06 5:32 ` Benjamin Herrenschmidt 2010-04-06 5:43 ` Benjamin Herrenschmidt 2010-04-06 5:52 ` KOSAKI Motohiro 2 siblings, 0 replies; 16+ messages in thread From: Benjamin Herrenschmidt @ 2010-04-06 5:32 UTC (permalink / raw) To: linux-mm; +Cc: linux-kernel@vger.kernel.org On Tue, 2010-04-06 at 15:09 +1000, Benjamin Herrenschmidt wrote: > Hi folks ! > > While looking at untangling a bit some of the mess with vm_flags and > pgprot (*), I notices a few things I can't quite explain... they may .. > or may not be bugs, but I though it was worth mentioning: And another one: - vma_wants_writenotify(): /* The open routine did something to the protections already? */ if (pgprot_val(vma->vm_page_prot) != pgprot_val(vm_get_page_prot(vm_flags))) return 0; That's going to blow if any -other- prot bit is used here. Cheers, Ben. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: mprotect pgprot handling weirdness 2010-04-06 5:09 mprotect pgprot handling weirdness Benjamin Herrenschmidt 2010-04-06 5:32 ` Benjamin Herrenschmidt @ 2010-04-06 5:43 ` Benjamin Herrenschmidt 2010-04-06 5:52 ` KOSAKI Motohiro 2 siblings, 0 replies; 16+ messages in thread From: Benjamin Herrenschmidt @ 2010-04-06 5:43 UTC (permalink / raw) To: linux-mm; +Cc: linux-kernel@vger.kernel.org On Tue, 2010-04-06 at 15:09 +1000, Benjamin Herrenschmidt wrote: > (*) Right now it's near impossible to add arch specific PROT_* bits to > mmap/mprotect for fancy things like cachability attributes, or other > nifty things like reverse-endian mappings that we have on some embedded > platforms, I'm investigating ways to better separate vm_page_prot from > vm_flags so some PROT_* bits can go straight to the former without > having to be mirrored in some way in the later. The other (easier) option is to make the vm flags always 64-bit and reserve a range of bits here for the arch to use but I suppose there's going to be unhappiness about that one :-) Cheers, Ben. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: mprotect pgprot handling weirdness 2010-04-06 5:09 mprotect pgprot handling weirdness Benjamin Herrenschmidt 2010-04-06 5:32 ` Benjamin Herrenschmidt 2010-04-06 5:43 ` Benjamin Herrenschmidt @ 2010-04-06 5:52 ` KOSAKI Motohiro 2010-04-06 6:07 ` Arch specific mmap attributes (Was: mprotect pgprot handling weirdness) Benjamin Herrenschmidt 2 siblings, 1 reply; 16+ messages in thread From: KOSAKI Motohiro @ 2010-04-06 5:52 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: kosaki.motohiro, linux-mm, linux-kernel@vger.kernel.org > Hi folks ! > > While looking at untangling a bit some of the mess with vm_flags and > pgprot (*), I notices a few things I can't quite explain... they may .. > or may not be bugs, but I though it was worth mentioning: > > - In mprotect_fixup() : > > /* > * vm_flags and vm_page_prot are protected by the mmap_sem > * held in write mode. > */ > vma->vm_flags = newflags; > vma->vm_page_prot = pgprot_modify(vma->vm_page_prot, > vm_get_page_prot(newflags)); > > if (vma_wants_writenotify(vma)) { > vma->vm_page_prot = vm_get_page_prot(newflags & ~VM_SHARED); > dirty_accountable = 1; > } > > So as you can see above, we take great care (using pgprot_modify) to avoid > blasting away some PAT related flags on x86 (no other arch implements > pgprot_modify() today).... but if we hit vma_wants_writenotify(), then > we unconditionally override the entire vma->vm_page_prot field with some > new prot bits born of the new vm_flags. That sounds odd... > > - in sys_mprotect: > > newflags = vm_flags | (vma->vm_flags & ~(VM_READ | VM_WRITE | VM_EXEC)); > > Do I read correctly that this means we cannot -remove- any flag than > VM_READ, VM_WRITE or VM_EXEC ? That means that we cannot remove PROT_SAO > which gets turned into VM_SAO on powerpc ... Yet another reason to take > those arch specific mapping attributes out of the vm_flags. > > (*) Right now it's near impossible to add arch specific PROT_* bits to > mmap/mprotect for fancy things like cachability attributes, or other > nifty things like reverse-endian mappings that we have on some embedded > platforms, I'm investigating ways to better separate vm_page_prot from > vm_flags so some PROT_* bits can go straight to the former without > having to be mirrored in some way in the later. This check was introduced the following commit. yes now we don't consider arch specific PROT_xx flags. but I don't think it is odd. Yeah, I can imagine at least embedded people certenary need arch specific PROT_xx flags and they hope to change it. but I don't think mprotect() fit for your usage. I mean mprotect() is widely used glibc internally. then, If mprotec can change which flags, glibc might turn off such flags implictly. So, Why can't we proper new syscall? It has no regression risk. ========================================================== commit d5e066ae3c39b4036b5f5021c352af0b73c85568 Author: torvalds <torvalds> Date: Fri Sep 5 19:05:07 2003 +0000 Fix mprotect() to do proper PROT_xxx -> VM_xxx translation. This also fixes the bug with MAP_SEM being potentially interpreted as VM_SHARED. BKrev: 3f58de63gvzz-PsxwnRPnXTpz7EOeg diff --git a/mm/mprotect.c b/mm/mprotect.c index 2c01579..699962e 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -224,7 +224,7 @@ fail: asmlinkage long sys_mprotect(unsigned long start, size_t len, unsigned long prot) { - unsigned long nstart, end, tmp; + unsigned long vm_flags, nstart, end, tmp; struct vm_area_struct * vma, * next, * prev; int error = -EINVAL; @@ -239,6 +239,8 @@ sys_mprotect(unsigned long start, size_t len, unsigned long prot) if (end == start) return 0; + vm_flags = calc_vm_prot_bits(prot); + down_write(¤t->mm->mmap_sem); vma = find_vma_prev(current->mm, start, &prev); @@ -257,7 +259,8 @@ sys_mprotect(unsigned long start, size_t len, unsigned long prot) goto out; } - newflags = prot | (vma->vm_flags & ~(PROT_READ | PROT_WRITE | PROT_EXEC)); + newflags = vm_flags | (vma->vm_flags & ~(VM_READ | VM_WRITE | VM_EXEC)); + if ((newflags & ~(newflags >> 4)) & 0xf) { error = -EACCES; goto out; -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Arch specific mmap attributes (Was: mprotect pgprot handling weirdness) 2010-04-06 5:52 ` KOSAKI Motohiro @ 2010-04-06 6:07 ` Benjamin Herrenschmidt 2010-04-06 6:24 ` KOSAKI Motohiro 0 siblings, 1 reply; 16+ messages in thread From: Benjamin Herrenschmidt @ 2010-04-06 6:07 UTC (permalink / raw) To: KOSAKI Motohiro; +Cc: linux-mm, linux-kernel@vger.kernel.org, linux-arch On Tue, 2010-04-06 at 14:52 +0900, KOSAKI Motohiro wrote: (Adding linux-arch) > This check was introduced the following commit. yes now we don't > consider arch specific PROT_xx flags. but I don't think it is odd. > > Yeah, I can imagine at least embedded people certenary need arch > specific PROT_xx flags and they hope to change it. but I don't > think mprotect() fit for your usage. I mean mprotect() is widely > used glibc internally. then, If mprotec can change which flags, > glibc might turn off such flags implictly. > > So, Why can't we proper new syscall? It has no regression risk. I don't care much personally whether we use mprotect() or a new syscall, but at this stage we already have PROT_SAO going that way for powerpc so that would be an ABI change. However, the main issue isn't really there. The main issue is that right now, everything we do in mmap.c, mprotect.c, ... revolves around having everything translated into the single vm_flags field. VMA merging decisions, construction of vm_page_prot, etc... everything is there. However, this is a 32-bit field on 32-bit archs, and we already use all possible bits in there. It's also a field entirely defined in generic code with no provision for arch specific bits. The question here thus boils down to what direction do we want to go to if we want to untangle that and provide the ability to expose mapping "attributes" basically. In fact, I suspect even x86 might have good use of that to create things like relaxed ordering mappings no ? This boils down, so far to a few facts/questions to be resolved: - Do we want to use the existing PROT_ argument to mmap, mprotect,... ? There's plenty of bit space, and we already have at least one example of an arch adding something to it (powerpc with PROT_SAO - aka Strong Access Ordering - aka Make It Look Like An x86 :-) - If not, while a separate syscall would be fine with me for setting attributes after the fact, it makes it harder to pass them via mmap, is that a big deal ? IE. Ie it means one -always- has to call it after mmap to change the attributes. That means for example that mmap will potentially create a VMA merged with another one, just to be re-split due to the attribute change. A bit gross... - Do we want to keep the current "Funnel everything into vm_flags" approach ? That leaves no option that I can see but to extend it into a u64 so it grows on 32-bit archs. - If not, I see two approaches here: Either having a separate / new "attribute" field in the VMA or going straight for the vm_page_prot (ie. the pgprot). In both cases, things like vma_merge() need to grow a new argument since obviously we can't merge things with different attributes. - ... Unless we just replace VM_SAO with VM_CANT_MERGE and set that whenever a VMA has a non-0 attributes. Sad but simpler Any other / better idea ? Cheers, Ben. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Arch specific mmap attributes (Was: mprotect pgprot handling weirdness) 2010-04-06 6:07 ` Arch specific mmap attributes (Was: mprotect pgprot handling weirdness) Benjamin Herrenschmidt @ 2010-04-06 6:24 ` KOSAKI Motohiro 2010-04-06 7:30 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 16+ messages in thread From: KOSAKI Motohiro @ 2010-04-06 6:24 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: kosaki.motohiro, linux-mm, linux-kernel@vger.kernel.org, linux-arch > On Tue, 2010-04-06 at 14:52 +0900, KOSAKI Motohiro wrote: > > (Adding linux-arch) > > > This check was introduced the following commit. yes now we don't > > consider arch specific PROT_xx flags. but I don't think it is odd. > > > > Yeah, I can imagine at least embedded people certenary need arch > > specific PROT_xx flags and they hope to change it. but I don't > > think mprotect() fit for your usage. I mean mprotect() is widely > > used glibc internally. then, If mprotec can change which flags, > > glibc might turn off such flags implictly. > > > > So, Why can't we proper new syscall? It has no regression risk. > > I don't care much personally whether we use mprotect() or a new syscall, > but at this stage we already have PROT_SAO going that way for powerpc so > that would be an ABI change. > > However, the main issue isn't really there. The main issue is that right > now, everything we do in mmap.c, mprotect.c, ... revolves around having > everything translated into the single vm_flags field. VMA merging > decisions, construction of vm_page_prot, etc... everything is there. > > However, this is a 32-bit field on 32-bit archs, and we already use all > possible bits in there. It's also a field entirely defined in generic > code with no provision for arch specific bits. > > The question here thus boils down to what direction do we want to go to > if we want to untangle that and provide the ability to expose mapping > "attributes" basically. In fact, I suspect even x86 might have good use > of that to create things like relaxed ordering mappings no ? > > This boils down, so far to a few facts/questions to be resolved: > > - Do we want to use the existing PROT_ argument to mmap, mprotect,... ? > There's plenty of bit space, and we already have at least one example of > an arch adding something to it (powerpc with PROT_SAO - aka Strong > Access Ordering - aka Make It Look Like An x86 :-) > > - If not, while a separate syscall would be fine with me for setting > attributes after the fact, it makes it harder to pass them via mmap, is > that a big deal ? IE. Ie it means one -always- has to call it after mmap > to change the attributes. That means for example that mmap will > potentially create a VMA merged with another one, just to be re-split > due to the attribute change. A bit gross... > > - Do we want to keep the current "Funnel everything into vm_flags" > approach ? That leaves no option that I can see but to extend it into a > u64 so it grows on 32-bit archs. > > - If not, I see two approaches here: Either having a separate / new > "attribute" field in the VMA or going straight for the vm_page_prot (ie. > the pgprot). In both cases, things like vma_merge() need to grow a new > argument since obviously we can't merge things with different > attributes. > > - ... Unless we just replace VM_SAO with VM_CANT_MERGE and set that > whenever a VMA has a non-0 attributes. Sad but simpler > > Any other / better idea ? I guess you haven't catch my intention. I didn't say we have to remove PROT_SAO and VM_SAO. I mean mmap(PROT_SAO) is ok, it's only append new flag, not change exiting flags meanings. I'm only against mprotect(PROT_NONE) turn off PROT_SAO implicitely. IOW I recommend we use three syscall mmap() create new mappings mprotect() change a protection of mapping (as a name) mattribute(): (or similar name) change an attribute of mapping (e.g. PROT_SAO or another arch specific flags) I'm not against changing mm/protect.c for PROT_SAO. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Arch specific mmap attributes (Was: mprotect pgprot handling weirdness) 2010-04-06 6:24 ` KOSAKI Motohiro @ 2010-04-06 7:30 ` Benjamin Herrenschmidt 2010-04-06 10:26 ` KOSAKI Motohiro 0 siblings, 1 reply; 16+ messages in thread From: Benjamin Herrenschmidt @ 2010-04-06 7:30 UTC (permalink / raw) To: KOSAKI Motohiro; +Cc: linux-mm, linux-kernel@vger.kernel.org, linux-arch On Tue, 2010-04-06 at 15:24 +0900, KOSAKI Motohiro wrote: > I guess you haven't catch my intention. I didn't say we have to remove > PROT_SAO and VM_SAO. > I mean mmap(PROT_SAO) is ok, it's only append new flag, not change exiting > flags meanings. I'm only against mprotect(PROT_NONE) turn off PROT_SAO > implicitely. > > IOW I recommend we use three syscall > mmap() create new mappings > mprotect() change a protection of mapping (as a name) > mattribute(): (or similar name) > change an attribute of mapping (e.g. PROT_SAO or > another arch specific flags) > > I'm not against changing mm/protect.c for PROT_SAO. Ok, I see. No biggie. The main deal remains how we want to do that inside the kernel :-) I think the less horrible options here are to either extend vm_flags to always be 64-bit, or add a separate vm_map_attributes flag, and add the necessary bits and pieces to prevent merge accross different attribute vma's. The more I try to hack it into vm_page_prot, the more I hate that option. Cheers Ben. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Arch specific mmap attributes (Was: mprotect pgprot handling weirdness) 2010-04-06 7:30 ` Benjamin Herrenschmidt @ 2010-04-06 10:26 ` KOSAKI Motohiro 2010-04-06 22:15 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 16+ messages in thread From: KOSAKI Motohiro @ 2010-04-06 10:26 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: kosaki.motohiro, linux-mm, linux-kernel@vger.kernel.org, linux-arch > Ok, I see. No biggie. The main deal remains how we want to do that > inside the kernel :-) I think the less horrible options here are > to either extend vm_flags to always be 64-bit, or add a separate > vm_map_attributes flag, and add the necessary bits and pieces to > prevent merge accross different attribute vma's. vma->vm_flags already have VM_SAO. Why do we need more flags? At least, I dislike to add separate flags member into vma. It might introduce unnecessary messy into vma merge thing. > The more I try to hack it into vm_page_prot, the more I hate that > option. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Arch specific mmap attributes (Was: mprotect pgprot handling weirdness) 2010-04-06 10:26 ` KOSAKI Motohiro @ 2010-04-06 22:15 ` Benjamin Herrenschmidt 2010-04-07 6:03 ` KOSAKI Motohiro 0 siblings, 1 reply; 16+ messages in thread From: Benjamin Herrenschmidt @ 2010-04-06 22:15 UTC (permalink / raw) To: KOSAKI Motohiro; +Cc: linux-mm, linux-kernel@vger.kernel.org, linux-arch On Tue, 2010-04-06 at 19:26 +0900, KOSAKI Motohiro wrote: > > Ok, I see. No biggie. The main deal remains how we want to do that > > inside the kernel :-) I think the less horrible options here are > > to either extend vm_flags to always be 64-bit, or add a separate > > vm_map_attributes flag, and add the necessary bits and pieces to > > prevent merge accross different attribute vma's. > > vma->vm_flags already have VM_SAO. Why do we need more flags? > At least, I dislike to add separate flags member into vma. > It might introduce unnecessary messy into vma merge thing. Well, we did shove SAO in there, and used up the very last vm_flag for it a while back. Now I need another one, for little endian mappings. So I'm stuck. But the problem goes further I believe. Archs do nowadays have quite an interesting set of MMU attributes that it would be useful to expose to some extent. Some powerpc's also provide storage keys for example and I think ARM have something along those lines. There's interesting cachability attributes too, on x86 as well. Being able to use such attributes to request for example a relaxed ordering mapping on x86 might be useful. I think it basically boils down to either extend vm_flags to always be 64-bit, which seems to be Nick preferred approach, or introduct a vm_attributes with all the necessary changes to the merge code to take it into account (not -that- hard tho, there's only half a page of results in grep for these things :-) Cheers, Ben. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Arch specific mmap attributes (Was: mprotect pgprot handling weirdness) 2010-04-06 22:15 ` Benjamin Herrenschmidt @ 2010-04-07 6:03 ` KOSAKI Motohiro 2010-04-07 7:03 ` Arch specific mmap attributes David Miller 2010-04-07 8:56 ` Arch specific mmap attributes (Was: mprotect pgprot handling weirdness) Benjamin Herrenschmidt 0 siblings, 2 replies; 16+ messages in thread From: KOSAKI Motohiro @ 2010-04-07 6:03 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: kosaki.motohiro, linux-mm, linux-kernel@vger.kernel.org, linux-arch > On Tue, 2010-04-06 at 19:26 +0900, KOSAKI Motohiro wrote: > > > Ok, I see. No biggie. The main deal remains how we want to do that > > > inside the kernel :-) I think the less horrible options here are > > > to either extend vm_flags to always be 64-bit, or add a separate > > > vm_map_attributes flag, and add the necessary bits and pieces to > > > prevent merge accross different attribute vma's. > > > > vma->vm_flags already have VM_SAO. Why do we need more flags? > > At least, I dislike to add separate flags member into vma. > > It might introduce unnecessary messy into vma merge thing. > > Well, we did shove SAO in there, and used up the very last vm_flag for > it a while back. Now I need another one, for little endian mappings. So > I'm stuck. > > But the problem goes further I believe. Archs do nowadays have quite an > interesting set of MMU attributes that it would be useful to expose to > some extent. Generally speaking, It seems no good idea. desktop and server world don't interest arch specific mmu attribute crap. because many many opensource and ISV library don't care it. I know highend hpc and embedded have differenct eco-system. they might want to use such strange mmu feature. I recommend to you are focusing popwerpc eco-system. I'm not against changing kernel internal. I only disagree mmu attribute fashion will be become used widely. > > Some powerpc's also provide storage keys for example and I think ARM > have something along those lines. There's interesting cachability > attributes too, on x86 as well. Being able to use such attributes to > request for example a relaxed ordering mapping on x86 might be useful. > > I think it basically boils down to either extend vm_flags to always be > 64-bit, which seems to be Nick preferred approach, or introduct a > vm_attributes with all the necessary changes to the merge code to take > it into account (not -that- hard tho, there's only half a page of > results in grep for these things :-) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Arch specific mmap attributes 2010-04-07 6:03 ` KOSAKI Motohiro @ 2010-04-07 7:03 ` David Miller 2010-04-07 7:14 ` KOSAKI Motohiro 2010-04-07 8:58 ` Benjamin Herrenschmidt 2010-04-07 8:56 ` Arch specific mmap attributes (Was: mprotect pgprot handling weirdness) Benjamin Herrenschmidt 1 sibling, 2 replies; 16+ messages in thread From: David Miller @ 2010-04-07 7:03 UTC (permalink / raw) To: kosaki.motohiro; +Cc: benh, linux-mm, linux-kernel, linux-arch From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Date: Wed, 7 Apr 2010 15:03:45 +0900 (JST) > I'm not against changing kernel internal. I only disagree mmu > attribute fashion will be become used widely. Desktop already uses similar features via PCI mmap attributes and such, not to mention MSR settings on x86. So I disagree with your assesment that this is some HPC/embedded issue. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Arch specific mmap attributes 2010-04-07 7:03 ` Arch specific mmap attributes David Miller @ 2010-04-07 7:14 ` KOSAKI Motohiro 2010-04-07 7:18 ` David Miller 2010-04-07 9:00 ` Benjamin Herrenschmidt 2010-04-07 8:58 ` Benjamin Herrenschmidt 1 sibling, 2 replies; 16+ messages in thread From: KOSAKI Motohiro @ 2010-04-07 7:14 UTC (permalink / raw) To: David Miller; +Cc: kosaki.motohiro, benh, linux-mm, linux-kernel, linux-arch > From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > Date: Wed, 7 Apr 2010 15:03:45 +0900 (JST) > > > I'm not against changing kernel internal. I only disagree mmu > > attribute fashion will be become used widely. > > Desktop already uses similar features via PCI mmap > attributes and such, not to mention MSR settings on > x86. Probably I haven't catch your mention. Why userland process need to change PCI mmap attribute by mmap(2)? It seems kernel issue. > So I disagree with your assesment that this is some > HPC/embedded issue. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Arch specific mmap attributes 2010-04-07 7:14 ` KOSAKI Motohiro @ 2010-04-07 7:18 ` David Miller 2010-04-07 9:00 ` Benjamin Herrenschmidt 1 sibling, 0 replies; 16+ messages in thread From: David Miller @ 2010-04-07 7:18 UTC (permalink / raw) To: kosaki.motohiro; +Cc: benh, linux-mm, linux-kernel, linux-arch From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Date: Wed, 7 Apr 2010 16:14:29 +0900 (JST) >> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> >> Date: Wed, 7 Apr 2010 15:03:45 +0900 (JST) >> >> > I'm not against changing kernel internal. I only disagree mmu >> > attribute fashion will be become used widely. >> >> Desktop already uses similar features via PCI mmap >> attributes and such, not to mention MSR settings on >> x86. > > Probably I haven't catch your mention. Why userland process > need to change PCI mmap attribute by mmap(2)? It seems kernel issue. It uses PCI specific fd ioctls to change the attributes. It's the same thing as extending the mmap() attribute space, but in a device specific way. I think evice and platform specific mmap() attributes are basically inevitable, at any level, embedded or desktop or whatever. The fact that we've hacked around the issue with device specific interfaces like the PCI device ioctls, is no excuse to not tackle the issue directly and come up with something usable. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Arch specific mmap attributes 2010-04-07 7:14 ` KOSAKI Motohiro 2010-04-07 7:18 ` David Miller @ 2010-04-07 9:00 ` Benjamin Herrenschmidt 1 sibling, 0 replies; 16+ messages in thread From: Benjamin Herrenschmidt @ 2010-04-07 9:00 UTC (permalink / raw) To: KOSAKI Motohiro; +Cc: David Miller, linux-mm, linux-kernel, linux-arch On Wed, 2010-04-07 at 16:14 +0900, KOSAKI Motohiro wrote: > > Desktop already uses similar features via PCI mmap > > attributes and such, not to mention MSR settings on > > x86. > > Probably I haven't catch your mention. Why userland process > need to change PCI mmap attribute by mmap(2)? It seems kernel issue. There are cases where the userspace based driver needs to control attributes such as write combining, or even cachability when mapping PCI devices directly into userspace. It's not -that- common, though X still does it on a number of platforms, and there are people still trying to run PCI drivers in userspace ;-) But regardless. I don't see why HPC or Embedded would have to be qualified as "crap" and not warrant our full attention into devising something sane and clean anyways. Cheers, Ben. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Arch specific mmap attributes 2010-04-07 7:03 ` Arch specific mmap attributes David Miller 2010-04-07 7:14 ` KOSAKI Motohiro @ 2010-04-07 8:58 ` Benjamin Herrenschmidt 1 sibling, 0 replies; 16+ messages in thread From: Benjamin Herrenschmidt @ 2010-04-07 8:58 UTC (permalink / raw) To: David Miller; +Cc: kosaki.motohiro, linux-mm, linux-kernel, linux-arch On Wed, 2010-04-07 at 00:03 -0700, David Miller wrote: > > I'm not against changing kernel internal. I only disagree mmu > > attribute fashion will be become used widely. > > Desktop already uses similar features via PCI mmap > attributes and such, not to mention MSR settings on > x86. This is a very good point, we've had all sort of trouble hacking that in for PCI mmap, between trying to get write combine in, which we got on /proc via a tweak I think we never got over to sysfs, and the ability to control cachability, for which we used to have O_SYNC hacks in /dev/mem, I think there is room for some nice and clean set of attributes here. Cheers, Ben. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Arch specific mmap attributes (Was: mprotect pgprot handling weirdness) 2010-04-07 6:03 ` KOSAKI Motohiro 2010-04-07 7:03 ` Arch specific mmap attributes David Miller @ 2010-04-07 8:56 ` Benjamin Herrenschmidt 1 sibling, 0 replies; 16+ messages in thread From: Benjamin Herrenschmidt @ 2010-04-07 8:56 UTC (permalink / raw) To: KOSAKI Motohiro Cc: linux-mm, linux-kernel@vger.kernel.org, linux-arch, Nick Piggin, Hugh Dickins On Wed, 2010-04-07 at 15:03 +0900, KOSAKI Motohiro wrote: > Generally speaking, It seems no good idea. desktop and server world don't > interest arch specific mmu attribute crap. So you are saying that because your desktop and servers don't care Linux shouldn't support the possiblity ? IE. Embedded doesn't matter or some sort of similar statement ? :-) Come on ... Anyways, this is just not true. Take SAO, this is a server feature (used among others for x86 emulation). Little Endian mappings is indeed more of an "embedded" feature to some extent, at least the way we plan to use it, but is still very relevant. Caching attributes control and storage keys can be useful in a lot of other areas that really have nothing to do with HPC :-) Databases come to mind, there's more too. In any case, I don't know why you argue. We have features that a lot of the CPUs out there provide, that at least some people out there would like to exploit, and you are saying that Linux should not provide support for these because your vision of a desktop/server only world is all that matters ? Anyways, let's go back to -how- to implement that properly rather than that sort of reasonably useless argument. > because many many opensource > and ISV library don't care it. I know highend hpc and embedded have > differenct eco-system. they might want to use such strange mmu feature. > I recommend to you are focusing popwerpc eco-system. Thanks you for your recommendation :-) > I'm not against changing kernel internal. I only disagree mmu attribute > fashion will be become used widely. So how do you propose we proceed ? Extend vm_flags to be a u64 instead ? I don't really care much which method is used, though from a -technical- perspective, the mmu attributes one seem to be nicer in the long run, but my immediate needs would be well served by just adding 2 or 3 flags in there :-) In any case, I'd be curious to have Hugh and Nick opinions here on the technicalities. Cheers, Ben. > > Some powerpc's also provide storage keys for example and I think ARM > > have something along those lines. There's interesting cachability > > attributes too, on x86 as well. Being able to use such attributes to > > request for example a relaxed ordering mapping on x86 might be useful. > > > > I think it basically boils down to either extend vm_flags to always be > > 64-bit, which seems to be Nick preferred approach, or introduct a > > vm_attributes with all the necessary changes to the merge code to take > > it into account (not -that- hard tho, there's only half a page of > > results in grep for these things :-) > > > > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2010-04-07 9:00 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-04-06 5:09 mprotect pgprot handling weirdness Benjamin Herrenschmidt 2010-04-06 5:32 ` Benjamin Herrenschmidt 2010-04-06 5:43 ` Benjamin Herrenschmidt 2010-04-06 5:52 ` KOSAKI Motohiro 2010-04-06 6:07 ` Arch specific mmap attributes (Was: mprotect pgprot handling weirdness) Benjamin Herrenschmidt 2010-04-06 6:24 ` KOSAKI Motohiro 2010-04-06 7:30 ` Benjamin Herrenschmidt 2010-04-06 10:26 ` KOSAKI Motohiro 2010-04-06 22:15 ` Benjamin Herrenschmidt 2010-04-07 6:03 ` KOSAKI Motohiro 2010-04-07 7:03 ` Arch specific mmap attributes David Miller 2010-04-07 7:14 ` KOSAKI Motohiro 2010-04-07 7:18 ` David Miller 2010-04-07 9:00 ` Benjamin Herrenschmidt 2010-04-07 8:58 ` Benjamin Herrenschmidt 2010-04-07 8:56 ` Arch specific mmap attributes (Was: mprotect pgprot handling weirdness) Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).