* [PATCH 0/2] man-pages: clarify MAP_LOCKED semantic @ 2015-05-13 14:38 ` Michal Hocko 0 siblings, 0 replies; 32+ messages in thread From: Michal Hocko @ 2015-05-13 14:38 UTC (permalink / raw) To: Michael Kerrisk Cc: Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm Hi, during the previous discussion http://marc.info/?l=linux-mm&m=143022313618001&w=2 it was made clear that making mmap(MAP_LOCKED) semantic really have mlock() semantic is too dangerous. Even though we can try to reduce the failure space the mmap man page should make it really clear about the subtle distinctions between the two. This is what that first patch does. The second patch is a small clarification for MAP_POPULATE based on David Rientjes feedback. ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 0/2] man-pages: clarify MAP_LOCKED semantic @ 2015-05-13 14:38 ` Michal Hocko 0 siblings, 0 replies; 32+ messages in thread From: Michal Hocko @ 2015-05-13 14:38 UTC (permalink / raw) To: Michael Kerrisk Cc: Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm Hi, during the previous discussion http://marc.info/?l=linux-mm&m=143022313618001&w=2 it was made clear that making mmap(MAP_LOCKED) semantic really have mlock() semantic is too dangerous. Even though we can try to reduce the failure space the mmap man page should make it really clear about the subtle distinctions between the two. This is what that first patch does. The second patch is a small clarification for MAP_POPULATE based on David Rientjes feedback. ^ permalink raw reply [flat|nested] 32+ messages in thread
[parent not found: <1431527892-2996-1-git-send-email-miso-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>]
* [PATCH 1/2] mmap.2: clarify MAP_LOCKED semantic 2015-05-13 14:38 ` Michal Hocko (?) @ 2015-05-13 14:38 ` Michal Hocko -1 siblings, 0 replies; 32+ messages in thread From: Michal Hocko @ 2015-05-13 14:38 UTC (permalink / raw) To: Michael Kerrisk Cc: Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Michal Hocko From: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org> MAP_LOCKED had a subtly different semantic from mmap(2)+mlock(2) since it has been introduced. mlock(2) fails if the memory range cannot get populated to guarantee that no future major faults will happen on the range. mmap(MAP_LOCKED) on the other hand silently succeeds even if the range was populated only partially. Fixing this subtle difference in the kernel is rather awkward because the memory population happens after mm locks have been dropped and so the cleanup before returning failure (munlock) could operate on something else than the originally mapped area. E.g. speculative userspace page fault handler catching SEGV and doing mmap(fault_addr, MAP_FIXED|MAP_LOCKED) might discard portion of a racing mmap and lead to lost data. Although it is not clear whether such a usage would be valid, mmap page doesn't explicitly describe requirements for threaded applications so we cannot exclude this possibility. This patch makes the semantic of MAP_LOCKED explicit and suggest using mmap + mlock as the only way to guarantee no later major page faults. Signed-off-by: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org> --- man2/mmap.2 | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/man2/mmap.2 b/man2/mmap.2 index 54d68cf87e9e..1486be2e96b3 100644 --- a/man2/mmap.2 +++ b/man2/mmap.2 @@ -235,8 +235,19 @@ See the Linux kernel source file for further information. .TP .BR MAP_LOCKED " (since Linux 2.5.37)" -Lock the pages of the mapped region into memory in the manner of +Mark the mmaped region to be locked in the same way as .BR mlock (2). +This implementation will try to populate (prefault) the whole range but +the mmap call doesn't fail with +.B ENOMEM +if this fails. Therefore major faults might happen later on. So the semantic +is not as strong as +.BR mlock (2). +.BR mmap (2) ++ +.BR mlock (2) +should be used when major faults are not acceptable after the initialization +of the mapping. This flag is ignored in older kernels. .\" If set, the mapped pages will not be swapped out. .TP -- 2.1.4 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 1/2] mmap.2: clarify MAP_LOCKED semantic @ 2015-05-13 14:38 ` Michal Hocko 0 siblings, 0 replies; 32+ messages in thread From: Michal Hocko @ 2015-05-13 14:38 UTC (permalink / raw) To: Michael Kerrisk Cc: Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm, Michal Hocko From: Michal Hocko <mhocko@suse.cz> MAP_LOCKED had a subtly different semantic from mmap(2)+mlock(2) since it has been introduced. mlock(2) fails if the memory range cannot get populated to guarantee that no future major faults will happen on the range. mmap(MAP_LOCKED) on the other hand silently succeeds even if the range was populated only partially. Fixing this subtle difference in the kernel is rather awkward because the memory population happens after mm locks have been dropped and so the cleanup before returning failure (munlock) could operate on something else than the originally mapped area. E.g. speculative userspace page fault handler catching SEGV and doing mmap(fault_addr, MAP_FIXED|MAP_LOCKED) might discard portion of a racing mmap and lead to lost data. Although it is not clear whether such a usage would be valid, mmap page doesn't explicitly describe requirements for threaded applications so we cannot exclude this possibility. This patch makes the semantic of MAP_LOCKED explicit and suggest using mmap + mlock as the only way to guarantee no later major page faults. Signed-off-by: Michal Hocko <mhocko@suse.cz> --- man2/mmap.2 | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/man2/mmap.2 b/man2/mmap.2 index 54d68cf87e9e..1486be2e96b3 100644 --- a/man2/mmap.2 +++ b/man2/mmap.2 @@ -235,8 +235,19 @@ See the Linux kernel source file for further information. .TP .BR MAP_LOCKED " (since Linux 2.5.37)" -Lock the pages of the mapped region into memory in the manner of +Mark the mmaped region to be locked in the same way as .BR mlock (2). +This implementation will try to populate (prefault) the whole range but +the mmap call doesn't fail with +.B ENOMEM +if this fails. Therefore major faults might happen later on. So the semantic +is not as strong as +.BR mlock (2). +.BR mmap (2) ++ +.BR mlock (2) +should be used when major faults are not acceptable after the initialization +of the mapping. This flag is ignored in older kernels. .\" If set, the mapped pages will not be swapped out. .TP -- 2.1.4 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 1/2] mmap.2: clarify MAP_LOCKED semantic @ 2015-05-13 14:38 ` Michal Hocko 0 siblings, 0 replies; 32+ messages in thread From: Michal Hocko @ 2015-05-13 14:38 UTC (permalink / raw) To: Michael Kerrisk Cc: Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm, Michal Hocko From: Michal Hocko <mhocko@suse.cz> MAP_LOCKED had a subtly different semantic from mmap(2)+mlock(2) since it has been introduced. mlock(2) fails if the memory range cannot get populated to guarantee that no future major faults will happen on the range. mmap(MAP_LOCKED) on the other hand silently succeeds even if the range was populated only partially. Fixing this subtle difference in the kernel is rather awkward because the memory population happens after mm locks have been dropped and so the cleanup before returning failure (munlock) could operate on something else than the originally mapped area. E.g. speculative userspace page fault handler catching SEGV and doing mmap(fault_addr, MAP_FIXED|MAP_LOCKED) might discard portion of a racing mmap and lead to lost data. Although it is not clear whether such a usage would be valid, mmap page doesn't explicitly describe requirements for threaded applications so we cannot exclude this possibility. This patch makes the semantic of MAP_LOCKED explicit and suggest using mmap + mlock as the only way to guarantee no later major page faults. Signed-off-by: Michal Hocko <mhocko@suse.cz> --- man2/mmap.2 | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/man2/mmap.2 b/man2/mmap.2 index 54d68cf87e9e..1486be2e96b3 100644 --- a/man2/mmap.2 +++ b/man2/mmap.2 @@ -235,8 +235,19 @@ See the Linux kernel source file for further information. .TP .BR MAP_LOCKED " (since Linux 2.5.37)" -Lock the pages of the mapped region into memory in the manner of +Mark the mmaped region to be locked in the same way as .BR mlock (2). +This implementation will try to populate (prefault) the whole range but +the mmap call doesn't fail with +.B ENOMEM +if this fails. Therefore major faults might happen later on. So the semantic +is not as strong as +.BR mlock (2). +.BR mmap (2) ++ +.BR mlock (2) +should be used when major faults are not acceptable after the initialization +of the mapping. This flag is ignored in older kernels. .\" If set, the mapped pages will not be swapped out. .TP -- 2.1.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 1/2] mmap.2: clarify MAP_LOCKED semantic 2015-05-13 14:38 ` Michal Hocko (?) (?) @ 2015-05-13 14:45 ` Eric B Munson [not found] ` <20150513144506.GD1227-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org> -1 siblings, 1 reply; 32+ messages in thread From: Eric B Munson @ 2015-05-13 14:45 UTC (permalink / raw) To: Michal Hocko Cc: Michael Kerrisk, Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm, Michal Hocko [-- Attachment #1: Type: text/plain, Size: 1515 bytes --] On Wed, 13 May 2015, Michal Hocko wrote: > From: Michal Hocko <mhocko@suse.cz> > > MAP_LOCKED had a subtly different semantic from mmap(2)+mlock(2) since > it has been introduced. > mlock(2) fails if the memory range cannot get populated to guarantee > that no future major faults will happen on the range. mmap(MAP_LOCKED) on > the other hand silently succeeds even if the range was populated only > partially. > > Fixing this subtle difference in the kernel is rather awkward because > the memory population happens after mm locks have been dropped and so > the cleanup before returning failure (munlock) could operate on something > else than the originally mapped area. > > E.g. speculative userspace page fault handler catching SEGV and doing > mmap(fault_addr, MAP_FIXED|MAP_LOCKED) might discard portion of a racing > mmap and lead to lost data. Although it is not clear whether such a > usage would be valid, mmap page doesn't explicitly describe requirements > for threaded applications so we cannot exclude this possibility. > > This patch makes the semantic of MAP_LOCKED explicit and suggest using > mmap + mlock as the only way to guarantee no later major page faults. > > Signed-off-by: Michal Hocko <mhocko@suse.cz> Does the problem still happend when MAP_POPULATE | MAP_LOCKED is used (AFAICT MAP_POPULATE will cause the mmap to fail if all the pages cannot be made present). Either way this is a good catch. Acked-by: Eric B Munson <emunson@akamai.com> [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
[parent not found: <20150513144506.GD1227-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 1/2] mmap.2: clarify MAP_LOCKED semantic 2015-05-13 14:45 ` Eric B Munson @ 2015-05-13 14:48 ` Eric B Munson 0 siblings, 0 replies; 32+ messages in thread From: Eric B Munson @ 2015-05-13 14:48 UTC (permalink / raw) To: Michal Hocko Cc: Michael Kerrisk, Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Michal Hocko [-- Attachment #1: Type: text/plain, Size: 1829 bytes --] On Wed, 13 May 2015, Eric B Munson wrote: > On Wed, 13 May 2015, Michal Hocko wrote: > > > From: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org> > > > > MAP_LOCKED had a subtly different semantic from mmap(2)+mlock(2) since > > it has been introduced. > > mlock(2) fails if the memory range cannot get populated to guarantee > > that no future major faults will happen on the range. mmap(MAP_LOCKED) on > > the other hand silently succeeds even if the range was populated only > > partially. > > > > Fixing this subtle difference in the kernel is rather awkward because > > the memory population happens after mm locks have been dropped and so > > the cleanup before returning failure (munlock) could operate on something > > else than the originally mapped area. > > > > E.g. speculative userspace page fault handler catching SEGV and doing > > mmap(fault_addr, MAP_FIXED|MAP_LOCKED) might discard portion of a racing > > mmap and lead to lost data. Although it is not clear whether such a > > usage would be valid, mmap page doesn't explicitly describe requirements > > for threaded applications so we cannot exclude this possibility. > > > > This patch makes the semantic of MAP_LOCKED explicit and suggest using > > mmap + mlock as the only way to guarantee no later major page faults. > > > > Signed-off-by: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org> > > Does the problem still happend when MAP_POPULATE | MAP_LOCKED is used > (AFAICT MAP_POPULATE will cause the mmap to fail if all the pages cannot > be made present). > > Either way this is a good catch. > > Acked-by: Eric B Munson <emunson-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org> > Sorry for the noise, this should have been a Reviewed-by: Eric B Munson <emunson-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org> [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/2] mmap.2: clarify MAP_LOCKED semantic @ 2015-05-13 14:48 ` Eric B Munson 0 siblings, 0 replies; 32+ messages in thread From: Eric B Munson @ 2015-05-13 14:48 UTC (permalink / raw) To: Michal Hocko Cc: Michael Kerrisk, Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm, Michal Hocko [-- Attachment #1: Type: text/plain, Size: 1729 bytes --] On Wed, 13 May 2015, Eric B Munson wrote: > On Wed, 13 May 2015, Michal Hocko wrote: > > > From: Michal Hocko <mhocko@suse.cz> > > > > MAP_LOCKED had a subtly different semantic from mmap(2)+mlock(2) since > > it has been introduced. > > mlock(2) fails if the memory range cannot get populated to guarantee > > that no future major faults will happen on the range. mmap(MAP_LOCKED) on > > the other hand silently succeeds even if the range was populated only > > partially. > > > > Fixing this subtle difference in the kernel is rather awkward because > > the memory population happens after mm locks have been dropped and so > > the cleanup before returning failure (munlock) could operate on something > > else than the originally mapped area. > > > > E.g. speculative userspace page fault handler catching SEGV and doing > > mmap(fault_addr, MAP_FIXED|MAP_LOCKED) might discard portion of a racing > > mmap and lead to lost data. Although it is not clear whether such a > > usage would be valid, mmap page doesn't explicitly describe requirements > > for threaded applications so we cannot exclude this possibility. > > > > This patch makes the semantic of MAP_LOCKED explicit and suggest using > > mmap + mlock as the only way to guarantee no later major page faults. > > > > Signed-off-by: Michal Hocko <mhocko@suse.cz> > > Does the problem still happend when MAP_POPULATE | MAP_LOCKED is used > (AFAICT MAP_POPULATE will cause the mmap to fail if all the pages cannot > be made present). > > Either way this is a good catch. > > Acked-by: Eric B Munson <emunson@akamai.com> > Sorry for the noise, this should have been a Reviewed-by: Eric B Munson <emunson@akamai.com> [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/2] mmap.2: clarify MAP_LOCKED semantic 2015-05-13 14:45 ` Eric B Munson [not found] ` <20150513144506.GD1227-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org> @ 2015-05-14 8:01 ` Michal Hocko 0 siblings, 0 replies; 32+ messages in thread From: Michal Hocko @ 2015-05-14 8:01 UTC (permalink / raw) To: Eric B Munson Cc: Michael Kerrisk, Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm-Bw31MaZKKs3YtjvyW6yDsg On Wed 13-05-15 10:45:06, Eric B Munson wrote: > On Wed, 13 May 2015, Michal Hocko wrote: > > > From: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org> > > > > MAP_LOCKED had a subtly different semantic from mmap(2)+mlock(2) since > > it has been introduced. > > mlock(2) fails if the memory range cannot get populated to guarantee > > that no future major faults will happen on the range. mmap(MAP_LOCKED) on > > the other hand silently succeeds even if the range was populated only > > partially. > > > > Fixing this subtle difference in the kernel is rather awkward because > > the memory population happens after mm locks have been dropped and so > > the cleanup before returning failure (munlock) could operate on something > > else than the originally mapped area. > > > > E.g. speculative userspace page fault handler catching SEGV and doing > > mmap(fault_addr, MAP_FIXED|MAP_LOCKED) might discard portion of a racing > > mmap and lead to lost data. Although it is not clear whether such a > > usage would be valid, mmap page doesn't explicitly describe requirements > > for threaded applications so we cannot exclude this possibility. > > > > This patch makes the semantic of MAP_LOCKED explicit and suggest using > > mmap + mlock as the only way to guarantee no later major page faults. > > > > Signed-off-by: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org> > > Does the problem still happend when MAP_POPULATE | MAP_LOCKED is used > (AFAICT MAP_POPULATE will cause the mmap to fail if all the pages cannot > be made present). No, there is no difference because MAP_POPULATE is implicit when MAP_LOCKED is used and as pointed in the cover, we cannot fail after the vma is created and locks dropped. The second patch tries to clarify that MAP_POPULATE is just a best effort. > Either way this is a good catch. > > Acked-by: Eric B Munson <emunson-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org> Thanks! -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/2] mmap.2: clarify MAP_LOCKED semantic @ 2015-05-14 8:01 ` Michal Hocko 0 siblings, 0 replies; 32+ messages in thread From: Michal Hocko @ 2015-05-14 8:01 UTC (permalink / raw) To: Eric B Munson Cc: Michael Kerrisk, Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm On Wed 13-05-15 10:45:06, Eric B Munson wrote: > On Wed, 13 May 2015, Michal Hocko wrote: > > > From: Michal Hocko <mhocko@suse.cz> > > > > MAP_LOCKED had a subtly different semantic from mmap(2)+mlock(2) since > > it has been introduced. > > mlock(2) fails if the memory range cannot get populated to guarantee > > that no future major faults will happen on the range. mmap(MAP_LOCKED) on > > the other hand silently succeeds even if the range was populated only > > partially. > > > > Fixing this subtle difference in the kernel is rather awkward because > > the memory population happens after mm locks have been dropped and so > > the cleanup before returning failure (munlock) could operate on something > > else than the originally mapped area. > > > > E.g. speculative userspace page fault handler catching SEGV and doing > > mmap(fault_addr, MAP_FIXED|MAP_LOCKED) might discard portion of a racing > > mmap and lead to lost data. Although it is not clear whether such a > > usage would be valid, mmap page doesn't explicitly describe requirements > > for threaded applications so we cannot exclude this possibility. > > > > This patch makes the semantic of MAP_LOCKED explicit and suggest using > > mmap + mlock as the only way to guarantee no later major page faults. > > > > Signed-off-by: Michal Hocko <mhocko@suse.cz> > > Does the problem still happend when MAP_POPULATE | MAP_LOCKED is used > (AFAICT MAP_POPULATE will cause the mmap to fail if all the pages cannot > be made present). No, there is no difference because MAP_POPULATE is implicit when MAP_LOCKED is used and as pointed in the cover, we cannot fail after the vma is created and locks dropped. The second patch tries to clarify that MAP_POPULATE is just a best effort. > Either way this is a good catch. > > Acked-by: Eric B Munson <emunson@akamai.com> Thanks! -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/2] mmap.2: clarify MAP_LOCKED semantic @ 2015-05-14 8:01 ` Michal Hocko 0 siblings, 0 replies; 32+ messages in thread From: Michal Hocko @ 2015-05-14 8:01 UTC (permalink / raw) To: Eric B Munson Cc: Michael Kerrisk, Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm On Wed 13-05-15 10:45:06, Eric B Munson wrote: > On Wed, 13 May 2015, Michal Hocko wrote: > > > From: Michal Hocko <mhocko@suse.cz> > > > > MAP_LOCKED had a subtly different semantic from mmap(2)+mlock(2) since > > it has been introduced. > > mlock(2) fails if the memory range cannot get populated to guarantee > > that no future major faults will happen on the range. mmap(MAP_LOCKED) on > > the other hand silently succeeds even if the range was populated only > > partially. > > > > Fixing this subtle difference in the kernel is rather awkward because > > the memory population happens after mm locks have been dropped and so > > the cleanup before returning failure (munlock) could operate on something > > else than the originally mapped area. > > > > E.g. speculative userspace page fault handler catching SEGV and doing > > mmap(fault_addr, MAP_FIXED|MAP_LOCKED) might discard portion of a racing > > mmap and lead to lost data. Although it is not clear whether such a > > usage would be valid, mmap page doesn't explicitly describe requirements > > for threaded applications so we cannot exclude this possibility. > > > > This patch makes the semantic of MAP_LOCKED explicit and suggest using > > mmap + mlock as the only way to guarantee no later major page faults. > > > > Signed-off-by: Michal Hocko <mhocko@suse.cz> > > Does the problem still happend when MAP_POPULATE | MAP_LOCKED is used > (AFAICT MAP_POPULATE will cause the mmap to fail if all the pages cannot > be made present). No, there is no difference because MAP_POPULATE is implicit when MAP_LOCKED is used and as pointed in the cover, we cannot fail after the vma is created and locks dropped. The second patch tries to clarify that MAP_POPULATE is just a best effort. > Either way this is a good catch. > > Acked-by: Eric B Munson <emunson@akamai.com> Thanks! -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/2] mmap.2: clarify MAP_LOCKED semantic 2015-05-13 14:38 ` Michal Hocko @ 2015-05-14 13:36 ` Michael Kerrisk (man-pages) -1 siblings, 0 replies; 32+ messages in thread From: Michael Kerrisk (man-pages) @ 2015-05-14 13:36 UTC (permalink / raw) To: Michal Hocko Cc: mtk.manpages, Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm, Michal Hocko, Eric B Munson On 05/13/2015 04:38 PM, Michal Hocko wrote: > From: Michal Hocko <mhocko@suse.cz> > > MAP_LOCKED had a subtly different semantic from mmap(2)+mlock(2) since > it has been introduced. > mlock(2) fails if the memory range cannot get populated to guarantee > that no future major faults will happen on the range. mmap(MAP_LOCKED) on > the other hand silently succeeds even if the range was populated only > partially. > > Fixing this subtle difference in the kernel is rather awkward because > the memory population happens after mm locks have been dropped and so > the cleanup before returning failure (munlock) could operate on something > else than the originally mapped area. > > E.g. speculative userspace page fault handler catching SEGV and doing > mmap(fault_addr, MAP_FIXED|MAP_LOCKED) might discard portion of a racing > mmap and lead to lost data. Although it is not clear whether such a > usage would be valid, mmap page doesn't explicitly describe requirements > for threaded applications so we cannot exclude this possibility. > > This patch makes the semantic of MAP_LOCKED explicit and suggest using > mmap + mlock as the only way to guarantee no later major page faults. Thanks, Michal. Applied, with Reviewed-by: from Eric added. Cheers, Michael > Signed-off-by: Michal Hocko <mhocko@suse.cz> > --- > man2/mmap.2 | 13 ++++++++++++- > 1 file changed, 12 insertions(+), 1 deletion(-) > > diff --git a/man2/mmap.2 b/man2/mmap.2 > index 54d68cf87e9e..1486be2e96b3 100644 > --- a/man2/mmap.2 > +++ b/man2/mmap.2 > @@ -235,8 +235,19 @@ See the Linux kernel source file > for further information. > .TP > .BR MAP_LOCKED " (since Linux 2.5.37)" > -Lock the pages of the mapped region into memory in the manner of > +Mark the mmaped region to be locked in the same way as > .BR mlock (2). > +This implementation will try to populate (prefault) the whole range but > +the mmap call doesn't fail with > +.B ENOMEM > +if this fails. Therefore major faults might happen later on. So the semantic > +is not as strong as > +.BR mlock (2). > +.BR mmap (2) > ++ > +.BR mlock (2) > +should be used when major faults are not acceptable after the initialization > +of the mapping. > This flag is ignored in older kernels. > .\" If set, the mapped pages will not be swapped out. > .TP > -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/2] mmap.2: clarify MAP_LOCKED semantic @ 2015-05-14 13:36 ` Michael Kerrisk (man-pages) 0 siblings, 0 replies; 32+ messages in thread From: Michael Kerrisk (man-pages) @ 2015-05-14 13:36 UTC (permalink / raw) To: Michal Hocko Cc: mtk.manpages, Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm, Michal Hocko, Eric B Munson On 05/13/2015 04:38 PM, Michal Hocko wrote: > From: Michal Hocko <mhocko@suse.cz> > > MAP_LOCKED had a subtly different semantic from mmap(2)+mlock(2) since > it has been introduced. > mlock(2) fails if the memory range cannot get populated to guarantee > that no future major faults will happen on the range. mmap(MAP_LOCKED) on > the other hand silently succeeds even if the range was populated only > partially. > > Fixing this subtle difference in the kernel is rather awkward because > the memory population happens after mm locks have been dropped and so > the cleanup before returning failure (munlock) could operate on something > else than the originally mapped area. > > E.g. speculative userspace page fault handler catching SEGV and doing > mmap(fault_addr, MAP_FIXED|MAP_LOCKED) might discard portion of a racing > mmap and lead to lost data. Although it is not clear whether such a > usage would be valid, mmap page doesn't explicitly describe requirements > for threaded applications so we cannot exclude this possibility. > > This patch makes the semantic of MAP_LOCKED explicit and suggest using > mmap + mlock as the only way to guarantee no later major page faults. Thanks, Michal. Applied, with Reviewed-by: from Eric added. Cheers, Michael > Signed-off-by: Michal Hocko <mhocko@suse.cz> > --- > man2/mmap.2 | 13 ++++++++++++- > 1 file changed, 12 insertions(+), 1 deletion(-) > > diff --git a/man2/mmap.2 b/man2/mmap.2 > index 54d68cf87e9e..1486be2e96b3 100644 > --- a/man2/mmap.2 > +++ b/man2/mmap.2 > @@ -235,8 +235,19 @@ See the Linux kernel source file > for further information. > .TP > .BR MAP_LOCKED " (since Linux 2.5.37)" > -Lock the pages of the mapped region into memory in the manner of > +Mark the mmaped region to be locked in the same way as > .BR mlock (2). > +This implementation will try to populate (prefault) the whole range but > +the mmap call doesn't fail with > +.B ENOMEM > +if this fails. Therefore major faults might happen later on. So the semantic > +is not as strong as > +.BR mlock (2). > +.BR mmap (2) > ++ > +.BR mlock (2) > +should be used when major faults are not acceptable after the initialization > +of the mapping. > This flag is ignored in older kernels. > .\" If set, the mapped pages will not be swapped out. > .TP > -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 32+ messages in thread
[parent not found: <1431527892-2996-2-git-send-email-miso-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>]
* Re: [PATCH 1/2] mmap.2: clarify MAP_LOCKED semantic 2015-05-13 14:38 ` Michal Hocko (?) @ 2016-05-11 11:07 ` Peter Zijlstra -1 siblings, 0 replies; 32+ messages in thread From: Peter Zijlstra @ 2016-05-11 11:07 UTC (permalink / raw) To: Michal Hocko, Michael Kerrisk Cc: Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Michal Hocko On 05/13/2015 04:38 PM, Michal Hocko wrote: > From: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org> > > MAP_LOCKED had a subtly different semantic from mmap(2)+mlock(2) since > it has been introduced. > mlock(2) fails if the memory range cannot get populated to guarantee > that no future major faults will happen on the range. mmap(MAP_LOCKED) on > the other hand silently succeeds even if the range was populated only > partially. > > Fixing this subtle difference in the kernel is rather awkward because > the memory population happens after mm locks have been dropped and so > the cleanup before returning failure (munlock) could operate on something > else than the originally mapped area. > > E.g. speculative userspace page fault handler catching SEGV and doing > mmap(fault_addr, MAP_FIXED|MAP_LOCKED) might discard portion of a racing > mmap and lead to lost data. Although it is not clear whether such a > usage would be valid, mmap page doesn't explicitly describe requirements > for threaded applications so we cannot exclude this possibility. > > This patch makes the semantic of MAP_LOCKED explicit and suggest using > mmap + mlock as the only way to guarantee no later major page faults. > URGH, this really blows chunks. It basically means MAP_LOCKED is pointless cruft and we might as well remove it. Why not fix it proper? ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/2] mmap.2: clarify MAP_LOCKED semantic @ 2016-05-11 11:07 ` Peter Zijlstra 0 siblings, 0 replies; 32+ messages in thread From: Peter Zijlstra @ 2016-05-11 11:07 UTC (permalink / raw) To: Michal Hocko, Michael Kerrisk Cc: Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm, Michal Hocko On 05/13/2015 04:38 PM, Michal Hocko wrote: > From: Michal Hocko <mhocko@suse.cz> > > MAP_LOCKED had a subtly different semantic from mmap(2)+mlock(2) since > it has been introduced. > mlock(2) fails if the memory range cannot get populated to guarantee > that no future major faults will happen on the range. mmap(MAP_LOCKED) on > the other hand silently succeeds even if the range was populated only > partially. > > Fixing this subtle difference in the kernel is rather awkward because > the memory population happens after mm locks have been dropped and so > the cleanup before returning failure (munlock) could operate on something > else than the originally mapped area. > > E.g. speculative userspace page fault handler catching SEGV and doing > mmap(fault_addr, MAP_FIXED|MAP_LOCKED) might discard portion of a racing > mmap and lead to lost data. Although it is not clear whether such a > usage would be valid, mmap page doesn't explicitly describe requirements > for threaded applications so we cannot exclude this possibility. > > This patch makes the semantic of MAP_LOCKED explicit and suggest using > mmap + mlock as the only way to guarantee no later major page faults. > URGH, this really blows chunks. It basically means MAP_LOCKED is pointless cruft and we might as well remove it. Why not fix it proper? ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/2] mmap.2: clarify MAP_LOCKED semantic @ 2016-05-11 11:07 ` Peter Zijlstra 0 siblings, 0 replies; 32+ messages in thread From: Peter Zijlstra @ 2016-05-11 11:07 UTC (permalink / raw) To: Michal Hocko, Michael Kerrisk Cc: Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm, Michal Hocko On 05/13/2015 04:38 PM, Michal Hocko wrote: > From: Michal Hocko <mhocko@suse.cz> > > MAP_LOCKED had a subtly different semantic from mmap(2)+mlock(2) since > it has been introduced. > mlock(2) fails if the memory range cannot get populated to guarantee > that no future major faults will happen on the range. mmap(MAP_LOCKED) on > the other hand silently succeeds even if the range was populated only > partially. > > Fixing this subtle difference in the kernel is rather awkward because > the memory population happens after mm locks have been dropped and so > the cleanup before returning failure (munlock) could operate on something > else than the originally mapped area. > > E.g. speculative userspace page fault handler catching SEGV and doing > mmap(fault_addr, MAP_FIXED|MAP_LOCKED) might discard portion of a racing > mmap and lead to lost data. Although it is not clear whether such a > usage would be valid, mmap page doesn't explicitly describe requirements > for threaded applications so we cannot exclude this possibility. > > This patch makes the semantic of MAP_LOCKED explicit and suggest using > mmap + mlock as the only way to guarantee no later major page faults. > URGH, this really blows chunks. It basically means MAP_LOCKED is pointless cruft and we might as well remove it. Why not fix it proper? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/2] mmap.2: clarify MAP_LOCKED semantic 2016-05-11 11:07 ` Peter Zijlstra @ 2016-05-11 11:18 ` Peter Zijlstra -1 siblings, 0 replies; 32+ messages in thread From: Peter Zijlstra @ 2016-05-11 11:18 UTC (permalink / raw) To: Michal Hocko, Michael Kerrisk Cc: Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm, Michal Hocko On 05/11/2016 01:07 PM, Peter Zijlstra wrote: > On 05/13/2015 04:38 PM, Michal Hocko wrote: >> >> This patch makes the semantic of MAP_LOCKED explicit and suggest using >> mmap + mlock as the only way to guarantee no later major page faults. >> > > URGH, this really blows chunks. It basically means MAP_LOCKED is > pointless cruft and we might as well remove it. > > Why not fix it proper? OK; after having been pointed at this discussion, it seems I reacted rather too hasty in that I didn't read all the previous threads. From that it appears fixing this proper is indeed rather hard, and we should indeed consider MAP_LOCKED broken. At which point I would've worded the manpage update stronger, but alas. Sorry for the noise. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/2] mmap.2: clarify MAP_LOCKED semantic @ 2016-05-11 11:18 ` Peter Zijlstra 0 siblings, 0 replies; 32+ messages in thread From: Peter Zijlstra @ 2016-05-11 11:18 UTC (permalink / raw) To: Michal Hocko, Michael Kerrisk Cc: Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm, Michal Hocko On 05/11/2016 01:07 PM, Peter Zijlstra wrote: > On 05/13/2015 04:38 PM, Michal Hocko wrote: >> >> This patch makes the semantic of MAP_LOCKED explicit and suggest using >> mmap + mlock as the only way to guarantee no later major page faults. >> > > URGH, this really blows chunks. It basically means MAP_LOCKED is > pointless cruft and we might as well remove it. > > Why not fix it proper? OK; after having been pointed at this discussion, it seems I reacted rather too hasty in that I didn't read all the previous threads. From that it appears fixing this proper is indeed rather hard, and we should indeed consider MAP_LOCKED broken. At which point I would've worded the manpage update stronger, but alas. Sorry for the noise. ^ permalink raw reply [flat|nested] 32+ messages in thread
[parent not found: <57331275.9000805-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>]
* Re: [PATCH 1/2] mmap.2: clarify MAP_LOCKED semantic 2016-05-11 11:07 ` Peter Zijlstra (?) @ 2016-05-11 11:32 ` Michal Hocko -1 siblings, 0 replies; 32+ messages in thread From: Michal Hocko @ 2016-05-11 11:32 UTC (permalink / raw) To: Peter Zijlstra Cc: Michael Kerrisk, Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm-Bw31MaZKKs3YtjvyW6yDsg On Wed 11-05-16 13:07:33, Peter Zijlstra wrote: > > > On 05/13/2015 04:38 PM, Michal Hocko wrote: > > From: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org> > > > > MAP_LOCKED had a subtly different semantic from mmap(2)+mlock(2) since > > it has been introduced. > > mlock(2) fails if the memory range cannot get populated to guarantee > > that no future major faults will happen on the range. mmap(MAP_LOCKED) on > > the other hand silently succeeds even if the range was populated only > > partially. > > > > Fixing this subtle difference in the kernel is rather awkward because > > the memory population happens after mm locks have been dropped and so > > the cleanup before returning failure (munlock) could operate on something > > else than the originally mapped area. > > > > E.g. speculative userspace page fault handler catching SEGV and doing > > mmap(fault_addr, MAP_FIXED|MAP_LOCKED) might discard portion of a racing > > mmap and lead to lost data. Although it is not clear whether such a > > usage would be valid, mmap page doesn't explicitly describe requirements > > for threaded applications so we cannot exclude this possibility. > > > > This patch makes the semantic of MAP_LOCKED explicit and suggest using > > mmap + mlock as the only way to guarantee no later major page faults. > > > > URGH, this really blows chunks. It basically means MAP_LOCKED is pointless > cruft and we might as well remove it. Yeah, the usefulness of MAP_LOCKED is somehow reduced. Everybody who wants the full semantic really have to use mlock(2). > Why not fix it proper? I have tried but it turned out to be a problem because we are dropping mmap_sem after we initialized VMA and as Linus pointed out there are multithreaded applications which are doing opportunistic memory management[1]. So we would have to hold the mmap_sem for write during the whole VMA setup + population and that doesn't seem to be worth all the trouble when we are even not sure whether somebody relies on MAP_LOCKED to have the hard mlock semantic. --- [1] http://lkml.kernel.org/r/CA+55aFydkG-BgZzry5DrTzueVh9VvEcVJdLV8iOyUphQk=0vpw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/2] mmap.2: clarify MAP_LOCKED semantic @ 2016-05-11 11:32 ` Michal Hocko 0 siblings, 0 replies; 32+ messages in thread From: Michal Hocko @ 2016-05-11 11:32 UTC (permalink / raw) To: Peter Zijlstra Cc: Michael Kerrisk, Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm On Wed 11-05-16 13:07:33, Peter Zijlstra wrote: > > > On 05/13/2015 04:38 PM, Michal Hocko wrote: > > From: Michal Hocko <mhocko@suse.cz> > > > > MAP_LOCKED had a subtly different semantic from mmap(2)+mlock(2) since > > it has been introduced. > > mlock(2) fails if the memory range cannot get populated to guarantee > > that no future major faults will happen on the range. mmap(MAP_LOCKED) on > > the other hand silently succeeds even if the range was populated only > > partially. > > > > Fixing this subtle difference in the kernel is rather awkward because > > the memory population happens after mm locks have been dropped and so > > the cleanup before returning failure (munlock) could operate on something > > else than the originally mapped area. > > > > E.g. speculative userspace page fault handler catching SEGV and doing > > mmap(fault_addr, MAP_FIXED|MAP_LOCKED) might discard portion of a racing > > mmap and lead to lost data. Although it is not clear whether such a > > usage would be valid, mmap page doesn't explicitly describe requirements > > for threaded applications so we cannot exclude this possibility. > > > > This patch makes the semantic of MAP_LOCKED explicit and suggest using > > mmap + mlock as the only way to guarantee no later major page faults. > > > > URGH, this really blows chunks. It basically means MAP_LOCKED is pointless > cruft and we might as well remove it. Yeah, the usefulness of MAP_LOCKED is somehow reduced. Everybody who wants the full semantic really have to use mlock(2). > Why not fix it proper? I have tried but it turned out to be a problem because we are dropping mmap_sem after we initialized VMA and as Linus pointed out there are multithreaded applications which are doing opportunistic memory management[1]. So we would have to hold the mmap_sem for write during the whole VMA setup + population and that doesn't seem to be worth all the trouble when we are even not sure whether somebody relies on MAP_LOCKED to have the hard mlock semantic. --- [1] http://lkml.kernel.org/r/CA+55aFydkG-BgZzry5DrTzueVh9VvEcVJdLV8iOyUphQk=0vpw@mail.gmail.com -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/2] mmap.2: clarify MAP_LOCKED semantic @ 2016-05-11 11:32 ` Michal Hocko 0 siblings, 0 replies; 32+ messages in thread From: Michal Hocko @ 2016-05-11 11:32 UTC (permalink / raw) To: Peter Zijlstra Cc: Michael Kerrisk, Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm On Wed 11-05-16 13:07:33, Peter Zijlstra wrote: > > > On 05/13/2015 04:38 PM, Michal Hocko wrote: > > From: Michal Hocko <mhocko@suse.cz> > > > > MAP_LOCKED had a subtly different semantic from mmap(2)+mlock(2) since > > it has been introduced. > > mlock(2) fails if the memory range cannot get populated to guarantee > > that no future major faults will happen on the range. mmap(MAP_LOCKED) on > > the other hand silently succeeds even if the range was populated only > > partially. > > > > Fixing this subtle difference in the kernel is rather awkward because > > the memory population happens after mm locks have been dropped and so > > the cleanup before returning failure (munlock) could operate on something > > else than the originally mapped area. > > > > E.g. speculative userspace page fault handler catching SEGV and doing > > mmap(fault_addr, MAP_FIXED|MAP_LOCKED) might discard portion of a racing > > mmap and lead to lost data. Although it is not clear whether such a > > usage would be valid, mmap page doesn't explicitly describe requirements > > for threaded applications so we cannot exclude this possibility. > > > > This patch makes the semantic of MAP_LOCKED explicit and suggest using > > mmap + mlock as the only way to guarantee no later major page faults. > > > > URGH, this really blows chunks. It basically means MAP_LOCKED is pointless > cruft and we might as well remove it. Yeah, the usefulness of MAP_LOCKED is somehow reduced. Everybody who wants the full semantic really have to use mlock(2). > Why not fix it proper? I have tried but it turned out to be a problem because we are dropping mmap_sem after we initialized VMA and as Linus pointed out there are multithreaded applications which are doing opportunistic memory management[1]. So we would have to hold the mmap_sem for write during the whole VMA setup + population and that doesn't seem to be worth all the trouble when we are even not sure whether somebody relies on MAP_LOCKED to have the hard mlock semantic. --- [1] http://lkml.kernel.org/r/CA+55aFydkG-BgZzry5DrTzueVh9VvEcVJdLV8iOyUphQk=0vpw@mail.gmail.com -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 2/2] mmap2: clarify MAP_POPULATE 2015-05-13 14:38 ` Michal Hocko @ 2015-05-13 14:38 ` Michal Hocko -1 siblings, 0 replies; 32+ messages in thread From: Michal Hocko @ 2015-05-13 14:38 UTC (permalink / raw) To: Michael Kerrisk Cc: Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm, Michal Hocko From: Michal Hocko <mhocko@suse.cz> David Rientjes has noticed that MAP_POPULATE wording might promise much more than the kernel actually provides and intend to provide. The primary usage of the flag is to pre-fault the range. There is no guarantee that no major faults will happen later on. The pages might have been reclaimed by the time the process tries to access them. Signed-off-by: Michal Hocko <mhocko@suse.cz> --- man2/mmap.2 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/man2/mmap.2 b/man2/mmap.2 index 1486be2e96b3..dcf306f2f730 100644 --- a/man2/mmap.2 +++ b/man2/mmap.2 @@ -284,7 +284,7 @@ private writable mappings. .BR MAP_POPULATE " (since Linux 2.5.46)" Populate (prefault) page tables for a mapping. For a file mapping, this causes read-ahead on the file. -Later accesses to the mapping will not be blocked by page faults. +This will help to reduce blocking on page faults later. .BR MAP_POPULATE is supported for private mappings only since Linux 2.6.23. .TP -- 2.1.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 2/2] mmap2: clarify MAP_POPULATE @ 2015-05-13 14:38 ` Michal Hocko 0 siblings, 0 replies; 32+ messages in thread From: Michal Hocko @ 2015-05-13 14:38 UTC (permalink / raw) To: Michael Kerrisk Cc: Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm, Michal Hocko From: Michal Hocko <mhocko@suse.cz> David Rientjes has noticed that MAP_POPULATE wording might promise much more than the kernel actually provides and intend to provide. The primary usage of the flag is to pre-fault the range. There is no guarantee that no major faults will happen later on. The pages might have been reclaimed by the time the process tries to access them. Signed-off-by: Michal Hocko <mhocko@suse.cz> --- man2/mmap.2 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/man2/mmap.2 b/man2/mmap.2 index 1486be2e96b3..dcf306f2f730 100644 --- a/man2/mmap.2 +++ b/man2/mmap.2 @@ -284,7 +284,7 @@ private writable mappings. .BR MAP_POPULATE " (since Linux 2.5.46)" Populate (prefault) page tables for a mapping. For a file mapping, this causes read-ahead on the file. -Later accesses to the mapping will not be blocked by page faults. +This will help to reduce blocking on page faults later. .BR MAP_POPULATE is supported for private mappings only since Linux 2.6.23. .TP -- 2.1.4 ^ permalink raw reply related [flat|nested] 32+ messages in thread
[parent not found: <1431527892-2996-3-git-send-email-miso-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>]
* Re: [PATCH 2/2] mmap2: clarify MAP_POPULATE 2015-05-13 14:38 ` Michal Hocko @ 2015-05-13 14:47 ` Eric B Munson -1 siblings, 0 replies; 32+ messages in thread From: Eric B Munson @ 2015-05-13 14:47 UTC (permalink / raw) To: Michal Hocko Cc: Michael Kerrisk, Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Michal Hocko [-- Attachment #1: Type: text/plain, Size: 614 bytes --] On Wed, 13 May 2015, Michal Hocko wrote: > From: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org> > > David Rientjes has noticed that MAP_POPULATE wording might promise much > more than the kernel actually provides and intend to provide. The > primary usage of the flag is to pre-fault the range. There is no > guarantee that no major faults will happen later on. The pages might > have been reclaimed by the time the process tries to access them. > > Signed-off-by: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org> Reviewed-by: Eric B Munson <emunson-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org> [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2] mmap2: clarify MAP_POPULATE @ 2015-05-13 14:47 ` Eric B Munson 0 siblings, 0 replies; 32+ messages in thread From: Eric B Munson @ 2015-05-13 14:47 UTC (permalink / raw) To: Michal Hocko Cc: Michael Kerrisk, Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm, Michal Hocko [-- Attachment #1: Type: text/plain, Size: 543 bytes --] On Wed, 13 May 2015, Michal Hocko wrote: > From: Michal Hocko <mhocko@suse.cz> > > David Rientjes has noticed that MAP_POPULATE wording might promise much > more than the kernel actually provides and intend to provide. The > primary usage of the flag is to pre-fault the range. There is no > guarantee that no major faults will happen later on. The pages might > have been reclaimed by the time the process tries to access them. > > Signed-off-by: Michal Hocko <mhocko@suse.cz> Reviewed-by: Eric B Munson <emunson@akamai.com> [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2] mmap2: clarify MAP_POPULATE 2015-05-13 14:38 ` Michal Hocko @ 2015-05-14 13:36 ` Michael Kerrisk (man-pages) -1 siblings, 0 replies; 32+ messages in thread From: Michael Kerrisk (man-pages) @ 2015-05-14 13:36 UTC (permalink / raw) To: Michal Hocko Cc: mtk.manpages, Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm, Michal Hocko On 05/13/2015 04:38 PM, Michal Hocko wrote: > From: Michal Hocko <mhocko@suse.cz> > > David Rientjes has noticed that MAP_POPULATE wording might promise much > more than the kernel actually provides and intend to provide. The > primary usage of the flag is to pre-fault the range. There is no > guarantee that no major faults will happen later on. The pages might > have been reclaimed by the time the process tries to access them. Yes, thanks, Michal -- that's a good point to make clearer. Applied, with Reviewed-by: from Eric added. Cheers, Michael > Signed-off-by: Michal Hocko <mhocko@suse.cz> > --- > man2/mmap.2 | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/man2/mmap.2 b/man2/mmap.2 > index 1486be2e96b3..dcf306f2f730 100644 > --- a/man2/mmap.2 > +++ b/man2/mmap.2 > @@ -284,7 +284,7 @@ private writable mappings. > .BR MAP_POPULATE " (since Linux 2.5.46)" > Populate (prefault) page tables for a mapping. > For a file mapping, this causes read-ahead on the file. > -Later accesses to the mapping will not be blocked by page faults. > +This will help to reduce blocking on page faults later. > .BR MAP_POPULATE > is supported for private mappings only since Linux 2.6.23. > .TP > -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2] mmap2: clarify MAP_POPULATE @ 2015-05-14 13:36 ` Michael Kerrisk (man-pages) 0 siblings, 0 replies; 32+ messages in thread From: Michael Kerrisk (man-pages) @ 2015-05-14 13:36 UTC (permalink / raw) To: Michal Hocko Cc: mtk.manpages, Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm, Michal Hocko On 05/13/2015 04:38 PM, Michal Hocko wrote: > From: Michal Hocko <mhocko@suse.cz> > > David Rientjes has noticed that MAP_POPULATE wording might promise much > more than the kernel actually provides and intend to provide. The > primary usage of the flag is to pre-fault the range. There is no > guarantee that no major faults will happen later on. The pages might > have been reclaimed by the time the process tries to access them. Yes, thanks, Michal -- that's a good point to make clearer. Applied, with Reviewed-by: from Eric added. Cheers, Michael > Signed-off-by: Michal Hocko <mhocko@suse.cz> > --- > man2/mmap.2 | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/man2/mmap.2 b/man2/mmap.2 > index 1486be2e96b3..dcf306f2f730 100644 > --- a/man2/mmap.2 > +++ b/man2/mmap.2 > @@ -284,7 +284,7 @@ private writable mappings. > .BR MAP_POPULATE " (since Linux 2.5.46)" > Populate (prefault) page tables for a mapping. > For a file mapping, this causes read-ahead on the file. > -Later accesses to the mapping will not be blocked by page faults. > +This will help to reduce blocking on page faults later. > .BR MAP_POPULATE > is supported for private mappings only since Linux 2.6.23. > .TP > -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2] mmap2: clarify MAP_POPULATE 2015-05-13 14:38 ` Michal Hocko @ 2015-05-15 0:13 ` David Rientjes -1 siblings, 0 replies; 32+ messages in thread From: David Rientjes @ 2015-05-15 0:13 UTC (permalink / raw) To: Michal Hocko Cc: Michael Kerrisk, Andrew Morton, Linus Torvalds, LKML, Linux API, linux-mm, Michal Hocko On Wed, 13 May 2015, Michal Hocko wrote: > From: Michal Hocko <mhocko@suse.cz> > > David Rientjes has noticed that MAP_POPULATE wording might promise much > more than the kernel actually provides and intend to provide. The > primary usage of the flag is to pre-fault the range. There is no > guarantee that no major faults will happen later on. The pages might > have been reclaimed by the time the process tries to access them. > > Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: David Rientjes <rientjes@google.com> Thanks for following up! -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2] mmap2: clarify MAP_POPULATE @ 2015-05-15 0:13 ` David Rientjes 0 siblings, 0 replies; 32+ messages in thread From: David Rientjes @ 2015-05-15 0:13 UTC (permalink / raw) To: Michal Hocko Cc: Michael Kerrisk, Andrew Morton, Linus Torvalds, LKML, Linux API, linux-mm, Michal Hocko On Wed, 13 May 2015, Michal Hocko wrote: > From: Michal Hocko <mhocko@suse.cz> > > David Rientjes has noticed that MAP_POPULATE wording might promise much > more than the kernel actually provides and intend to provide. The > primary usage of the flag is to pre-fault the range. There is no > guarantee that no major faults will happen later on. The pages might > have been reclaimed by the time the process tries to access them. > > Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: David Rientjes <rientjes@google.com> Thanks for following up! ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 0/2] man-pages: clarify MAP_LOCKED semantic 2015-05-13 14:38 ` Michal Hocko (?) @ 2015-05-18 9:12 ` Michal Hocko -1 siblings, 0 replies; 32+ messages in thread From: Michal Hocko @ 2015-05-18 9:12 UTC (permalink / raw) To: Michael Kerrisk Cc: Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm On Wed 13-05-15 16:38:10, Michal Hocko wrote: > Hi, > during the previous discussion http://marc.info/?l=linux-mm&m=143022313618001&w=2 > it was made clear that making mmap(MAP_LOCKED) semantic really have > mlock() semantic is too dangerous. Even though we can try to reduce the > failure space the mmap man page should make it really clear about the > subtle distinctions between the two. This is what that first patch does. > The second patch is a small clarification for MAP_POPULATE based on > David Rientjes feedback. I have completely forgot about the in kernel doc. --- >From 9d1478ccd036f84e50da906e39cd1e7bcb94cecd Mon Sep 17 00:00:00 2001 From: Michal Hocko <mhocko@suse.cz> Date: Mon, 18 May 2015 11:07:00 +0200 Subject: [PATCH] Documentation/vm/unevictable-lru.txt: clarify MAP_LOCKED behavior There is a very subtle difference between mmap()+mlock() vs mmap(MAP_LOCKED) semantic. The former one fails if the population of the area fails while the later one doesn't. This basically means that mmap(MAPLOCKED) areas might see major fault after mmap syscall returns which is not the case for mlock. mmap man page has already been altered but Documentation/vm/unevictable-lru.txt deserves a clarification as well. Reported-by: David Rientjes <rientjes@google.com> Signed-off-by: Michal Hocko <mhocko@suse.cz> --- Documentation/vm/unevictable-lru.txt | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/Documentation/vm/unevictable-lru.txt b/Documentation/vm/unevictable-lru.txt index 3be0bfc4738d..32ee3a67dba2 100644 --- a/Documentation/vm/unevictable-lru.txt +++ b/Documentation/vm/unevictable-lru.txt @@ -467,7 +467,13 @@ mmap(MAP_LOCKED) SYSTEM CALL HANDLING In addition the mlock()/mlockall() system calls, an application can request that a region of memory be mlocked supplying the MAP_LOCKED flag to the mmap() -call. Furthermore, any mmap() call or brk() call that expands the heap by a +call. There is one important and subtle difference here, though. mmap() + mlock() +will fail if the range cannot be faulted in (e.g. because mm_populate fails) +and returns with ENOMEM while mmap(MAP_LOCKED) will not fail. The mmaped +area will still have properties of the locked area - aka. pages will not get +swapped out - but major page faults to fault memory in might still happen. + +Furthermore, any mmap() call or brk() call that expands the heap by a task that has previously called mlockall() with the MCL_FUTURE flag will result in the newly mapped memory being mlocked. Before the unevictable/mlock changes, the kernel simply called make_pages_present() to allocate pages and -- 2.1.4 -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 0/2] man-pages: clarify MAP_LOCKED semantic @ 2015-05-18 9:12 ` Michal Hocko 0 siblings, 0 replies; 32+ messages in thread From: Michal Hocko @ 2015-05-18 9:12 UTC (permalink / raw) To: Michael Kerrisk Cc: Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm On Wed 13-05-15 16:38:10, Michal Hocko wrote: > Hi, > during the previous discussion http://marc.info/?l=linux-mm&m=143022313618001&w=2 > it was made clear that making mmap(MAP_LOCKED) semantic really have > mlock() semantic is too dangerous. Even though we can try to reduce the > failure space the mmap man page should make it really clear about the > subtle distinctions between the two. This is what that first patch does. > The second patch is a small clarification for MAP_POPULATE based on > David Rientjes feedback. I have completely forgot about the in kernel doc. --- >From 9d1478ccd036f84e50da906e39cd1e7bcb94cecd Mon Sep 17 00:00:00 2001 From: Michal Hocko <mhocko@suse.cz> Date: Mon, 18 May 2015 11:07:00 +0200 Subject: [PATCH] Documentation/vm/unevictable-lru.txt: clarify MAP_LOCKED behavior There is a very subtle difference between mmap()+mlock() vs mmap(MAP_LOCKED) semantic. The former one fails if the population of the area fails while the later one doesn't. This basically means that mmap(MAPLOCKED) areas might see major fault after mmap syscall returns which is not the case for mlock. mmap man page has already been altered but Documentation/vm/unevictable-lru.txt deserves a clarification as well. Reported-by: David Rientjes <rientjes@google.com> Signed-off-by: Michal Hocko <mhocko@suse.cz> --- Documentation/vm/unevictable-lru.txt | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/Documentation/vm/unevictable-lru.txt b/Documentation/vm/unevictable-lru.txt index 3be0bfc4738d..32ee3a67dba2 100644 --- a/Documentation/vm/unevictable-lru.txt +++ b/Documentation/vm/unevictable-lru.txt @@ -467,7 +467,13 @@ mmap(MAP_LOCKED) SYSTEM CALL HANDLING In addition the mlock()/mlockall() system calls, an application can request that a region of memory be mlocked supplying the MAP_LOCKED flag to the mmap() -call. Furthermore, any mmap() call or brk() call that expands the heap by a +call. There is one important and subtle difference here, though. mmap() + mlock() +will fail if the range cannot be faulted in (e.g. because mm_populate fails) +and returns with ENOMEM while mmap(MAP_LOCKED) will not fail. The mmaped +area will still have properties of the locked area - aka. pages will not get +swapped out - but major page faults to fault memory in might still happen. + +Furthermore, any mmap() call or brk() call that expands the heap by a task that has previously called mlockall() with the MCL_FUTURE flag will result in the newly mapped memory being mlocked. Before the unevictable/mlock changes, the kernel simply called make_pages_present() to allocate pages and -- 2.1.4 -- Michal Hocko SUSE Labs ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 0/2] man-pages: clarify MAP_LOCKED semantic @ 2015-05-18 9:12 ` Michal Hocko 0 siblings, 0 replies; 32+ messages in thread From: Michal Hocko @ 2015-05-18 9:12 UTC (permalink / raw) To: Michael Kerrisk Cc: Andrew Morton, Linus Torvalds, David Rientjes, LKML, Linux API, linux-mm On Wed 13-05-15 16:38:10, Michal Hocko wrote: > Hi, > during the previous discussion http://marc.info/?l=linux-mm&m=143022313618001&w=2 > it was made clear that making mmap(MAP_LOCKED) semantic really have > mlock() semantic is too dangerous. Even though we can try to reduce the > failure space the mmap man page should make it really clear about the > subtle distinctions between the two. This is what that first patch does. > The second patch is a small clarification for MAP_POPULATE based on > David Rientjes feedback. I have completely forgot about the in kernel doc. --- ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2016-05-11 11:32 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-13 14:38 [PATCH 0/2] man-pages: clarify MAP_LOCKED semantic Michal Hocko
2015-05-13 14:38 ` Michal Hocko
[not found] ` <1431527892-2996-1-git-send-email-miso-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2015-05-13 14:38 ` [PATCH 1/2] mmap.2: " Michal Hocko
2015-05-13 14:38 ` Michal Hocko
2015-05-13 14:38 ` Michal Hocko
2015-05-13 14:45 ` Eric B Munson
[not found] ` <20150513144506.GD1227-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
2015-05-13 14:48 ` Eric B Munson
2015-05-13 14:48 ` Eric B Munson
2015-05-14 8:01 ` Michal Hocko
2015-05-14 8:01 ` Michal Hocko
2015-05-14 8:01 ` Michal Hocko
2015-05-14 13:36 ` Michael Kerrisk (man-pages)
2015-05-14 13:36 ` Michael Kerrisk (man-pages)
[not found] ` <1431527892-2996-2-git-send-email-miso-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2016-05-11 11:07 ` Peter Zijlstra
2016-05-11 11:07 ` Peter Zijlstra
2016-05-11 11:07 ` Peter Zijlstra
2016-05-11 11:18 ` Peter Zijlstra
2016-05-11 11:18 ` Peter Zijlstra
[not found] ` <57331275.9000805-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-05-11 11:32 ` Michal Hocko
2016-05-11 11:32 ` Michal Hocko
2016-05-11 11:32 ` Michal Hocko
2015-05-13 14:38 ` [PATCH 2/2] mmap2: clarify MAP_POPULATE Michal Hocko
2015-05-13 14:38 ` Michal Hocko
[not found] ` <1431527892-2996-3-git-send-email-miso-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2015-05-13 14:47 ` Eric B Munson
2015-05-13 14:47 ` Eric B Munson
2015-05-14 13:36 ` Michael Kerrisk (man-pages)
2015-05-14 13:36 ` Michael Kerrisk (man-pages)
2015-05-15 0:13 ` David Rientjes
2015-05-15 0:13 ` David Rientjes
2015-05-18 9:12 ` [PATCH 0/2] man-pages: clarify MAP_LOCKED semantic Michal Hocko
2015-05-18 9:12 ` Michal Hocko
2015-05-18 9:12 ` Michal Hocko
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.