linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints
       [not found] ` <54CFF8AC.6010102@intel.com>
@ 2015-02-03  8:19   ` Vlastimil Babka
       [not found]     ` <54D08483.40209-AlSwsSmVLrQ@public.gmane.org>
  2015-02-03 11:16     ` Mel Gorman
  0 siblings, 2 replies; 20+ messages in thread
From: Vlastimil Babka @ 2015-02-03  8:19 UTC (permalink / raw)
  To: Dave Hansen, Mel Gorman, linux-mm
  Cc: Minchan Kim, Andrew Morton, linux-kernel, linux-api, mtk.manpages,
	linux-man

[CC linux-api, man pages]

On 02/02/2015 11:22 PM, Dave Hansen wrote:
> On 02/02/2015 08:55 AM, Mel Gorman wrote:
>> This patch identifies when a thread is frequently calling MADV_DONTNEED
>> on the same region of memory and starts ignoring the hint. On an 8-core
>> single-socket machine this was the impact on ebizzy using glibc 2.19.
> 
> The manpage, at least, claims that we zero-fill after MADV_DONTNEED is
> called:
> 
>>      MADV_DONTNEED
>>               Do  not  expect  access in the near future.  (For the time being, the application is finished with the given range, so the kernel can free resources
>>               associated with it.)  Subsequent accesses of pages in this range will succeed, but will result either in reloading of the memory contents  from  the
>>               underlying mapped file (see mmap(2)) or zero-fill-on-demand pages for mappings without an underlying file.
> 
> So if we have anything depending on the behavior that it's _always_
> zero-filled after an MADV_DONTNEED, this will break it.

OK, so that's a third person (including me) who understood it as a zero-fill
guarantee. I think the man page should be clarified (if it's indeed not
guaranteed), or we have a bug.

The implementation actually skips MADV_DONTNEED for
VM_LOCKED|VM_HUGETLB|VM_PFNMAP vma's.

I'm not sure about VM_PFNMAP, these are probably special enough. For mlock, one
could expect that mlocking and MADV_DONTNEED would be in some opposition, but
it's not documented in the manpage AFAIK. Neither is the hugetlb case, which
could be really unexpected by the user.

Next, what the man page says about guarantees:

"The kernel is free to ignore the advice."

- that would suggest that nothing is guaranteed

"This call does not influence the semantics of the application (except in the
case of MADV_DONTNEED)"

- that depends if the reader understands it as "does influence by MADV_DONTNEED"
or "may influence by MADV_DONTNEED"

- btw, isn't MADV_DONTFORK another exception that does influence the semantics?
And since it's mentioned as a workaround for some hardware, is it OK to ignore
this advice?

And the part you already cited:

"Subsequent accesses of pages in this range will succeed, but will result either
in reloading of the memory contents from the underlying mapped file (see
mmap(2)) or zero-fill on-demand pages for mappings without an underlying file."

- The word "will result" did sound as a guarantee at least to me. So here it
could be changed to "may result (unless the advice is ignored)"?

And if we agree that there is indeed no guarantee, what's the actual semantic
difference from MADV_FREE? I guess none? So there's only a possible perfomance
difference?

Vlastimil

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints
       [not found]     ` <54D08483.40209-AlSwsSmVLrQ@public.gmane.org>
@ 2015-02-03 10:53       ` Kirill A. Shutemov
  2015-02-03 11:42         ` Vlastimil Babka
  0 siblings, 1 reply; 20+ messages in thread
From: Kirill A. Shutemov @ 2015-02-03 10:53 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Dave Hansen, Mel Gorman, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	Minchan Kim, Andrew Morton, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w,
	linux-man-u79uwXL29TY76Z2rM5mHXA

On Tue, Feb 03, 2015 at 09:19:15AM +0100, Vlastimil Babka wrote:
> [CC linux-api, man pages]
> 
> On 02/02/2015 11:22 PM, Dave Hansen wrote:
> > On 02/02/2015 08:55 AM, Mel Gorman wrote:
> >> This patch identifies when a thread is frequently calling MADV_DONTNEED
> >> on the same region of memory and starts ignoring the hint. On an 8-core
> >> single-socket machine this was the impact on ebizzy using glibc 2.19.
> > 
> > The manpage, at least, claims that we zero-fill after MADV_DONTNEED is
> > called:
> > 
> >>      MADV_DONTNEED
> >>               Do  not  expect  access in the near future.  (For the time being, the application is finished with the given range, so the kernel can free resources
> >>               associated with it.)  Subsequent accesses of pages in this range will succeed, but will result either in reloading of the memory contents  from  the
> >>               underlying mapped file (see mmap(2)) or zero-fill-on-demand pages for mappings without an underlying file.
> > 
> > So if we have anything depending on the behavior that it's _always_
> > zero-filled after an MADV_DONTNEED, this will break it.
> 
> OK, so that's a third person (including me) who understood it as a zero-fill
> guarantee. I think the man page should be clarified (if it's indeed not
> guaranteed), or we have a bug.
> 
> The implementation actually skips MADV_DONTNEED for
> VM_LOCKED|VM_HUGETLB|VM_PFNMAP vma's.

It doesn't skip. It fails with -EINVAL. Or I miss something.

> - The word "will result" did sound as a guarantee at least to me. So here it
> could be changed to "may result (unless the advice is ignored)"?

It's too late to fix documentation. Applications already depends on the
beheviour.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints
  2015-02-03  8:19   ` MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints Vlastimil Babka
       [not found]     ` <54D08483.40209-AlSwsSmVLrQ@public.gmane.org>
@ 2015-02-03 11:16     ` Mel Gorman
       [not found]       ` <20150203111600.GR2395-l3A5Bk7waGM@public.gmane.org>
  1 sibling, 1 reply; 20+ messages in thread
From: Mel Gorman @ 2015-02-03 11:16 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Dave Hansen, linux-mm, Minchan Kim, Andrew Morton, linux-kernel,
	linux-api, mtk.manpages, linux-man

On Tue, Feb 03, 2015 at 09:19:15AM +0100, Vlastimil Babka wrote:
> [CC linux-api, man pages]
> 
> On 02/02/2015 11:22 PM, Dave Hansen wrote:
> > On 02/02/2015 08:55 AM, Mel Gorman wrote:
> >> This patch identifies when a thread is frequently calling MADV_DONTNEED
> >> on the same region of memory and starts ignoring the hint. On an 8-core
> >> single-socket machine this was the impact on ebizzy using glibc 2.19.
> > 
> > The manpage, at least, claims that we zero-fill after MADV_DONTNEED is
> > called:
> > 
> >>      MADV_DONTNEED
> >>               Do  not  expect  access in the near future.  (For the time being, the application is finished with the given range, so the kernel can free resources
> >>               associated with it.)  Subsequent accesses of pages in this range will succeed, but will result either in reloading of the memory contents  from  the
> >>               underlying mapped file (see mmap(2)) or zero-fill-on-demand pages for mappings without an underlying file.
> > 
> > So if we have anything depending on the behavior that it's _always_
> > zero-filled after an MADV_DONTNEED, this will break it.
> 
> OK, so that's a third person (including me) who understood it as a zero-fill
> guarantee. I think the man page should be clarified (if it's indeed not
> guaranteed), or we have a bug.
> 
> The implementation actually skips MADV_DONTNEED for
> VM_LOCKED|VM_HUGETLB|VM_PFNMAP vma's.
> 

This was the first reason why I did not consider the zero-filling to be a
guarantee. That said, at this point I'm also not considering pushing this
patch towards the kernel. I agree that this is a glibc bug so I've dropped
a line to some glibc people to see what they think the approach should be.

> I'm not sure about VM_PFNMAP, these are probably special enough. For mlock, one
> could expect that mlocking and MADV_DONTNEED would be in some opposition, but
> it's not documented in the manpage AFAIK. Neither is the hugetlb case, which
> could be really unexpected by the user.
> 

The equivalent posix page also lacks details on how exactly this flag
should behave. hugetlb is sortof special in that it's always backed by
a ram-based file where the contents can be refaulted. It gets hairy when
the mapping has been created to look anonymous but is not anonymous
really. The semantics of hugetlb have always been fuzzy.

> Next, what the man page says about guarantees:
> 
> "The kernel is free to ignore the advice."
> 
> - that would suggest that nothing is guaranteed
> 

Yep, another reason why I did not clear the page when ignoring the hint.

> "This call does not influence the semantics of the application (except in the
> case of MADV_DONTNEED)"
> 
> - that depends if the reader understands it as "does influence by MADV_DONTNEED"
> or "may influence by MADV_DONTNEED"
> 
> - btw, isn't MADV_DONTFORK another exception that does influence the semantics?
> And since it's mentioned as a workaround for some hardware, is it OK to ignore
> this advice?
> 

MADV_DONTFORK is also a Linux-specific extention. It happens to be one
that if it gets ignored then the application will be very surprised.

> And the part you already cited:
> 
> "Subsequent accesses of pages in this range will succeed, but will result either
> in reloading of the memory contents from the underlying mapped file (see
> mmap(2)) or zero-fill on-demand pages for mappings without an underlying file."
> 
> - The word "will result" did sound as a guarantee at least to me. So here it
> could be changed to "may result (unless the advice is ignored)"?
> 

The wording should be "may result" as there are circumstances where it
gets ignored even without this prototype patch.

> And if we agree that there is indeed no guarantee, what's the actual semantic
> difference from MADV_FREE? I guess none? So there's only a possible perfomance
> difference?
> 

Timing. MADV_DONTNEED if it has an effect is immediate, is a heavier
operations and RSS is reduced. MADV_FREE only has an impact in the future
if there is memory pressure.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints
  2015-02-03 10:53       ` Kirill A. Shutemov
@ 2015-02-03 11:42         ` Vlastimil Babka
  2015-02-03 16:20           ` Michael Kerrisk (man-pages)
       [not found]           ` <54D0B43D.8000209-AlSwsSmVLrQ@public.gmane.org>
  0 siblings, 2 replies; 20+ messages in thread
From: Vlastimil Babka @ 2015-02-03 11:42 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Dave Hansen, Mel Gorman, linux-mm, Minchan Kim, Andrew Morton,
	linux-kernel, linux-api, mtk.manpages, linux-man

On 02/03/2015 11:53 AM, Kirill A. Shutemov wrote:
> On Tue, Feb 03, 2015 at 09:19:15AM +0100, Vlastimil Babka wrote:
>> [CC linux-api, man pages]
>> 
>> On 02/02/2015 11:22 PM, Dave Hansen wrote:
>> > On 02/02/2015 08:55 AM, Mel Gorman wrote:
>> >> This patch identifies when a thread is frequently calling MADV_DONTNEED
>> >> on the same region of memory and starts ignoring the hint. On an 8-core
>> >> single-socket machine this was the impact on ebizzy using glibc 2.19.
>> > 
>> > The manpage, at least, claims that we zero-fill after MADV_DONTNEED is
>> > called:
>> > 
>> >>      MADV_DONTNEED
>> >>               Do  not  expect  access in the near future.  (For the time being, the application is finished with the given range, so the kernel can free resources
>> >>               associated with it.)  Subsequent accesses of pages in this range will succeed, but will result either in reloading of the memory contents  from  the
>> >>               underlying mapped file (see mmap(2)) or zero-fill-on-demand pages for mappings without an underlying file.
>> > 
>> > So if we have anything depending on the behavior that it's _always_
>> > zero-filled after an MADV_DONTNEED, this will break it.
>> 
>> OK, so that's a third person (including me) who understood it as a zero-fill
>> guarantee. I think the man page should be clarified (if it's indeed not
>> guaranteed), or we have a bug.
>> 
>> The implementation actually skips MADV_DONTNEED for
>> VM_LOCKED|VM_HUGETLB|VM_PFNMAP vma's.
> 
> It doesn't skip. It fails with -EINVAL. Or I miss something.

No, I missed that. Thanks for pointing out. The manpage also explains EINVAL in
this case:

*  The application is attempting to release locked or shared pages (with
MADV_DONTNEED).

- that covers mlocking ok, not sure if the rest fits the "shared pages" case
though. I dont see any check for other kinds of shared pages in the code.

>> - The word "will result" did sound as a guarantee at least to me. So here it
>> could be changed to "may result (unless the advice is ignored)"?
> 
> It's too late to fix documentation. Applications already depends on the
> beheviour.

Right, so as long as they check for EINVAL, it should be safe. It appears that
jemalloc does.

I still wouldnt be sure just by reading the man page that the clearing is
guaranteed whenever I dont get an error return value, though,

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints
       [not found]       ` <20150203111600.GR2395-l3A5Bk7waGM@public.gmane.org>
@ 2015-02-03 15:21         ` Michal Hocko
       [not found]           ` <20150203152121.GC8914-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: Michal Hocko @ 2015-02-03 15:21 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Vlastimil Babka, Dave Hansen, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	Minchan Kim, Andrew Morton, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w,
	linux-man-u79uwXL29TY76Z2rM5mHXA

On Tue 03-02-15 11:16:00, Mel Gorman wrote:
> On Tue, Feb 03, 2015 at 09:19:15AM +0100, Vlastimil Babka wrote:
[...]
> > And if we agree that there is indeed no guarantee, what's the actual semantic
> > difference from MADV_FREE? I guess none? So there's only a possible perfomance
> > difference?
> > 
> 
> Timing. MADV_DONTNEED if it has an effect is immediate, is a heavier
> operations and RSS is reduced. MADV_FREE only has an impact in the future
> if there is memory pressure.

JFTR. the man page for MADV_FREE has been proposed already
(https://lkml.org/lkml/2014/12/5/63 should be the last version AFAIR). I
do not see it in the man-pages git tree but the patch was not in time
for 3.19 so I guess it will only appear in 3.20.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints
  2015-02-03 11:42         ` Vlastimil Babka
@ 2015-02-03 16:20           ` Michael Kerrisk (man-pages)
       [not found]             ` <54D0F56A.9050003-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
       [not found]           ` <54D0B43D.8000209-AlSwsSmVLrQ@public.gmane.org>
  1 sibling, 1 reply; 20+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-02-03 16:20 UTC (permalink / raw)
  To: Vlastimil Babka, Kirill A. Shutemov
  Cc: mtk.manpages, Dave Hansen, Mel Gorman, linux-mm, Minchan Kim,
	Andrew Morton, linux-kernel, linux-api, linux-man, Hugh Dickins

Hello Vlastimil

Thanks for CCing me into this thread.

On 02/03/2015 12:42 PM, Vlastimil Babka wrote:
> On 02/03/2015 11:53 AM, Kirill A. Shutemov wrote:
>> On Tue, Feb 03, 2015 at 09:19:15AM +0100, Vlastimil Babka wrote:
>>> [CC linux-api, man pages]
>>>
>>> On 02/02/2015 11:22 PM, Dave Hansen wrote:
>>>> On 02/02/2015 08:55 AM, Mel Gorman wrote:
>>>>> This patch identifies when a thread is frequently calling MADV_DONTNEED
>>>>> on the same region of memory and starts ignoring the hint. On an 8-core
>>>>> single-socket machine this was the impact on ebizzy using glibc 2.19.
>>>>
>>>> The manpage, at least, claims that we zero-fill after MADV_DONTNEED is
>>>> called:
>>>>
>>>>>      MADV_DONTNEED
>>>>>               Do  not  expect  access in the near future.  (For the time being, the application is finished with the given range, so the kernel can free resources
>>>>>               associated with it.)  Subsequent accesses of pages in this range will succeed, but will result either in reloading of the memory contents  from  the
>>>>>               underlying mapped file (see mmap(2)) or zero-fill-on-demand pages for mappings without an underlying file.
>>>>
>>>> So if we have anything depending on the behavior that it's _always_
>>>> zero-filled after an MADV_DONTNEED, this will break it.
>>>
>>> OK, so that's a third person (including me) who understood it as a zero-fill
>>> guarantee. I think the man page should be clarified (if it's indeed not
>>> guaranteed), or we have a bug.
>>>
>>> The implementation actually skips MADV_DONTNEED for
>>> VM_LOCKED|VM_HUGETLB|VM_PFNMAP vma's.
>>
>> It doesn't skip. It fails with -EINVAL. Or I miss something.
> 
> No, I missed that. Thanks for pointing out. The manpage also explains EINVAL in
> this case:
> 
> *  The application is attempting to release locked or shared pages (with
> MADV_DONTNEED).

Yes, there is that. But the page could be more explicit when discussing
MADV_DONTNEED in the main text. I've done that.

> - that covers mlocking ok, not sure if the rest fits the "shared pages" case
> though. I dont see any check for other kinds of shared pages in the code.

Agreed. "shared" here seems confused. I've removed it. And I've
added mention of "Huge TLB pages" for this error.

>>> - The word "will result" did sound as a guarantee at least to me. So here it
>>> could be changed to "may result (unless the advice is ignored)"?
>>
>> It's too late to fix documentation. Applications already depends on the
>> beheviour.
> 
> Right, so as long as they check for EINVAL, it should be safe. It appears that
> jemalloc does.

So, first a brief question: in the cases where the call does not error out,
are we agreed that in the current implementation, MADV_DONTNEED will
always result in zero-filled pages when the region is faulted back in
(when we consider pages that are not backed by a file)?

> I still wouldnt be sure just by reading the man page that the clearing is
> guaranteed whenever I dont get an error return value, though,

I'm not quite sure what you want here. I mean: if there's an error,
then the DONTNEED action didn't occur, right? Therefore, there won't
be zero-filled pages. But, for what it's worth, I added "If the
operation succeeds" at the start of that sentence beginning "Subsequent
accesses...".

Now, some history, explaining why the page is a bit of a mess,
and for that matter why I could really use more help on it from MM
folk (especially in the form of actual patches [1], rather than notes
about deficiencies in the documentation), because:

    ***I simply cannot keep up with all of the details***.

Once upon a time (Linux 2.4), there was madvise() with just 5 flags:

       MADV_NORMAL
       MADV_RANDOM
       MADV_SEQUENTIAL
       MADV_WILLNEED
       MADV_DONTNEED

And already a dozen years ago, *I* added the text about MADV_DONTNEED.
Back then, I believe it was true. I'm not sure if it's still true now,
but I assume for the moment that it is, and await feedback. And the 
text saying that the call does not affect the semantics of memory 
access dates back even further (and was then true, MADV_DONTNEED aside).

Those 5 flags have analogs in POSIX's posix_madvise() (albeit, there
is a semantic mismatch between the destructive MADV_DONTNEED and
POSIX's nondestructive POSIX_MADV_DONTNEED). They also appear
on most other implementations.

Since the original implementation, numerous pieces of cruft^W^W^W
excellent new flags have been overloaded into this one system call.
Some of those certainly violated the "does not change the semantics
of the application" statement, but, sadly, the kernel developers who
implemented MADV_REMOVE or MADV_DONTFORK did not think to send a
patch to the man page for those new flags, one that might have noted
that the semantics of the application are changed by such flags. Equally
sadly, I did overlook to scan the bigger page when *I* added 
documentation of these flags to those pages, otherwise I might have 
caught that detail.

So, just to repeat, I  could really use more help on it from MM
folk in the form of actual patches to the man page.

Thanks,

Michael

[1] https://www.kernel.org/doc/man-pages/patches.html

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints
       [not found]           ` <20150203152121.GC8914-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
@ 2015-02-03 16:25             ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 20+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-02-03 16:25 UTC (permalink / raw)
  To: Michal Hocko, Mel Gorman
  Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, minchan Kim, Dave Hansen,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg

On 02/03/2015 04:21 PM, Michal Hocko wrote:
> On Tue 03-02-15 11:16:00, Mel Gorman wrote:
>> On Tue, Feb 03, 2015 at 09:19:15AM +0100, Vlastimil Babka wrote:
> [...]
>>> And if we agree that there is indeed no guarantee, what's the actual semantic
>>> difference from MADV_FREE? I guess none? So there's only a possible perfomance
>>> difference?
>>>
>>
>> Timing. MADV_DONTNEED if it has an effect is immediate, is a heavier
>> operations and RSS is reduced. MADV_FREE only has an impact in the future
>> if there is memory pressure.
> 
> JFTR. the man page for MADV_FREE has been proposed already
> (https://lkml.org/lkml/2014/12/5/63 should be the last version AFAIR). I
> do not see it in the man-pages git tree but the patch was not in time
> for 3.19 so I guess it will only appear in 3.20.
> 

Yikes! That patch was buried in the bottom of a locked filing cabinet
in a disused lavatory. I unfortunately don't read every thread that comes
my way, especially if it doesn't look like a man-pages patch (i.e., falls
in the middle of an LKML thread that starts on another topic, and doesn't 
see linux-man@). I'll respond to that patch soon. (There are some problems
that mean I could not accept it, AFAICT.)

Thanks,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints
       [not found]           ` <54D0B43D.8000209-AlSwsSmVLrQ@public.gmane.org>
@ 2015-02-04  0:09             ` Minchan Kim
  0 siblings, 0 replies; 20+ messages in thread
From: Minchan Kim @ 2015-02-04  0:09 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Kirill A. Shutemov, Dave Hansen, Mel Gorman,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Andrew Morton,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w,
	linux-man-u79uwXL29TY76Z2rM5mHXA, Rik van Riel

On Tue, Feb 03, 2015 at 12:42:53PM +0100, Vlastimil Babka wrote:
> On 02/03/2015 11:53 AM, Kirill A. Shutemov wrote:
> > On Tue, Feb 03, 2015 at 09:19:15AM +0100, Vlastimil Babka wrote:
> >> [CC linux-api, man pages]
> >> 
> >> On 02/02/2015 11:22 PM, Dave Hansen wrote:
> >> > On 02/02/2015 08:55 AM, Mel Gorman wrote:
> >> >> This patch identifies when a thread is frequently calling MADV_DONTNEED
> >> >> on the same region of memory and starts ignoring the hint. On an 8-core
> >> >> single-socket machine this was the impact on ebizzy using glibc 2.19.
> >> > 
> >> > The manpage, at least, claims that we zero-fill after MADV_DONTNEED is
> >> > called:
> >> > 
> >> >>      MADV_DONTNEED
> >> >>               Do  not  expect  access in the near future.  (For the time being, the application is finished with the given range, so the kernel can free resources
> >> >>               associated with it.)  Subsequent accesses of pages in this range will succeed, but will result either in reloading of the memory contents  from  the
> >> >>               underlying mapped file (see mmap(2)) or zero-fill-on-demand pages for mappings without an underlying file.
> >> > 
> >> > So if we have anything depending on the behavior that it's _always_
> >> > zero-filled after an MADV_DONTNEED, this will break it.
> >> 
> >> OK, so that's a third person (including me) who understood it as a zero-fill
> >> guarantee. I think the man page should be clarified (if it's indeed not
> >> guaranteed), or we have a bug.
> >> 
> >> The implementation actually skips MADV_DONTNEED for
> >> VM_LOCKED|VM_HUGETLB|VM_PFNMAP vma's.
> > 
> > It doesn't skip. It fails with -EINVAL. Or I miss something.
> 
> No, I missed that. Thanks for pointing out. The manpage also explains EINVAL in
> this case:
> 
> *  The application is attempting to release locked or shared pages (with
> MADV_DONTNEED).
> 
> - that covers mlocking ok, not sure if the rest fits the "shared pages" case
> though. I dont see any check for other kinds of shared pages in the code.
> 
> >> - The word "will result" did sound as a guarantee at least to me. So here it
> >> could be changed to "may result (unless the advice is ignored)"?
> > 
> > It's too late to fix documentation. Applications already depends on the
> > beheviour.
> 
> Right, so as long as they check for EINVAL, it should be safe. It appears that
> jemalloc does.
> 
> I still wouldnt be sure just by reading the man page that the clearing is
> guaranteed whenever I dont get an error return value, though,
> 

IMHO,

Man page said
"MADV_DONTNEED: Subsequent accesses of pages in this range will succeed,
 but will result either in reloading of  the memory contents from the
 underlying mapped file (see mmap(2)) or  zero-fill-on-demand pages
 for mappings without an underlying file."

Heap by allocated by malloc(3) is anonymous page so it's a mapping
withtout an underlying file so userspace can expect zero-fill.

Man page said
"EINVAL: The application is attempting to release locked or
shared pages (with MADV_DONTNEED)"

So, user can expect the call on area by allocated by malloc(3)
if he doesn't call mlock will always be successful.

Man page said
"madivse: This call does not influence the semantics of the application
(except in the case of MADV_DONTNEED)"

So, we shouldn't break MADV_DONTNEED's semantic which free pages
instantly. It's a long time semantic and it was one of arguable issues
on MADV_FREE Rik had tried long time ago to replace MADV_DONTNEED
with MADV_FREE.

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints
       [not found]             ` <54D0F56A.9050003-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2015-02-04 13:46               ` Vlastimil Babka
       [not found]                 ` <54D22298.3040504-AlSwsSmVLrQ@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: Vlastimil Babka @ 2015-02-04 13:46 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages), Kirill A. Shutemov
  Cc: Dave Hansen, Mel Gorman, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	Minchan Kim, Andrew Morton, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-man-u79uwXL29TY76Z2rM5mHXA, Hugh Dickins

On 02/03/2015 05:20 PM, Michael Kerrisk (man-pages) wrote:
> Hello Vlastimil
>
> Thanks for CCing me into this thread.

NP

> On 02/03/2015 12:42 PM, Vlastimil Babka wrote:
>> On 02/03/2015 11:53 AM, Kirill A. Shutemov wrote:
>>> On Tue, Feb 03, 2015 at 09:19:15AM +0100, Vlastimil Babka wrote:
>>>
>>> It doesn't skip. It fails with -EINVAL. Or I miss something.
>>
>> No, I missed that. Thanks for pointing out. The manpage also explains EINVAL in
>> this case:
>>
>> *  The application is attempting to release locked or shared pages (with
>> MADV_DONTNEED).
>
> Yes, there is that. But the page could be more explicit when discussing
> MADV_DONTNEED in the main text. I've done that.
>
>> - that covers mlocking ok, not sure if the rest fits the "shared pages" case
>> though. I dont see any check for other kinds of shared pages in the code.
>
> Agreed. "shared" here seems confused. I've removed it. And I've
> added mention of "Huge TLB pages" for this error.
>

Thanks.

>>>> - The word "will result" did sound as a guarantee at least to me. So here it
>>>> could be changed to "may result (unless the advice is ignored)"?
>>>
>>> It's too late to fix documentation. Applications already depends on the
>>> beheviour.
>>
>> Right, so as long as they check for EINVAL, it should be safe. It appears that
>> jemalloc does.
>
> So, first a brief question: in the cases where the call does not error out,
> are we agreed that in the current implementation, MADV_DONTNEED will
> always result in zero-filled pages when the region is faulted back in
> (when we consider pages that are not backed by a file)?

I'd agree at this point.
Also we should probably mention anonymously shared pages (shmem). I 
think they behave the same as file here.

>> I still wouldnt be sure just by reading the man page that the clearing is
>> guaranteed whenever I dont get an error return value, though,
>
> I'm not quite sure what you want here. I mean: if there's an error,

I was just reiterating that the guarantee is not clear from if you 
consider all the statements in the man page.

> then the DONTNEED action didn't occur, right? Therefore, there won't
> be zero-filled pages. But, for what it's worth, I added "If the
> operation succeeds" at the start of that sentence beginning "Subsequent
> accesses...".

Yes, that should clarify it. Thanks!

> Now, some history, explaining why the page is a bit of a mess,
> and for that matter why I could really use more help on it from MM
> folk (especially in the form of actual patches [1], rather than notes
> about deficiencies in the documentation), because:
>
>      ***I simply cannot keep up with all of the details***.

I see, and expected it would be like this. I would just send patch if 
the situation was clear, but here we should agree first, and I thought 
you should be involved from the beginning.

> Once upon a time (Linux 2.4), there was madvise() with just 5 flags:
>
>         MADV_NORMAL
>         MADV_RANDOM
>         MADV_SEQUENTIAL
>         MADV_WILLNEED
>         MADV_DONTNEED
>
> And already a dozen years ago, *I* added the text about MADV_DONTNEED.
> Back then, I believe it was true. I'm not sure if it's still true now,
> but I assume for the moment that it is, and await feedback. And the
> text saying that the call does not affect the semantics of memory
> access dates back even further (and was then true, MADV_DONTNEED aside).
>
> Those 5 flags have analogs in POSIX's posix_madvise() (albeit, there
> is a semantic mismatch between the destructive MADV_DONTNEED and
> POSIX's nondestructive POSIX_MADV_DONTNEED). They also appear
> on most other implementations.
>
> Since the original implementation, numerous pieces of cruft^W^W^W
> excellent new flags have been overloaded into this one system call.
> Some of those certainly violated the "does not change the semantics
> of the application" statement, but, sadly, the kernel developers who
> implemented MADV_REMOVE or MADV_DONTFORK did not think to send a
> patch to the man page for those new flags, one that might have noted
> that the semantics of the application are changed by such flags. Equally
> sadly, I did overlook to scan the bigger page when *I* added
> documentation of these flags to those pages, otherwise I might have
> caught that detail.
>
> So, just to repeat, I  could really use more help on it from MM
> folk in the form of actual patches to the man page.

Thanks for the background. I'll try to remember to check for man-pages 
part when I review some api changing patch.

> Thanks,
>
> Michael
>
> [1] https://www.kernel.org/doc/man-pages/patches.html
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints
       [not found]                 ` <54D22298.3040504-AlSwsSmVLrQ@public.gmane.org>
@ 2015-02-04 14:00                   ` Michael Kerrisk (man-pages)
  2015-02-04 17:02                     ` Vlastimil Babka
  0 siblings, 1 reply; 20+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-02-04 14:00 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Kirill A. Shutemov, Dave Hansen, Mel Gorman,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, Minchan Kim,
	Andrew Morton, lkml, Linux API, linux-man, Hugh Dickins

Hello Vlastimil,

On 4 February 2015 at 14:46, Vlastimil Babka <vbabka-AlSwsSmVLrQ@public.gmane.org> wrote:
> On 02/03/2015 05:20 PM, Michael Kerrisk (man-pages) wrote:
>>
>> On 02/03/2015 12:42 PM, Vlastimil Babka wrote:
>>>
>>> On 02/03/2015 11:53 AM, Kirill A. Shutemov wrote:
>>>>
>>>> On Tue, Feb 03, 2015 at 09:19:15AM +0100, Vlastimil Babka wrote:
>>>>
>>>> It doesn't skip. It fails with -EINVAL. Or I miss something.
>>>
>>>
>>> No, I missed that. Thanks for pointing out. The manpage also explains
>>> EINVAL in
>>> this case:
>>>
>>> *  The application is attempting to release locked or shared pages (with
>>> MADV_DONTNEED).
>>
>> Yes, there is that. But the page could be more explicit when discussing
>> MADV_DONTNEED in the main text. I've done that.
>>
>>> - that covers mlocking ok, not sure if the rest fits the "shared pages"
>>> case
>>> though. I dont see any check for other kinds of shared pages in the code.
>>
>> Agreed. "shared" here seems confused. I've removed it. And I've
>> added mention of "Huge TLB pages" for this error.
>
> Thanks.

I also added those cases for MADV_REMOVE, BTW.

>>>>> - The word "will result" did sound as a guarantee at least to me. So
>>>>> here it
>>>>> could be changed to "may result (unless the advice is ignored)"?
>>>>
>>>> It's too late to fix documentation. Applications already depends on the
>>>> beheviour.
>>>
>>> Right, so as long as they check for EINVAL, it should be safe. It appears
>>> that
>>> jemalloc does.
>>
>>
>> So, first a brief question: in the cases where the call does not error
>> out,
>> are we agreed that in the current implementation, MADV_DONTNEED will
>> always result in zero-filled pages when the region is faulted back in
>> (when we consider pages that are not backed by a file)?
>
>
> I'd agree at this point.

Thanks for the confirmation.

> Also we should probably mention anonymously shared pages (shmem). I think
> they behave the same as file here.

You mean tmpfs here, right? (I don't keep all of the synonyms straight.)

>>> I still wouldnt be sure just by reading the man page that the clearing is
>>> guaranteed whenever I dont get an error return value, though,
>>
>> I'm not quite sure what you want here. I mean: if there's an error,
>
> I was just reiterating that the guarantee is not clear from if you consider
> all the statements in the man page.
>
>> then the DONTNEED action didn't occur, right? Therefore, there won't
>> be zero-filled pages. But, for what it's worth, I added "If the
>> operation succeeds" at the start of that sentence beginning "Subsequent
>> accesses...".
>
> Yes, that should clarify it. Thanks!

Okay.

>> Now, some history, explaining why the page is a bit of a mess,
>> and for that matter why I could really use more help on it from MM
>> folk (especially in the form of actual patches [1], rather than notes
>> about deficiencies in the documentation), because:
>>
>>      ***I simply cannot keep up with all of the details***.
>
> I see, and expected it would be like this. I would just send patch if the
> situation was clear, but here we should agree first, and I thought you
> should be involved from the beginning.

Sorry -- I should have made it clearer, this statement was not
targeted at you personally, or even necessarily at this particular
thread. It was a general comment, that came up sharply to me as I
looked at how much cruft there is in the madvise() page.

>> Once upon a time (Linux 2.4), there was madvise() with just 5 flags:
>>
>>         MADV_NORMAL
>>         MADV_RANDOM
>>         MADV_SEQUENTIAL
>>         MADV_WILLNEED
>>         MADV_DONTNEED
>>
>> And already a dozen years ago, *I* added the text about MADV_DONTNEED.
>> Back then, I believe it was true. I'm not sure if it's still true now,
>> but I assume for the moment that it is, and await feedback. And the
>> text saying that the call does not affect the semantics of memory
>> access dates back even further (and was then true, MADV_DONTNEED aside).
>>
>> Those 5 flags have analogs in POSIX's posix_madvise() (albeit, there
>> is a semantic mismatch between the destructive MADV_DONTNEED and
>> POSIX's nondestructive POSIX_MADV_DONTNEED). They also appear
>> on most other implementations.
>>
>> Since the original implementation, numerous pieces of cruft^W^W^W
>> excellent new flags have been overloaded into this one system call.
>> Some of those certainly violated the "does not change the semantics
>> of the application" statement, but, sadly, the kernel developers who
>> implemented MADV_REMOVE or MADV_DONTFORK did not think to send a
>> patch to the man page for those new flags, one that might have noted
>> that the semantics of the application are changed by such flags. Equally
>> sadly, I did overlook to scan the bigger page when *I* added
>> documentation of these flags to those pages, otherwise I might have
>> caught that detail.
>>
>> So, just to repeat, I  could really use more help on it from MM
>> folk in the form of actual patches to the man page.
>
> Thanks for the background. I'll try to remember to check for man-pages part
> when I review some api changing patch.

That would be great.

Thanks,

Michael



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints
  2015-02-04 14:00                   ` Michael Kerrisk (man-pages)
@ 2015-02-04 17:02                     ` Vlastimil Babka
       [not found]                       ` <54D2508A.9030804-AlSwsSmVLrQ@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: Vlastimil Babka @ 2015-02-04 17:02 UTC (permalink / raw)
  To: mtk.manpages
  Cc: Kirill A. Shutemov, Dave Hansen, Mel Gorman, linux-mm@kvack.org,
	Minchan Kim, Andrew Morton, lkml, Linux API, linux-man,
	Hugh Dickins

On 02/04/2015 03:00 PM, Michael Kerrisk (man-pages) wrote:
> Hello Vlastimil,
>
> On 4 February 2015 at 14:46, Vlastimil Babka <vbabka@suse.cz> wrote:
>>>> - that covers mlocking ok, not sure if the rest fits the "shared pages"
>>>> case
>>>> though. I dont see any check for other kinds of shared pages in the code.
>>>
>>> Agreed. "shared" here seems confused. I've removed it. And I've
>>> added mention of "Huge TLB pages" for this error.
>>
>> Thanks.
>
> I also added those cases for MADV_REMOVE, BTW.

Right. There's also the following for MADV_REMOVE that needs updating:

"Currently, only shmfs/tmpfs supports this; other filesystems return 
with the error ENOSYS."

- it's not just shmem/tmpfs anymore. It should be best to refer to 
fallocate(2) option FALLOC_FL_PUNCH_HOLE which seems to be (more) up to 
date.

- AFAICS it doesn't return ENOSYS but EOPNOTSUPP. Also neither error 
code is listed in the ERRORS section.

>>>>>> - The word "will result" did sound as a guarantee at least to me. So
>>>>>> here it
>>>>>> could be changed to "may result (unless the advice is ignored)"?
>>>>>
>>>>> It's too late to fix documentation. Applications already depends on the
>>>>> beheviour.
>>>>
>>>> Right, so as long as they check for EINVAL, it should be safe. It appears
>>>> that
>>>> jemalloc does.
>>>
>>>
>>> So, first a brief question: in the cases where the call does not error
>>> out,
>>> are we agreed that in the current implementation, MADV_DONTNEED will
>>> always result in zero-filled pages when the region is faulted back in
>>> (when we consider pages that are not backed by a file)?
>>
>>
>> I'd agree at this point.
>
> Thanks for the confirmation.
>
>> Also we should probably mention anonymously shared pages (shmem). I think
>> they behave the same as file here.
>
> You mean tmpfs here, right? (I don't keep all of the synonyms straight.)

shmem is tmpfs (that by itself would fit under "files" just fine), but 
also sys V segments created by shmget(2) and also mappings created by 
mmap with MAP_SHARED | MAP_ANONYMOUS. I'm not sure if there's a single 
manpage to refer to the full list.

Thanks,
Vlastimil

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints
       [not found]                       ` <54D2508A.9030804-AlSwsSmVLrQ@public.gmane.org>
@ 2015-02-04 19:24                         ` Michael Kerrisk (man-pages)
  2015-02-05  1:07                           ` Minchan Kim
  2015-02-05 15:41                           ` Michal Hocko
  0 siblings, 2 replies; 20+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-02-04 19:24 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Kirill A. Shutemov, Dave Hansen, Mel Gorman,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, Minchan Kim,
	Andrew Morton, lkml, Linux API, linux-man, Hugh Dickins

On 4 February 2015 at 18:02, Vlastimil Babka <vbabka-AlSwsSmVLrQ@public.gmane.org> wrote:
> On 02/04/2015 03:00 PM, Michael Kerrisk (man-pages) wrote:
>>
>> Hello Vlastimil,
>>
>> On 4 February 2015 at 14:46, Vlastimil Babka <vbabka-AlSwsSmVLrQ@public.gmane.org> wrote:
>>>>>
>>>>> - that covers mlocking ok, not sure if the rest fits the "shared pages"
>>>>> case
>>>>> though. I dont see any check for other kinds of shared pages in the
>>>>> code.
>>>>
>>>>
>>>> Agreed. "shared" here seems confused. I've removed it. And I've
>>>> added mention of "Huge TLB pages" for this error.
>>>
>>>
>>> Thanks.
>>
>>
>> I also added those cases for MADV_REMOVE, BTW.
>
>
> Right. There's also the following for MADV_REMOVE that needs updating:
>
> "Currently, only shmfs/tmpfs supports this; other filesystems return with
> the error ENOSYS."
>
> - it's not just shmem/tmpfs anymore. It should be best to refer to
> fallocate(2) option FALLOC_FL_PUNCH_HOLE which seems to be (more) up to
> date.
>
> - AFAICS it doesn't return ENOSYS but EOPNOTSUPP. Also neither error code is
> listed in the ERRORS section.

Yup, I recently added that as well, based on a patch from Jan Chaloupka.

>>>>>>> - The word "will result" did sound as a guarantee at least to me. So
>>>>>>> here it
>>>>>>> could be changed to "may result (unless the advice is ignored)"?
>>>>>>
>>>>>> It's too late to fix documentation. Applications already depends on
>>>>>> the
>>>>>> beheviour.
>>>>>
>>>>> Right, so as long as they check for EINVAL, it should be safe. It
>>>>> appears
>>>>> that
>>>>> jemalloc does.
>>>>
>>>> So, first a brief question: in the cases where the call does not error
>>>> out,
>>>> are we agreed that in the current implementation, MADV_DONTNEED will
>>>> always result in zero-filled pages when the region is faulted back in
>>>> (when we consider pages that are not backed by a file)?
>>>
>>> I'd agree at this point.
>>
>> Thanks for the confirmation.
>>
>>> Also we should probably mention anonymously shared pages (shmem). I think
>>> they behave the same as file here.
>>
>> You mean tmpfs here, right? (I don't keep all of the synonyms straight.)
>
> shmem is tmpfs (that by itself would fit under "files" just fine), but also
> sys V segments created by shmget(2) and also mappings created by mmap with
> MAP_SHARED | MAP_ANONYMOUS. I'm not sure if there's a single manpage to
> refer to the full list.

So, how about this text:

              After a successful MADV_DONTNEED operation, the seman‐
              tics  of  memory  access  in  the specified region are
              changed: subsequent accesses of  pages  in  the  range
              will  succeed,  but will result in either reloading of
              the memory contents from the  underlying  mapped  file
              (for  shared file mappings, shared anonymous mappings,
              and shmem-based techniques such  as  System  V  shared
              memory  segments)  or  zero-fill-on-demand  pages  for
              anonymous private mappings.

Thanks,

Michael
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints
  2015-02-04 19:24                         ` Michael Kerrisk (man-pages)
@ 2015-02-05  1:07                           ` Minchan Kim
  2015-02-06 15:41                             ` Michael Kerrisk (man-pages)
  2015-02-05 15:41                           ` Michal Hocko
  1 sibling, 1 reply; 20+ messages in thread
From: Minchan Kim @ 2015-02-05  1:07 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Vlastimil Babka, Kirill A. Shutemov, Dave Hansen, Mel Gorman,
	linux-mm@kvack.org, Andrew Morton, lkml, Linux API, linux-man,
	Hugh Dickins

Hello,

On Wed, Feb 04, 2015 at 08:24:27PM +0100, Michael Kerrisk (man-pages) wrote:
> On 4 February 2015 at 18:02, Vlastimil Babka <vbabka@suse.cz> wrote:
> > On 02/04/2015 03:00 PM, Michael Kerrisk (man-pages) wrote:
> >>
> >> Hello Vlastimil,
> >>
> >> On 4 February 2015 at 14:46, Vlastimil Babka <vbabka@suse.cz> wrote:
> >>>>>
> >>>>> - that covers mlocking ok, not sure if the rest fits the "shared pages"
> >>>>> case
> >>>>> though. I dont see any check for other kinds of shared pages in the
> >>>>> code.
> >>>>
> >>>>
> >>>> Agreed. "shared" here seems confused. I've removed it. And I've
> >>>> added mention of "Huge TLB pages" for this error.
> >>>
> >>>
> >>> Thanks.
> >>
> >>
> >> I also added those cases for MADV_REMOVE, BTW.
> >
> >
> > Right. There's also the following for MADV_REMOVE that needs updating:
> >
> > "Currently, only shmfs/tmpfs supports this; other filesystems return with
> > the error ENOSYS."
> >
> > - it's not just shmem/tmpfs anymore. It should be best to refer to
> > fallocate(2) option FALLOC_FL_PUNCH_HOLE which seems to be (more) up to
> > date.
> >
> > - AFAICS it doesn't return ENOSYS but EOPNOTSUPP. Also neither error code is
> > listed in the ERRORS section.
> 
> Yup, I recently added that as well, based on a patch from Jan Chaloupka.
> 
> >>>>>>> - The word "will result" did sound as a guarantee at least to me. So
> >>>>>>> here it
> >>>>>>> could be changed to "may result (unless the advice is ignored)"?
> >>>>>>
> >>>>>> It's too late to fix documentation. Applications already depends on
> >>>>>> the
> >>>>>> beheviour.
> >>>>>
> >>>>> Right, so as long as they check for EINVAL, it should be safe. It
> >>>>> appears
> >>>>> that
> >>>>> jemalloc does.
> >>>>
> >>>> So, first a brief question: in the cases where the call does not error
> >>>> out,
> >>>> are we agreed that in the current implementation, MADV_DONTNEED will
> >>>> always result in zero-filled pages when the region is faulted back in
> >>>> (when we consider pages that are not backed by a file)?
> >>>
> >>> I'd agree at this point.
> >>
> >> Thanks for the confirmation.
> >>
> >>> Also we should probably mention anonymously shared pages (shmem). I think
> >>> they behave the same as file here.
> >>
> >> You mean tmpfs here, right? (I don't keep all of the synonyms straight.)
> >
> > shmem is tmpfs (that by itself would fit under "files" just fine), but also
> > sys V segments created by shmget(2) and also mappings created by mmap with
> > MAP_SHARED | MAP_ANONYMOUS. I'm not sure if there's a single manpage to
> > refer to the full list.
> 
> So, how about this text:
> 
>               After a successful MADV_DONTNEED operation, the seman‐
>               tics  of  memory  access  in  the specified region are
>               changed: subsequent accesses of  pages  in  the  range
>               will  succeed,  but will result in either reloading of
>               the memory contents from the  underlying  mapped  file
>               (for  shared file mappings, shared anonymous mappings,
>               and shmem-based techniques such  as  System  V  shared
>               memory  segments)  or  zero-fill-on-demand  pages  for
>               anonymous private mappings.

Hmm, I'd like to clarify.

Whether it was intention or not, some of userspace developers thought
about that syscall drop pages instantly if was no-error return so that
they will see more free pages(ie, rss for the process will be decreased)
with keeping the VMA. Can we rely on it?

And we should make error section, too.
"locked" covers mlock(2) and you said you will add hugetlb. Then,
VM_PFNMAP? In that case, it fails. How can we say about VM_PFNMAP?
special mapping for some drivers?

One more thing, "The kernel is free to ignore the advice".
It conflicts "This call does not influence the semantics of the
application (except in the case of MADV_DONTNEED)" so
is it okay we can believe "The kernel is free to ingmore the advise
except MADV_DONTNEED"?

Thanks.
-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints
  2015-02-04 19:24                         ` Michael Kerrisk (man-pages)
  2015-02-05  1:07                           ` Minchan Kim
@ 2015-02-05 15:41                           ` Michal Hocko
  2015-02-06 15:57                             ` Michael Kerrisk (man-pages)
  1 sibling, 1 reply; 20+ messages in thread
From: Michal Hocko @ 2015-02-05 15:41 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Vlastimil Babka, Kirill A. Shutemov, Dave Hansen, Mel Gorman,
	linux-mm@kvack.org, Minchan Kim, Andrew Morton, lkml, Linux API,
	linux-man, Hugh Dickins

On Wed 04-02-15 20:24:27, Michael Kerrisk wrote:
[...]
> So, how about this text:
> 
>               After a successful MADV_DONTNEED operation, the seman‐
>               tics  of  memory  access  in  the specified region are
>               changed: subsequent accesses of  pages  in  the  range
>               will  succeed,  but will result in either reloading of
>               the memory contents from the  underlying  mapped  file

"
result in either providing the up-to-date contents of the underlying
mapped file
"

Would be more precise IMO because reload might be interpreted as a major
fault which is not necessarily the case (see below).

>               (for  shared file mappings, shared anonymous mappings,
>               and shmem-based techniques such  as  System  V  shared
>               memory  segments)  or  zero-fill-on-demand  pages  for
>               anonymous private mappings.

Yes, this wording is better because many users are not aware of
MAP_ANON|MAP_SHARED being file backed in fact and mmap man page doesn't
mention that.

I am just wondering whether it makes sense to mention that MADV_DONTNEED
for shared mappings might be surprising and not freeing the backing
pages thus not really freeing memory until there is a memory
pressure. But maybe this is too implementation specific for a man
page. What about the following wording on top of yours?
"
Please note that the MADV_DONTNEED hint on shared mappings might not
lead to immediate freeing of pages in the range. The kernel is free to
delay this until an appropriate moment. RSS of the calling process will
be reduced however.
"
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints
  2015-02-05  1:07                           ` Minchan Kim
@ 2015-02-06 15:41                             ` Michael Kerrisk (man-pages)
       [not found]                               ` <54D4E098.8050004-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-02-06 15:41 UTC (permalink / raw)
  To: Minchan Kim
  Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, Vlastimil Babka,
	Kirill A. Shutemov, Dave Hansen, Mel Gorman,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, Andrew Morton,
	lkml, Linux API, linux-man, Hugh Dickins

On 02/05/2015 02:07 AM, Minchan Kim wrote:
> Hello,
> 
> On Wed, Feb 04, 2015 at 08:24:27PM +0100, Michael Kerrisk (man-pages) wrote:
>> On 4 February 2015 at 18:02, Vlastimil Babka <vbabka-AlSwsSmVLrQ@public.gmane.org> wrote:
>>> On 02/04/2015 03:00 PM, Michael Kerrisk (man-pages) wrote:
>>>>
>>>> Hello Vlastimil,
>>>>
>>>> On 4 February 2015 at 14:46, Vlastimil Babka <vbabka-AlSwsSmVLrQ@public.gmane.org> wrote:
>>>>>>>
>>>>>>> - that covers mlocking ok, not sure if the rest fits the "shared pages"
>>>>>>> case
>>>>>>> though. I dont see any check for other kinds of shared pages in the
>>>>>>> code.
>>>>>>
>>>>>>
>>>>>> Agreed. "shared" here seems confused. I've removed it. And I've
>>>>>> added mention of "Huge TLB pages" for this error.
>>>>>
>>>>>
>>>>> Thanks.
>>>>
>>>>
>>>> I also added those cases for MADV_REMOVE, BTW.
>>>
>>>
>>> Right. There's also the following for MADV_REMOVE that needs updating:
>>>
>>> "Currently, only shmfs/tmpfs supports this; other filesystems return with
>>> the error ENOSYS."
>>>
>>> - it's not just shmem/tmpfs anymore. It should be best to refer to
>>> fallocate(2) option FALLOC_FL_PUNCH_HOLE which seems to be (more) up to
>>> date.
>>>
>>> - AFAICS it doesn't return ENOSYS but EOPNOTSUPP. Also neither error code is
>>> listed in the ERRORS section.
>>
>> Yup, I recently added that as well, based on a patch from Jan Chaloupka.
>>
>>>>>>>>> - The word "will result" did sound as a guarantee at least to me. So
>>>>>>>>> here it
>>>>>>>>> could be changed to "may result (unless the advice is ignored)"?
>>>>>>>>
>>>>>>>> It's too late to fix documentation. Applications already depends on
>>>>>>>> the
>>>>>>>> beheviour.
>>>>>>>
>>>>>>> Right, so as long as they check for EINVAL, it should be safe. It
>>>>>>> appears
>>>>>>> that
>>>>>>> jemalloc does.
>>>>>>
>>>>>> So, first a brief question: in the cases where the call does not error
>>>>>> out,
>>>>>> are we agreed that in the current implementation, MADV_DONTNEED will
>>>>>> always result in zero-filled pages when the region is faulted back in
>>>>>> (when we consider pages that are not backed by a file)?
>>>>>
>>>>> I'd agree at this point.
>>>>
>>>> Thanks for the confirmation.
>>>>
>>>>> Also we should probably mention anonymously shared pages (shmem). I think
>>>>> they behave the same as file here.
>>>>
>>>> You mean tmpfs here, right? (I don't keep all of the synonyms straight.)
>>>
>>> shmem is tmpfs (that by itself would fit under "files" just fine), but also
>>> sys V segments created by shmget(2) and also mappings created by mmap with
>>> MAP_SHARED | MAP_ANONYMOUS. I'm not sure if there's a single manpage to
>>> refer to the full list.
>>
>> So, how about this text:
>>
>>               After a successful MADV_DONTNEED operation, the seman‐
>>               tics  of  memory  access  in  the specified region are
>>               changed: subsequent accesses of  pages  in  the  range
>>               will  succeed,  but will result in either reloading of
>>               the memory contents from the  underlying  mapped  file
>>               (for  shared file mappings, shared anonymous mappings,
>>               and shmem-based techniques such  as  System  V  shared
>>               memory  segments)  or  zero-fill-on-demand  pages  for
>>               anonymous private mappings.
> 
> Hmm, I'd like to clarify.
> 
> Whether it was intention or not, some of userspace developers thought
> about that syscall drop pages instantly if was no-error return so that
> they will see more free pages(ie, rss for the process will be decreased)
> with keeping the VMA. Can we rely on it?

I do not know. Michael?

> And we should make error section, too.
> "locked" covers mlock(2) and you said you will add hugetlb. Then,
> VM_PFNMAP? In that case, it fails. How can we say about VM_PFNMAP?
> special mapping for some drivers?

I'm open for offers on what to add.
 
> One more thing, "The kernel is free to ignore the advice".
> It conflicts "This call does not influence the semantics of the
> application (except in the case of MADV_DONTNEED)" so
> is it okay we can believe "The kernel is free to ingmore the advise
> except MADV_DONTNEED"?

I decided to just drop the sentence

     The kernel is free to ignore the advice.

It creates misunderstandings, and does not really add information.

Cheers,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints
  2015-02-05 15:41                           ` Michal Hocko
@ 2015-02-06 15:57                             ` Michael Kerrisk (man-pages)
       [not found]                               ` <54D4E47E.4020509-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-02-06 15:57 UTC (permalink / raw)
  To: Michal Hocko
  Cc: mtk.manpages, Vlastimil Babka, Kirill A. Shutemov, Dave Hansen,
	Mel Gorman, linux-mm@kvack.org, Minchan Kim, Andrew Morton, lkml,
	Linux API, linux-man, Hugh Dickins

Hi Michael

On 02/05/2015 04:41 PM, Michal Hocko wrote:
> On Wed 04-02-15 20:24:27, Michael Kerrisk wrote:
> [...]
>> So, how about this text:
>>
>>               After a successful MADV_DONTNEED operation, the seman‐
>>               tics  of  memory  access  in  the specified region are
>>               changed: subsequent accesses of  pages  in  the  range
>>               will  succeed,  but will result in either reloading of
>>               the memory contents from the  underlying  mapped  file
> 
> "
> result in either providing the up-to-date contents of the underlying
> mapped file
> "

Thanks! I did something like that. See below.

> Would be more precise IMO because reload might be interpreted as a major
> fault which is not necessarily the case (see below).
> 
>>               (for  shared file mappings, shared anonymous mappings,
>>               and shmem-based techniques such  as  System  V  shared
>>               memory  segments)  or  zero-fill-on-demand  pages  for
>>               anonymous private mappings.
> 
> Yes, this wording is better because many users are not aware of
> MAP_ANON|MAP_SHARED being file backed in fact and mmap man page doesn't
> mention that.

(Michal, would you have a text to propose to add to the mmap(2) page?
Maybe it would be useful to add something there.)

> 
> I am just wondering whether it makes sense to mention that MADV_DONTNEED
> for shared mappings might be surprising and not freeing the backing
> pages thus not really freeing memory until there is a memory
> pressure. But maybe this is too implementation specific for a man
> page. What about the following wording on top of yours?
> "
> Please note that the MADV_DONTNEED hint on shared mappings might not
> lead to immediate freeing of pages in the range. The kernel is free to
> delay this until an appropriate moment. RSS of the calling process will
> be reduced however.
> "

Thanks! I added this, but dropped in the word "immediately" in the last 
sentence, since I assume that was implied. So now we have:

              After  a  successful MADV_DONTNEED operation, the seman‐
              tics of  memory  access  in  the  specified  region  are
              changed:  subsequent accesses of pages in the range will
              succeed, but will result in either repopulating the mem‐
              ory  contents from the up-to-date contents of the under‐
              lying mapped file  (for  shared  file  mappings,  shared
              anonymous  mappings,  and shmem-based techniques such as
              System V shared memory segments) or  zero-fill-on-demand
              pages for anonymous private mappings.

              Note  that,  when applied to shared mappings, MADV_DONT‐
              NEED might not lead to immediate freeing of the pages in
              the  range.   The  kernel  is  free to delay freeing the
              pages until an appropriate  moment.   The  resident  set
              size  (RSS)  of  the calling process will be immediately
              reduced however.

The current draft of the page can be found in a branch,
http://git.kernel.org/cgit/docs/man-pages/man-pages.git/log/?h=draft_madvise

Thanks,

Michael



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints
       [not found]                               ` <54D4E47E.4020509-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2015-02-06 20:45                                 ` Michal Hocko
  2015-02-09  6:50                                 ` Minchan Kim
  1 sibling, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2015-02-06 20:45 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Vlastimil Babka, Kirill A. Shutemov, Dave Hansen, Mel Gorman,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, Minchan Kim,
	Andrew Morton, lkml, Linux API, linux-man, Hugh Dickins

On Fri 06-02-15 16:57:50, Michael Kerrisk wrote:
[...]
> > Yes, this wording is better because many users are not aware of
> > MAP_ANON|MAP_SHARED being file backed in fact and mmap man page doesn't
> > mention that.
> 
> (Michal, would you have a text to propose to add to the mmap(2) page?
> Maybe it would be useful to add something there.)

I am half way on vacation, but I can cook a patch after I am back after
week.
 
> > I am just wondering whether it makes sense to mention that MADV_DONTNEED
> > for shared mappings might be surprising and not freeing the backing
> > pages thus not really freeing memory until there is a memory
> > pressure. But maybe this is too implementation specific for a man
> > page. What about the following wording on top of yours?
> > "
> > Please note that the MADV_DONTNEED hint on shared mappings might not
> > lead to immediate freeing of pages in the range. The kernel is free to
> > delay this until an appropriate moment. RSS of the calling process will
> > be reduced however.
> > "
> 
> Thanks! I added this, but dropped in the word "immediately" in the last 
> sentence, since I assume that was implied. So now we have:
> 
>               After  a  successful MADV_DONTNEED operation, the seman‐
>               tics of  memory  access  in  the  specified  region  are
>               changed:  subsequent accesses of pages in the range will
>               succeed, but will result in either repopulating the mem‐
>               ory  contents from the up-to-date contents of the under‐
>               lying mapped file  (for  shared  file  mappings,  shared
>               anonymous  mappings,  and shmem-based techniques such as
>               System V shared memory segments) or  zero-fill-on-demand
>               pages for anonymous private mappings.
> 
>               Note  that,  when applied to shared mappings, MADV_DONT‐
>               NEED might not lead to immediate freeing of the pages in
>               the  range.   The  kernel  is  free to delay freeing the
>               pages until an appropriate  moment.   The  resident  set
>               size  (RSS)  of  the calling process will be immediately
>               reduced however.

This sounds good to me and it is definitely much better than the current
state. Thanks!

> The current draft of the page can be found in a branch,
> http://git.kernel.org/cgit/docs/man-pages/man-pages.git/log/?h=draft_madvise
> 
> Thanks,
> 
> Michael
> 
> 
> 
> -- 
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints
       [not found]                               ` <54D4E098.8050004-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2015-02-09  6:46                                 ` Minchan Kim
  2015-02-09  9:13                                   ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 20+ messages in thread
From: Minchan Kim @ 2015-02-09  6:46 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Vlastimil Babka, Kirill A. Shutemov, Dave Hansen, Mel Gorman,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, Andrew Morton,
	lkml, Linux API, linux-man, Hugh Dickins

Hello, Michael

On Fri, Feb 06, 2015 at 04:41:12PM +0100, Michael Kerrisk (man-pages) wrote:
> On 02/05/2015 02:07 AM, Minchan Kim wrote:
> > Hello,
> > 
> > On Wed, Feb 04, 2015 at 08:24:27PM +0100, Michael Kerrisk (man-pages) wrote:
> >> On 4 February 2015 at 18:02, Vlastimil Babka <vbabka-AlSwsSmVLrQ@public.gmane.org> wrote:
> >>> On 02/04/2015 03:00 PM, Michael Kerrisk (man-pages) wrote:
> >>>>
> >>>> Hello Vlastimil,
> >>>>
> >>>> On 4 February 2015 at 14:46, Vlastimil Babka <vbabka-AlSwsSmVLrQ@public.gmane.org> wrote:
> >>>>>>>
> >>>>>>> - that covers mlocking ok, not sure if the rest fits the "shared pages"
> >>>>>>> case
> >>>>>>> though. I dont see any check for other kinds of shared pages in the
> >>>>>>> code.
> >>>>>>
> >>>>>>
> >>>>>> Agreed. "shared" here seems confused. I've removed it. And I've
> >>>>>> added mention of "Huge TLB pages" for this error.
> >>>>>
> >>>>>
> >>>>> Thanks.
> >>>>
> >>>>
> >>>> I also added those cases for MADV_REMOVE, BTW.
> >>>
> >>>
> >>> Right. There's also the following for MADV_REMOVE that needs updating:
> >>>
> >>> "Currently, only shmfs/tmpfs supports this; other filesystems return with
> >>> the error ENOSYS."
> >>>
> >>> - it's not just shmem/tmpfs anymore. It should be best to refer to
> >>> fallocate(2) option FALLOC_FL_PUNCH_HOLE which seems to be (more) up to
> >>> date.
> >>>
> >>> - AFAICS it doesn't return ENOSYS but EOPNOTSUPP. Also neither error code is
> >>> listed in the ERRORS section.
> >>
> >> Yup, I recently added that as well, based on a patch from Jan Chaloupka.
> >>
> >>>>>>>>> - The word "will result" did sound as a guarantee at least to me. So
> >>>>>>>>> here it
> >>>>>>>>> could be changed to "may result (unless the advice is ignored)"?
> >>>>>>>>
> >>>>>>>> It's too late to fix documentation. Applications already depends on
> >>>>>>>> the
> >>>>>>>> beheviour.
> >>>>>>>
> >>>>>>> Right, so as long as they check for EINVAL, it should be safe. It
> >>>>>>> appears
> >>>>>>> that
> >>>>>>> jemalloc does.
> >>>>>>
> >>>>>> So, first a brief question: in the cases where the call does not error
> >>>>>> out,
> >>>>>> are we agreed that in the current implementation, MADV_DONTNEED will
> >>>>>> always result in zero-filled pages when the region is faulted back in
> >>>>>> (when we consider pages that are not backed by a file)?
> >>>>>
> >>>>> I'd agree at this point.
> >>>>
> >>>> Thanks for the confirmation.
> >>>>
> >>>>> Also we should probably mention anonymously shared pages (shmem). I think
> >>>>> they behave the same as file here.
> >>>>
> >>>> You mean tmpfs here, right? (I don't keep all of the synonyms straight.)
> >>>
> >>> shmem is tmpfs (that by itself would fit under "files" just fine), but also
> >>> sys V segments created by shmget(2) and also mappings created by mmap with
> >>> MAP_SHARED | MAP_ANONYMOUS. I'm not sure if there's a single manpage to
> >>> refer to the full list.
> >>
> >> So, how about this text:
> >>
> >>               After a successful MADV_DONTNEED operation, the seman‐
> >>               tics  of  memory  access  in  the specified region are
> >>               changed: subsequent accesses of  pages  in  the  range
> >>               will  succeed,  but will result in either reloading of
> >>               the memory contents from the  underlying  mapped  file
> >>               (for  shared file mappings, shared anonymous mappings,
> >>               and shmem-based techniques such  as  System  V  shared
> >>               memory  segments)  or  zero-fill-on-demand  pages  for
> >>               anonymous private mappings.
> > 
> > Hmm, I'd like to clarify.
> > 
> > Whether it was intention or not, some of userspace developers thought
> > about that syscall drop pages instantly if was no-error return so that
> > they will see more free pages(ie, rss for the process will be decreased)
> > with keeping the VMA. Can we rely on it?
> 
> I do not know. Michael?

It's important to identify difference between MADV_DONTNEED and MADV_FREE
so it would be better to clear out in this chance.

> 
> > And we should make error section, too.
> > "locked" covers mlock(2) and you said you will add hugetlb. Then,
> > VM_PFNMAP? In that case, it fails. How can we say about VM_PFNMAP?
> > special mapping for some drivers?
> 
> I'm open for offers on what to add.

I suggests from quote "LWN" http://lwn.net/Articles/162860/
"*special mapping* which is not made up of "normal" pages.
It is usually created by device drivers which map special memory areas
into user space"

>  
> > One more thing, "The kernel is free to ignore the advice".
> > It conflicts "This call does not influence the semantics of the
> > application (except in the case of MADV_DONTNEED)" so
> > is it okay we can believe "The kernel is free to ingmore the advise
> > except MADV_DONTNEED"?
> 
> I decided to just drop the sentence
> 
>      The kernel is free to ignore the advice.
> 
> It creates misunderstandings, and does not really add information.

Sounds good.

> 
> Cheers,
> 
> Michael
> 
> -- 
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints
       [not found]                               ` <54D4E47E.4020509-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2015-02-06 20:45                                 ` Michal Hocko
@ 2015-02-09  6:50                                 ` Minchan Kim
  1 sibling, 0 replies; 20+ messages in thread
From: Minchan Kim @ 2015-02-09  6:50 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Michal Hocko, Vlastimil Babka, Kirill A. Shutemov, Dave Hansen,
	Mel Gorman, linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	Andrew Morton, lkml, Linux API, linux-man, Hugh Dickins

On Fri, Feb 06, 2015 at 04:57:50PM +0100, Michael Kerrisk (man-pages) wrote:
> Hi Michael
> 
> On 02/05/2015 04:41 PM, Michal Hocko wrote:
> > On Wed 04-02-15 20:24:27, Michael Kerrisk wrote:
> > [...]
> >> So, how about this text:
> >>
> >>               After a successful MADV_DONTNEED operation, the seman‐
> >>               tics  of  memory  access  in  the specified region are
> >>               changed: subsequent accesses of  pages  in  the  range
> >>               will  succeed,  but will result in either reloading of
> >>               the memory contents from the  underlying  mapped  file
> > 
> > "
> > result in either providing the up-to-date contents of the underlying
> > mapped file
> > "
> 
> Thanks! I did something like that. See below.
> 
> > Would be more precise IMO because reload might be interpreted as a major
> > fault which is not necessarily the case (see below).
> > 
> >>               (for  shared file mappings, shared anonymous mappings,
> >>               and shmem-based techniques such  as  System  V  shared
> >>               memory  segments)  or  zero-fill-on-demand  pages  for
> >>               anonymous private mappings.
> > 
> > Yes, this wording is better because many users are not aware of
> > MAP_ANON|MAP_SHARED being file backed in fact and mmap man page doesn't
> > mention that.
> 
> (Michal, would you have a text to propose to add to the mmap(2) page?
> Maybe it would be useful to add something there.)
> 
> > 
> > I am just wondering whether it makes sense to mention that MADV_DONTNEED
> > for shared mappings might be surprising and not freeing the backing
> > pages thus not really freeing memory until there is a memory
> > pressure. But maybe this is too implementation specific for a man
> > page. What about the following wording on top of yours?
> > "
> > Please note that the MADV_DONTNEED hint on shared mappings might not
> > lead to immediate freeing of pages in the range. The kernel is free to
> > delay this until an appropriate moment. RSS of the calling process will
> > be reduced however.
> > "
> 
> Thanks! I added this, but dropped in the word "immediately" in the last 
> sentence, since I assume that was implied. So now we have:
> 
>               After  a  successful MADV_DONTNEED operation, the seman‐
>               tics of  memory  access  in  the  specified  region  are
>               changed:  subsequent accesses of pages in the range will
>               succeed, but will result in either repopulating the mem‐
>               ory  contents from the up-to-date contents of the under‐
>               lying mapped file  (for  shared  file  mappings,  shared
>               anonymous  mappings,  and shmem-based techniques such as
>               System V shared memory segments) or  zero-fill-on-demand
>               pages for anonymous private mappings.
> 
>               Note  that,  when applied to shared mappings, MADV_DONT‐
>               NEED might not lead to immediate freeing of the pages in
>               the  range.   The  kernel  is  free to delay freeing the
>               pages until an appropriate  moment.   The  resident  set
>               size  (RSS)  of  the calling process will be immediately
>               reduced however.

Looks good. So, I can parse it that anonymous private mappings will lead
to immediate freeing of the pages in the range so it's clearly different
with MADV_FREE.

> 
> The current draft of the page can be found in a branch,
> http://git.kernel.org/cgit/docs/man-pages/man-pages.git/log/?h=draft_madvise
> 
> Thanks,
> 
> Michael
> 
> 
> 
> -- 
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints
  2015-02-09  6:46                                 ` Minchan Kim
@ 2015-02-09  9:13                                   ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 20+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-02-09  9:13 UTC (permalink / raw)
  To: Minchan Kim
  Cc: mtk.manpages, Vlastimil Babka, Kirill A. Shutemov, Dave Hansen,
	Mel Gorman, linux-mm@kvack.org, Andrew Morton, lkml, Linux API,
	linux-man, Hugh Dickins

Hello Minchan

On 02/09/2015 07:46 AM, Minchan Kim wrote:
> Hello, Michael
> 
> On Fri, Feb 06, 2015 at 04:41:12PM +0100, Michael Kerrisk (man-pages) wrote:
>> On 02/05/2015 02:07 AM, Minchan Kim wrote:
>>> Hello,
>>>
>>> On Wed, Feb 04, 2015 at 08:24:27PM +0100, Michael Kerrisk (man-pages) wrote:
>>>> On 4 February 2015 at 18:02, Vlastimil Babka <vbabka@suse.cz> wrote:
>>>>> On 02/04/2015 03:00 PM, Michael Kerrisk (man-pages) wrote:

[...]

>>> And we should make error section, too.
>>> "locked" covers mlock(2) and you said you will add hugetlb. Then,
>>> VM_PFNMAP? In that case, it fails. How can we say about VM_PFNMAP?
>>> special mapping for some drivers?
>>
>> I'm open for offers on what to add.
> 
> I suggests from quote "LWN" http://lwn.net/Articles/162860/
> "*special mapping* which is not made up of "normal" pages.
> It is usually created by device drivers which map special memory areas
> into user space"

Thanks. I've added mention of VM_PFNMAP in the discussion of both 
MADV_DONTNEED and MADV_REMOVE, and noted that both of those
operations will give an error when applied to VM_PFNMAP pages.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2015-02-09  9:13 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20150202165525.GM2395@suse.de>
     [not found] ` <54CFF8AC.6010102@intel.com>
2015-02-03  8:19   ` MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints Vlastimil Babka
     [not found]     ` <54D08483.40209-AlSwsSmVLrQ@public.gmane.org>
2015-02-03 10:53       ` Kirill A. Shutemov
2015-02-03 11:42         ` Vlastimil Babka
2015-02-03 16:20           ` Michael Kerrisk (man-pages)
     [not found]             ` <54D0F56A.9050003-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-02-04 13:46               ` Vlastimil Babka
     [not found]                 ` <54D22298.3040504-AlSwsSmVLrQ@public.gmane.org>
2015-02-04 14:00                   ` Michael Kerrisk (man-pages)
2015-02-04 17:02                     ` Vlastimil Babka
     [not found]                       ` <54D2508A.9030804-AlSwsSmVLrQ@public.gmane.org>
2015-02-04 19:24                         ` Michael Kerrisk (man-pages)
2015-02-05  1:07                           ` Minchan Kim
2015-02-06 15:41                             ` Michael Kerrisk (man-pages)
     [not found]                               ` <54D4E098.8050004-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-02-09  6:46                                 ` Minchan Kim
2015-02-09  9:13                                   ` Michael Kerrisk (man-pages)
2015-02-05 15:41                           ` Michal Hocko
2015-02-06 15:57                             ` Michael Kerrisk (man-pages)
     [not found]                               ` <54D4E47E.4020509-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-02-06 20:45                                 ` Michal Hocko
2015-02-09  6:50                                 ` Minchan Kim
     [not found]           ` <54D0B43D.8000209-AlSwsSmVLrQ@public.gmane.org>
2015-02-04  0:09             ` Minchan Kim
2015-02-03 11:16     ` Mel Gorman
     [not found]       ` <20150203111600.GR2395-l3A5Bk7waGM@public.gmane.org>
2015-02-03 15:21         ` Michal Hocko
     [not found]           ` <20150203152121.GC8914-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2015-02-03 16:25             ` Michael Kerrisk (man-pages)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).