linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] mm: mlock: remove lru_add_drain_all()
@ 2017-10-19 22:25 Shakeel Butt
  2017-10-20  6:19 ` Michal Hocko
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Shakeel Butt @ 2017-10-19 22:25 UTC (permalink / raw)
  To: Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Michal Hocko,
	Joonsoo Kim, Minchan Kim, Yisheng Xie, Ingo Molnar, Greg Thelen,
	Hugh Dickins
  Cc: Balbir Singh, Anshuman Khandual, linux-mm, linux-kernel,
	Shakeel Butt

lru_add_drain_all() is not required by mlock() and it will drain
everything that has been cached at the time mlock is called. And
that is not really related to the memory which will be faulted in
(and cached) and mlocked by the syscall itself.

Without lru_add_drain_all() the mlocked pages can remain on pagevecs
and be moved to evictable LRUs. However they will eventually be moved
back to unevictable LRU by reclaim. So, we can safely remove
lru_add_drain_all() from mlock syscall. Also there is no need for
local lru_add_drain() as it will be called deep inside __mm_populate()
(in follow_page_pte()).

On larger machines the overhead of lru_add_drain_all() in mlock() can
be significant when mlocking data already in memory. We have observed
high latency in mlock() due to lru_add_drain_all() when the users
were mlocking in memory tmpfs files.

Signed-off-by: Shakeel Butt <shakeelb@google.com>
---
Changelog since v1:
- updated commit message

 mm/mlock.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/mm/mlock.c b/mm/mlock.c
index dfc6f1912176..3ceb2935d1e0 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -669,8 +669,6 @@ static __must_check int do_mlock(unsigned long start, size_t len, vm_flags_t fla
 	if (!can_do_mlock())
 		return -EPERM;
 
-	lru_add_drain_all();	/* flush pagevec */
-
 	len = PAGE_ALIGN(len + (offset_in_page(start)));
 	start &= PAGE_MASK;
 
@@ -797,9 +795,6 @@ SYSCALL_DEFINE1(mlockall, int, flags)
 	if (!can_do_mlock())
 		return -EPERM;
 
-	if (flags & MCL_CURRENT)
-		lru_add_drain_all();	/* flush pagevec */
-
 	lock_limit = rlimit(RLIMIT_MEMLOCK);
 	lock_limit >>= PAGE_SHIFT;
 
-- 
2.15.0.rc0.271.g36b669edcc-goog

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mm: mlock: remove lru_add_drain_all()
  2017-10-19 22:25 [PATCH v2] mm: mlock: remove lru_add_drain_all() Shakeel Butt
@ 2017-10-20  6:19 ` Michal Hocko
  2017-10-20 15:07   ` Shakeel Butt
  2017-10-20 21:51 ` Balbir Singh
  2017-10-30  9:12 ` Vlastimil Babka
  2 siblings, 1 reply; 6+ messages in thread
From: Michal Hocko @ 2017-10-20  6:19 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Joonsoo Kim,
	Minchan Kim, Yisheng Xie, Ingo Molnar, Greg Thelen, Hugh Dickins,
	Balbir Singh, Anshuman Khandual, linux-mm, linux-kernel

On Thu 19-10-17 15:25:07, Shakeel Butt wrote:
> lru_add_drain_all() is not required by mlock() and it will drain
> everything that has been cached at the time mlock is called. And
> that is not really related to the memory which will be faulted in
> (and cached) and mlocked by the syscall itself.
> 
> Without lru_add_drain_all() the mlocked pages can remain on pagevecs
> and be moved to evictable LRUs. However they will eventually be moved
> back to unevictable LRU by reclaim. So, we can safely remove
> lru_add_drain_all() from mlock syscall. Also there is no need for
> local lru_add_drain() as it will be called deep inside __mm_populate()
> (in follow_page_pte()).

This paragraph can be still a bit confusing. I suspect you meant to say
something like: "If anything lru_add_drain_all" should be called _after_
pages have been mlocked and faulted in but even that is not strictly
needed because those pages would get to the appropriate LRUs lazily
during the reclaim path. Moreover follow_page_pte (gup) will drain the
local pcp LRU cache."

> On larger machines the overhead of lru_add_drain_all() in mlock() can
> be significant when mlocking data already in memory. We have observed
> high latency in mlock() due to lru_add_drain_all() when the users
> were mlocking in memory tmpfs files.
> 
> Signed-off-by: Shakeel Butt <shakeelb@google.com>

Anyway, this patch makes a lot of sense to me. Feel free to add
Acked-by: Michal Hocko <mhocko@suse.com>

> ---
> Changelog since v1:
> - updated commit message
> 
>  mm/mlock.c | 5 -----
>  1 file changed, 5 deletions(-)
> 
> diff --git a/mm/mlock.c b/mm/mlock.c
> index dfc6f1912176..3ceb2935d1e0 100644
> --- a/mm/mlock.c
> +++ b/mm/mlock.c
> @@ -669,8 +669,6 @@ static __must_check int do_mlock(unsigned long start, size_t len, vm_flags_t fla
>  	if (!can_do_mlock())
>  		return -EPERM;
>  
> -	lru_add_drain_all();	/* flush pagevec */
> -
>  	len = PAGE_ALIGN(len + (offset_in_page(start)));
>  	start &= PAGE_MASK;
>  
> @@ -797,9 +795,6 @@ SYSCALL_DEFINE1(mlockall, int, flags)
>  	if (!can_do_mlock())
>  		return -EPERM;
>  
> -	if (flags & MCL_CURRENT)
> -		lru_add_drain_all();	/* flush pagevec */
> -
>  	lock_limit = rlimit(RLIMIT_MEMLOCK);
>  	lock_limit >>= PAGE_SHIFT;
>  
> -- 
> 2.15.0.rc0.271.g36b669edcc-goog
> 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mm: mlock: remove lru_add_drain_all()
  2017-10-20  6:19 ` Michal Hocko
@ 2017-10-20 15:07   ` Shakeel Butt
  0 siblings, 0 replies; 6+ messages in thread
From: Shakeel Butt @ 2017-10-20 15:07 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Joonsoo Kim,
	Minchan Kim, Yisheng Xie, Ingo Molnar, Greg Thelen, Hugh Dickins,
	Balbir Singh, Anshuman Khandual, Linux MM, LKML

On Thu, Oct 19, 2017 at 11:19 PM, Michal Hocko <mhocko@kernel.org> wrote:
> On Thu 19-10-17 15:25:07, Shakeel Butt wrote:
>> lru_add_drain_all() is not required by mlock() and it will drain
>> everything that has been cached at the time mlock is called. And
>> that is not really related to the memory which will be faulted in
>> (and cached) and mlocked by the syscall itself.
>>
>> Without lru_add_drain_all() the mlocked pages can remain on pagevecs
>> and be moved to evictable LRUs. However they will eventually be moved
>> back to unevictable LRU by reclaim. So, we can safely remove
>> lru_add_drain_all() from mlock syscall. Also there is no need for
>> local lru_add_drain() as it will be called deep inside __mm_populate()
>> (in follow_page_pte()).
>
> This paragraph can be still a bit confusing. I suspect you meant to say
> something like: "If anything lru_add_drain_all" should be called _after_
> pages have been mlocked and faulted in but even that is not strictly
> needed because those pages would get to the appropriate LRUs lazily
> during the reclaim path. Moreover follow_page_pte (gup) will drain the
> local pcp LRU cache."
>

Andrew, can you please replace the second paragraph of the commit with
Michal's suggested paragraph.

>> On larger machines the overhead of lru_add_drain_all() in mlock() can
>> be significant when mlocking data already in memory. We have observed
>> high latency in mlock() due to lru_add_drain_all() when the users
>> were mlocking in memory tmpfs files.
>>
>> Signed-off-by: Shakeel Butt <shakeelb@google.com>
>
> Anyway, this patch makes a lot of sense to me. Feel free to add
> Acked-by: Michal Hocko <mhocko@suse.com>
>

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mm: mlock: remove lru_add_drain_all()
  2017-10-19 22:25 [PATCH v2] mm: mlock: remove lru_add_drain_all() Shakeel Butt
  2017-10-20  6:19 ` Michal Hocko
@ 2017-10-20 21:51 ` Balbir Singh
  2017-10-21  8:09   ` Michal Hocko
  2017-10-30  9:12 ` Vlastimil Babka
  2 siblings, 1 reply; 6+ messages in thread
From: Balbir Singh @ 2017-10-20 21:51 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Michal Hocko,
	Joonsoo Kim, Minchan Kim, Yisheng Xie, Ingo Molnar, Greg Thelen,
	Hugh Dickins, Anshuman Khandual, linux-mm,
	linux-kernel@vger.kernel.org

On Fri, Oct 20, 2017 at 9:25 AM, Shakeel Butt <shakeelb@google.com> wrote:
> lru_add_drain_all() is not required by mlock() and it will drain
> everything that has been cached at the time mlock is called. And
> that is not really related to the memory which will be faulted in
> (and cached) and mlocked by the syscall itself.
>
> Without lru_add_drain_all() the mlocked pages can remain on pagevecs
> and be moved to evictable LRUs. However they will eventually be moved
> back to unevictable LRU by reclaim. So, we can safely remove
> lru_add_drain_all() from mlock syscall. Also there is no need for
> local lru_add_drain() as it will be called deep inside __mm_populate()
> (in follow_page_pte()).
>
> On larger machines the overhead of lru_add_drain_all() in mlock() can
> be significant when mlocking data already in memory. We have observed
> high latency in mlock() due to lru_add_drain_all() when the users
> were mlocking in memory tmpfs files.
>
> Signed-off-by: Shakeel Butt <shakeelb@google.com>
> ---

I'm afraid I still don't fully understand the impact in terms of numbers and
statistics as seen from inside a cgroup. My understanding is that we'll slowly
see the unreclaimable stats go up as we drain the pvec's across CPU's
I understand the optimization and I can see why lru_add_drain_all() is
expensive.

Acked-by: Balbir Singh <bsingharora@gmail.com>

Balbir Singh.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mm: mlock: remove lru_add_drain_all()
  2017-10-20 21:51 ` Balbir Singh
@ 2017-10-21  8:09   ` Michal Hocko
  0 siblings, 0 replies; 6+ messages in thread
From: Michal Hocko @ 2017-10-21  8:09 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Shakeel Butt, Andrew Morton, Kirill A. Shutemov, Vlastimil Babka,
	Joonsoo Kim, Minchan Kim, Yisheng Xie, Ingo Molnar, Greg Thelen,
	Hugh Dickins, Anshuman Khandual, linux-mm,
	linux-kernel@vger.kernel.org

On Sat 21-10-17 08:51:04, Balbir Singh wrote:
> On Fri, Oct 20, 2017 at 9:25 AM, Shakeel Butt <shakeelb@google.com> wrote:
> > lru_add_drain_all() is not required by mlock() and it will drain
> > everything that has been cached at the time mlock is called. And
> > that is not really related to the memory which will be faulted in
> > (and cached) and mlocked by the syscall itself.
> >
> > Without lru_add_drain_all() the mlocked pages can remain on pagevecs
> > and be moved to evictable LRUs. However they will eventually be moved
> > back to unevictable LRU by reclaim. So, we can safely remove
> > lru_add_drain_all() from mlock syscall. Also there is no need for
> > local lru_add_drain() as it will be called deep inside __mm_populate()
> > (in follow_page_pte()).
> >
> > On larger machines the overhead of lru_add_drain_all() in mlock() can
> > be significant when mlocking data already in memory. We have observed
> > high latency in mlock() due to lru_add_drain_all() when the users
> > were mlocking in memory tmpfs files.
> >
> > Signed-off-by: Shakeel Butt <shakeelb@google.com>
> > ---
> 
> I'm afraid I still don't fully understand the impact in terms of numbers and
> statistics as seen from inside a cgroup.

I really fail to see why there would be anything cgroup specific here.

> My understanding is that we'll slowly
> see the unreclaimable stats go up as we drain the pvec's across CPU's

Not really. Draining is a bit tricky. Anonymous PF (gup) use
lru_cache_add_active_or_unevictable so we bypass the LRU cache
on mlocked pages altogether. Filemap faults go via cache and
__pagevec_lru_add_fn to flush a full cache is not mlock aware. But gup
(follow_page_pte) path tries to move existing and mapped pages to the
unevictable LRU list. So yes we can see lazy mlock pages on evictable
LRU but reclaim will get them to the unevictable list when needed.
This should be mostly reduced to file mappings. But I haven't checked
the code recently and mlock is quite tricky so I might misremember.

In any case lru_add_drain_all is quite tangent to all this AFAICS.

> I understand the optimization and I can see why lru_add_drain_all() is
> expensive.

not only it is expensive it is paying price for previous caching which
might not be directly related to the mlock syscall.
 
> Acked-by: Balbir Singh <bsingharora@gmail.com>
> 
> Balbir Singh.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mm: mlock: remove lru_add_drain_all()
  2017-10-19 22:25 [PATCH v2] mm: mlock: remove lru_add_drain_all() Shakeel Butt
  2017-10-20  6:19 ` Michal Hocko
  2017-10-20 21:51 ` Balbir Singh
@ 2017-10-30  9:12 ` Vlastimil Babka
  2 siblings, 0 replies; 6+ messages in thread
From: Vlastimil Babka @ 2017-10-30  9:12 UTC (permalink / raw)
  To: Shakeel Butt, Andrew Morton, Kirill A. Shutemov, Michal Hocko,
	Joonsoo Kim, Minchan Kim, Yisheng Xie, Ingo Molnar, Greg Thelen,
	Hugh Dickins
  Cc: Balbir Singh, Anshuman Khandual, linux-mm, linux-kernel

On 10/20/2017 12:25 AM, Shakeel Butt wrote:
> lru_add_drain_all() is not required by mlock() and it will drain
> everything that has been cached at the time mlock is called. And
> that is not really related to the memory which will be faulted in
> (and cached) and mlocked by the syscall itself.
> 
> Without lru_add_drain_all() the mlocked pages can remain on pagevecs
> and be moved to evictable LRUs. However they will eventually be moved
> back to unevictable LRU by reclaim. So, we can safely remove
> lru_add_drain_all() from mlock syscall. Also there is no need for
> local lru_add_drain() as it will be called deep inside __mm_populate()
> (in follow_page_pte()).
> 
> On larger machines the overhead of lru_add_drain_all() in mlock() can
> be significant when mlocking data already in memory. We have observed
> high latency in mlock() due to lru_add_drain_all() when the users
> were mlocking in memory tmpfs files.
> 
> Signed-off-by: Shakeel Butt <shakeelb@google.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
> Changelog since v1:
> - updated commit message
> 
>  mm/mlock.c | 5 -----
>  1 file changed, 5 deletions(-)
> 
> diff --git a/mm/mlock.c b/mm/mlock.c
> index dfc6f1912176..3ceb2935d1e0 100644
> --- a/mm/mlock.c
> +++ b/mm/mlock.c
> @@ -669,8 +669,6 @@ static __must_check int do_mlock(unsigned long start, size_t len, vm_flags_t fla
>  	if (!can_do_mlock())
>  		return -EPERM;
>  
> -	lru_add_drain_all();	/* flush pagevec */
> -
>  	len = PAGE_ALIGN(len + (offset_in_page(start)));
>  	start &= PAGE_MASK;
>  
> @@ -797,9 +795,6 @@ SYSCALL_DEFINE1(mlockall, int, flags)
>  	if (!can_do_mlock())
>  		return -EPERM;
>  
> -	if (flags & MCL_CURRENT)
> -		lru_add_drain_all();	/* flush pagevec */
> -
>  	lock_limit = rlimit(RLIMIT_MEMLOCK);
>  	lock_limit >>= PAGE_SHIFT;
>  
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-10-30  9:12 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-19 22:25 [PATCH v2] mm: mlock: remove lru_add_drain_all() Shakeel Butt
2017-10-20  6:19 ` Michal Hocko
2017-10-20 15:07   ` Shakeel Butt
2017-10-20 21:51 ` Balbir Singh
2017-10-21  8:09   ` Michal Hocko
2017-10-30  9:12 ` Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).