* [PATCH] mm, oom_reaper: Move oom_lock from __oom_reap_task_mm() to oom_reap_task().
@ 2016-12-12 10:55 Tetsuo Handa
2016-12-12 11:59 ` Michal Hocko
0 siblings, 1 reply; 5+ messages in thread
From: Tetsuo Handa @ 2016-12-12 10:55 UTC (permalink / raw)
To: mhocko; +Cc: linux-mm, Tetsuo Handa
Since commit 862e3073b3eed13f
("mm, oom: get rid of signal_struct::oom_victims")
changed to wait until MMF_OOM_SKIP is set rather than wait while
TIF_MEMDIE is set, rationale comment for commit e2fe14564d3316d1
("oom_reaper: close race with exiting task") needs to be updated.
While holding oom_lock can make sure that other threads waiting for
oom_lock at __alloc_pages_may_oom() are given a chance to call
get_page_from_freelist() after the OOM reaper called unmap_page_range()
via __oom_reap_task_mm(), it can defer calling of __oom_reap_task_mm().
Therefore, this patch moves oom_lock from __oom_reap_task_mm() to
oom_reap_task() (without any functional change). By doing so, the OOM
killer can call __oom_reap_task_mm() if we don't want to defer calling
of __oom_reap_task_mm() (e.g. when oom_evaluate_task() aborted by
finding existing OOM victim's mm without MMF_OOM_SKIP).
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
mm/oom_kill.c | 39 +++++++++++++--------------------------
1 file changed, 13 insertions(+), 26 deletions(-)
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index ec9f11d..53b6e0c 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -467,28 +467,9 @@ static bool __oom_reap_task_mm(struct task_struct *tsk, struct mm_struct *mm)
struct vm_area_struct *vma;
struct zap_details details = {.check_swap_entries = true,
.ignore_dirty = true};
- bool ret = true;
-
- /*
- * We have to make sure to not race with the victim exit path
- * and cause premature new oom victim selection:
- * __oom_reap_task_mm exit_mm
- * mmget_not_zero
- * mmput
- * atomic_dec_and_test
- * exit_oom_victim
- * [...]
- * out_of_memory
- * select_bad_process
- * # no TIF_MEMDIE task selects new victim
- * unmap_page_range # frees some memory
- */
- mutex_lock(&oom_lock);
- if (!down_read_trylock(&mm->mmap_sem)) {
- ret = false;
- goto unlock_oom;
- }
+ if (!down_read_trylock(&mm->mmap_sem))
+ return false;
/*
* increase mm_users only after we know we will reap something so
@@ -497,7 +478,7 @@ static bool __oom_reap_task_mm(struct task_struct *tsk, struct mm_struct *mm)
*/
if (!mmget_not_zero(mm)) {
up_read(&mm->mmap_sem);
- goto unlock_oom;
+ return true;
}
/*
@@ -548,9 +529,7 @@ static bool __oom_reap_task_mm(struct task_struct *tsk, struct mm_struct *mm)
* put the oom_reaper out of the way.
*/
mmput_async(mm);
-unlock_oom:
- mutex_unlock(&oom_lock);
- return ret;
+ return true;
}
#define MAX_OOM_REAP_RETRIES 10
@@ -560,8 +539,16 @@ static void oom_reap_task(struct task_struct *tsk)
struct mm_struct *mm = tsk->signal->oom_mm;
/* Retry the down_read_trylock(mmap_sem) a few times */
- while (attempts++ < MAX_OOM_REAP_RETRIES && !__oom_reap_task_mm(tsk, mm))
+ while (attempts++ < MAX_OOM_REAP_RETRIES) {
+ bool ret;
+
+ mutex_lock(&oom_lock);
+ ret = __oom_reap_task_mm(tsk, mm);
+ mutex_unlock(&oom_lock);
+ if (ret)
+ break;
schedule_timeout_idle(HZ/10);
+ }
if (attempts <= MAX_OOM_REAP_RETRIES)
goto done;
--
1.8.3.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] mm, oom_reaper: Move oom_lock from __oom_reap_task_mm() to oom_reap_task().
2016-12-12 10:55 [PATCH] mm, oom_reaper: Move oom_lock from __oom_reap_task_mm() to oom_reap_task() Tetsuo Handa
@ 2016-12-12 11:59 ` Michal Hocko
2016-12-22 9:35 ` Michal Hocko
0 siblings, 1 reply; 5+ messages in thread
From: Michal Hocko @ 2016-12-12 11:59 UTC (permalink / raw)
To: Tetsuo Handa; +Cc: linux-mm
On Mon 12-12-16 19:55:52, Tetsuo Handa wrote:
> Since commit 862e3073b3eed13f
> ("mm, oom: get rid of signal_struct::oom_victims")
> changed to wait until MMF_OOM_SKIP is set rather than wait while
> TIF_MEMDIE is set, rationale comment for commit e2fe14564d3316d1
> ("oom_reaper: close race with exiting task") needs to be updated.
True.
> While holding oom_lock can make sure that other threads waiting for
> oom_lock at __alloc_pages_may_oom() are given a chance to call
> get_page_from_freelist() after the OOM reaper called unmap_page_range()
> via __oom_reap_task_mm(), it can defer calling of __oom_reap_task_mm().
>
> Therefore, this patch moves oom_lock from __oom_reap_task_mm() to
> oom_reap_task() (without any functional change). By doing so, the OOM
> killer can call __oom_reap_task_mm() if we don't want to defer calling
> of __oom_reap_task_mm() (e.g. when oom_evaluate_task() aborted by
> finding existing OOM victim's mm without MMF_OOM_SKIP).
But I fail to understand this part of the changelog. It sounds like a
preparatory for other changes. There doesn't seem to be any other user
of __oom_reap_task_mm in the current tree.
Please send a patch which removes the comment which is no longer true
on its own and feel free to add
Acked-by: Michal Hocko <mhocko@suse.com>
but do not make other changes if you do not have any follow up patch
which would benefit from that.
Thanks!
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> ---
> mm/oom_kill.c | 39 +++++++++++++--------------------------
> 1 file changed, 13 insertions(+), 26 deletions(-)
>
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index ec9f11d..53b6e0c 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -467,28 +467,9 @@ static bool __oom_reap_task_mm(struct task_struct *tsk, struct mm_struct *mm)
> struct vm_area_struct *vma;
> struct zap_details details = {.check_swap_entries = true,
> .ignore_dirty = true};
> - bool ret = true;
> -
> - /*
> - * We have to make sure to not race with the victim exit path
> - * and cause premature new oom victim selection:
> - * __oom_reap_task_mm exit_mm
> - * mmget_not_zero
> - * mmput
> - * atomic_dec_and_test
> - * exit_oom_victim
> - * [...]
> - * out_of_memory
> - * select_bad_process
> - * # no TIF_MEMDIE task selects new victim
> - * unmap_page_range # frees some memory
> - */
> - mutex_lock(&oom_lock);
>
> - if (!down_read_trylock(&mm->mmap_sem)) {
> - ret = false;
> - goto unlock_oom;
> - }
> + if (!down_read_trylock(&mm->mmap_sem))
> + return false;
>
> /*
> * increase mm_users only after we know we will reap something so
> @@ -497,7 +478,7 @@ static bool __oom_reap_task_mm(struct task_struct *tsk, struct mm_struct *mm)
> */
> if (!mmget_not_zero(mm)) {
> up_read(&mm->mmap_sem);
> - goto unlock_oom;
> + return true;
> }
>
> /*
> @@ -548,9 +529,7 @@ static bool __oom_reap_task_mm(struct task_struct *tsk, struct mm_struct *mm)
> * put the oom_reaper out of the way.
> */
> mmput_async(mm);
> -unlock_oom:
> - mutex_unlock(&oom_lock);
> - return ret;
> + return true;
> }
>
> #define MAX_OOM_REAP_RETRIES 10
> @@ -560,8 +539,16 @@ static void oom_reap_task(struct task_struct *tsk)
> struct mm_struct *mm = tsk->signal->oom_mm;
>
> /* Retry the down_read_trylock(mmap_sem) a few times */
> - while (attempts++ < MAX_OOM_REAP_RETRIES && !__oom_reap_task_mm(tsk, mm))
> + while (attempts++ < MAX_OOM_REAP_RETRIES) {
> + bool ret;
> +
> + mutex_lock(&oom_lock);
> + ret = __oom_reap_task_mm(tsk, mm);
> + mutex_unlock(&oom_lock);
> + if (ret)
> + break;
> schedule_timeout_idle(HZ/10);
> + }
>
> if (attempts <= MAX_OOM_REAP_RETRIES)
> goto done;
> --
> 1.8.3.1
>
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] mm, oom_reaper: Move oom_lock from __oom_reap_task_mm() to oom_reap_task().
2016-12-12 11:59 ` Michal Hocko
@ 2016-12-22 9:35 ` Michal Hocko
2016-12-22 10:41 ` [PATCH] mm, oom_reaper: Move oom_lock from __oom_reap_task_mm()to oom_reap_task() Tetsuo Handa
0 siblings, 1 reply; 5+ messages in thread
From: Michal Hocko @ 2016-12-22 9:35 UTC (permalink / raw)
To: Tetsuo Handa; +Cc: linux-mm
On Mon 12-12-16 12:59:18, Michal Hocko wrote:
> On Mon 12-12-16 19:55:52, Tetsuo Handa wrote:
> > Since commit 862e3073b3eed13f
> > ("mm, oom: get rid of signal_struct::oom_victims")
> > changed to wait until MMF_OOM_SKIP is set rather than wait while
> > TIF_MEMDIE is set, rationale comment for commit e2fe14564d3316d1
> > ("oom_reaper: close race with exiting task") needs to be updated.
>
> True.
>
> > While holding oom_lock can make sure that other threads waiting for
> > oom_lock at __alloc_pages_may_oom() are given a chance to call
> > get_page_from_freelist() after the OOM reaper called unmap_page_range()
> > via __oom_reap_task_mm(), it can defer calling of __oom_reap_task_mm().
> >
> > Therefore, this patch moves oom_lock from __oom_reap_task_mm() to
> > oom_reap_task() (without any functional change). By doing so, the OOM
> > killer can call __oom_reap_task_mm() if we don't want to defer calling
> > of __oom_reap_task_mm() (e.g. when oom_evaluate_task() aborted by
> > finding existing OOM victim's mm without MMF_OOM_SKIP).
>
> But I fail to understand this part of the changelog. It sounds like a
> preparatory for other changes. There doesn't seem to be any other user
> of __oom_reap_task_mm in the current tree.
>
> Please send a patch which removes the comment which is no longer true
> on its own and feel free to add
>
> Acked-by: Michal Hocko <mhocko@suse.com>
>
> but do not make other changes if you do not have any follow up patch
> which would benefit from that.
Do you plan to pursue this?
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] mm, oom_reaper: Move oom_lock from __oom_reap_task_mm()to oom_reap_task().
2016-12-22 9:35 ` Michal Hocko
@ 2016-12-22 10:41 ` Tetsuo Handa
2016-12-22 11:50 ` Michal Hocko
0 siblings, 1 reply; 5+ messages in thread
From: Tetsuo Handa @ 2016-12-22 10:41 UTC (permalink / raw)
To: mhocko; +Cc: linux-mm
Michal Hocko wrote:
> On Mon 12-12-16 12:59:18, Michal Hocko wrote:
> > On Mon 12-12-16 19:55:52, Tetsuo Handa wrote:
> > > Since commit 862e3073b3eed13f
> > > ("mm, oom: get rid of signal_struct::oom_victims")
> > > changed to wait until MMF_OOM_SKIP is set rather than wait while
> > > TIF_MEMDIE is set, rationale comment for commit e2fe14564d3316d1
> > > ("oom_reaper: close race with exiting task") needs to be updated.
> >
> > True.
> >
> > > While holding oom_lock can make sure that other threads waiting for
> > > oom_lock at __alloc_pages_may_oom() are given a chance to call
> > > get_page_from_freelist() after the OOM reaper called unmap_page_range()
> > > via __oom_reap_task_mm(), it can defer calling of __oom_reap_task_mm().
> > >
> > > Therefore, this patch moves oom_lock from __oom_reap_task_mm() to
> > > oom_reap_task() (without any functional change). By doing so, the OOM
> > > killer can call __oom_reap_task_mm() if we don't want to defer calling
> > > of __oom_reap_task_mm() (e.g. when oom_evaluate_task() aborted by
> > > finding existing OOM victim's mm without MMF_OOM_SKIP).
> >
> > But I fail to understand this part of the changelog. It sounds like a
> > preparatory for other changes. There doesn't seem to be any other user
> > of __oom_reap_task_mm in the current tree.
I'm planning to call __oom_reap_task_mm() from out_of_memory() if OOM
situation is not solved immediately, after we made sure that we give
enough CPU time to OOM killer and OOM reaper to run reclaim code by
mutex_lock_killable(&oom_lock) change.
> >
> > Please send a patch which removes the comment which is no longer true
> > on its own and feel free to add
> >
> > Acked-by: Michal Hocko <mhocko@suse.com>
> >
> > but do not make other changes if you do not have any follow up patch
> > which would benefit from that.
>
> Do you plan to pursue this?
Although I don't know whether we agree with mutex_lock_killable(&oom_lock)
change, I think this patch alone can go as a cleanup.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] mm, oom_reaper: Move oom_lock from __oom_reap_task_mm()to oom_reap_task().
2016-12-22 10:41 ` [PATCH] mm, oom_reaper: Move oom_lock from __oom_reap_task_mm()to oom_reap_task() Tetsuo Handa
@ 2016-12-22 11:50 ` Michal Hocko
0 siblings, 0 replies; 5+ messages in thread
From: Michal Hocko @ 2016-12-22 11:50 UTC (permalink / raw)
To: Tetsuo Handa; +Cc: linux-mm
On Thu 22-12-16 19:41:41, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > On Mon 12-12-16 12:59:18, Michal Hocko wrote:
> > > On Mon 12-12-16 19:55:52, Tetsuo Handa wrote:
> > > > Since commit 862e3073b3eed13f
> > > > ("mm, oom: get rid of signal_struct::oom_victims")
> > > > changed to wait until MMF_OOM_SKIP is set rather than wait while
> > > > TIF_MEMDIE is set, rationale comment for commit e2fe14564d3316d1
> > > > ("oom_reaper: close race with exiting task") needs to be updated.
> > >
> > > True.
> > >
> > > > While holding oom_lock can make sure that other threads waiting for
> > > > oom_lock at __alloc_pages_may_oom() are given a chance to call
> > > > get_page_from_freelist() after the OOM reaper called unmap_page_range()
> > > > via __oom_reap_task_mm(), it can defer calling of __oom_reap_task_mm().
> > > >
> > > > Therefore, this patch moves oom_lock from __oom_reap_task_mm() to
> > > > oom_reap_task() (without any functional change). By doing so, the OOM
> > > > killer can call __oom_reap_task_mm() if we don't want to defer calling
> > > > of __oom_reap_task_mm() (e.g. when oom_evaluate_task() aborted by
> > > > finding existing OOM victim's mm without MMF_OOM_SKIP).
> > >
> > > But I fail to understand this part of the changelog. It sounds like a
> > > preparatory for other changes. There doesn't seem to be any other user
> > > of __oom_reap_task_mm in the current tree.
>
> I'm planning to call __oom_reap_task_mm() from out_of_memory() if OOM
> situation is not solved immediately, after we made sure that we give
> enough CPU time to OOM killer and OOM reaper to run reclaim code by
> mutex_lock_killable(&oom_lock) change.
Whatever you plan to do, my objection to the patch was that it mixes two
things together. The comment removal makes sense on its own and it
should be a separate patch which I've acked already.
> > > Please send a patch which removes the comment which is no longer true
> > > on its own and feel free to add
> > >
> > > Acked-by: Michal Hocko <mhocko@suse.com>
> > >
> > > but do not make other changes if you do not have any follow up patch
> > > which would benefit from that.
> >
> > Do you plan to pursue this?
>
> Although I don't know whether we agree with mutex_lock_killable(&oom_lock)
> change, I think this patch alone can go as a cleanup.
No, we don't agree on that part. As this is a printk issue I do not want
to workaround it in the oom related code. That is just ridiculous. The
very same issue would be possible due to other continous source of log
messages.
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-12-22 11:51 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-12 10:55 [PATCH] mm, oom_reaper: Move oom_lock from __oom_reap_task_mm() to oom_reap_task() Tetsuo Handa
2016-12-12 11:59 ` Michal Hocko
2016-12-22 9:35 ` Michal Hocko
2016-12-22 10:41 ` [PATCH] mm, oom_reaper: Move oom_lock from __oom_reap_task_mm()to oom_reap_task() Tetsuo Handa
2016-12-22 11:50 ` Michal Hocko
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).