From: Minchan Kim <minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: Suren Baghdasaryan <surenb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: Andrew Morton
<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
LKML <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
linux-mm <linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>,
linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
oleksandr-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
Tim Murray <timmurray-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Daniel Colascione
<dancol-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Sandeep Patil <sspatil-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Sonny Rao <sonnyrao-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Brian Geffon <bgeffon-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>,
Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
John Dias <joaodias-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Joel Fernandes
<joel-QYYGw3jwrUn5owFQY34kdNi2O/JbrIOy@public.gmane.org>
Subject: Re: [PATCH v3 2/5] mm: introduce external memory hinting API
Date: Mon, 10 Feb 2020 13:27:27 -0800 [thread overview]
Message-ID: <20200210212727.GA48790@google.com> (raw)
In-Reply-To: <CAJuCfpFOBUYfxyQZZCvKjD0pyh_D-ZEpJP9kLD8xav+hVHZWYQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
Hi Suren,
On Mon, Feb 10, 2020 at 09:50:20AM -0800, Suren Baghdasaryan wrote:
> On Mon, Jan 27, 2020 at 4:17 PM Minchan Kim <minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> >
> > There is usecase that System Management Software(SMS) want to give
> > a memory hint like MADV_[COLD|PAGEEOUT] to other processes and
> > in the case of Android, it is the ActivityManagerService.
> >
> > It's similar in spirit to madvise(MADV_WONTNEED), but the information
> > required to make the reclaim decision is not known to the app. Instead,
> > it is known to the centralized userspace daemon(ActivityManagerService),
> > and that daemon must be able to initiate reclaim on its own without
> > any app involvement.
> >
> > To solve the issue, this patch introduces a new syscall process_madvise(2).
> > It uses pidfd of an external process to give the hint.
> >
> > int process_madvise(int pidfd, void *addr, size_t length, int advise,
> > unsigned long flag);
> >
> > Since it could affect other process's address range, only privileged
> > process(CAP_SYS_PTRACE) or something else(e.g., being the same UID)
> > gives it the right to ptrace the process could use it successfully.
> > The flag argument is reserved for future use if we need to extend the
> > API.
> >
> > I think supporting all hints madvise has/will supported/support to
> > process_madvise is rather risky. Because we are not sure all hints make
> > sense from external process and implementation for the hint may rely on
> > the caller being in the current context so it could be error-prone.
> > Thus, I just limited hints as MADV_[COLD|PAGEOUT] in this patch.
> >
> > If someone want to add other hints, we could hear hear the usecase and
> > review it for each hint. It's safer for maintenance rather than
> > introducing a buggy syscall but hard to fix it later.
>
> I would definitely be interested in adding MADV_DONTNEED support for
> process_madvise() to allow quick memory reclaim after a kill. The
> scenario is that userspace daemon can kill a process and try to help
> reclaim its memory. Having process_madvise(MADV_DONTNEED) support
> helps in the following cases:
> 1. Process issuing process_madvise has a higher CPU bandwidth
> allowance than the victim process, therefore can reclaim victim's
> memory quicker.
> 2. In case the victim occupies large amounts of memory the process
> issuing process_madvise can spawn multiple (possibly high priority)
> threads each reclaiming portions of the victim's memory.
> Such an extension will add a destructive kind of madvise into the set
> supported by process_madvise and I want to make sure we can accomodate
> for that in the future. Do you see any issues with supporting
> MADV_DONTNEED in the future?
Or kernel could do by themselves to spawn mulitple tasks if the system has
available badwidth and target process has a lot memory to be reclaimed
Anyway, it doesn't have any issue because we already have some synchrnoization
methods(e.g., signal or cgroup freezer) to freeze target processes before
giving a hint. It's not different with usual write syscall on shared file
among processes.
< snip >
> > diff --git a/mm/madvise.c b/mm/madvise.c
> > index 0c901de531e4..00ffa7e92f79 100644
> > --- a/mm/madvise.c
> > +++ b/mm/madvise.c
> > @@ -17,6 +17,7 @@
> > #include <linux/falloc.h>
> > #include <linux/fadvise.h>
> > #include <linux/sched.h>
> > +#include <linux/sched/mm.h>
> > #include <linux/ksm.h>
> > #include <linux/fs.h>
> > #include <linux/file.h>
> > @@ -315,6 +316,8 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,
> >
> > if (fatal_signal_pending(task))
> > return -EINTR;
> > + else if (current != task && fatal_signal_pending(current))
> > + return -EINTR;
>
> I think this can be simplified as:
>
> + if (fatal_signal_pending(current))
> + return -EINTR;
>
> current != task condition is not needed because if current == task
> then you would return earlier after checking
> fatal_signal_pending(task).
True, I will remove it.
Thanks!
WARNING: multiple messages have this Message-ID (diff)
From: Minchan Kim <minchan@kernel.org>
To: Suren Baghdasaryan <surenb@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
linux-api@vger.kernel.org, oleksandr@redhat.com,
Tim Murray <timmurray@google.com>,
Daniel Colascione <dancol@google.com>,
Sandeep Patil <sspatil@google.com>,
Sonny Rao <sonnyrao@google.com>,
Brian Geffon <bgeffon@google.com>, Michal Hocko <mhocko@suse.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Shakeel Butt <shakeelb@google.com>,
John Dias <joaodias@google.com>,
Joel Fernandes <joel@joelfernandes.org>
Subject: Re: [PATCH v3 2/5] mm: introduce external memory hinting API
Date: Mon, 10 Feb 2020 13:27:27 -0800 [thread overview]
Message-ID: <20200210212727.GA48790@google.com> (raw)
In-Reply-To: <CAJuCfpFOBUYfxyQZZCvKjD0pyh_D-ZEpJP9kLD8xav+hVHZWYQ@mail.gmail.com>
Hi Suren,
On Mon, Feb 10, 2020 at 09:50:20AM -0800, Suren Baghdasaryan wrote:
> On Mon, Jan 27, 2020 at 4:17 PM Minchan Kim <minchan@kernel.org> wrote:
> >
> > There is usecase that System Management Software(SMS) want to give
> > a memory hint like MADV_[COLD|PAGEEOUT] to other processes and
> > in the case of Android, it is the ActivityManagerService.
> >
> > It's similar in spirit to madvise(MADV_WONTNEED), but the information
> > required to make the reclaim decision is not known to the app. Instead,
> > it is known to the centralized userspace daemon(ActivityManagerService),
> > and that daemon must be able to initiate reclaim on its own without
> > any app involvement.
> >
> > To solve the issue, this patch introduces a new syscall process_madvise(2).
> > It uses pidfd of an external process to give the hint.
> >
> > int process_madvise(int pidfd, void *addr, size_t length, int advise,
> > unsigned long flag);
> >
> > Since it could affect other process's address range, only privileged
> > process(CAP_SYS_PTRACE) or something else(e.g., being the same UID)
> > gives it the right to ptrace the process could use it successfully.
> > The flag argument is reserved for future use if we need to extend the
> > API.
> >
> > I think supporting all hints madvise has/will supported/support to
> > process_madvise is rather risky. Because we are not sure all hints make
> > sense from external process and implementation for the hint may rely on
> > the caller being in the current context so it could be error-prone.
> > Thus, I just limited hints as MADV_[COLD|PAGEOUT] in this patch.
> >
> > If someone want to add other hints, we could hear hear the usecase and
> > review it for each hint. It's safer for maintenance rather than
> > introducing a buggy syscall but hard to fix it later.
>
> I would definitely be interested in adding MADV_DONTNEED support for
> process_madvise() to allow quick memory reclaim after a kill. The
> scenario is that userspace daemon can kill a process and try to help
> reclaim its memory. Having process_madvise(MADV_DONTNEED) support
> helps in the following cases:
> 1. Process issuing process_madvise has a higher CPU bandwidth
> allowance than the victim process, therefore can reclaim victim's
> memory quicker.
> 2. In case the victim occupies large amounts of memory the process
> issuing process_madvise can spawn multiple (possibly high priority)
> threads each reclaiming portions of the victim's memory.
> Such an extension will add a destructive kind of madvise into the set
> supported by process_madvise and I want to make sure we can accomodate
> for that in the future. Do you see any issues with supporting
> MADV_DONTNEED in the future?
Or kernel could do by themselves to spawn mulitple tasks if the system has
available badwidth and target process has a lot memory to be reclaimed
Anyway, it doesn't have any issue because we already have some synchrnoization
methods(e.g., signal or cgroup freezer) to freeze target processes before
giving a hint. It's not different with usual write syscall on shared file
among processes.
< snip >
> > diff --git a/mm/madvise.c b/mm/madvise.c
> > index 0c901de531e4..00ffa7e92f79 100644
> > --- a/mm/madvise.c
> > +++ b/mm/madvise.c
> > @@ -17,6 +17,7 @@
> > #include <linux/falloc.h>
> > #include <linux/fadvise.h>
> > #include <linux/sched.h>
> > +#include <linux/sched/mm.h>
> > #include <linux/ksm.h>
> > #include <linux/fs.h>
> > #include <linux/file.h>
> > @@ -315,6 +316,8 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,
> >
> > if (fatal_signal_pending(task))
> > return -EINTR;
> > + else if (current != task && fatal_signal_pending(current))
> > + return -EINTR;
>
> I think this can be simplified as:
>
> + if (fatal_signal_pending(current))
> + return -EINTR;
>
> current != task condition is not needed because if current == task
> then you would return earlier after checking
> fatal_signal_pending(task).
True, I will remove it.
Thanks!
next prev parent reply other threads:[~2020-02-10 21:27 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-28 0:16 [PATCH v3 0/5] introduce memory hinting API for external process Minchan Kim
2020-01-28 0:16 ` Minchan Kim
2020-01-28 0:16 ` [PATCH v3 2/5] mm: introduce external memory hinting API Minchan Kim
[not found] ` <20200128001641.5086-3-minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2020-02-10 17:50 ` Suren Baghdasaryan
2020-02-10 17:50 ` Suren Baghdasaryan
[not found] ` <CAJuCfpFOBUYfxyQZZCvKjD0pyh_D-ZEpJP9kLD8xav+hVHZWYQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2020-02-10 21:27 ` Minchan Kim [this message]
2020-02-10 21:27 ` Minchan Kim
2020-01-28 0:16 ` [PATCH v3 3/5] mm/madvise: employ mmget_still_valid for write lock Minchan Kim
[not found] ` <20200128001641.5086-1-minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2020-01-28 0:16 ` [PATCH v3 1/5] mm: factor out madvise's core functionality Minchan Kim
2020-01-28 0:16 ` Minchan Kim
[not found] ` <20200128001641.5086-2-minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2020-02-10 23:00 ` Alexander Duyck
2020-02-10 23:00 ` Alexander Duyck
[not found] ` <CAKgT0UcsB_isBHGH-z5L9kMWma5dy0qc-OZTDLyFhYYs68iFog-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2020-02-11 21:08 ` Minchan Kim
2020-02-11 21:08 ` Minchan Kim
2020-01-28 0:16 ` [PATCH v3 4/5] mm/madvise: allow KSM hints for remote API Minchan Kim
2020-01-28 0:16 ` Minchan Kim
2020-02-10 21:29 ` [PATCH v3 0/5] introduce memory hinting API for external process Minchan Kim
2020-02-10 21:29 ` Minchan Kim
2020-01-28 0:16 ` [PATCH v3 5/5] mm: support both pid and pidfd for process_madvise Minchan Kim
[not found] ` <20200128001641.5086-6-minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2020-02-10 23:12 ` Alexander Duyck
2020-02-10 23:12 ` Alexander Duyck
2020-02-11 21:11 ` Minchan Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200210212727.GA48790@google.com \
--to=minchan-dgejt+ai2ygdnm+yrofe0a@public.gmane.org \
--cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=bgeffon-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=dancol-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
--cc=joaodias-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=joel-QYYGw3jwrUn5owFQY34kdNi2O/JbrIOy@public.gmane.org \
--cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
--cc=mhocko-IBi9RG/b67k@public.gmane.org \
--cc=oleksandr-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=sonnyrao-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=sspatil-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=surenb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=timmurray-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.