From: Jerome Glisse <jglisse@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
John Hubbard <jhubbard@nvidia.com>,
David Nellans <dnellans@nvidia.com>,
Dan Williams <dan.j.williams@intel.com>,
Balbir Singh <bsingharora@gmail.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Vladimir Davydov <vdavydov.dev@gmail.com>,
cgroups@vger.kernel.org
Subject: Re: [PATCH 4/5] mm/memcontrol: allow to uncharge page without using page->lru field
Date: Mon, 10 Jul 2017 11:32:23 -0400 [thread overview]
Message-ID: <20170710153222.GA4964@redhat.com> (raw)
In-Reply-To: <20170710082805.GD19185@dhcp22.suse.cz>
On Mon, Jul 10, 2017 at 10:28:06AM +0200, Michal Hocko wrote:
> On Wed 05-07-17 10:35:29, Jerome Glisse wrote:
> > On Tue, Jul 04, 2017 at 02:51:13PM +0200, Michal Hocko wrote:
> > > On Mon 03-07-17 17:14:14, Jerome Glisse wrote:
> > > > HMM pages (private or public device pages) are ZONE_DEVICE page and
> > > > thus you can not use page->lru fields of those pages. This patch
> > > > re-arrange the uncharge to allow single page to be uncharge without
> > > > modifying the lru field of the struct page.
> > > >
> > > > There is no change to memcontrol logic, it is the same as it was
> > > > before this patch.
> > >
> > > What is the memcg semantic of the memory? Why is it even charged? AFAIR
> > > this is not a reclaimable memory. If yes how are we going to deal with
> > > memory limits? What should happen if go OOM? Does killing an process
> > > actually help to release that memory? Isn't it pinned by a device?
> > >
> > > For the patch itself. It is quite ugly but I haven't spotted anything
> > > obviously wrong with it. It is the memcg semantic with this class of
> > > memory which makes me worried.
> >
> > So i am facing 3 choices. First one not account device memory at all.
> > Second one is account device memory like any other memory inside a
> > process. Third one is account device memory as something entirely new.
> >
> > I pick the second one for two reasons. First because when migrating
> > back from device memory it means that migration can not fail because
> > of memory cgroup limit, this simplify an already complex migration
> > code. Second because i assume that device memory usage is a transient
> > state ie once device is done with its computation the most likely
> > outcome is memory is migrated back. From this assumption it means
> > that you do not want to allow a process to overuse regular memory
> > while it is using un-accounted device memory. It sounds safer to
> > account device memory and to keep the process within its memcg
> > boundary.
> >
> > Admittedly here i am making an assumption and i can be wrong. Thing
> > is we do not have enough real data of how this will be use and how
> > much of an impact device memory will have. That is why for now i
> > would rather restrict myself to either not account it or account it
> > as usual.
> >
> > If you prefer not accounting it until we have more experience on how
> > it is use and how it impacts memory resource management i am fine with
> > that too. It will make the migration code slightly more complex.
>
> I can see why you want to do this but the semantic _has_ to be clear.
> And as such make sure that the exiting task will simply unpin and
> invalidate all the device memory (assuming this memory is not shared
> which I am not sure is even possible).
So there is 2 differents path out of device memory:
- munmap/process exiting: memory will get uncharge from its memory
cgroup just like regular memory
- migration to non device memory, the memory cgroup charge get
transfer to the new page just like for any other page
Do you want me to document all this in any specific place ? I will
add a comment in memory_control.c and in HMM documentations for this
but should i add it anywhere else ?
Note that the device memory is not pin. The whole point of HMM is to
do away with any pining. Thought as device page are not on lru they
are not reclaim like any other page. However we expect that device
driver might implement something akin to device memory reclaim to
make room for more important data base on statistic collected by the
device driver. If there is enough commonality accross devices then
we might implement a more generic mechanisms but at this point i
rather grow as we learn.
Cheers,
Jerome
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-07-10 15:32 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-03 21:14 [PATCH 0/5] Cache coherent device memory (CDM) with HMM v3 Jérôme Glisse
2017-07-03 21:14 ` [PATCH 1/5] mm/persistent-memory: match IORES_DESC name and enum memory_type one Jérôme Glisse
2017-07-03 23:49 ` Dan Williams
2017-07-05 14:25 ` Jerome Glisse
2017-07-05 16:15 ` Dan Williams
2017-07-05 18:49 ` Jerome Glisse
2017-07-11 3:48 ` Balbir Singh
2017-07-11 7:31 ` Dan Williams
2017-07-11 15:05 ` Jerome Glisse
2017-07-11 16:49 ` Dan Williams
2017-07-03 21:14 ` [PATCH 2/5] mm/device-public-memory: device memory cache coherent with CPU v2 Jérôme Glisse
2017-07-11 4:12 ` Balbir Singh
2017-07-11 14:57 ` Jerome Glisse
2017-07-12 5:50 ` Balbir Singh
2017-07-03 21:14 ` [PATCH 3/5] mm/hmm: add new helper to hotplug CDM memory region Jérôme Glisse
2017-07-03 21:14 ` [PATCH 4/5] mm/memcontrol: allow to uncharge page without using page->lru field Jérôme Glisse
2017-07-04 12:51 ` Michal Hocko
2017-07-05 3:18 ` Balbir Singh
2017-07-05 6:38 ` Michal Hocko
2017-07-05 10:22 ` Balbir Singh
2017-07-05 14:35 ` Jerome Glisse
2017-07-10 8:28 ` Michal Hocko
2017-07-10 15:32 ` Jerome Glisse [this message]
2017-07-10 16:04 ` Michal Hocko
2017-07-10 16:25 ` Jerome Glisse
2017-07-10 16:36 ` Michal Hocko
2017-07-10 16:54 ` Jerome Glisse
2017-07-10 17:48 ` Michal Hocko
2017-07-10 18:10 ` Jerome Glisse
2017-07-03 21:14 ` [PATCH 5/5] mm/memcontrol: support MEMORY_DEVICE_PRIVATE and MEMORY_DEVICE_PUBLIC Jérôme Glisse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170710153222.GA4964@redhat.com \
--to=jglisse@redhat.com \
--cc=bsingharora@gmail.com \
--cc=cgroups@vger.kernel.org \
--cc=dan.j.williams@intel.com \
--cc=dnellans@nvidia.com \
--cc=hannes@cmpxchg.org \
--cc=jhubbard@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).