From: Jerome Glisse <jglisse@redhat.com>
To: Bob Liu <liubo95@huawei.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
John Hubbard <jhubbard@nvidia.com>,
David Nellans <dnellans@nvidia.com>,
Dan Williams <dan.j.williams@intel.com>,
Balbir Singh <bsingharora@gmail.com>,
Michal Hocko <mhocko@kernel.org>
Subject: Re: [PATCH 0/6] Cache coherent device memory (CDM) with HMM v5
Date: Tue, 18 Jul 2017 11:38:16 -0400 [thread overview]
Message-ID: <20170718153816.GA3135@redhat.com> (raw)
In-Reply-To: <2d534afc-28c5-4c81-c452-7e4c013ab4d0@huawei.com>
On Tue, Jul 18, 2017 at 11:26:51AM +0800, Bob Liu wrote:
> On 2017/7/14 5:15, Jerome Glisse wrote:
> > Sorry i made horrible mistake on names in v4, i completly miss-
> > understood the suggestion. So here i repost with proper naming.
> > This is the only change since v3. Again sorry about the noise
> > with v4.
> >
> > Changes since v4:
> > - s/DEVICE_HOST/DEVICE_PUBLIC
> >
> > Git tree:
> > https://cgit.freedesktop.org/~glisse/linux/log/?h=hmm-cdm-v5
> >
> >
> > Cache coherent device memory apply to architecture with system bus
> > like CAPI or CCIX. Device connected to such system bus can expose
> > their memory to the system and allow cache coherent access to it
> > from the CPU.
> >
> > Even if for all intent and purposes device memory behave like regular
> > memory, we still want to manage it in isolation from regular memory.
> > Several reasons for that, first and foremost this memory is less
> > reliable than regular memory if the device hangs because of invalid
> > commands we can loose access to device memory. Second CPU access to
> > this memory is expected to be slower than to regular memory. Third
> > having random memory into device means that some of the bus bandwith
> > wouldn't be available to the device but would be use by CPU access.
> >
> > This is why we want to manage such memory in isolation from regular
> > memory. Kernel should not try to use this memory even as last resort
> > when running out of memory, at least for now.
> >
>
> I think set a very large node distance for "Cache Coherent Device Memory"
> may be a easier way to address these concerns.
Such approach was discuss at length in the past see links below. Outcome
of discussion:
- CPU less node are bad
- device memory can be unreliable (device hang) no way for application
to understand that
- application and driver NUMA madvise/mbind/mempolicy ... can conflict
with each other and no way the kernel can figure out which should
apply
- NUMA as it is now would not work as we need further isolation that
what a large node distance would provide
Probably few others argument i forget.
https://lists.gt.net/linux/kernel/2551369
https://groups.google.com/forum/#!topic/linux.kernel/Za_e8C3XnRs%5B1-25%5D
https://lwn.net/Articles/720380/
Cheers,
Jerome
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Jerome Glisse <jglisse@redhat.com>
To: Bob Liu <liubo95@huawei.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
John Hubbard <jhubbard@nvidia.com>,
David Nellans <dnellans@nvidia.com>,
Dan Williams <dan.j.williams@intel.com>,
Balbir Singh <bsingharora@gmail.com>,
Michal Hocko <mhocko@kernel.org>
Subject: Re: [PATCH 0/6] Cache coherent device memory (CDM) with HMM v5
Date: Tue, 18 Jul 2017 11:38:16 -0400 [thread overview]
Message-ID: <20170718153816.GA3135@redhat.com> (raw)
In-Reply-To: <2d534afc-28c5-4c81-c452-7e4c013ab4d0@huawei.com>
On Tue, Jul 18, 2017 at 11:26:51AM +0800, Bob Liu wrote:
> On 2017/7/14 5:15, Jérôme Glisse wrote:
> > Sorry i made horrible mistake on names in v4, i completly miss-
> > understood the suggestion. So here i repost with proper naming.
> > This is the only change since v3. Again sorry about the noise
> > with v4.
> >
> > Changes since v4:
> > - s/DEVICE_HOST/DEVICE_PUBLIC
> >
> > Git tree:
> > https://cgit.freedesktop.org/~glisse/linux/log/?h=hmm-cdm-v5
> >
> >
> > Cache coherent device memory apply to architecture with system bus
> > like CAPI or CCIX. Device connected to such system bus can expose
> > their memory to the system and allow cache coherent access to it
> > from the CPU.
> >
> > Even if for all intent and purposes device memory behave like regular
> > memory, we still want to manage it in isolation from regular memory.
> > Several reasons for that, first and foremost this memory is less
> > reliable than regular memory if the device hangs because of invalid
> > commands we can loose access to device memory. Second CPU access to
> > this memory is expected to be slower than to regular memory. Third
> > having random memory into device means that some of the bus bandwith
> > wouldn't be available to the device but would be use by CPU access.
> >
> > This is why we want to manage such memory in isolation from regular
> > memory. Kernel should not try to use this memory even as last resort
> > when running out of memory, at least for now.
> >
>
> I think set a very large node distance for "Cache Coherent Device Memory"
> may be a easier way to address these concerns.
Such approach was discuss at length in the past see links below. Outcome
of discussion:
- CPU less node are bad
- device memory can be unreliable (device hang) no way for application
to understand that
- application and driver NUMA madvise/mbind/mempolicy ... can conflict
with each other and no way the kernel can figure out which should
apply
- NUMA as it is now would not work as we need further isolation that
what a large node distance would provide
Probably few others argument i forget.
https://lists.gt.net/linux/kernel/2551369
https://groups.google.com/forum/#!topic/linux.kernel/Za_e8C3XnRs%5B1-25%5D
https://lwn.net/Articles/720380/
Cheers,
Jérôme
next prev parent reply other threads:[~2017-07-18 15:38 UTC|newest]
Thread overview: 89+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-13 21:15 [PATCH 0/6] Cache coherent device memory (CDM) with HMM v5 Jérôme Glisse
2017-07-13 21:15 ` Jérôme Glisse
2017-07-13 21:15 ` [PATCH 1/6] mm/zone-device: rename DEVICE_PUBLIC to DEVICE_HOST Jérôme Glisse
2017-07-13 21:15 ` Jérôme Glisse
2017-07-17 9:09 ` Balbir Singh
2017-07-17 9:09 ` Balbir Singh
2017-07-13 21:15 ` [PATCH 2/6] mm/device-public-memory: device memory cache coherent with CPU v4 Jérôme Glisse
2017-07-13 21:15 ` Jérôme Glisse
2017-07-13 23:01 ` Balbir Singh
2017-07-13 23:01 ` Balbir Singh
2017-07-13 21:15 ` [PATCH 3/6] mm/hmm: add new helper to hotplug CDM memory region v3 Jérôme Glisse
2017-07-13 21:15 ` Jérôme Glisse
2017-07-13 21:15 ` [PATCH 4/6] mm/memcontrol: allow to uncharge page without using page->lru field Jérôme Glisse
2017-07-13 21:15 ` Jérôme Glisse
2017-07-17 9:10 ` Balbir Singh
2017-07-17 9:10 ` Balbir Singh
2017-07-17 9:10 ` Balbir Singh
2017-07-13 21:15 ` [PATCH 5/6] mm/memcontrol: support MEMORY_DEVICE_PRIVATE and MEMORY_DEVICE_PUBLIC v3 Jérôme Glisse
2017-07-13 21:15 ` Jérôme Glisse
2017-07-13 21:15 ` Jérôme Glisse
2017-07-17 9:15 ` Balbir Singh
2017-07-17 9:15 ` Balbir Singh
2017-07-17 9:15 ` Balbir Singh
2017-07-13 21:15 ` [PATCH 6/6] mm/hmm: documents how device memory is accounted in rss and memcg Jérôme Glisse
2017-07-13 21:15 ` Jérôme Glisse
2017-07-14 13:26 ` Michal Hocko
2017-07-14 13:26 ` Michal Hocko
2017-07-18 3:26 ` [PATCH 0/6] Cache coherent device memory (CDM) with HMM v5 Bob Liu
2017-07-18 3:26 ` Bob Liu
2017-07-18 15:38 ` Jerome Glisse [this message]
2017-07-18 15:38 ` Jerome Glisse
2017-07-19 1:46 ` Bob Liu
2017-07-19 1:46 ` Bob Liu
2017-07-19 2:25 ` Jerome Glisse
2017-07-19 2:25 ` Jerome Glisse
2017-07-19 9:09 ` Bob Liu
2017-07-19 9:09 ` Bob Liu
2017-07-20 15:03 ` Jerome Glisse
2017-07-20 15:03 ` Jerome Glisse
2017-07-21 1:15 ` Bob Liu
2017-07-21 1:15 ` Bob Liu
2017-07-21 1:41 ` Jerome Glisse
2017-07-21 1:41 ` Jerome Glisse
2017-07-21 2:10 ` Bob Liu
2017-07-21 2:10 ` Bob Liu
2017-07-21 12:01 ` Bob Liu
2017-07-21 12:01 ` Bob Liu
2017-07-21 15:21 ` Jerome Glisse
2017-07-21 15:21 ` Jerome Glisse
2017-07-21 3:48 ` Dan Williams
2017-07-21 3:48 ` Dan Williams
2017-07-21 15:22 ` Jerome Glisse
2017-07-21 15:22 ` Jerome Glisse
2017-09-05 19:36 ` Jerome Glisse
2017-09-05 19:36 ` Jerome Glisse
2017-09-09 23:22 ` Bob Liu
2017-09-09 23:22 ` Bob Liu
2017-09-11 23:36 ` Jerome Glisse
2017-09-11 23:36 ` Jerome Glisse
2017-09-12 1:02 ` Bob Liu
2017-09-12 1:02 ` Bob Liu
2017-09-12 16:17 ` Jerome Glisse
2017-09-12 16:17 ` Jerome Glisse
2017-09-26 9:56 ` Bob Liu
2017-09-26 9:56 ` Bob Liu
2017-09-26 16:16 ` Jerome Glisse
2017-09-26 16:16 ` Jerome Glisse
2017-09-30 2:57 ` Bob Liu
2017-09-30 2:57 ` Bob Liu
2017-09-30 22:49 ` Jerome Glisse
2017-09-30 22:49 ` Jerome Glisse
2017-10-11 13:15 ` Bob Liu
2017-10-11 13:15 ` Bob Liu
2017-10-12 15:37 ` Jerome Glisse
2017-10-12 15:37 ` Jerome Glisse
2017-11-16 2:10 ` chet l
2017-11-16 2:10 ` chet l
2017-11-16 2:44 ` Jerome Glisse
2017-11-16 2:44 ` Jerome Glisse
2017-11-16 3:23 ` chetan L
2017-11-16 3:23 ` chetan L
2017-11-16 3:29 ` chetan L
2017-11-16 3:29 ` chetan L
2017-11-16 21:29 ` Jerome Glisse
2017-11-16 21:29 ` Jerome Glisse
2017-11-16 22:41 ` chetan L
2017-11-16 22:41 ` chetan L
2017-11-16 23:11 ` Jerome Glisse
2017-11-16 23:11 ` Jerome Glisse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170718153816.GA3135@redhat.com \
--to=jglisse@redhat.com \
--cc=bsingharora@gmail.com \
--cc=dan.j.williams@intel.com \
--cc=dnellans@nvidia.com \
--cc=jhubbard@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=liubo95@huawei.com \
--cc=mhocko@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.