From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ot0-f197.google.com (mail-ot0-f197.google.com [74.125.82.197]) by kanga.kvack.org (Postfix) with ESMTP id 381D56B0286 for ; Tue, 16 Jan 2018 16:03:30 -0500 (EST) Received: by mail-ot0-f197.google.com with SMTP id e19so11018404otf.4 for ; Tue, 16 Jan 2018 13:03:30 -0800 (PST) Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id w130si1107072oib.393.2018.01.16.13.03.29 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 16 Jan 2018 13:03:29 -0800 (PST) Date: Tue, 16 Jan 2018 16:03:21 -0500 From: Jerome Glisse Subject: [LSF/MM TOPIC] CAPI/CCIX cache coherent device memory (NUMA too ?) Message-ID: <20180116210321.GB8801@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-linux-mm@kvack.org List-ID: To: lsf-pc@lists.linux-foundation.org Cc: linux-mm@kvack.org, Anshuman Khandual , Balbir Singh , Dan Williams , John Hubbard , Jonathan Masters , Ross Zwisler CAPI (on IBM Power8 and 9) and CCIX are two new standard that build on top of existing interconnect (like PCIE) and add the possibility for cache coherent access both way (from CPU to device memory and from device to main memory). This extend what we are use to with PCIE (where only device to main memory can be cache coherent but not CPU to device memory). How is this memory gonna be expose to the kernel and how the kernel gonna expose this to user space is the topic i want to discuss. I believe this is highly device specific for instance for GPU you want the device memory allocation and usage to be under the control of the GPU device driver. Maybe other type of device want different strategy. The HMAT patchset is partialy related to all this as it is about exposing different type of memory available in a system for CPU (HBM, main memory, ...) and some of their properties (bandwidth, latency, ...). We can start by looking at how CAPI and CCIX plan to expose this to the kernel and try to list some of the type of devices we expect to see. Discussion can then happen on how to represent this internaly to the kernel and how to expose this to userspace. Note this might also trigger discussion on a NUMA like model or on extending/replacing it by something more generic. Peoples (alphabetical order on first name) sorry if i missed anyone: "Anshuman Khandual" "Balbir Singh" "Dan Williams" "John Hubbard" "Jonathan Masters" "Ross Zwisler" -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf0-f198.google.com (mail-pf0-f198.google.com [209.85.192.198]) by kanga.kvack.org (Postfix) with ESMTP id 96469280263 for ; Tue, 16 Jan 2018 20:34:24 -0500 (EST) Received: by mail-pf0-f198.google.com with SMTP id n6so12946782pfg.19 for ; Tue, 16 Jan 2018 17:34:24 -0800 (PST) Received: from huawei.com (szxga05-in.huawei.com. [45.249.212.191]) by mx.google.com with ESMTPS id y77si3078821pfj.328.2018.01.16.17.34.22 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 16 Jan 2018 17:34:22 -0800 (PST) Subject: Re: [LSF/MM TOPIC] CAPI/CCIX cache coherent device memory (NUMA too ?) References: <20180116210321.GB8801@redhat.com> From: "Liubo(OS Lab)" Message-ID: Date: Wed, 17 Jan 2018 09:32:59 +0800 MIME-Version: 1.0 In-Reply-To: <20180116210321.GB8801@redhat.com> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Jerome Glisse , lsf-pc@lists.linux-foundation.org Cc: linux-mm@kvack.org, Anshuman Khandual , Balbir Singh , Dan Williams , John Hubbard , Jonathan Masters , Ross Zwisler On 2018/1/17 5:03, Jerome Glisse wrote: > CAPI (on IBM Power8 and 9) and CCIX are two new standard that > build on top of existing interconnect (like PCIE) and add the > possibility for cache coherent access both way (from CPU to > device memory and from device to main memory). This extend > what we are use to with PCIE (where only device to main memory > can be cache coherent but not CPU to device memory). > Yes, and more than CAPI/CCIX. E.g A SoC may connected with different types of memory through internal system-bus. > How is this memory gonna be expose to the kernel and how the > kernel gonna expose this to user space is the topic i want to > discuss. I believe this is highly device specific for instance > for GPU you want the device memory allocation and usage to be > under the control of the GPU device driver. Maybe other type > of device want different strategy. > > The HMAT patchset is partialy related to all this as it is about > exposing different type of memory available in a system for CPU > (HBM, main memory, ...) and some of their properties (bandwidth, > latency, ...). > Yes, and different type of memory doesn't mean device-memory or Nvdimm only(which are always think not as reliable as DDR). > > We can start by looking at how CAPI and CCIX plan to expose this > to the kernel and try to list some of the type of devices we > expect to see. Discussion can then happen on how to represent this > internaly to the kernel and how to expose this to userspace. > > Note this might also trigger discussion on a NUMA like model or > on extending/replacing it by something more generic. > Agree, for NUMA model the node distance is not enough when a system has different type of memory. Like the HMAT patches mentioned, different bandwidth ,latency, ... > > Peoples (alphabetical order on first name) sorry if i missed > anyone: > "Anshuman Khandual" > "Balbir Singh" > "Dan Williams" > "John Hubbard" > "Jonathan Masters" > "Ross Zwisler" > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ot0-f199.google.com (mail-ot0-f199.google.com [74.125.82.199]) by kanga.kvack.org (Postfix) with ESMTP id 28C12280263 for ; Tue, 16 Jan 2018 20:55:16 -0500 (EST) Received: by mail-ot0-f199.google.com with SMTP id 95so2176132otl.16 for ; Tue, 16 Jan 2018 17:55:16 -0800 (PST) Received: from mail-sor-f41.google.com (mail-sor-f41.google.com. [209.85.220.41]) by mx.google.com with SMTPS id c188sor1145136oih.200.2018.01.16.17.55.15 for (Google Transport Security); Tue, 16 Jan 2018 17:55:15 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <20180116210321.GB8801@redhat.com> References: <20180116210321.GB8801@redhat.com> From: "Figo.zhang" Date: Wed, 17 Jan 2018 09:55:14 +0800 Message-ID: Subject: Re: [LSF/MM TOPIC] CAPI/CCIX cache coherent device memory (NUMA too ?) Content-Type: multipart/alternative; boundary="001a113da306716d330562ef2364" Sender: owner-linux-mm@kvack.org List-ID: To: Jerome Glisse Cc: lsf-pc@lists.linux-foundation.org, Linux MM , Anshuman Khandual , Balbir Singh , Dan Williams , John Hubbard , Jonathan Masters , Ross Zwisler --001a113da306716d330562ef2364 Content-Type: text/plain; charset="UTF-8" 2018-01-17 5:03 GMT+08:00 Jerome Glisse : > CAPI (on IBM Power8 and 9) and CCIX are two new standard that > build on top of existing interconnect (like PCIE) and add the > possibility for cache coherent access both way (from CPU to > device memory and from device to main memory). This extend > what we are use to with PCIE (where only device to main memory > can be cache coherent but not CPU to device memory). > the UPI bus also support cache coherency for Intel platform, right? it seem the specification of CCIX/CAPI protocol is not public, we cannot know the details about them, your topic will cover the details? > > How is this memory gonna be expose to the kernel and how the > kernel gonna expose this to user space is the topic i want to > discuss. I believe this is highly device specific for instance > for GPU you want the device memory allocation and usage to be > under the control of the GPU device driver. Maybe other type > of device want different strategy. > i see it lack of some simple example for how to use the HMM, because GPU driver is more complicate for linux driver developer except the ATI/NVIDIA developers. > > The HMAT patchset is partialy related to all this as it is about > exposing different type of memory available in a system for CPU > (HBM, main memory, ...) and some of their properties (bandwidth, > latency, ...). > > > We can start by looking at how CAPI and CCIX plan to expose this > to the kernel and try to list some of the type of devices we > expect to see. Discussion can then happen on how to represent this > internaly to the kernel and how to expose this to userspace. > > Note this might also trigger discussion on a NUMA like model or > on extending/replacing it by something more generic. > > > Peoples (alphabetical order on first name) sorry if i missed > anyone: > "Anshuman Khandual" > "Balbir Singh" > "Dan Williams" > "John Hubbard" > "Jonathan Masters" > "Ross Zwisler" > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org > --001a113da306716d330562ef2364 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

= 2018-01-17 5:03 GMT+08:00 Jerome Glisse <jglisse@redhat.com>:
CAPI (on IBM Power8 and 9) and CCIX ar= e two new standard that
build on top of existing interconnect (like PCIE) and add the
possibility for cache coherent access both way (from CPU to
device memory and from device to main memory). This extend
what we are use to with PCIE (where only device to main memory
can be cache coherent but not CPU to device memory).
<= br>
the UPI bus=C2=A0also=C2=A0support=C2=A0cache=C2=A0coherency = for Intel=C2=A0platform,=C2=A0right?
it=C2=A0seem the specificati= on=C2=A0of CCIX/CAPI protocol is not=C2=A0public, we cannot know the=C2=A0d= etails
about=C2=A0them, your topic will=C2=A0cover the=C2=A0detai= ls?
=C2=A0

How is this memory gonna be expose to the kernel and how the
kernel gonna expose this to user space is the topic i want to
discuss. I believe this is highly device specific for instance
for GPU you want the device memory allocation and usage to be
under the control of the GPU device driver. Maybe other type
of device want different strategy.
i see it lack of=C2= =A0some=C2=A0simple=C2=A0example for how to use the HMM,=C2=A0because=C2=A0= GPU=C2=A0driver=C2=A0is=C2=A0more
complicate=C2=A0for=C2=A0linux= =C2=A0driver developer=C2=A0 except the ATI/NVIDIA developers.

The HMAT patchset is partialy related to all this as it is about
exposing different type of memory available in a system for CPU
(HBM, main memory, ...) and some of their properties (bandwidth,
latency, ...).


We can start by looking at how CAPI and CCIX plan to expose this
to the kernel and try to list some of the type of devices we
expect to see. Discussion can then happen on how to represent this
internaly to the kernel and how to expose this to userspace.

Note this might also trigger discussion on a NUMA like model or
on extending/replacing it by something more generic.


Peoples (alphabetical order on first name) sorry if i missed
anyone:
=C2=A0 =C2=A0 "Anshuman Khandual" <khandual@linux.vnet.ibm.com>
=C2=A0 =C2=A0 "Balbir Singh" <bsingharora@gmail.com>
=C2=A0 =C2=A0 "Dan Williams" <dan.j.williams@intel.com>
=C2=A0 =C2=A0 "John Hubbard" <jhubbard@nvidia.com>
=C2=A0 =C2=A0 "Jonathan Masters" <jcm@redhat.com>
=C2=A0 =C2=A0 "Ross Zwisler" <ross.zwisler@linux.intel.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.= =C2=A0 For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=3Dmailto:"dont@kvack.org"> email@kva= ck.org </a>

--001a113da306716d330562ef2364-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-f69.google.com (mail-oi0-f69.google.com [209.85.218.69]) by kanga.kvack.org (Postfix) with ESMTP id 1C745280263 for ; Tue, 16 Jan 2018 21:30:32 -0500 (EST) Received: by mail-oi0-f69.google.com with SMTP id z73so9923538oia.16 for ; Tue, 16 Jan 2018 18:30:32 -0800 (PST) Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id r131si1352186oih.19.2018.01.16.18.30.30 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 16 Jan 2018 18:30:30 -0800 (PST) Date: Tue, 16 Jan 2018 21:30:24 -0500 From: Jerome Glisse Subject: Re: [LSF/MM TOPIC] CAPI/CCIX cache coherent device memory (NUMA too ?) Message-ID: <20180117023024.GB3492@redhat.com> References: <20180116210321.GB8801@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: "Figo.zhang" Cc: lsf-pc@lists.linux-foundation.org, Linux MM , Anshuman Khandual , Balbir Singh , Dan Williams , John Hubbard , Jonathan Masters , Ross Zwisler On Wed, Jan 17, 2018 at 09:55:14AM +0800, Figo.zhang wrote: > 2018-01-17 5:03 GMT+08:00 Jerome Glisse : > > > CAPI (on IBM Power8 and 9) and CCIX are two new standard that > > build on top of existing interconnect (like PCIE) and add the > > possibility for cache coherent access both way (from CPU to > > device memory and from device to main memory). This extend > > what we are use to with PCIE (where only device to main memory > > can be cache coherent but not CPU to device memory). > > > > the UPI bus also support cache coherency for Intel platform, right? AFAIK the UPI only apply between processors and is not expose to devices except integrated Intel devices (like Intel GPU or FPGA) thus it is less generic/open than CAPI/CCIX. > it seem the specification of CCIX/CAPI protocol is not public, we cannot > know the details about them, your topic will cover the details? I can only cover what will be public at the time of summit but for the sake of discussion the important characteristic is the cache coherency aspect. Discussing how it is implemented, cache line protocol and all the gory details of protocol is of little interest from kernel point of view. > > How is this memory gonna be expose to the kernel and how the > > kernel gonna expose this to user space is the topic i want to > > discuss. I believe this is highly device specific for instance > > for GPU you want the device memory allocation and usage to be > > under the control of the GPU device driver. Maybe other type > > of device want different strategy. > > > i see it lack of some simple example for how to use the HMM, because > GPU driver is more complicate for linux driver developer except the > ATI/NVIDIA developers. HMM require a device with an MMU and capable of pausing workload that do pagefault. Only devices complex enough i know of are GPU, Infiniband and FPGA. HMM from feedback i had so far is that most people working on any such device driver understand HMM. I am always happy to answer any specific questions on the API and how it is intended to be use by device driver (and improve kernel documentation in the process). How HMM functionality is then expose to userspace by the device driver is under the control of each individual device driver. Cheers, Jerome -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ua0-f197.google.com (mail-ua0-f197.google.com [209.85.217.197]) by kanga.kvack.org (Postfix) with ESMTP id 25D366B0069 for ; Wed, 17 Jan 2018 11:29:43 -0500 (EST) Received: by mail-ua0-f197.google.com with SMTP id p17so7081577uap.12 for ; Wed, 17 Jan 2018 08:29:43 -0800 (PST) Received: from mail-sor-f41.google.com (mail-sor-f41.google.com. [209.85.220.41]) by mx.google.com with SMTPS id x7sor2009000vkg.275.2018.01.17.08.29.42 for (Google Transport Security); Wed, 17 Jan 2018 08:29:42 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <20180116210321.GB8801@redhat.com> References: <20180116210321.GB8801@redhat.com> From: Balbir Singh Date: Wed, 17 Jan 2018 21:59:40 +0530 Message-ID: Subject: Re: [LSF/MM TOPIC] CAPI/CCIX cache coherent device memory (NUMA too ?) Content-Type: text/plain; charset="UTF-8" Sender: owner-linux-mm@kvack.org List-ID: To: Jerome Glisse Cc: lsf-pc , linux-mm , Anshuman Khandual , Dan Williams , John Hubbard , Jonathan Masters , Ross Zwisler On Wed, Jan 17, 2018 at 2:33 AM, Jerome Glisse wrote: > CAPI (on IBM Power8 and 9) and CCIX are two new standard that > build on top of existing interconnect (like PCIE) and add the > possibility for cache coherent access both way (from CPU to > device memory and from device to main memory). This extend > what we are use to with PCIE (where only device to main memory > can be cache coherent but not CPU to device memory). > > How is this memory gonna be expose to the kernel and how the > kernel gonna expose this to user space is the topic i want to > discuss. I believe this is highly device specific for instance > for GPU you want the device memory allocation and usage to be > under the control of the GPU device driver. Maybe other type > of device want different strategy. > > The HMAT patchset is partialy related to all this as it is about > exposing different type of memory available in a system for CPU > (HBM, main memory, ...) and some of their properties (bandwidth, > latency, ...). > > > We can start by looking at how CAPI and CCIX plan to expose this > to the kernel and try to list some of the type of devices we > expect to see. Discussion can then happen on how to represent this > internaly to the kernel and how to expose this to userspace. > > Note this might also trigger discussion on a NUMA like model or > on extending/replacing it by something more generic. > Yes, I agree. I've had some experience with both NUMA and HMM/CDM models. I think we should compare and contrast the trade-offs and also discuss how we want to expose some of the ZONE_DEVICE information back to user space. > > Peoples (alphabetical order on first name) sorry if i missed > anyone: > "Anshuman Khandual" > "Balbir Singh" > "Dan Williams" > "John Hubbard" > "Jonathan Masters" > "Ross Zwisler" I'd love to be there if invited. Thanks, Balbir Singh. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-vk0-f71.google.com (mail-vk0-f71.google.com [209.85.213.71]) by kanga.kvack.org (Postfix) with ESMTP id BAC996B0033 for ; Wed, 17 Jan 2018 11:43:43 -0500 (EST) Received: by mail-vk0-f71.google.com with SMTP id e71so8860542vkd.4 for ; Wed, 17 Jan 2018 08:43:43 -0800 (PST) Received: from mail-sor-f41.google.com (mail-sor-f41.google.com. [209.85.220.41]) by mx.google.com with SMTPS id f187sor1600683vkg.97.2018.01.17.08.43.42 for (Google Transport Security); Wed, 17 Jan 2018 08:43:42 -0800 (PST) MIME-Version: 1.0 In-Reply-To: References: <20180116210321.GB8801@redhat.com> From: Balbir Singh Date: Wed, 17 Jan 2018 22:13:41 +0530 Message-ID: Subject: Re: [LSF/MM TOPIC] CAPI/CCIX cache coherent device memory (NUMA too ?) Content-Type: text/plain; charset="UTF-8" Sender: owner-linux-mm@kvack.org List-ID: To: "Liubo(OS Lab)" Cc: Jerome Glisse , lsf-pc , linux-mm , Anshuman Khandual , Dan Williams , John Hubbard , Jonathan Masters , Ross Zwisler On Wed, Jan 17, 2018 at 7:02 AM, Liubo(OS Lab) wrote: > On 2018/1/17 5:03, Jerome Glisse wrote: >> CAPI (on IBM Power8 and 9) and CCIX are two new standard that >> build on top of existing interconnect (like PCIE) and add the >> possibility for cache coherent access both way (from CPU to >> device memory and from device to main memory). This extend >> what we are use to with PCIE (where only device to main memory >> can be cache coherent but not CPU to device memory). >> > > Yes, and more than CAPI/CCIX. > E.g A SoC may connected with different types of memory through internal system-bus. cool! any references, docs? > >> How is this memory gonna be expose to the kernel and how the >> kernel gonna expose this to user space is the topic i want to >> discuss. I believe this is highly device specific for instance >> for GPU you want the device memory allocation and usage to be >> under the control of the GPU device driver. Maybe other type >> of device want different strategy. >> >> The HMAT patchset is partialy related to all this as it is about >> exposing different type of memory available in a system for CPU >> (HBM, main memory, ...) and some of their properties (bandwidth, >> latency, ...). >> > > Yes, and different type of memory doesn't mean device-memory or Nvdimm only(which are always think not as reliable as DDR). > OK, so something probably as reliable system memory, but with different characteristics >> >> We can start by looking at how CAPI and CCIX plan to expose this >> to the kernel and try to list some of the type of devices we >> expect to see. Discussion can then happen on how to represent this >> internaly to the kernel and how to expose this to userspace. >> >> Note this might also trigger discussion on a NUMA like model or >> on extending/replacing it by something more generic. >> > > Agree, for NUMA model the node distance is not enough when a system has different type of memory. > Like the HMAT patches mentioned, different bandwidth ,latency, ... > Yes, definitely worth discussing. The last time I posted N_COHERENT_MEMORY as a patchset to isolate memory, but that met with a lot of opposition due to lack of a full use case and end to end demonstration. I think we can work on a proposal that provides the benefits of NUMA, but that might require us to revisit what algorithms should be run on what nodes, relationship between nodes. Balbir Singh. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ot0-f198.google.com (mail-ot0-f198.google.com [74.125.82.198]) by kanga.kvack.org (Postfix) with ESMTP id 86BA16B0069 for ; Fri, 19 Jan 2018 00:14:08 -0500 (EST) Received: by mail-ot0-f198.google.com with SMTP id e4so402443ote.7 for ; Thu, 18 Jan 2018 21:14:08 -0800 (PST) Received: from hqemgate15.nvidia.com (hqemgate15.nvidia.com. [216.228.121.64]) by mx.google.com with ESMTPS id w13si4279559oth.212.2018.01.18.21.14.06 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 18 Jan 2018 21:14:07 -0800 (PST) Subject: Re: [LSF/MM TOPIC] CAPI/CCIX cache coherent device memory (NUMA too ?) References: <20180116210321.GB8801@redhat.com> From: John Hubbard Message-ID: <8493581b-3e1b-1a61-00b2-59cebb1af452@nvidia.com> Date: Thu, 18 Jan 2018 21:14:05 -0800 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Balbir Singh , Jerome Glisse Cc: lsf-pc , linux-mm , Anshuman Khandual , Dan Williams , Jonathan Masters , Ross Zwisler On 01/17/2018 08:29 AM, Balbir Singh wrote: > On Wed, Jan 17, 2018 at 2:33 AM, Jerome Glisse wrote: >> CAPI (on IBM Power8 and 9) and CCIX are two new standard that >> build on top of existing interconnect (like PCIE) and add the >> possibility for cache coherent access both way (from CPU to >> device memory and from device to main memory). This extend >> what we are use to with PCIE (where only device to main memory >> can be cache coherent but not CPU to device memory). >> >> How is this memory gonna be expose to the kernel and how the >> kernel gonna expose this to user space is the topic i want to >> discuss. I believe this is highly device specific for instance >> for GPU you want the device memory allocation and usage to be >> under the control of the GPU device driver. Maybe other type >> of device want different strategy. >> >> The HMAT patchset is partialy related to all this as it is about >> exposing different type of memory available in a system for CPU >> (HBM, main memory, ...) and some of their properties (bandwidth, >> latency, ...). >> >> >> We can start by looking at how CAPI and CCIX plan to expose this >> to the kernel and try to list some of the type of devices we >> expect to see. Discussion can then happen on how to represent this >> internaly to the kernel and how to expose this to userspace. >> >> Note this might also trigger discussion on a NUMA like model or >> on extending/replacing it by something more generic. >> > > Yes, I agree. I've had some experience with both NUMA and HMM/CDM > models. I think we should compare and contrast the trade-offs > and also discuss how we want to expose some of the ZONE_DEVICE > information back to user space. Hi Jerome and all, Thanks for adding me here. This area is something I'm interested in, and would love to get a chance to discuss some more. There are a lot of new types of computers popping up, with a remarkable variety of memory-like components (and some unusual direct connections between components), even within the same box. It really is getting interesting. I recall some key points from last year's discussions very clearly, about doing careful experiments (for example, add HMM, and see how it's used, rather than making large NUMA changes right away). So now that we are (just barely) getting some experience with NUMA and HMM systems, maybe we can look a bit further ahead. Admittedly, not much further; as noted on the other thread ("HMM status upstream"), there is still ongoing effort to finish up various device drivers, and get together an open source compute stack. thanks, -- John Hubbard NVIDIA > >> >> Peoples (alphabetical order on first name) sorry if i missed >> anyone: >> "Anshuman Khandual" >> "Balbir Singh" >> "Dan Williams" >> "John Hubbard" >> "Jonathan Masters" >> "Ross Zwisler" > > I'd love to be there if invited. > > Thanks, > Balbir Singh. > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg0-f71.google.com (mail-pg0-f71.google.com [74.125.83.71]) by kanga.kvack.org (Postfix) with ESMTP id C29916B0003 for ; Fri, 26 Jan 2018 13:47:04 -0500 (EST) Received: by mail-pg0-f71.google.com with SMTP id 64so754772pgc.17 for ; Fri, 26 Jan 2018 10:47:04 -0800 (PST) Received: from mga03.intel.com (mga03.intel.com. [134.134.136.65]) by mx.google.com with ESMTPS id m82si6703866pfi.343.2018.01.26.10.47.03 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 26 Jan 2018 10:47:03 -0800 (PST) Date: Fri, 26 Jan 2018 11:47:01 -0700 From: Ross Zwisler Subject: Re: [LSF/MM TOPIC] CAPI/CCIX cache coherent device memory (NUMA too ?) Message-ID: <20180126184701.GA14734@linux.intel.com> References: <20180116210321.GB8801@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180116210321.GB8801@redhat.com> Sender: owner-linux-mm@kvack.org List-ID: To: Jerome Glisse Cc: lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org, Anshuman Khandual , Balbir Singh , Dan Williams , John Hubbard , Jonathan Masters , Ross Zwisler On Tue, Jan 16, 2018 at 04:03:21PM -0500, Jerome Glisse wrote: > CAPI (on IBM Power8 and 9) and CCIX are two new standard that > build on top of existing interconnect (like PCIE) and add the > possibility for cache coherent access both way (from CPU to > device memory and from device to main memory). This extend > what we are use to with PCIE (where only device to main memory > can be cache coherent but not CPU to device memory). > > How is this memory gonna be expose to the kernel and how the > kernel gonna expose this to user space is the topic i want to > discuss. I believe this is highly device specific for instance > for GPU you want the device memory allocation and usage to be > under the control of the GPU device driver. Maybe other type > of device want different strategy. > > The HMAT patchset is partialy related to all this as it is about > exposing different type of memory available in a system for CPU > (HBM, main memory, ...) and some of their properties (bandwidth, > latency, ...). > > > We can start by looking at how CAPI and CCIX plan to expose this > to the kernel and try to list some of the type of devices we > expect to see. Discussion can then happen on how to represent this > internaly to the kernel and how to expose this to userspace. > > Note this might also trigger discussion on a NUMA like model or > on extending/replacing it by something more generic. > > > Peoples (alphabetical order on first name) sorry if i missed > anyone: > "Anshuman Khandual" > "Balbir Singh" > "Dan Williams" > "John Hubbard" > "Jonathan Masters" > "Ross Zwisler" I'd love to be part of this discussion, thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org