From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christoph Hellwig Subject: Re: [PATCH] vmalloc: introduce vmap_pfn for persistent memory Date: Wed, 8 Nov 2017 07:04:47 -0800 Message-ID: <20171108150447.GA10374@infradead.org> References: <20171108095909.GA7390@infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-nvdimm-bounces-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org Sender: "Linux-nvdimm" To: Mikulas Patocka Cc: linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org, Christoph Hellwig , Christoph Hellwig , linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, dm-devel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, Laura Abbott , "Kirill A . Shutemov" List-Id: dm-devel.ids On Wed, Nov 08, 2017 at 07:33:09AM -0500, Mikulas Patocka wrote: > We could use the function clwb() (or arch-independent wrapper dax_flush()) > - that uses the clflushopt instruction on Broadwell or clwb on Skylake - > but it is very slow, write performance on Broadwell is only 350MB/s. > > So in practice I use the movnti instruction that bypasses cache. The > write-combining buffer is flushed with sfence. And what do you do for an architecture with virtuall indexed caches? From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Wed, 8 Nov 2017 07:04:47 -0800 From: Christoph Hellwig Subject: Re: [PATCH] vmalloc: introduce vmap_pfn for persistent memory Message-ID: <20171108150447.GA10374@infradead.org> References: <20171108095909.GA7390@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org To: Mikulas Patocka Cc: Christoph Hellwig , Ross Zwisler , linux-mm@kvack.org, linux-nvdimm@lists.01.org, Dan Williams , dm-devel@redhat.com, Laura Abbott , Christoph Hellwig , "Kirill A . Shutemov" List-ID: On Wed, Nov 08, 2017 at 07:33:09AM -0500, Mikulas Patocka wrote: > We could use the function clwb() (or arch-independent wrapper dax_flush()) > - that uses the clflushopt instruction on Broadwell or clwb on Skylake - > but it is very slow, write performance on Broadwell is only 350MB/s. > > So in practice I use the movnti instruction that bypasses cache. The > write-combining buffer is flushed with sfence. And what do you do for an architecture with virtuall indexed caches? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org