All of lore.kernel.org
 help / color / mirror / Atom feed
From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	David Miller <davem@davemloft.net>,
	hch@infradead.org, kvm@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	peterx@redhat.com, linux-mm@kvack.org,
	linux-arm-kernel@lists.infradead.org,
	linux-parisc@vger.kernel.org
Subject: Re: [RFC PATCH V2 0/5] vhost: accelerate metadata access through vmap()
Date: Tue, 12 Mar 2019 15:57:14 -0700	[thread overview]
Message-ID: <1552431434.14432.47.camel@HansenPartnership.com> (raw)
In-Reply-To: <20190312225032.GD25147@redhat.com>

On Tue, 2019-03-12 at 18:50 -0400, Andrea Arcangeli wrote:
> On Tue, Mar 12, 2019 at 03:02:54PM -0700, James Bottomley wrote:
> > I'm sure there must be workarounds elsewhere in the other arch code
> > otherwise things like this, which appear all over drivers/,
> > wouldn't
> > work:
> > 
> > drivers/scsi/isci/request.c:1430
> > 
> > 	kaddr = kmap_atomic(page);
> > 	memcpy(kaddr + sg->offset, src_addr, copy_len);
> > 	kunmap_atomic(kaddr);
> > 
> 
> Are you sure "page" is an userland page with an alias address?
> 
> 	sg->page_link = (unsigned long)virt_to_page(addr);

Yes, it's an element of a scatter gather list, which may be either a
kernel page or a user page, but is usually the latter.

> page_link seems to point to kernel memory.
> 
> I found an apparent solution like parisc on arm 32bit:
> 
> void __kunmap_atomic(void *kvaddr)
> {
> 	unsigned long vaddr = (unsigned long) kvaddr & PAGE_MASK;
> 	int idx, type;
> 
> 	if (kvaddr >= (void *)FIXADDR_START) {
> 		type = kmap_atomic_idx();
> 		idx = FIX_KMAP_BEGIN + type + KM_TYPE_NR *
> smp_processor_id();
> 
> 		if (cache_is_vivt())
> 			__cpuc_flush_dcache_area((void *)vaddr,
> PAGE_SIZE);
> 
> However on arm 64bit kunmap_atomic is not implemented at all and
> other 32bit implementations don't do it, for example sparc seems to
> do the cache flush too if the kernel is built with
> CONFIG_DEBUG_HIGHMEM (which makes the flushing conditional to the
> debug option).
> 
> The kunmap_atomic where fixmap is used, is flushing the tlb lazily so
> even on 32bit you can't even be sure if there was a tlb flush for
> each single page you unmapped, so it's hard to see how the above can
> work safe, is "page" would have been an userland page mapped with
> aliased CPU cache.
> 
> > the sequence dirties the kernel virtual address but doesn't flush
> > before doing kunmap.  There are hundreds of other examples which is
> > why I think adding flush_kernel_dcache_page() is an already lost
> > cause.
> 
> In lots of cases kmap is needed to just modify kernel memory not to
> modify userland memory (where get/put_user is more commonly used
> instead..), there's no cache aliasing in such case.

That's why I picked drivers/  The use case in there is mostly kmap to
put a special value into a scatter gather list entry.

> > Actually copy_user_page() is unused in the main kernel.  The big
> > problem is copy_user_highpage() but that's mostly highly optimised
> > by the VIPT architectures (in other words you can fiddle with kmap
> > without impacting it).
> 
> copy_user_page is not unused, it's called precisely by
> copy_user_highpage, which is why the cache flushes are done inside
> copy_user_page.
> 
> static inline void copy_user_highpage(struct page *to, struct page
> *from,
> 	unsigned long vaddr, struct vm_area_struct *vma)
> {
> 	char *vfrom, *vto;
> 
> 	vfrom = kmap_atomic(from);
> 	vto = kmap_atomic(to);
> 	copy_user_page(vto, vfrom, vaddr, to);
> 	kunmap_atomic(vto);
> 	kunmap_atomic(vfrom);
> }

That's the asm/generic implementation.  Most VIPT architectures
override it.

James


WARNING: multiple messages have this Message-ID (diff)
From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: linux-parisc@vger.kernel.org, kvm@vger.kernel.org,
	"Michael S. Tsirkin" <mst@redhat.com>,
	netdev@vger.kernel.org, Jason Wang <jasowang@redhat.com>,
	linux-kernel@vger.kernel.org, peterx@redhat.com,
	virtualization@lists.linux-foundation.org, hch@infradead.org,
	linux-mm@kvack.org, David Miller <davem@davemloft.net>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [RFC PATCH V2 0/5] vhost: accelerate metadata access through vmap()
Date: Tue, 12 Mar 2019 15:57:14 -0700	[thread overview]
Message-ID: <1552431434.14432.47.camel@HansenPartnership.com> (raw)
In-Reply-To: <20190312225032.GD25147@redhat.com>

On Tue, 2019-03-12 at 18:50 -0400, Andrea Arcangeli wrote:
> On Tue, Mar 12, 2019 at 03:02:54PM -0700, James Bottomley wrote:
> > I'm sure there must be workarounds elsewhere in the other arch code
> > otherwise things like this, which appear all over drivers/,
> > wouldn't
> > work:
> > 
> > drivers/scsi/isci/request.c:1430
> > 
> > 	kaddr = kmap_atomic(page);
> > 	memcpy(kaddr + sg->offset, src_addr, copy_len);
> > 	kunmap_atomic(kaddr);
> > 
> 
> Are you sure "page" is an userland page with an alias address?
> 
> 	sg->page_link = (unsigned long)virt_to_page(addr);

Yes, it's an element of a scatter gather list, which may be either a
kernel page or a user page, but is usually the latter.

> page_link seems to point to kernel memory.
> 
> I found an apparent solution like parisc on arm 32bit:
> 
> void __kunmap_atomic(void *kvaddr)
> {
> 	unsigned long vaddr = (unsigned long) kvaddr & PAGE_MASK;
> 	int idx, type;
> 
> 	if (kvaddr >= (void *)FIXADDR_START) {
> 		type = kmap_atomic_idx();
> 		idx = FIX_KMAP_BEGIN + type + KM_TYPE_NR *
> smp_processor_id();
> 
> 		if (cache_is_vivt())
> 			__cpuc_flush_dcache_area((void *)vaddr,
> PAGE_SIZE);
> 
> However on arm 64bit kunmap_atomic is not implemented at all and
> other 32bit implementations don't do it, for example sparc seems to
> do the cache flush too if the kernel is built with
> CONFIG_DEBUG_HIGHMEM (which makes the flushing conditional to the
> debug option).
> 
> The kunmap_atomic where fixmap is used, is flushing the tlb lazily so
> even on 32bit you can't even be sure if there was a tlb flush for
> each single page you unmapped, so it's hard to see how the above can
> work safe, is "page" would have been an userland page mapped with
> aliased CPU cache.
> 
> > the sequence dirties the kernel virtual address but doesn't flush
> > before doing kunmap.  There are hundreds of other examples which is
> > why I think adding flush_kernel_dcache_page() is an already lost
> > cause.
> 
> In lots of cases kmap is needed to just modify kernel memory not to
> modify userland memory (where get/put_user is more commonly used
> instead..), there's no cache aliasing in such case.

That's why I picked drivers/  The use case in there is mostly kmap to
put a special value into a scatter gather list entry.

> > Actually copy_user_page() is unused in the main kernel.  The big
> > problem is copy_user_highpage() but that's mostly highly optimised
> > by the VIPT architectures (in other words you can fiddle with kmap
> > without impacting it).
> 
> copy_user_page is not unused, it's called precisely by
> copy_user_highpage, which is why the cache flushes are done inside
> copy_user_page.
> 
> static inline void copy_user_highpage(struct page *to, struct page
> *from,
> 	unsigned long vaddr, struct vm_area_struct *vma)
> {
> 	char *vfrom, *vto;
> 
> 	vfrom = kmap_atomic(from);
> 	vto = kmap_atomic(to);
> 	copy_user_page(vto, vfrom, vaddr, to);
> 	kunmap_atomic(vto);
> 	kunmap_atomic(vfrom);
> }

That's the asm/generic implementation.  Most VIPT architectures
override it.

James


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-03-12 22:57 UTC|newest]

Thread overview: 175+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-06  7:18 [RFC PATCH V2 0/5] vhost: accelerate metadata access through vmap() Jason Wang
2019-03-06  7:18 ` Jason Wang
2019-03-06  7:18 ` [RFC PATCH V2 1/5] vhost: generalize adding used elem Jason Wang
2019-03-06  7:18 ` Jason Wang
2019-03-06  7:18 ` [RFC PATCH V2 2/5] vhost: fine grain userspace memory accessors Jason Wang
2019-03-06 10:45   ` Christophe de Dinechin
2019-03-07  2:38     ` Jason Wang
2019-03-07  2:38       ` Jason Wang
2019-03-06 10:45   ` Christophe de Dinechin
2019-03-06  7:18 ` Jason Wang
2019-03-06  7:18 ` [RFC PATCH V2 3/5] vhost: rename vq_iotlb_prefetch() to vq_meta_prefetch() Jason Wang
2019-03-06  7:18 ` Jason Wang
2019-03-06  7:18 ` [RFC PATCH V2 4/5] vhost: introduce helpers to get the size of metadata area Jason Wang
2019-03-06 10:56   ` Christophe de Dinechin
2019-03-06 10:56   ` Christophe de Dinechin
2019-03-07  2:40     ` Jason Wang
2019-03-07  2:40     ` Jason Wang
2019-03-06 18:43   ` Souptick Joarder
2019-03-07  2:42     ` Jason Wang
2019-03-07  2:42     ` Jason Wang
2019-03-06  7:18 ` Jason Wang
2019-03-06  7:18 ` [RFC PATCH V2 5/5] vhost: access vq metadata through kernel virtual address Jason Wang
2019-03-06 16:31   ` Michael S. Tsirkin
2019-03-06 16:31   ` Michael S. Tsirkin
2019-03-07  2:45     ` Jason Wang
2019-03-07  2:45     ` Jason Wang
2019-03-07 15:34       ` Michael S. Tsirkin
2019-03-07 15:34       ` Michael S. Tsirkin
2019-03-07 19:09         ` Jerome Glisse
2019-03-07 19:38           ` Andrea Arcangeli
2019-03-07 19:38             ` Andrea Arcangeli
2019-03-07 20:17             ` Jerome Glisse
2019-03-07 20:17             ` Jerome Glisse
2019-03-07 21:27               ` Andrea Arcangeli
2019-03-07 21:27               ` Andrea Arcangeli
2019-03-08  9:13                 ` Jason Wang
2019-03-08 19:11                   ` Andrea Arcangeli
2019-03-11  7:21                     ` Jason Wang
2019-03-11  7:21                     ` Jason Wang
2019-03-08 19:11                   ` Andrea Arcangeli
2019-03-08  9:13                 ` Jason Wang
2019-03-11 14:45                 ` Jan Kara
2019-03-11 14:45                 ` Jan Kara
2019-03-07 19:09         ` Jerome Glisse
2019-03-08  8:31         ` Jason Wang
2019-03-08  8:31         ` Jason Wang
2019-03-07 15:47   ` Michael S. Tsirkin
2019-03-07 17:56     ` Michael S. Tsirkin
2019-03-07 17:56     ` Michael S. Tsirkin
2019-03-07 19:16       ` Andrea Arcangeli
2019-03-07 19:16       ` Andrea Arcangeli
2019-03-08  8:50         ` Jason Wang
2019-03-08  8:50           ` Jason Wang
2019-03-08 14:58           ` Jerome Glisse
2019-03-11  7:18             ` Jason Wang
2019-03-11  7:18             ` Jason Wang
2019-03-08 14:58           ` Jerome Glisse
2019-03-08 19:48           ` Andrea Arcangeli
2019-03-08 19:48           ` Andrea Arcangeli
2019-03-08 20:06             ` Jerome Glisse
2019-03-08 20:06             ` Jerome Glisse
2019-03-11  7:40             ` Jason Wang
2019-03-11  7:40               ` Jason Wang
2019-03-11 12:48               ` Michael S. Tsirkin
2019-03-11 12:48               ` Michael S. Tsirkin
2019-03-11 13:43                 ` Andrea Arcangeli
2019-03-11 13:43                 ` Andrea Arcangeli
2019-03-12  2:56                   ` Jason Wang
2019-03-12  2:56                   ` Jason Wang
2019-03-12  3:51                     ` Michael S. Tsirkin
2019-03-12  3:51                     ` Michael S. Tsirkin
2019-03-12  2:52                 ` Jason Wang
2019-03-12  2:52                   ` Jason Wang
2019-03-12  3:50                   ` Michael S. Tsirkin
2019-03-12  7:15                     ` Jason Wang
2019-03-12  7:15                     ` Jason Wang
2019-03-12  3:50                   ` Michael S. Tsirkin
2019-03-07 19:17       ` Jerome Glisse
2019-03-08  2:21         ` Michael S. Tsirkin
2019-03-08  2:21           ` Michael S. Tsirkin
2019-03-08  2:55           ` Jerome Glisse
2019-03-08  2:55           ` Jerome Glisse
2019-03-08  3:16             ` Michael S. Tsirkin
2019-03-08  3:16             ` Michael S. Tsirkin
2019-03-08  3:40               ` Jerome Glisse
2019-03-08  3:43                 ` Michael S. Tsirkin
2019-03-08  3:43                 ` Michael S. Tsirkin
2019-03-08  3:45                   ` Jerome Glisse
2019-03-08  3:45                   ` Jerome Glisse
2019-03-08  9:15                     ` Jason Wang
2019-03-08  9:15                     ` Jason Wang
2019-03-08  3:40               ` Jerome Glisse
2019-03-08  8:58         ` Jason Wang
2019-03-08 12:56           ` Michael S. Tsirkin
2019-03-08 15:02             ` Jerome Glisse
2019-03-08 15:02             ` Jerome Glisse
2019-03-08 12:56           ` Michael S. Tsirkin
2019-03-08 19:13           ` Andrea Arcangeli
2019-03-08 19:13           ` Andrea Arcangeli
2019-03-08  8:58         ` Jason Wang
2019-03-07 19:17       ` Jerome Glisse
2019-03-07 15:47   ` Michael S. Tsirkin
2019-03-06  7:18 ` Jason Wang
2019-03-08 14:12 ` [RFC PATCH V2 0/5] vhost: accelerate metadata access through vmap() Christoph Hellwig
2019-03-08 14:12   ` Christoph Hellwig
2019-03-11  7:13   ` Jason Wang
2019-03-11  7:13   ` Jason Wang
2019-03-11  7:13     ` Jason Wang
2019-03-11 13:59     ` Michael S. Tsirkin
2019-03-11 13:59     ` Michael S. Tsirkin
2019-03-11 13:59       ` Michael S. Tsirkin
2019-03-11 18:14       ` David Miller
2019-03-11 18:14       ` David Miller
2019-03-11 18:14         ` David Miller
2019-03-12  2:59         ` Jason Wang
2019-03-12  2:59           ` Jason Wang
2019-03-12  3:52           ` Michael S. Tsirkin
2019-03-12  3:52             ` Michael S. Tsirkin
2019-03-12  3:52             ` Michael S. Tsirkin
2019-03-12  7:17             ` Jason Wang
2019-03-12  7:17             ` Jason Wang
2019-03-12  7:17               ` Jason Wang
2019-03-12 11:54               ` Michael S. Tsirkin
2019-03-12 11:54                 ` Michael S. Tsirkin
2019-03-12 15:46                 ` James Bottomley
2019-03-12 15:46                   ` James Bottomley
2019-03-12 20:04                   ` Andrea Arcangeli
2019-03-12 20:04                     ` Andrea Arcangeli
2019-03-12 20:04                     ` Andrea Arcangeli
2019-03-12 20:53                     ` James Bottomley
2019-03-12 20:53                       ` James Bottomley
2019-03-12 21:11                       ` Andrea Arcangeli
2019-03-12 21:11                         ` Andrea Arcangeli
2019-03-12 21:19                         ` James Bottomley
2019-03-12 21:19                           ` James Bottomley
2019-03-12 21:53                           ` Andrea Arcangeli
2019-03-12 21:53                           ` Andrea Arcangeli
2019-03-12 21:53                             ` Andrea Arcangeli
2019-03-12 22:02                             ` James Bottomley
2019-03-12 22:02                               ` James Bottomley
2019-03-12 22:50                               ` Andrea Arcangeli
2019-03-12 22:50                                 ` Andrea Arcangeli
2019-03-12 22:57                                 ` James Bottomley [this message]
2019-03-12 22:57                                   ` James Bottomley
2019-03-12 22:50                               ` Andrea Arcangeli
2019-03-12 21:11                       ` Andrea Arcangeli
2019-03-13 16:05                       ` Christoph Hellwig
2019-03-13 16:05                       ` Christoph Hellwig
2019-03-13 16:05                         ` Christoph Hellwig
2019-03-13 16:05                         ` Christoph Hellwig
2019-03-13 16:37                         ` James Bottomley
2019-03-13 16:37                           ` James Bottomley
2019-03-14 10:42                           ` Michael S. Tsirkin
2019-03-14 10:42                             ` Michael S. Tsirkin
2019-03-14 13:49                             ` Jason Wang
2019-03-14 13:49                             ` Jason Wang
2019-03-14 13:49                               ` Jason Wang
2019-03-14 19:33                               ` Andrea Arcangeli
2019-03-14 19:33                               ` Andrea Arcangeli
2019-03-14 19:33                                 ` Andrea Arcangeli
2019-03-15  4:39                                 ` Jason Wang
2019-03-15  4:39                                 ` Jason Wang
2019-03-15  4:39                                   ` Jason Wang
2019-03-14 10:42                           ` Michael S. Tsirkin
2019-03-12 11:54               ` Michael S. Tsirkin
2019-03-12  5:14           ` James Bottomley
2019-03-12  5:14             ` James Bottomley
2019-03-12  7:51             ` Jason Wang
2019-03-12  7:51             ` Jason Wang
2019-03-12  7:51               ` Jason Wang
2019-03-12  7:53               ` Jason Wang
2019-03-12  7:53                 ` Jason Wang
2019-03-12  7:53               ` Jason Wang
2019-03-12  2:59         ` Jason Wang
2019-03-08 14:12 ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1552431434.14432.47.camel@HansenPartnership.com \
    --to=james.bottomley@hansenpartnership.com \
    --cc=aarcange@redhat.com \
    --cc=davem@davemloft.net \
    --cc=hch@infradead.org \
    --cc=jasowang@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=peterx@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.