All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: linuxppc-dev@lists.ozlabs.org, kvm-ppc@vger.kernel.org,
	Alex Williamson <alex.williamson@redhat.com>,
	Paul Mackerras <paulus@ozlabs.org>
Subject: Re: [PATCH kernel v2 2/2] KVM: PPC: Check if IOMMU page is contained in the pinned physical page
Date: Fri, 29 Jun 2018 04:57:02 +0000	[thread overview]
Message-ID: <20180629045702.GI3422@umbus.fritz.box> (raw)
In-Reply-To: <20180629145121.5d03e067@aik.ozlabs.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 4251 bytes --]

On Fri, Jun 29, 2018 at 02:51:21PM +1000, Alexey Kardashevskiy wrote:
> On Fri, 29 Jun 2018 14:12:41 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Tue, Jun 26, 2018 at 03:59:26PM +1000, Alexey Kardashevskiy wrote:
> > > We already have a check in drivers/vfio/vfio_iommu_spapr_tce.c that
> > > an IOMMU page is contained in the physical page so the PCI hardware won't
> > > get access to unassigned host memory.
> > > 
> > > However we do not have this check in KVM fastpath (H_PUT_TCE accelerated
> > > code) so the user space can pin memory backed with 64k pages and create
> > > a hardware TCE table with a bigger page size. We were lucky so far and
> > > did not hit this yet as the very first time the mapping happens
> > > we do not have tbl::it_userspace allocated yet and fall back to
> > > the userspace which in turn calls VFIO IOMMU driver and that fails
> > > because of the check in vfio_iommu_spapr_tce.c which is really
> > > sustainable solution.
> > > 
> > > This stores the smallest preregistered page size in the preregistered
> > > region descriptor and changes the mm_iommu_xxx API to check this against
> > > the IOMMU page size.
> > > 
> > > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > > ---
> > > Changes:
> > > v2:
> > > * explicitly check for compound pages before calling compound_order()
> > > 
> > > ---
> > > The bug is: run QEMU _without_ hugepages (no -mempath) and tell it to
> > > advertise 16MB pages to the guest; a typical pseries guest will use 16MB
> > > for IOMMU pages without checking the mmu pagesize and this will fail
> > > at https://git.qemu.org/?p=qemu.git;a=blob;f=hw/vfio/common.c;h=fb396cf00ac40eb35967a04c9cc798ca896eed57;hb=refs/heads/master#l256
> > > 
> > > With the change, mapping will fail in KVM and the guest will print:
> > > 
> > > mlx5_core 0000:00:00.0: ibm,create-pe-dma-window(2027) 0 8000000 20000000 18 1f returned 0 (liobn = 0x80000001 starting addr = 8000000 0)
> > > mlx5_core 0000:00:00.0: created tce table LIOBN 0x80000001 for /pci@800000020000000/ethernet@0
> > > mlx5_core 0000:00:00.0: failed to map direct window for
> > > /pci@800000020000000/ethernet@0: -1  
> > 
> > [snip]
> > > @@ -124,7 +125,7 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries,
> > >  		struct mm_iommu_table_group_mem_t **pmem)
> > >  {
> > >  	struct mm_iommu_table_group_mem_t *mem;
> > > -	long i, j, ret = 0, locked_entries = 0;
> > > +	long i, j, ret = 0, locked_entries = 0, pageshift;
> > >  	struct page *page = NULL;
> > >  
> > >  	mutex_lock(&mem_list_mutex);
> > > @@ -166,6 +167,8 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries,
> > >  		goto unlock_exit;
> > >  	}
> > >  
>  > > +	mem->pageshift = 30; /* start from 1G pages - the biggest we have */  
> > 
> > What about 16G pages on an HPT system?
> 
> 
> Below in the loop mem->pageshift will reduce to the biggest actual size
> which will be 16mb/64k/4k. Or remain 1GB if no memory is actually
> pinned, no loss there.

Are you saying that 16G IOMMU pages aren't supported?  Or that there's
some reason a guest can never use them?

> > >  	for (i = 0; i < entries; ++i) {
> > >  		if (1 != get_user_pages_fast(ua + (i << PAGE_SHIFT),
> > >  					1/* pages */, 1/* iswrite */, &page)) {
> > > @@ -199,6 +202,11 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries,
> > >  			}
> > >  		}
> > >  populate:
> > > +		pageshift = PAGE_SHIFT;
> > > +		if (PageCompound(page))
> > > +			pageshift += compound_order(compound_head(page));
> > > +		mem->pageshift = min_t(unsigned int, mem->pageshift, pageshift);  
> > 
> > Why not make mem->pageshift and pageshift local the same type to avoid
> > the min_t() ?
> 
> I was under impression min() is deprecated (misinterpret checkpatch.pl
> may be) and therefore did not pay attention to it. I can fix this and
> repost if there is no other question.

Hm, it's possible.


-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: David Gibson <david@gibson.dropbear.id.au>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: linuxppc-dev@lists.ozlabs.org, kvm-ppc@vger.kernel.org,
	Alex Williamson <alex.williamson@redhat.com>,
	Paul Mackerras <paulus@ozlabs.org>
Subject: Re: [PATCH kernel v2 2/2] KVM: PPC: Check if IOMMU page is contained in the pinned physical page
Date: Fri, 29 Jun 2018 14:57:02 +1000	[thread overview]
Message-ID: <20180629045702.GI3422@umbus.fritz.box> (raw)
In-Reply-To: <20180629145121.5d03e067@aik.ozlabs.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 4251 bytes --]

On Fri, Jun 29, 2018 at 02:51:21PM +1000, Alexey Kardashevskiy wrote:
> On Fri, 29 Jun 2018 14:12:41 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Tue, Jun 26, 2018 at 03:59:26PM +1000, Alexey Kardashevskiy wrote:
> > > We already have a check in drivers/vfio/vfio_iommu_spapr_tce.c that
> > > an IOMMU page is contained in the physical page so the PCI hardware won't
> > > get access to unassigned host memory.
> > > 
> > > However we do not have this check in KVM fastpath (H_PUT_TCE accelerated
> > > code) so the user space can pin memory backed with 64k pages and create
> > > a hardware TCE table with a bigger page size. We were lucky so far and
> > > did not hit this yet as the very first time the mapping happens
> > > we do not have tbl::it_userspace allocated yet and fall back to
> > > the userspace which in turn calls VFIO IOMMU driver and that fails
> > > because of the check in vfio_iommu_spapr_tce.c which is really
> > > sustainable solution.
> > > 
> > > This stores the smallest preregistered page size in the preregistered
> > > region descriptor and changes the mm_iommu_xxx API to check this against
> > > the IOMMU page size.
> > > 
> > > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > > ---
> > > Changes:
> > > v2:
> > > * explicitly check for compound pages before calling compound_order()
> > > 
> > > ---
> > > The bug is: run QEMU _without_ hugepages (no -mempath) and tell it to
> > > advertise 16MB pages to the guest; a typical pseries guest will use 16MB
> > > for IOMMU pages without checking the mmu pagesize and this will fail
> > > at https://git.qemu.org/?p=qemu.git;a=blob;f=hw/vfio/common.c;h=fb396cf00ac40eb35967a04c9cc798ca896eed57;hb=refs/heads/master#l256
> > > 
> > > With the change, mapping will fail in KVM and the guest will print:
> > > 
> > > mlx5_core 0000:00:00.0: ibm,create-pe-dma-window(2027) 0 8000000 20000000 18 1f returned 0 (liobn = 0x80000001 starting addr = 8000000 0)
> > > mlx5_core 0000:00:00.0: created tce table LIOBN 0x80000001 for /pci@800000020000000/ethernet@0
> > > mlx5_core 0000:00:00.0: failed to map direct window for
> > > /pci@800000020000000/ethernet@0: -1  
> > 
> > [snip]
> > > @@ -124,7 +125,7 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries,
> > >  		struct mm_iommu_table_group_mem_t **pmem)
> > >  {
> > >  	struct mm_iommu_table_group_mem_t *mem;
> > > -	long i, j, ret = 0, locked_entries = 0;
> > > +	long i, j, ret = 0, locked_entries = 0, pageshift;
> > >  	struct page *page = NULL;
> > >  
> > >  	mutex_lock(&mem_list_mutex);
> > > @@ -166,6 +167,8 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries,
> > >  		goto unlock_exit;
> > >  	}
> > >  
>  > > +	mem->pageshift = 30; /* start from 1G pages - the biggest we have */  
> > 
> > What about 16G pages on an HPT system?
> 
> 
> Below in the loop mem->pageshift will reduce to the biggest actual size
> which will be 16mb/64k/4k. Or remain 1GB if no memory is actually
> pinned, no loss there.

Are you saying that 16G IOMMU pages aren't supported?  Or that there's
some reason a guest can never use them?

> > >  	for (i = 0; i < entries; ++i) {
> > >  		if (1 != get_user_pages_fast(ua + (i << PAGE_SHIFT),
> > >  					1/* pages */, 1/* iswrite */, &page)) {
> > > @@ -199,6 +202,11 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries,
> > >  			}
> > >  		}
> > >  populate:
> > > +		pageshift = PAGE_SHIFT;
> > > +		if (PageCompound(page))
> > > +			pageshift += compound_order(compound_head(page));
> > > +		mem->pageshift = min_t(unsigned int, mem->pageshift, pageshift);  
> > 
> > Why not make mem->pageshift and pageshift local the same type to avoid
> > the min_t() ?
> 
> I was under impression min() is deprecated (misinterpret checkpatch.pl
> may be) and therefore did not pay attention to it. I can fix this and
> repost if there is no other question.

Hm, it's possible.


-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2018-06-29  4:57 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-26  5:59 [PATCH kernel v2 0/2] KVM: PPC: Check if IOMMU page is contained in the pinned physical page Alexey Kardashevskiy
2018-06-26  5:59 ` Alexey Kardashevskiy
2018-06-26  5:59 ` [PATCH kernel v2 1/2] vfio/spapr: Use IOMMU pageshift rather than pagesize Alexey Kardashevskiy
2018-06-26  5:59   ` Alexey Kardashevskiy
2018-06-30 19:56   ` Alex Williamson
2018-06-30 19:56     ` Alex Williamson
2018-06-26  5:59 ` [PATCH kernel v2 2/2] KVM: PPC: Check if IOMMU page is contained in the pinned physical page Alexey Kardashevskiy
2018-06-26  5:59   ` Alexey Kardashevskiy
2018-06-29  4:12   ` David Gibson
2018-06-29  4:12     ` David Gibson
2018-06-29  4:51     ` Alexey Kardashevskiy
2018-06-29  4:51       ` Alexey Kardashevskiy
2018-06-29  4:57       ` David Gibson [this message]
2018-06-29  4:57         ` David Gibson
2018-06-29  5:18         ` Alexey Kardashevskiy
2018-06-29  5:18           ` Alexey Kardashevskiy
2018-06-29  7:07           ` Alexey Kardashevskiy
2018-06-29  7:07             ` Alexey Kardashevskiy
2018-07-02  4:08             ` David Gibson
2018-07-02  4:08               ` David Gibson
2018-07-02  4:33               ` Alexey Kardashevskiy
2018-07-02  4:33                 ` Alexey Kardashevskiy
2018-07-02  4:52                 ` David Gibson
2018-07-02  4:52                   ` David Gibson
2018-07-02  6:32                   ` Alexey Kardashevskiy
2018-07-02  6:32                     ` Alexey Kardashevskiy
2018-07-03  1:36                     ` David Gibson
2018-07-03  1:36                       ` David Gibson
2018-06-29  1:55 ` [PATCH kernel v2 0/2] " Michael Ellerman
2018-06-29  1:55   ` Michael Ellerman
2018-06-29  3:00   ` Alexey Kardashevskiy
2018-06-29  3:00     ` Alexey Kardashevskiy
2018-06-29  4:14     ` David Gibson
2018-06-29  4:14       ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180629045702.GI3422@umbus.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=paulus@ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.