All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Mackerras <paulus@ozlabs.org>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: linuxppc-dev@lists.ozlabs.org, kvm-ppc@vger.kernel.org,
	David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH kernel v2] KVM: PPC: Optimize clearing TCEs for sparse tables
Date: Sun, 21 Oct 2018 21:52:11 +0000	[thread overview]
Message-ID: <20181021215211.GA16320@blackberry> (raw)
In-Reply-To: <20181015100841.33267-1-aik@ozlabs.ru>

On Mon, Oct 15, 2018 at 09:08:41PM +1100, Alexey Kardashevskiy wrote:
> The powernv platform maintains 2 TCE tables for VFIO - a hardware TCE
> table and a table with userspace addresses. These tables are radix trees,
> we allocate indirect levels when they are written to. Since
> the memory allocation is problematic in real mode, we have 2 accessors
> to the entries:
> - for virtual mode: it allocates the memory and it is always expected
> to return non-NULL;
> - fr real mode: it does not allocate and can return NULL.
> 
> Also, DMA windows can span to up to 55 bits of the address space and since
> we never have this much RAM, such windows are sparse. However currently
> the SPAPR TCE IOMMU driver walks through all TCEs to unpin DMA memory.
> 
> Since we maintain a userspace addresses table for VFIO which is a mirror
> of the hardware table, we can use it to know which parts of the DMA
> window have not been mapped and skip these so does this patch.
> 
> The bare metal systems do not have this problem as they use a bypass mode
> of a PHB which maps RAM directly.
> 
> This helps a lot with sparse DMA windows, reducing the shutdown time from
> about 3 minutes per 1 billion TCEs to a few seconds for 32GB sparse guest.
> Just skipping the last level seems to be good enough.
> 
> As non-allocating accessor is used now in virtual mode as well, rename it
> from IOMMU_TABLE_USERSPACE_ENTRY_RM (real mode) to _RO (read only).
> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>

Thanks, applied to my kvm-ppc-next branch, and now in the kvm next
branch also.

Paul.

WARNING: multiple messages have this Message-ID (diff)
From: Paul Mackerras <paulus@ozlabs.org>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: linuxppc-dev@lists.ozlabs.org, kvm-ppc@vger.kernel.org,
	David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH kernel v2] KVM: PPC: Optimize clearing TCEs for sparse tables
Date: Mon, 22 Oct 2018 08:52:11 +1100	[thread overview]
Message-ID: <20181021215211.GA16320@blackberry> (raw)
In-Reply-To: <20181015100841.33267-1-aik@ozlabs.ru>

On Mon, Oct 15, 2018 at 09:08:41PM +1100, Alexey Kardashevskiy wrote:
> The powernv platform maintains 2 TCE tables for VFIO - a hardware TCE
> table and a table with userspace addresses. These tables are radix trees,
> we allocate indirect levels when they are written to. Since
> the memory allocation is problematic in real mode, we have 2 accessors
> to the entries:
> - for virtual mode: it allocates the memory and it is always expected
> to return non-NULL;
> - fr real mode: it does not allocate and can return NULL.
> 
> Also, DMA windows can span to up to 55 bits of the address space and since
> we never have this much RAM, such windows are sparse. However currently
> the SPAPR TCE IOMMU driver walks through all TCEs to unpin DMA memory.
> 
> Since we maintain a userspace addresses table for VFIO which is a mirror
> of the hardware table, we can use it to know which parts of the DMA
> window have not been mapped and skip these so does this patch.
> 
> The bare metal systems do not have this problem as they use a bypass mode
> of a PHB which maps RAM directly.
> 
> This helps a lot with sparse DMA windows, reducing the shutdown time from
> about 3 minutes per 1 billion TCEs to a few seconds for 32GB sparse guest.
> Just skipping the last level seems to be good enough.
> 
> As non-allocating accessor is used now in virtual mode as well, rename it
> from IOMMU_TABLE_USERSPACE_ENTRY_RM (real mode) to _RO (read only).
> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>

Thanks, applied to my kvm-ppc-next branch, and now in the kvm next
branch also.

Paul.

  reply	other threads:[~2018-10-21 21:52 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-15 10:08 [PATCH kernel v2] KVM: PPC: Optimize clearing TCEs for sparse tables Alexey Kardashevskiy
2018-10-15 10:08 ` Alexey Kardashevskiy
2018-10-21 21:52 ` Paul Mackerras [this message]
2018-10-21 21:52   ` Paul Mackerras

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181021215211.GA16320@blackberry \
    --to=paulus@ozlabs.org \
    --cc=aik@ozlabs.ru \
    --cc=david@gibson.dropbear.id.au \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.