From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: Paul Mackerras <paulus@samba.org>,
linuxppc-dev@lists.ozlabs.org,
Gavin Shan <gwshan@linux.vnet.ibm.com>
Subject: Re: [PATCH v3 03/18] KVM: PPC: Account TCE pages in locked_vm
Date: Mon, 28 Jul 2014 14:34:33 +1000 [thread overview]
Message-ID: <1406522073.4935.46.camel@pasglop> (raw)
In-Reply-To: <53D5D04F.7000507@ozlabs.ru>
On Mon, 2014-07-28 at 14:23 +1000, Alexey Kardashevskiy wrote:
> On 07/28/2014 10:43 AM, Benjamin Herrenschmidt wrote:
> > On Thu, 2014-07-24 at 18:47 +1000, Alexey Kardashevskiy wrote:
> >> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> >> ---
> >
> > You need a description.
> >
> >> arch/powerpc/kvm/book3s_64_vio.c | 35 ++++++++++++++++++++++++++++++++++-
> >> 1 file changed, 34 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
> >> index 516f2ee..48b7ed4 100644
> >> --- a/arch/powerpc/kvm/book3s_64_vio.c
> >> +++ b/arch/powerpc/kvm/book3s_64_vio.c
> >> @@ -45,18 +45,48 @@ static long kvmppc_stt_npages(unsigned long window_size)
> >> * sizeof(u64), PAGE_SIZE) / PAGE_SIZE;
> >> }
> >>
> >> +/*
> >> + * Checks ulimit in order not to let the user space to pin all
> >> + * available memory for TCE tables.
> >> + */
> >> +static long kvmppc_account_memlimit(long npages)
> >> +{
> >> + unsigned long ret = 0, locked, lock_limit;
> >> +
> >> + if (!current->mm)
> >> + return -ESRCH; /* process exited */
> >> +
> >> + down_write(¤t->mm->mmap_sem);
> >> + locked = current->mm->locked_vm + npages;
> >> + lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
> >> + if (locked > lock_limit && !capable(CAP_IPC_LOCK)) {
> >> + pr_warn("RLIMIT_MEMLOCK (%ld) exceeded\n",
> >> + rlimit(RLIMIT_MEMLOCK));
> >> + ret = -ENOMEM;
> >> + } else {
> >> + current->mm->locked_vm += npages;
> >> + }
> >> + up_write(¤t->mm->mmap_sem);
> >> +
> >> + return ret;
> >> +}
> >> +
> >> static void release_spapr_tce_table(struct kvmppc_spapr_tce_table *stt)
> >> {
> >> struct kvm *kvm = stt->kvm;
> >> int i;
> >> + long npages = kvmppc_stt_npages(stt->window_size);
> >>
> >> mutex_lock(&kvm->lock);
> >> list_del(&stt->list);
> >> - for (i = 0; i < kvmppc_stt_npages(stt->window_size); i++)
> >> + for (i = 0; i < npages; i++)
> >> __free_page(stt->pages[i]);
> >> +
> >> kfree(stt);
> >> mutex_unlock(&kvm->lock);
> >>
> >> + kvmppc_account_memlimit(-(npages + 1));
> >> +
> >> kvm_put_kvm(kvm);
> >> }
> >>
> >> @@ -112,6 +142,9 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
> >> }
> >>
> >> npages = kvmppc_stt_npages(args->window_size);
> >> + ret = kvmppc_account_memlimit(npages + 1);
> >> + if (ret)
> >> + goto fail;
> >
> > This is called for VFIO only or is it also called when creating TCE
> > tables for emulated devices ? Because in the latter case, you don't
> > want to account the pages as locked, do you ?
>
> At the moment TCE-containing pages (for emulated TCE) are allocated with
> alloc_page() which is kernel memory and therefore always locked, no?
So the npages up there is the number of TCE-containing pages, not the
number of mapped-by-TCE pages ? In that case it makes sense yes.
>
> > Also, you need to explain what +1
> >
> > Finally, do I correctly deduce that creating 10 TCE tables of 2G
> > each will end up accounting 20G as locked even if the guest for
> > example only has 4G of RAM ?
>
>
> The user is required to set the limit to 20G, correct. But this does not
> mean all 20G will be pinned. Ugly but better than nothing. As I remember
> from you explanations, even if we give up real/virtual mode handlers for
> H_PUT_TCE&Co, we cannot rely of existing counters in page struct in order
> to understand whether we need to account a page again or not so we are
> stuck with this code till we have a "clone DDW window" API.
Right but please put that explanation somewhere in one of the changeset
comments or as comments near the code.
> But this patch is not about guest pages, it is about pages with TCEs, there
> was no counting for this at all.
Ok.
>
>
> >
> >> stt = kzalloc(sizeof(*stt) + npages * sizeof(struct page *),
> >> GFP_KERNEL);
> >
> > Ben.
> >
> >
>
>
next prev parent reply other threads:[~2014-07-28 4:34 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-24 8:47 [PATCH v3 00/18] powernv: vfio: Add Dynamic DMA windows (DDW) Alexey Kardashevskiy
2014-07-24 8:47 ` [PATCH v3 01/18] powerpc/iommu: Fix comments with it_page_shift Alexey Kardashevskiy
2014-07-24 8:47 ` [PATCH v3 02/18] KVM: PPC: Use RCU when adding to arch.spapr_tce_tables Alexey Kardashevskiy
2014-07-28 0:40 ` Benjamin Herrenschmidt
2014-07-28 4:11 ` Alexey Kardashevskiy
2014-07-28 4:30 ` Benjamin Herrenschmidt
2014-07-24 8:47 ` [PATCH v3 03/18] KVM: PPC: Account TCE pages in locked_vm Alexey Kardashevskiy
2014-07-28 0:43 ` Benjamin Herrenschmidt
2014-07-28 4:23 ` Alexey Kardashevskiy
2014-07-28 4:34 ` Benjamin Herrenschmidt [this message]
2014-07-24 8:47 ` [PATCH v3 04/18] vfio: powerpc: Move locked_vm accounting to a helper Alexey Kardashevskiy
2014-07-28 0:46 ` Benjamin Herrenschmidt
2014-07-28 9:12 ` Alexey Kardashevskiy
2014-07-24 8:47 ` [PATCH v3 05/18] powerpc/powernv: Use it_page_shift for TCE invalidation Alexey Kardashevskiy
2014-07-24 8:47 ` [PATCH v3 06/18] powerpc/powernv: Use it_page_shift in TCE build Alexey Kardashevskiy
2014-07-24 8:48 ` [PATCH v3 07/18] powerpc/powernv: Add a page size parameter to pnv_pci_setup_iommu_table() Alexey Kardashevskiy
2014-07-24 8:48 ` [PATCH v3 08/18] powerpc/powernv: Make invalidate() a callback Alexey Kardashevskiy
2014-07-24 8:48 ` [PATCH v3 09/18] powerpc/spapr: vfio: Implement spapr_tce_iommu_ops Alexey Kardashevskiy
2014-07-24 8:48 ` [PATCH v3 10/18] powerpc/powernv: Convert/move set_bypass() callback to take_ownership() Alexey Kardashevskiy
2014-07-28 1:18 ` Benjamin Herrenschmidt
2014-07-28 3:55 ` Alexey Kardashevskiy
2014-07-28 4:19 ` Benjamin Herrenschmidt
2014-07-24 8:48 ` [PATCH v3 11/18] powerpc/iommu: Fix IOMMU ownership control functions Alexey Kardashevskiy
2014-07-24 8:48 ` [PATCH v3 12/18] powerpc/iommu: Fix missing permission bits in iommu_put_tce_user_mode() Alexey Kardashevskiy
2014-07-28 1:19 ` Benjamin Herrenschmidt
2014-07-28 4:32 ` Alexey Kardashevskiy
2014-07-28 4:35 ` Benjamin Herrenschmidt
2014-07-24 8:48 ` [PATCH v3 13/18] powerpc/iommu: Extend ppc_md.tce_build(_rm) to return old TCE values Alexey Kardashevskiy
2014-07-28 1:09 ` Benjamin Herrenschmidt
2014-07-28 1:16 ` Benjamin Herrenschmidt
2014-07-24 8:48 ` [PATCH v3 14/18] powerpc/powernv: Return non-zero TCE from pnv_tce_build Alexey Kardashevskiy
2014-07-24 8:48 ` [PATCH v3 15/18] powerpc/iommu: Implement put_page() if TCE had non-zero value Alexey Kardashevskiy
2014-07-24 8:48 ` [PATCH v3 16/18] powerpc/powernv: Implement Dynamic DMA windows (DDW) for IODA Alexey Kardashevskiy
2014-07-24 8:48 ` [PATCH v3 17/18] vfio: Use it_page_size Alexey Kardashevskiy
2014-07-24 8:48 ` [PATCH v3 18/18] vfio: powerpc: Enable Dynamic DMA windows Alexey Kardashevskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1406522073.4935.46.camel@pasglop \
--to=benh@kernel.crashing.org \
--cc=aik@ozlabs.ru \
--cc=gwshan@linux.vnet.ibm.com \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=paulus@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).