From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: Paul Mackerras <paulus@samba.org>,
linuxppc-dev@lists.ozlabs.org,
Gavin Shan <gwshan@linux.vnet.ibm.com>
Subject: Re: [PATCH v3 03/18] KVM: PPC: Account TCE pages in locked_vm
Date: Mon, 28 Jul 2014 14:34:33 +1000 [thread overview]
Message-ID: <1406522073.4935.46.camel@pasglop> (raw)
In-Reply-To: <53D5D04F.7000507@ozlabs.ru>
On Mon, 2014-07-28 at 14:23 +1000, Alexey Kardashevskiy wrote:
> On 07/28/2014 10:43 AM, Benjamin Herrenschmidt wrote:
> > On Thu, 2014-07-24 at 18:47 +1000, Alexey Kardashevskiy wrote:
> >> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> >> ---
> >
> > You need a description.
> >
> >> arch/powerpc/kvm/book3s_64_vio.c | 35 ++++++++++++++++++++++++++++++++++-
> >> 1 file changed, 34 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
> >> index 516f2ee..48b7ed4 100644
> >> --- a/arch/powerpc/kvm/book3s_64_vio.c
> >> +++ b/arch/powerpc/kvm/book3s_64_vio.c
> >> @@ -45,18 +45,48 @@ static long kvmppc_stt_npages(unsigned long window_size)
> >> * sizeof(u64), PAGE_SIZE) / PAGE_SIZE;
> >> }
> >>
> >> +/*
> >> + * Checks ulimit in order not to let the user space to pin all
> >> + * available memory for TCE tables.
> >> + */
> >> +static long kvmppc_account_memlimit(long npages)
> >> +{
> >> + unsigned long ret = 0, locked, lock_limit;
> >> +
> >> + if (!current->mm)
> >> + return -ESRCH; /* process exited */
> >> +
> >> + down_write(¤t->mm->mmap_sem);
> >> + locked = current->mm->locked_vm + npages;
> >> + lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
> >> + if (locked > lock_limit && !capable(CAP_IPC_LOCK)) {
> >> + pr_warn("RLIMIT_MEMLOCK (%ld) exceeded\n",
> >> + rlimit(RLIMIT_MEMLOCK));
> >> + ret = -ENOMEM;
> >> + } else {
> >> + current->mm->locked_vm += npages;
> >> + }
> >> + up_write(¤t->mm->mmap_sem);
> >> +
> >> + return ret;
> >> +}
> >> +
> >> static void release_spapr_tce_table(struct kvmppc_spapr_tce_table *stt)
> >> {
> >> struct kvm *kvm = stt->kvm;
> >> int i;
> >> + long npages = kvmppc_stt_npages(stt->window_size);
> >>
> >> mutex_lock(&kvm->lock);
> >> list_del(&stt->list);
> >> - for (i = 0; i < kvmppc_stt_npages(stt->window_size); i++)
> >> + for (i = 0; i < npages; i++)
> >> __free_page(stt->pages[i]);
> >> +
> >> kfree(stt);
> >> mutex_unlock(&kvm->lock);
> >>
> >> + kvmppc_account_memlimit(-(npages + 1));
> >> +
> >> kvm_put_kvm(kvm);
> >> }
> >>
> >> @@ -112,6 +142,9 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
> >> }
> >>
> >> npages = kvmppc_stt_npages(args->window_size);
> >> + ret = kvmppc_account_memlimit(npages + 1);
> >> + if (ret)
> >> + goto fail;
> >
> > This is called for VFIO only or is it also called when creating TCE
> > tables for emulated devices ? Because in the latter case, you don't
> > want to account the pages as locked, do you ?
>
> At the moment TCE-containing pages (for emulated TCE) are allocated with
> alloc_page() which is kernel memory and therefore always locked, no?
So the npages up there is the number of TCE-containing pages, not the
number of mapped-by-TCE pages ? In that case it makes sense yes.
>
> > Also, you need to explain what +1
> >
> > Finally, do I correctly deduce that creating 10 TCE tables of 2G
> > each will end up accounting 20G as locked even if the guest for
> > example only has 4G of RAM ?
>
>
> The user is required to set the limit to 20G, correct. But this does not
> mean all 20G will be pinned. Ugly but better than nothing. As I remember
> from you explanations, even if we give up real/virtual mode handlers for
> H_PUT_TCE&Co, we cannot rely of existing counters in page struct in order
> to understand whether we need to account a page again or not so we are
> stuck with this code till we have a "clone DDW window" API.
Right but please put that explanation somewhere in one of the changeset
comments or as comments near the code.
> But this patch is not about guest pages, it is about pages with TCEs, there
> was no counting for this at all.
Ok.
>
>
> >
> >> stt = kzalloc(sizeof(*stt) + npages * sizeof(struct page *),
> >> GFP_KERNEL);
> >
> > Ben.
> >
> >
>
>
next prev parent reply other threads:[~2014-07-28 4:34 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-24 8:47 [PATCH v3 00/18] powernv: vfio: Add Dynamic DMA windows (DDW) Alexey Kardashevskiy
2014-07-24 8:47 ` [PATCH v3 01/18] powerpc/iommu: Fix comments with it_page_shift Alexey Kardashevskiy
2014-07-24 8:47 ` [PATCH v3 02/18] KVM: PPC: Use RCU when adding to arch.spapr_tce_tables Alexey Kardashevskiy
2014-07-28 0:40 ` Benjamin Herrenschmidt
2014-07-28 4:11 ` Alexey Kardashevskiy
2014-07-28 4:30 ` Benjamin Herrenschmidt
2014-07-24 8:47 ` [PATCH v3 03/18] KVM: PPC: Account TCE pages in locked_vm Alexey Kardashevskiy
2014-07-28 0:43 ` Benjamin Herrenschmidt
2014-07-28 4:23 ` Alexey Kardashevskiy
2014-07-28 4:34 ` Benjamin Herrenschmidt [this message]
2014-07-24 8:47 ` [PATCH v3 04/18] vfio: powerpc: Move locked_vm accounting to a helper Alexey Kardashevskiy
2014-07-28 0:46 ` Benjamin Herrenschmidt
2014-07-28 9:12 ` Alexey Kardashevskiy
2014-07-24 8:47 ` [PATCH v3 05/18] powerpc/powernv: Use it_page_shift for TCE invalidation Alexey Kardashevskiy
2014-07-24 8:47 ` [PATCH v3 06/18] powerpc/powernv: Use it_page_shift in TCE build Alexey Kardashevskiy
2014-07-24 8:48 ` [PATCH v3 07/18] powerpc/powernv: Add a page size parameter to pnv_pci_setup_iommu_table() Alexey Kardashevskiy
2014-07-24 8:48 ` [PATCH v3 08/18] powerpc/powernv: Make invalidate() a callback Alexey Kardashevskiy
2014-07-24 8:48 ` [PATCH v3 09/18] powerpc/spapr: vfio: Implement spapr_tce_iommu_ops Alexey Kardashevskiy
2014-07-24 8:48 ` [PATCH v3 10/18] powerpc/powernv: Convert/move set_bypass() callback to take_ownership() Alexey Kardashevskiy
2014-07-28 1:18 ` Benjamin Herrenschmidt
2014-07-28 3:55 ` Alexey Kardashevskiy
2014-07-28 4:19 ` Benjamin Herrenschmidt
2014-07-24 8:48 ` [PATCH v3 11/18] powerpc/iommu: Fix IOMMU ownership control functions Alexey Kardashevskiy
2014-07-24 8:48 ` [PATCH v3 12/18] powerpc/iommu: Fix missing permission bits in iommu_put_tce_user_mode() Alexey Kardashevskiy
2014-07-28 1:19 ` Benjamin Herrenschmidt
2014-07-28 4:32 ` Alexey Kardashevskiy
2014-07-28 4:35 ` Benjamin Herrenschmidt
2014-07-24 8:48 ` [PATCH v3 13/18] powerpc/iommu: Extend ppc_md.tce_build(_rm) to return old TCE values Alexey Kardashevskiy
2014-07-28 1:09 ` Benjamin Herrenschmidt
2014-07-28 1:16 ` Benjamin Herrenschmidt
2014-07-24 8:48 ` [PATCH v3 14/18] powerpc/powernv: Return non-zero TCE from pnv_tce_build Alexey Kardashevskiy
2014-07-24 8:48 ` [PATCH v3 15/18] powerpc/iommu: Implement put_page() if TCE had non-zero value Alexey Kardashevskiy
2014-07-24 8:48 ` [PATCH v3 16/18] powerpc/powernv: Implement Dynamic DMA windows (DDW) for IODA Alexey Kardashevskiy
2014-07-24 8:48 ` [PATCH v3 17/18] vfio: Use it_page_size Alexey Kardashevskiy
2014-07-24 8:48 ` [PATCH v3 18/18] vfio: powerpc: Enable Dynamic DMA windows Alexey Kardashevskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1406522073.4935.46.camel@pasglop \
--to=benh@kernel.crashing.org \
--cc=aik@ozlabs.ru \
--cc=gwshan@linux.vnet.ibm.com \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=paulus@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.