From: christoffer.dall@linaro.org (Christoffer Dall)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH RESEND v15 07/10] KVM: arm: page logging 2nd stage fault handling
Date: Thu, 15 Jan 2015 11:20:32 +0100 [thread overview]
Message-ID: <20150115102032.GN26222@cbox> (raw)
In-Reply-To: <54B6F753.8000101@samsung.com>
On Wed, Jan 14, 2015 at 03:10:11PM -0800, Mario Smarduch wrote:
> On 01/14/2015 02:32 AM, Christoffer Dall wrote:
> > On Tue, Jan 13, 2015 at 03:14:47PM -0800, Mario Smarduch wrote:
> >
> > [...]
> >
> >>>>>
> >>>>>
> >>>>> What I meant last time around concerning user_mem_abort was more
> >>>>> something like this:
> >>>>>
> >>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> >>>>> index 1dc9778..38ea58e 100644
> >>>>> --- a/arch/arm/kvm/mmu.c
> >>>>> +++ b/arch/arm/kvm/mmu.c
> >>>>> @@ -935,7 +935,14 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >>>>> return -EFAULT;
> >>>>> }
> >>>>>
> >>>>> - if (is_vm_hugetlb_page(vma)) {
> >>>>> + /*
> >>>>> + * Writes to pages in a memslot with logging enabled are always logged
> >>>>> + * on a singe page-by-page basis.
> >>>>> + */
> >>>>> + if (memslot_is_logging(memslot) && write_fault)
> >>>>> + force_pte = true;
> >>>>
> >>>> If it's a write you take the pte route and
> >>>> dissolves huge page, if it's a read you reconstruct the
> >>>> THP that seems to yield pretty bad results.
> >>>
> >>> ok, then remove the ' && write_fault' part of the clause.
> >> Hi Christoffer,
> >> couple comments/questions.
> >>
> >> setting force_pte here, disables huge pages for
> >> non-writable regions.
> >>
> >
>
> Hi Christoffer,
> another round, although I'll go ahead and post another
> iteration, sorry but as you mentioned this code is
> important.
>
> > hmmm, by a non-writable region you mean a read-only memslot? Can you set
> > dirty page logging for such one? That doesn't make much sense to me.
>
> Come to think of it that's true.
>
> It's bit fuzzyy when I was looking at the API for KVM_MEM_LOG_DIRTY_PAGES,
> it appears user space needs to check if region is read-only and set region
> size to 0(qemu). I don't see any checks in kernel to disable logging if
> region is read only and we're enabling dirty page logging. API doesn't say
> anything else. You may be able to enable logging
> for read-only region if you leave region size as is.
>
> I guess this has been around for quite a while so we
> can just assume read-only slots will have logging disabled.
>
I'm a bit confused, IIUC we don't make any explicit checks in the code
right now, so either there is some generic code that never sets the
logging flag on a read-only memregion or you can change the
implementation of memslot_is_logging() to return false if the memslot is
read-only.
> >
> > Note, that if you receive writable == false from gfn_to_pfn_prot() that
> > doesn't mean that the page can never be written to, it just means that
> > the current mapping of the page is not a writable one, you can call that
> > same function again later with write_fault=true, and you either get a
> > writable page back or you simply get an error.
>
> Yes that's true after studying hva_to_pfn_slow(),
> and __get_user_pages_fast(), a lot of conditions
> handled there.
>
> >
> > [...]
> >
> >>>>> if (kvm_is_device_pfn(pfn))
> >>>>> mem_type = PAGE_S2_DEVICE;
> >>
> >> If we're not setting the IOMAP flag do we have need
> >> this, since we're forfeiting error checking later
> >> in stage2_set_pte()?
> >>
> >
> > we still need this, remember the error checking is about
> > cache == NULL, not about the IOMAP flag. I think I address this in the
> > new proposal below, but please check carefully.
>
> Ok so mmu notifier may call stage2_set_pte() with
> null cache poiner and intermediate table entries may
> not be there so stage2_get_pud() may return NULL.
> With logging on it won't happen, but just in case
> we check.
yes, to easily catch programming errors in the future, that's why if you
make it a VM_BUG_ON the check won't be compiled unless you have kernel
memory debugging enabled.
>
> And we'll continue to handle Device faults until
> further notice.
>
yes.
> >
> > Take a look at this one:
>
> Looks good to me, concise.
>
Thanks,
-Christoffer
WARNING: multiple messages have this Message-ID (diff)
From: Christoffer Dall <christoffer.dall@linaro.org>
To: Mario Smarduch <m.smarduch@samsung.com>
Cc: marc.zyngier@arm.com, pbonzini@redhat.com,
kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH RESEND v15 07/10] KVM: arm: page logging 2nd stage fault handling
Date: Thu, 15 Jan 2015 11:20:32 +0100 [thread overview]
Message-ID: <20150115102032.GN26222@cbox> (raw)
In-Reply-To: <54B6F753.8000101@samsung.com>
On Wed, Jan 14, 2015 at 03:10:11PM -0800, Mario Smarduch wrote:
> On 01/14/2015 02:32 AM, Christoffer Dall wrote:
> > On Tue, Jan 13, 2015 at 03:14:47PM -0800, Mario Smarduch wrote:
> >
> > [...]
> >
> >>>>>
> >>>>>
> >>>>> What I meant last time around concerning user_mem_abort was more
> >>>>> something like this:
> >>>>>
> >>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> >>>>> index 1dc9778..38ea58e 100644
> >>>>> --- a/arch/arm/kvm/mmu.c
> >>>>> +++ b/arch/arm/kvm/mmu.c
> >>>>> @@ -935,7 +935,14 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >>>>> return -EFAULT;
> >>>>> }
> >>>>>
> >>>>> - if (is_vm_hugetlb_page(vma)) {
> >>>>> + /*
> >>>>> + * Writes to pages in a memslot with logging enabled are always logged
> >>>>> + * on a singe page-by-page basis.
> >>>>> + */
> >>>>> + if (memslot_is_logging(memslot) && write_fault)
> >>>>> + force_pte = true;
> >>>>
> >>>> If it's a write you take the pte route and
> >>>> dissolves huge page, if it's a read you reconstruct the
> >>>> THP that seems to yield pretty bad results.
> >>>
> >>> ok, then remove the ' && write_fault' part of the clause.
> >> Hi Christoffer,
> >> couple comments/questions.
> >>
> >> setting force_pte here, disables huge pages for
> >> non-writable regions.
> >>
> >
>
> Hi Christoffer,
> another round, although I'll go ahead and post another
> iteration, sorry but as you mentioned this code is
> important.
>
> > hmmm, by a non-writable region you mean a read-only memslot? Can you set
> > dirty page logging for such one? That doesn't make much sense to me.
>
> Come to think of it that's true.
>
> It's bit fuzzyy when I was looking at the API for KVM_MEM_LOG_DIRTY_PAGES,
> it appears user space needs to check if region is read-only and set region
> size to 0(qemu). I don't see any checks in kernel to disable logging if
> region is read only and we're enabling dirty page logging. API doesn't say
> anything else. You may be able to enable logging
> for read-only region if you leave region size as is.
>
> I guess this has been around for quite a while so we
> can just assume read-only slots will have logging disabled.
>
I'm a bit confused, IIUC we don't make any explicit checks in the code
right now, so either there is some generic code that never sets the
logging flag on a read-only memregion or you can change the
implementation of memslot_is_logging() to return false if the memslot is
read-only.
> >
> > Note, that if you receive writable == false from gfn_to_pfn_prot() that
> > doesn't mean that the page can never be written to, it just means that
> > the current mapping of the page is not a writable one, you can call that
> > same function again later with write_fault=true, and you either get a
> > writable page back or you simply get an error.
>
> Yes that's true after studying hva_to_pfn_slow(),
> and __get_user_pages_fast(), a lot of conditions
> handled there.
>
> >
> > [...]
> >
> >>>>> if (kvm_is_device_pfn(pfn))
> >>>>> mem_type = PAGE_S2_DEVICE;
> >>
> >> If we're not setting the IOMAP flag do we have need
> >> this, since we're forfeiting error checking later
> >> in stage2_set_pte()?
> >>
> >
> > we still need this, remember the error checking is about
> > cache == NULL, not about the IOMAP flag. I think I address this in the
> > new proposal below, but please check carefully.
>
> Ok so mmu notifier may call stage2_set_pte() with
> null cache poiner and intermediate table entries may
> not be there so stage2_get_pud() may return NULL.
> With logging on it won't happen, but just in case
> we check.
yes, to easily catch programming errors in the future, that's why if you
make it a VM_BUG_ON the check won't be compiled unless you have kernel
memory debugging enabled.
>
> And we'll continue to handle Device faults until
> further notice.
>
yes.
> >
> > Take a look at this one:
>
> Looks good to me, concise.
>
Thanks,
-Christoffer
next prev parent reply other threads:[~2015-01-15 10:20 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-10 4:17 [PATCH RESEND v15 07/10] KVM: arm: page logging 2nd stage fault handling Mario Smarduch
2015-01-10 4:17 ` Mario Smarduch
2015-01-11 14:00 ` Christoffer Dall
2015-01-11 14:00 ` Christoffer Dall
2015-01-12 16:27 ` Mario Smarduch
2015-01-12 16:27 ` Mario Smarduch
2015-01-12 17:49 ` Christoffer Dall
2015-01-12 17:49 ` Christoffer Dall
2015-01-12 19:04 ` Mario Smarduch
2015-01-12 19:04 ` Mario Smarduch
2015-01-12 19:43 ` Christoffer Dall
2015-01-12 19:43 ` Christoffer Dall
2015-01-13 17:42 ` Mario Smarduch
2015-01-13 17:42 ` Mario Smarduch
2015-01-14 10:33 ` Christoffer Dall
2015-01-14 10:33 ` Christoffer Dall
2015-01-13 23:14 ` Mario Smarduch
2015-01-13 23:14 ` Mario Smarduch
2015-01-14 10:32 ` Christoffer Dall
2015-01-14 10:32 ` Christoffer Dall
2015-01-14 23:10 ` Mario Smarduch
2015-01-14 23:10 ` Mario Smarduch
2015-01-15 10:20 ` Christoffer Dall [this message]
2015-01-15 10:20 ` Christoffer Dall
-- strict thread matches above, loose matches on Subject: below --
2015-01-15 2:51 Mario Smarduch
2015-01-15 2:51 ` Mario Smarduch
2015-01-15 10:55 ` Christoffer Dall
2015-01-15 10:55 ` Christoffer Dall
2015-01-09 1:42 Mario Smarduch
2015-01-09 1:42 ` Mario Smarduch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150115102032.GN26222@cbox \
--to=christoffer.dall@linaro.org \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.