Re: [PATCH] mm/mmu_notifier: restore set_pte_at_notify semantics

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Andrea Arcangeli <aarcange@redhat.com>
To: Jerome Glisse <j.glisse@gmail.com>
Cc: Haggai Eran <haggaie@mellanox.com>,
	Mike Rapoport <mike.rapoport@ravellosystems.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Izik Eidus <izik.eidus@ravellosystems.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Or Gerlitz <ogerlitz@mellanox.com>,
	Sagi Grimberg <sagig@mellanox.com>,
	Shachar Raindel <raindel@mellanox.com>
Subject: Re: [PATCH] mm/mmu_notifier: restore set_pte_at_notify semantics
Date: Wed, 2 Apr 2014 18:43:24 +0200	[thread overview]
Message-ID: <20140402164324.GK1500@redhat.com> (raw)
In-Reply-To: <20140402151825.GA3614@gmail.com>

Hi,

On Wed, Apr 02, 2014 at 11:18:27AM -0400, Jerome Glisse wrote:
> This would imply either to scan all mmu_notifier currently register or to
> have a global flags for the mm to know if there is one mmu_notifier without
> change_pte. Moreover this would means that kvm would remain "broken" if one
> of the mmu notifier do not have the change_pte callback.
> 
> Solution i have in mind and is part of a patchset i am working on, just
> involve passing along an enum value to mmu notifier callback. The enum
> value would tell what are the exact event that actually triggered the
> mmu notifier call (vmscan, migrate, ksm, ...). Knowing this kvm could then
> simply ignore invalidate_range_start/end for event it knows it will get
> a change_pte callback.

That sounds similar to adding two new methods companion of change_pte
that if not implemented would fallback into
invalidate_range_start/end? And KVM would implement those as noops? It
would add bytes to the mmu notifer structure but it wouldn't add
branches.

It's not urgent but we clearly need to do something about this, or we
should drop change_pte entirely because currently it does nothing and
it only wastes some CPU cycle.

Removing change_pte these days isn't a showstopper because KVM page
fault become smarter lately. In the old days lack of change_pte would
mean that the guest would break KSM cow if it ever accessed the page
in readonly (sharable) mode. Back then change_pte was a fundamental
optimization to use KSM and to avoid all KSM pages to be cowed
immediately after being merged.

These days reading from guest memory backed by KSM won't break COW
even if the spte isn't already established before the KVM fault fires
on the KSM memory. change_pte these days has only the benefit of
avoiding a vmexit/vmenter cycle after a the KSM merge and one
vmexit/vmenter cyle after a KSM break COW (the event that triggers if
the guest eventually writes to the page).

KSM merge and KSM cows aren't too frequent operations (and they both
have significant cost associated with them) so it's uncertain if it's
worth keeping the change_pte optimization nowadays. Considering it's
already implemented I personally feel it's worth keeping as a
microoptimization because vmexit/vmenter are certainly more expensive
than calling change_pte.

Both what you suggested above (with enum or two new companion methods)
or the other way Haggai suggested (checking if change_pte is
implemented in all registered mmu notifiers) sounds good to me.

Thanks,
Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2014-04-02 16:43 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-15  9:40 [PATCH] mm/mmu_notifier: restore set_pte_at_notify semantics Mike Rapoport
2014-01-22 13:10 ` Andrea Arcangeli
2014-01-22 14:01   ` Haggai Eran
2014-03-30 20:33     ` Jerome Glisse
2014-04-02 12:52       ` Haggai Eran
2014-04-02 15:18         ` Jerome Glisse
2014-04-02 16:43           ` Andrea Arcangeli [this message]
2014-01-22 21:54 ` Andrew Morton
2014-01-22 22:19   ` Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140402164324.GK1500@redhat.com \
    --to=aarcange@redhat.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=haggaie@mellanox.com \
    --cc=izik.eidus@ravellosystems.com \
    --cc=j.glisse@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.rapoport@ravellosystems.com \
    --cc=ogerlitz@mellanox.com \
    --cc=raindel@mellanox.com \
    --cc=sagig@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).