All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill-oKw7cIdHH8eLwutG50LtGA@public.gmane.org>
To: Mark Williamson
	<mwilliamson-/4lU09Eg6ahx67MzidHQgQC/G2K4zDHf@public.gmane.org>
Cc: Mark Seaborn <mseaborn-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	kernel list
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"Kirill A. Shutemov"
	<kirill.shutemov-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>,
	Pavel Emelyanov <xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>,
	Konstantin Khlebnikov
	<khlebnikov-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Linus Torvalds
	<torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>,
	Linux API <linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Finn Grimwood
	<fgrimwood-/4lU09Eg6ahx67MzidHQgQC/G2K4zDHf@public.gmane.org>,
	Daniel James
	<djames-/4lU09Eg6ahx67MzidHQgQC/G2K4zDHf@public.gmane.org>
Subject: Re: Regression: Requiring CAP_SYS_ADMIN for /proc/<pid>/pagemap causes application-level breakage
Date: Wed, 29 Apr 2015 22:36:22 +0300	[thread overview]
Message-ID: <20150429193622.GA11892@node.dhcp.inet.fi> (raw)
In-Reply-To: <CAEVpBa+DPbhZeRsA=+Jnn7f1BqVZm9SU=ABjy6bAHbbMi_yn+g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Wed, Apr 29, 2015 at 07:44:57PM +0100, Mark Williamson wrote:
> Hi all,
> 
> We've been investigating further and found a snag with the PFN-hiding
> approach discussed last week - looks like it won't be enough on all
> the architectures we support.  Our product runs on x86_32, x86_64 and
> ARM.  For now, it looks like soft-dirty is only available on x86_64.
> A patch that simply zeros out the physical addresses in
> /proc/PID/pagemap will therefore help us on x86_64 but we'll still
> have problems on other platforms[1].
> 
> For context, we were previously using pagemap as a cross-platform way
> to get soft-dirty-like functionality.  Specifically, to ask "did a
> process write to any pages since fork()" by comparing addresses and
> deducing where CoW must have occurred.  In the absence of soft-dirty
> and the physical addresses, it looks like we can't figure that out
> with the remaining information in pagemap.
> 
> If the pagemap file included the "writeable" bit from the PTE, we
> think we'd have all the information required to deduce what we need
> (although I realise that's a bit of a nasty workaround).  If I
> proposed including the PTE protection bits in pagemap, would that be
> controversial?  I'm guessing yes but thought it was worth a shot ;-)
> Would anybody be able to suggest a more tasteful approach?

Emm.. I have hard time to understand how writable bit is enough to get
soft-dirty-alike functionality.

Let's say we have anon-mapping with COW setup after the fork(). It's not
writable PTEs to trigger COW on wp faults. But you can easily get to the
same non-writable PTE after breaking COW: fork() again or
mprotect(PROT_READ) and mprotect(PROT_READ|PROT_WRITE) back.

?

> 
> Thanks,
> Mark
> 
> [1] I'd note that using soft-dirty is clearly the right approach for
> us on x64, where available and that ideally we'd use it on other
> architectures - cross-arch support for soft-dirty is a slightly
> different discussion, which I hope to post another thread for.

-- 
 Kirill A. Shutemov

WARNING: multiple messages have this Message-ID (diff)
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Mark Williamson <mwilliamson@undo-software.com>
Cc: Mark Seaborn <mseaborn@chromium.org>,
	kernel list <linux-kernel@vger.kernel.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Pavel Emelyanov <xemul@parallels.com>,
	Konstantin Khlebnikov <khlebnikov@openvz.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andy Lutomirski <luto@amacapital.net>,
	Linux API <linux-api@vger.kernel.org>,
	Finn Grimwood <fgrimwood@undo-software.com>,
	Daniel James <djames@undo-software.com>
Subject: Re: Regression: Requiring CAP_SYS_ADMIN for /proc/<pid>/pagemap causes application-level breakage
Date: Wed, 29 Apr 2015 22:36:22 +0300	[thread overview]
Message-ID: <20150429193622.GA11892@node.dhcp.inet.fi> (raw)
In-Reply-To: <CAEVpBa+DPbhZeRsA=+Jnn7f1BqVZm9SU=ABjy6bAHbbMi_yn+g@mail.gmail.com>

On Wed, Apr 29, 2015 at 07:44:57PM +0100, Mark Williamson wrote:
> Hi all,
> 
> We've been investigating further and found a snag with the PFN-hiding
> approach discussed last week - looks like it won't be enough on all
> the architectures we support.  Our product runs on x86_32, x86_64 and
> ARM.  For now, it looks like soft-dirty is only available on x86_64.
> A patch that simply zeros out the physical addresses in
> /proc/PID/pagemap will therefore help us on x86_64 but we'll still
> have problems on other platforms[1].
> 
> For context, we were previously using pagemap as a cross-platform way
> to get soft-dirty-like functionality.  Specifically, to ask "did a
> process write to any pages since fork()" by comparing addresses and
> deducing where CoW must have occurred.  In the absence of soft-dirty
> and the physical addresses, it looks like we can't figure that out
> with the remaining information in pagemap.
> 
> If the pagemap file included the "writeable" bit from the PTE, we
> think we'd have all the information required to deduce what we need
> (although I realise that's a bit of a nasty workaround).  If I
> proposed including the PTE protection bits in pagemap, would that be
> controversial?  I'm guessing yes but thought it was worth a shot ;-)
> Would anybody be able to suggest a more tasteful approach?

Emm.. I have hard time to understand how writable bit is enough to get
soft-dirty-alike functionality.

Let's say we have anon-mapping with COW setup after the fork(). It's not
writable PTEs to trigger COW on wp faults. But you can easily get to the
same non-writable PTE after breaking COW: fork() again or
mprotect(PROT_READ) and mprotect(PROT_READ|PROT_WRITE) back.

?

> 
> Thanks,
> Mark
> 
> [1] I'd note that using soft-dirty is clearly the right approach for
> us on x64, where available and that ideally we'd use it on other
> architectures - cross-arch support for soft-dirty is a slightly
> different discussion, which I hope to post another thread for.

-- 
 Kirill A. Shutemov

  parent reply	other threads:[~2015-04-29 19:36 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-24 15:01 Regression: Requiring CAP_SYS_ADMIN for /proc/<pid>/pagemap causes application-level breakage Mark Williamson
2015-04-24 15:01 ` Mark Williamson
2015-04-24 15:26 ` Mark Seaborn
2015-04-24 16:43   ` Mark Williamson
     [not found]     ` <CAEVpBa+vjfR8p_fJQPXyrQCMYJOocQpTHmkTYmj7nOpGcce=Pg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-04-29 18:44       ` Mark Williamson
2015-04-29 18:44         ` Mark Williamson
2015-04-29 19:23         ` Mark Williamson
     [not found]         ` <CAEVpBa+DPbhZeRsA=+Jnn7f1BqVZm9SU=ABjy6bAHbbMi_yn+g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-04-29 19:36           ` Kirill A. Shutemov [this message]
2015-04-29 19:36             ` Kirill A. Shutemov
     [not found]             ` <20150429193622.GA11892-nhfs4B5ZimeFUdmeq17FyvUpdFzICT1y@public.gmane.org>
2015-04-29 20:24               ` Mark Williamson
2015-04-29 20:24                 ` Mark Williamson
2015-04-29 20:33               ` Linus Torvalds
2015-04-29 20:33                 ` Linus Torvalds
2015-04-29 20:44                 ` Konstantin Khlebnikov
2015-04-29 21:02                   ` Linus Torvalds
2015-04-29 21:05                     ` Kirill A. Shutemov
2015-04-29 21:05                       ` Kirill A. Shutemov
     [not found]                       ` <20150429210513.GA12431-nhfs4B5ZimeFUdmeq17FyvUpdFzICT1y@public.gmane.org>
2015-04-29 21:18                         ` Linus Torvalds
2015-04-29 21:18                           ` Linus Torvalds
     [not found]                           ` <CA+55aFw-dK4U3Z_gi5wEdGsPz0ZaQMizETKqS-=ajpDAu03Fnw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-04-29 21:37                             ` Kirill A. Shutemov
2015-04-29 21:37                               ` Kirill A. Shutemov
2015-04-30 11:43                     ` Konstantin Khlebnikov
2015-04-30 11:43                       ` Konstantin Khlebnikov
     [not found]                       ` <CALYGNiOcb2Z+zBVQ0Z058Z7+jp3EixYq8MK1f3MYYXOpT9iAiw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-04-30 13:11                         ` Konstantin Khlebnikov
2015-04-30 13:11                           ` Konstantin Khlebnikov
     [not found]                           ` <CALYGNiNVvVUuXvRX2nTtcDA8a8uWdQNVJ4bb2HgCY2efeDJrRw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-04-30 13:22                             ` Kirill A. Shutemov
2015-04-30 13:22                               ` Kirill A. Shutemov
     [not found]                               ` <20150430132230.GE15874-nhfs4B5ZimeFUdmeq17FyvUpdFzICT1y@public.gmane.org>
2015-04-30 13:32                                 ` Konstantin Khlebnikov
2015-04-30 13:32                                   ` Konstantin Khlebnikov
2015-04-30 18:45                                 ` Mark Williamson
2015-04-30 18:45                                   ` Mark Williamson
2015-04-30 18:32                             ` Mark Williamson
2015-04-30 18:32                               ` Mark Williamson
2015-04-30 11:50                     ` Mark Williamson
2015-04-30 11:50                       ` Mark Williamson
     [not found] <CAEVpBa+7Yp+zCTczZqBd6Qp_uM7yy0i8YZfZkUbDeUsPpKtqRQ@mail.gmail.com>
2015-04-24 16:08 ` Linus Torvalds
2015-04-24 16:10   ` Andy Lutomirski
     [not found]     ` <CALCETrUkkbZaNGkcZMenciC7o9BO7U52LPXQwT+Q5TT8W2=uKQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-04-24 16:27       ` Linus Torvalds
2015-04-24 16:27         ` Linus Torvalds
2015-04-29 14:38     ` Mark Williamson
     [not found]   ` <CA+55aFymwdHBs02GQkqQYwYF9Ru5dMKd4=2whLUfPwZSPU7ZGA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-04-24 16:46     ` Mark Williamson
2015-04-24 16:46       ` Mark Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150429193622.GA11892@node.dhcp.inet.fi \
    --to=kirill-okw7cidhh8elwutg50ltga@public.gmane.org \
    --cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=djames-/4lU09Eg6ahx67MzidHQgQC/G2K4zDHf@public.gmane.org \
    --cc=fgrimwood-/4lU09Eg6ahx67MzidHQgQC/G2K4zDHf@public.gmane.org \
    --cc=khlebnikov-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
    --cc=kirill.shutemov-VuQAYsv1563Yd54FQh9/CA@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org \
    --cc=mseaborn-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org \
    --cc=mwilliamson-/4lU09Eg6ahx67MzidHQgQC/G2K4zDHf@public.gmane.org \
    --cc=torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.