All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: prakash sangappa
	<prakash.sangappa-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Cc: Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Mike Rapoport
	<rppt-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	Mike Kravetz
	<mike.kravetz-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
	Dave Hansen <dave.hansen-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	John Stultz <john.stultz-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
Subject: Re: [RFC PATCH] userfaultfd: Add feature to request for a signal delivery
Date: Tue, 4 Jul 2017 18:40:34 +0200	[thread overview]
Message-ID: <20170704164034.GH5738@redhat.com> (raw)
In-Reply-To: <5956F2EC.1000805-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>

On Fri, Jun 30, 2017 at 05:55:08PM -0700, prakash sangappa wrote:
> Interesting that UFFDIO_COPY is faster then fallocate().  In the DB use case
> the page does not need to be allocated at the time a process trips on 
> the hugetlbfs
> file hole and receives SIGBUS.  fallocate() is called on the hugetlbfs file,
> when more memory needs to be allocated by a separate process.

The major difference is that with UFFDIO_COPY the hugepage will be
immediately mapped into the virtual address without requiring any
further minor fault. So it's ideal if you could arrange to call
UFFDIO_COPY from the same process that is going to touch and use the
hugetlbfs data immediately after. You would eliminate a minor fault
that way.

UFFDIO_COPY at least for anon was measured to perform better than a
regular page fault too.

> Regarding hugetlbfs mount option, one consideration is to allow mounts of
> hugetlbfs inside user namespaces's mount namespace. Which would allow
> non privileged processes to mount hugetlbfs for use inside a user 
> namespace.
> This may be needed even for the 'min_size' mount option using which an
> application could reserve huge pages and mount a filesystem for its use,
> with out the need to have privileges given the system has enough hugepages
> configured.  It seems if non privileged processes are allowed to mount 
> hugetlbfs
> filesystem, then min_size should be subject to some resource limits.
> 
> Mounting inside user namespace will be a different patch proposal later.

There's no particular reason to make UFFDIO_FEATURE_SIGBUS a
privileged op unless we want to eliminate the branch with the static
key, so it's certainly simpler than dealing with hugetlbfs min_size
reserves.

I'm positive about the UFFDIO_FEATURE_SIGBUS tradeoffs, but others
feel free to comment.

If you could make second patch to extend the selftest to exercise and
validates UFFDIO_FEATURE_SIGBUS in anon/shmem/hugetlbfs it'd be great.

Thanks,
Andrea

WARNING: multiple messages have this Message-ID (diff)
From: Andrea Arcangeli <aarcange@redhat.com>
To: prakash sangappa <prakash.sangappa@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>,
	Mike Rapoport <rppt@linux.vnet.ibm.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Christoph Hellwig <hch@infradead.org>,
	linux-api@vger.kernel.org, John Stultz <john.stultz@linaro.org>
Subject: Re: [RFC PATCH] userfaultfd: Add feature to request for a signal delivery
Date: Tue, 4 Jul 2017 18:40:34 +0200	[thread overview]
Message-ID: <20170704164034.GH5738@redhat.com> (raw)
In-Reply-To: <5956F2EC.1000805@oracle.com>

On Fri, Jun 30, 2017 at 05:55:08PM -0700, prakash sangappa wrote:
> Interesting that UFFDIO_COPY is faster then fallocate().  In the DB use case
> the page does not need to be allocated at the time a process trips on 
> the hugetlbfs
> file hole and receives SIGBUS.  fallocate() is called on the hugetlbfs file,
> when more memory needs to be allocated by a separate process.

The major difference is that with UFFDIO_COPY the hugepage will be
immediately mapped into the virtual address without requiring any
further minor fault. So it's ideal if you could arrange to call
UFFDIO_COPY from the same process that is going to touch and use the
hugetlbfs data immediately after. You would eliminate a minor fault
that way.

UFFDIO_COPY at least for anon was measured to perform better than a
regular page fault too.

> Regarding hugetlbfs mount option, one consideration is to allow mounts of
> hugetlbfs inside user namespaces's mount namespace. Which would allow
> non privileged processes to mount hugetlbfs for use inside a user 
> namespace.
> This may be needed even for the 'min_size' mount option using which an
> application could reserve huge pages and mount a filesystem for its use,
> with out the need to have privileges given the system has enough hugepages
> configured.  It seems if non privileged processes are allowed to mount 
> hugetlbfs
> filesystem, then min_size should be subject to some resource limits.
> 
> Mounting inside user namespace will be a different patch proposal later.

There's no particular reason to make UFFDIO_FEATURE_SIGBUS a
privileged op unless we want to eliminate the branch with the static
key, so it's certainly simpler than dealing with hugetlbfs min_size
reserves.

I'm positive about the UFFDIO_FEATURE_SIGBUS tradeoffs, but others
feel free to comment.

If you could make second patch to extend the selftest to exercise and
validates UFFDIO_FEATURE_SIGBUS in anon/shmem/hugetlbfs it'd be great.

Thanks,
Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Andrea Arcangeli <aarcange@redhat.com>
To: prakash sangappa <prakash.sangappa@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>,
	Mike Rapoport <rppt@linux.vnet.ibm.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Christoph Hellwig <hch@infradead.org>,
	linux-api@vger.kernel.org, John Stultz <john.stultz@linaro.org>
Subject: Re: [RFC PATCH] userfaultfd: Add feature to request for a signal delivery
Date: Tue, 4 Jul 2017 18:40:34 +0200	[thread overview]
Message-ID: <20170704164034.GH5738@redhat.com> (raw)
In-Reply-To: <5956F2EC.1000805@oracle.com>

On Fri, Jun 30, 2017 at 05:55:08PM -0700, prakash sangappa wrote:
> Interesting that UFFDIO_COPY is faster then fallocate().  In the DB use case
> the page does not need to be allocated at the time a process trips on 
> the hugetlbfs
> file hole and receives SIGBUS.  fallocate() is called on the hugetlbfs file,
> when more memory needs to be allocated by a separate process.

The major difference is that with UFFDIO_COPY the hugepage will be
immediately mapped into the virtual address without requiring any
further minor fault. So it's ideal if you could arrange to call
UFFDIO_COPY from the same process that is going to touch and use the
hugetlbfs data immediately after. You would eliminate a minor fault
that way.

UFFDIO_COPY at least for anon was measured to perform better than a
regular page fault too.

> Regarding hugetlbfs mount option, one consideration is to allow mounts of
> hugetlbfs inside user namespaces's mount namespace. Which would allow
> non privileged processes to mount hugetlbfs for use inside a user 
> namespace.
> This may be needed even for the 'min_size' mount option using which an
> application could reserve huge pages and mount a filesystem for its use,
> with out the need to have privileges given the system has enough hugepages
> configured.  It seems if non privileged processes are allowed to mount 
> hugetlbfs
> filesystem, then min_size should be subject to some resource limits.
> 
> Mounting inside user namespace will be a different patch proposal later.

There's no particular reason to make UFFDIO_FEATURE_SIGBUS a
privileged op unless we want to eliminate the branch with the static
key, so it's certainly simpler than dealing with hugetlbfs min_size
reserves.

I'm positive about the UFFDIO_FEATURE_SIGBUS tradeoffs, but others
feel free to comment.

If you could make second patch to extend the selftest to exercise and
validates UFFDIO_FEATURE_SIGBUS in anon/shmem/hugetlbfs it'd be great.

Thanks,
Andrea

  parent reply	other threads:[~2017-07-04 16:40 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-26 19:46 [RFC PATCH] userfaultfd: Add feature to request for a signal delivery Prakash Sangappa
2017-06-26 19:46 ` Prakash Sangappa
2017-06-27  7:06 ` Michal Hocko
2017-06-27  7:06   ` Michal Hocko
2017-06-27 15:35   ` Mike Rapoport
2017-06-27 15:35     ` Mike Rapoport
2017-06-27 16:01     ` Prakash Sangappa
2017-06-27 16:01       ` Prakash Sangappa
2017-06-27 16:01       ` Prakash Sangappa
     [not found]       ` <51508e99-d2dd-894f-8d8a-678e3747c1ee-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2017-06-28 13:18         ` Mike Rapoport
2017-06-28 13:18           ` Mike Rapoport
2017-06-28 13:18           ` Mike Rapoport
2017-06-28 18:23           ` Prakash Sangappa
2017-06-28 18:23             ` Prakash Sangappa
2017-06-29  8:09             ` Michal Hocko
2017-06-29  8:09               ` Michal Hocko
2017-06-29 21:41               ` prakash.sangappa
2017-06-29 21:41                 ` prakash.sangappa
     [not found]                 ` <936bde7b-1913-5589-22f4-9bbfdb6a8dd5-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2017-06-30  9:47                   ` Michal Hocko
2017-06-30  9:47                     ` Michal Hocko
2017-06-30  9:47                     ` Michal Hocko
2017-06-30 13:08                     ` Andrea Arcangeli
2017-06-30 13:08                       ` Andrea Arcangeli
     [not found]                       ` <20170630130813.GA5738-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-07-01  0:55                         ` prakash sangappa
2017-07-01  0:55                           ` prakash sangappa
2017-07-01  0:55                           ` prakash sangappa
     [not found]                           ` <5956F2EC.1000805-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2017-07-04 16:40                             ` Andrea Arcangeli [this message]
2017-07-04 16:40                               ` Andrea Arcangeli
2017-07-04 16:40                               ` Andrea Arcangeli
2017-07-05 22:24                               ` prakash.sangappa
2017-07-05 22:24                                 ` prakash.sangappa
     [not found]                     ` <20170630094718.GE22917-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2017-07-05 18:41                       ` John Stultz
2017-07-05 18:41                         ` John Stultz
2017-07-05 18:41                         ` John Stultz
2017-06-29 10:46             ` Mike Rapoport
2017-06-29 10:46               ` Mike Rapoport
2017-06-29 21:49               ` prakash.sangappa
2017-06-29 21:49                 ` prakash.sangappa
2017-06-27 15:47   ` Prakash Sangappa
2017-06-27 15:47     ` Prakash Sangappa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170704164034.GH5738@redhat.com \
    --to=aarcange-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=dave.hansen-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=john.stultz-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
    --cc=mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=mike.kravetz-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=prakash.sangappa-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=rppt-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.