Re: [RESEND PATCH V2 1/3] Add mmap flag to request pages are locked after page fault

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
To: Eric B Munson <emunson-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
Cc: Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	linux-alpha-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mips-6z/3iImG2C8G8FEW9MqTrA@public.gmane.org,
	linux-parisc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org,
	sparclinux-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-xtensa-PjhNF2WwrV/0Sa2dR60CXw@public.gmane.org,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	linux-arch-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [RESEND PATCH V2 1/3] Add mmap flag to request pages are locked after page fault
Date: Mon, 22 Jun 2015 14:38:26 +0200	[thread overview]
Message-ID: <20150622123826.GF4430@dhcp22.suse.cz> (raw)
In-Reply-To: <20150619164333.GD2329-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>

On Fri 19-06-15 12:43:33, Eric B Munson wrote:
> On Fri, 19 Jun 2015, Michal Hocko wrote:
> 
> > On Thu 18-06-15 16:30:48, Eric B Munson wrote:
> > > On Thu, 18 Jun 2015, Michal Hocko wrote:
> > [...]
> > > > Wouldn't it be much more reasonable and straightforward to have
> > > > MAP_FAULTPOPULATE as a counterpart for MAP_POPULATE which would
> > > > explicitly disallow any form of pre-faulting? It would be usable for
> > > > other usecases than with MAP_LOCKED combination.
> > > 
> > > I don't see a clear case for it being more reasonable, it is one
> > > possible way to solve the problem.
> > 
> > MAP_FAULTPOPULATE would be usable for other cases as well. E.g. fault
> > around is all or nothing feature. Either all mappings (which support
> > this) fault around or none. There is no way to tell the kernel that
> > this particular mapping shouldn't fault around. I haven't seen such a
> > request yet but we have seen requests to have a way to opt out from
> > a global policy in the past (e.g. per-process opt out from THP). So
> > I can imagine somebody will come with a request to opt out from any
> > speculative operations on the mapped area in the future.
> > 
> > > But I think it leaves us in an even
> > > more akward state WRT VMA flags.  As you noted in your fix for the
> > > mmap() man page, one can get into a state where a VMA is VM_LOCKED, but
> > > not present.  Having VM_LOCKONFAULT states that this was intentional, if
> > > we go to using MAP_FAULTPOPULATE instead of MAP_LOCKONFAULT, we no
> > > longer set VM_LOCKONFAULT (unless we want to start mapping it to the
> > > presence of two MAP_ flags).  This can make detecting the MAP_LOCKED +
> > > populate failure state harder.
> > 
> > I am not sure I understand your point here. Could you be more specific
> > how would you check for that and what for?
> 
> My thought on detecting was that someone might want to know if they had
> a VMA that was VM_LOCKED but had not been made present becuase of a
> failure in mmap.  We don't have a way today, but adding VM_LOCKONFAULT
> is at least explicit about what is happening which would make detecting
> the VM_LOCKED but not present state easier. 

One could use /proc/<pid>/pagemap to query the residency.

> This assumes that
> MAP_FAULTPOPULATE does not translate to a VMA flag, but it sounds like
> it would have to.

Yes, it would have to have a VM flag for the vma.

> > From my understanding MAP_LOCKONFAULT is essentially
> > MAP_FAULTPOPULATE|MAP_LOCKED with a quite obvious semantic (unlike
> > single MAP_LOCKED unfortunately). I would love to also have
> > MAP_LOCKED|MAP_POPULATE (aka full mlock semantic) but I am really
> > skeptical considering how my previous attempt to make MAP_POPULATE
> > reasonable went.
> 
> Are you objecting to the addition of the VMA flag VM_LOCKONFAULT, or the
> new MAP_LOCKONFAULT flag (or both)? 

I thought the MAP_FAULTPOPULATE (or any other better name) would
directly translate into VM_FAULTPOPULATE and wouldn't be tight to the
locked semantic. We already have VM_LOCKED for that. The direct effect
of the flag would be to prevent from population other than the direct
page fault - including any speculative actions like fault around or
read-ahead.

> If you prefer that MAP_LOCKED |
> MAP_FAULTPOPULATE means that VM_LOCKONFAULT is set, I am fine with that
> instead of introducing MAP_LOCKONFAULT.  I went with the new flag
> because to date, we have a one to one mapping of MAP_* to VM_* flags.
> 
> > 
> > > If this is the preferred path for mmap(), I am fine with that. 
> > 
> > > However,
> > > I would like to see the new system calls that Andrew mentioned (and that
> > > I am testing patches for) go in as well. 
> > 
> > mlock with flags sounds like a good step but I am not sure it will make
> > sense in the future. POSIX has screwed that and I am not sure how many
> > applications would use it. This ship has sailed long time ago.
> 
> I don't know either, but the code is the question, right?  I know that
> we have at least one team that wants it here.
> 
> > 
> > > That way we give users the
> > > ability to request VM_LOCKONFAULT for memory allocated using something
> > > other than mmap.
> > 
> > mmap(MAP_FAULTPOPULATE); mlock() would have the same semantic even
> > without changing mlock syscall.
> 
> That is true as long as MAP_FAULTPOPULATE set a flag in the VMA(s).  It
> doesn't cover the actual case I was asking about, which is how do I get
> lock on fault on malloc'd memory?

OK I see your point now. We would indeed need a flag argument for mlock.
-- 
Michal Hocko
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)

From: Michal Hocko <mhocko@suse.cz>
To: Eric B Munson <emunson@akamai.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-alpha@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mips@linux-mips.org, linux-parisc@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, sparclinux@vger.kernel.org,
	linux-xtensa@linux-xtensa.org, linux-mm@kvack.org,
	linux-arch@vger.kernel.org, linux-api@vger.kernel.org
Subject: Re: [RESEND PATCH V2 1/3] Add mmap flag to request pages are locked after page fault
Date: Mon, 22 Jun 2015 14:38:26 +0200	[thread overview]
Message-ID: <20150622123826.GF4430@dhcp22.suse.cz> (raw)
Message-ID: <20150622123826.nvGBTzVESqPqu_c0um220BjykrZaIz-RqFB0kfQXsYM@z> (raw)
In-Reply-To: <20150619164333.GD2329@akamai.com>

On Fri 19-06-15 12:43:33, Eric B Munson wrote:
> On Fri, 19 Jun 2015, Michal Hocko wrote:
> 
> > On Thu 18-06-15 16:30:48, Eric B Munson wrote:
> > > On Thu, 18 Jun 2015, Michal Hocko wrote:
> > [...]
> > > > Wouldn't it be much more reasonable and straightforward to have
> > > > MAP_FAULTPOPULATE as a counterpart for MAP_POPULATE which would
> > > > explicitly disallow any form of pre-faulting? It would be usable for
> > > > other usecases than with MAP_LOCKED combination.
> > > 
> > > I don't see a clear case for it being more reasonable, it is one
> > > possible way to solve the problem.
> > 
> > MAP_FAULTPOPULATE would be usable for other cases as well. E.g. fault
> > around is all or nothing feature. Either all mappings (which support
> > this) fault around or none. There is no way to tell the kernel that
> > this particular mapping shouldn't fault around. I haven't seen such a
> > request yet but we have seen requests to have a way to opt out from
> > a global policy in the past (e.g. per-process opt out from THP). So
> > I can imagine somebody will come with a request to opt out from any
> > speculative operations on the mapped area in the future.
> > 
> > > But I think it leaves us in an even
> > > more akward state WRT VMA flags.  As you noted in your fix for the
> > > mmap() man page, one can get into a state where a VMA is VM_LOCKED, but
> > > not present.  Having VM_LOCKONFAULT states that this was intentional, if
> > > we go to using MAP_FAULTPOPULATE instead of MAP_LOCKONFAULT, we no
> > > longer set VM_LOCKONFAULT (unless we want to start mapping it to the
> > > presence of two MAP_ flags).  This can make detecting the MAP_LOCKED +
> > > populate failure state harder.
> > 
> > I am not sure I understand your point here. Could you be more specific
> > how would you check for that and what for?
> 
> My thought on detecting was that someone might want to know if they had
> a VMA that was VM_LOCKED but had not been made present becuase of a
> failure in mmap.  We don't have a way today, but adding VM_LOCKONFAULT
> is at least explicit about what is happening which would make detecting
> the VM_LOCKED but not present state easier. 

One could use /proc/<pid>/pagemap to query the residency.

> This assumes that
> MAP_FAULTPOPULATE does not translate to a VMA flag, but it sounds like
> it would have to.

Yes, it would have to have a VM flag for the vma.

> > From my understanding MAP_LOCKONFAULT is essentially
> > MAP_FAULTPOPULATE|MAP_LOCKED with a quite obvious semantic (unlike
> > single MAP_LOCKED unfortunately). I would love to also have
> > MAP_LOCKED|MAP_POPULATE (aka full mlock semantic) but I am really
> > skeptical considering how my previous attempt to make MAP_POPULATE
> > reasonable went.
> 
> Are you objecting to the addition of the VMA flag VM_LOCKONFAULT, or the
> new MAP_LOCKONFAULT flag (or both)? 

I thought the MAP_FAULTPOPULATE (or any other better name) would
directly translate into VM_FAULTPOPULATE and wouldn't be tight to the
locked semantic. We already have VM_LOCKED for that. The direct effect
of the flag would be to prevent from population other than the direct
page fault - including any speculative actions like fault around or
read-ahead.

> If you prefer that MAP_LOCKED |
> MAP_FAULTPOPULATE means that VM_LOCKONFAULT is set, I am fine with that
> instead of introducing MAP_LOCKONFAULT.  I went with the new flag
> because to date, we have a one to one mapping of MAP_* to VM_* flags.
> 
> > 
> > > If this is the preferred path for mmap(), I am fine with that. 
> > 
> > > However,
> > > I would like to see the new system calls that Andrew mentioned (and that
> > > I am testing patches for) go in as well. 
> > 
> > mlock with flags sounds like a good step but I am not sure it will make
> > sense in the future. POSIX has screwed that and I am not sure how many
> > applications would use it. This ship has sailed long time ago.
> 
> I don't know either, but the code is the question, right?  I know that
> we have at least one team that wants it here.
> 
> > 
> > > That way we give users the
> > > ability to request VM_LOCKONFAULT for memory allocated using something
> > > other than mmap.
> > 
> > mmap(MAP_FAULTPOPULATE); mlock() would have the same semantic even
> > without changing mlock syscall.
> 
> That is true as long as MAP_FAULTPOPULATE set a flag in the VMA(s).  It
> doesn't cover the actual case I was asking about, which is how do I get
> lock on fault on malloc'd memory?

OK I see your point now. We would indeed need a flag argument for mlock.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in

WARNING: multiple messages have this Message-ID (diff)

From: Michal Hocko <mhocko@suse.cz>
To: Eric B Munson <emunson@akamai.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-alpha@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mips@linux-mips.org, linux-parisc@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, sparclinux@vger.kernel.org,
	linux-xtensa@linux-xtensa.org, linux-mm@kvack.org,
	linux-arch@vger.kernel.org, linux-api@vger.kernel.org
Subject: Re: [RESEND PATCH V2 1/3] Add mmap flag to request pages are locked after page fault
Date: Mon, 22 Jun 2015 14:38:26 +0200	[thread overview]
Message-ID: <20150622123826.GF4430@dhcp22.suse.cz> (raw)
In-Reply-To: <20150619164333.GD2329@akamai.com>

On Fri 19-06-15 12:43:33, Eric B Munson wrote:
> On Fri, 19 Jun 2015, Michal Hocko wrote:
> 
> > On Thu 18-06-15 16:30:48, Eric B Munson wrote:
> > > On Thu, 18 Jun 2015, Michal Hocko wrote:
> > [...]
> > > > Wouldn't it be much more reasonable and straightforward to have
> > > > MAP_FAULTPOPULATE as a counterpart for MAP_POPULATE which would
> > > > explicitly disallow any form of pre-faulting? It would be usable for
> > > > other usecases than with MAP_LOCKED combination.
> > > 
> > > I don't see a clear case for it being more reasonable, it is one
> > > possible way to solve the problem.
> > 
> > MAP_FAULTPOPULATE would be usable for other cases as well. E.g. fault
> > around is all or nothing feature. Either all mappings (which support
> > this) fault around or none. There is no way to tell the kernel that
> > this particular mapping shouldn't fault around. I haven't seen such a
> > request yet but we have seen requests to have a way to opt out from
> > a global policy in the past (e.g. per-process opt out from THP). So
> > I can imagine somebody will come with a request to opt out from any
> > speculative operations on the mapped area in the future.
> > 
> > > But I think it leaves us in an even
> > > more akward state WRT VMA flags.  As you noted in your fix for the
> > > mmap() man page, one can get into a state where a VMA is VM_LOCKED, but
> > > not present.  Having VM_LOCKONFAULT states that this was intentional, if
> > > we go to using MAP_FAULTPOPULATE instead of MAP_LOCKONFAULT, we no
> > > longer set VM_LOCKONFAULT (unless we want to start mapping it to the
> > > presence of two MAP_ flags).  This can make detecting the MAP_LOCKED +
> > > populate failure state harder.
> > 
> > I am not sure I understand your point here. Could you be more specific
> > how would you check for that and what for?
> 
> My thought on detecting was that someone might want to know if they had
> a VMA that was VM_LOCKED but had not been made present becuase of a
> failure in mmap.  We don't have a way today, but adding VM_LOCKONFAULT
> is at least explicit about what is happening which would make detecting
> the VM_LOCKED but not present state easier. 

One could use /proc/<pid>/pagemap to query the residency.

> This assumes that
> MAP_FAULTPOPULATE does not translate to a VMA flag, but it sounds like
> it would have to.

Yes, it would have to have a VM flag for the vma.

> > From my understanding MAP_LOCKONFAULT is essentially
> > MAP_FAULTPOPULATE|MAP_LOCKED with a quite obvious semantic (unlike
> > single MAP_LOCKED unfortunately). I would love to also have
> > MAP_LOCKED|MAP_POPULATE (aka full mlock semantic) but I am really
> > skeptical considering how my previous attempt to make MAP_POPULATE
> > reasonable went.
> 
> Are you objecting to the addition of the VMA flag VM_LOCKONFAULT, or the
> new MAP_LOCKONFAULT flag (or both)? 

I thought the MAP_FAULTPOPULATE (or any other better name) would
directly translate into VM_FAULTPOPULATE and wouldn't be tight to the
locked semantic. We already have VM_LOCKED for that. The direct effect
of the flag would be to prevent from population other than the direct
page fault - including any speculative actions like fault around or
read-ahead.

> If you prefer that MAP_LOCKED |
> MAP_FAULTPOPULATE means that VM_LOCKONFAULT is set, I am fine with that
> instead of introducing MAP_LOCKONFAULT.  I went with the new flag
> because to date, we have a one to one mapping of MAP_* to VM_* flags.
> 
> > 
> > > If this is the preferred path for mmap(), I am fine with that. 
> > 
> > > However,
> > > I would like to see the new system calls that Andrew mentioned (and that
> > > I am testing patches for) go in as well. 
> > 
> > mlock with flags sounds like a good step but I am not sure it will make
> > sense in the future. POSIX has screwed that and I am not sure how many
> > applications would use it. This ship has sailed long time ago.
> 
> I don't know either, but the code is the question, right?  I know that
> we have at least one team that wants it here.
> 
> > 
> > > That way we give users the
> > > ability to request VM_LOCKONFAULT for memory allocated using something
> > > other than mmap.
> > 
> > mmap(MAP_FAULTPOPULATE); mlock() would have the same semantic even
> > without changing mlock syscall.
> 
> That is true as long as MAP_FAULTPOPULATE set a flag in the VMA(s).  It
> doesn't cover the actual case I was asking about, which is how do I get
> lock on fault on malloc'd memory?

OK I see your point now. We would indeed need a flag argument for mlock.
-- 
Michal Hocko
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)

From: Michal Hocko <mhocko@suse.cz>
To: Eric B Munson <emunson-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
Cc: Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	linux-alpha-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mips-6z/3iImG2C8G8FEW9MqTrA@public.gmane.org,
	linux-parisc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org,
	sparclinux-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-xtensa-PjhNF2WwrV/0Sa2dR60CXw@public.gmane.org,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	linux-arch-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [RESEND PATCH V2 1/3] Add mmap flag to request pages are locked after page fault
Date: Mon, 22 Jun 2015 12:38:26 +0000	[thread overview]
Message-ID: <20150622123826.GF4430@dhcp22.suse.cz> (raw)
In-Reply-To: <20150619164333.GD2329-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>

On Fri 19-06-15 12:43:33, Eric B Munson wrote:
> On Fri, 19 Jun 2015, Michal Hocko wrote:
> 
> > On Thu 18-06-15 16:30:48, Eric B Munson wrote:
> > > On Thu, 18 Jun 2015, Michal Hocko wrote:
> > [...]
> > > > Wouldn't it be much more reasonable and straightforward to have
> > > > MAP_FAULTPOPULATE as a counterpart for MAP_POPULATE which would
> > > > explicitly disallow any form of pre-faulting? It would be usable for
> > > > other usecases than with MAP_LOCKED combination.
> > > 
> > > I don't see a clear case for it being more reasonable, it is one
> > > possible way to solve the problem.
> > 
> > MAP_FAULTPOPULATE would be usable for other cases as well. E.g. fault
> > around is all or nothing feature. Either all mappings (which support
> > this) fault around or none. There is no way to tell the kernel that
> > this particular mapping shouldn't fault around. I haven't seen such a
> > request yet but we have seen requests to have a way to opt out from
> > a global policy in the past (e.g. per-process opt out from THP). So
> > I can imagine somebody will come with a request to opt out from any
> > speculative operations on the mapped area in the future.
> > 
> > > But I think it leaves us in an even
> > > more akward state WRT VMA flags.  As you noted in your fix for the
> > > mmap() man page, one can get into a state where a VMA is VM_LOCKED, but
> > > not present.  Having VM_LOCKONFAULT states that this was intentional, if
> > > we go to using MAP_FAULTPOPULATE instead of MAP_LOCKONFAULT, we no
> > > longer set VM_LOCKONFAULT (unless we want to start mapping it to the
> > > presence of two MAP_ flags).  This can make detecting the MAP_LOCKED +
> > > populate failure state harder.
> > 
> > I am not sure I understand your point here. Could you be more specific
> > how would you check for that and what for?
> 
> My thought on detecting was that someone might want to know if they had
> a VMA that was VM_LOCKED but had not been made present becuase of a
> failure in mmap.  We don't have a way today, but adding VM_LOCKONFAULT
> is at least explicit about what is happening which would make detecting
> the VM_LOCKED but not present state easier. 

One could use /proc/<pid>/pagemap to query the residency.

> This assumes that
> MAP_FAULTPOPULATE does not translate to a VMA flag, but it sounds like
> it would have to.

Yes, it would have to have a VM flag for the vma.

> > From my understanding MAP_LOCKONFAULT is essentially
> > MAP_FAULTPOPULATE|MAP_LOCKED with a quite obvious semantic (unlike
> > single MAP_LOCKED unfortunately). I would love to also have
> > MAP_LOCKED|MAP_POPULATE (aka full mlock semantic) but I am really
> > skeptical considering how my previous attempt to make MAP_POPULATE
> > reasonable went.
> 
> Are you objecting to the addition of the VMA flag VM_LOCKONFAULT, or the
> new MAP_LOCKONFAULT flag (or both)? 

I thought the MAP_FAULTPOPULATE (or any other better name) would
directly translate into VM_FAULTPOPULATE and wouldn't be tight to the
locked semantic. We already have VM_LOCKED for that. The direct effect
of the flag would be to prevent from population other than the direct
page fault - including any speculative actions like fault around or
read-ahead.

> If you prefer that MAP_LOCKED |
> MAP_FAULTPOPULATE means that VM_LOCKONFAULT is set, I am fine with that
> instead of introducing MAP_LOCKONFAULT.  I went with the new flag
> because to date, we have a one to one mapping of MAP_* to VM_* flags.
> 
> > 
> > > If this is the preferred path for mmap(), I am fine with that. 
> > 
> > > However,
> > > I would like to see the new system calls that Andrew mentioned (and that
> > > I am testing patches for) go in as well. 
> > 
> > mlock with flags sounds like a good step but I am not sure it will make
> > sense in the future. POSIX has screwed that and I am not sure how many
> > applications would use it. This ship has sailed long time ago.
> 
> I don't know either, but the code is the question, right?  I know that
> we have at least one team that wants it here.
> 
> > 
> > > That way we give users the
> > > ability to request VM_LOCKONFAULT for memory allocated using something
> > > other than mmap.
> > 
> > mmap(MAP_FAULTPOPULATE); mlock() would have the same semantic even
> > without changing mlock syscall.
> 
> That is true as long as MAP_FAULTPOPULATE set a flag in the VMA(s).  It
> doesn't cover the actual case I was asking about, which is how do I get
> lock on fault on malloc'd memory?

OK I see your point now. We would indeed need a flag argument for mlock.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in

WARNING: multiple messages have this Message-ID (diff)

From: Michal Hocko <mhocko@suse.cz>
To: Eric B Munson <emunson@akamai.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-alpha@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mips@linux-mips.org, linux-parisc@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, sparclinux@vger.kernel.org,
	linux-xtensa@linux-xtensa.org, linux-mm@kvack.org,
	linux-arch@vger.kernel.org, linux-api@vger.kernel.org
Subject: Re: [RESEND PATCH V2 1/3] Add mmap flag to request pages are locked after page fault
Date: Mon, 22 Jun 2015 14:38:26 +0200	[thread overview]
Message-ID: <20150622123826.GF4430@dhcp22.suse.cz> (raw)
In-Reply-To: <20150619164333.GD2329@akamai.com>

On Fri 19-06-15 12:43:33, Eric B Munson wrote:
> On Fri, 19 Jun 2015, Michal Hocko wrote:
> 
> > On Thu 18-06-15 16:30:48, Eric B Munson wrote:
> > > On Thu, 18 Jun 2015, Michal Hocko wrote:
> > [...]
> > > > Wouldn't it be much more reasonable and straightforward to have
> > > > MAP_FAULTPOPULATE as a counterpart for MAP_POPULATE which would
> > > > explicitly disallow any form of pre-faulting? It would be usable for
> > > > other usecases than with MAP_LOCKED combination.
> > > 
> > > I don't see a clear case for it being more reasonable, it is one
> > > possible way to solve the problem.
> > 
> > MAP_FAULTPOPULATE would be usable for other cases as well. E.g. fault
> > around is all or nothing feature. Either all mappings (which support
> > this) fault around or none. There is no way to tell the kernel that
> > this particular mapping shouldn't fault around. I haven't seen such a
> > request yet but we have seen requests to have a way to opt out from
> > a global policy in the past (e.g. per-process opt out from THP). So
> > I can imagine somebody will come with a request to opt out from any
> > speculative operations on the mapped area in the future.
> > 
> > > But I think it leaves us in an even
> > > more akward state WRT VMA flags.  As you noted in your fix for the
> > > mmap() man page, one can get into a state where a VMA is VM_LOCKED, but
> > > not present.  Having VM_LOCKONFAULT states that this was intentional, if
> > > we go to using MAP_FAULTPOPULATE instead of MAP_LOCKONFAULT, we no
> > > longer set VM_LOCKONFAULT (unless we want to start mapping it to the
> > > presence of two MAP_ flags).  This can make detecting the MAP_LOCKED +
> > > populate failure state harder.
> > 
> > I am not sure I understand your point here. Could you be more specific
> > how would you check for that and what for?
> 
> My thought on detecting was that someone might want to know if they had
> a VMA that was VM_LOCKED but had not been made present becuase of a
> failure in mmap.  We don't have a way today, but adding VM_LOCKONFAULT
> is at least explicit about what is happening which would make detecting
> the VM_LOCKED but not present state easier. 

One could use /proc/<pid>/pagemap to query the residency.

> This assumes that
> MAP_FAULTPOPULATE does not translate to a VMA flag, but it sounds like
> it would have to.

Yes, it would have to have a VM flag for the vma.

> > From my understanding MAP_LOCKONFAULT is essentially
> > MAP_FAULTPOPULATE|MAP_LOCKED with a quite obvious semantic (unlike
> > single MAP_LOCKED unfortunately). I would love to also have
> > MAP_LOCKED|MAP_POPULATE (aka full mlock semantic) but I am really
> > skeptical considering how my previous attempt to make MAP_POPULATE
> > reasonable went.
> 
> Are you objecting to the addition of the VMA flag VM_LOCKONFAULT, or the
> new MAP_LOCKONFAULT flag (or both)? 

I thought the MAP_FAULTPOPULATE (or any other better name) would
directly translate into VM_FAULTPOPULATE and wouldn't be tight to the
locked semantic. We already have VM_LOCKED for that. The direct effect
of the flag would be to prevent from population other than the direct
page fault - including any speculative actions like fault around or
read-ahead.

> If you prefer that MAP_LOCKED |
> MAP_FAULTPOPULATE means that VM_LOCKONFAULT is set, I am fine with that
> instead of introducing MAP_LOCKONFAULT.  I went with the new flag
> because to date, we have a one to one mapping of MAP_* to VM_* flags.
> 
> > 
> > > If this is the preferred path for mmap(), I am fine with that. 
> > 
> > > However,
> > > I would like to see the new system calls that Andrew mentioned (and that
> > > I am testing patches for) go in as well. 
> > 
> > mlock with flags sounds like a good step but I am not sure it will make
> > sense in the future. POSIX has screwed that and I am not sure how many
> > applications would use it. This ship has sailed long time ago.
> 
> I don't know either, but the code is the question, right?  I know that
> we have at least one team that wants it here.
> 
> > 
> > > That way we give users the
> > > ability to request VM_LOCKONFAULT for memory allocated using something
> > > other than mmap.
> > 
> > mmap(MAP_FAULTPOPULATE); mlock() would have the same semantic even
> > without changing mlock syscall.
> 
> That is true as long as MAP_FAULTPOPULATE set a flag in the VMA(s).  It
> doesn't cover the actual case I was asking about, which is how do I get
> lock on fault on malloc'd memory?

OK I see your point now. We would indeed need a flag argument for mlock.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: Michal Hocko <mhocko@suse.cz>
To: Eric B Munson <emunson@akamai.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-alpha@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mips@linux-mips.org, linux-parisc@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, sparclinux@vger.kernel.org,
	linux-xtensa@linux-xtensa.org, linux-mm@kvack.org,
	linux-arch@vger.kernel.org, linux-api@vger.kernel.org
Subject: Re: [RESEND PATCH V2 1/3] Add mmap flag to request pages are locked after page fault
Date: Mon, 22 Jun 2015 14:38:26 +0200	[thread overview]
Message-ID: <20150622123826.GF4430@dhcp22.suse.cz> (raw)
In-Reply-To: <20150619164333.GD2329@akamai.com>

On Fri 19-06-15 12:43:33, Eric B Munson wrote:
> On Fri, 19 Jun 2015, Michal Hocko wrote:
> 
> > On Thu 18-06-15 16:30:48, Eric B Munson wrote:
> > > On Thu, 18 Jun 2015, Michal Hocko wrote:
> > [...]
> > > > Wouldn't it be much more reasonable and straightforward to have
> > > > MAP_FAULTPOPULATE as a counterpart for MAP_POPULATE which would
> > > > explicitly disallow any form of pre-faulting? It would be usable for
> > > > other usecases than with MAP_LOCKED combination.
> > > 
> > > I don't see a clear case for it being more reasonable, it is one
> > > possible way to solve the problem.
> > 
> > MAP_FAULTPOPULATE would be usable for other cases as well. E.g. fault
> > around is all or nothing feature. Either all mappings (which support
> > this) fault around or none. There is no way to tell the kernel that
> > this particular mapping shouldn't fault around. I haven't seen such a
> > request yet but we have seen requests to have a way to opt out from
> > a global policy in the past (e.g. per-process opt out from THP). So
> > I can imagine somebody will come with a request to opt out from any
> > speculative operations on the mapped area in the future.
> > 
> > > But I think it leaves us in an even
> > > more akward state WRT VMA flags.  As you noted in your fix for the
> > > mmap() man page, one can get into a state where a VMA is VM_LOCKED, but
> > > not present.  Having VM_LOCKONFAULT states that this was intentional, if
> > > we go to using MAP_FAULTPOPULATE instead of MAP_LOCKONFAULT, we no
> > > longer set VM_LOCKONFAULT (unless we want to start mapping it to the
> > > presence of two MAP_ flags).  This can make detecting the MAP_LOCKED +
> > > populate failure state harder.
> > 
> > I am not sure I understand your point here. Could you be more specific
> > how would you check for that and what for?
> 
> My thought on detecting was that someone might want to know if they had
> a VMA that was VM_LOCKED but had not been made present becuase of a
> failure in mmap.  We don't have a way today, but adding VM_LOCKONFAULT
> is at least explicit about what is happening which would make detecting
> the VM_LOCKED but not present state easier. 

One could use /proc/<pid>/pagemap to query the residency.

> This assumes that
> MAP_FAULTPOPULATE does not translate to a VMA flag, but it sounds like
> it would have to.

Yes, it would have to have a VM flag for the vma.

> > From my understanding MAP_LOCKONFAULT is essentially
> > MAP_FAULTPOPULATE|MAP_LOCKED with a quite obvious semantic (unlike
> > single MAP_LOCKED unfortunately). I would love to also have
> > MAP_LOCKED|MAP_POPULATE (aka full mlock semantic) but I am really
> > skeptical considering how my previous attempt to make MAP_POPULATE
> > reasonable went.
> 
> Are you objecting to the addition of the VMA flag VM_LOCKONFAULT, or the
> new MAP_LOCKONFAULT flag (or both)? 

I thought the MAP_FAULTPOPULATE (or any other better name) would
directly translate into VM_FAULTPOPULATE and wouldn't be tight to the
locked semantic. We already have VM_LOCKED for that. The direct effect
of the flag would be to prevent from population other than the direct
page fault - including any speculative actions like fault around or
read-ahead.

> If you prefer that MAP_LOCKED |
> MAP_FAULTPOPULATE means that VM_LOCKONFAULT is set, I am fine with that
> instead of introducing MAP_LOCKONFAULT.  I went with the new flag
> because to date, we have a one to one mapping of MAP_* to VM_* flags.
> 
> > 
> > > If this is the preferred path for mmap(), I am fine with that. 
> > 
> > > However,
> > > I would like to see the new system calls that Andrew mentioned (and that
> > > I am testing patches for) go in as well. 
> > 
> > mlock with flags sounds like a good step but I am not sure it will make
> > sense in the future. POSIX has screwed that and I am not sure how many
> > applications would use it. This ship has sailed long time ago.
> 
> I don't know either, but the code is the question, right?  I know that
> we have at least one team that wants it here.
> 
> > 
> > > That way we give users the
> > > ability to request VM_LOCKONFAULT for memory allocated using something
> > > other than mmap.
> > 
> > mmap(MAP_FAULTPOPULATE); mlock() would have the same semantic even
> > without changing mlock syscall.
> 
> That is true as long as MAP_FAULTPOPULATE set a flag in the VMA(s).  It
> doesn't cover the actual case I was asking about, which is how do I get
> lock on fault on malloc'd memory?

OK I see your point now. We would indeed need a flag argument for mlock.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Please read the FAQ at  http://www.tux.org/lkml/

next prev parent reply	other threads:[~2015-06-22 12:38 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-10 13:26 [RESEND PATCH V2 0/3] Allow user to request memory to be locked on page fault Eric B Munson
2015-06-10 13:26 ` Eric B Munson
2015-06-10 13:26 ` Eric B Munson
2015-06-10 13:26 ` [RESEND PATCH V2 1/3] Add mmap flag to request pages are locked after " Eric B Munson
2015-06-10 13:26   ` Eric B Munson
2015-06-10 13:26   ` Eric B Munson
2015-06-18 15:29   ` Michal Hocko
2015-06-18 15:29     ` Michal Hocko
2015-06-18 15:29     ` Michal Hocko
2015-06-18 20:30     ` Eric B Munson
2015-06-18 20:30       ` Eric B Munson
2015-06-19 14:57       ` Michal Hocko
2015-06-19 14:57         ` Michal Hocko
2015-06-19 14:57         ` Michal Hocko
2015-06-19 14:57         ` Michal Hocko
2015-06-19 14:57         ` Michal Hocko
2015-06-19 14:57         ` Michal Hocko
2015-06-19 16:43         ` Eric B Munson
2015-06-19 16:43           ` Eric B Munson
     [not found]           ` <20150619164333.GD2329-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
2015-06-22 12:38             ` Michal Hocko [this message]
2015-06-22 12:38               ` Michal Hocko
2015-06-22 12:38               ` Michal Hocko
2015-06-22 12:38               ` Michal Hocko
2015-06-22 12:38               ` Michal Hocko
2015-06-22 12:38               ` Michal Hocko
     [not found]               ` <20150622123826.GF4430-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2015-06-22 14:18                 ` Eric B Munson
2015-06-22 14:18                   ` Eric B Munson
2015-06-22 14:18                   ` Eric B Munson
2015-06-23 12:45                   ` Vlastimil Babka
2015-06-23 12:45                     ` Vlastimil Babka
2015-06-23 12:45                     ` Vlastimil Babka
     [not found]                     ` <558954DD.4060405-AlSwsSmVLrQ@public.gmane.org>
2015-06-24  9:47                       ` Michal Hocko
2015-06-24  9:47                         ` Michal Hocko
2015-06-24  9:47                         ` Michal Hocko
2015-06-24  9:47                         ` Michal Hocko
2015-06-24  8:50                   ` Michal Hocko
2015-06-24  8:50                     ` Michal Hocko
2015-06-24  8:50                     ` Michal Hocko
2015-06-25 14:46                     ` Eric B Munson
2015-06-25 14:46                       ` Eric B Munson
2015-06-10 13:26 ` [RESEND PATCH V2 2/3] Add mlockall flag for locking pages on fault Eric B Munson
2015-06-10 13:26   ` Eric B Munson
2015-06-10 13:26   ` Eric B Munson
     [not found] ` <1433942810-7852-1-git-send-email-emunson-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
2015-06-10 13:26   ` [RESEND PATCH V2 3/3] Add tests for lock " Eric B Munson
2015-06-10 13:26     ` Eric B Munson
2015-06-10 13:26     ` Eric B Munson
2015-06-10 21:59   ` [RESEND PATCH V2 0/3] Allow user to request memory to be locked on page fault Andrew Morton
2015-06-10 21:59     ` Andrew Morton
2015-06-10 21:59     ` Andrew Morton
2015-06-10 21:59     ` Andrew Morton
2015-06-11 19:21     ` Eric B Munson
2015-06-11 19:21       ` Eric B Munson
2015-06-11 19:21       ` Eric B Munson
2015-06-11 19:21       ` Eric B Munson
2015-06-11 19:34       ` Andrew Morton
2015-06-11 19:34         ` Andrew Morton
2015-06-11 19:34         ` Andrew Morton
2015-06-11 19:55         ` Eric B Munson
2015-06-11 19:55           ` Eric B Munson
2015-06-11 19:55           ` Eric B Munson
2015-06-12 12:05         ` Vlastimil Babka
2015-06-12 12:05           ` Vlastimil Babka
2015-06-12 12:05           ` Vlastimil Babka
2015-06-15 14:43           ` Eric B Munson
2015-06-15 14:43             ` Eric B Munson
     [not found]             ` <20150615144356.GB12300-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
2015-06-23 13:04               ` Vlastimil Babka
2015-06-23 13:04                 ` Vlastimil Babka
2015-06-23 13:04                 ` Vlastimil Babka
2015-06-23 13:04                 ` Vlastimil Babka
2015-06-25 14:16                 ` Eric B Munson
2015-06-25 14:16                   ` Eric B Munson
2015-06-25 14:26                   ` Andy Lutomirski
2015-06-25 14:26                     ` Andy Lutomirski
2015-06-25 14:26                     ` Andy Lutomirski
2015-06-15 14:39         ` Eric B Munson
2015-06-15 14:39           ` Eric B Munson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150622123826.GF4430@dhcp22.suse.cz \
    --to=mhocko-alswssmvlrq@public.gmane.org \
    --cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=emunson-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org \
    --cc=linux-alpha-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-arch-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-mips-6z/3iImG2C8G8FEW9MqTrA@public.gmane.org \
    --cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
    --cc=linux-parisc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-xtensa-PjhNF2WwrV/0Sa2dR60CXw@public.gmane.org \
    --cc=linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org \
    --cc=sparclinux-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.