Re: [PATCH 1/1] mm: make start_isolate_page_range() fail if already isolated

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Mike Kravetz <mike.kravetz@oracle.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Luiz Capitulino <lcapitulino@redhat.com>,
	Michal Nazarewicz <mina86@mina86.com>,
	Michal Hocko <mhocko@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	Mel Gorman <mgorman@techsingularity.net>,
	Johannes Weiner <hannes@cmpxchg.org>
Subject: Re: [PATCH 1/1] mm: make start_isolate_page_range() fail if already isolated
Date: Fri, 2 Mar 2018 16:38:33 -0800	[thread overview]
Message-ID: <3887b37d-2bc0-1eff-9aec-6a99cc0715fb@oracle.com> (raw)
In-Reply-To: <20180302160607.570e13f2157f56503fe1bdaa@linux-foundation.org>

On 03/02/2018 04:06 PM, Andrew Morton wrote:
> On Mon, 26 Feb 2018 11:10:54 -0800 Mike Kravetz <mike.kravetz@oracle.com> wrote:
> 
>> start_isolate_page_range() is used to set the migrate type of a
>> set of page blocks to MIGRATE_ISOLATE while attempting to start
>> a migration operation.  It assumes that only one thread is
>> calling it for the specified range.  This routine is used by
>> CMA, memory hotplug and gigantic huge pages.  Each of these users
>> synchronize access to the range within their subsystem.  However,
>> two subsystems (CMA and gigantic huge pages for example) could
>> attempt operations on the same range.  If this happens, page
>> blocks may be incorrectly left marked as MIGRATE_ISOLATE and
>> therefore not available for page allocation.
>>
>> Without 'locking code' there is no easy way to synchronize access
>> to the range of page blocks passed to start_isolate_page_range.
>> However, if two threads are working on the same set of page blocks
>> one will stumble upon blocks set to MIGRATE_ISOLATE by the other.
>> In such conditions, make the thread noticing MIGRATE_ISOLATE
>> clean up as normal and return -EBUSY to the caller.
>>
>> This will allow start_isolate_page_range to serve as a
>> synchronization mechanism and will allow for more general use
>> of callers making use of these interfaces.  So, update comments
>> in alloc_contig_range to reflect this new functionality.
>>
>> ...
>>
>> --- a/mm/page_isolation.c
>> +++ b/mm/page_isolation.c
>> @@ -28,6 +28,13 @@ static int set_migratetype_isolate(struct page *page, int migratetype,
>>  
>>  	spin_lock_irqsave(&zone->lock, flags);
>>  
>> +	/*
>> +	 * We assume we are the only ones trying to isolate this block.
>> +	 * If MIGRATE_ISOLATE already set, return -EBUSY
>> +	 */
>> +	if (is_migrate_isolate_page(page))
>> +		goto out;
>> +
>>  	pfn = page_to_pfn(page);
>>  	arg.start_pfn = pfn;
>>  	arg.nr_pages = pageblock_nr_pages;
> 
> Seems a bit ugly and I'm not sure that it's correct.  If the loop in
> start_isolate_page_range() gets partway through a number of pages then
> we hit the race, start_isolate_page_range() will then go and "undo" the
> work being done by the thread which it is racing against?

I agree that it is a bit ugly.  However, when a thread hits the above
condition it will only undo what it has done.  Only one thread is able
to set migrate state to isolate (under the zone lock).  So, a thread
will only undo what it has done.

The exact problem of one thread undoing what another thread has done
is possible with the code today and is what this patch is attempting
to address.

> Even if that can't happen, blundering through a whole bunch of pages
> then saying whoops then undoing everything is unpleasing.
> 
> Should we be looking at preventing these races at a higher level?

I could not immediately come up with a good idea here.  The zone lock
would be the obvious choice, but I don't think we want to hold it while
examining each of the page blocks.  Perhaps a new lock or semaphore
associated with the zone?  I'm open to suggestions.

-- 
Mike Kravetz

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2018-03-03  0:38 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-26 19:10 [PATCH 0/1] make start_isolate_page_range() thread safe Mike Kravetz
2018-02-26 19:10 ` [PATCH 1/1] mm: make start_isolate_page_range() fail if already isolated Mike Kravetz
2018-03-03  0:06   ` Andrew Morton
2018-03-03  0:38     ` Mike Kravetz [this message]
2018-03-03  0:56       ` Andrew Morton
2018-03-03  1:39         ` Mike Kravetz
2018-03-06  0:57           ` Mike Kravetz
2018-03-06 22:32             ` Andrew Morton
2018-03-09 22:47 ` [PATCH v2] " Mike Kravetz
2018-03-13 21:14   ` Andrew Morton
2018-03-13 21:27     ` Mike Kravetz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3887b37d-2bc0-1eff-9aec-6a99cc0715fb@oracle.com \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=lcapitulino@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=mina86@mina86.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).