From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 36052C3DA7A
	for <linux-mm@archiver.kernel.org>; Mon,  2 Jan 2023 14:37:02 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 80C1A8E0002; Mon,  2 Jan 2023 09:37:01 -0500 (EST)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 794C98E0001; Mon,  2 Jan 2023 09:37:01 -0500 (EST)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 65CBE8E0002; Mon,  2 Jan 2023 09:37:01 -0500 (EST)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12])
	by kanga.kvack.org (Postfix) with ESMTP id 535E18E0001
	for <linux-mm@kvack.org>; Mon,  2 Jan 2023 09:37:01 -0500 (EST)
Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay01.hostedemail.com (Postfix) with ESMTP id 14C651C440E
	for <linux-mm@kvack.org>; Mon,  2 Jan 2023 14:37:01 +0000 (UTC)
X-FDA: 80310111042.04.F3AFF2C
Received: from casper.infradead.org (casper.infradead.org [90.155.50.34])
	by imf21.hostedemail.com (Postfix) with ESMTP id 003281C001A
	for <linux-mm@kvack.org>; Mon,  2 Jan 2023 14:36:57 +0000 (UTC)
Authentication-Results: imf21.hostedemail.com;
	dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=QUih8KNn;
	spf=none (imf21.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org;
	dmarc=none
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1672670218;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=IHJbXSStoTOW1TXphXY3233q5587iv+dojH9JRS4b8Y=;
	b=KMmoKojbgFhElyO4ARfSY/VzgWyHda9iv8QdJVlslFQEVZsMPd8wAvPX09eKWjmaWDvhrI
	drcWe6AddkdBS73wg3ezOgMnJrvdoTmeCtK+lUP0bv46Ua4oZmg571W1DdVlbrENwgxYCI
	of1FpSFWQOcr4Zj+HwOb2eNyqTC9uAY=
ARC-Authentication-Results: i=1;
	imf21.hostedemail.com;
	dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=QUih8KNn;
	spf=none (imf21.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org;
	dmarc=none
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672670218; a=rsa-sha256;
	cv=none;
	b=DBv7NHH/ndHVT5lhNZtT7oHq/Qz+y6yhjpjxLOKHTJbqmAI3uf1u371NC82lr5hiQL6OpO
	vmOJN7mkRDHYox2qukkh4oy27ghE4OLW8Pxe3uXGRKTzfF7m3EfBClLEIfYuJdiqfRyaC3
	PRB7Jz0kD9Lp4ej7CsFT1gcN3GU2iYY=
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;
	d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version:
	References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To:
	Content-Transfer-Encoding:Content-ID:Content-Description;
	bh=IHJbXSStoTOW1TXphXY3233q5587iv+dojH9JRS4b8Y=; b=QUih8KNnF4kH+vRtjaGFtkBLuH
	MbB3v3V26JuxZSmYx6Dy7uAi0QLvocEqVUwx2EvpvGDRzPpItDue6/4aCsjlPiZrbE8Bj17mzGVTR
	EtBlzaG9hKTvWDND3awzE6yfycMcaVfV1pyz2RiB3eHGvP6Vte49xtWNgWJka4g/1nkaXIuxYGsLe
	i1J3vWgwk6RoRU6q5t6aj7QIXiOLoQWUC2xqxZyphwn1B90KfcL1SOQwcnldjZxxblpu1TXF1gY3p
	JLBc9TVTsgCY3OmpksAeKpiV2XrWUyd2QunTEpnu8Si/hruDP06mcucZpY/jWSP7VqvksLWJbfDYf
	hhSm9S0A==;
Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux))
	id 1pCLvz-00DEAH-0q; Mon, 02 Jan 2023 14:37:03 +0000
Date: Mon, 2 Jan 2023 14:37:02 +0000
From: Matthew Wilcox <willy@infradead.org>
To: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: linux-mm@kvack.org, liam.howlett@oracle.com, surenb@google.com,
	ldufour@linux.ibm.com, michel@lespinasse.org, vbabka@suse.cz,
	linux-kernel@vger.kernel.org
Subject: Re: [QUESTION] about the maple tree and current status of mmap_lock
 scalability
Message-ID: <Y7LsDgMxHh8NHzDY@casper.infradead.org>
References: <EC51CFA7-2BC8-4F72-A7D4-3B1A778EDB37@gmail.com>
 <Y6ysHNPvKayTfeq8@casper.infradead.org>
 <Y62ipKlWGEbJZKXv@hyeyoo>
 <Y63FmaNoLAcdsLaU@casper.infradead.org>
 <Y7LIPOc/ESmhRzYk@hyeyoo>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <Y7LIPOc/ESmhRzYk@hyeyoo>
X-Rspamd-Queue-Id: 003281C001A
X-Stat-Signature: qbw6tjsabw5ectq68z1mdwdayzk56hyq
X-Rspam-User: 
X-Rspamd-Server: rspam08
X-HE-Tag: 1672670217-789176
X-HE-Meta: U2FsdGVkX1+2LI064qRIV+g9mS4EjzwVCtxFkkxGoMdowStWzmeMMPQ9Rqms9obse+8ZvJnZZpM6nWMJf4DfD/6VYB6unvgM05VzW3ZSUaPxegkUlGZoQ23vD3OTV2cd3sVgDZdK3oGpaW6R2J3y0OdHzGZ9JHiP2/AFdvrTEPUnDapoQsmuhwbdfJ4rRxRMMUByORf72aQfHxwxGoKcO38Zlmu0RP462elpMFhReKK4QvuiEuwEnUPJ6ABjhQ8lXL3/T5galnls4U/VxGnX2s9UpYugWDSCe3g0P1ZKw8LfIiyG/0AdcSU4FbtT6OHriguDu+MBbn1vLiFJWIcTIuf1lLyZdt7leuIqUOk1xQyea5/qWSuWqaKMxzSUgrL2ZauNMKInzx0JIStDGX751bhD/XvOIKOXXa7uiZsSIg2QR8aDhFXd9rjbhOZoC4cOL+/cB4ULjGNzgiU1FQ+7tgVtmmOXt2TOwMrcpTov7193AqZFudTgSjbqAmXnaXxq+VlH/qX8p7arp9WYpUPPQfOIqdnS+8+bhuO8nvJXL+vHKd+LsunT0WUYK4XOPKlA2q6FnZ4DZcaiUc0tmythT8yZBJkatp7rRnwMAgSUr++H2q271I2DGgcv390MBXy0VJhNZ9XqNGodnFNGjTeeAVhXcpphZcl2SK9eygJhJnU97ZBHZ80txpDH+S40BYdNGURWdqgpW8ekO1pxrT5xzeVh2Ilfn25x97C7ZRa5HosKRSDk6gdVK7eKM5bt1udiN2NCaAyxP7tQztTPD/e2gGg6UNK7LVk+BtvMWyxZ9dgPQbUhQVjb9FdSyHzyTEWMi2BWVyXEoN8uypuqFd5yTJVIr3O3II7WO6qTczBlIICv+sbrkFgljGBHX3Ta8pO1heXnUGXGlbKvsSQ+Qwih9KMa45/Ps4FkUKAgM5/QZPo4FuUF4aJPb9hjM6ztN1LL9r45SAjSgyxfpGPKUG0
 gGg==
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On Mon, Jan 02, 2023 at 09:04:12PM +0900, Hyeonggon Yoo wrote:
> > https://www.infradead.org/~willy/linux/store-free-page-faults.html
> > outlines how I intend to proceed from Suren's current scheme (where
> > RCU is only used to protect the tree walk) to using RCU for the
> > entire page fault.
> 
> Thank you for sharing this your outlines.
> Okay, so the planned scheme is:
> 
> 	1. Try to process entire page fault under RCU protection
> 		- if failed, goto 2. if succeeded, goto 4.
> 
> 	2. Fall back to Suren's scheme (try to take VMA rwsem)
> 		- if failed, goto 3. if succeeded, goto 4.

Right.  The question is whether to restart the page fault under Suren's
scheme, or just grab the VMA rwsem and continue.  Experimentation
needed.

It's also worth noting that Michel has an alternative proposal, which
is to drop out of RCU protection before trying to allocate memory, then
re-enter RCU mode and check the sequence count hasn't changed on the
entire MM.  His proposal has the advantage of not trying to allocate
memory while holding the RCU read lock, but the disadvantage of having
to retry the page fault if anyone has called mmap() or munmap().  Which
alternative is better is going to depend on the workload; do we see more
calls to mmap()/munmap(), or do we need to enter page reclaim more often?
I think they're largely equivalent performance-wise in the fast path.
Another metric to consider is code complexity; he thinks his method
is easier to understand and I think mine is easier.  To be expected,
I suppose ;-)

> 	3. Fall back to mmap_lock
> 		- goto 4.
> 
> 	4. Finish page fault.
> 
> To implement 1, __p*d_alloc() need to take gfp flags
> not to sleep in RCU read-side critical section.
> 
> What about introducing PF_MEMALLOC_NOWAIT process flag forcing
> GFP_NOWAIT | __GFP_NOWARN
> 
> similar to PF_MEMALLOC_NO{FS,IO}, looking like this?
> 
> Will be less churn.

Certainly less churn, but also far more risky.  All of a sudden,
codepaths which used to always succeed will now start failing, and
either there aren't checks for memory allocation failures or those
paths have never been tested before.