From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1DBFBC636CC for ; Mon, 20 Feb 2023 14:27:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3FE716B0071; Mon, 20 Feb 2023 09:27:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3ADFB6B0072; Mon, 20 Feb 2023 09:27:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 275B06B0073; Mon, 20 Feb 2023 09:27:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 16D186B0071 for ; Mon, 20 Feb 2023 09:27:00 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id D1AC180937 for ; Mon, 20 Feb 2023 14:26:59 +0000 (UTC) X-FDA: 80487896958.25.1867EDE Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by imf30.hostedemail.com (Postfix) with ESMTP id EB65F80021 for ; Mon, 20 Feb 2023 14:26:57 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=EEaQ+MQJ; spf=pass (imf30.hostedemail.com: domain of 42.hyeyoo@gmail.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=42.hyeyoo@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676903218; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9SpbamaPNl85gOD1dvXbiOUXPjiZW4r+h87yhkDRLd4=; b=OoydwDJeMCOHWR4n+OTTAezCNj4IoIB1RwYRVASFDpgYByy215cC3a9h77chxae87NL/U/ RLXviF23IEyu3Y00O0aVxQSxH9woPSAvL4o3jBo9R92mENKPSy0B0lIl/KTxqLc0AvccrM aWShcLVWpWFl/rru6yNtvcPaWC3TRCU= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=EEaQ+MQJ; spf=pass (imf30.hostedemail.com: domain of 42.hyeyoo@gmail.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=42.hyeyoo@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676903218; a=rsa-sha256; cv=none; b=WSFqgRJXGKP8sL3ce30hn/YB7/ypcTBIWtOC4RFbGH9AVBOL8Z5SjNjAbvF6onFvYLX286 Zmehkjp/Gpi2mOziiID2yyyYiArKBxnOyxxYLRiEi+gnccLzTQD0cYH1ovKZr8tQN/hOr2 HAKm9Dz1a53nyvRBppRqm+NOf1Fzrf8= Received: by mail-pl1-f175.google.com with SMTP id z2so1698041plf.12 for ; Mon, 20 Feb 2023 06:26:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=9SpbamaPNl85gOD1dvXbiOUXPjiZW4r+h87yhkDRLd4=; b=EEaQ+MQJM1p1qLc+LJqIUu2RqpYfnPRSy7mYYcrlnd7suvvPm0Hye9q+vU+k0w0zq4 aECyGhLzIVtMz7UIa8kXdVI0vLoVxJUiLYwdof9lBOPAyBEZcRcvx+XXLhg3DWVgSJA3 OQIMu9+QLeyhLCuMkYMEBeBBV7Xo2gbD/N43ZJC65ImJbdd6ytYAcQgP7CG1hlTZskXW SktysZk2vu5BRArQchAHQ1CmAN9m1pdAMtvsGRC+2h5AycxRbY75BYOXVlZ3Nk4TAbh+ wSH/QaXCwaDbwed6faKCgUtX4NfmNWzkjzMhLeTPsRd7uhforJdXbn+VBfZZpIHC9AWF wiLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=9SpbamaPNl85gOD1dvXbiOUXPjiZW4r+h87yhkDRLd4=; b=O5qzNUj2arkVNLJ+iJuDqWc99wjE3SvgdeO55NsEjQ8vcph61nAyGChJomEzdFalXi s6q8BRdZrD/U36MmwSjppgw3NUSebNLOzXifkADLRqC+Zemq39+LgdorXoGFMeH5XTA3 go9qJkq01fQkIIxneEJ7T/eOZ6WDDaWegyc76uXicAOqo15x15cBayJPWQyz8lToTtpo XlYWm4Fb0br6yBj7bCUxadNpAyHiwtjFZgO14XPK+vqrPERc5B1CFPNkmfBuvZhVxJYU viCFtmCIwXIYSDbSLhJxD41h/N6prds1OH27cUWNtiDOU0LmeEI8PqFjK8STMJDb3gDc vnTQ== X-Gm-Message-State: AO0yUKWINM7lII3Dic/JbNijjIQWHBH+9D9LP5N1z7dtaHuxyUDwQ3UR BgI1BPkSEiL9HBjgccMGsa4= X-Google-Smtp-Source: AK7set9t9SvmOWZt0SaXZRuin++O9uhNirZLa9nWoKKDyAwxIskH/g1cDh7mwu7jyY9B2OkWPmjkWw== X-Received: by 2002:a17:902:c950:b0:19a:b44b:cca6 with SMTP id i16-20020a170902c95000b0019ab44bcca6mr2409217pla.24.1676903216666; Mon, 20 Feb 2023 06:26:56 -0800 (PST) Received: from localhost ([2400:8902::f03c:93ff:fe27:642a]) by smtp.gmail.com with ESMTPSA id i3-20020a170902c28300b00194c2f78581sm7973365pld.199.2023.02.20.06.26.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Feb 2023 06:26:55 -0800 (PST) Date: Mon, 20 Feb 2023 14:26:49 +0000 From: Hyeonggon Yoo <42.hyeyoo@gmail.com> To: Matthew Wilcox Cc: linux-mm@kvack.org, liam.howlett@oracle.com, surenb@google.com, ldufour@linux.ibm.com, michel@lespinasse.org, vbabka@suse.cz, linux-kernel@vger.kernel.org Subject: Re: [QUESTION] about the maple tree and current status of mmap_lock scalability Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: EB65F80021 X-Stat-Signature: 95es1zg41ecqxjhcbc74yss57ruyeh64 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1676903217-539712 X-HE-Meta: U2FsdGVkX18IANWWaoW8n0ZhY83wWiHXIoSQ46Dn/VeKivadXyP0MwHs3qyOC4xOZygEy5OdeaS3CJ1TdhjWi5X1FY8hae38SDvlMqY2ggXsf3lTZdrd/8IvmuqRJIu4iYQ0+ze8lDjqdxWJrnklxltHN1wk3edyBdwNLYZg5Xsxlqq0qz0XX7uTaJlPmpFuJDvlrzIeSyEJhjfw8N0LC9P7pHA8MaTNHhdsU+6vAd2/joM25KDSQ6/bc3L+bwP+WSOxYScfAJvKjnVuqkNDeKMPGD9EUKHEgapU4eZAkbgkqMPiO9fqurHOt44V9RsaUa3uu9YE2aA6hSuWADLLWM8Z/PFY5HToGeSeigANYCFHDlPoya+Imd2zZoy4BE/wQ6yH+Hyqjo1qpgvkFAbeca9Y8g+SPmdPBOErXvl95Bpbnxwl5qKyaiPaqZMVk+z25Fe/T09xBNXifTfdRJbDS+K/QdjwxnaRx9vFxWgMLgpHV7SxmmoRLTkYYUa8ucuJbmkrlS8mG78PAH4djy9J0nyutM98py1gPVU+cbep2C8xepGrvKHf7ihN+vVUThQNHGwqicR6JBA+R1xHf1ySetFflrVCYs966YYMDrfDNzvkVseclXX/EG8uuf1CSsMrcFGxL13srVAynw1OI1gnhforjBVUd2+vjUJWOU0gMgtehUm6dPEBFzlDAlyfLeVK/h870cCoYfaTK27jeRUX8yTjgUF1zybqBqD1EEulbrBJpMVCORakqvM1gNiaBLueADOinrWMSP3HQ99egNS3Fb9/IOlIbXnrOnBJ5DKWeGgsoXe0r3ETy49WZfbwAxWHbPaCObSjS0HijJdDj9amB1VccQGZJEUic0ZWS9e0s/dKGp9JqywlDxpsned9/JXVHljcOfnNUJZO6XG3oKpF3cOaGNRC1VFfx3jSUhKX0CFhxWJ+VXYgolEbHa57WLqV7k+MWYyfQq3IpEkpSyA fRIQk1QX 42bGcenqrIt90zgbwjvVVnblRn9vVIsGGQoHTwPKSFWfL8YubXdvfC11UJq2TFdjdUtDAvYripegbrA/b+LlOB3fiyEC+NKnXhQENolvkKyenx1vOymUprPo20qlJmIWQuCGygwFbF9kmVS6qPaj7u4exDeKjrbBueSXrQaFwsDg4Fr8d9oRKqRwXbmeUgiOY70Wnr+1G9iAqxfPpVo3VcR6kzkoBzU7ixi+mm2Z0MJyAGHxHRRc/vyxY1tQ7ZkuSUR159xF5+zgWLccdXctOpcgMsWY8ixELAsfBKu2xNj1e6IZJdQDHnPHwdRwgty3xXPtDq5RSGi/mY2IQ8O3aWUk5VL7MWCaCvRY4n0fz0Ml1KQoWBtDK0HtwC3ivIIC4a3Mj5lVCC01Zl+c/FQAIFDWYD3TpZe85/Ei5yqktSbG6/spwZL/TmI6hlbR4FM9h7Pq8n8tUsA8oo26uRrjCjmThO2dZ92c9l7kdCC3benYPobKDwCzS7LZG5IaDROzYA3PtmQX/7Yo95+KogIA5LukdCwcLh2RF95WtuOOq4W+VIgaKhTpPo3Y8fbyuU8/trtKMnuVxdN6OOagd4dURN5THrXtp5COAZf5prYb3q84ttlameSN1LtyVtLs/xRk1hNXE/ocT2YNZmzh65NKVySsxwCW+mvifzndKnaVMQGDWMck= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jan 02, 2023 at 02:37:02PM +0000, Matthew Wilcox wrote: > On Mon, Jan 02, 2023 at 09:04:12PM +0900, Hyeonggon Yoo wrote: > > > https://www.infradead.org/~willy/linux/store-free-page-faults.html > > > outlines how I intend to proceed from Suren's current scheme (where > > > RCU is only used to protect the tree walk) to using RCU for the > > > entire page fault. > > > > Thank you for sharing this your outlines. > > Okay, so the planned scheme is: > > > > 1. Try to process entire page fault under RCU protection > > - if failed, goto 2. if succeeded, goto 4. > > > > 2. Fall back to Suren's scheme (try to take VMA rwsem) > > - if failed, goto 3. if succeeded, goto 4. > > Right. The question is whether to restart the page fault under Suren's > scheme, or just grab the VMA rwsem and continue. Experimentation > needed. > > It's also worth noting that Michel has an alternative proposal, which > is to drop out of RCU protection before trying to allocate memory, then > re-enter RCU mode and check the sequence count hasn't changed on the > entire MM. His proposal has the advantage of not trying to allocate > memory while holding the RCU read lock, but the disadvantage of having > to retry the page fault if anyone has called mmap() or munmap(). Which > alternative is better is going to depend on the workload; do we see more > calls to mmap()/munmap(), or do we need to enter page reclaim more often? > I think they're largely equivalent performance-wise in the fast path. > Another metric to consider is code complexity; he thinks his method > is easier to understand and I think mine is easier. To be expected, > I suppose ;-) I'm planning to suggest a cooperative project to my colleagues that would involve making __p?d_alloc() take gfp flags. Wondering if there was any progress or conclusion made on which approach is better for full RCU page faults, or was there another solution proposed? Asking this because I don't want to waste my time if the approach has been abandoned. Regards, Hyeonggon > > 3. Fall back to mmap_lock > > - goto 4. > > > > 4. Finish page fault.