From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3ED69CD37B0 for ; Mon, 18 Sep 2023 18:56:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A48FD6B0434; Mon, 18 Sep 2023 14:56:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9F8BE6B0435; Mon, 18 Sep 2023 14:56:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8C0F26B0436; Mon, 18 Sep 2023 14:56:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 7A4336B0434 for ; Mon, 18 Sep 2023 14:56:17 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 4AF0E160B73 for ; Mon, 18 Sep 2023 18:56:17 +0000 (UTC) X-FDA: 81250623594.09.F6BDACB Received: from mail-lj1-f174.google.com (mail-lj1-f174.google.com [209.85.208.174]) by imf22.hostedemail.com (Postfix) with ESMTP id 70F2FC0027 for ; Mon, 18 Sep 2023 18:56:15 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=QuSin8gc; spf=pass (imf22.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.174 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695063375; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jXz4cq66mHWtvTDRMK7RDIs9CKkcwkIB/OuEfXB14i8=; b=eTwzgNUZ+T2E90yXvyRO35/w6yGv6WNxzFPLpHgZJecT6AkVe+/i8cz6QHhtfV203FQmvP 6Vc+rxZkpMgwKxl5wRYQHo25y5eKnPc1scQCN+rceqush8Vsw6VshXjxwphhW3BVuIife+ 2RrXd6MZzX4lY61e733L2mFVIh4HFvM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695063375; a=rsa-sha256; cv=none; b=hH1vQluzhZM0oJtMb0fxTCX3zI4q7HwXpPqK0qa2iEhSV/2G0GjdKfR+7iQoQkY8jhANgK PffoZx1PGP/IcpbsPil4+c5L3Namy0s0gfecKKx5pYEOY3hCzNkBnLv85ylX7SCLUJvHkp YgFux/gG4vQG4uFPGc50pJMFamMPw8s= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=QuSin8gc; spf=pass (imf22.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.174 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-lj1-f174.google.com with SMTP id 38308e7fff4ca-2b962c226ceso79577361fa.3 for ; Mon, 18 Sep 2023 11:56:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1695063373; x=1695668173; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=jXz4cq66mHWtvTDRMK7RDIs9CKkcwkIB/OuEfXB14i8=; b=QuSin8gcrdTqdVywY786lYvux5vhaXBBHbBTI6q8vBo/BZL92H3NK1ygxpk+O0Bukz 834f7NmWhaGndcDHXkMntxecsFZildmEU4wh10L+feY5lS1twC2lqwksSlxgRLWBPuoq a6+xCpHaNrxW7w9z70yDrs2+IeBNs/aB+gwwbgZ8IKtgWlaIKVPCDknM1amXjs2F8KLS eebgufWUCRh2xmexvKWidUctmZGyKm5RKO5709qjF9dWQDhClheu4eG2+0Hpx3YLwHvH ezdcJs2mncop62ggJD/jpcysHkT3tiSocXfn595KRoSEvQelOWvk0EvDOjdYabnpMvJU IMyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695063373; x=1695668173; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jXz4cq66mHWtvTDRMK7RDIs9CKkcwkIB/OuEfXB14i8=; b=tUXV1jOeFEFIqEJhsxcolpRobLwc1sW2RBZbZ0+vD3UHNqlwVHQC9YxSs4tZP1hbhr wPtStQRlRLLeJF3jc+Aal3C4VfYHO+uxOzM0IsfRZJuIg5oVO7DwBU5rMRGPJK8kGV3D 8OguWGWLpl/HrljQhtAxxtltEclfHn4TgitSVgk5qK+Vkts0JNnl+bjttBNLBuV/rzNx kJoxA3rBznHOrfgzoHDoJpFhmUxcIJ9PVNZ74KtNXwMxepuni48MHkpFPpg+IMFp/8+V hux7VecGfaQOzOJeH0uQMer3pQxs7t0uZ3CcwOR5YtsY3jrmfy89xZKZSFpJbG4rpv2b 4lPw== X-Gm-Message-State: AOJu0YyjBVPeuiab7w2GoOSZ0iSTyHo55d7ciB/nesKzACMFgeWdyHfw xwhW5EpoShbiPYOAihP5sFfQhqrkeXuhUnzyY7oUMA== X-Google-Smtp-Source: AGHT+IEHASyFmV1bd4FBQakOmDWzBWx3Rbg9ZbcFyr5AJUmZLMEwiCsc+/HpjVvZHJQElwIa+nplDGt/Kg+T7jbrlPc= X-Received: by 2002:a05:651c:210b:b0:2c0:1385:8c86 with SMTP id a11-20020a05651c210b00b002c013858c86mr2901745ljq.25.1695063373292; Mon, 18 Sep 2023 11:56:13 -0700 (PDT) MIME-Version: 1.0 References: <20230915095042.1320180-1-da.gomez@samsung.com> <20230915095042.1320180-7-da.gomez@samsung.com> <20230918075758.vlufrhq22es2dhuu@sarkhan> In-Reply-To: <20230918075758.vlufrhq22es2dhuu@sarkhan> From: Yosry Ahmed Date: Mon, 18 Sep 2023 11:55:34 -0700 Message-ID: Subject: Re: [PATCH 6/6] shmem: add large folios support to the write path To: Daniel Gomez Cc: "minchan@kernel.org" , "senozhatsky@chromium.org" , "axboe@kernel.dk" , "djwong@kernel.org" , "willy@infradead.org" , "hughd@google.com" , "akpm@linux-foundation.org" , "mcgrof@kernel.org" , "linux-kernel@vger.kernel.org" , "linux-block@vger.kernel.org" , "linux-xfs@vger.kernel.org" , "linux-fsdevel@vger.kernel.org" , "linux-mm@kvack.org" , "gost.dev@samsung.com" , Pankaj Raghav Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 70F2FC0027 X-Rspam-User: X-Stat-Signature: ha16zfi14e9buobh1gzwtjcswehnobgd X-Rspamd-Server: rspam03 X-HE-Tag: 1695063375-836079 X-HE-Meta: U2FsdGVkX1/TVOVgQ//A+DptGKJCihy3mpWdNmdYdnWhG3WjyfmaLYyEYNFePzvmUsUb2tn0bTd07uAyORocnxndtV6oxWl5C0ia6PfbmX4B7xevJ1XxR8njRKyU5xWUg/5JXqkfCNi6cBS5RXUL8gt08ndrKILdZAOjoeTzsWqD7AUCm5io+jVyTa5G+5Q0fs6Vs3LAteMyTktRF95zN9ANiWJcAvTiNLe0GC35OzrChiI1xuA4IVkXf12DJf0BPD1QcyUN3jzhCk1vfFW3Or4eKn0BclwuiIy51AMdWJkkRmqJI/OxQmBoEDYRuDb+ZUrL69rihtkx8uPEf6d/KfKIRkZVLRkEvKZhYgmmAPwpBlutDTEycTbPzsDnho7kLTh/farwQNlMVztPDAzjIwipYnzc/dTFw6k3VWOYGMvWMlIidyQt1kAEn6cFe7f44N/Aodoq2dIuKZCapBE8vEv7EzC3Pm3VhYoTVXn46lF2cpwvQKnaAIkHviavtkP01SUZEhfgjYORnOuvrdVIoiSgbWrG7avjNjOAI7IkyGG/+ctrgWqo5Mh/+HwD7P6CwykWN3hGkWFTVRYxs1SjYEjfoBRrYqy9GD8c30QbIfCo8d8g84Zju/klosWt7SW95Z4AANEGRdVV9XvkDn7udbfvwBjrLH37oIWEmZUkICcM5ZXnz9rSJ3q6WcBxe6HYL54c188c77rZL8o0OFzZ6haGgaf7syWCOcZWaCr6xjT1vF7Bvv/gttUoEIjsYtRbPGLAUPeG4qrerK+JG4H3AQHtAaJVGRuTYJnzXkeuSn8NDkOPlUgYKL6ZzdUb2CwgZDuB6j5HJuzSnfEPqPjVUZVIxlJ4SUbFJzoFYAGJji/vcQg7T8vH6ovyfTbgi4IdtQIQQ2SKSf7bAqLRJzqOqTQhKsUyiTgF3Bdg4MtJhZt3P8AogGAWK3qwMXqneerendjKVpjKYJD59/Yyje5 Uf72lkrJ 4Rv8MzmMGCMYfeezpgZZLf1xBUEX6oFVDNMd+FQWgeoKo1FrKMqueUy91c07fh12HgkMp1m/xFdHBadPN5jpsNj7KlXUuoN16kXksivdLudy+u4n1Zi76El3qzsw/QBE4dZAP0irhVYBKMbKiUVhmQCJJ9n3LCjgtfUi0h/B5T3RB+eQ+DDZqhuLqKOOzKhluT/0eqP3gzCjAiKUKvII9xKFYMDPVKG8KbbMmv3IaVeZpBLn1fcexJjvRYP+fEUG7begV0tKTG8ZtUmXyjle+tyiSqyS/B6JAm+sPaKGMSGvzA900pTdNJ5ZJhsZZTvg5Ec3h6D+QGaD9qZnRXNoOyc2QlUOJgu6Nxz+eFQiuQDEtD6eg1DlCgnu8HSfsvpwtMDJpD9YRap8v6kpjloyJInXV4JLfWlbPNlbCL7X/cZbljS2XyJfY8ZQwgw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Sep 18, 2023 at 1:00=E2=80=AFAM Daniel Gomez = wrote: > > On Fri, Sep 15, 2023 at 11:26:37AM -0700, Yosry Ahmed wrote: > > On Fri, Sep 15, 2023 at 2:51=E2=80=AFAM Daniel Gomez wrote: > > > > > > Add large folio support for shmem write path matching the same high > > > order preference mechanism used for iomap buffered IO path as used in > > > __filemap_get_folio(). > > > > > > Use the __folio_get_max_order to get a hint for the order of the foli= o > > > based on file size which takes care of the mapping requirements. > > > > > > Swap does not support high order folios for now, so make it order 0 i= n > > > case swap is enabled. > > > > I didn't take a close look at the series, but I am not sure I > > understand the rationale here. Reclaim will split high order shmem > > folios anyway, right? > > For context, this is part of the enablement of large block sizes (LBS) > effort [1][2][3], so the assumption here is that the kernel will > reclaim memory with the same (large) block sizes that were written to > the device. > > I'll add more context in the V2. > > [1] https://kernelnewbies.org/KernelProjects/large-block-size > [2] https://docs.google.com/spreadsheets/d/e/2PACX-1vS7sQfw90S00l2rfOKm83= Jlg0px8KxMQE4HHp_DKRGbAGcAV-xu6LITHBEc4xzVh9wLH6WM2lR0cZS8/pubhtml# > [3] https://lore.kernel.org/all/ZQfbHloBUpDh+zCg@dread.disaster.area/ > > > > It seems like we only enable high order folios if the "noswap" mount > > option is used, which is fairly recent. I doubt it is widely used. > > For now, I skipped the swap path as it currently lacks support for > high order folios. But I'm currently looking into it as part of the LBS > effort (please check spreadsheet at [2] for that). Thanks for the context, but I am not sure I understand. IIUC we are skipping allocating large folios in shmem if swap is enabled in this patch. Swap does not support swapping out large folios as a whole (except THPs), but page reclaim will split those large folios and swap them out as order-0 pages anyway. So I am not sure I understand why we need to skip allocating large folios if swap is enabled. > > > > > > > > Signed-off-by: Daniel Gomez > > > --- > > > mm/shmem.c | 16 +++++++++++++--- > > > 1 file changed, 13 insertions(+), 3 deletions(-) > > > > > > diff --git a/mm/shmem.c b/mm/shmem.c > > > index adff74751065..26ca555b1669 100644 > > > --- a/mm/shmem.c > > > +++ b/mm/shmem.c > > > @@ -1683,13 +1683,19 @@ static struct folio *shmem_alloc_folio(gfp_t = gfp, > > > } > > > > > > static struct folio *shmem_alloc_and_acct_folio(gfp_t gfp, struct in= ode *inode, > > > - pgoff_t index, bool huge, unsigned int *order) > > > + pgoff_t index, bool huge, unsigned int *order, > > > + struct shmem_sb_info *sbinfo) > > > { > > > struct shmem_inode_info *info =3D SHMEM_I(inode); > > > struct folio *folio; > > > int nr; > > > int err; > > > > > > + if (!sbinfo->noswap) > > > + *order =3D 0; > > > + else > > > + *order =3D (*order =3D=3D 1) ? 0 : *order; > > > + > > > if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) > > > huge =3D false; > > > nr =3D huge ? HPAGE_PMD_NR : 1U << *order; > > > @@ -2032,6 +2038,8 @@ static int shmem_get_folio_gfp(struct inode *in= ode, pgoff_t index, > > > return 0; > > > } > > > > > > + order =3D mapping_size_order(inode->i_mapping, index, len); > > > + > > > if (!shmem_is_huge(inode, index, false, > > > vma ? vma->vm_mm : NULL, vma ? vma->vm_fla= gs : 0)) > > > goto alloc_nohuge; > > > @@ -2039,11 +2047,11 @@ static int shmem_get_folio_gfp(struct inode *= inode, pgoff_t index, > > > huge_gfp =3D vma_thp_gfp_mask(vma); > > > huge_gfp =3D limit_gfp_mask(huge_gfp, gfp); > > > folio =3D shmem_alloc_and_acct_folio(huge_gfp, inode, index, = true, > > > - &order); > > > + &order, sbinfo); > > > if (IS_ERR(folio)) { > > > alloc_nohuge: > > > folio =3D shmem_alloc_and_acct_folio(gfp, inode, inde= x, false, > > > - &order); > > > + &order, sbinfo); > > > } > > > if (IS_ERR(folio)) { > > > int retry =3D 5; > > > @@ -2147,6 +2155,8 @@ static int shmem_get_folio_gfp(struct inode *in= ode, pgoff_t index, > > > if (folio_test_large(folio)) { > > > folio_unlock(folio); > > > folio_put(folio); > > > + if (order > 0) > > > + order--; > > > goto alloc_nohuge; > > > } > > > unlock: > > > -- > > > 2.39.2 > > >