From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BFB8EC43334 for ; Wed, 29 Jun 2022 12:57:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 32D2D8E0005; Wed, 29 Jun 2022 08:57:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2DB5C8E0001; Wed, 29 Jun 2022 08:57:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1CA078E0005; Wed, 29 Jun 2022 08:57:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 0E69E8E0001 for ; Wed, 29 Jun 2022 08:57:37 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 50C6A60820 for ; Wed, 29 Jun 2022 12:57:36 +0000 (UTC) X-FDA: 79631274912.08.1EE688D Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf26.hostedemail.com (Postfix) with ESMTP id C287A140024 for ; Wed, 29 Jun 2022 12:57:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1656507455; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=NI8li6i4Fa8od6Xwnv5YISnQaEXRgLZN/bKwtyutLW8=; b=Yoj2NUeph1kTm5+PZq5W3JV0xCVLm9u5+qnaYkfN4qtju+bjHvyLExbeE7acNdpxY/cNEW CRGlgpdSNCTDpjIIQntDmXI2ujJO42GYTMu7Km/2v8qLpvXmMFajYMxPK4EtX3kLnuSq0c dZ4vkQ/raAeNxOaUun1fUSqoTSSt5jg= Received: from mail-qv1-f72.google.com (mail-qv1-f72.google.com [209.85.219.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-629-4vSefnObPCqX_4FF2EHlzA-1; Wed, 29 Jun 2022 08:57:34 -0400 X-MC-Unique: 4vSefnObPCqX_4FF2EHlzA-1 Received: by mail-qv1-f72.google.com with SMTP id s11-20020a0562140cab00b0046e7d2b24b3so15258937qvs.16 for ; Wed, 29 Jun 2022 05:57:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=NI8li6i4Fa8od6Xwnv5YISnQaEXRgLZN/bKwtyutLW8=; b=RGSxDliSBy/KlhZzEHlNHXiiaeabKQEGdgUOLhlqK+em47UiHv7SrFvZKrSrhBHwN2 GvwMdBDCZHKBXok3DbLeayiYPaSRXB8yWzalpGHY+9GJ7VJva3TwfrMgj9yo1MXD6R41 753et3b0LDsXs+K9OkwE2l6Xj20cKO8I0DD5hI3n4t0v6hWnYGiugrcsIVbpU0FihB/w pGaMLmepxwxUGBoo+gpVVg4NKZXmEqNz6pgHnhKUz0AoqOV1pCd6NFgsYgWJd8KNwsmt cGY2YylBJ2ARnO/Pb/S9l/3cc4wPSZ+hD4ay6I0H7/Y/ZpIWwx8bIH5CTGuXlx+kok+J QenA== X-Gm-Message-State: AJIora86sDx2N3Zhdl3i8E4PKHR2Io5BGrYipOTCDu6bklEPF8f9xgBg wAGhSBS6ZHaVVBijbzjYahMsRqACQgCOpqGjLZBB/IghHekbdvyAuZXzE07Wwjt2X+54V2xAqK+ qcs6hlImU5m0= X-Received: by 2002:a05:6214:ac1:b0:472:a974:59b9 with SMTP id g1-20020a0562140ac100b00472a97459b9mr966714qvi.130.1656507453508; Wed, 29 Jun 2022 05:57:33 -0700 (PDT) X-Google-Smtp-Source: AGRyM1spkVv4gEXBYIpaGadRuLqknIV0kmMSAfF1s4zAypEPVpP875350aVTH3LbG8XV2+b0HyB/hw== X-Received: by 2002:a05:6214:ac1:b0:472:a974:59b9 with SMTP id g1-20020a0562140ac100b00472a97459b9mr966685qvi.130.1656507453162; Wed, 29 Jun 2022 05:57:33 -0700 (PDT) Received: from bfoster (c-24-61-119-116.hsd1.ma.comcast.net. [24.61.119.116]) by smtp.gmail.com with ESMTPSA id b12-20020ac86bcc000000b00304ef50af9fsm10484612qtt.2.2022.06.29.05.57.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Jun 2022 05:57:32 -0700 (PDT) Date: Wed, 29 Jun 2022 08:57:30 -0400 From: Brian Foster To: "Darrick J. Wong" Cc: Dave Chinner , Matthew Wilcox , linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Christoph Hellwig , linux-mm@kvack.org Subject: Re: Multi-page folio issues in 5.19-rc4 (was [PATCH v3 25/25] xfs: Support large folios) Message-ID: References: <20220628073120.GI227878@dread.disaster.area> <20220628221757.GJ227878@dread.disaster.area> MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1656507455; a=rsa-sha256; cv=none; b=3twXF3hrSeGfu/0yR5mKFj8+IpzE3WTAHfeFqoeVWTxkc5TmkIzW3LMANMadIEL3daUp2D xR7giPnPsFtfNjQJwfNEIoOMPvYkXPXSsXgzdyjvPEPtRG2hy+u6dbHfLwDmvfPpHuzIIq vMy2YN47pWsfDzkaN3mrQjCP0+0ZNCs= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Yoj2NUep; spf=none (imf26.hostedemail.com: domain of bfoster@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=bfoster@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1656507455; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NI8li6i4Fa8od6Xwnv5YISnQaEXRgLZN/bKwtyutLW8=; b=G7SPWlw8qDn/H5HMtB8Rf+AVtbadIMhgG7VyUwYOPQPUY1LLGZi1Z4Cihk21CRKFx2h2dR v8jDAuGzh+b/x2h5BEb862p8W6v17Mjz1kA7kKTsOcsXITm/qSAeatx8BzCvI1SsX/jLz6 IUjloD0y3ZzCX8oRUxk7dC/4GHY9rZA= X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: C287A140024 X-Rspam-User: X-Stat-Signature: oj61qtc4fdymxaeqtkny68fst6ukiggj Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Yoj2NUep; spf=none (imf26.hostedemail.com: domain of bfoster@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=bfoster@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-HE-Tag: 1656507455-528159 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jun 28, 2022 at 04:21:55PM -0700, Darrick J. Wong wrote: > On Wed, Jun 29, 2022 at 08:17:57AM +1000, Dave Chinner wrote: > > On Tue, Jun 28, 2022 at 02:18:24PM +0100, Matthew Wilcox wrote: > > > On Tue, Jun 28, 2022 at 12:31:55PM +0100, Matthew Wilcox wrote: > > > > On Tue, Jun 28, 2022 at 12:27:40PM +0100, Matthew Wilcox wrote: > > > > > On Tue, Jun 28, 2022 at 05:31:20PM +1000, Dave Chinner wrote: > > > > > > So using this technique, I've discovered that there's a dirty page > > > > > > accounting leak that eventually results in fsx hanging in > > > > > > balance_dirty_pages(). > > > > > > > > > > Alas, I think this is only an accounting error, and not related to > > > > > the problem(s) that Darrick & Zorro are seeing. I think what you're > > > > > seeing is dirty pages being dropped at truncation without the > > > > > appropriate accounting. ie this should be the fix: > > > > > > > > Argh, try one that actually compiles. > > > > > > ... that one's going to underflow the accounting. Maybe I shouldn't > > > be writing code at 6am? > > > > > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > > > index f7248002dad9..4eec6ee83e44 100644 > > > --- a/mm/huge_memory.c > > > +++ b/mm/huge_memory.c > > > @@ -18,6 +18,7 @@ > > > #include > > > #include > > > #include > > > +#include > > > #include > > > #include > > > #include > > > @@ -2439,11 +2440,15 @@ static void __split_huge_page(struct page *page, struct list_head *list, > > > __split_huge_page_tail(head, i, lruvec, list); > > > /* Some pages can be beyond EOF: drop them from page cache */ > > > if (head[i].index >= end) { > > > - ClearPageDirty(head + i); > > > - __delete_from_page_cache(head + i, NULL); > > > + struct folio *tail = page_folio(head + i); > > > + > > > if (shmem_mapping(head->mapping)) > > > shmem_uncharge(head->mapping->host, 1); > > > - put_page(head + i); > > > + else if (folio_test_clear_dirty(tail)) > > > + folio_account_cleaned(tail, > > > + inode_to_wb(folio->mapping->host)); > > > + __filemap_remove_folio(tail, NULL); > > > + folio_put(tail); > > > } else if (!PageAnon(page)) { > > > __xa_store(&head->mapping->i_pages, head[i].index, > > > head + i, 0); > > > > > > > Yup, that fixes the leak. > > > > Tested-by: Dave Chinner > > Four hours of generic/522 running is long enough to conclude that this > is likely the fix for my problem and migrate long soak testing to my > main g/522 rig and: > > Tested-by: Darrick J. Wong > Just based on Willy's earlier comment.. what I would probably be a little careful/curious about here is whether the accounting fix leads to an indirect behavior change that does impact reproducibility of the corruption problem. For example, does artificially escalated dirty page tracking lead to increased reclaim/writeback activity than might otherwise occur, and thus contend with the fs workload? Clearly it has some impact based on Dave's balance_dirty_pages() problem reproducer, but I don't know if it extends beyond that off the top of my head. That might make some sense if the workload is fsx, since that doesn't typically stress cache/memory usage the way a large fsstress workload or something might. So for example, interesting questions might be... Do your corruption events happen to correspond with dirty page accounting crossing some threshold based on available memory in your test environment? Does reducing available memory affect reproducibility? Etc. Brian > --D > > > Cheers, > > > > Dave. > > -- > > Dave Chinner > > david@fromorbit.com >