From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4976C4167B for ; Wed, 1 Nov 2023 01:27:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3006C8D002B; Tue, 31 Oct 2023 21:27:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2B0218D0001; Tue, 31 Oct 2023 21:27:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 17A658D002B; Tue, 31 Oct 2023 21:27:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 0612A8D0001 for ; Tue, 31 Oct 2023 21:27:39 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id CDE8780C28 for ; Wed, 1 Nov 2023 01:27:38 +0000 (UTC) X-FDA: 81407648196.14.85573D4 Received: from mail-yb1-f171.google.com (mail-yb1-f171.google.com [209.85.219.171]) by imf20.hostedemail.com (Postfix) with ESMTP id 1E9731C000F for ; Wed, 1 Nov 2023 01:27:36 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=jE49rSAv; spf=pass (imf20.hostedemail.com: domain of tom.leiming@gmail.com designates 209.85.219.171 as permitted sender) smtp.mailfrom=tom.leiming@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698802057; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=X9F2S85nrQwVjkJXZ1QXNB9unaOxe7JFrDjf8Uexjb0=; b=ehZaLDpGfVxuAYyypUyKnPj6NT0BSr4MeURaQWnNxWI9VTjn8485StP9Cmnhx0xSwAihFN ohA4qs+7coCNgLKoaCUm3GvpkfN/E3/r5c01BOuLdJojUnCVTLphJHOrUmFgt1YLEv0ZSI HcMLHNZ79HrOm4CYg1UtVBnnVcFFsew= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698802057; a=rsa-sha256; cv=none; b=micOmjh//lFUTQaWGI2dkGPTWNr3TQ+uT0O1JoK/IDC1ORxWzRtlVHqWgRT6AcpWh+WJAX UkpELb/WO8J2TvL3//sOEToz80RrighCBN4MCruQgGXbB/JpjqpKmuvxAtEvsXMVmHDVIn fpVX2hso85h8v2xJXkNHluhZhvPpM+E= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=jE49rSAv; spf=pass (imf20.hostedemail.com: domain of tom.leiming@gmail.com designates 209.85.219.171 as permitted sender) smtp.mailfrom=tom.leiming@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-yb1-f171.google.com with SMTP id 3f1490d57ef6-d9b9adaf291so4993759276.1 for ; Tue, 31 Oct 2023 18:27:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1698802056; x=1699406856; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=X9F2S85nrQwVjkJXZ1QXNB9unaOxe7JFrDjf8Uexjb0=; b=jE49rSAvoNVxnwELI7xN6LjvZmqYXloztRmXeGLvvXx8kRkMNxd+sA9sFaBVs9HN9r 2PzyBzozONqsDG4A1icJ8gvigWlppyDtKbBExQK0hL7BVsizZXocuZAL7xNDh1RPvgpb jCw2Zwt3dEaCPPYOSrZ8wBk6wXSMa20f0Vvx2tXHq8Rpaqgu1+rMPDUOGVNWaN8I7Ob1 mX0YA7OwpLcTLozWrCjt3qH6WC5RmJwb6tKMiPBbp52uzCbuDhUiD7HzKz0Kpd3znQsm vlF7tHqtCbJIp1J9haFuGeJ1tWtMMdOcyf57Ci8rClgH1q+HgbCSv6+d5A8B6Ijg0JLr 3qeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698802056; x=1699406856; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=X9F2S85nrQwVjkJXZ1QXNB9unaOxe7JFrDjf8Uexjb0=; b=QHnfBNCi/LRQqi6UuZ84aAMhIzLXvffY8acFvxxY0Re/a0oGTF8MwqAqEof8Ayrcex VpaBnXGDKWXrE/1nYMvQ8R2nSIg+ffSt5oz2hgK+73X7LeHaaF8tKbcMD56xs/JjAilG J/+2yBJYxtpNp9XHchZy1nMfuUkwNz8/uNK+8wmwcm0lYpXW+6r+e37xWdD4dRG7Y6Ty EMjCEmGrc+BtFMyUAE9sFkKelDIHHRUUkH4d5WZ+DEEUntMJDe2sAVRtqSxNTleA0BIu WHh06h8CPGmwV1zx14wbvSn4kgcoxG9BVXzj1qfFq07F2EHo0foqmZSCK9f3UENEDgAw U4cA== X-Gm-Message-State: AOJu0YyTeKDlHami72MN1MOdOVI6y6GSeXSDgQuwBdY/qYwoLCjx7cpV eY8K3igs4sq1aWs4jorAeNioySahPYGenZWXhVw= X-Google-Smtp-Source: AGHT+IGE/c64lLcNYlD85SNTuo1FeA7mh/riwGBmUgRjj+nc5k98CezvOKtRYTWCC891QYGix030sHR2YhBq+BrvIh4= X-Received: by 2002:a25:b318:0:b0:da0:c49a:5fdf with SMTP id l24-20020a25b318000000b00da0c49a5fdfmr10745700ybj.7.1698802056126; Tue, 31 Oct 2023 18:27:36 -0700 (PDT) MIME-Version: 1.0 References: <3514c87f-c87f-f91f-ca90-1616428f6317@redhat.com> <1a47fa28-3968-51df-5b0b-a19c675cc289@suse.cz> <20231030122513.6gds75hxd65gu747@quack3> <20231030155603.k3kejytq2e4vnp7z@quack3> <98aefaa9-1ac-a0e4-fb9a-89ded456750@redhat.com> <20231031140136.25bio5wajc5pmdtl@quack3> In-Reply-To: From: Ming Lei Date: Wed, 1 Nov 2023 09:27:24 +0800 Message-ID: Subject: Re: Intermittent storage (dm-crypt?) freeze - regression 6.4->6.5 To: =?UTF-8?Q?Marek_Marczykowski=2DG=C3=B3recki?= Cc: Jan Kara , Mikulas Patocka , Vlastimil Babka , Andrew Morton , Matthew Wilcox , Michal Hocko , stable@vger.kernel.org, regressions@lists.linux.dev, Alasdair Kergon , Mike Snitzer , dm-devel@lists.linux.dev, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 1E9731C000F X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: z6xatnyqf6r7pxjpz96rkmfi3qa83u17 X-HE-Tag: 1698802056-147064 X-HE-Meta: U2FsdGVkX19jjAZ30nOhHkPdDQNZiYe75e4XP/SU8jcXvQ+WTJEJyPA8nuOV6cq9SW4iQCzgM6J50y4O5iN//OgPW4XW2FJHHBsrXHqlMfQ29GMwptaPmDi+997+a/QtbPB6/3KSXvRc87p4jXE8SEUks4B0kOWpK2jgg6EZJaG2nZToX390Joyt1/wWKF8S1jdtkME2Jl//zSD3V0wc9eviu4ZUnXMk2DvZ/sGwW+yP2NvZIa5f9Mbr4Mee3mGvepAZYAoV7+oGi6CAXBGIXLFEFN5sQt745sNDcSV1XCFKZQ1NPMntCevyKRpbqLHUpRUyL/Ftd0aho8PqjP8auB143/41EbAoWyBnBpIZrBRanwfT69G6ZKXn7cextw4k1zy5w4w4Y5oQFaGbCe1Xbg7m4H8aKOAycEmOm7ZzV+p5UHVfLYAbVyPUusR515rN9jD+Ux7QqMHfhYmhe8dvwC2I6fiS28rQi+xEmejpXdqttZwLJDO6Zz0SujvJs6vQAmQQU+MKfyF9WOMBblHydwihwukn+HRFLlr35yUbPSfTHRQ3iGkyfsmYVqGI6cdsOEWjsQ9PfSMq0YeOCodLsozf6jVUdQojAzKpdhCigD5hLunDnMkp0HEz5WELHHX1kxPCKmzxvIh6QESEnikh87eJJxk56eozwRJIzOGHI6TTZ49wfRQxWWhM7OiEO2sTq5oKj/Cf/pcWI0PyUU8jr0J26QE1JEvzKVJPl5VSJWSpnjcLOqUxzN99VU6jsAlEt1f0EFcFLLBvwusiF4sqQ0jnZKmBV2SFws/f8w4zTb7uiexTXQCWzji//SqGJMKxKH12j7PwI9CR1FWuQCjnUYXISiCzrhE+RFJ1WZTHDExzMAZDw9QCXB56rj7HIwLA9qOSh+aAlMPLrhZcMM+gY06wBUKMcZ5/Ky0mGBdsv50EkwIEirEOxgplv3Ev9MGd3TFBDi0v3HFgxpPjWkU dzaKao49 9p/rrzKLtxZuaWgY5aysldjBtCyOQ7yDVLLSnmy7aU2OVJZXm1QJNRfl5Mdx8vNJEVM5GJ9vAT9pMNWi8Rf/Ewcua4DEn325+OYvBOtXJKWXt3rexbN1kb22agDDh+DOvi/ZkZWMKAszhAHrEDrXzXjah6prtM3EjC6NcHqkHCmHexElI94CAMu5rvYsm52Pnet5XtYXoYXg2m9CG0rgr5VDIheTTspzL6TmG9E/ZrogQeZRvYeBIo6yV3A9lDMkG8VN7coKVQ0ojkfeekYrGANKSWccA/AnE2oAYRwwtX9ZaszYwBdV96swgmzvjcHHisOIAtHXtDcU3nyQZdbMbx6BJP8Q53KqYNR9RAOw2vv/gzmZX0q0Ck9oHveCGksBUmXHxFGZvUZ6OZ669dKZLdfZZ0jtmv+oQ8zo294ULaUPtzUrPr/31sk07QCpZK1s9D9VcsbwjwitpEq0+H+nMdbiKB9zEQMqPdh6oJjFsXvUVUak= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Oct 31, 2023 at 11:42=E2=80=AFPM Marek Marczykowski-G=C3=B3recki wrote: > > On Tue, Oct 31, 2023 at 03:01:36PM +0100, Jan Kara wrote: > > On Tue 31-10-23 04:48:44, Marek Marczykowski-G=C3=B3recki wrote: > > > Then tried: > > > - PAGE_ALLOC_COSTLY_ORDER=3D4, order=3D4 - cannot reproduce, > > > - PAGE_ALLOC_COSTLY_ORDER=3D4, order=3D5 - cannot reproduce, > > > - PAGE_ALLOC_COSTLY_ORDER=3D4, order=3D6 - freeze rather quickly > > > > > > I've retried the PAGE_ALLOC_COSTLY_ORDER=3D4,order=3D5 case several t= imes > > > and I can't reproduce the issue there. I'm confused... > > > > And this kind of confirms that allocations > PAGE_ALLOC_COSTLY_ORDER > > causing hangs is most likely just a coincidence. Rather something eithe= r in > > the block layer or in the storage driver has problems with handling bio= s > > with sufficiently high order pages attached. This is going to be a bit > > painful to debug I'm afraid. How long does it take for you trigger the > > hang? I'm asking to get rough estimate how heavy tracing we can afford = so > > that we don't overwhelm the system... > > Sometimes it freezes just after logging in, but in worst case it takes > me about 10min of more or less `tar xz` + `dd`. blk-mq debugfs is usually helpful for hang issue in block layer or underlying drivers: (cd /sys/kernel/debug/block && find . -type f -exec grep -aH . {} \;) BTW, you can just collect logs of the exact disks if you know what are behind dm-crypt, which can be figured out by `lsblk`, and it has to be collected after the hang is triggered. Thanks, Ming Lei