From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 29E63C02180 for ; Wed, 15 Jan 2025 10:47:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: Content-Type:In-Reply-To:From:References:Cc:To:Subject:MIME-Version:Date: Message-ID:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=+fZuuelokoOKIA//FmKWAdBZjtvQ6Y+FnIu++jUy0fQ=; b=nKuiPA4MtWn57Sl8q4JLJPgqMu Acb9jjUGjEG/2UVqJruBv038V7KlMaYvqjMIZP2M1RZThqV2eT0BiuoHBCIns/ifD1WzmZsX4x0IB FEVY4uXCBu9fMuXyidDpAoZMU+Bye0F7jmFL0t/J6ecWSDWcusktz0hOTJ8+fOE/Uxtj4WzoukdHc k6kU3w75CIRUZy/vQz3kEIz2DwnDe8c8mWqELhSV4rWxC3AixRpvETxiYu+Hh2tnFuIy3OtFrsIq3 dWAADbNOklHJKrW1K83CW8oxK/ifmQFVbLmY9dlq0AxajCo6J3qm7MAHoYLFvpVf1hNxSGGdvCo9z F1FpykcA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tY0vx-0000000BYzG-1nQ7; Wed, 15 Jan 2025 10:47:37 +0000 Received: from mout.kundenserver.de ([212.227.126.131]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tY0vu-0000000BYy0-1Z06 for linux-nvme@lists.infradead.org; Wed, 15 Jan 2025 10:47:36 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=simg.de; s=s1-ionos; t=1736938050; x=1737542850; i=linux-kernel@simg.de; bh=+fZuuelokoOKIA//FmKWAdBZjtvQ6Y+FnIu++jUy0fQ=; h=X-UI-Sender-Class:Message-ID:Date:MIME-Version:Subject:To:Cc: References:From:In-Reply-To:Content-Type: Content-Transfer-Encoding:cc:content-transfer-encoding: content-type:date:from:message-id:mime-version:reply-to:subject: to; b=1SVU+6g1ketCcYvZHtKmJ6m/LYY+YtfhYWFTWGGqnHRKrolCMxGjLRqmXsIu+uWM tF+x/QojRJsbM2aHlGFV49357rLhe8Cp9clLkT9vJZEpLvjFkcFp0RFxYFWQfPFI4 NacsNJe3sdwmTvM2qPHoS+qJ9K1rQKzVozaVEhLhIb0qRahG89DiRYDmsbBy6SKKW 2JJf+c99G/dqn7Qmv9Ku1W04S2yRM+ezH//Wx24+n4dajDcoVif0OXRwt4+D66lTP S9ZyiAXghslH3t8nIR20zHRBNBks6IUEbOZ3J7xvVJR9nCB/Iz/i9LLP65le9hP5s y/fm9TlcWjN9zKrdAg== X-UI-Sender-Class: 55c96926-9e95-11ee-ae09-1f7a4046a0f6 Received: from [192.168.1.60] ([93.217.99.181]) by mrelayeu.kundenserver.de (mreue012 [212.227.15.167]) with ESMTPSA (Nemesis) id 1MqZE0-1tBtbd46qW-00a8Lv; Wed, 15 Jan 2025 11:47:30 +0100 Message-ID: <6c2a34ac-d158-4109-a166-e6d06cafa360@simg.de> Date: Wed, 15 Jan 2025 11:47:28 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [Bug 219609] File corruptions on SSD in 1st M.2 socket of AsRock X600M-STX + Ryzen 8700G To: Bruno Gravato , bugzilla-daemon@kernel.org Cc: Keith Busch , bugzilla-daemon@kernel.org, Adrian Huang , Linux kernel regressions list , linux-nvme@lists.infradead.org, Jens Axboe , "iommu@lists.linux.dev" , LKML , Thorsten Leemhuis , Christoph Hellwig References: <401f2c46-0bc3-4e7f-b549-f868dc1834c5@leemhuis.info> <20250109082849.GC20724@lst.de> <210e7b28-de05-44bc-9604-83a79ae131b0@leemhuis.info> <726275aa-a3c2-4dbd-9055-a14db93efa29@simg.de> Content-Language: en-US From: Stefan In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-Provags-ID: V03:K1:rnFdIXYz4Y9QPkpV/BUfkHAd3Td8QA5l4h8Y3z5YvImjTAdXNdD p0nu2ylUaMNi1dpXaWVWprQ+d9OqgyCeEMl8diDJI/Um8xAQgVTyJhZPttsJ19JWpBiF4oe rsrtuAVEyHKriBeuqEnybp+oLtgLJ/NkgTn3R2mqZ1ePv74bS/VYRjiIzwNz669EoanOgFD FT4v5HV/BDxonfWYOncTA== UI-OutboundReport: notjunk:1;M01:P0:kt1B4uIbJec=;O4TaXg9JowgzvJ4zJRqEt3csfIT 09oa/FZ8omDf5Px+BrWTF/3oC/vWV6IslnxB0ZdGRWg42A9150y/dnP8UPyYGUUMXLGoC5eIU +JO0dMue8SJJOL/mSdsqjFV9LYgTe1s/0PddMGnd3mrIBlnxJAFrLp9iFFh/QcGckChSHGhtW f6sdysf5/0H2E9tB1oDJDaea+Uiq3fPPKr+E+RT17dzrsNGFyoOTuuNmW/Y366FDt4mUK2dcC RjJN3GB9wrimlVguJoBZyzKsrL83nkgImF4HLCrGfZfB2hbTDjaw3kcetgns6odoHFDtKl21K FYNZC80owE1fBA+dFm4ebeVAwGCpmbhyiXtFACK432A6agVKGv+FYWzhVZgLVfTg0bFnXQ/ih VFduENFOSsqxWNOyxiwkNwEldlfEEhlvNjwf0SspP4Sv58X53AYwVACCxGnQTVunxc5w8czuY 5+2cUAv6l16CXuqd/6tAEn8OV4KM593DcPiWgsurdfTvPpONrmsVdbGQtZo1/dBJcouNnBH/x YpACii1etD7w2KRq5+G2die7jKix1JOjzusXI3BRWc47Ezgm9phAIlcQ++veY1dh9FYfTI2ZK xqaqBwZskOgKp5ng/PA4csKJS819OKvfFZCWhU7xrhmmxjKNF/BXDOm406KuirZiSvVeE1KQg 8g+Jvw6OBhOKYZ2fehf+u1ZHprYAS1ZMBuJ6MTbMykvXctNor1683ZGe1piP+r75aLnKVDKyQ iuP4w1vvyc6SgcdVrEJdmkWk6Kpn6vywYNVQ1vbW/6IQh/C6jxCX9aJsxZJqTVYkt8t02ABlM VlY7Ta9xwCKxJw5zmIRyiTHFQR5PnXDkYYDw6l6P3KtfcccsedFfRkVyJssr6Wf5kS6HQEct+ 2BIGUt4KOm0Z4xRb9vxmtce64c1XuWuyzdmgH3711DciKZTygqJLlfPNDjow4vI+EYPjZxzDv Kc95LxWs1hb4M/Yl+TdLUlCh4lERPW7hdEkSxFyrXUs+ZiFB/6joPIPpY6G14zLwcL61o/Cdv 0dVEgIuZLl+79w6ULzIjOZiN5+wMEzqpkiagG2mOHnimaAIyKWlt3bVYkJA4FaCl6nzykNsDE 3HiPB8xgw= X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250115_024734_857548_C9C825A6 X-CRM114-Status: GOOD ( 33.85 ) X-BeenThere: linux-nvme@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "Linux-nvme" Errors-To: linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org Hi, (replying to both, the mailing list and the kernel bug tracker) Am 15.01.25 um 07:37 schrieb Bruno Gravato: > I then removed the Solidigm disk from the secondary and kept the WD > disk in the main M.2 slot. Rerun my tests (on kernel 6.11.5) and > bang! btrfs scrub now detected quite a few checksum errors! > > I then tried disabling volatile write cache with "nvme set-feature > /dev/nvme0 -f 6 -v 0" "nvme get-feature /dev/nvme0 -f 6" confirmed it > was disabled, but /sys/block/nvme0n1/queue/fua still showed 1... Was > that supposed to turn into 0? You can check this using `nvme get-feature /dev/nvme0n1 -f 6` > So it looks like the corruption only happens if only the main M.2 > slot is occupied and the secondary M.2 slot is free. With two nvme > disks (one on each M.2 slot), there were no errors at all. > > Stefan, did you ever try running your tests with 2 nvme disks > installed on both slots? Or did you use only one slot at a time? No, I only tested these configurations: 1. 1st M.2: Lexar; 2nd M.2: empty (Easy to reproduce write errors) 2. 1st M.2: Kingsten; 2nd M.2: Lexar (Difficult to reproduce read errors with 6.1 Kernel, but no issues with a newer ones within several month of intense use) I'll swap the SSD's soon. Then I will also test other configurations and will try out a third SSD. If I get corruption with other SSD's, I will check which modifications help. Note that I need both SSD's (configuration 2) in about one week and cannot change this for about 3 months (already announced this in December)= . Thus, if there are things I shall test with configuration 1, please inform me quickly. Just as remainder (for those who did not read the two bug trackers): I tested with `f3` (a utility used to detect scam disks) on ext4. `f3` reports overwritten sectors. In configuration 1 this are write errors (appear if I read again). (If no other SSD-intense jobs are running), the corruption do not occur in the last files, and I never noticed file system corruptions, only file contents is corrupt. (This is probably luck, but also has something to do with the journal and the time when file system information are written.) Am 13.01.25 um 22:01 schrieb bugzilla-daemon@kernel.org: > https://bugzilla.kernel.org/show_bug.cgi?id=3D219609 > > --- Comment #21 from mbe --- > Hi, > > I did some more tests. At first I retrieved the following values under debian > >> Debian 12, Kernel 6.1.119, no corruption >> cat /sys/class/block/nvme0n1/queue/max_hw_sectors_kb >> 2048 >> >> cat /sys/class/block/nvme0n1/queue/max_sectors_kb >> 1280 >> >> cat /sys/class/block/nvme0n1/queue/max_segments >> 127 >> >> cat /sys/class/block/nvme0n1/queue/max_segment_size >> 4294967295 > > To achieve the same values on Kernel 6.11.0-13, I had to make the following > changes to drivers/nvme/host/pci.c > >> --- pci.c.org 2024-09-15 16:57:56.000000000 +0200 >> +++ pci.c 2025-01-13 21:18:54.475903619 +0100 >> @@ -41,8 +41,8 @@ >> * These can be higher, but we need to ensure that any command doesn= 't >> * require an sg allocation that needs more than a page of data. >> */ >> -#define NVME_MAX_KB_SZ 8192 >> -#define NVME_MAX_SEGS 128 >> +#define NVME_MAX_KB_SZ 4096 >> +#define NVME_MAX_SEGS 127 >> #define NVME_MAX_NR_ALLOCATIONS 5 >> >> static int use_threaded_interrupts; >> @@ -3048,8 +3048,8 @@ >> * Limit the max command size to prevent iod->sg allocations going >> * over a single page. >> */ >> - dev->ctrl.max_hw_sectors =3D min_t(u32, >> - NVME_MAX_KB_SZ << 1, dma_opt_mapping_size(&pdev->dev) >> 9); >> + //dev->ctrl.max_hw_sectors =3D min_t(u32, >> + // NVME_MAX_KB_SZ << 1, dma_opt_mapping_size(&pdev->dev) >> 9); >> dev->ctrl.max_segments =3D NVME_MAX_SEGS; >> >> /* > > So basically, dev->ctl.max_hw_sectors stays zero, so that in core.c it is set > to the value of nvme_mps_to_sectors(ctrl, id->mdts) (=3D> 4096 in my c= ase) This has the same effect as setting it to `dma_max_mapping_size(...)` >> if (id->mdts) >> max_hw_sectors =3D nvme_mps_to_sectors(ctrl, id->mdts); >> else >> max_hw_sectors =3D UINT_MAX; >> ctrl->max_hw_sectors =3D >> min_not_zero(ctrl->max_hw_sectors, max_hw_sectors); > > But that alone was not enough: > Tests with ctrl->max_hw_sectors=3D4096 and NVME_MAX_SEGS =3D 128 still resulted in > corruptions. > They only went away after reverting this value back to 127 (the value from > kernel 6.1). That change was introduced in 6.3-rc1 using a patch "nvme-pci: place descriptor addresses in iod" ( https://github.com/torvalds/linux/commit/7846c1b5a5db8bb8475603069df7c7af0= 34fd081 ) This patch has no effect for me, i.e. unmodified kernels work up to 6.3.6. The patch that triggers the corruptions is the one introduced in 6.3.7 which replaces `dma_max_mapping_size(...)` by `dma_opt_mapping_size(...)`. If I apply this change to 6.1, the corruptions also occur in that kernel. Matthias, did you checked what happens is you only modify NVME_MAX_SEGS (and leave the `dev->ctrl.max_hw_sectors =3D min_t(u32, NVME_MAX_KB_SZ << 1, dma_opt_mapping_size(&pdev->dev) >> 9);`) > Additional logging to get the values of the following statements >> (dma_opt_mapping_size(&pdev->dev) >> 9) =3D 256 >> (dma_max_mapping_size(&pdev->dev) >> 9) =3D 36028797018963967 [sic!] > > @Stefan, can you check which value NVME_MAX_SEGS had in your tests? > It also seems to have an influence. "128", see above. Regards Stefan