From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-lj1-f178.google.com (mail-lj1-f178.google.com [209.85.208.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F3B6D275B18 for ; Fri, 21 Nov 2025 13:21:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.178 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763731300; cv=none; b=B8U3T6bcCL+ZxI8tsyluqvlnhv/TpNO1hg9o4vh5MITNNX+sSKhS3VYbxxUmHq5bcg72eVqpKlGC77RDpzj4q74dwxIJw4yC107x8gY97PvZT6RuEaRprciaB/jLUmS1WNevNBBKHhnHaT5CYtoN39dTIZ6EEANDcrKIaqzZ3qw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763731300; c=relaxed/simple; bh=nl2Ah5lw6/NvHF0Hv+J6MtSi9ILBKTcajaAQ/aYHv+Y=; h=From:Date:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=EX73F2xTNzKK3v2HG/thf+Sfly4zb99B2HYPqKzStD3bniJRowZRDeBtib/HplOS67Dl0MVkiR62b+m8G5RMRMXuh5hxad8eARugqJFx7bYEpqcpsMjYDlJ9idiAbS2zIkS1zekuEv+ZnkkxF+iPOJQt/5uNT7zHzT4D8B/0uSA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=MX6t6uva; arc=none smtp.client-ip=209.85.208.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="MX6t6uva" Received: by mail-lj1-f178.google.com with SMTP id 38308e7fff4ca-37b9728a353so20615151fa.0 for ; Fri, 21 Nov 2025 05:21:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1763731297; x=1764336097; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:from:to:cc:subject:date:message-id:reply-to; bh=79GwkU2/fy3f+SepLi3u1RyiaJD0RiWh7OhFW1VczvI=; b=MX6t6uva4drDTJkmoJGJPmlb5fvhLIMVhFfszLUc5nFG07QzE/WpJ2FJ4zw8Nx58/K wN52dWUt1H9D67G62cUvU3gGjGaBvGjIbiMgJ87U6Iq6BAbslSd9G3MMoitspMhnbC4L +0C4xS7hsrDBf7XAJOXYF30pUpqKvkGWxTAQQH09sfk4qgWHbmeFtohg69su8JjoEcwm ougbtkelH9TbeRDZrv5s9Np4H/+hRRlveSByk0qC7SKuSX7Kvs/1qYBM6iyXp/Z4KpIL FlSVk+kMW+KTteLwe1vWRqFjJE0pK7pTtc/FV3BBNwXH1Ca3G4DzTtcXQrRMV4JQck+b PR8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763731297; x=1764336097; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=79GwkU2/fy3f+SepLi3u1RyiaJD0RiWh7OhFW1VczvI=; b=ZXjNrKDcqX3zbxVsGuiqw04BRw1Mj+w/kRKKMepr/hQR1EHb9oJRz20ZFbQjpqdXV+ h+fFHJuddVqfU7tfK4C1IunEflg4WpioEisw+xgXJWUtqdBPePpcs8NpcxzBsJGcY4rG 0pZufGU790tbkJ5J9bWqMM0BC5Nzh4oSjAIQKylIj+MGJU3hp0orS40DqRe392mypfKq IGUhZMRPkkxrYd3qNVK9qGo9kAgWKLEFjZV3RID5LAGl44rDAZiwmzgmwTCTq3e4WC3n Iew0dTaO52KkhGT3Ge6jCerlcz2OIX2/rKuKMjlQI0KhygFA4ciQ4SvTLRxLUtC0KWHu 6gRg== X-Forwarded-Encrypted: i=1; AJvYcCUSrtZv1YZ3jTiYfbSkRiIjoH4uAGk0bJ4qPsQbHSZQP6RQHagAzDp1LAdhBge4XsXSbb3xCRLGDfc7jLI=@vger.kernel.org X-Gm-Message-State: AOJu0YzJaF2Y0xB8K2FqNxJfSbYvRaGUtriJCBLyRemmz/+YERWil1Ny O1Gkf2gfQd+y+J/8QQEXgtJiMm17YXWDJvfnZabRFTBPRgLZCOG1LtCo X-Gm-Gg: ASbGncuk1kXV3BSDrKz84LhrWc6phkS7WK2UCyZpIm9SkPqOkVBR3VS4v0P/Ros+mP9 quBfi7EIM19KR9OiXPmMABgw8D5aPEILEjjiaWiUOBWmp6bjf4tFlk3lVlrzRSPVF148EIL1uXu MKgf1jegUJNEshGdZBT/vJguZOClnwZthl+OoULClQDW0tcdEJSKaFKTMqOz9KlnT4W8otlF5Eb MYVENWzMFIp+f7Sg3IsejBFzalcv5Syau5Jge5YdQJHwQqNtxVtgSxkRYV/U5DG7/JlTbMvhBjq w6mKi72FkjoF8yGKX4ytTwKQwV1Of0jRAJX7y+u1Hu+nw2pIh4cIvejfrVHznKnKeE4u66/necU T1E/bPQcdG4rR4volzhZ00syh/K0UlmAM7Q27bxIbPehBa5izFe9Q9QNkgNLw5JLRPrq9ZRUNGj GUw4GGDChiu/JjbGgrF/mEVvOV1CJtI7Vm/grOn7Sf X-Google-Smtp-Source: AGHT+IE6eqBNnnsPc/kl4pBOnRsoJFDtFinObeP8AvNr0yenZJ5OZ1u/FVuBTd4dSwaQrAWxv6AjhA== X-Received: by 2002:a2e:954c:0:b0:37b:575d:6403 with SMTP id 38308e7fff4ca-37cd91ac037mr5652341fa.6.1763731296665; Fri, 21 Nov 2025 05:21:36 -0800 (PST) Received: from pc636 (host-90-233-212-127.mobileonline.telia.com. [90.233.212.127]) by smtp.gmail.com with ESMTPSA id 38308e7fff4ca-37cc6b4b49bsm11323141fa.4.2025.11.21.05.21.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 21 Nov 2025 05:21:36 -0800 (PST) From: Uladzislau Rezki X-Google-Original-From: Uladzislau Rezki Date: Fri, 21 Nov 2025 14:21:34 +0100 To: Christoph Hellwig , Mikulas Patocka Cc: Uladzislau Rezki , Mikulas Patocka , Benjamin Marzinski , Alasdair Kergon , DMML , Andrew Morton , Mike Snitzer , LKML Subject: Re: [RESEND PATCH] dm-ebs: Mark full buffer dirty even on partial write Message-ID: References: <73556fc8-5fbf-37cb-26b9-7cdb88f69720@redhat.com> <230baa83-cd79-f232-5fb8-1476115e1ae7@redhat.com> <20251119054635.GB19993@lst.de> <20251120062146.GA29990@lst.de> <20251121072421.GA29754@lst.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20251121072421.GA29754@lst.de> On Fri, Nov 21, 2025 at 08:24:21AM +0100, Christoph Hellwig wrote: > On Thu, Nov 20, 2025 at 01:08:57PM +0100, Uladzislau Rezki wrote: > > Could you please check below? Is the last one is correctly reported? > > The latter looks unexpected, but is is becase qemu is not passing through > the qemu physical_block_size attribute to any of the nvme settings Linux > interprets as such for NVMe (NVMe doesn't actually have the concept of > a physical block size, unlike SCSI/ATA): > OK, understood and thank you for checking this. > > root@testvm:~# nvme id-ns -H /dev/nvme0n1 | grep npw > npwg : 0 > npwa : 0 > root@testvm:~# nvme id-ns -H /dev/nvme0n1 | grep naw > nawun : 0 > nawupf : 0 > root@testvm:~# nvme id-ctrl -H /dev/nvme0 | grep awupf > awupf : 0 > > but as said multiple times, that should not really matter - the logical > block size is the granularity of I/O, the physical block size is just > a performance hint. > Right. As stated in commit message of the patch which is in question. 8K emulated in qemu device with CONFIG_TRANSPARENT_HUGEPAGE=y: urezki@pc638:~$ sudo nvme list Node Generic SN Model Namespace Usage Format FW Rev --------------------- --------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- -------- /dev/nvme0n1 /dev/ng0n1 foo QEMU NVMe Ctrl 1 8.49 GB / 8.49 GB 8 KiB + 0 B 10.0.6 urezki@pc638:~$ cat bin/dmsetup.sh #!/bin/bash lower=/dev/nvme0n1 len=$(blockdev --getsz "$lower") echo "0 $len ebs $lower 0 1 16" | dmsetup create nvme-8k urezki@pc638:~$ sudo bin/dmsetup.sh urezki@pc638:~$ sudo cat /sys/block/nvme0n1/queue/logical_block_size 8192 urezki@pc638:~$ sudo cat /sys/block/nvme0n1/queue/physical_block_size 8192 urezki@pc638:~$ sudo cat /sys/block/dm-0/queue/logical_block_size 512 urezki@pc638:~$ sudo cat /sys/block/dm-0/queue/physical_block_size 8192 urezki@pc638:~$ sudo mkfs.ext4 -F /dev/dm-0 mke2fs 1.47.0 (5-Feb-2023) /dev/dm-0 contains a ext4 file system last mounted on Fri Nov 21 12:22:55 2025 Discarding device blocks: done Creating filesystem with 2072576 4k blocks and 518144 inodes Filesystem UUID: f71adb05-c020-4406-bc0d-bdb9e5c29af7 Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632 Allocating group tables: done Writing inode tables: done Creating journal (16384 blocks): done Writing superblocks and filesystem accounting information: mkfs.ext4: Input/output error while writing out and closing file system urezki@pc638:~$ sudo dmesg | grep -i "i/o" [ 71.813322] Buffer I/O error on dev dm-0, logical block 10, lost async page write [ 71.813373] Buffer I/O error on dev dm-0, logical block 11, lost async page write [ 71.813395] Buffer I/O error on dev dm-0, logical block 12, lost async page write [ 71.813415] Buffer I/O error on dev dm-0, logical block 13, lost async page write [ 71.813433] Buffer I/O error on dev dm-0, logical block 14, lost async page write [ 71.813451] Buffer I/O error on dev dm-0, logical block 15, lost async page write [ 71.813475] Buffer I/O error on dev dm-0, logical block 16, lost async page write [ 71.813493] Buffer I/O error on dev dm-0, logical block 17, lost async page write [ 71.813516] Buffer I/O error on dev dm-0, logical block 18, lost async page write [ 71.813537] Buffer I/O error on dev dm-0, logical block 19, lost async page write urezki@pc638:~$ with the patch: urezki@pc638:~$ sudo nvme list Node Generic SN Model Namespace Usage Format FW Rev --------------------- --------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- -------- /dev/nvme0n1 /dev/ng0n1 foo QEMU NVMe Ctrl 1 8.49 GB / 8.49 GB 8 KiB + 0 B 10.0.6 urezki@pc638:~$ cat bin/dmsetup.sh #!/bin/bash lower=/dev/nvme0n1 len=$(blockdev --getsz "$lower") echo "0 $len ebs $lower 0 1 16" | dmsetup create nvme-8k urezki@pc638:~$ sudo bin/dmsetup.sh urezki@pc638:~$ sudo cat /sys/block/nvme0n1/queue/logical_block_size 8192 urezki@pc638:~$ sudo cat /sys/block/nvme0n1/queue/physical_block_size 8192 urezki@pc638:~$ sudo cat /sys/block/dm-0/queue/logical_block_size 512 urezki@pc638:~$ sudo cat /sys/block/dm-0/queue/physical_block_size 8192 urezki@pc638:~$ sudo mkfs.ext4 -F /dev/dm-0 mke2fs 1.47.0 (5-Feb-2023) Discarding device blocks: done Creating filesystem with 2072576 4k blocks and 518144 inodes Filesystem UUID: c7dff4c7-aa7e-4c94-98ee-f9ea2da92a06 Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632 Allocating group tables: done Writing inode tables: done Creating journal (16384 blocks): done Writing superblocks and filesystem accounting information: done urezki@pc638:~$ sudo mount /dev/dm-0 /mnt/ urezki@pc638:~$ ls -al /mnt/ total 24 drwxr-xr-x 3 root root 4096 Nov 21 12:22 . drwxr-xr-x 19 root root 4096 Jul 10 19:42 .. drwx------ 2 root root 16384 Nov 21 12:22 lost+found urezki@pc638:~$ How do we solve this? Mikulas proposed to use below patch: Index: linux-2.6/drivers/md/dm-bufio.c =================================================================== --- linux-2.6.orig/drivers/md/dm-bufio.c 2025-10-13 21:42:47.000000000 +0200 +++ linux-2.6/drivers/md/dm-bufio.c 2025-10-20 14:40:32.000000000 +0200 @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer * { unsigned int n_sectors; sector_t sector; - unsigned int offset, end; + unsigned int offset, end, align; b->end_io = end_io; @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer * b->c->write_callback(b); offset = b->write_start; end = b->write_end; - offset &= -DM_BUFIO_WRITE_ALIGN; - end += DM_BUFIO_WRITE_ALIGN - 1; - end &= -DM_BUFIO_WRITE_ALIGN; + align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev)); + offset &= -align; + end += align - 1; + end &= -align; if (unlikely(end > b->c->block_size)) end = b->c->block_size; and it fixes the setup which i described in the commit message, but i have question. Why in dm-ebs we need to offload partial buffer < ubf size? Thank you for answers! -- Uladzislau Rezki