Linux block layer

Linux block layer
 help / color / mirror / Atom feed

* [PATCH] rnbd-clt: Use common error handling code in rnbd_get_iu()
From: Markus Elfring @ 2026-06-10 19:03 UTC (permalink / raw)
  To: linux-block, Jack Wang, Jens Axboe, Md. Haris Iqbal; +Cc: LKML, kernel-janitors

From: Markus Elfring <elfring@users.sourceforge.net>
Date: Wed, 10 Jun 2026 20:58:47 +0200

Use an additional label so that a bit of exception handling can be better
reused at the end of an if branch.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
---
 drivers/block/rnbd/rnbd-clt.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/block/rnbd/rnbd-clt.c b/drivers/block/rnbd/rnbd-clt.c
index 4d6725a0035e..d8e3f145ee2f 100644
--- a/drivers/block/rnbd/rnbd-clt.c
+++ b/drivers/block/rnbd/rnbd-clt.c
@@ -329,10 +329,8 @@ static struct rnbd_iu *rnbd_get_iu(struct rnbd_clt_session *sess,
 		return NULL;
 
 	permit = rnbd_get_permit(sess, con_type, wait);
-	if (!permit) {
-		kfree(iu);
-		return NULL;
-	}
+	if (!permit)
+		goto free_iu;
 
 	iu->permit = permit;
 	/*
@@ -349,6 +347,7 @@ static struct rnbd_iu *rnbd_get_iu(struct rnbd_clt_session *sess,
 
 	if (sg_alloc_table(&iu->sgt, 1, GFP_KERNEL)) {
 		rnbd_put_permit(sess, permit);
+free_iu:
 		kfree(iu);
 		return NULL;
 	}
-- 
2.54.0


^ permalink raw reply related

* Re: [PATCH] iomap: enforce DIO alignment check in iomap]
From: Carlos Maiolino @ 2026-06-10 18:19 UTC (permalink / raw)
  To: Keith Busch; +Cc: brauner, linux-block
In-Reply-To: <aimbdmqC10uXrZuo@kbusch-mbp>

On Wed, Jun 10, 2026 at 11:14:30AM -0600, Keith Busch wrote:
> On Wed, Jun 10, 2026 at 06:58:13PM +0200, Carlos Maiolino wrote:
> > On Wed, Jun 10, 2026 at 09:37:17AM -0600, Keith Busch wrote:
> > > On Wed, Jun 10, 2026 at 05:27:42PM +0200, Carlos Maiolino wrote:
> > > > The DIO alignment check has been lifted from iomap layer to rely on the
> > > > block layer to enforce proper alignment when issuing direct IO
> > > > operations. This though, depending on the IO size and buffer address
> > > > passed to the IO operation may lead to user-visible behavior change.
> > > > 
> > > > This has been caught initially by LTP test diotest4 running on
> > > > PPC architecture, where the test fails because a read() operation
> > > > with a supposedly misaligned buffer succeeds instead of an expected
> > > > -EINVAL.
> > > 
> > > It's not supposed to matter where in the stack we determined this be an
> > > invalid request: it should still fail if it's misaligned. Could you
> > > clarify how this is succeeding?
> > 
> > Fair enough, can you point me to where the alignment is supposed to be
> > checked? 
> 
> https://elixir.bootlin.com/linux/v7.1-rc7/source/block/blk-merge.c#L352
> 
> It does require that someone calls the bio split-to-limits routine,
> which I had taken for granted as a given, but I realize that some
> drivers don't do that. What block device are you using for your test?

In the PPC machine, it's a virtual scsi vdasd device from one of the
virtual nodes

NAME HCTL       TYPE VENDOR   MODEL  REV SERIAL                             TRAN
sda  0:0:1:0    disk AIX      VDASD 0001 000a508a00007a0000000175dcba35ac.5

ibmvfc                262144  0
ibmvscsi              196608  2

For my x86 machine (remind I reduce the buffer size to 512 on x86), it's
a commodity sata samsung SSD:

sdb  6:0:0:0    disk ATA      Samsung SSD 860 EVO 1TB RVT02B6Q S3Z9NB0KB82416J   sata

Both with an ext4 straight on top of a partition (no device-mapper or
any other volume layer in between), although I also tried with a
linear device-mapper (lvm) device and the results seemed the same.

Those are two machines I have had more reliable results. My laptop with
a samsung NVME makes read() return an -EIO but I can't tell if it's a
device failure or just a wrong error being returned by now

^ permalink raw reply

* Re: [PATCH 21/27] null_blk: Enable lock context analysis
From: Bart Van Assche @ 2026-06-10 18:01 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, Marco Elver, Keith Busch, Damien Le Moal,
	Chaitanya Kulkarni, Johannes Thumshirn, Nilay Shroff,
	Genjian Zhang, Kees Cook
In-Reply-To: <aij2yP1Y1Dven4Bz@infradead.org>

On 6/9/26 10:31 PM, Christoph Hellwig wrote:
> On Tue, Jun 09, 2026 at 03:05:08PM -0700, Bart Van Assche wrote:
>> Add __must_hold() annotations where these are missing. Annotate two
>> functions that use conditional locking with __context_unsafe().
> 
> Please explain why that is needed and there is no better way to have
> proper annotations.  Both here and in a comment in the code.

Hi Christoph,

Is the change shown below considered acceptable? It converts the
null_lock_zone() and null_unlock_zone() calls into scoped_guard()
and thereby eliminates the risk that any null_lock_zone() call
would not be paired properly with a null_unlock_zone() call. The
DEFINE_CLASS() macro below does the following:
* Before any code block that is protected by scoped_guard(null_zone)
   or guard(null_zone), call null_lock_zone() and create a local variable
   with type struct nullb_dev_and_zone. That local variable is declared
   with the __cleanup__ attribute.
* The __cleanup__ attribute causes null_unlock_zone() to be called with
   the same arguments as null_lock_zone().

Thanks,

Bart.


diff --git a/drivers/block/null_blk/zoned.c b/drivers/block/null_blk/zoned.c
index 12a3534ecb85..3bf3b6057e8a 100644
--- a/drivers/block/null_blk/zoned.c
+++ b/drivers/block/null_blk/zoned.c
@@ -50,6 +50,23 @@ static inline void null_unlock_zone(struct 
nullb_device *dev,
  		mutex_unlock(&zone->mutex);
  }

+struct nullb_dev_and_zone {
+	struct nullb_device *dev;
+	struct nullb_zone *zone;
+};
+
+DEFINE_CLASS(null_zone, struct nullb_dev_and_zone,
+	     null_unlock_zone(_T.dev, _T.zone),
+	     ({
+		     null_lock_zone(dev, zone);
+		     (struct nullb_dev_and_zone){dev, zone};
+	     }),
+	     struct nullb_device *dev, struct nullb_zone *zone)
+
+DEFINE_CLASS_IS_UNCONDITIONAL(null_zone)
+
  int null_init_zoned_dev(struct nullb_device *dev,
  			struct queue_limits *lim)
  {
@@ -218,14 +235,14 @@ int null_report_zones(struct gendisk *disk, 
sector_t sector,
  		 * So use a local copy to avoid corruption of the device zone
  		 * array.
  		 */
-		null_lock_zone(dev, zone);
+		scoped_guard(null_zone, dev, zone) {
  		blkz.start = zone->start;
  		blkz.len = zone->len;
  		blkz.wp = zone->wp;
  		blkz.type = zone->type;
  		blkz.cond = zone->cond;
  		blkz.capacity = zone->capacity;
-		null_unlock_zone(dev, zone);
+		}

  		error = disk_report_zone(disk, &blkz, i, args);
  		if (error)
@@ -366,7 +383,7 @@ static blk_status_t null_zone_write(struct nullb_cmd 
*cmd, sector_t sector,
  		return null_process_cmd(cmd, REQ_OP_WRITE, sector, nr_sectors);
  	}

-	null_lock_zone(dev, zone);
+	scoped_guard(null_zone, dev, zone) {

  	/*
  	 * Regular writes must be at the write pointer position. Zone append
@@ -446,7 +463,8 @@ static blk_status_t null_zone_write(struct nullb_cmd 
*cmd, sector_t sector,
  	ret = badblocks_ret;

  unlock_zone:
-	null_unlock_zone(dev, zone);
+	;
+	}

  	return ret;
  }
@@ -657,14 +675,14 @@ static blk_status_t null_zone_mgmt(struct 
nullb_cmd *cmd, enum req_op op,
  	if (op == REQ_OP_ZONE_RESET_ALL) {
  		for (i = dev->zone_nr_conv; i < dev->nr_zones; i++) {
  			zone = &dev->zones[i];
-			null_lock_zone(dev, zone);
+			scoped_guard(null_zone, dev, zone) {
  			if (zone->cond != BLK_ZONE_COND_EMPTY &&
  			    zone->cond != BLK_ZONE_COND_READONLY &&
  			    zone->cond != BLK_ZONE_COND_OFFLINE) {
  				null_reset_zone(dev, zone);
  				trace_nullb_zone_op(cmd, i, zone->cond);
  			}
-			null_unlock_zone(dev, zone);
+			}
  		}
  		return BLK_STS_OK;
  	}
@@ -672,7 +690,7 @@ static blk_status_t null_zone_mgmt(struct nullb_cmd 
*cmd, enum req_op op,
  	zone_no = null_zone_no(dev, sector);
  	zone = &dev->zones[zone_no];

-	null_lock_zone(dev, zone);
+	scoped_guard(null_zone, dev, zone) {

  	if (zone->cond == BLK_ZONE_COND_READONLY ||
  	    zone->cond == BLK_ZONE_COND_OFFLINE) {
@@ -702,7 +720,8 @@ static blk_status_t null_zone_mgmt(struct nullb_cmd 
*cmd, enum req_op op,
  		trace_nullb_zone_op(cmd, zone_no, zone->cond);

  unlock:
-	null_unlock_zone(dev, zone);
+	;
+	}

  	return ret;
  }
@@ -712,7 +731,6 @@ blk_status_t null_process_zoned_cmd(struct nullb_cmd 
*cmd, enum req_op op,
  {
  	struct nullb_device *dev;
  	struct nullb_zone *zone;
-	blk_status_t sts;

  	switch (op) {
  	case REQ_OP_WRITE:
@@ -731,10 +749,8 @@ blk_status_t null_process_zoned_cmd(struct 
nullb_cmd *cmd, enum req_op op,
  		if (zone->cond == BLK_ZONE_COND_OFFLINE)
  			return BLK_STS_IOERR;

-		null_lock_zone(dev, zone);
-		sts = null_process_cmd(cmd, op, sector, nr_sectors);
-		null_unlock_zone(dev, zone);
-		return sts;
+		scoped_guard(null_zone, dev, zone)
+			return null_process_cmd(cmd, op, sector, nr_sectors);
  	}
  }

@@ -748,7 +764,7 @@ static void null_set_zone_cond(struct nullb_device *dev,
  			 cond != BLK_ZONE_COND_OFFLINE))
  		return;

-	null_lock_zone(dev, zone);
+	guard(null_zone)(dev, zone);

  	/*
  	 * If the read-only condition is requested again to zones already in
@@ -769,8 +785,6 @@ static void null_set_zone_cond(struct nullb_device *dev,
  		zone->cond = cond;
  		zone->wp = NULL_ZONE_INVALID_WP;
  	}
-
-	null_unlock_zone(dev, zone);
  }

  /*


^ permalink raw reply related

* Re: [PATCH 20/27] nbd: Enable lock context analysis
From: Bart Van Assche @ 2026-06-10 17:16 UTC (permalink / raw)
  To: Nilay Shroff, Jens Axboe
  Cc: linux-block, Christoph Hellwig, Marco Elver, Christoph Hellwig,
	Josef Bacik
In-Reply-To: <4c8438e3-2415-43c9-ba6a-27321070c58e@linux.ibm.com>

On 6/10/26 1:02 AM, Nilay Shroff wrote:
> Above changes are good, however I see nbd also uses @nbd_index_mutex
> which guards @nbd_index_idr. So should we also annotate @nbd_index_idr
> using __guarded_by(&nbd_index_mutex)?

How about adding these changes as an additional patch?

Thanks,

Bart.

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 345e4b73009d..b9e0ad0b3ca0 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -49,8 +49,8 @@
  #define CREATE_TRACE_POINTS
  #include <trace/events/nbd.h>

-static DEFINE_IDR(nbd_index_idr);
  static DEFINE_MUTEX(nbd_index_mutex);
+static __guarded_by(&nbd_index_mutex) DEFINE_IDR(nbd_index_idr);
  static struct workqueue_struct *nbd_del_wq;
  static int nbd_total_devices = 0;

@@ -2739,7 +2739,9 @@ static void __exit nbd_cleanup(void)
  	/* Also wait for nbd_dev_remove_work() completes */
  	destroy_workqueue(nbd_del_wq);

-	idr_destroy(&nbd_index_idr);
+	scoped_guard(mutex_init, &nbd_index_mutex)
+		idr_destroy(&nbd_index_idr);
+
  	unregister_blkdev(NBD_MAJOR, "nbd");
  }



^ permalink raw reply related

* Re: [PATCH] iomap: enforce DIO alignment check in iomap]
From: Keith Busch @ 2026-06-10 17:14 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: brauner, linux-block
In-Reply-To: <aimGzU_UY3jV-ece@nidhogg.toxiclabs.cc>

On Wed, Jun 10, 2026 at 06:58:13PM +0200, Carlos Maiolino wrote:
> On Wed, Jun 10, 2026 at 09:37:17AM -0600, Keith Busch wrote:
> > On Wed, Jun 10, 2026 at 05:27:42PM +0200, Carlos Maiolino wrote:
> > > The DIO alignment check has been lifted from iomap layer to rely on the
> > > block layer to enforce proper alignment when issuing direct IO
> > > operations. This though, depending on the IO size and buffer address
> > > passed to the IO operation may lead to user-visible behavior change.
> > > 
> > > This has been caught initially by LTP test diotest4 running on
> > > PPC architecture, where the test fails because a read() operation
> > > with a supposedly misaligned buffer succeeds instead of an expected
> > > -EINVAL.
> > 
> > It's not supposed to matter where in the stack we determined this be an
> > invalid request: it should still fail if it's misaligned. Could you
> > clarify how this is succeeding?
> 
> Fair enough, can you point me to where the alignment is supposed to be
> checked? 

https://elixir.bootlin.com/linux/v7.1-rc7/source/block/blk-merge.c#L352

It does require that someone calls the bio split-to-limits routine,
which I had taken for granted as a given, but I realize that some
drivers don't do that. What block device are you using for your test?

^ permalink raw reply

* Re: [PATCH 18/27] loop: Add lock context annotations
From: Bart Van Assche @ 2026-06-10 17:13 UTC (permalink / raw)
  To: Nilay Shroff, Jens Axboe
  Cc: linux-block, Christoph Hellwig, Marco Elver, Nathan Chancellor
In-Reply-To: <c71180a3-c049-490f-a8ab-b3faa9714de1@linux.ibm.com>

On 6/10/26 2:21 AM, Nilay Shroff wrote:
> One thing I noticed while looking through the loop driver is that it 
> also defines
> @loop_ctl_mutex, which protects @loop_index_idr. It might be worth 
> annotating
> @loop_index_idr with `__guarded_by(&loop_ctl_mutex) as well so that 
> Clang can
> validate accesses to the IDR against the corresponding locking 
> requirements.

I'm considering to add the changes below as an additional patch:


diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index ff7eff102c5a..30a2b2696368 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -90,8 +90,8 @@ struct loop_cmd {
  #define LOOP_IDLE_WORKER_TIMEOUT (60 * HZ)
  #define LOOP_DEFAULT_HW_Q_DEPTH 128

-static DEFINE_IDR(loop_index_idr);
  static DEFINE_MUTEX(loop_ctl_mutex);
+static __guarded_by(&loop_ctl_mutex) DEFINE_IDR(loop_index_idr);
  static DEFINE_MUTEX(loop_validate_mutex);

  /**
@@ -2326,6 +2326,8 @@ static void __exit loop_exit(void)
  	struct loop_device *lo;
  	int id;

+	guard(mutex_init)(&loop_ctl_mutex);
+
  	unregister_blkdev(LOOP_MAJOR, "loop");
  	misc_deregister(&loop_misc);



^ permalink raw reply related

* Re: [PATCH 11/27] drbd: Split drbd_req_state()
From: Bart Van Assche @ 2026-06-10 17:07 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, Marco Elver, Philipp Reisner,
	Lars Ellenberg, Christoph Böhmwalder
In-Reply-To: <61eb3024-f9bb-465a-938c-c12db1218923@acm.org>

On 6/10/26 10:05 AM, Bart Van Assche wrote:
> Do you perhaps want me to combine this patch and the next patch in this
> series (12/27)?

Answering my own question: I just noticed that this has been requested
in the review comment on the next patch. I will combine these two
patches.

Bart.

^ permalink raw reply

* Re: [PATCH 11/27] drbd: Split drbd_req_state()
From: Bart Van Assche @ 2026-06-10 17:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, Marco Elver, Philipp Reisner,
	Lars Ellenberg, Christoph Böhmwalder
In-Reply-To: <aij1DtWM8N7O3q-E@infradead.org>

On 6/9/26 10:24 PM, Christoph Hellwig wrote:
>> +{
>> +	enum drbd_state_rv rv;
>> +
>> +	if (f & CS_SERIALIZE)
>> +		mutex_lock(device->state_mutex);
>> +	rv = __drbd_req_state(device, mask, val, f & ~CS_SERIALIZE);
>>   	if (f & CS_SERIALIZE)
>>   		mutex_unlock(device->state_mutex);
> 
> Wouldn't something like:
> 
> 	if (f & CS_SERIALIZE) {
> 		mutex_lock(device->state_mutex);
> 		rv = __drbd_req_state(device, mask, val, f & ~CS_SERIALIZE);
>   		mutex_unlock(device->state_mutex);
> 	} else {
> 		rv = __drbd_req_state(device, mask, val, f & ~CS_SERIALIZE);
> 	}
> 
> be either to follow?  Is there much of a point in the CS_SERIALIZE
> clearing here?

Do you perhaps want me to combine this patch and the next patch in this
series (12/27)?

Thanks,

Bart.

^ permalink raw reply

* Re: [PATCH 00/27] Enable lock context analysis in drivers/block/
From: Bart Van Assche @ 2026-06-10 17:00 UTC (permalink / raw)
  To: Christoph Hellwig, Marco Elver; +Cc: Jens Axboe, linux-block
In-Reply-To: <ailK6eqeWrwRLPkz@infradead.org>

On 6/10/26 4:30 AM, Christoph Hellwig wrote:
> Bart: maytbe for next version just enable it on a per-driver basis.
> Once all are covered we can switch to directory-wide.

I can do that. Since the merge window likely will open this Monday, I
probably should wait with reposting any patches from this series until
after the merge window has closed.

Thanks for having taken the time to review this patch series.

Bart.

^ permalink raw reply

* [PATCH 1/1] block: partitions: bound sysv68 slice table count
From: Ren Wei @ 2026-06-10 16:58 UTC (permalink / raw)
  To: linux-block
  Cc: kees, axboe, objecting, akpm, phdm, yuantan098, zcliangcn, bird,
	zzhan461, n05ec
In-Reply-To: <cover.1781036698.git.zzhan461@ucr.edu>

From: Zhao Zhang <zzhan461@ucr.edu>

sysv68_partition() reads a single sector for the slice table, but it
trusts ios_slccnt from disk and walks that many entries after skipping
the synthetic whole-disk slice. A crafted image can set ios_slccnt
larger than the 64 struct slice records that fit in one sector and
trigger an out-of-bounds read while scanning partitions.

Limit the slice count to the number of records that fit in the sector
returned by read_part_sector(), then drop the whole-disk entry only
when the bounded count is non-zero.

Fixes: 19d0e8ce856a ("partition: add support for sysv68 partitions")
Cc: stable@vger.kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Assisted-by: Codex:GPT-5.4
Signed-off-by: Zhao Zhang <zzhan461@ucr.edu>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
---
 block/partitions/sysv68.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/block/partitions/sysv68.c b/block/partitions/sysv68.c
index 470e0f9de7be..5110ed83c541 100644
--- a/block/partitions/sysv68.c
+++ b/block/partitions/sysv68.c
@@ -48,7 +48,8 @@ struct slice {
 
 int sysv68_partition(struct parsed_partitions *state)
 {
-	int i, slices;
+	sector_t slice_sector;
+	unsigned int i, slices;
 	int slot = 1;
 	Sector sect;
 	unsigned char *data;
@@ -65,14 +66,16 @@ int sysv68_partition(struct parsed_partitions *state)
 		return 0;
 	}
 	slices = be16_to_cpu(b->dk_ios.ios_slccnt);
-	i = be32_to_cpu(b->dk_ios.ios_slcblk);
+	slice_sector = be32_to_cpu(b->dk_ios.ios_slcblk);
 	put_dev_sector(sect);
 
-	data = read_part_sector(state, i, &sect);
+	data = read_part_sector(state, slice_sector, &sect);
 	if (!data)
 		return -1;
 
-	slices -= 1; /* last slice is the whole disk */
+	slices = min_t(unsigned int, slices, SECTOR_SIZE / sizeof(*slice));
+	if (slices)
+		slices -= 1; /* last slice is the whole disk */
 	seq_buf_printf(&state->pp_buf, "sysV68: %s(s%u)", state->name, slices);
 	slice = (struct slice *)data;
 	for (i = 0; i < slices; i++, slice++) {
-- 
2.47.3


^ permalink raw reply related

* Re: [PATCH] iomap: enforce DIO alignment check in iomap]
From: Carlos Maiolino @ 2026-06-10 16:58 UTC (permalink / raw)
  To: Keith Busch; +Cc: brauner, linux-block
In-Reply-To: <aimErft1NoW-4Map@kbusch-mbp>

On Wed, Jun 10, 2026 at 09:37:17AM -0600, Keith Busch wrote:
> On Wed, Jun 10, 2026 at 05:27:42PM +0200, Carlos Maiolino wrote:
> > The DIO alignment check has been lifted from iomap layer to rely on the
> > block layer to enforce proper alignment when issuing direct IO
> > operations. This though, depending on the IO size and buffer address
> > passed to the IO operation may lead to user-visible behavior change.
> > 
> > This has been caught initially by LTP test diotest4 running on
> > PPC architecture, where the test fails because a read() operation
> > with a supposedly misaligned buffer succeeds instead of an expected
> > -EINVAL.
> 
> It's not supposed to matter where in the stack we determined this be an
> invalid request: it should still fail if it's misaligned. Could you
> clarify how this is succeeding?

Fair enough, can you point me to where the alignment is supposed to be
checked? I've been seeing kind of different behaviors with different
machines so I kind of am not sure where the alignment is supposed to be
validated within the block layer (I should probably have tagged this
patch as RFC as my understanding of the block layer is superficial).

A few runs when the test failed (because the read() call succeded) I
could see this for example:

openat(dirfd=AT_FDCWD, pathname="testdata-4.2063", flags=O_RDWR|O_DIRECT) = 3
_llseek(fd=3, offset=4096, result=[4096], whence=SEEK_SET) = 0
read(arg1=0x3, arg2=0x25960001, arg3=0x1000) = 0x1000

FWIW, on my laptop, the test fails because read() returns an -EIO (while
the test expects -EINVAL), I'm pointing the fingers to a faulty hardware
now but I didn't discard the possibility of an EIO being returned EINVAL
should have been returned instead.
I'm trying to find another machine to test and see whatever differences
appear...

Cheers

^ permalink raw reply

* Re: [PATCH 09/27] drbd: Split drbd_nl_get_connections_dumpit()
From: Bart Van Assche @ 2026-06-10 16:48 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, Marco Elver, Philipp Reisner,
	Lars Ellenberg, Christoph Böhmwalder
In-Reply-To: <aij0fuVz_Bxi6NXi@infradead.org>

On 6/9/26 10:22 PM, Christoph Hellwig wrote:
> On Tue, Jun 09, 2026 at 03:04:56PM -0700, Bart Van Assche wrote:
>> +static int drbd_nl_put_dump_connections_result(
>> +	struct sk_buff *skb, struct netlink_callback *cb,
>> +	struct drbd_resource *resource, struct drbd_connection *connection,
>> +	enum drbd_ret_code retcode)
> 
> Weird indenation here.  The usual style is either lining up after the
> opening brace or two tabs.

This formatting comes from git clang-format. Anyway, I will fix the
formatting.

Thanks,

Bart.



^ permalink raw reply

* Re: [PATCH v3 3/4] block: drop shared-tag fairness throttling
From: Keith Busch @ 2026-06-10 16:35 UTC (permalink / raw)
  To: Sumit Saxena
  Cc: Christoph Hellwig, Martin K . Petersen, Jens Axboe,
	James E . J . Bottomley, linux-scsi, linux-block, Adam Radford,
	Khalid Aziz, Adaptec OEM Raid Solutions, Matthew Wilcox,
	Hannes Reinecke, Juergen E . Fischer, Russell King,
	linux-arm-kernel, Finn Thain, Michael Schmitz, Anil Gurumurthy,
	Sudarsana Kalluru, Oliver Neukum, Ali Akcaagac, Jamie Lenehan,
	Ram Vegesna, target-devel, Bradley Grove, Satish Kharat,
	Sesidhar Baddela, Karan Tilak Kumar, Yihang Li, Don Brace,
	storagedev, HighPoint Linux Team, Tyrel Datwyler,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, linuxppc-dev, Brian King, Lee Duncan,
	Chris Leech, Mike Christie, open-iscsi, Justin Tee, Paul Ely,
	Kashyap Desai, Shivasharan S, Chandrakanth Patil,
	megaraidlinux.pdl, Sathya Prakash Veerichetty, Sreekanth Reddy,
	mpi3mr-linuxdrv.pdl, Suganath Prabu Subramani, Ranjan Kumar,
	MPT-FusionLinux.pdl, Daniel Palmer, GOTO Masanori, YOKOTA Hiroshi,
	Jack Wang, Geoff Levand, Michael Reed, Nilesh Javali,
	GR-QLogic-Storage-Upstream, Narsimhulu Musini, K . Y . Srinivasan,
	Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li, linux-hyperv,
	Michael S . Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Eugenio Perez, virtualization, Vishal Bhakta,
	bcm-kernel-feedback-list, Juergen Gross, Stefano Stabellini,
	Oleksandr Tyshchenko, xen-devel, Bart Van Assche
In-Reply-To: <CAL2rwxr1uGshb1o=jvP2OnBffNz2cKXj8tHuAUCN5HFuy2vB_g@mail.gmail.com>

On Wed, Jun 10, 2026 at 09:16:11PM +0530, Sumit Saxena wrote:
> The motivation for this change stems from performance issue we
> encountered due to false sharing of the 'nr_active_requests_shared_tags'
> counter
> on certain CPU architectures. I initially submitted a patch to move that
> counter to
> its own cache line to avoid conflicts with 'nr_requests' and other hot
> fields
> (see:
> https://patchwork.kernel.org/project/linux-scsi/patch/20260402074637.92417-3-sumit.saxena@broadcom.com/
> ).
> 
> During the review, Bart shared his work, which eliminates the
> counter entirely by removing the fairness throttling. My testing confirmed
> that
> this approach resolved the performance issues and improved IOPS.
> This patch is part of a larger set, and I have reported the cumulative
> performance
> improvements in the cover letter.

So the problem is just the atomic operation accounting overhead? I
previously thought the device just really needed to consume all the tags
to hit performance.

^ permalink raw reply

* Re: [PATCH v3 3/4] block: drop shared-tag fairness throttling
From: Bart Van Assche @ 2026-06-10 16:05 UTC (permalink / raw)
  To: open-iscsi, Hannes Reinecke, Sumit Saxena, Martin K . Petersen,
	Jens Axboe
  Cc: James E . J . Bottomley, linux-scsi, linux-block
In-Reply-To: <93a82831-608d-4462-a019-26b3adc7089c@suse.de>

On 6/9/26 11:18 PM, Hannes Reinecke wrote:
> The whole point of this was to increase fairness between drives, so
> of course removing it will make an individual drive going faster ...
Data that shows that fairness is preserved even with this patch applied
is available here:
https://lore.kernel.org/linux-block/20240529213921.3166462-1-bvanassche@acm.org/

Thanks,

Bart.

^ permalink raw reply

* Re: [PATCH v3 3/4] block: drop shared-tag fairness throttling
From: Bart Van Assche @ 2026-06-10 16:03 UTC (permalink / raw)
  To: open-iscsi, Christoph Hellwig, Sumit Saxena
  Cc: Martin K . Petersen, Jens Axboe, James E . J . Bottomley,
	linux-scsi, linux-block
In-Reply-To: <aikAs4X-2NWTuwCc@infradead.org>

On 6/9/26 11:14 PM, Christoph Hellwig wrote:
> Just dropping the fairness was rejected before and there is no
> explanation here on why any of that has changed.
Hmm ... has anyone ever rejected this patch? Jens' latest feedback
(May 2024) is available here:
https://lore.kernel.org/linux-block/7a69eba2-42e4-4c67-8a54-37b5b41675f9@kernel.dk/

Thanks,

Bart.

^ permalink raw reply

* Re: [PATCH v3 3/4] block: drop shared-tag fairness throttling
From: Sumit Saxena @ 2026-06-10 15:46 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Martin K . Petersen, Jens Axboe, James E . J . Bottomley,
	linux-scsi, linux-block, Adam Radford, Khalid Aziz,
	Adaptec OEM Raid Solutions, Matthew Wilcox, Hannes Reinecke,
	Juergen E . Fischer, Russell King, linux-arm-kernel, Finn Thain,
	Michael Schmitz, Anil Gurumurthy, Sudarsana Kalluru,
	Oliver Neukum, Ali Akcaagac, Jamie Lenehan, Ram Vegesna,
	target-devel, Bradley Grove, Satish Kharat, Sesidhar Baddela,
	Karan Tilak Kumar, Yihang Li, Don Brace, storagedev,
	HighPoint Linux Team, Tyrel Datwyler, Madhavan Srinivasan,
	Michael Ellerman, Nicholas Piggin, Christophe Leroy, linuxppc-dev,
	Brian King, Lee Duncan, Chris Leech, Mike Christie, open-iscsi,
	Justin Tee, Paul Ely, Kashyap Desai, Shivasharan S,
	Chandrakanth Patil, megaraidlinux.pdl, Sathya Prakash Veerichetty,
	Sreekanth Reddy, mpi3mr-linuxdrv.pdl, Suganath Prabu Subramani,
	Ranjan Kumar, MPT-FusionLinux.pdl, Daniel Palmer, GOTO Masanori,
	YOKOTA Hiroshi, Jack Wang, Geoff Levand, Michael Reed,
	Nilesh Javali, GR-QLogic-Storage-Upstream, Narsimhulu Musini,
	K . Y . Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li,
	linux-hyperv, Michael S . Tsirkin, Jason Wang, Paolo Bonzini,
	Stefan Hajnoczi, Eugenio Perez, virtualization, Vishal Bhakta,
	bcm-kernel-feedback-list, Juergen Gross, Stefano Stabellini,
	Oleksandr Tyshchenko, xen-devel, Bart Van Assche
In-Reply-To: <aikAs4X-2NWTuwCc@infradead.org>


[-- Attachment #1.1: Type: text/plain, Size: 1889 bytes --]

On Wed, Jun 10, 2026 at 11:44 AM Christoph Hellwig <hch@infradead.org>
wrote:
>
> Just dropping the fairness was rejected before and there is no
> explanation here on why any of that has changed.

I missed the fact that this patch had been previously rejected.

The motivation for this change stems from performance issue we
encountered due to false sharing of the 'nr_active_requests_shared_tags'
counter
on certain CPU architectures. I initially submitted a patch to move that
counter to
its own cache line to avoid conflicts with 'nr_requests' and other hot
fields
(see:
https://patchwork.kernel.org/project/linux-scsi/patch/20260402074637.92417-3-sumit.saxena@broadcom.com/
).

During the review, Bart shared his work, which eliminates the
counter entirely by removing the fairness throttling. My testing confirmed
that
this approach resolved the performance issues and improved IOPS.
This patch is part of a larger set, and I have reported the cumulative
performance
improvements in the cover letter.

Thanks,
Sumit

>
> On Tue, Jun 09, 2026 at 05:48:02PM +0530, Sumit Saxena wrote:
> > From: Bart Van Assche <bvanassche@acm.org>
> >
> > Original patch [1] by Bart Van Assche; this version is rebased onto the
> > current tree.  In testing it improves IOPS by roughly 16-18% by removing
> > the fair-sharing throttle on shared tag queues.
> >
> > This patch removes the following code and structure members:
> > - The function hctx_may_queue().
> > - blk_mq_hw_ctx.nr_active and
request_queue.nr_active_requests_shared_tags
> >   and also all the code that modifies these two member variables.
>
> .. and besides that, this commit message is still entirely useless
> as it doesn't explain any of the thoughts of why this change is safe
> and desirable.  While the mechanics above are totally obvious from
> the code change itself.
>

[-- Attachment #1.2: Type: text/html, Size: 2485 bytes --]

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5469 bytes --]

^ permalink raw reply

* Re: [PATCH] iomap: enforce DIO alignment check in iomap]
From: Keith Busch @ 2026-06-10 15:37 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: brauner, linux-block
In-Reply-To: <aimCIEpCwUUyoORf@nidhogg.toxiclabs.cc>

On Wed, Jun 10, 2026 at 05:27:42PM +0200, Carlos Maiolino wrote:
> The DIO alignment check has been lifted from iomap layer to rely on the
> block layer to enforce proper alignment when issuing direct IO
> operations. This though, depending on the IO size and buffer address
> passed to the IO operation may lead to user-visible behavior change.
> 
> This has been caught initially by LTP test diotest4 running on
> PPC architecture, where the test fails because a read() operation
> with a supposedly misaligned buffer succeeds instead of an expected
> -EINVAL.

It's not supposed to matter where in the stack we determined this be an
invalid request: it should still fail if it's misaligned. Could you
clarify how this is succeeding?

^ permalink raw reply

* [PATCH] iomap: enforce DIO alignment check in iomap]
From: Carlos Maiolino @ 2026-06-10 15:27 UTC (permalink / raw)
  To: brauner, linux-block

Sorry, I screwed up two addresses, bouncing them to the right addresses
to avoid spamming the lists again.

----- Forwarded message from cem@kernel.org -----

Date: Wed, 10 Jun 2026 16:52:11 +0200
From: cem@kernel.org
To: brauner@vger.kernel.org
Cc: linux-block@vger.kernel.devel, linux-fsdevel@vger.kernel.org,
  linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, Carlos Maiolino
 <cem@kernel.org>,  Keith Busch <kbusch@kernel.org>, Hannes Reinecke
 <hare@suse.de>,  "Martin K. Petersen" <martin.petersen@oracle.com>,
 Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>
Subject: [PATCH] iomap: enforce DIO alignment check in iomap
Message-ID: <20260610145218.141369-1-cem@kernel.org>
X-Mailer: git-send-email 2.54.0
Tags: sent

From: Carlos Maiolino <cem@kernel.org>

The DIO alignment check has been lifted from iomap layer to rely on the
block layer to enforce proper alignment when issuing direct IO
operations. This though, depending on the IO size and buffer address
passed to the IO operation may lead to user-visible behavior change.

This has been caught initially by LTP test diotest4 running on
PPC architecture, where the test fails because a read() operation
with a supposedly misaligned buffer succeeds instead of an expected
-EINVAL.
This has no direct relationship with PPC, but seems to do with the
IO size crossing page borders or not.

The test allocates a 4k buffer, and then increments the buffer pointer
by a single byte to enforce a misaligned address. It then issues a 4k
read() using such buffer. The read is supposed to return an -EINVAL but
it ends up succeeding.

The allocated buffer is at least a single page, so the read() size being
smaller will end up most of the time within the very same page initially
allocated which seems to suffice the block layer to accept the IO.

On x86 though, the same 4k read will end up crossing page boundaries
causing a bio_split which ends up properly checking the address and
rejecting it due to misalignment.
The test itself is buggy (which seems by design) because it ends up
attempting to read 4096 bytes into a 4095, but I believe the test
expected the address to be rejected prior to any write attempt.

The problematic behavior is reproducible on x86 by reducing the IO size
to something < PAGE_SIZE, so the misaligned read()s will also be accepted
by the block layer.

Fixing this is just a matter of enforcing daddr and memory
alignment back into iomap.

This behavior is reproducible in ext4 and xfs due to both relying on
iomap layer, btrfs does not present this behavior change as it does its
own DIO alignment checking.

Fixes: 7eac33186957 ("iomap: simplify direct io validity check")
Cc: Keith Busch <kbusch@kernel.org>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
---

While I didn't spot any memory/disk corruption looking into this, it
changes the user behavior that dictates buffer addresses must be
properly aligned when issuing direct IO operations so I thought making
iomap check again for the buffer address alignment is reasonable.

 fs/iomap/direct-io.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 95254aa1b654..0064984e64e5 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -400,6 +400,9 @@ static int iomap_dio_bio_iter(struct iomap_iter *iter, struct iomap_dio *dio)
 	if ((pos | length) & (alignment - 1))
 		return -EINVAL;

+	if (iov_iter_alignment(dio->submit.iter) & (alignment - 1))
+		return -EINVAL;
+
 	if (dio->flags & IOMAP_DIO_WRITE) {
 		bool need_completion_work = true;

-- 
2.54.0

----- End forwarded message -----

^ permalink raw reply related

* Re: [PATCH v2 00/14] list: Prepare entry iterators to cache cursor state
From: Andy Shevchenko @ 2026-06-10 15:02 UTC (permalink / raw)
  To: Christian König
  Cc: Kaitao Cheng, Thierry Reding, Jonathan Hunter,
	Sowjanya Komatineni, Davidlohr Bueso, Paul E . McKenney,
	Josh Triplett, Peter Zijlstra, Ingo Molnar, Will Deacon,
	Boqun Feng, Liam Girdwood, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, Tvrtko Ursulin, Huang Rui, Eddie James, Mark Brown,
	Maxime Coquelin, Alexandre Torgue, Laxman Dewangan,
	Neil Armstrong, Robert Foss, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Laurent Pinchart,
	Jonas Karlman, Jernej Skrabec, Matthew Auld, Matthew Brost,
	Waiman Long, drbd-dev, linux-block, linux1394-devel, dri-devel,
	intel-gfx, linux-spi, linux-stm32, linux-arm-kernel, linux-tegra,
	linux-sound, linux-kernel, Andrew Morton, Randy Dunlap,
	Christian Brauner, David Howells, Luca Ceresoli, Kaito Cheng,
	Muchun Song, Philipp Reisner, Lars Ellenberg,
	Christoph Böhmwalder, Jens Axboe, Takashi Sakamoto,
	Andrzej Hajda, Jaroslav Kysela, Takashi Iwai
In-Reply-To: <d974a2ea-6102-45ff-bf36-3b25a2404e40@amd.com>

On Wed, Jun 10, 2026 at 11:11:34AM +0200, Christian König wrote:
> On 6/10/26 10:18, Kaitao Cheng wrote:
> > 在 2026/6/10 16:07, Christian König 写道:

...

> > Should we revert to v1, or keep list_for_each_entry() and
> > list_for_each_entry_safe() as they are, close this thread, and make no
> > changes?
> > 
> > Link to v1:
> > https://lore.kernel.org/all/20260529082149.76764-1-kaitao.cheng@linux.dev/
> > 
> > Or do you have any better suggestions?
> 
> v1 looks perfectly reasonable to me.

But why not just hiding that once for all (in case they don't use the temporary
iterator)? Easy to automate, robust — everyone is happy?

> You should just include some patches in the same patch set to actually use
> the new macros.
> 
> If you modify the files under drivers/dma-buf or drivers/gpu/drm/amd to use
> the new macro I'm happy to review that.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply

* Re: [PATCH v2] rust: add procedural macro for declaring configfs attributes
From: Malte Wechter @ 2026-06-10 14:48 UTC (permalink / raw)
  To: Gary Guo, Andreas Hindborg, Breno Leitao, Miguel Ojeda,
	Boqun Feng, Björn Roy Baron, Benno Lossin, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, Jens Axboe
  Cc: linux-kernel, rust-for-linux, linux-block
In-Reply-To: <DJ5A9HIJ43EJ.2K3SCFI7UR3NF@garyguo.net>


On 6/10/26 12:05 PM, Gary Guo wrote:
> On Wed Jun 10, 2026 at 10:15 AM BST, Malte Wechter wrote:
>>>> +
>>>> +        Ok(ConfigfsAttrs {
>>>> +            container,
>>>> +            data,
>>>> +            child,
>>>> +            attributes,
>>>> +        })
>>>> +    }
>>>> +}
>>>> +
>>>> +fn make_static_ident<T: ToTokens>(ty: &T, suffix: &str) -> syn::Ident {
>>>> +    let raw_id = quote! { #ty }.to_string();
>>>> +
>>>> +    // Sanitizing syn::Type::Path, this is safe since it is
>>>> +    // only used as the identifier.
>>>> +    let normalized = raw_id
>>>> +        .split("::")
>>>> +        .map(|s| String::from(s.trim()))
>>>> +        .reduce(|a, b| format!("{a}_{b}"))
>>>> +        .expect("Cannot be empty")
>>>> +        .to_uppercase()
>>>> +        .replace(|c: char| !c.is_alphanumeric(), "_");
>>>> +
>>>> +    Ident::new(&format!("{}_{}", normalized, suffix), ty.span())
>>>> +}
>>>> +
>>>> +pub(crate) fn configfs_attrs(cfs_attrs: ConfigfsAttrs) -> proc_macro2::TokenStream {
>>>> +    let (container_ty, data_ty) = (&cfs_attrs.container, &cfs_attrs.data);
>>>> +
>>>> +    let data_tp_ident = make_static_ident(data_ty, "TPE");
>>>> +    let data_attr_ident = make_static_ident(data_ty, "ATTR_LIST");
>>> Instead of creating identifers like this, just scope them properly so that
>>> a fixed identifier is used without colliding.
>> I believe it is nice to keep the identifiers unique, especially if you
>> have many fields
>> you can easily differentiate them.
> What make_static_ident is doing I think is quite hacky. I'd rather not having
> them. Plus, the identifiers are not seen by users anyway.
>
> Best,
> Gary

I will simplify the naming of the static variables greatly (removing 
make_static_ident), but still generate unique identifiers.

Best regards,

Malte


^ permalink raw reply

* Re: [PATCH v2 00/14] list: Prepare entry iterators to cache cursor state
From: Andy Shevchenko @ 2026-06-10 14:43 UTC (permalink / raw)
  To: Kaitao Cheng
  Cc: Christian König, Thierry Reding, Jonathan Hunter,
	Sowjanya Komatineni, Davidlohr Bueso, Paul E . McKenney,
	Josh Triplett, Peter Zijlstra, Ingo Molnar, Will Deacon,
	Boqun Feng, Liam Girdwood, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, Tvrtko Ursulin, Huang Rui, Eddie James, Mark Brown,
	Maxime Coquelin, Alexandre Torgue, Laxman Dewangan,
	Neil Armstrong, Robert Foss, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Laurent Pinchart,
	Jonas Karlman, Jernej Skrabec, Matthew Auld, Matthew Brost,
	Waiman Long, drbd-dev, linux-block, linux1394-devel, dri-devel,
	intel-gfx, linux-spi, linux-stm32, linux-arm-kernel, linux-tegra,
	linux-sound, linux-kernel, Andrew Morton, Randy Dunlap,
	Christian Brauner, David Howells, Luca Ceresoli, Kaito Cheng,
	Muchun Song, Philipp Reisner, Lars Ellenberg,
	Christoph Böhmwalder, Jens Axboe, Takashi Sakamoto,
	Andrzej Hajda, Jaroslav Kysela, Takashi Iwai
In-Reply-To: <5152089a-2808-4fe9-b633-b03018105dd2@linux.dev>

On Wed, Jun 10, 2026 at 02:14:06PM +0800, Kaitao Cheng wrote:
> 在 2026/6/9 18:33, Christian König 写道:
> > On 6/9/26 08:13, Kaitao Cheng wrote:
> >>
> >> This series prepares for, and then updates, the list_for_each_entry()
> >> family so the common entry iterators cache their next or previous cursor
> >> before the loop body runs.
> > 
> > Why in the world would we want to do that?
> > 
> > The safe and non-safe variants have very distinct use cases and that is completely intentional.
> > 
> > What we could improve maybe is the documentation, from my experience an astonishing large amount of people have misconceptions about the safe variants.
> > 
> >> The first 13 patches open-code loops that intentionally depend on the
> >> old "derive the next entry from the current cursor at the end of the
> >> iteration" behaviour.  These loops append work to the list being walked,
> >> restart traversal after dropping a lock, skip an entry consumed by the
> >> current iteration, or otherwise adjust the cursor in the loop body.
> > 
> > Well I have to clearly reject the changes for subsystems/components I'm maintaining, that just looks horrible to me and I clearly don't see a good reason for that.
> 
> Hi Christian and Andy Shevchenko,
> 
> Thanks for taking a look. I would like to clarify the point you raised.
> 
> The reason I started looking at this is the original motivation behind
> the _safe() variants.  They exist because some users need to remove, move
> or otherwise consume the current entry while walking the list.  In that
> case the next cursor has to be preserved before the loop body can modify
> the current entry.
> 
> The unfortunate part is that this could not be expressed with the
> existing list_for_each_entry() interface without changing its calling
> convention.  The _safe() variants had to grow an extra argument for the
> temporary cursor, and that is why we ended up with a separate family of
> macros.
> 
> But conceptually, the distinction does not have to be exposed as two
> different iterator families forever.  The difference is an implementation
> detail: whether the iterator keeps the next/previous cursor before the
> body runs.  This series makes the common list_for_each_entry() iterators
> do that internally, so the safe and non-safe forms can effectively be
> folded together, or at least the need for a separate public _safe()
> interface becomes much weaker.
> 
> There is also a usability issue with the current _safe() interface.  The
> caller is forced to define a temporary cursor outside the macro and pass
> it in, even though almost all users never use that cursor directly.  It is
> just boilerplate required by the macro implementation.  I find that
> redundant and awkward: the temporary cursor is an internal detail of the
> iteration, but every caller has to spell it out.

Ah, I think the distinct macro families is that what we want.
But the hiding of the parameter can be done inside list_for_each_*_safe().
You can do a treewide change with coccinelle.

Sorry if I didn't get the whole idea from your previous contributions.

Note, even cases that would need a temporary cursor may be switched to
new list_for_each_*_safe(), see how PCI macros for iterating over resources
are implemented (include/linux/pci.h).

> With the updated list_for_each_entry() implementation, that extra cursor
> can be kept inside the iterator itself.  Callers that only want to walk
> the list, including callers that delete or consume the current entry, no
> longer need to carry an otherwise-unused temporary variable just to make
> the macro work.
> 
> >> The final patch changes include/linux/list.h to keep a private cursor in
> >> the common entry iterators while preserving the public macro interface.
> >> The safe variants remain available when callers need the temporary
> >> cursor explicitly or have stronger mutation requirements.


-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply

* Re: [PATCH v2 23/83] block: rnull: add discard support
From: Malte Wechter @ 2026-06-10 13:55 UTC (permalink / raw)
  To: Andreas Hindborg, Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen,
	Benno Lossin, Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross
  Cc: linux-block, linux-kernel, linux-mm, rust-for-linux
In-Reply-To: <20260609-rnull-v6-19-rc5-send-v2-23-82c7404542e2@kernel.org>

On Tue Jun 9, 2026 at 9:08 PM CEST, Andreas Hindborg wrote:
> Add support for discard operations to the rnull block driver:
> - Add discard module parameter and configfs attribute.
> - Set max_hw_discard_sectors when discard is enabled.
> - Add sector occupancy tracking.
> - Add discard handling that frees sectors and removes empty pages.
> - Discard operations require memory backing to function.
>
> The discard feature uses a bitmap to track which sectors in each page are
> occupied, allowing cleanup of pages when they are empty.
>
> Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
> ---
>  drivers/block/rnull/configfs.rs |  15 +++++
>  drivers/block/rnull/rnull.rs    | 120 +++++++++++++++++++++++++++++++++++-----
>  2 files changed, 121 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
> index 2f3fa81ea121..e47399cd45a4 100644
> --- a/drivers/block/rnull/configfs.rs
> +++ b/drivers/block/rnull/configfs.rs
...
>          }
>      })
>  );
> +
> +configfs_attribute!(DeviceConfig, 10,
> +    show: |this, page| show_field(this.data.lock().discard, page),
> +    store: |this, page| store_with_power_check(this, page, |data, page| {
> +        if !data.memory_backed {
> +            return Err(EINVAL);
> +        }
> +        data.discard = kstrtobool_bytes(page)?;
> +        Ok(())
> +    })
> +);
Should it be ok to set 'discard' to 0 if 'emory_backed' is not set?
In the C null_blk driver, 'discard' defaults to 0 if 'memory_backed' is not set,
it is also ignored (and defaulted to 0) if 'zoned' is enabled.

Best regards,
Malte

^ permalink raw reply

* Re: [PATCH v7 00/43] btrfs: add fscrypt support
From: Daniel Vacek @ 2026-06-10 13:19 UTC (permalink / raw)
  To: Eric Biggers
  Cc: David Sterba, linux-block, linux-fscrypt, linux-btrfs,
	linux-kernel, Chris Mason, Josef Bacik, Theodore Y. Ts'o,
	Jaegeuk Kim, Jens Axboe
In-Reply-To: <20260531002812.GA2302@sol>

On Sun, 31 May 2026 at 02:29, Eric Biggers <ebiggers@kernel.org> wrote:
>
> On Fri, May 22, 2026 at 09:00:46AM +0200, Daniel Vacek wrote:
> > Hi Eric,
> >
> > This is just a gentle ping.
> > I was wondering if you had a chance to look at this version?
> > I believe all your previous feedback has been addressed and this
> > version is solid.
> > Please, let me know your thoughts.
> >
> > Regards,
> > Daniel
>
> It's been really hard to find time to review this huge patchset.  I've
> started going through it and will try to leave comments next week.

Hi Eric,

First, thank you for looking into this.

> In the mean time it would be really helpful if you went through the
> Sashiko reviews
> (https://sashiko.dev/#/patchset/20260513085340.3673127-1-neelx%40suse.com)
> and address the ones that make sense to.  It found 93 issues including
> 16 critical ones, which is kind of a lot.  Some of them are the same
> things I'm noticing already.  Same for the issue that Christoph noticed
> where new devices can be added; Sashiko had already found that too.
>
> If I'm going to have to use my limited human review time to point out
> issues that were already found, that's not a great use of time.

I already went through some of the Sashiko reviews (they were slowly
coming up one by one) before leaving for my vacation. And I found them
mostly confusing or misleading. I'm gonna have a look into the rest I
haven't seen yet to see if there are any useful points. And I'll
compare it to your notes to see if it was Sashiko being confused or if
it was me.

> I also don't see any information about how this was tested (and will
> continue to be tested in the future).

I'm using the `encrypt` group of xfstests as the acceptance criteria
and the full test run to ensure no regressions.

General support for btrfs had to be added. And since we only support
the v2 fscrypt policy, some tests had to be split into two tests - one
testing v1 and the other testing v2 policy.
There are also additional new btrfs specific tests (reflinks,
snapshots). xfstests also need fscrypt support in btrfs-progs with
related changes to export nonces and other metadata like device
offsets of encrypted blocks for validation.

I'll send the fscrypt updated btrfs-progs and xfstests next. I just
need to clean them up a bit first. I'm sorry about the delay, but I
only managed to post the kernel part before my vacation. I was hoping
it would give you all more time to review.

Thanks,
Daniel

> - Eric

^ permalink raw reply

* Re: configurable block error injection v4
From: Christoph Hellwig @ 2026-06-10 13:11 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Jonathan Corbet, Damien Le Moal, Hannes Reinecke, Keith Busch,
	linux-block, linux-doc
In-Reply-To: <20260610051015.1906799-1-hch@lst.de>

Sashiko had two comments on this:

 - the one about the race of removal vs. addition is spot on, and
   trivially fixed by actually removing the code (unlock/lock cycle).
 - the other about zero-sized commands is valid, but more of an
   enhancement.  And one that if implemented right now actually
   make things worse, as flushed actually show up as empty writes
   with the preflush bit and not as REQ_OP_FLUSH.  So if we'd special
   case zero-sized bios, we'd make a flush hit all write rules,
   which would defeat the point.  We really need to do flushes as
   REQ_OP_FLUSH at the bio level :(

Below is what I plan to fold in, and I'm thinking of the empty bio
issue above can be caught in a comment or the documentation nicely.

diff --git a/block/error-injection.c b/block/error-injection.c
index 7f7f0d3327bc..3bc91f199dc7 100644
--- a/block/error-injection.c
+++ b/block/error-injection.c
@@ -119,11 +119,7 @@ static void error_inject_removeall(struct gendisk *disk)
 	while ((inj = list_first_entry_or_null(&disk->error_injection_list,
 			struct blk_error_inject, entry))) {
 		list_del_rcu(&inj->entry);
-		mutex_unlock(&disk->error_injection_lock);
-
 		kfree_rcu_mightsleep(inj);
-
-		mutex_lock(&disk->error_injection_lock);
 	}
 	static_branch_dec(&blk_error_injection_enabled);
 	mutex_unlock(&disk->error_injection_lock);

^ permalink raw reply related

* Re: [PATCH v1] floppy: Drop unused pnp driver data
From: Jens Axboe @ 2026-06-10 12:28 UTC (permalink / raw)
  To: Denis Efremov, Uwe Kleine-König (The Capable Hub)
  Cc: linux-block, linux-kernel
In-Reply-To: <99dbf851ffb99229ea1dcfd8f58e9ee6a1f05349.1781075967.git.u.kleine-koenig@baylibre.com>


On Wed, 10 Jun 2026 09:27:55 +0200, Uwe Kleine-König (The Capable Hub) wrote:
> The pnp_device_id array is only used for module data to support
> auto-loading the floppy module. So the .driver_data member is unused and
> this assignment can be dropped.
> 
> While touching that array, align the coding style to what is used most
> for these.
> 
> [...]

Applied, thanks!

[1/1] floppy: Drop unused pnp driver data
      commit: 64e335790272d9b0468af3a70e34f377924b156f

Best regards,
-- 
Jens Axboe




^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox