linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] block/dm: use BIOSET_PERCPU_CACHE from bio_alloc_clone
@ 2022-03-22 19:49 Mike Snitzer
  2022-03-22 19:49 ` [PATCH 1/3] block: allow BIOSET_PERCPU_CACHE use " Mike Snitzer
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Mike Snitzer @ 2022-03-22 19:49 UTC (permalink / raw)
  To: axboe; +Cc: ming.lei, hch, dm-devel, linux-block

Hi Jens,

I ran with your suggestion and DM now sees a ~7% improvement in hipri
bio polling with io_uring (using dm-linear on null_blk, IOPS went from
900K to 966K).

I'd appreciate it if you could pick up the first patch for 5.19.
I'll rebase dm's 5.19 branch on block once it lands.

(FYI, this series builds on linux-dm.git's "dm-5.18" branch, and the
commits in this series are available in linux-dm.git's "dm-5.19"
branch).

Thanks,
Mike

Mike Snitzer (3):
  block: allow BIOSET_PERCPU_CACHE use from bio_alloc_clone
  dm: enable BIOSET_PERCPU_CACHE for dm_io bioset
  dm: conditionally enable BIOSET_PERCPU_CACHE for bio-based dm_io bioset

 block/bio.c           | 56 ++++++++++++++++++++++++++++++++-------------------
 block/blk.h           |  7 -------
 drivers/md/dm-table.c | 11 +++++++---
 drivers/md/dm.c       | 10 ++++-----
 drivers/md/dm.h       |  4 ++--
 include/linux/bio.h   |  7 +++++++
 6 files changed, 57 insertions(+), 38 deletions(-)

-- 
2.15.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] block: allow BIOSET_PERCPU_CACHE use from bio_alloc_clone
  2022-03-22 19:49 [PATCH 0/3] block/dm: use BIOSET_PERCPU_CACHE from bio_alloc_clone Mike Snitzer
@ 2022-03-22 19:49 ` Mike Snitzer
  2022-03-22 19:54   ` Christoph Hellwig
  2022-03-22 19:49 ` [PATCH 2/3] dm: enable BIOSET_PERCPU_CACHE for dm_io bioset Mike Snitzer
  2022-03-22 19:49 ` [PATCH 3/3] dm: conditionally enable BIOSET_PERCPU_CACHE for bio-based " Mike Snitzer
  2 siblings, 1 reply; 7+ messages in thread
From: Mike Snitzer @ 2022-03-22 19:49 UTC (permalink / raw)
  To: axboe; +Cc: ming.lei, hch, dm-devel, linux-block

These changes allow DM core to make full use of BIOSET_PERCPU_CACHE:

Factor out bio_alloc_percpu_cache() from bio_alloc_kiocb() to allow
use by bio_alloc_clone() too.

Update bioset_init_from_src() to set BIOSET_PERCPU_CACHE if
bio_src->cache is not NULL.

Move bio_clear_polled() to include/linux/bio.h to allow users outside
of block core.

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
---
 block/bio.c         | 56 +++++++++++++++++++++++++++++++++--------------------
 block/blk.h         |  7 -------
 include/linux/bio.h |  7 +++++++
 3 files changed, 42 insertions(+), 28 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index b15f5466ce08..2c3a1f678461 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -728,6 +728,33 @@ void bio_put(struct bio *bio)
 }
 EXPORT_SYMBOL(bio_put);
 
+static struct bio *bio_alloc_percpu_cache(struct block_device *bdev,
+		unsigned short nr_vecs, unsigned int opf, gfp_t gfp,
+		struct bio_set *bs)
+{
+	struct bio_alloc_cache *cache;
+	struct bio *bio;
+
+	cache = per_cpu_ptr(bs->cache, get_cpu());
+	if (cache->free_list) {
+		bio = cache->free_list;
+		cache->free_list = bio->bi_next;
+		cache->nr--;
+		put_cpu();
+		bio_init(bio, bdev, nr_vecs ? bio->bi_inline_vecs : NULL,
+			 nr_vecs, opf);
+		bio->bi_pool = bs;
+		bio_set_flag(bio, BIO_PERCPU_CACHE);
+		return bio;
+	}
+	put_cpu();
+	bio = bio_alloc_bioset(bdev, nr_vecs, opf, gfp, bs);
+	if (!bio)
+		return NULL;
+	bio_set_flag(bio, BIO_PERCPU_CACHE);
+	return bio;
+}
+
 static int __bio_clone(struct bio *bio, struct bio *bio_src, gfp_t gfp)
 {
 	bio_set_flag(bio, BIO_CLONED);
@@ -768,7 +795,10 @@ struct bio *bio_alloc_clone(struct block_device *bdev, struct bio *bio_src,
 {
 	struct bio *bio;
 
-	bio = bio_alloc_bioset(bdev, 0, bio_src->bi_opf, gfp, bs);
+	if (bs->cache && bio_src->bi_opf & REQ_POLLED)
+		bio = bio_alloc_percpu_cache(bdev, 0, bio_src->bi_opf, gfp, bs);
+	else
+		bio = bio_alloc_bioset(bdev, 0, bio_src->bi_opf, gfp, bs);
 	if (!bio)
 		return NULL;
 
@@ -1736,6 +1766,8 @@ int bioset_init_from_src(struct bio_set *bs, struct bio_set *src)
 		flags |= BIOSET_NEED_BVECS;
 	if (src->rescue_workqueue)
 		flags |= BIOSET_NEED_RESCUER;
+	if (src->cache)
+		flags |= BIOSET_PERCPU_CACHE;
 
 	return bioset_init(bs, src->bio_pool.min_nr, src->front_pad, flags);
 }
@@ -1753,35 +1785,17 @@ EXPORT_SYMBOL(bioset_init_from_src);
  *    Like @bio_alloc_bioset, but pass in the kiocb. The kiocb is only
  *    used to check if we should dip into the per-cpu bio_set allocation
  *    cache. The allocation uses GFP_KERNEL internally. On return, the
- *    bio is marked BIO_PERCPU_CACHEABLE, and the final put of the bio
+ *    bio is marked BIO_PERCPU_CACHE, and the final put of the bio
  *    MUST be done from process context, not hard/soft IRQ.
  *
  */
 struct bio *bio_alloc_kiocb(struct kiocb *kiocb, struct block_device *bdev,
 		unsigned short nr_vecs, unsigned int opf, struct bio_set *bs)
 {
-	struct bio_alloc_cache *cache;
-	struct bio *bio;
-
 	if (!(kiocb->ki_flags & IOCB_ALLOC_CACHE) || nr_vecs > BIO_INLINE_VECS)
 		return bio_alloc_bioset(bdev, nr_vecs, opf, GFP_KERNEL, bs);
 
-	cache = per_cpu_ptr(bs->cache, get_cpu());
-	if (cache->free_list) {
-		bio = cache->free_list;
-		cache->free_list = bio->bi_next;
-		cache->nr--;
-		put_cpu();
-		bio_init(bio, bdev, nr_vecs ? bio->bi_inline_vecs : NULL,
-			 nr_vecs, opf);
-		bio->bi_pool = bs;
-		bio_set_flag(bio, BIO_PERCPU_CACHE);
-		return bio;
-	}
-	put_cpu();
-	bio = bio_alloc_bioset(bdev, nr_vecs, opf, GFP_KERNEL, bs);
-	bio_set_flag(bio, BIO_PERCPU_CACHE);
-	return bio;
+	return bio_alloc_percpu_cache(bdev, nr_vecs, opf, GFP_KERNEL, bs);
 }
 EXPORT_SYMBOL_GPL(bio_alloc_kiocb);
 
diff --git a/block/blk.h b/block/blk.h
index ebaa59ca46ca..8e338e76d303 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -451,13 +451,6 @@ extern struct device_attribute dev_attr_events;
 extern struct device_attribute dev_attr_events_async;
 extern struct device_attribute dev_attr_events_poll_msecs;
 
-static inline void bio_clear_polled(struct bio *bio)
-{
-	/* can't support alloc cache if we turn off polling */
-	bio_clear_flag(bio, BIO_PERCPU_CACHE);
-	bio->bi_opf &= ~REQ_POLLED;
-}
-
 long blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg);
 long compat_blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg);
 
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 7523aba4ddf7..709663ae757a 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -787,6 +787,13 @@ static inline void bio_set_polled(struct bio *bio, struct kiocb *kiocb)
 		bio->bi_opf |= REQ_NOWAIT;
 }
 
+static inline void bio_clear_polled(struct bio *bio)
+{
+	/* can't support alloc cache if we turn off polling */
+	bio_clear_flag(bio, BIO_PERCPU_CACHE);
+	bio->bi_opf &= ~REQ_POLLED;
+}
+
 struct bio *blk_next_bio(struct bio *bio, struct block_device *bdev,
 		unsigned int nr_pages, unsigned int opf, gfp_t gfp);
 
-- 
2.15.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] dm: enable BIOSET_PERCPU_CACHE for dm_io bioset
  2022-03-22 19:49 [PATCH 0/3] block/dm: use BIOSET_PERCPU_CACHE from bio_alloc_clone Mike Snitzer
  2022-03-22 19:49 ` [PATCH 1/3] block: allow BIOSET_PERCPU_CACHE use " Mike Snitzer
@ 2022-03-22 19:49 ` Mike Snitzer
  2022-03-22 19:49 ` [PATCH 3/3] dm: conditionally enable BIOSET_PERCPU_CACHE for bio-based " Mike Snitzer
  2 siblings, 0 replies; 7+ messages in thread
From: Mike Snitzer @ 2022-03-22 19:49 UTC (permalink / raw)
  To: axboe; +Cc: ming.lei, hch, dm-devel, linux-block

Also change dm_io_complete() to use bio_clear_polled() so that it
clears both REQ_POLLED and BIO_PERCPU_CACHE if the bio is requeued due
to BLK_STS_DM_REQUEUE.

Only io_uring benefits from using BIOSET_PERCPU_CACHE, it is only safe
to use in non-interrupt context but io_uring's completions are all in
process context.

This change improves DM's hipri bio polling performance by ~7%.

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
---
 drivers/md/dm.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index a2e80c376827..06f3720a190b 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -899,9 +899,9 @@ static void dm_io_complete(struct dm_io *io)
 		/*
 		 * Upper layer won't help us poll split bio, io->orig_bio
 		 * may only reflect a subset of the pre-split original,
-		 * so clear REQ_POLLED in case of requeue
+		 * so clear REQ_POLLED and BIO_PERCPU_CACHE on requeue.
 		 */
-		bio->bi_opf &= ~REQ_POLLED;
+		bio_clear_polled(bio);
 		return;
 	}
 
@@ -3014,7 +3014,7 @@ struct dm_md_mempools *dm_alloc_md_mempools(struct mapped_device *md, enum dm_qu
 		pool_size = max(dm_get_reserved_bio_based_ios(), min_pool_size);
 		front_pad = roundup(per_io_data_size, __alignof__(struct dm_target_io)) + DM_TARGET_IO_BIO_OFFSET;
 		io_front_pad = roundup(per_io_data_size,  __alignof__(struct dm_io)) + DM_IO_BIO_OFFSET;
-		ret = bioset_init(&pools->io_bs, pool_size, io_front_pad, 0);
+		ret = bioset_init(&pools->io_bs, pool_size, io_front_pad, BIOSET_PERCPU_CACHE);
 		if (ret)
 			goto out;
 		if (integrity && bioset_integrity_create(&pools->io_bs, pool_size))
-- 
2.15.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] dm: conditionally enable BIOSET_PERCPU_CACHE for bio-based dm_io bioset
  2022-03-22 19:49 [PATCH 0/3] block/dm: use BIOSET_PERCPU_CACHE from bio_alloc_clone Mike Snitzer
  2022-03-22 19:49 ` [PATCH 1/3] block: allow BIOSET_PERCPU_CACHE use " Mike Snitzer
  2022-03-22 19:49 ` [PATCH 2/3] dm: enable BIOSET_PERCPU_CACHE for dm_io bioset Mike Snitzer
@ 2022-03-22 19:49 ` Mike Snitzer
  2 siblings, 0 replies; 7+ messages in thread
From: Mike Snitzer @ 2022-03-22 19:49 UTC (permalink / raw)
  To: axboe; +Cc: ming.lei, hch, dm-devel, linux-block

There is no point establishing the bioset percpu cache if request_queue
won't have QUEUE_FLAG_POLL.

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
---
 drivers/md/dm-table.c | 11 ++++++++---
 drivers/md/dm.c       |  6 +++---
 drivers/md/dm.h       |  4 ++--
 3 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index c0be4f60b427..7ebc70e3eb2f 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -1002,6 +1002,8 @@ bool dm_table_request_based(struct dm_table *t)
 	return __table_type_request_based(dm_table_get_type(t));
 }
 
+static int dm_table_supports_poll(struct dm_table *t);
+
 static int dm_table_alloc_md_mempools(struct dm_table *t, struct mapped_device *md)
 {
 	enum dm_queue_mode type = dm_table_get_type(t);
@@ -1009,21 +1011,24 @@ static int dm_table_alloc_md_mempools(struct dm_table *t, struct mapped_device *
 	unsigned min_pool_size = 0;
 	struct dm_target *ti;
 	unsigned i;
+	bool poll_supported = false;
 
 	if (unlikely(type == DM_TYPE_NONE)) {
 		DMWARN("no table type is set, can't allocate mempools");
 		return -EINVAL;
 	}
 
-	if (__table_type_bio_based(type))
+	if (__table_type_bio_based(type)) {
 		for (i = 0; i < t->num_targets; i++) {
 			ti = t->targets + i;
 			per_io_data_size = max(per_io_data_size, ti->per_io_data_size);
 			min_pool_size = max(min_pool_size, ti->num_flush_bios);
 		}
+		poll_supported = !!dm_table_supports_poll(t);
+	}
 
-	t->mempools = dm_alloc_md_mempools(md, type, t->integrity_supported,
-					   per_io_data_size, min_pool_size);
+	t->mempools = dm_alloc_md_mempools(md, type, per_io_data_size, min_pool_size,
+					   t->integrity_supported, poll_supported);
 	if (!t->mempools)
 		return -ENOMEM;
 
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 06f3720a190b..071e7615f153 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -2997,8 +2997,8 @@ int dm_noflush_suspending(struct dm_target *ti)
 EXPORT_SYMBOL_GPL(dm_noflush_suspending);
 
 struct dm_md_mempools *dm_alloc_md_mempools(struct mapped_device *md, enum dm_queue_mode type,
-					    unsigned integrity, unsigned per_io_data_size,
-					    unsigned min_pool_size)
+					    unsigned per_io_data_size, unsigned min_pool_size,
+					    bool integrity, bool poll)
 {
 	struct dm_md_mempools *pools = kzalloc_node(sizeof(*pools), GFP_KERNEL, md->numa_node_id);
 	unsigned int pool_size = 0;
@@ -3014,7 +3014,7 @@ struct dm_md_mempools *dm_alloc_md_mempools(struct mapped_device *md, enum dm_qu
 		pool_size = max(dm_get_reserved_bio_based_ios(), min_pool_size);
 		front_pad = roundup(per_io_data_size, __alignof__(struct dm_target_io)) + DM_TARGET_IO_BIO_OFFSET;
 		io_front_pad = roundup(per_io_data_size,  __alignof__(struct dm_io)) + DM_IO_BIO_OFFSET;
-		ret = bioset_init(&pools->io_bs, pool_size, io_front_pad, BIOSET_PERCPU_CACHE);
+		ret = bioset_init(&pools->io_bs, pool_size, io_front_pad, poll ? BIOSET_PERCPU_CACHE : 0);
 		if (ret)
 			goto out;
 		if (integrity && bioset_integrity_create(&pools->io_bs, pool_size))
diff --git a/drivers/md/dm.h b/drivers/md/dm.h
index 9013dc1a7b00..3f89664fea01 100644
--- a/drivers/md/dm.h
+++ b/drivers/md/dm.h
@@ -221,8 +221,8 @@ void dm_kcopyd_exit(void);
  * Mempool operations
  */
 struct dm_md_mempools *dm_alloc_md_mempools(struct mapped_device *md, enum dm_queue_mode type,
-					    unsigned integrity, unsigned per_bio_data_size,
-					    unsigned min_pool_size);
+					    unsigned per_io_data_size, unsigned min_pool_size,
+					    bool integrity, bool poll);
 void dm_free_md_mempools(struct dm_md_mempools *pools);
 
 /*
-- 
2.15.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] block: allow BIOSET_PERCPU_CACHE use from bio_alloc_clone
  2022-03-22 19:49 ` [PATCH 1/3] block: allow BIOSET_PERCPU_CACHE use " Mike Snitzer
@ 2022-03-22 19:54   ` Christoph Hellwig
  2022-03-22 20:16     ` Mike Snitzer
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2022-03-22 19:54 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: axboe, ming.lei, hch, dm-devel, linux-block

On Tue, Mar 22, 2022 at 03:49:25PM -0400, Mike Snitzer wrote:
> -	bio = bio_alloc_bioset(bdev, 0, bio_src->bi_opf, gfp, bs);
> +	if (bs->cache && bio_src->bi_opf & REQ_POLLED)
> +		bio = bio_alloc_percpu_cache(bdev, 0, bio_src->bi_opf, gfp, bs);
> +	else

I don't think we can just unconditionally do this based on REQ_POLLED.
We'd either need a flag for bio_alloc_bioset or (probably better)
a separate interface.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] block: allow BIOSET_PERCPU_CACHE use from bio_alloc_clone
  2022-03-22 19:54   ` Christoph Hellwig
@ 2022-03-22 20:16     ` Mike Snitzer
  2022-03-23  6:46       ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Mike Snitzer @ 2022-03-22 20:16 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: axboe, ming.lei, dm-devel, linux-block

On Tue, Mar 22 2022 at  3:54P -0400,
Christoph Hellwig <hch@lst.de> wrote:

> On Tue, Mar 22, 2022 at 03:49:25PM -0400, Mike Snitzer wrote:
> > -	bio = bio_alloc_bioset(bdev, 0, bio_src->bi_opf, gfp, bs);
> > +	if (bs->cache && bio_src->bi_opf & REQ_POLLED)
> > +		bio = bio_alloc_percpu_cache(bdev, 0, bio_src->bi_opf, gfp, bs);
> > +	else
> 
> I don't think we can just unconditionally do this based on REQ_POLLED.
> We'd either need a flag for bio_alloc_bioset or (probably better)
> a separate interface.
> 

I did initially think it worthwhile to push the use of
bio_alloc_percpu_cache() down to bio_alloc_bioset() rather than
bio_alloc_clone() -- but I started slower with more targetted change
for DM's needs.

And yeah, since there isn't a REQ_ flag equivalent for IOCB_ALLOC_CACHE
(other than just allowing all REQ_POLLED access) there isn't a
meaningful way to limit access to the bioset's percpu cache.

Curious: how do bio_alloc_kiocb() callers know when it appropriate to
set IOCB_ALLOC_CACHE or not?  Seems io_uring is only one and it
unilaterally does:
kiocb->ki_flags |= IOCB_HIPRI | IOCB_ALLOC_CACHE;

So like IOCB_HIPRI maps to REQ_POLL, should IOCB_ALLOC_CACHE map to
REQ_ALLOC_CACHE? Or better name?

Open to further suggestions on which way to go with these details.

Thanks,
Mike


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] block: allow BIOSET_PERCPU_CACHE use from bio_alloc_clone
  2022-03-22 20:16     ` Mike Snitzer
@ 2022-03-23  6:46       ` Christoph Hellwig
  0 siblings, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2022-03-23  6:46 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Christoph Hellwig, axboe, ming.lei, dm-devel, linux-block

On Tue, Mar 22, 2022 at 04:16:07PM -0400, Mike Snitzer wrote:
> I did initially think it worthwhile to push the use of
> bio_alloc_percpu_cache() down to bio_alloc_bioset() rather than
> bio_alloc_clone() -- but I started slower with more targetted change
> for DM's needs.

Note that the nvme io_uring passthrough patches also want a non-kiocb
cached alloc version.  That code isn't quite ready yet but shows we'll
need something there as well.

> And yeah, since there isn't a REQ_ flag equivalent for IOCB_ALLOC_CACHE
> (other than just allowing all REQ_POLLED access) there isn't a
> meaningful way to limit access to the bioset's percpu cache.
> 
> Curious: how do bio_alloc_kiocb() callers know when it appropriate to
> set IOCB_ALLOC_CACHE or not?  Seems io_uring is only one and it
> unilaterally does:
> kiocb->ki_flags |= IOCB_HIPRI | IOCB_ALLOC_CACHE;

Yes, an uring deals with making sure they are freed in the same place.

> So like IOCB_HIPRI maps to REQ_POLL, should IOCB_ALLOC_CACHE map to
> REQ_ALLOC_CACHE? Or better name?

The name sounds good fine, and I just it would be best done by
lifting BIO_PERCPU_CACHE to the REQ_ namespace and ensure it is
cleared by the bio allocator if the percpu cache isn't actually
used.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-03-23  6:47 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-03-22 19:49 [PATCH 0/3] block/dm: use BIOSET_PERCPU_CACHE from bio_alloc_clone Mike Snitzer
2022-03-22 19:49 ` [PATCH 1/3] block: allow BIOSET_PERCPU_CACHE use " Mike Snitzer
2022-03-22 19:54   ` Christoph Hellwig
2022-03-22 20:16     ` Mike Snitzer
2022-03-23  6:46       ` Christoph Hellwig
2022-03-22 19:49 ` [PATCH 2/3] dm: enable BIOSET_PERCPU_CACHE for dm_io bioset Mike Snitzer
2022-03-22 19:49 ` [PATCH 3/3] dm: conditionally enable BIOSET_PERCPU_CACHE for bio-based " Mike Snitzer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).