linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 04/20] block: Add some exports for bcachefs
       [not found] <20230712211115.2174650-1-kent.overstreet@linux.dev>
@ 2023-07-12 21:10 ` Kent Overstreet
  2023-07-24 17:31   ` Christoph Hellwig
  2023-07-25  2:59   ` Matthew Wilcox
  2023-07-12 21:11 ` [PATCH 05/20] block: Allow bio_iov_iter_get_pages() with bio->bi_bdev unset Kent Overstreet
  2023-07-12 21:11 ` [PATCH 06/20] block: Bring back zero_fill_bio_iter Kent Overstreet
  2 siblings, 2 replies; 18+ messages in thread
From: Kent Overstreet @ 2023-07-12 21:10 UTC (permalink / raw)
  To: linux-bcachefs, linux-fsdevel, linux-kernel
  Cc: Kent Overstreet, linux-block, Jens Axboe, Kent Overstreet

From: Kent Overstreet <kent.overstreet@gmail.com>

 - bio_set_pages_dirty(), bio_check_pages_dirty() - dio path
 - blk_status_to_str() - error messages
 - bio_add_folio() - this should definitely be exported for everyone,
   it's the modern version of bio_add_page()

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Cc: linux-block@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
---
 block/bio.c            | 2 ++
 block/blk-core.c       | 1 +
 block/blk.h            | 1 -
 include/linux/blkdev.h | 1 +
 4 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/block/bio.c b/block/bio.c
index 043944fd46..1e75840d17 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1481,6 +1481,7 @@ void bio_set_pages_dirty(struct bio *bio)
 			set_page_dirty_lock(bvec->bv_page);
 	}
 }
+EXPORT_SYMBOL_GPL(bio_set_pages_dirty);
 
 /*
  * bio_check_pages_dirty() will check that all the BIO's pages are still dirty.
@@ -1540,6 +1541,7 @@ void bio_check_pages_dirty(struct bio *bio)
 	spin_unlock_irqrestore(&bio_dirty_lock, flags);
 	schedule_work(&bio_dirty_work);
 }
+EXPORT_SYMBOL_GPL(bio_check_pages_dirty);
 
 static inline bool bio_remaining_done(struct bio *bio)
 {
diff --git a/block/blk-core.c b/block/blk-core.c
index 1da77e7d62..b7b0237c36 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -205,6 +205,7 @@ const char *blk_status_to_str(blk_status_t status)
 		return "<null>";
 	return blk_errors[idx].name;
 }
+EXPORT_SYMBOL_GPL(blk_status_to_str);
 
 /**
  * blk_sync_queue - cancel any pending callbacks on a queue
diff --git a/block/blk.h b/block/blk.h
index 45547bcf11..f20f9ca03e 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -251,7 +251,6 @@ static inline void bio_integrity_free(struct bio *bio)
 
 unsigned long blk_rq_timeout(unsigned long timeout);
 void blk_add_timer(struct request *req);
-const char *blk_status_to_str(blk_status_t status);
 
 bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
 		unsigned int nr_segs);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index c0ffe203a6..7a32dc98e1 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -854,6 +854,7 @@ extern const char *blk_op_str(enum req_op op);
 
 int blk_status_to_errno(blk_status_t status);
 blk_status_t errno_to_blk_status(int errno);
+const char *blk_status_to_str(blk_status_t status);
 
 /* only poll the hardware once, don't continue until a completion was found */
 #define BLK_POLL_ONESHOT		(1 << 0)
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 05/20] block: Allow bio_iov_iter_get_pages() with bio->bi_bdev unset
       [not found] <20230712211115.2174650-1-kent.overstreet@linux.dev>
  2023-07-12 21:10 ` [PATCH 04/20] block: Add some exports for bcachefs Kent Overstreet
@ 2023-07-12 21:11 ` Kent Overstreet
  2023-07-24 17:34   ` Christoph Hellwig
  2023-07-12 21:11 ` [PATCH 06/20] block: Bring back zero_fill_bio_iter Kent Overstreet
  2 siblings, 1 reply; 18+ messages in thread
From: Kent Overstreet @ 2023-07-12 21:11 UTC (permalink / raw)
  To: linux-bcachefs, linux-fsdevel, linux-kernel
  Cc: Kent Overstreet, Jens Axboe, linux-block

bio_iov_iter_get_pages() trims the IO based on the block size of the
block device the IO will be issued to.

However, bcachefs is a multi device filesystem; when we're creating the
bio we don't yet know which block device the bio will be submitted to -
we have to handle the alignment checks elsewhere.

Thus this is needed to avoid a null ptr deref.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
---
 block/bio.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 1e75840d17..e74a04ea14 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1245,7 +1245,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 	struct page **pages = (struct page **)bv;
 	ssize_t size, left;
 	unsigned len, i = 0;
-	size_t offset, trim;
+	size_t offset;
 	int ret = 0;
 
 	/*
@@ -1274,10 +1274,12 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 
 	nr_pages = DIV_ROUND_UP(offset + size, PAGE_SIZE);
 
-	trim = size & (bdev_logical_block_size(bio->bi_bdev) - 1);
-	iov_iter_revert(iter, trim);
+	if (bio->bi_bdev) {
+		size_t trim = size & (bdev_logical_block_size(bio->bi_bdev) - 1);
+		iov_iter_revert(iter, trim);
+		size -= trim;
+	}
 
-	size -= trim;
 	if (unlikely(!size)) {
 		ret = -EFAULT;
 		goto out;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 06/20] block: Bring back zero_fill_bio_iter
       [not found] <20230712211115.2174650-1-kent.overstreet@linux.dev>
  2023-07-12 21:10 ` [PATCH 04/20] block: Add some exports for bcachefs Kent Overstreet
  2023-07-12 21:11 ` [PATCH 05/20] block: Allow bio_iov_iter_get_pages() with bio->bi_bdev unset Kent Overstreet
@ 2023-07-12 21:11 ` Kent Overstreet
  2023-07-24 17:35   ` Christoph Hellwig
  2 siblings, 1 reply; 18+ messages in thread
From: Kent Overstreet @ 2023-07-12 21:11 UTC (permalink / raw)
  To: linux-bcachefs, linux-fsdevel, linux-kernel
  Cc: Kent Overstreet, Kent Overstreet, Jens Axboe, linux-block

From: Kent Overstreet <kent.overstreet@gmail.com>

This reverts 6f822e1b5d9dda3d20e87365de138046e3baa03a - this helper is
used by bcachefs.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
---
 block/bio.c         | 6 +++---
 include/linux/bio.h | 7 ++++++-
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index e74a04ea14..70b5c987bc 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -606,15 +606,15 @@ struct bio *bio_kmalloc(unsigned short nr_vecs, gfp_t gfp_mask)
 }
 EXPORT_SYMBOL(bio_kmalloc);
 
-void zero_fill_bio(struct bio *bio)
+void zero_fill_bio_iter(struct bio *bio, struct bvec_iter start)
 {
 	struct bio_vec bv;
 	struct bvec_iter iter;
 
-	bio_for_each_segment(bv, bio, iter)
+	__bio_for_each_segment(bv, bio, iter, start)
 		memzero_bvec(&bv);
 }
-EXPORT_SYMBOL(zero_fill_bio);
+EXPORT_SYMBOL(zero_fill_bio_iter);
 
 /**
  * bio_truncate - truncate the bio to small size of @new_size
diff --git a/include/linux/bio.h b/include/linux/bio.h
index b3e7529ff5..f2620f8d18 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -484,7 +484,12 @@ extern void bio_copy_data_iter(struct bio *dst, struct bvec_iter *dst_iter,
 extern void bio_copy_data(struct bio *dst, struct bio *src);
 extern void bio_free_pages(struct bio *bio);
 void guard_bio_eod(struct bio *bio);
-void zero_fill_bio(struct bio *bio);
+void zero_fill_bio_iter(struct bio *bio, struct bvec_iter iter);
+
+static inline void zero_fill_bio(struct bio *bio)
+{
+	zero_fill_bio_iter(bio, bio->bi_iter);
+}
 
 static inline void bio_release_pages(struct bio *bio, bool mark_dirty)
 {
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 04/20] block: Add some exports for bcachefs
  2023-07-12 21:10 ` [PATCH 04/20] block: Add some exports for bcachefs Kent Overstreet
@ 2023-07-24 17:31   ` Christoph Hellwig
  2023-07-25  3:00     ` Kent Overstreet
  2023-07-25  2:59   ` Matthew Wilcox
  1 sibling, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2023-07-24 17:31 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcachefs, linux-fsdevel, linux-kernel, Kent Overstreet,
	linux-block, Jens Axboe

On Wed, Jul 12, 2023 at 05:10:59PM -0400, Kent Overstreet wrote:
> From: Kent Overstreet <kent.overstreet@gmail.com>
> 
>  - bio_set_pages_dirty(), bio_check_pages_dirty() - dio path

Why?  We've so far have been able to get away without file systems
reinventing their own DIO path.  I'd really like to keep it that way,
so if you think the iomap dio code should be improved please explain
why.  And please also cycle the fsdevel list in.

>  - blk_status_to_str() - error messages
>  - bio_add_folio() - this should definitely be exported for everyone,
>    it's the modern version of bio_add_page()

These look ok to me to go in when the actual user shows up.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 05/20] block: Allow bio_iov_iter_get_pages() with bio->bi_bdev unset
  2023-07-12 21:11 ` [PATCH 05/20] block: Allow bio_iov_iter_get_pages() with bio->bi_bdev unset Kent Overstreet
@ 2023-07-24 17:34   ` Christoph Hellwig
  2023-07-25  2:43     ` Kent Overstreet
  0 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2023-07-24 17:34 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcachefs, linux-fsdevel, linux-kernel, Jens Axboe,
	linux-block

On Wed, Jul 12, 2023 at 05:11:00PM -0400, Kent Overstreet wrote:
> bio_iov_iter_get_pages() trims the IO based on the block size of the
> block device the IO will be issued to.
> 
> However, bcachefs is a multi device filesystem; when we're creating the
> bio we don't yet know which block device the bio will be submitted to -
> we have to handle the alignment checks elsewhere.

So, we've been trying really hard to always make sure to pass a bdev
to anything that allocates a bio, mostly due due the fact that we
actually derive information like the blk-cgroup associations from it.

The whole blk-cgroup stuff is actually a problem for non-trivial
multi-device setups.  XFS gets away fine because each file just
sits on either the main or RT device and no user I/O goes to the
log device, and btrfs papers over it in a weird way by always
associating with the last added device, which is in many ways gross
and wrong, but at least satisfies the assumptions made in blk-cgroup.

How do you plan to deal with this?  Because I really don't want folks
just to go ahead and ignore the issues, we need to actually sort this
out.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 06/20] block: Bring back zero_fill_bio_iter
  2023-07-12 21:11 ` [PATCH 06/20] block: Bring back zero_fill_bio_iter Kent Overstreet
@ 2023-07-24 17:35   ` Christoph Hellwig
  2023-07-25  2:45     ` Kent Overstreet
  0 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2023-07-24 17:35 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcachefs, linux-fsdevel, linux-kernel, Kent Overstreet,
	Jens Axboe, linux-block

On Wed, Jul 12, 2023 at 05:11:01PM -0400, Kent Overstreet wrote:
> From: Kent Overstreet <kent.overstreet@gmail.com>
> 
> This reverts 6f822e1b5d9dda3d20e87365de138046e3baa03a - this helper is
> used by bcachefs.
> 
> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: linux-block@vger.kernel.org
> ---
>  block/bio.c         | 6 +++---
>  include/linux/bio.h | 7 ++++++-
>  2 files changed, 9 insertions(+), 4 deletions(-)

I really don't see any point in offering this in the block layer.  By
the lack of any other caller it obviously isn't such a generic and
always used helper, but more importantly it literally is three lines
of code to implement it.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 05/20] block: Allow bio_iov_iter_get_pages() with bio->bi_bdev unset
  2023-07-24 17:34   ` Christoph Hellwig
@ 2023-07-25  2:43     ` Kent Overstreet
  2023-07-26 13:23       ` Christoph Hellwig
  0 siblings, 1 reply; 18+ messages in thread
From: Kent Overstreet @ 2023-07-25  2:43 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-bcachefs, linux-fsdevel, linux-kernel, Jens Axboe,
	linux-block

On Mon, Jul 24, 2023 at 10:34:20AM -0700, Christoph Hellwig wrote:
> On Wed, Jul 12, 2023 at 05:11:00PM -0400, Kent Overstreet wrote:
> > bio_iov_iter_get_pages() trims the IO based on the block size of the
> > block device the IO will be issued to.
> > 
> > However, bcachefs is a multi device filesystem; when we're creating the
> > bio we don't yet know which block device the bio will be submitted to -
> > we have to handle the alignment checks elsewhere.
> 
> So, we've been trying really hard to always make sure to pass a bdev
> to anything that allocates a bio, mostly due due the fact that we
> actually derive information like the blk-cgroup associations from it.
> 
> The whole blk-cgroup stuff is actually a problem for non-trivial
> multi-device setups.  XFS gets away fine because each file just
> sits on either the main or RT device and no user I/O goes to the
> log device, and btrfs papers over it in a weird way by always
> associating with the last added device, which is in many ways gross
> and wrong, but at least satisfies the assumptions made in blk-cgroup.
> 
> How do you plan to deal with this?  Because I really don't want folks
> just to go ahead and ignore the issues, we need to actually sort this
> out.

Doing the blk-cgroup association at bio alloc time sounds broken to me,
because of stacking block devices - why was the association not done at
generic_make_request() time?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 06/20] block: Bring back zero_fill_bio_iter
  2023-07-24 17:35   ` Christoph Hellwig
@ 2023-07-25  2:45     ` Kent Overstreet
  2023-07-26 13:21       ` Christoph Hellwig
  0 siblings, 1 reply; 18+ messages in thread
From: Kent Overstreet @ 2023-07-25  2:45 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-bcachefs, linux-fsdevel, linux-kernel, Kent Overstreet,
	Jens Axboe, linux-block

On Mon, Jul 24, 2023 at 10:35:45AM -0700, Christoph Hellwig wrote:
> On Wed, Jul 12, 2023 at 05:11:01PM -0400, Kent Overstreet wrote:
> > From: Kent Overstreet <kent.overstreet@gmail.com>
> > 
> > This reverts 6f822e1b5d9dda3d20e87365de138046e3baa03a - this helper is
> > used by bcachefs.
> > 
> > Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
> > Cc: Jens Axboe <axboe@kernel.dk>
> > Cc: linux-block@vger.kernel.org
> > ---
> >  block/bio.c         | 6 +++---
> >  include/linux/bio.h | 7 ++++++-
> >  2 files changed, 9 insertions(+), 4 deletions(-)
> 
> I really don't see any point in offering this in the block layer.  By
> the lack of any other caller it obviously isn't such a generic and
> always used helper, but more importantly it literally is three lines
> of code to implement it.

And yet, we've had a subtle bug introduced in that code that took quite
awhile to be fixed - I'm not pro code duplication in general and I don't
think this is a good place to start.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 04/20] block: Add some exports for bcachefs
  2023-07-12 21:10 ` [PATCH 04/20] block: Add some exports for bcachefs Kent Overstreet
  2023-07-24 17:31   ` Christoph Hellwig
@ 2023-07-25  2:59   ` Matthew Wilcox
  1 sibling, 0 replies; 18+ messages in thread
From: Matthew Wilcox @ 2023-07-25  2:59 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcachefs, linux-fsdevel, linux-kernel, Kent Overstreet,
	linux-block, Jens Axboe

On Wed, Jul 12, 2023 at 05:10:59PM -0400, Kent Overstreet wrote:
>  - bio_add_folio() - this should definitely be exported for everyone,
>    it's the modern version of bio_add_page()

Looks like this one got added in cd57b77197a4, so it just needs to be
dropped from the changelog.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 04/20] block: Add some exports for bcachefs
  2023-07-24 17:31   ` Christoph Hellwig
@ 2023-07-25  3:00     ` Kent Overstreet
  2023-07-26 13:20       ` Christoph Hellwig
  0 siblings, 1 reply; 18+ messages in thread
From: Kent Overstreet @ 2023-07-25  3:00 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-bcachefs, linux-fsdevel, linux-kernel, Kent Overstreet,
	linux-block, Jens Axboe

On Mon, Jul 24, 2023 at 10:31:04AM -0700, Christoph Hellwig wrote:
> On Wed, Jul 12, 2023 at 05:10:59PM -0400, Kent Overstreet wrote:
> > From: Kent Overstreet <kent.overstreet@gmail.com>
> > 
> >  - bio_set_pages_dirty(), bio_check_pages_dirty() - dio path
> 
> Why?  We've so far have been able to get away without file systems
> reinventing their own DIO path.  I'd really like to keep it that way,
> so if you think the iomap dio code should be improved please explain
> why.  And please also cycle the fsdevel list in.

It's been discussed at length why bcachefs doesn't use iomap.

In short, iomap is heavily callback based, the bcachefs io paths are
not - we pass around data structures instead. I discussed this with
people when iomap was first being written, but iomap ended up being a
much more conservative approach, more in line with the old buffer heads
code where the generic code calls into the filesystem to obtain
mappings.

I'm gradually convincing people of the merits of the bcachefs approach -
in particular reducing indirect function calls is getting more attention
these days.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 04/20] block: Add some exports for bcachefs
  2023-07-25  3:00     ` Kent Overstreet
@ 2023-07-26 13:20       ` Christoph Hellwig
  2023-08-01 18:59         ` Kent Overstreet
  0 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2023-07-26 13:20 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Christoph Hellwig, linux-bcachefs, linux-fsdevel, linux-kernel,
	Kent Overstreet, linux-block, Jens Axboe

On Mon, Jul 24, 2023 at 11:00:37PM -0400, Kent Overstreet wrote:
> In short, iomap is heavily callback based, the bcachefs io paths are
> not - we pass around data structures instead. I discussed this with
> people when iomap was first being written, but iomap ended up being a
> much more conservative approach, more in line with the old buffer heads
> code where the generic code calls into the filesystem to obtain
> mappings.
> 
> I'm gradually convincing people of the merits of the bcachefs approach -
> in particular reducing indirect function calls is getting more attention
> these days.

FYI, Matthew has had patches that convert iomap to be an iterator,
and I've massage the first half of them and actuall got them in
before.  I'd much rather finish off that work (even if only for
direct I/O initially) than adding another direct I/O code.  But
even with out that we should be able to easily pass more private
data, in fact btrfs makes pretty heavy use of that.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 06/20] block: Bring back zero_fill_bio_iter
  2023-07-25  2:45     ` Kent Overstreet
@ 2023-07-26 13:21       ` Christoph Hellwig
  2023-08-01 19:05         ` Kent Overstreet
  0 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2023-07-26 13:21 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Christoph Hellwig, linux-bcachefs, linux-fsdevel, linux-kernel,
	Kent Overstreet, Jens Axboe, linux-block

On Mon, Jul 24, 2023 at 10:45:53PM -0400, Kent Overstreet wrote:
> And yet, we've had a subtle bug introduced in that code that took quite
> awhile to be fixed - I'm not pro code duplication in general and I don't
> think this is a good place to start.

I'm not sure arguing for adding a helper your can triviall implement
yourself really helps to streamline your upstreaming process.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 05/20] block: Allow bio_iov_iter_get_pages() with bio->bi_bdev unset
  2023-07-25  2:43     ` Kent Overstreet
@ 2023-07-26 13:23       ` Christoph Hellwig
  2023-08-01 19:04         ` Kent Overstreet
  0 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2023-07-26 13:23 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Christoph Hellwig, linux-bcachefs, linux-fsdevel, linux-kernel,
	Jens Axboe, linux-block

On Mon, Jul 24, 2023 at 10:43:12PM -0400, Kent Overstreet wrote:
> Doing the blk-cgroup association at bio alloc time sounds broken to me,
> because of stacking block devices - why was the association not done at
> generic_make_request() time?

Because blk-cgroup not only works at the lowest level in the stack,
but also for stackable block devices.  It's not a design decision I
particularly agree with, but it's been there forever.

So we need to assign it when creating the bio (we used to do it at
submission time, but the way it was done was horribly ineffcient,
that's why I moved it to allocation time), and then when hitting a
stacked device it get reassinged (which still is horribly inefficient).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 04/20] block: Add some exports for bcachefs
  2023-07-26 13:20       ` Christoph Hellwig
@ 2023-08-01 18:59         ` Kent Overstreet
  0 siblings, 0 replies; 18+ messages in thread
From: Kent Overstreet @ 2023-08-01 18:59 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-bcachefs, linux-fsdevel, linux-kernel, Kent Overstreet,
	linux-block, Jens Axboe

On Wed, Jul 26, 2023 at 06:20:42AM -0700, Christoph Hellwig wrote:
> On Mon, Jul 24, 2023 at 11:00:37PM -0400, Kent Overstreet wrote:
> > In short, iomap is heavily callback based, the bcachefs io paths are
> > not - we pass around data structures instead. I discussed this with
> > people when iomap was first being written, but iomap ended up being a
> > much more conservative approach, more in line with the old buffer heads
> > code where the generic code calls into the filesystem to obtain
> > mappings.
> > 
> > I'm gradually convincing people of the merits of the bcachefs approach -
> > in particular reducing indirect function calls is getting more attention
> > these days.
> 
> FYI, Matthew has had patches that convert iomap to be an iterator,
> and I've massage the first half of them and actuall got them in
> before.  I'd much rather finish off that work (even if only for
> direct I/O initially) than adding another direct I/O code.  But
> even with out that we should be able to easily pass more private
> data, in fact btrfs makes pretty heavy use of that.

That's wonderful, but getting iomap up to the level of what bcachefs
needs is still going to be a pretty big project and it's not going to be
my highest priority.

bcachefs also hangs more state off of the pagecache, in bcachefs's
equivvalent of iomap_page - we store reservations for dirty data there
and a few other things, which means the buffered IO paths don't have to
walk any other data structures.

I think that's another idea you guys will want to steal, but a higher
priority for me is getting a proper FUSE port done - and making bcachefs
more tightly weddded to VFS library code is not likely to make that
process any easier.

Once a proper fuse port is done and we know what that looks like will be
a better time for some consolidation.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 05/20] block: Allow bio_iov_iter_get_pages() with bio->bi_bdev unset
  2023-07-26 13:23       ` Christoph Hellwig
@ 2023-08-01 19:04         ` Kent Overstreet
  2023-08-02 11:47           ` Christoph Hellwig
  0 siblings, 1 reply; 18+ messages in thread
From: Kent Overstreet @ 2023-08-01 19:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-bcachefs, linux-fsdevel, linux-kernel, Jens Axboe,
	linux-block

On Wed, Jul 26, 2023 at 06:23:05AM -0700, Christoph Hellwig wrote:
> On Mon, Jul 24, 2023 at 10:43:12PM -0400, Kent Overstreet wrote:
> > Doing the blk-cgroup association at bio alloc time sounds broken to me,
> > because of stacking block devices - why was the association not done at
> > generic_make_request() time?
> 
> Because blk-cgroup not only works at the lowest level in the stack,
> but also for stackable block devices.  It's not a design decision I
> particularly agree with, but it's been there forever.

You're setting the association only to the highest block device in the
stack - how on earth is it supposed to work with anything lower?

And looking at bio_associate_blkg(), this code looks completely broken.
It's checking bio->bi_blkg, but that's just been set to NULL in
bio_init().

And this is your code, so I think you need to go over this again.

Anyways, bio_associate_blkg() is also called by bio_set_dev(), which
means passing the block device to bio_init() was a completely pointless
change. bcachefs uses bio_set_dev(), so everything is fine.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 06/20] block: Bring back zero_fill_bio_iter
  2023-07-26 13:21       ` Christoph Hellwig
@ 2023-08-01 19:05         ` Kent Overstreet
  0 siblings, 0 replies; 18+ messages in thread
From: Kent Overstreet @ 2023-08-01 19:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-bcachefs, linux-fsdevel, linux-kernel, Kent Overstreet,
	Jens Axboe, linux-block

On Wed, Jul 26, 2023 at 06:21:23AM -0700, Christoph Hellwig wrote:
> On Mon, Jul 24, 2023 at 10:45:53PM -0400, Kent Overstreet wrote:
> > And yet, we've had a subtle bug introduced in that code that took quite
> > awhile to be fixed - I'm not pro code duplication in general and I don't
> > think this is a good place to start.
> 
> I'm not sure arguing for adding a helper your can triviall implement
> yourself really helps to streamline your upstreaming process.

I gave you my engineering reasons, you're the one who's arguing.

And to make everything perfectly clear: this is code that I originally
wrote, and then you started changing without CCing me - your patch that
deleted zero_fill_bio_iter() never should've gone in.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 05/20] block: Allow bio_iov_iter_get_pages() with bio->bi_bdev unset
  2023-08-01 19:04         ` Kent Overstreet
@ 2023-08-02 11:47           ` Christoph Hellwig
  2023-08-02 16:44             ` Kent Overstreet
  0 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2023-08-02 11:47 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Christoph Hellwig, linux-bcachefs, linux-fsdevel, linux-kernel,
	Jens Axboe, linux-block

On Tue, Aug 01, 2023 at 03:04:50PM -0400, Kent Overstreet wrote:
> > Because blk-cgroup not only works at the lowest level in the stack,
> > but also for stackable block devices.  It's not a design decision I
> > particularly agree with, but it's been there forever.
> 
> You're setting the association only to the highest block device in the
> stack - how on earth is it supposed to work with anything lower?

Hey, ask the cgroup folks as they come up with it.  I'm not going to
defend the logic here.

> And looking at bio_associate_blkg(), this code looks completely broken.
> It's checking bio->bi_blkg, but that's just been set to NULL in
> bio_init().

It's checking bi_blkg because it can also be called from bio_set_dev.

> And this is your code, so I think you need to go over this again.

It's "my code" in the sene of that I did one big round of unwinding
the even bigger mess that was there.  There is another few rounds needed
for the code to vaguely make sense.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 05/20] block: Allow bio_iov_iter_get_pages() with bio->bi_bdev unset
  2023-08-02 11:47           ` Christoph Hellwig
@ 2023-08-02 16:44             ` Kent Overstreet
  0 siblings, 0 replies; 18+ messages in thread
From: Kent Overstreet @ 2023-08-02 16:44 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-bcachefs, linux-fsdevel, linux-kernel, Jens Axboe,
	linux-block

On Wed, Aug 02, 2023 at 04:47:18AM -0700, Christoph Hellwig wrote:
> On Tue, Aug 01, 2023 at 03:04:50PM -0400, Kent Overstreet wrote:
> > > Because blk-cgroup not only works at the lowest level in the stack,
> > > but also for stackable block devices.  It's not a design decision I
> > > particularly agree with, but it's been there forever.
> > 
> > You're setting the association only to the highest block device in the
> > stack - how on earth is it supposed to work with anything lower?
> 
> Hey, ask the cgroup folks as they come up with it.  I'm not going to
> defend the logic here.
> 
> > And looking at bio_associate_blkg(), this code looks completely broken.
> > It's checking bio->bi_blkg, but that's just been set to NULL in
> > bio_init().
> 
> It's checking bi_blkg because it can also be called from bio_set_dev.

So bio_set_dev() has subtly different behaviour than passing the block
device to bio_init()?

That's just broken.

> 
> > And this is your code, so I think you need to go over this again.
> 
> It's "my code" in the sene of that I did one big round of unwinding
> the even bigger mess that was there.  There is another few rounds needed
> for the code to vaguely make sense.

Well, I'll watch for those patches then...

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-08-02 16:44 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20230712211115.2174650-1-kent.overstreet@linux.dev>
2023-07-12 21:10 ` [PATCH 04/20] block: Add some exports for bcachefs Kent Overstreet
2023-07-24 17:31   ` Christoph Hellwig
2023-07-25  3:00     ` Kent Overstreet
2023-07-26 13:20       ` Christoph Hellwig
2023-08-01 18:59         ` Kent Overstreet
2023-07-25  2:59   ` Matthew Wilcox
2023-07-12 21:11 ` [PATCH 05/20] block: Allow bio_iov_iter_get_pages() with bio->bi_bdev unset Kent Overstreet
2023-07-24 17:34   ` Christoph Hellwig
2023-07-25  2:43     ` Kent Overstreet
2023-07-26 13:23       ` Christoph Hellwig
2023-08-01 19:04         ` Kent Overstreet
2023-08-02 11:47           ` Christoph Hellwig
2023-08-02 16:44             ` Kent Overstreet
2023-07-12 21:11 ` [PATCH 06/20] block: Bring back zero_fill_bio_iter Kent Overstreet
2023-07-24 17:35   ` Christoph Hellwig
2023-07-25  2:45     ` Kent Overstreet
2023-07-26 13:21       ` Christoph Hellwig
2023-08-01 19:05         ` Kent Overstreet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).