* [PATCH 1/3] bcachefs: allow writeback to fill bio completely
2023-11-03 13:09 [PATCH 0/3] bcachefs: writeback and byte-order misc fixes Brian Foster
@ 2023-11-03 13:09 ` Brian Foster
2023-11-03 13:09 ` [PATCH 2/3] bcachefs: byte order swap bch_alloc_v4.fragmentation_lru field Brian Foster
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Brian Foster @ 2023-11-03 13:09 UTC (permalink / raw)
To: linux-bcachefs
The bcachefs folio writeback code includes a bio full check as well
as a fixed size check to determine when to split off and submit
writeback I/O. The inclusive check of the latter against the limit
means that writeback can submit slightly prematurely. This is not a
functional problem, but results in unnecessarily split I/Os and
extent merging.
This can be observed with a buffered write sized exactly to the
current maximum value (1MB) and with key_merging_disabled=1. The
latter prevents the merge from the second write such that a
subsequent check of the extent list shows a 1020k extent followed by
a contiguous 4k extent.
The purpose for the fixed size check is also undocumented and
somewhat obscure. Lift this check into a new helper that wraps the
bio check, fix the comparison logic, and add a comment to document
the purpose and how we might improve on this in the future.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
fs/bcachefs/fs-io-buffered.c | 19 ++++++++++++++++---
1 file changed, 16 insertions(+), 3 deletions(-)
diff --git a/fs/bcachefs/fs-io-buffered.c b/fs/bcachefs/fs-io-buffered.c
index 58ccc7b91ac7..52f0e7acda3d 100644
--- a/fs/bcachefs/fs-io-buffered.c
+++ b/fs/bcachefs/fs-io-buffered.c
@@ -389,6 +389,21 @@ static inline struct bch_writepage_state bch_writepage_state_init(struct bch_fs
return ret;
}
+/*
+ * Determine when a writepage io is full. We have to limit writepage bios to a
+ * single page per bvec (i.e. 1MB with 4k pages) because that is the limit to
+ * what the bounce path in bch2_write_extent() can handle. In theory we could
+ * loosen this restriction for non-bounce I/O, but we don't have that context
+ * here. Ideally, we can up this limit and make it configurable in the future
+ * when the bounce path can be enhanced to accommodate larger source bios.
+ */
+static inline bool bch_io_full(struct bch_writepage_io *io, unsigned len)
+{
+ struct bio *bio = &io->op.wbio.bio;
+ return bio_full(bio, len) ||
+ (bio->bi_iter.bi_size + len > BIO_MAX_VECS * PAGE_SIZE);
+}
+
static void bch2_writepage_io_done(struct bch_write_op *op)
{
struct bch_writepage_io *io =
@@ -606,9 +621,7 @@ static int __bch2_writepage(struct folio *folio,
if (w->io &&
(w->io->op.res.nr_replicas != nr_replicas_this_write ||
- bio_full(&w->io->op.wbio.bio, sectors << 9) ||
- w->io->op.wbio.bio.bi_iter.bi_size + (sectors << 9) >=
- (BIO_MAX_VECS * PAGE_SIZE) ||
+ bch_io_full(w->io, sectors << 9) ||
bio_end_sector(&w->io->op.wbio.bio) != sector))
bch2_writepage_do_io(w);
--
2.41.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* [PATCH 2/3] bcachefs: byte order swap bch_alloc_v4.fragmentation_lru field
2023-11-03 13:09 [PATCH 0/3] bcachefs: writeback and byte-order misc fixes Brian Foster
2023-11-03 13:09 ` [PATCH 1/3] bcachefs: allow writeback to fill bio completely Brian Foster
@ 2023-11-03 13:09 ` Brian Foster
2023-11-03 13:09 ` [PATCH 3/3] bcachefs: use swab40 for bch_backpointer.bucket_offset bitfield Brian Foster
2023-11-03 15:18 ` [PATCH 0/3] bcachefs: writeback and byte-order misc fixes Kent Overstreet
3 siblings, 0 replies; 5+ messages in thread
From: Brian Foster @ 2023-11-03 13:09 UTC (permalink / raw)
To: linux-bcachefs
A simple test to populate a filesystem on one CPU architecture and
fsck on an arch of the opposite byte order produces errors related
to the fragmentation LRU. This occurs because the 64-bit
fragmentation_lru field is not byte-order swapped when reads detect
that the on-disk/bset key values were written in opposite byte-order
of the current CPU.
Update the bch2_alloc_v4 swab callback to handle fragmentation_lru
as is done for other multi-byte fields. This doesn't affect existing
filesystems when accessed by CPUs of the same endianness because the
->swab() callback is only called when the bset flags indicate an
endianness mismatch between the CPU and on-disk data.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
fs/bcachefs/alloc_background.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/bcachefs/alloc_background.c b/fs/bcachefs/alloc_background.c
index bcfae91667af..ad256a88cb5c 100644
--- a/fs/bcachefs/alloc_background.c
+++ b/fs/bcachefs/alloc_background.c
@@ -319,6 +319,7 @@ void bch2_alloc_v4_swab(struct bkey_s k)
a->io_time[1] = swab64(a->io_time[1]);
a->stripe = swab32(a->stripe);
a->nr_external_backpointers = swab32(a->nr_external_backpointers);
+ a->fragmentation_lru = swab64(a->fragmentation_lru);
bps = alloc_v4_backpointers(a);
for (bp = bps; bp < bps + BCH_ALLOC_V4_NR_BACKPOINTERS(a); bp++) {
--
2.41.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* [PATCH 3/3] bcachefs: use swab40 for bch_backpointer.bucket_offset bitfield
2023-11-03 13:09 [PATCH 0/3] bcachefs: writeback and byte-order misc fixes Brian Foster
2023-11-03 13:09 ` [PATCH 1/3] bcachefs: allow writeback to fill bio completely Brian Foster
2023-11-03 13:09 ` [PATCH 2/3] bcachefs: byte order swap bch_alloc_v4.fragmentation_lru field Brian Foster
@ 2023-11-03 13:09 ` Brian Foster
2023-11-03 15:18 ` [PATCH 0/3] bcachefs: writeback and byte-order misc fixes Kent Overstreet
3 siblings, 0 replies; 5+ messages in thread
From: Brian Foster @ 2023-11-03 13:09 UTC (permalink / raw)
To: linux-bcachefs
The bucket_offset field of bch_backpointer is a 40-bit bitfield, but the
bch2_backpointer_swab() helper uses swab32. This leads to inconsistency
when an on-disk fs is accessed from an opposite endian machine.
As it turns out, we already have an internal swab40() helper that is
used from the bch_alloc_v4 swab callback. Lift it into the backpointers
header file and use it consistently in both places.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
fs/bcachefs/alloc_background.c | 9 ---------
fs/bcachefs/backpointers.c | 2 +-
fs/bcachefs/backpointers.h | 9 +++++++++
3 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/fs/bcachefs/alloc_background.c b/fs/bcachefs/alloc_background.c
index ad256a88cb5c..1fec0e67891f 100644
--- a/fs/bcachefs/alloc_background.c
+++ b/fs/bcachefs/alloc_background.c
@@ -297,15 +297,6 @@ int bch2_alloc_v4_invalid(struct bch_fs *c, struct bkey_s_c k,
return ret;
}
-static inline u64 swab40(u64 x)
-{
- return (((x & 0x00000000ffULL) << 32)|
- ((x & 0x000000ff00ULL) << 16)|
- ((x & 0x0000ff0000ULL) >> 0)|
- ((x & 0x00ff000000ULL) >> 16)|
- ((x & 0xff00000000ULL) >> 32));
-}
-
void bch2_alloc_v4_swab(struct bkey_s k)
{
struct bch_alloc_v4 *a = bkey_s_to_alloc_v4(k).v;
diff --git a/fs/bcachefs/backpointers.c b/fs/bcachefs/backpointers.c
index 3b79bde1ce2f..5ed96dddae08 100644
--- a/fs/bcachefs/backpointers.c
+++ b/fs/bcachefs/backpointers.c
@@ -77,7 +77,7 @@ void bch2_backpointer_swab(struct bkey_s k)
{
struct bkey_s_backpointer bp = bkey_s_to_backpointer(k);
- bp.v->bucket_offset = swab32(bp.v->bucket_offset);
+ bp.v->bucket_offset = swab40(bp.v->bucket_offset);
bp.v->bucket_len = swab32(bp.v->bucket_len);
bch2_bpos_swab(&bp.v->pos);
}
diff --git a/fs/bcachefs/backpointers.h b/fs/bcachefs/backpointers.h
index 4ab9f3562912..ab866feeaf66 100644
--- a/fs/bcachefs/backpointers.h
+++ b/fs/bcachefs/backpointers.h
@@ -7,6 +7,15 @@
#include "buckets.h"
#include "super.h"
+static inline u64 swab40(u64 x)
+{
+ return (((x & 0x00000000ffULL) << 32)|
+ ((x & 0x000000ff00ULL) << 16)|
+ ((x & 0x0000ff0000ULL) >> 0)|
+ ((x & 0x00ff000000ULL) >> 16)|
+ ((x & 0xff00000000ULL) >> 32));
+}
+
int bch2_backpointer_invalid(struct bch_fs *, struct bkey_s_c k,
enum bkey_invalid_flags, struct printbuf *);
void bch2_backpointer_to_text(struct printbuf *, const struct bch_backpointer *);
--
2.41.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH 0/3] bcachefs: writeback and byte-order misc fixes
2023-11-03 13:09 [PATCH 0/3] bcachefs: writeback and byte-order misc fixes Brian Foster
` (2 preceding siblings ...)
2023-11-03 13:09 ` [PATCH 3/3] bcachefs: use swab40 for bch_backpointer.bucket_offset bitfield Brian Foster
@ 2023-11-03 15:18 ` Kent Overstreet
3 siblings, 0 replies; 5+ messages in thread
From: Kent Overstreet @ 2023-11-03 15:18 UTC (permalink / raw)
To: Brian Foster; +Cc: linux-bcachefs
On Fri, Nov 03, 2023 at 09:09:35AM -0400, Brian Foster wrote:
> Hi all,
>
> This series is a repost of a handful of patches that happened to have
> fallen through the cracks. Patch 1 relates back to the discussion[1] on
> increasing the writeback bio size limit. We can't safely do that until
> the bounce path can be enhanced to handle larger sizes, but we can at
> least fix the writeback code to completely fill the bio. Patches 2-3 are
> a couple more byte order fixes that were uncovered by swapping an
> on-disk filesystem between big and little endian machines. As of these
> two, I've not encountered any further byte-order issues via this sort of
> test.
>
> All three of these patches have been soaking in my test branch [2] for
> quite some time. Thoughts, reviews, flames appreciated.
>
> Brian
>
> [1] https://lore.kernel.org/linux-bcachefs/20230927112338.262207-3-bfoster@redhat.com/
> [2] https://evilpiepirate.org/~testdashboard/ci?branch=bfoster
>
> Brian Foster (3):
> bcachefs: allow writeback to fill bio completely
> bcachefs: byte order swap bch_alloc_v4.fragmentation_lru field
> bcachefs: use swab40 for bch_backpointer.bucket_offset bitfield
These will all go out in the next pull request.
^ permalink raw reply [flat|nested] 5+ messages in thread