Linux bcachefs list
 help / color / mirror / Atom feed
* [PATCH] bcachefs: Increase blacklist range
@ 2025-03-15  7:39 Alan Huang
  2025-03-15 21:49 ` Kent Overstreet
  2025-03-17 20:58 ` John Stoffel
  0 siblings, 2 replies; 8+ messages in thread
From: Alan Huang @ 2025-03-15  7:39 UTC (permalink / raw)
  To: kent.overstreet; +Cc: linux-bcachefs, Alan Huang

Now there are 16 journal buffers, 8 is too small to be enough.

Signed-off-by: Alan Huang <mmpgouride@gmail.com>
---
 fs/bcachefs/recovery.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/bcachefs/recovery.c b/fs/bcachefs/recovery.c
index 71c786cdb192..a6e26733854d 100644
--- a/fs/bcachefs/recovery.c
+++ b/fs/bcachefs/recovery.c
@@ -899,7 +899,7 @@ int bch2_fs_recovery(struct bch_fs *c)
 	 * journal sequence numbers:
 	 */
 	if (!c->sb.clean)
-		journal_seq += 8;
+		journal_seq += JOURNAL_BUF_NR * 4;
 
 	if (blacklist_seq != journal_seq) {
 		ret =   bch2_journal_log_msg(c, "blacklisting entries %llu-%llu",
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] bcachefs: Increase blacklist range
  2025-03-15  7:39 [PATCH] bcachefs: Increase blacklist range Alan Huang
@ 2025-03-15 21:49 ` Kent Overstreet
  2025-03-17 20:58 ` John Stoffel
  1 sibling, 0 replies; 8+ messages in thread
From: Kent Overstreet @ 2025-03-15 21:49 UTC (permalink / raw)
  To: Alan Huang; +Cc: linux-bcachefs

On Sat, Mar 15, 2025 at 03:39:42PM +0800, Alan Huang wrote:
> Now there are 16 journal buffers, 8 is too small to be enough.
> 
> Signed-off-by: Alan Huang <mmpgouride@gmail.com>

Applied! Thanks

> ---
>  fs/bcachefs/recovery.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/bcachefs/recovery.c b/fs/bcachefs/recovery.c
> index 71c786cdb192..a6e26733854d 100644
> --- a/fs/bcachefs/recovery.c
> +++ b/fs/bcachefs/recovery.c
> @@ -899,7 +899,7 @@ int bch2_fs_recovery(struct bch_fs *c)
>  	 * journal sequence numbers:
>  	 */
>  	if (!c->sb.clean)
> -		journal_seq += 8;
> +		journal_seq += JOURNAL_BUF_NR * 4;
>  
>  	if (blacklist_seq != journal_seq) {
>  		ret =   bch2_journal_log_msg(c, "blacklisting entries %llu-%llu",
> -- 
> 2.48.1
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] bcachefs: Increase blacklist range
  2025-03-15  7:39 [PATCH] bcachefs: Increase blacklist range Alan Huang
  2025-03-15 21:49 ` Kent Overstreet
@ 2025-03-17 20:58 ` John Stoffel
  2025-03-18  0:33   ` Kent Overstreet
  1 sibling, 1 reply; 8+ messages in thread
From: John Stoffel @ 2025-03-17 20:58 UTC (permalink / raw)
  To: Alan Huang; +Cc: kent.overstreet, linux-bcachefs

>>>>> "Alan" == Alan Huang <mmpgouride@gmail.com> writes:

> Now there are 16 journal buffers, 8 is too small to be enough.
> Signed-off-by: Alan Huang <mmpgouride@gmail.com>
> ---
>  fs/bcachefs/recovery.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

> diff --git a/fs/bcachefs/recovery.c b/fs/bcachefs/recovery.c
> index 71c786cdb192..a6e26733854d 100644
> --- a/fs/bcachefs/recovery.c
> +++ b/fs/bcachefs/recovery.c
> @@ -899,7 +899,7 @@ int bch2_fs_recovery(struct bch_fs *c)
>  	 * journal sequence numbers:
>  	 */
>  	if (!c->sb.clean)
> -		journal_seq += 8;
> +		journal_seq += JOURNAL_BUF_NR * 4;
 
Instead of magic numbers, could you put in a define with an
explanation of how you arrived at this number?  Just to document the
assumptions better?

John

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] bcachefs: Increase blacklist range
  2025-03-17 20:58 ` John Stoffel
@ 2025-03-18  0:33   ` Kent Overstreet
  2025-03-18 14:26     ` John Stoffel
  0 siblings, 1 reply; 8+ messages in thread
From: Kent Overstreet @ 2025-03-18  0:33 UTC (permalink / raw)
  To: John Stoffel; +Cc: Alan Huang, linux-bcachefs

On Mon, Mar 17, 2025 at 04:58:26PM -0400, John Stoffel wrote:
> >>>>> "Alan" == Alan Huang <mmpgouride@gmail.com> writes:
> 
> > Now there are 16 journal buffers, 8 is too small to be enough.
> > Signed-off-by: Alan Huang <mmpgouride@gmail.com>
> > ---
> >  fs/bcachefs/recovery.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> > diff --git a/fs/bcachefs/recovery.c b/fs/bcachefs/recovery.c
> > index 71c786cdb192..a6e26733854d 100644
> > --- a/fs/bcachefs/recovery.c
> > +++ b/fs/bcachefs/recovery.c
> > @@ -899,7 +899,7 @@ int bch2_fs_recovery(struct bch_fs *c)
> >  	 * journal sequence numbers:
> >  	 */
> >  	if (!c->sb.clean)
> > -		journal_seq += 8;
> > +		journal_seq += JOURNAL_BUF_NR * 4;
>  
> Instead of magic numbers, could you put in a define with an
> explanation of how you arrived at this number?  Just to document the
> assumptions better?
> 
> John

The * 4 is a fudge factor.

But actually, I was giving this more thought and I don't think we have
the correct number.

The real bound is "number of unflushed journal entries that might have
been allocated, and have other items (btree nodes) referring to that
sequence number, but which don't hit because beacuse they weren't
flushed".

And we don't have an actual bound on that.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] bcachefs: Increase blacklist range
  2025-03-18  0:33   ` Kent Overstreet
@ 2025-03-18 14:26     ` John Stoffel
  2025-03-18 14:57       ` Kent Overstreet
  0 siblings, 1 reply; 8+ messages in thread
From: John Stoffel @ 2025-03-18 14:26 UTC (permalink / raw)
  To: Kent Overstreet; +Cc: John Stoffel, Alan Huang, linux-bcachefs

>>>>> "Kent" == Kent Overstreet <kent.overstreet@linux.dev> writes:

> On Mon, Mar 17, 2025 at 04:58:26PM -0400, John Stoffel wrote:
>> >>>>> "Alan" == Alan Huang <mmpgouride@gmail.com> writes:
>> 
>> > Now there are 16 journal buffers, 8 is too small to be enough.
>> > Signed-off-by: Alan Huang <mmpgouride@gmail.com>
>> > ---
>> >  fs/bcachefs/recovery.c | 2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> > diff --git a/fs/bcachefs/recovery.c b/fs/bcachefs/recovery.c
>> > index 71c786cdb192..a6e26733854d 100644
>> > --- a/fs/bcachefs/recovery.c
>> > +++ b/fs/bcachefs/recovery.c
>> > @@ -899,7 +899,7 @@ int bch2_fs_recovery(struct bch_fs *c)
>> >  	 * journal sequence numbers:
>> >  	 */
>> >  	if (!c->sb.clean)
>> > -		journal_seq += 8;
>> > +		journal_seq += JOURNAL_BUF_NR * 4;
>> 
>> Instead of magic numbers, could you put in a define with an
>> explanation of how you arrived at this number?  Just to document the
>> assumptions better?
>> 
>> John

> The * 4 is a fudge factor.

Ok.

> But actually, I was giving this more thought and I don't think we have
> the correct number.

We have a WAG here.  :-)

> The real bound is "number of unflushed journal entries that might have
> been allocated, and have other items (btree nodes) referring to that
> sequence number, but which don't hit because beacuse they weren't
> flushed".

> And we don't have an actual bound on that.

So what happens if journal_seq overflows?  I don't know the code and
haven't looked.   

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] bcachefs: Increase blacklist range
  2025-03-18 14:26     ` John Stoffel
@ 2025-03-18 14:57       ` Kent Overstreet
  2025-03-18 15:19         ` Alan Huang
  0 siblings, 1 reply; 8+ messages in thread
From: Kent Overstreet @ 2025-03-18 14:57 UTC (permalink / raw)
  To: John Stoffel; +Cc: Alan Huang, linux-bcachefs

On Tue, Mar 18, 2025 at 10:26:03AM -0400, John Stoffel wrote:
> >>>>> "Kent" == Kent Overstreet <kent.overstreet@linux.dev> writes:
> 
> > On Mon, Mar 17, 2025 at 04:58:26PM -0400, John Stoffel wrote:
> >> >>>>> "Alan" == Alan Huang <mmpgouride@gmail.com> writes:
> >> 
> >> > Now there are 16 journal buffers, 8 is too small to be enough.
> >> > Signed-off-by: Alan Huang <mmpgouride@gmail.com>
> >> > ---
> >> >  fs/bcachefs/recovery.c | 2 +-
> >> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >> 
> >> > diff --git a/fs/bcachefs/recovery.c b/fs/bcachefs/recovery.c
> >> > index 71c786cdb192..a6e26733854d 100644
> >> > --- a/fs/bcachefs/recovery.c
> >> > +++ b/fs/bcachefs/recovery.c
> >> > @@ -899,7 +899,7 @@ int bch2_fs_recovery(struct bch_fs *c)
> >> >  	 * journal sequence numbers:
> >> >  	 */
> >> >  	if (!c->sb.clean)
> >> > -		journal_seq += 8;
> >> > +		journal_seq += JOURNAL_BUF_NR * 4;
> >> 
> >> Instead of magic numbers, could you put in a define with an
> >> explanation of how you arrived at this number?  Just to document the
> >> assumptions better?
> >> 
> >> John
> 
> > The * 4 is a fudge factor.
> 
> Ok.
> 
> > But actually, I was giving this more thought and I don't think we have
> > the correct number.
> 
> We have a WAG here.  :-)
> 
> > The real bound is "number of unflushed journal entries that might have
> > been allocated, and have other items (btree nodes) referring to that
> > sequence number, but which don't hit because beacuse they weren't
> > flushed".
> 
> > And we don't have an actual bound on that.
> 
> So what happens if journal_seq overflows?  I don't know the code and
> haven't looked.   

In the olden days, in the before times, blacklisting insufficient
sequence numbers would mean btree node entries could become visible
after a crash that should've been ignored, because they were newer than
the newest (flush) journal entry.

That can't happen anymore because pointers to btree nodes now indicate
how many sectors we've written and completed within that node, and
they're updated after every btree node write (log append).

IOW - our mechanism for sequential consistency now is that the b-tree
behaves like a pure COW btree.

So I retract what I said before :) Having just checked the journal read
path, and the recovery path that creates the blacklist entries, I think
we're good.

The only reason we'd need to blacklist more is if we want some kind of a
double check on the modern sequential consistency mechanism, and I don't
think we need that.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] bcachefs: Increase blacklist range
  2025-03-18 14:57       ` Kent Overstreet
@ 2025-03-18 15:19         ` Alan Huang
  2025-03-18 16:46           ` Kent Overstreet
  0 siblings, 1 reply; 8+ messages in thread
From: Alan Huang @ 2025-03-18 15:19 UTC (permalink / raw)
  To: Kent Overstreet; +Cc: John Stoffel, linux-bcachefs

On Mar 18, 2025, at 22:57, Kent Overstreet <kent.overstreet@linux.dev> wrote:
> 
> On Tue, Mar 18, 2025 at 10:26:03AM -0400, John Stoffel wrote:
>>>>>>> "Kent" == Kent Overstreet <kent.overstreet@linux.dev> writes:
>> 
>>> On Mon, Mar 17, 2025 at 04:58:26PM -0400, John Stoffel wrote:
>>>>>>>>> "Alan" == Alan Huang <mmpgouride@gmail.com> writes:
>>>> 
>>>>> Now there are 16 journal buffers, 8 is too small to be enough.
>>>>> Signed-off-by: Alan Huang <mmpgouride@gmail.com>
>>>>> ---
>>>>> fs/bcachefs/recovery.c | 2 +-
>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>> 
>>>>> diff --git a/fs/bcachefs/recovery.c b/fs/bcachefs/recovery.c
>>>>> index 71c786cdb192..a6e26733854d 100644
>>>>> --- a/fs/bcachefs/recovery.c
>>>>> +++ b/fs/bcachefs/recovery.c
>>>>> @@ -899,7 +899,7 @@ int bch2_fs_recovery(struct bch_fs *c)
>>>>> * journal sequence numbers:
>>>>> */
>>>>> if (!c->sb.clean)
>>>>> - journal_seq += 8;
>>>>> + journal_seq += JOURNAL_BUF_NR * 4;
>>>> 
>>>> Instead of magic numbers, could you put in a define with an
>>>> explanation of how you arrived at this number?  Just to document the
>>>> assumptions better?
>>>> 
>>>> John
>> 
>>> The * 4 is a fudge factor.
>> 
>> Ok.
>> 
>>> But actually, I was giving this more thought and I don't think we have
>>> the correct number.
>> 
>> We have a WAG here.  :-)
>> 
>>> The real bound is "number of unflushed journal entries that might have
>>> been allocated, and have other items (btree nodes) referring to that
>>> sequence number, but which don't hit because beacuse they weren't
>>> flushed".
>> 
>>> And we don't have an actual bound on that.
>> 
>> So what happens if journal_seq overflows?  I don't know the code and
>> haven't looked.   
> 
> In the olden days, in the before times, blacklisting insufficient
> sequence numbers would mean btree node entries could become visible
> after a crash that should've been ignored, because they were newer than
> the newest (flush) journal entry.
> 
> That can't happen anymore because pointers to btree nodes now indicate
> how many sectors we've written and completed within that node, and
> they're updated after every btree node write (log append).
> 
> IOW - our mechanism for sequential consistency now is that the b-tree
> behaves like a pure COW btree.
> 
> So I retract what I said before :) Having just checked the journal read
> path, and the recovery path that creates the blacklist entries, I think
> we're good.
> 
> The only reason we'd need to blacklist more is if we want some kind of a
> double check on the modern sequential consistency mechanism, and I don't
> think we need that.

What about multi-btree sequential consistency ?

i.e. one btree completes the entire tree update, but the other doesn’t




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] bcachefs: Increase blacklist range
  2025-03-18 15:19         ` Alan Huang
@ 2025-03-18 16:46           ` Kent Overstreet
  0 siblings, 0 replies; 8+ messages in thread
From: Kent Overstreet @ 2025-03-18 16:46 UTC (permalink / raw)
  To: Alan Huang; +Cc: John Stoffel, linux-bcachefs

On Tue, Mar 18, 2025 at 11:19:52PM +0800, Alan Huang wrote:
> On Mar 18, 2025, at 22:57, Kent Overstreet <kent.overstreet@linux.dev> wrote:
> > In the olden days, in the before times, blacklisting insufficient
> > sequence numbers would mean btree node entries could become visible
> > after a crash that should've been ignored, because they were newer than
> > the newest (flush) journal entry.
> > 
> > That can't happen anymore because pointers to btree nodes now indicate
> > how many sectors we've written and completed within that node, and
> > they're updated after every btree node write (log append).
> > 
> > IOW - our mechanism for sequential consistency now is that the b-tree
> > behaves like a pure COW btree.
> > 
> > So I retract what I said before :) Having just checked the journal read
> > path, and the recovery path that creates the blacklist entries, I think
> > we're good.
> > 
> > The only reason we'd need to blacklist more is if we want some kind of a
> > double check on the modern sequential consistency mechanism, and I don't
> > think we need that.
> 
> What about multi-btree sequential consistency ?
> 
> i.e. one btree completes the entire tree update, but the other doesn’t

Btree updates are completed via journal updates, it's all synchronized
via the journal

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-03-18 16:46 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-15  7:39 [PATCH] bcachefs: Increase blacklist range Alan Huang
2025-03-15 21:49 ` Kent Overstreet
2025-03-17 20:58 ` John Stoffel
2025-03-18  0:33   ` Kent Overstreet
2025-03-18 14:26     ` John Stoffel
2025-03-18 14:57       ` Kent Overstreet
2025-03-18 15:19         ` Alan Huang
2025-03-18 16:46           ` Kent Overstreet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox