infinite looping in logical

public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed

* infinite looping in logical_ino ioctl
@ 2022-11-12  3:49 Zygo Blaxell
  2022-11-13 20:05 ` Zygo Blaxell
  0 siblings, 1 reply; 2+ messages in thread
From: Zygo Blaxell @ 2022-11-12  3:49 UTC (permalink / raw)
  To: linux-btrfs

I've been chasing an infinite loop in the logical_ino ioctl that appears
when dedupe and logical_ino are used together on the same filesystem.
An easy way to do that is to run bees on a CPU with a double-digit
number of cores, but I've also knocked down servers several times in
the last year by running accidentally allowing 'btdu' and 'duperemove'
to run at the same time.

The bug has been highly resistant to analysis.  Even in the best cases,
it takes up to 70 hours of heavy dedupe+logical_ino on every core to
trigger.  bpftrace relies on RCU, but the RCU grace period doesn't happen
on the core running the infinite loop, so bpftraces simply stop when
the bug occurs.  Almost any change to the code in fs/btrfs/backref.c,
even incrementing a static variable counter in some other function,
causes the problem to become much harder to repro, and another similar
change makes it come back.  Once the infinite loop is started, it's
fairly robust--nothing but a reboot gets it out of the loop.

Yesterday I was been able to capture the bug in kgdb on an
unmodified 5.19.16 kernel after 60 hours, and the infinite loop is in
add_all_parents:

    462         while (!ret && count < ref->count) {
...
    486                 fi = btrfs_item_ptr(eb, slot, struct btrfs_file_extent_item);
    487                 disk_byte = btrfs_file_extent_disk_bytenr(eb, fi);
    488                 data_offset = btrfs_file_extent_offset(eb, fi);
    489 
    490                 if (disk_byte == wanted_disk_byte) {
...
    517 next:
    518                 if (time_seq == BTRFS_SEQ_LAST)
    519                         ret = btrfs_next_item(root, path);
    520                 else
    521                         ret = btrfs_next_old_item(root, path, time_seq);
    522         }

In the infinite looping case, time_seq is a 4-digit number, ret and
count are always 0, ref->count is 1, and disk_byte != wanted_disk_byte.
Those conditions never change, so we can't get out of this loop.

When I tried to probe more deeply what btrfs_next_old_item was doing,
I found that the code is somehow executing btrfs_next_item on line 519,
despite time_seq having the value 3722 at the time.  Iteration over the
items in views at two different points in time sounds like it could
result in infinite looping, but I wasn't able to confirm that before
the gdb session died.  I'm now waiting for my test VM to repro again.

Maybe this bug isn't in the _code_ after all...?  I wasn't expecting
to get here, so I'm not sure what to try next.

Kernel built on Debian with gcc (Debian 12.2.0-7) 12.2.0.

Old references:

User report of bees lockup:
[1] https://lore.kernel.org/linux-btrfs/c9f1640177563f545ef70eb6ec1560faa1bb1bd7.camel@bcom.cz/

My previous attempt to bisect the bug or use bpftrace on it:
[2] https://lore.kernel.org/linux-btrfs/20211210183456.GP17148@hungrycats.org/

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: infinite looping in logical_ino ioctl
  2022-11-12  3:49 infinite looping in logical_ino ioctl Zygo Blaxell
@ 2022-11-13 20:05 ` Zygo Blaxell
  0 siblings, 0 replies; 2+ messages in thread
From: Zygo Blaxell @ 2022-11-13 20:05 UTC (permalink / raw)
  To: linux-btrfs

On Fri, Nov 11, 2022 at 10:49:17PM -0500, Zygo Blaxell wrote:
> I've been chasing an infinite loop in the logical_ino ioctl that appears
> when dedupe and logical_ino are used together on the same filesystem.
[...]
> add_all_parents:

I added a simple loop counter to make it easier to get gdb to stop
at the right place:

	size_t loop_count = 0;

>     462         while (!ret && count < ref->count) {

			if (loop_count++ == 999999) {
				printk(KERN_ERR "loop_count exceeded, break here\n");
			}

Then I set a breakpoint there, and had the debugger set count = 1
to force logical_ino out of the loop in add_all_parents.  Almost
immediately, the same thread goes right back into add_all_parents
with a different search key:

	(gdb) commands 6
	>set var count=1
	>c
	>end
	(gdb) continue

	Thread 3 hit Breakpoint 6, add_all_parents (ignore_offset=<optimized out>, extent_item_pos=0xffffc90008897be8, time_seq=11943, level=<optimized out>, ref=0xffff8881dca3a210, preftrees=0xffffc90008897ac0, parents=0xffff88842c9920a0, path=0xffff88815d347460, root=<optimized out>) at fs/btrfs/backref.c:466
	466                             printk(KERN_ERR "loop_count exceeded, break here\n");
	1: key = {objectid = 304, type = 108 'l', offset = 318775296}
	2: count = 0
	3: *key_for_search = {objectid = 304, type = 108 'l', offset = 721883136}
	4: ref->count = 1

	Thread 3 hit Breakpoint 6, add_all_parents (ignore_offset=<optimized out>, extent_item_pos=0xffffc90008897be8, time_seq=11943, level=<optimized out>, ref=0xffff88842f8a9630, preftrees=0xffffc90008897ac0, parents=0xffff88842c9920a0, path=0xffff88815d347460, root=<optimized out>) at fs/btrfs/backref.c:466
	466                             printk(KERN_ERR "loop_count exceeded, break here\n");
	1: key = {objectid = 304, type = 108 'l', offset = 318775296}
	2: count = 0
	3: *key_for_search = {objectid = 304, type = 108 'l', offset = 721879040}
	4: ref->count = 1

	Thread 3 hit Breakpoint 6, add_all_parents (ignore_offset=<optimized out>, extent_item_pos=0xffffc90008897be8, time_seq=11943, level=<optimized out>, ref=0xffff8881a5a86478, preftrees=0xffffc90008897ac0, parents=0xffff88842c9920a0, path=0xffff88815d347460, root=<optimized out>) at fs/btrfs/backref.c:466
	466                             printk(KERN_ERR "loop_count exceeded, break here\n");
	1: key = {objectid = 304, type = 108 'l', offset = 318775296}
	2: count = 0
	3: *key_for_search = {objectid = 304, type = 108 'l', offset = 721874944}
	4: ref->count = 2

	Thread 3 hit Breakpoint 6, add_all_parents (ignore_offset=<optimized out>, extent_item_pos=0xffffc90008897be8, time_seq=11943, level=<optimized out>, ref=0xffff8881a03d1f20, preftrees=0xffffc90008897ac0, parents=0xffff88842c9920a0, path=0xffff88815d347460, root=<optimized out>) at fs/btrfs/backref.c:466
	466                             printk(KERN_ERR "loop_count exceeded, break here\n");
	1: key = {objectid = 304, type = 108 'l', offset = 318775296}
	2: count = 0
	3: *key_for_search = {objectid = 304, type = 108 'l', offset = 721850368}
	4: ref->count = 1

	Thread 3 hit Breakpoint 6, add_all_parents (ignore_offset=<optimized out>, extent_item_pos=0xffffc90008897be8, time_seq=11943, level=<optimized out>, ref=0xffff88816b340c60, preftrees=0xffffc90008897ac0, parents=0xffff88842c9920a0, path=0xffff88815d347460, root=<optimized out>) at fs/btrfs/backref.c:466
	466                             printk(KERN_ERR "loop_count exceeded, break here\n");
	1: key = {objectid = 304, type = 108 'l', offset = 318775296}
	2: count = 0
	3: *key_for_search = {objectid = 304, type = 108 'l', offset = 721829888}
	4: ref->count = 1

That did get logical_ino to eventually give up and return to userspace,
which ends my debugging session for today.  Next time it gets stuck,
I'll put some breakpoints further up the call stack to see what the
caller's state looks like.

I'm not sure how to debug btrfs_next_old_item(), as gdb has some real
problems controlling execution in that function, and examining varibles
in there is a great way to find the bugs in gdb itself.

Is there a GDB trick to get the kernel to dump out the eb contents
from gdb?  Calling btrfs_print_leaf(eb) from the debugger seems to make
the kernel splat.  I suppose I could call it when loop_count hits 999999.

This part was wrong in my earlier post:

> When I tried to probe more deeply what btrfs_next_old_item was doing,
> I found that the code is somehow executing btrfs_next_item on line 519,
> despite time_seq having the value 3722 at the time.  Iteration over the
> items in views at two different points in time sounds like it could
> result in infinite looping, but I wasn't able to confirm that before
> the gdb session died.  I'm now waiting for my test VM to repro again.

This is just an artifact of inline function optimization.  The assembly
code for that if statement is full of jumps that don't correspond neatly
to source lines.  If the pc is pointing anywhere near that block, gdb
thinks it's on line 519.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-11-13 20:05 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-12  3:49 infinite looping in logical_ino ioctl Zygo Blaxell
2022-11-13 20:05 ` Zygo Blaxell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox