public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* "netfs: Can't donate prior to front"
@ 2025-02-07 18:40 Max Kellermann
  2025-02-10 14:07 ` David Howells
  2025-02-10 19:11 ` [PATCH] fs/netfs/read_collect: add to next->prev_donated Max Kellermann
  0 siblings, 2 replies; 13+ messages in thread
From: Max Kellermann @ 2025-02-07 18:40 UTC (permalink / raw)
  To: David Howells, netfs, LKML, linux-nfs

Hi,

the following crash occurs with 6.13.1 on our servers every 20 minutes or so:

 netfs: Can't donate prior to front
 R=00070d30[3] s=9a000-9bfff 0/2000/2000
 folio: 98000-9bfff
 donated: prev=0 next=0
 s=9a000 av=2000 part=2000
 ------------[ cut here ]------------
 kernel BUG at fs/netfs/read_collect.c:315!
 Oops: invalid opcode: 0000 [#1] SMP PTI
 CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Not tainted 6.13.1-cm4all2-hp #416
 Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 11/23/2021
 RIP: 0010:netfs_consume_read_data.isra.0+0xa72/0xab0
 Code: 48 89 ea 31 f6 48 c7 c7 bb 7a d0 ae e8 b7 d2 d1 ff 48 8b 4c 24
20 4c 89 e2 48 c7 c7 d7 7a d0 ae 48 8b 74 24 18 e8 9e d2 d1 ff <0f> 0b
4c 89 ef 48 89 54 24 10 4c 89 44 24 08 e8 1a 4e b5 00 48 c7
 RSP: 0018:ffffb434cc448db0 EFLAGS: 00010246
 RAX: 0000000000000019 RBX: ffff8fa63d9cbec0 RCX: 0000000000000027
 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8fbb1f9db840
 RBP: 0000000000000000 R08: 00000000ffffbfff R09: 0000000000000001
 R10: 0000000000000003 R11: ffff8fd31f6a0000 R12: 0000000000002000
 R13: ffff8fa5350aaee8 R14: 0000000000004000 R15: ffff8fa5350aaee8
 FS:  0000000000000000(0000) GS:ffff8fbb1f9c0000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f9c5000ef48 CR3: 0000000bcee2e001 CR4: 00000000001706f0
 Call Trace:
  <IRQ>
  ? die+0x32/0x80
  ? do_trap+0xd8/0x100
  ? do_error_trap+0x65/0x80
  ? netfs_consume_read_data.isra.0+0xa72/0xab0
  ? exc_invalid_op+0x4c/0x60
  ? netfs_consume_read_data.isra.0+0xa72/0xab0
  ? asm_exc_invalid_op+0x16/0x20
  ? netfs_consume_read_data.isra.0+0xa72/0xab0
  ? __pfx_cachefiles_read_complete+0x10/0x10
  netfs_read_subreq_terminated+0x22d/0x370
  cachefiles_read_complete+0x48/0xf0
  iomap_dio_bio_end_io+0x125/0x160
  blk_update_request+0xea/0x3e0
  scsi_end_request+0x27/0x190
  scsi_io_completion+0x43/0x6c0
  blk_complete_reqs+0x40/0x50
  handle_softirqs+0xd1/0x280
  irq_exit_rcu+0x91/0xb0
  common_interrupt+0x79/0xa0
  </IRQ>
  <TASK>
  asm_common_interrupt+0x22/0x40
 RIP: 0010:cpuidle_enter_state+0xba/0x3b0
 Code: 00 e8 ea 86 1c ff e8 45 f7 ff ff 8b 53 04 49 89 c5 0f 1f 44 00
00 31 ff e8 73 b9 1b ff 45 84 ff 0f 85 f8 01 00 00 fb 45 85 f6 <0f> 88
46 01 00 00 48 8b 04 24 49 63 ce 48 6b d1 68 49 29 c5 48 89
 RSP: 0018:ffffb434c018be98 EFLAGS: 00000202
 RAX: ffff8fbb1f9c0000 RBX: ffffd41cbe7e3448 RCX: 000000000000001f
 RDX: 0000000000000007 RSI: 000000003149acb2 RDI: 0000000000000000
 RBP: 0000000000000004 R08: 0000000000000002 R09: 0000000000000000
 R10: 0000000000000004 R11: 000000000000001f R12: ffffffffaf660060
 R13: 0000030f6179fa73 R14: 0000000000000004 R15: 0000000000000000
  ? cpuidle_enter_state+0xad/0x3b0
  cpuidle_enter+0x29/0x40
  do_idle+0x19c/0x200
  cpu_startup_entry+0x25/0x30
  start_secondary+0xf3/0x100
  common_startup_64+0x13e/0x148
  </TASK>
 Modules linked in:
 ---[ end trace 0000000000000000 ]---

This is a server with heavy NFS traffic (with fscache enabled).

Please help - and let me know if you need more information.

Max

^ permalink raw reply	[flat|nested] 13+ messages in thread
* Re: [PATCH] fs/netfs/read_collect: add to next->prev_donated
@ 2025-03-07 16:09 Norbert Lange
  0 siblings, 0 replies; 13+ messages in thread
From: Norbert Lange @ 2025-03-07 16:09 UTC (permalink / raw)
  To: max.kellermann
  Cc: Salvatore Bonaccorso, dhowells, gregkh, linux-kernel, linux-nfs,
	netfs

I reproduced with the available versions in debian:

linux-image-6.12.12-amd64  6.12.12-1 -> segfault
linux-image-6.12.17-amd64  6.12.17-1 -> segfault
linux-image-6.13-amd64  6.13.5-1~exp1 -> 'kernel BUG at
fs/netfs/read_collect.c:316!'

Then I took the debian 6.12.17-1 kernel (latest LTS), added those 3 patches:

https://lore.kernel.org/netfs/20250211093432.3524035-1-max.kellermann@ionos.com/
https://lore.kernel.org/netfs/20250210223144.3481766-1-max.kellermann@ionos.com/
https://lore.kernel.org/netfs/20250210191118.3444416-1-max.kellermann@ionos.com/

The resulting kernel apparently fixed the issue, I just testet in Qemu
so far (no signed kernel for secure boot).

Tested-by: Norbert Lange <nolange79@gmail.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-03-07 16:09 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-07 18:40 "netfs: Can't donate prior to front" Max Kellermann
2025-02-10 14:07 ` David Howells
2025-02-10 17:41   ` Max Kellermann
2025-02-10 17:45     ` Max Kellermann
2025-02-10 19:11 ` [PATCH] fs/netfs/read_collect: add to next->prev_donated Max Kellermann
2025-02-14 12:47   ` David Howells
2025-02-20 13:09     ` Max Kellermann
2025-02-20 14:17       ` Greg Kroah-Hartman
2025-02-20 15:00         ` Max Kellermann
2025-02-20 15:10           ` Greg Kroah-Hartman
2025-03-01 14:17           ` Salvatore Bonaccorso
2025-03-01 16:51             ` Max Kellermann
  -- strict thread matches above, loose matches on Subject: below --
2025-03-07 16:09 Norbert Lange

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox