linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* kernel BUG at fs/ext4/inode.c:1914 - page_buffers()
@ 2022-11-23 14:48 Ivan Zahariev
  2022-12-05 17:27 ` Ivan Zahariev
  2022-12-05 21:10 ` Theodore Ts'o
  0 siblings, 2 replies; 15+ messages in thread
From: Ivan Zahariev @ 2022-11-23 14:48 UTC (permalink / raw)
  To: linux-ext4; +Cc: Theodore Ts'o, Greg Kroah-Hartman

Hello,

Starting with kernel 5.15 for the past eight months we have a total of 
12 kernel panics at a fleet of 1000 KVM/Qemu machines which look the 
following way:

     kernel BUG at fs/ext4/inode.c:1914

Switching from kernel 4.14 to 5.15 almost immediately triggered the 
problem. Therefore we are very confident that userland activity is more 
or less the same and is not the root cause. The kernel function which 
triggers the BUG is __ext4_journalled_writepage(). In 5.15 the code for 
__ext4_journalled_writepage() in "fs/ext4/inode.c" is the same as the 
current kernel "master". The line where the BUG is triggered is:

     struct buffer_head *page_bufs = page_buffers(page)

The definition of "page_buffers(page)" in "include/linux/buffer_head.h" 
hasn't changed since 4.14, so no difference here. This is where the 
actual "kernel BUG" event is triggered:

     /* If we *know* page->private refers to buffer_heads */
     #define page_buffers(page) \
         ({ \
             BUG_ON(!PagePrivate(page)); \
             ((struct buffer_head *)page_private(page)); \
         })
     #define page_has_buffers(page) PagePrivate(page)

Initially I thought that the issue is already discussed here: 
https://lore.kernel.org/all/Yg0m6IjcNmfaSokM@google.com/
But this seems to be another (solved) problem and Theodore Ts'o already 
made a quick fix by simply reporting the rare occurrence and continuing 
forward. The commit is in 5.15 (and in the latest kernel), so it's not 
helping our case: 
https://github.com/torvalds/linux/commit/cc5095747edfb054ca2068d01af20be3fcc3634f

Back to the problem! 99% of the difference between 4.14 and the latest 
kernel for __ext4_journalled_writepage() in "fs/ext4/inode.c" comes from 
the following commit: 
https://github.com/torvalds/linux/commit/5c48a7df91499e371ef725895b2e2d21a126e227

Is it safe that we revert this patch on the latest 5.15 kernel, so that 
we can confirm if this resolves the issue for us?

Best regards.
--Ivan



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-09-22  9:54 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-23 14:48 kernel BUG at fs/ext4/inode.c:1914 - page_buffers() Ivan Zahariev
2022-12-05 17:27 ` Ivan Zahariev
2022-12-05 21:10 ` Theodore Ts'o
2022-12-05 21:50   ` Ivan Zahariev
2023-01-12 15:07   ` Jan Kara
2023-03-15 11:27     ` Ivan Zahariev
2023-03-15 17:32       ` Jan Kara
2023-03-15 18:57         ` Theodore Ts'o
2023-05-11  9:21           ` Marcus Hoffmann
2023-05-12 12:19             ` Greg KH
2023-05-12 14:24               ` Marcus Hoffmann
2023-05-12 22:50                 ` Greg KH
2023-05-15  9:10                 ` Jan Kara
2023-09-20  9:40             ` Mathieu Othacehe
2023-09-22  9:54               ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).