public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
* [bugzilla:219548] the kernel crashes when storing an EXT4 file system in a ZRAM device
@ 2024-12-12  3:58 Sergey Senozhatsky
  2024-12-12  4:14 ` Matthew Wilcox
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Sergey Senozhatsky @ 2024-12-12  3:58 UTC (permalink / raw)
  To: Theodore Ts'o, Christoph Hellwig, Jens Axboe, caiqingfu
  Cc: Andrew Morton, Sergey Senozhatsky, linux-ext4, linux-block,
	linux-fsdevel

Hi,

We've got two reports [1] [2] (could be the same person) which
suggest that ext4 may change page content while the page is under
write().  The particular problem here the case when ext4 is on
the zram device.  zram compresses every page written to it, so if
the page content can be modified concurrently with zram's compression
then we can't really use zram with ext4.

Can you take a look please?

[1] https://bugzilla.kernel.org/show_bug.cgi?id=219548
[2] https://lore.kernel.org/linux-kernel/20241129115735.136033-1-baicaiaichibaicai@gmail.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bugzilla:219548] the kernel crashes when storing an EXT4 file system in a ZRAM device
  2024-12-12  3:58 [bugzilla:219548] the kernel crashes when storing an EXT4 file system in a ZRAM device Sergey Senozhatsky
@ 2024-12-12  4:14 ` Matthew Wilcox
  2024-12-12  4:35   ` Sergey Senozhatsky
  2024-12-12  4:49 ` Christoph Hellwig
  2024-12-12  5:37 ` Theodore Ts'o
  2 siblings, 1 reply; 10+ messages in thread
From: Matthew Wilcox @ 2024-12-12  4:14 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Theodore Ts'o, Christoph Hellwig, Jens Axboe, caiqingfu,
	Andrew Morton, linux-ext4, linux-block, linux-fsdevel

On Thu, Dec 12, 2024 at 12:58:26PM +0900, Sergey Senozhatsky wrote:
> Hi,
> 
> We've got two reports [1] [2] (could be the same person) which
> suggest that ext4 may change page content while the page is under
> write().  The particular problem here the case when ext4 is on
> the zram device.  zram compresses every page written to it, so if
> the page content can be modified concurrently with zram's compression
> then we can't really use zram with ext4.

Do you set BLK_FEAT_STABLE_WRITES on zram?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bugzilla:219548] the kernel crashes when storing an EXT4 file system in a ZRAM device
  2024-12-12  4:14 ` Matthew Wilcox
@ 2024-12-12  4:35   ` Sergey Senozhatsky
  0 siblings, 0 replies; 10+ messages in thread
From: Sergey Senozhatsky @ 2024-12-12  4:35 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Sergey Senozhatsky, Theodore Ts'o, Christoph Hellwig,
	Jens Axboe, caiqingfu, Andrew Morton, linux-ext4, linux-block,
	linux-fsdevel

On (24/12/12 04:14), Matthew Wilcox wrote:
> > We've got two reports [1] [2] (could be the same person) which
> > suggest that ext4 may change page content while the page is under
> > write().  The particular problem here the case when ext4 is on
> > the zram device.  zram compresses every page written to it, so if
> > the page content can be modified concurrently with zram's compression
> > then we can't really use zram with ext4.
> 
> Do you set BLK_FEAT_STABLE_WRITES on zram?

Yes, zram sets BLK_FEAT_STABLE_WRITES and BLK_FEAT_SYNCHRONOUS.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bugzilla:219548] the kernel crashes when storing an EXT4 file system in a ZRAM device
  2024-12-12  3:58 [bugzilla:219548] the kernel crashes when storing an EXT4 file system in a ZRAM device Sergey Senozhatsky
  2024-12-12  4:14 ` Matthew Wilcox
@ 2024-12-12  4:49 ` Christoph Hellwig
  2024-12-12  5:37 ` Theodore Ts'o
  2 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2024-12-12  4:49 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Theodore Ts'o, Christoph Hellwig, Jens Axboe, caiqingfu,
	Andrew Morton, linux-ext4, linux-block, linux-fsdevel

On Thu, Dec 12, 2024 at 12:58:26PM +0900, Sergey Senozhatsky wrote:
> Hi,
> 
> We've got two reports [1] [2] (could be the same person) which
> suggest that ext4 may change page content while the page is under
> write().  The particular problem here the case when ext4 is on
> the zram device.  zram compresses every page written to it, so if
> the page content can be modified concurrently with zram's compression
> then we can't really use zram with ext4.

This smells like ext4 doesn't respect BDI_CAP_STABLE_WRITES somewhere.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bugzilla:219548] the kernel crashes when storing an EXT4 file system in a ZRAM device
  2024-12-12  3:58 [bugzilla:219548] the kernel crashes when storing an EXT4 file system in a ZRAM device Sergey Senozhatsky
  2024-12-12  4:14 ` Matthew Wilcox
  2024-12-12  4:49 ` Christoph Hellwig
@ 2024-12-12  5:37 ` Theodore Ts'o
  2024-12-12  6:30   ` Sergey Senozhatsky
                     ` (2 more replies)
  2 siblings, 3 replies; 10+ messages in thread
From: Theodore Ts'o @ 2024-12-12  5:37 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Christoph Hellwig, Jens Axboe, caiqingfu, Andrew Morton,
	linux-ext4, linux-block, linux-fsdevel

On Thu, Dec 12, 2024 at 12:58:26PM +0900, Sergey Senozhatsky wrote:
> Hi,
> 
> We've got two reports [1] [2] (could be the same person) which
> suggest that ext4 may change page content while the page is under
> write().  The particular problem here the case when ext4 is on
> the zram device.  zram compresses every page written to it, so if
> the page content can be modified concurrently with zram's compression
> then we can't really use zram with ext4.
> 
> Can you take a look please?
> 
> [1] https://bugzilla.kernel.org/show_bug.cgi?id=219548
> [2] https://lore.kernel.org/linux-kernel/20241129115735.136033-1-baicaiaichibaicai@gmail.com

The link in [2] is a bit busted, since the message in question wasn't
cc'ed to LKML, but rather to mm-commits.  But dropping "/linux-kernel"
allows the link to work, and what's interesting is this message from
that thread:

https://lore.kernel.org/all/20241202060632.139067-1-baicaiaichibaicai@gmail.com/

The blocks which are getting modified while a write is in flight are
ext4 metadata blocks, which are in the buffer cache.  Ext4 is
modifying those blocks via bh->b_data, and ext4 isn't issuing the
write; those are happenig via the buffer cache's writeback functions.

Hmmm.... was the user using an ext4 file system with the journal
disabled, by any chance?  If ext4 is using the journal (which is the
common case), metadata blocks only get modified via jbd2 journal
functions, and a blocks only get modified when they are part of a jbd2
transaction --- and while the transaction is active, the buffer cache
writeback is disabled.  It's only after the transaction is committed
that are dirty blocks associated with that transaction are allowed to
be written back.  So I *think* the only way we could run into problems
is ext4's jbd2 journalling is disabled.

More generally, any file system which uses the buffer cache, and
doesn't use jbd2 to control when writeback happens, I think is going
to be at risk with a block device which requires stable writes.  The
only way to fix this, really, is to have the buffer cache code copy
the data to a bounce buffer, and then issue the write from the bounce
buffer.

						- Ted



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bugzilla:219548] the kernel crashes when storing an EXT4 file system in a ZRAM device
  2024-12-12  5:37 ` Theodore Ts'o
@ 2024-12-12  6:30   ` Sergey Senozhatsky
  2024-12-12  7:35   ` Christoph Hellwig
  2024-12-12  8:37   ` Sergey Senozhatsky
  2 siblings, 0 replies; 10+ messages in thread
From: Sergey Senozhatsky @ 2024-12-12  6:30 UTC (permalink / raw)
  To: Theodore Ts'o, Yu Huabing
  Cc: Sergey Senozhatsky, Christoph Hellwig, Jens Axboe, caiqingfu,
	Andrew Morton, linux-ext4, linux-block, linux-fsdevel

On (24/12/12 00:37), Theodore Ts'o wrote:
> On Thu, Dec 12, 2024 at 12:58:26PM +0900, Sergey Senozhatsky wrote:
> > Hi,
> > 
> > We've got two reports [1] [2] (could be the same person) which
> > suggest that ext4 may change page content while the page is under
> > write().  The particular problem here the case when ext4 is on
> > the zram device.  zram compresses every page written to it, so if
> > the page content can be modified concurrently with zram's compression
> > then we can't really use zram with ext4.
> > 
> > Can you take a look please?
> > 
> > [1] https://bugzilla.kernel.org/show_bug.cgi?id=219548
> > [2] https://lore.kernel.org/linux-kernel/20241129115735.136033-1-baicaiaichibaicai@gmail.com
> 
> The link in [2] is a bit busted, since the message in question wasn't
> cc'ed to LKML, but rather to mm-commits.  But dropping "/linux-kernel"
> allows the link to work, and what's interesting is this message from
> that thread:

My bad.

> https://lore.kernel.org/all/20241202060632.139067-1-baicaiaichibaicai@gmail.com/

Let me Cc Yu Huabing on this:

> The blocks which are gtting modified while a write is in flight are
> ext4 metadata blocks, which are in the buffer cache.  Ext4 is
> modifying those blocks via bh->b_data, and ext4 isn't issuing the
> write; those are happenig via the buffer cache's writeback functions.
> 
> Hmmm.... was the user using an ext4 file system with the journal
> disabled, by any chance?  If ext4 is using the journal (which is the
> common case), metadata blocks only get modified via jbd2 journal
> functions, and a blocks only get modified when they are part of a jbd2
> transaction --- and while the transaction is active, the buffer cache
> writeback is disabled.  It's only after the transaction is committed
> that are dirty blocks associated with that transaction are allowed to
> be written back.  So I *think* the only way we could run into problems
> is ext4's jbd2 journalling is disabled.
> 
> More generally, any file system which uses the buffer cache, and
> doesn't use jbd2 to control when writeback happens, I think is going
> to be at risk with a block device which requires stable writes.  The
> only way to fix this, really, is to have the buffer cache code copy
> the data to a bounce buffer, and then issue the write from the bounce
> buffer.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bugzilla:219548] the kernel crashes when storing an EXT4 file system in a ZRAM device
  2024-12-12  5:37 ` Theodore Ts'o
  2024-12-12  6:30   ` Sergey Senozhatsky
@ 2024-12-12  7:35   ` Christoph Hellwig
  2024-12-12 14:04     ` Theodore Ts'o
  2024-12-12  8:37   ` Sergey Senozhatsky
  2 siblings, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2024-12-12  7:35 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Sergey Senozhatsky, Christoph Hellwig, Jens Axboe, caiqingfu,
	Andrew Morton, linux-ext4, linux-block, linux-fsdevel

On Thu, Dec 12, 2024 at 12:37:39AM -0500, Theodore Ts'o wrote:
> More generally, any file system which uses the buffer cache, and
> doesn't use jbd2 to control when writeback happens, I think is going
> to be at risk with a block device which requires stable writes.  The
> only way to fix this, really, is to have the buffer cache code copy
> the data to a bounce buffer, and then issue the write from the bounce
> buffer.

Should there be a pr_warn_once when using a file systems using the legacy
buffer cache interfaces on a device that requires stable pages?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bugzilla:219548] the kernel crashes when storing an EXT4 file system in a ZRAM device
  2024-12-12  5:37 ` Theodore Ts'o
  2024-12-12  6:30   ` Sergey Senozhatsky
  2024-12-12  7:35   ` Christoph Hellwig
@ 2024-12-12  8:37   ` Sergey Senozhatsky
  2 siblings, 0 replies; 10+ messages in thread
From: Sergey Senozhatsky @ 2024-12-12  8:37 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Sergey Senozhatsky, Christoph Hellwig, Jens Axboe, caiqingfu,
	Andrew Morton, linux-ext4, linux-block, linux-fsdevel

On (24/12/12 00:37), Theodore Ts'o wrote:
> The blocks which are getting modified while a write is in flight are
> ext4 metadata blocks, which are in the buffer cache.  Ext4 is
> modifying those blocks via bh->b_data, and ext4 isn't issuing the
> write; those are happenig via the buffer cache's writeback functions.
>
> Hmmm.... was the user using an ext4 file system with the journal
> disabled, by any chance?

I believe you are right, at least that's what caiqingfu said [1]:

echo 524288000 > /sys/devices/virtual/block/zram0/disksize
mkfs.ext4 -O ^has_journal -b 4096 -F -L TEMP -m 0 /dev/zram0
mkdir /tmp/zram
mount -t ext4 -o errors=continue,nosuid,nodev,noatime /dev/zram0 /tmp/zram

[1] https://lore.kernel.org/mm-commits/20241202100753.139305-1-baicaiaichibaicai@gmail.com/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bugzilla:219548] the kernel crashes when storing an EXT4 file system in a ZRAM device
  2024-12-12  7:35   ` Christoph Hellwig
@ 2024-12-12 14:04     ` Theodore Ts'o
  2024-12-12 14:12       ` Theodore Ts'o
  0 siblings, 1 reply; 10+ messages in thread
From: Theodore Ts'o @ 2024-12-12 14:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sergey Senozhatsky, Jens Axboe, caiqingfu, Andrew Morton,
	linux-ext4, linux-block, linux-fsdevel

On Wed, Dec 11, 2024 at 11:35:40PM -0800, Christoph Hellwig wrote:
> On Thu, Dec 12, 2024 at 12:37:39AM -0500, Theodore Ts'o wrote:
> > More generally, any file system which uses the buffer cache, and
> > doesn't use jbd2 to control when writeback happens, I think is going
> > to be at risk with a block device which requires stable writes.  The
> > only way to fix this, really, is to have the buffer cache code copy
> > the data to a bounce buffer, and then issue the write from the bounce
> > buffer.
> 
> Should there be a pr_warn_once when using a file systems using the legacy
> buffer cache interfaces on a device that requires stable pages?

Well, either that, or we need to teach the buffer cache writeback code
to issue writes through a bounce buffer if the device requires stable
writes.

I'll note that this could also manifest if some program was writing to
a device that requires stable writes using buffered I/O.  For example,
if they are using postgres, which won't be switching to direct I/O for
another 2-5 years (depending on how optimistic you are and how willing
enterprise customers will be to move to the latest version of
Postgres; some are stillu using very ancient Postgres for the same
reason that RHEL 7 systems based on the 3.10 kernel are still in
production use even today.)

For this particular use case, which is running VM's on
Chromium/ChromeOS, I suspect we do need to have some kind of solution
other than triggering a WARN_ON.  Besides, I'd really rather not get
the kind of syzbot noise we would have by having some scheme that
would be trivially easy for syzbot to trigger.  (We're not should use
WARN_ON for things that can be triggered by Stupid User Tricks,
because syzbot fuzzers can be so ingenious.  :-)

	       	       	      	 	     - Ted

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bugzilla:219548] the kernel crashes when storing an EXT4 file system in a ZRAM device
  2024-12-12 14:04     ` Theodore Ts'o
@ 2024-12-12 14:12       ` Theodore Ts'o
  0 siblings, 0 replies; 10+ messages in thread
From: Theodore Ts'o @ 2024-12-12 14:12 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sergey Senozhatsky, Jens Axboe, caiqingfu, Andrew Morton,
	linux-ext4, linux-block, linux-fsdevel

On Thu, Dec 12, 2024 at 09:04:37AM -0500, Theodore Ts'o wrote:
> For this particular use case, which is running VM's on
> Chromium/ChromeOS, I suspect we do need to have some kind of solution
> other than triggering a WARN_ON.

Sorry, I didn't complete my thought here.  We could just say, "don't
use ext4 without a journal in a Chrome VM."  But if the are going to
allow the VM's access to external USB storage, then ext4 in no-journal
mode will be the least of their problems.  People trying to access USB
thumbdrives or sdcards from their digital cameras using FAT file
sytems will be trigerring ZBLK buffer overflow kernel crashes left,
right, and center.  Especially if they are on a low-cost ChromeOS
device with a tiny amount of meory, such that memory pressure should
be considered a foregone conclusion.  :-)

						- Ted

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-12-12 14:12 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-12  3:58 [bugzilla:219548] the kernel crashes when storing an EXT4 file system in a ZRAM device Sergey Senozhatsky
2024-12-12  4:14 ` Matthew Wilcox
2024-12-12  4:35   ` Sergey Senozhatsky
2024-12-12  4:49 ` Christoph Hellwig
2024-12-12  5:37 ` Theodore Ts'o
2024-12-12  6:30   ` Sergey Senozhatsky
2024-12-12  7:35   ` Christoph Hellwig
2024-12-12 14:04     ` Theodore Ts'o
2024-12-12 14:12       ` Theodore Ts'o
2024-12-12  8:37   ` Sergey Senozhatsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox