Linux block layer
 help / color / mirror / Atom feed
* Direct IO page bouncing got some garbage?
@ 2026-06-12  1:41 Qu Wenruo
  2026-06-12  5:24 ` Qu Wenruo
  0 siblings, 1 reply; 4+ messages in thread
From: Qu Wenruo @ 2026-06-12  1:41 UTC (permalink / raw)
  To: linux-btrfs, linux-fsdevel@vger.kernel.org,
	linux-block@vger.kernel.org, Linux Memory Management List

Hi,

Recently I'm trying to make btrfs utilize IOMAP_DIO_BOUNCE, however I'm 
experiencing weird data corruption.

During test case generic/708, I'm reliably hitting garbage pages at the 
last 64KiB, the garbage even contains an ELF header.

In that test case, we mmap a 2MiB sized buffer from another file, and 
use that 2MiB mmapped memory as buffer for direct IO, write into a 
different file.

The source file has dirty page cache for that 2MiB range, and no 
writeback happened during that direct IO write.

So it means as long as we fault in all the pages of that 2MiB buffer, we 
should be able to copy them into the newly allocated folio, and submit a 
bio using the bounced pages.

But the last 64KiB is reliably corrupted with some ELF header.

I'm wondering where the corruption is from, especially it seems btrfs 
has very little to do, except calling fault_in_iov_readable() to fault 
in all the pages.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Direct IO page bouncing got some garbage?
  2026-06-12  1:41 Direct IO page bouncing got some garbage? Qu Wenruo
@ 2026-06-12  5:24 ` Qu Wenruo
  2026-06-12  8:15   ` Christoph Hellwig
  0 siblings, 1 reply; 4+ messages in thread
From: Qu Wenruo @ 2026-06-12  5:24 UTC (permalink / raw)
  To: linux-btrfs, linux-fsdevel@vger.kernel.org,
	linux-block@vger.kernel.org, Linux Memory Management List



在 2026/6/12 11:11, Qu Wenruo 写道:
> Hi,
> 
> Recently I'm trying to make btrfs utilize IOMAP_DIO_BOUNCE, however I'm 
> experiencing weird data corruption.
> 
> During test case generic/708, I'm reliably hitting garbage pages at the 
> last 64KiB, the garbage even contains an ELF header.

Added ftrace shows that, since btrfs has to disable page fault to avoid 
certain deadlock, bio_iov_iter_bounced() failed with a short copy.

Initially bio_iov_iter_bounced() got a 1MiB page, but copy_iter_from() 
only copied 64K then failed due to the disabled page fault.

I don't think it's a coincident that the short 64K exactly matches where 
the garbage is (the last 64K).

I guess it's in the error path we didn't properly revert the iov iter?

> 
> In that test case, we mmap a 2MiB sized buffer from another file, and 
> use that 2MiB mmapped memory as buffer for direct IO, write into a 
> different file.
> 
> The source file has dirty page cache for that 2MiB range, and no 
> writeback happened during that direct IO write.
> 
> So it means as long as we fault in all the pages of that 2MiB buffer, we 
> should be able to copy them into the newly allocated folio, and submit a 
> bio using the bounced pages.
> 
> But the last 64KiB is reliably corrupted with some ELF header.
> 
> I'm wondering where the corruption is from, especially it seems btrfs 
> has very little to do, except calling fault_in_iov_readable() to fault 
> in all the pages.
> 
> Thanks,
> Qu


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Direct IO page bouncing got some garbage?
  2026-06-12  5:24 ` Qu Wenruo
@ 2026-06-12  8:15   ` Christoph Hellwig
  2026-06-12  8:27     ` Qu Wenruo
  0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2026-06-12  8:15 UTC (permalink / raw)
  To: Qu Wenruo
  Cc: linux-btrfs, linux-fsdevel@vger.kernel.org,
	linux-block@vger.kernel.org, Linux Memory Management List

On Fri, Jun 12, 2026 at 02:54:22PM +0930, Qu Wenruo wrote:
> 
> 
> 在 2026/6/12 11:11, Qu Wenruo 写道:
> > Hi,
> > 
> > Recently I'm trying to make btrfs utilize IOMAP_DIO_BOUNCE, however I'm
> > experiencing weird data corruption.
> > 
> > During test case generic/708, I'm reliably hitting garbage pages at the
> > last 64KiB, the garbage even contains an ELF header.
> 
> Added ftrace shows that, since btrfs has to disable page fault to avoid
> certain deadlock, bio_iov_iter_bounced() failed with a short copy.
> 
> Initially bio_iov_iter_bounced() got a 1MiB page, but copy_iter_from() only
> copied 64K then failed due to the disabled page fault.
> 
> I don't think it's a coincident that the short 64K exactly matches where the
> garbage is (the last 64K).
> 
> I guess it's in the error path we didn't properly revert the iov iter?

Looks like it.  The better option would probably be not to give up on
a short copy, and just reduce the bio size to fit the short copy
even if that wastes a little memory.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Direct IO page bouncing got some garbage?
  2026-06-12  8:15   ` Christoph Hellwig
@ 2026-06-12  8:27     ` Qu Wenruo
  0 siblings, 0 replies; 4+ messages in thread
From: Qu Wenruo @ 2026-06-12  8:27 UTC (permalink / raw)
  To: Christoph Hellwig, Qu Wenruo
  Cc: linux-btrfs, linux-fsdevel@vger.kernel.org,
	linux-block@vger.kernel.org, Linux Memory Management List



在 2026/6/12 17:45, Christoph Hellwig 写道:
> On Fri, Jun 12, 2026 at 02:54:22PM +0930, Qu Wenruo wrote:
>>
>>
>> 在 2026/6/12 11:11, Qu Wenruo 写道:
>>> Hi,
>>>
>>> Recently I'm trying to make btrfs utilize IOMAP_DIO_BOUNCE, however I'm
>>> experiencing weird data corruption.
>>>
>>> During test case generic/708, I'm reliably hitting garbage pages at the
>>> last 64KiB, the garbage even contains an ELF header.
>>
>> Added ftrace shows that, since btrfs has to disable page fault to avoid
>> certain deadlock, bio_iov_iter_bounced() failed with a short copy.
>>
>> Initially bio_iov_iter_bounced() got a 1MiB page, but copy_iter_from() only
>> copied 64K then failed due to the disabled page fault.
>>
>> I don't think it's a coincident that the short 64K exactly matches where the
>> garbage is (the last 64K).
>>
>> I guess it's in the error path we didn't properly revert the iov iter?
> 
> Looks like it.  The better option would probably be not to give up on
> a short copy, and just reduce the bio size to fit the short copy
> even if that wastes a little memory.

And that will only work if IOMAP_PARTIAL is specified.

For now I'll just fix the short copy path, and continue to make btrfs 
work with IOMAP_DIO_BOUNCE first (already testing and finished one round).

The partial bio return solution will require some way to pass the dio 
flag, or a refactor to move all the error handling to the only caller.
Anyway it will be a dedicated series for the change, meanwhile the 
simple revert fix will be sent out soon along with the btrfs enablement.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-06-12  8:27 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-12  1:41 Direct IO page bouncing got some garbage? Qu Wenruo
2026-06-12  5:24 ` Qu Wenruo
2026-06-12  8:15   ` Christoph Hellwig
2026-06-12  8:27     ` Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox