public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ntfs: use page allocation for resident attribute inline data
@ 2026-04-22 10:46 Namjae Jeon
  2026-04-22 12:55 ` Matthew Wilcox
  0 siblings, 1 reply; 7+ messages in thread
From: Namjae Jeon @ 2026-04-22 10:46 UTC (permalink / raw)
  To: hyc.lee; +Cc: linux-fsdevel, Namjae Jeon

The current kmemdup() based allocation for IOMAP_INLINE can result in
inline_data pointer having a non-zero page offset. This causes
iomap_inline_data_valid() to fail the check:

    iomap->length <= PAGE_SIZE - offset_in_page(iomap->inline_data)

and triggers the kernel BUG at fs/iomap/buffered-io.c:1061.

This particularly affects workloads with frequent small file access
(e.g. Firefox Nightly profile on NTFS with bind mount) when using the
new ntfs. This fix this by allocating a full page with alloc_page() so that
page_address() always returns a page-aligned address.

Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
---
 fs/ntfs/iomap.c | 26 +++++++++++++++++++-------
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/fs/ntfs/iomap.c b/fs/ntfs/iomap.c
index 3d1458dea90f..74a4d3e971f4 100644
--- a/fs/ntfs/iomap.c
+++ b/fs/ntfs/iomap.c
@@ -89,6 +89,7 @@ static int ntfs_read_iomap_begin_resident(struct inode *inode, loff_t offset, lo
 	u32 attr_len;
 	int err = 0;
 	char *kattr;
+	struct page *ipage;
 
 	if (NInoAttr(ni))
 		base_ni = ni->ext.base_ntfs_ino;
@@ -129,15 +130,18 @@ static int ntfs_read_iomap_begin_resident(struct inode *inode, loff_t offset, lo
 
 	kattr = (u8 *)ctx->attr + le16_to_cpu(ctx->attr->data.resident.value_offset);
 
-	iomap->inline_data = kmemdup(kattr, attr_len, GFP_KERNEL);
-	if (!iomap->inline_data) {
+	ipage = alloc_page(GFP_NOFS | __GFP_ZERO);
+	if (!ipage) {
 		err = -ENOMEM;
 		goto out;
 	}
 
+	memcpy(page_address(ipage), kattr, attr_len);
 	iomap->type = IOMAP_INLINE;
+	iomap->inline_data = page_address(ipage);
 	iomap->offset = 0;
 	iomap->length = attr_len;
+	iomap->private = ipage;
 
 out:
 	if (ctx)
@@ -285,8 +289,11 @@ static int ntfs_read_iomap_begin(struct inode *inode, loff_t offset, loff_t leng
 static int ntfs_read_iomap_end(struct inode *inode, loff_t pos, loff_t length,
 		ssize_t written, unsigned int flags, struct iomap *iomap)
 {
-	if (iomap->type == IOMAP_INLINE)
-		kfree(iomap->inline_data);
+	if (iomap->type == IOMAP_INLINE) {
+		struct page *ipage = iomap->private;
+
+		put_page(ipage);
+	}
 
 	return written;
 }
@@ -652,6 +659,7 @@ static int ntfs_write_iomap_begin_resident(struct inode *inode, loff_t offset,
 	u32 attr_len;
 	int err = 0;
 	char *kattr;
+	struct page *ipage;
 
 	ctx = ntfs_attr_get_search_ctx(ni, NULL);
 	if (!ctx) {
@@ -672,16 +680,19 @@ static int ntfs_write_iomap_begin_resident(struct inode *inode, loff_t offset,
 	attr_len = le32_to_cpu(a->data.resident.value_length);
 	kattr = (u8 *)a + le16_to_cpu(a->data.resident.value_offset);
 
-	iomap->inline_data = kmemdup(kattr, attr_len, GFP_KERNEL);
-	if (!iomap->inline_data) {
+	ipage = alloc_page(GFP_NOFS | __GFP_ZERO);
+	if (!ipage) {
 		err = -ENOMEM;
 		goto out;
 	}
 
+	memcpy(page_address(ipage), kattr, attr_len);
 	iomap->type = IOMAP_INLINE;
+	iomap->inline_data = page_address(ipage);
 	iomap->offset = 0;
 	/* iomap requires there is only one INLINE_DATA extent */
 	iomap->length = attr_len;
+	iomap->private = ipage;
 
 out:
 	if (ctx)
@@ -771,6 +782,7 @@ static int ntfs_write_iomap_end_resident(struct inode *inode, loff_t pos,
 	u32 attr_len;
 	int err;
 	char *kattr;
+	struct page *ipage = iomap->private;
 
 	mutex_lock(&ni->mrec_lock);
 	ctx = ntfs_attr_get_search_ctx(ni, NULL);
@@ -799,7 +811,7 @@ static int ntfs_write_iomap_end_resident(struct inode *inode, loff_t pos,
 	mark_mft_record_dirty(ctx->ntfs_ino);
 err_out:
 	ntfs_attr_put_search_ctx(ctx);
-	kfree(iomap->inline_data);
+	put_page(ipage);
 	mutex_unlock(&ni->mrec_lock);
 	return written;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] ntfs: use page allocation for resident attribute inline data
  2026-04-22 10:46 [PATCH] ntfs: use page allocation for resident attribute inline data Namjae Jeon
@ 2026-04-22 12:55 ` Matthew Wilcox
  2026-04-22 14:35   ` Namjae Jeon
  2026-04-23  5:49   ` Christoph Hellwig
  0 siblings, 2 replies; 7+ messages in thread
From: Matthew Wilcox @ 2026-04-22 12:55 UTC (permalink / raw)
  To: Namjae Jeon
  Cc: hyc.lee, linux-fsdevel, Christian Brauner, Darrick J. Wong,
	linux-xfs, Gao Xiang

On Wed, Apr 22, 2026 at 07:46:27PM +0900, Namjae Jeon wrote:
> The current kmemdup() based allocation for IOMAP_INLINE can result in
> inline_data pointer having a non-zero page offset. This causes
> iomap_inline_data_valid() to fail the check:
> 
>     iomap->length <= PAGE_SIZE - offset_in_page(iomap->inline_data)
> 
> and triggers the kernel BUG at fs/iomap/buffered-io.c:1061.

Hang on, hang on, hang on.

First, maybe this check is too strict.  I'm sure it's true for EROFS,
but I don't see why it should be true for everybody.  Perhaps we should
delete this check or relax it?

Second, why are you calling kmemdup() to begin with?  This seems
entirely pointless; the iomap code is going to call memcpy() on it.
You're supposed to just be pointing into your data structures.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] ntfs: use page allocation for resident attribute inline data
  2026-04-22 12:55 ` Matthew Wilcox
@ 2026-04-22 14:35   ` Namjae Jeon
  2026-04-22 15:28     ` Darrick J. Wong
  2026-04-23  5:49   ` Christoph Hellwig
  1 sibling, 1 reply; 7+ messages in thread
From: Namjae Jeon @ 2026-04-22 14:35 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: hyc.lee, linux-fsdevel, Christian Brauner, Darrick J. Wong,
	linux-xfs, Gao Xiang

On Wed, Apr 22, 2026 at 9:55 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Wed, Apr 22, 2026 at 07:46:27PM +0900, Namjae Jeon wrote:
> > The current kmemdup() based allocation for IOMAP_INLINE can result in
> > inline_data pointer having a non-zero page offset. This causes
> > iomap_inline_data_valid() to fail the check:
> >
> >     iomap->length <= PAGE_SIZE - offset_in_page(iomap->inline_data)
> >
> > and triggers the kernel BUG at fs/iomap/buffered-io.c:1061.
>
> Hang on, hang on, hang on.
>
> First, maybe this check is too strict.  I'm sure it's true for EROFS,
> but I don't see why it should be true for everybody.  Perhaps we should
> delete this check or relax it?
I agree that the current check might be unnecessarily strict for
general cases. So I will prepare another patch to remove this trap for
further discussion with iomap maintainers.
>
> Second, why are you calling kmemdup() to begin with?  This seems
> entirely pointless; the iomap code is going to call memcpy() on it.
> You're supposed to just be pointing into your data structures.
In the initial implementation of NTFS with iomap, I pointed directly
to the internal data structures. However, I encountered this BUG_ON
trap during testing, so I switched to page allocation to avoid it.
Then, during the review process for the NTFS series, I changed it to
kmemdup() without much thought. If this BUG_ON trap can be removed, I
could have simply pointed to the internal data structures as you said.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] ntfs: use page allocation for resident attribute inline data
  2026-04-22 14:35   ` Namjae Jeon
@ 2026-04-22 15:28     ` Darrick J. Wong
  2026-04-22 15:36       ` Gao Xiang
  0 siblings, 1 reply; 7+ messages in thread
From: Darrick J. Wong @ 2026-04-22 15:28 UTC (permalink / raw)
  To: Namjae Jeon
  Cc: Matthew Wilcox, hyc.lee, linux-fsdevel, Christian Brauner,
	linux-xfs, Gao Xiang

On Wed, Apr 22, 2026 at 11:35:32PM +0900, Namjae Jeon wrote:
> On Wed, Apr 22, 2026 at 9:55 PM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Wed, Apr 22, 2026 at 07:46:27PM +0900, Namjae Jeon wrote:
> > > The current kmemdup() based allocation for IOMAP_INLINE can result in
> > > inline_data pointer having a non-zero page offset. This causes
> > > iomap_inline_data_valid() to fail the check:
> > >
> > >     iomap->length <= PAGE_SIZE - offset_in_page(iomap->inline_data)
> > >
> > > and triggers the kernel BUG at fs/iomap/buffered-io.c:1061.
> >
> > Hang on, hang on, hang on.
> >
> > First, maybe this check is too strict.  I'm sure it's true for EROFS,
> > but I don't see why it should be true for everybody.  Perhaps we should
> > delete this check or relax it?
> I agree that the current check might be unnecessarily strict for
> general cases. So I will prepare another patch to remove this trap for
> further discussion with iomap maintainers.
> >
> > Second, why are you calling kmemdup() to begin with?  This seems
> > entirely pointless; the iomap code is going to call memcpy() on it.
> > You're supposed to just be pointing into your data structures.
> In the initial implementation of NTFS with iomap, I pointed directly
> to the internal data structures. However, I encountered this BUG_ON
> trap during testing, so I switched to page allocation to avoid it.
> Then, during the review process for the NTFS series, I changed it to
> kmemdup() without much thought. If this BUG_ON trap can be removed, I
> could have simply pointed to the internal data structures as you said.

I think the check is wrong.  We rely on the filesystem to point
iomap::inline_data to kernel memory that is at least iomap::length bytes
in size.  If that crosses a PAGE_SIZE boundary that's fine, so long as
the caller actually mapped that much memory.  IOWs, if you have an
iomap:

{pos = 0, inline_data = 0xB0000, length = 32768, ...}

then we trust that you really did map all of the MDA text mode memory
and that memcpy'ing 100 bytes to pos 4090 is ok.

(Perhaps this is a relic of the bs<=ps days?)

--D

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] ntfs: use page allocation for resident attribute inline data
  2026-04-22 15:28     ` Darrick J. Wong
@ 2026-04-22 15:36       ` Gao Xiang
  2026-04-23  5:20         ` Namjae Jeon
  0 siblings, 1 reply; 7+ messages in thread
From: Gao Xiang @ 2026-04-22 15:36 UTC (permalink / raw)
  To: Darrick J. Wong, Namjae Jeon
  Cc: Matthew Wilcox, hyc.lee, linux-fsdevel, Christian Brauner,
	linux-xfs, Gao Xiang

Hi,

On 2026/4/22 23:28, Darrick J. Wong wrote:
> On Wed, Apr 22, 2026 at 11:35:32PM +0900, Namjae Jeon wrote:
>> On Wed, Apr 22, 2026 at 9:55 PM Matthew Wilcox <willy@infradead.org> wrote:
>>>
>>> On Wed, Apr 22, 2026 at 07:46:27PM +0900, Namjae Jeon wrote:
>>>> The current kmemdup() based allocation for IOMAP_INLINE can result in
>>>> inline_data pointer having a non-zero page offset. This causes
>>>> iomap_inline_data_valid() to fail the check:
>>>>
>>>>      iomap->length <= PAGE_SIZE - offset_in_page(iomap->inline_data)
>>>>
>>>> and triggers the kernel BUG at fs/iomap/buffered-io.c:1061.
>>>
>>> Hang on, hang on, hang on.
>>>
>>> First, maybe this check is too strict.  I'm sure it's true for EROFS,
>>> but I don't see why it should be true for everybody.  Perhaps we should
>>> delete this check or relax it?
>> I agree that the current check might be unnecessarily strict for
>> general cases. So I will prepare another patch to remove this trap for
>> further discussion with iomap maintainers.
>>>
>>> Second, why are you calling kmemdup() to begin with?  This seems
>>> entirely pointless; the iomap code is going to call memcpy() on it.
>>> You're supposed to just be pointing into your data structures.
>> In the initial implementation of NTFS with iomap, I pointed directly
>> to the internal data structures. However, I encountered this BUG_ON
>> trap during testing, so I switched to page allocation to avoid it.
>> Then, during the review process for the NTFS series, I changed it to
>> kmemdup() without much thought. If this BUG_ON trap can be removed, I
>> could have simply pointed to the internal data structures as you said.
> 
> I think the check is wrong.  We rely on the filesystem to point
> iomap::inline_data to kernel memory that is at least iomap::length bytes
> in size.  If that crosses a PAGE_SIZE boundary that's fine, so long as
> the caller actually mapped that much memory.  IOWs, if you have an
> iomap:
> 
> {pos = 0, inline_data = 0xB0000, length = 32768, ...}
> 
> then we trust that you really did map all of the MDA text mode memory
> and that memcpy'ing 100 bytes to pos 4090 is ok.
> 
> (Perhaps this is a relic of the bs<=ps days?)

Anyway, as said before, I think that particular assertion
can be removed too.

Thanks,
Gao Xiang

> 
> --D


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] ntfs: use page allocation for resident attribute inline data
  2026-04-22 15:36       ` Gao Xiang
@ 2026-04-23  5:20         ` Namjae Jeon
  0 siblings, 0 replies; 7+ messages in thread
From: Namjae Jeon @ 2026-04-23  5:20 UTC (permalink / raw)
  To: Gao Xiang
  Cc: Darrick J. Wong, Matthew Wilcox, hyc.lee, linux-fsdevel,
	Christian Brauner, linux-xfs, Gao Xiang

> >
> > I think the check is wrong.  We rely on the filesystem to point
> > iomap::inline_data to kernel memory that is at least iomap::length bytes
> > in size.  If that crosses a PAGE_SIZE boundary that's fine, so long as
> > the caller actually mapped that much memory.  IOWs, if you have an
> > iomap:
> >
> > {pos = 0, inline_data = 0xB0000, length = 32768, ...}
> >
> > then we trust that you really did map all of the MDA text mode memory
> > and that memcpy'ing 100 bytes to pos 4090 is ok.
> >
> > (Perhaps this is a relic of the bs<=ps days?)
>
> Anyway, as said before, I think that particular assertion
> can be removed too.
Okay, I will submit the patch for this.
Thanks!

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] ntfs: use page allocation for resident attribute inline data
  2026-04-22 12:55 ` Matthew Wilcox
  2026-04-22 14:35   ` Namjae Jeon
@ 2026-04-23  5:49   ` Christoph Hellwig
  1 sibling, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2026-04-23  5:49 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Namjae Jeon, hyc.lee, linux-fsdevel, Christian Brauner,
	Darrick J. Wong, linux-xfs, Gao Xiang

On Wed, Apr 22, 2026 at 01:55:40PM +0100, Matthew Wilcox wrote:
> On Wed, Apr 22, 2026 at 07:46:27PM +0900, Namjae Jeon wrote:
> > The current kmemdup() based allocation for IOMAP_INLINE can result in
> > inline_data pointer having a non-zero page offset. This causes
> > iomap_inline_data_valid() to fail the check:
> > 
> >     iomap->length <= PAGE_SIZE - offset_in_page(iomap->inline_data)
> > 
> > and triggers the kernel BUG at fs/iomap/buffered-io.c:1061.
> 
> Hang on, hang on, hang on.
> 
> First, maybe this check is too strict.  I'm sure it's true for EROFS,
> but I don't see why it should be true for everybody.  Perhaps we should
> delete this check or relax it?

I think the current check should just go.  ->iomap_inline_data is
treated as a normal linear address everywhere, so any offset in page
check is weird.

> Second, why are you calling kmemdup() to begin with?  This seems
> entirely pointless; the iomap code is going to call memcpy() on it.
> You're supposed to just be pointing into your data structures.

Yes.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-04-23  5:49 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-22 10:46 [PATCH] ntfs: use page allocation for resident attribute inline data Namjae Jeon
2026-04-22 12:55 ` Matthew Wilcox
2026-04-22 14:35   ` Namjae Jeon
2026-04-22 15:28     ` Darrick J. Wong
2026-04-22 15:36       ` Gao Xiang
2026-04-23  5:20         ` Namjae Jeon
2026-04-23  5:49   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox