* [PATCH] ntfs: use page allocation for resident attribute inline data
@ 2026-04-22 10:46 Namjae Jeon
2026-04-22 12:55 ` Matthew Wilcox
0 siblings, 1 reply; 7+ messages in thread
From: Namjae Jeon @ 2026-04-22 10:46 UTC (permalink / raw)
To: hyc.lee; +Cc: linux-fsdevel, Namjae Jeon
The current kmemdup() based allocation for IOMAP_INLINE can result in
inline_data pointer having a non-zero page offset. This causes
iomap_inline_data_valid() to fail the check:
iomap->length <= PAGE_SIZE - offset_in_page(iomap->inline_data)
and triggers the kernel BUG at fs/iomap/buffered-io.c:1061.
This particularly affects workloads with frequent small file access
(e.g. Firefox Nightly profile on NTFS with bind mount) when using the
new ntfs. This fix this by allocating a full page with alloc_page() so that
page_address() always returns a page-aligned address.
Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
---
fs/ntfs/iomap.c | 26 +++++++++++++++++++-------
1 file changed, 19 insertions(+), 7 deletions(-)
diff --git a/fs/ntfs/iomap.c b/fs/ntfs/iomap.c
index 3d1458dea90f..74a4d3e971f4 100644
--- a/fs/ntfs/iomap.c
+++ b/fs/ntfs/iomap.c
@@ -89,6 +89,7 @@ static int ntfs_read_iomap_begin_resident(struct inode *inode, loff_t offset, lo
u32 attr_len;
int err = 0;
char *kattr;
+ struct page *ipage;
if (NInoAttr(ni))
base_ni = ni->ext.base_ntfs_ino;
@@ -129,15 +130,18 @@ static int ntfs_read_iomap_begin_resident(struct inode *inode, loff_t offset, lo
kattr = (u8 *)ctx->attr + le16_to_cpu(ctx->attr->data.resident.value_offset);
- iomap->inline_data = kmemdup(kattr, attr_len, GFP_KERNEL);
- if (!iomap->inline_data) {
+ ipage = alloc_page(GFP_NOFS | __GFP_ZERO);
+ if (!ipage) {
err = -ENOMEM;
goto out;
}
+ memcpy(page_address(ipage), kattr, attr_len);
iomap->type = IOMAP_INLINE;
+ iomap->inline_data = page_address(ipage);
iomap->offset = 0;
iomap->length = attr_len;
+ iomap->private = ipage;
out:
if (ctx)
@@ -285,8 +289,11 @@ static int ntfs_read_iomap_begin(struct inode *inode, loff_t offset, loff_t leng
static int ntfs_read_iomap_end(struct inode *inode, loff_t pos, loff_t length,
ssize_t written, unsigned int flags, struct iomap *iomap)
{
- if (iomap->type == IOMAP_INLINE)
- kfree(iomap->inline_data);
+ if (iomap->type == IOMAP_INLINE) {
+ struct page *ipage = iomap->private;
+
+ put_page(ipage);
+ }
return written;
}
@@ -652,6 +659,7 @@ static int ntfs_write_iomap_begin_resident(struct inode *inode, loff_t offset,
u32 attr_len;
int err = 0;
char *kattr;
+ struct page *ipage;
ctx = ntfs_attr_get_search_ctx(ni, NULL);
if (!ctx) {
@@ -672,16 +680,19 @@ static int ntfs_write_iomap_begin_resident(struct inode *inode, loff_t offset,
attr_len = le32_to_cpu(a->data.resident.value_length);
kattr = (u8 *)a + le16_to_cpu(a->data.resident.value_offset);
- iomap->inline_data = kmemdup(kattr, attr_len, GFP_KERNEL);
- if (!iomap->inline_data) {
+ ipage = alloc_page(GFP_NOFS | __GFP_ZERO);
+ if (!ipage) {
err = -ENOMEM;
goto out;
}
+ memcpy(page_address(ipage), kattr, attr_len);
iomap->type = IOMAP_INLINE;
+ iomap->inline_data = page_address(ipage);
iomap->offset = 0;
/* iomap requires there is only one INLINE_DATA extent */
iomap->length = attr_len;
+ iomap->private = ipage;
out:
if (ctx)
@@ -771,6 +782,7 @@ static int ntfs_write_iomap_end_resident(struct inode *inode, loff_t pos,
u32 attr_len;
int err;
char *kattr;
+ struct page *ipage = iomap->private;
mutex_lock(&ni->mrec_lock);
ctx = ntfs_attr_get_search_ctx(ni, NULL);
@@ -799,7 +811,7 @@ static int ntfs_write_iomap_end_resident(struct inode *inode, loff_t pos,
mark_mft_record_dirty(ctx->ntfs_ino);
err_out:
ntfs_attr_put_search_ctx(ctx);
- kfree(iomap->inline_data);
+ put_page(ipage);
mutex_unlock(&ni->mrec_lock);
return written;
--
2.25.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] ntfs: use page allocation for resident attribute inline data
2026-04-22 10:46 [PATCH] ntfs: use page allocation for resident attribute inline data Namjae Jeon
@ 2026-04-22 12:55 ` Matthew Wilcox
2026-04-22 14:35 ` Namjae Jeon
2026-04-23 5:49 ` Christoph Hellwig
0 siblings, 2 replies; 7+ messages in thread
From: Matthew Wilcox @ 2026-04-22 12:55 UTC (permalink / raw)
To: Namjae Jeon
Cc: hyc.lee, linux-fsdevel, Christian Brauner, Darrick J. Wong,
linux-xfs, Gao Xiang
On Wed, Apr 22, 2026 at 07:46:27PM +0900, Namjae Jeon wrote:
> The current kmemdup() based allocation for IOMAP_INLINE can result in
> inline_data pointer having a non-zero page offset. This causes
> iomap_inline_data_valid() to fail the check:
>
> iomap->length <= PAGE_SIZE - offset_in_page(iomap->inline_data)
>
> and triggers the kernel BUG at fs/iomap/buffered-io.c:1061.
Hang on, hang on, hang on.
First, maybe this check is too strict. I'm sure it's true for EROFS,
but I don't see why it should be true for everybody. Perhaps we should
delete this check or relax it?
Second, why are you calling kmemdup() to begin with? This seems
entirely pointless; the iomap code is going to call memcpy() on it.
You're supposed to just be pointing into your data structures.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ntfs: use page allocation for resident attribute inline data
2026-04-22 12:55 ` Matthew Wilcox
@ 2026-04-22 14:35 ` Namjae Jeon
2026-04-22 15:28 ` Darrick J. Wong
2026-04-23 5:49 ` Christoph Hellwig
1 sibling, 1 reply; 7+ messages in thread
From: Namjae Jeon @ 2026-04-22 14:35 UTC (permalink / raw)
To: Matthew Wilcox
Cc: hyc.lee, linux-fsdevel, Christian Brauner, Darrick J. Wong,
linux-xfs, Gao Xiang
On Wed, Apr 22, 2026 at 9:55 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Wed, Apr 22, 2026 at 07:46:27PM +0900, Namjae Jeon wrote:
> > The current kmemdup() based allocation for IOMAP_INLINE can result in
> > inline_data pointer having a non-zero page offset. This causes
> > iomap_inline_data_valid() to fail the check:
> >
> > iomap->length <= PAGE_SIZE - offset_in_page(iomap->inline_data)
> >
> > and triggers the kernel BUG at fs/iomap/buffered-io.c:1061.
>
> Hang on, hang on, hang on.
>
> First, maybe this check is too strict. I'm sure it's true for EROFS,
> but I don't see why it should be true for everybody. Perhaps we should
> delete this check or relax it?
I agree that the current check might be unnecessarily strict for
general cases. So I will prepare another patch to remove this trap for
further discussion with iomap maintainers.
>
> Second, why are you calling kmemdup() to begin with? This seems
> entirely pointless; the iomap code is going to call memcpy() on it.
> You're supposed to just be pointing into your data structures.
In the initial implementation of NTFS with iomap, I pointed directly
to the internal data structures. However, I encountered this BUG_ON
trap during testing, so I switched to page allocation to avoid it.
Then, during the review process for the NTFS series, I changed it to
kmemdup() without much thought. If this BUG_ON trap can be removed, I
could have simply pointed to the internal data structures as you said.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ntfs: use page allocation for resident attribute inline data
2026-04-22 14:35 ` Namjae Jeon
@ 2026-04-22 15:28 ` Darrick J. Wong
2026-04-22 15:36 ` Gao Xiang
0 siblings, 1 reply; 7+ messages in thread
From: Darrick J. Wong @ 2026-04-22 15:28 UTC (permalink / raw)
To: Namjae Jeon
Cc: Matthew Wilcox, hyc.lee, linux-fsdevel, Christian Brauner,
linux-xfs, Gao Xiang
On Wed, Apr 22, 2026 at 11:35:32PM +0900, Namjae Jeon wrote:
> On Wed, Apr 22, 2026 at 9:55 PM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Wed, Apr 22, 2026 at 07:46:27PM +0900, Namjae Jeon wrote:
> > > The current kmemdup() based allocation for IOMAP_INLINE can result in
> > > inline_data pointer having a non-zero page offset. This causes
> > > iomap_inline_data_valid() to fail the check:
> > >
> > > iomap->length <= PAGE_SIZE - offset_in_page(iomap->inline_data)
> > >
> > > and triggers the kernel BUG at fs/iomap/buffered-io.c:1061.
> >
> > Hang on, hang on, hang on.
> >
> > First, maybe this check is too strict. I'm sure it's true for EROFS,
> > but I don't see why it should be true for everybody. Perhaps we should
> > delete this check or relax it?
> I agree that the current check might be unnecessarily strict for
> general cases. So I will prepare another patch to remove this trap for
> further discussion with iomap maintainers.
> >
> > Second, why are you calling kmemdup() to begin with? This seems
> > entirely pointless; the iomap code is going to call memcpy() on it.
> > You're supposed to just be pointing into your data structures.
> In the initial implementation of NTFS with iomap, I pointed directly
> to the internal data structures. However, I encountered this BUG_ON
> trap during testing, so I switched to page allocation to avoid it.
> Then, during the review process for the NTFS series, I changed it to
> kmemdup() without much thought. If this BUG_ON trap can be removed, I
> could have simply pointed to the internal data structures as you said.
I think the check is wrong. We rely on the filesystem to point
iomap::inline_data to kernel memory that is at least iomap::length bytes
in size. If that crosses a PAGE_SIZE boundary that's fine, so long as
the caller actually mapped that much memory. IOWs, if you have an
iomap:
{pos = 0, inline_data = 0xB0000, length = 32768, ...}
then we trust that you really did map all of the MDA text mode memory
and that memcpy'ing 100 bytes to pos 4090 is ok.
(Perhaps this is a relic of the bs<=ps days?)
--D
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ntfs: use page allocation for resident attribute inline data
2026-04-22 15:28 ` Darrick J. Wong
@ 2026-04-22 15:36 ` Gao Xiang
2026-04-23 5:20 ` Namjae Jeon
0 siblings, 1 reply; 7+ messages in thread
From: Gao Xiang @ 2026-04-22 15:36 UTC (permalink / raw)
To: Darrick J. Wong, Namjae Jeon
Cc: Matthew Wilcox, hyc.lee, linux-fsdevel, Christian Brauner,
linux-xfs, Gao Xiang
Hi,
On 2026/4/22 23:28, Darrick J. Wong wrote:
> On Wed, Apr 22, 2026 at 11:35:32PM +0900, Namjae Jeon wrote:
>> On Wed, Apr 22, 2026 at 9:55 PM Matthew Wilcox <willy@infradead.org> wrote:
>>>
>>> On Wed, Apr 22, 2026 at 07:46:27PM +0900, Namjae Jeon wrote:
>>>> The current kmemdup() based allocation for IOMAP_INLINE can result in
>>>> inline_data pointer having a non-zero page offset. This causes
>>>> iomap_inline_data_valid() to fail the check:
>>>>
>>>> iomap->length <= PAGE_SIZE - offset_in_page(iomap->inline_data)
>>>>
>>>> and triggers the kernel BUG at fs/iomap/buffered-io.c:1061.
>>>
>>> Hang on, hang on, hang on.
>>>
>>> First, maybe this check is too strict. I'm sure it's true for EROFS,
>>> but I don't see why it should be true for everybody. Perhaps we should
>>> delete this check or relax it?
>> I agree that the current check might be unnecessarily strict for
>> general cases. So I will prepare another patch to remove this trap for
>> further discussion with iomap maintainers.
>>>
>>> Second, why are you calling kmemdup() to begin with? This seems
>>> entirely pointless; the iomap code is going to call memcpy() on it.
>>> You're supposed to just be pointing into your data structures.
>> In the initial implementation of NTFS with iomap, I pointed directly
>> to the internal data structures. However, I encountered this BUG_ON
>> trap during testing, so I switched to page allocation to avoid it.
>> Then, during the review process for the NTFS series, I changed it to
>> kmemdup() without much thought. If this BUG_ON trap can be removed, I
>> could have simply pointed to the internal data structures as you said.
>
> I think the check is wrong. We rely on the filesystem to point
> iomap::inline_data to kernel memory that is at least iomap::length bytes
> in size. If that crosses a PAGE_SIZE boundary that's fine, so long as
> the caller actually mapped that much memory. IOWs, if you have an
> iomap:
>
> {pos = 0, inline_data = 0xB0000, length = 32768, ...}
>
> then we trust that you really did map all of the MDA text mode memory
> and that memcpy'ing 100 bytes to pos 4090 is ok.
>
> (Perhaps this is a relic of the bs<=ps days?)
Anyway, as said before, I think that particular assertion
can be removed too.
Thanks,
Gao Xiang
>
> --D
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ntfs: use page allocation for resident attribute inline data
2026-04-22 15:36 ` Gao Xiang
@ 2026-04-23 5:20 ` Namjae Jeon
0 siblings, 0 replies; 7+ messages in thread
From: Namjae Jeon @ 2026-04-23 5:20 UTC (permalink / raw)
To: Gao Xiang
Cc: Darrick J. Wong, Matthew Wilcox, hyc.lee, linux-fsdevel,
Christian Brauner, linux-xfs, Gao Xiang
> >
> > I think the check is wrong. We rely on the filesystem to point
> > iomap::inline_data to kernel memory that is at least iomap::length bytes
> > in size. If that crosses a PAGE_SIZE boundary that's fine, so long as
> > the caller actually mapped that much memory. IOWs, if you have an
> > iomap:
> >
> > {pos = 0, inline_data = 0xB0000, length = 32768, ...}
> >
> > then we trust that you really did map all of the MDA text mode memory
> > and that memcpy'ing 100 bytes to pos 4090 is ok.
> >
> > (Perhaps this is a relic of the bs<=ps days?)
>
> Anyway, as said before, I think that particular assertion
> can be removed too.
Okay, I will submit the patch for this.
Thanks!
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ntfs: use page allocation for resident attribute inline data
2026-04-22 12:55 ` Matthew Wilcox
2026-04-22 14:35 ` Namjae Jeon
@ 2026-04-23 5:49 ` Christoph Hellwig
1 sibling, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2026-04-23 5:49 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Namjae Jeon, hyc.lee, linux-fsdevel, Christian Brauner,
Darrick J. Wong, linux-xfs, Gao Xiang
On Wed, Apr 22, 2026 at 01:55:40PM +0100, Matthew Wilcox wrote:
> On Wed, Apr 22, 2026 at 07:46:27PM +0900, Namjae Jeon wrote:
> > The current kmemdup() based allocation for IOMAP_INLINE can result in
> > inline_data pointer having a non-zero page offset. This causes
> > iomap_inline_data_valid() to fail the check:
> >
> > iomap->length <= PAGE_SIZE - offset_in_page(iomap->inline_data)
> >
> > and triggers the kernel BUG at fs/iomap/buffered-io.c:1061.
>
> Hang on, hang on, hang on.
>
> First, maybe this check is too strict. I'm sure it's true for EROFS,
> but I don't see why it should be true for everybody. Perhaps we should
> delete this check or relax it?
I think the current check should just go. ->iomap_inline_data is
treated as a normal linear address everywhere, so any offset in page
check is weird.
> Second, why are you calling kmemdup() to begin with? This seems
> entirely pointless; the iomap code is going to call memcpy() on it.
> You're supposed to just be pointing into your data structures.
Yes.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-04-23 5:49 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-22 10:46 [PATCH] ntfs: use page allocation for resident attribute inline data Namjae Jeon
2026-04-22 12:55 ` Matthew Wilcox
2026-04-22 14:35 ` Namjae Jeon
2026-04-22 15:28 ` Darrick J. Wong
2026-04-22 15:36 ` Gao Xiang
2026-04-23 5:20 ` Namjae Jeon
2026-04-23 5:49 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox