Linux filesystem development
 help / color / mirror / Atom feed
From: Hyunchul Lee <hyc.lee@gmail.com>
To: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Cc: "glaubitz@physik.fu-berlin.de" <glaubitz@physik.fu-berlin.de>,
	"frank.li@vivo.com" <frank.li@vivo.com>,
	"slava@dubeyko.com" <slava@dubeyko.com>,
	"hch@infradead.org" <hch@infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"cheol.lee@lge.com" <cheol.lee@lge.com>
Subject: Re: [PATCH] hfsplus: limit sb_maxbytes to partition size
Date: Mon, 9 Mar 2026 09:52:20 +0900	[thread overview]
Message-ID: <aa4ZxJ1Lkk_9-f-C@hyunchul-PC02> (raw)
In-Reply-To: <cecaebb4c333439ea6e10808908f69cd3f3dbf95.camel@ibm.com>

On Fri, Mar 06, 2026 at 08:08:40PM +0000, Viacheslav Dubeyko wrote:
> On Fri, 2026-03-06 at 11:05 +0900, Hyunchul Lee wrote:
> > On Fri, Mar 06, 2026 at 01:23:16AM +0000, Viacheslav Dubeyko wrote:
> > > On Fri, 2026-03-06 at 09:57 +0900, Hyunchul Lee wrote:
> > > > On Thu, Mar 05, 2026 at 11:21:19PM +0000, Viacheslav Dubeyko wrote:
> > > > > On Thu, 2026-03-05 at 10:52 +0900, Hyunchul Lee wrote:
> > > > > > > > 
> > > > > > > > Sorry it's generic/285, not generic/268.
> > > > > > > > in generic/285, there is a test that creates a hole exceeding the block
> > > > > > > > size and appends small data to the file. hfsplus fails because it fills
> > > > > > > > the block device and returns ENOSPC. However if it returns EFBIG
> > > > > > > > instead, the test is skipped.
> > > > > > > > 
> > > > > > > > For writes like xfs_io -c "pwrite 8t 512", should fops->write_iter
> > > > > > > > returns ENOSPC, or would it be better to return EFBIG?
> > > > > > > > > 
> > > > > > > 
> > > > > > > Current hfsplus_file_extend() implementation doesn't support holes. I assume you
> > > > > > > mean this code [1]:
> > > > > > > 
> > > > > > >         len = hip->clump_blocks;
> > > > > > >         start = hfsplus_block_allocate(sb, sbi->total_blocks, goal, &len);
> > > > > > >         if (start >= sbi->total_blocks) {
> > > > > > >                 start = hfsplus_block_allocate(sb, goal, 0, &len);
> > > > > > >                 if (start >= goal) {
> > > > > > >                         res = -ENOSPC;
> > > > > > >                         goto out;
> > > > > > >                 }
> > > > > > >         }
> > > > > > > 
> > > > > > > Am I correct?
> > > > > > > 
> > > > > > Yes,
> > > > > > 
> > > > > > hfsplus_write_begin()
> > > > > >   cont_write_begin()
> > > > > >     cont_expand_zero()
> > > > > > 
> > > > > > 1) xfs_io -c "pwrite 8t 512"
> > > > > > 2) hfsplus_begin_write() is called with offset 2^43 and length 512
> > > > > > 3) cont_expand_zero() allocates and zeroes out one block repeatedly
> > > > > > for the range
> > > > > > 0 to 2^43 - 1. To achieve this, hfsplus_write_begin() is called repeatedly.
> > > > > > 4) hfsplus_write_begin() allocates one block through hfsplus_get_block() =>
> > > > > > hfsplus_file_extend()
> > > > > 
> > > > > I think we can consider these directions:
> > > > > 
> > > > > (1) Currently, HFS+ code doesn't support holes. So, it means that
> > > > > hfsplus_write_begin() can check pos variable and i_size_read(inode). If pos is
> > > > > bigger than i_size_read(inode), then hfsplus_file_extend() will reject such
> > > > > request. So, we can return error code (probably, -EFBIG) for this case without
> > > > > calling hfsplus_file_extend(). But, from another point of view, maybe,
> > > > > hfsplus_file_extend() could be one place for this check. Does it make sense?
> > > > > 
> > > > > (2) I think that hfsplus_file_extend() could treat hole or absence of free
> > > > > blocks like -ENOSPC. Probably, we can change the error code from -ENOSPC to -
> > > > > EFBIG in hfsplus_write_begin(). What do you think?
> > > > > 
> > > > Even if holes are not supported, shouldn't the following writes be
> > > > supported?
> > > > 
> > > > xfs_io -f -c "pwrite 4k 512" <file-path>
> > > > 
> > > > If so, since we need to support cases where pos > i_size_read(inode),
> > > 
> > > The pos > i_size_read(inode) means that you create the hole. Because,
> > 
> > That's correct. However I believe that not supporting writes like the
> > one mentioned above is a significant limitation. Filesystems that don't
> > support sparse files, such as exFAT, allocate blocks and fill them with
> > zeros.
> > 
> 
> You are welcomed to write the code for HFS/HFS+. :) I'll be happy to see such
> support.
> 
> > > oppositely, when HFS+ logic tries to allocate new block, then it expects to have
> > > pos == i_size_read(inode). And we need to take into account this code [1]:
> > > 
> > > 	if (iblock >= hip->fs_blocks) {
> > > 		if (!create)
> > > 			return 0;
> > > 		if (iblock > hip->fs_blocks) <-- This is the rejection of hole
> > > 			return -EIO;
> > > 		if (ablock >= hip->alloc_blocks) {
> > > 			res = hfsplus_file_extend(inode, false);
> > > 			if (res)
> > > 				return res;
> > > 		}
> > > 	}
> > > 
> > > The generic_write_end() changes the inode size: i_size_write(inode, pos +
> > > copied).
> > 
> > I think that it's not problem.
> > 
> > hfsplus_write_begin()
> >   cont_write_begin()
> >     cont_expand_zero()
> > 
> > cont_expand_zero() calls hfsplus_get_block() to allocate blocks between
> > i_size_read(inode) and pos, if pos > i_size_read(inode).
> > 
> 
> Currently, HFS/HFS+ expect that file should be extended without holes. It means
> that next allocating block should be equal to number of allocated blocks in
> file. If pos > i_size_read(inode), then it means that next allocating block is
> not equal to number of allocated blocks in file.
> 
> If you imply that requested length could include multiple blocks for allocation,
> then next allocating block should be equal to number of allocated blocks on
> every step. And if the next allocating block is bigger than number of allocated
> blocks in file, then hole creation is requested.
> 
> So, what are we discussing here? :)
> 

The email is getting lengty, so I will try to summarize the
discussion. :)

As mentioned before, if hfsplus_write_begin() returns an error when
pos > i_size_read(inode), the following write will fail:

xfs_io -f -c "pwrite 4k 512" <file-path>

However, currently hfsplus allows this write.

Therefore, the condition, pos - i_size_read(inode) > free space is
necessary, and futhermore, I think it is better to check the condtion
in write_iter() instead of write_begin().


> > > 
> > > > wouldn't the condition "pos - i_size_read(inode) > free space" be better?
> > > > Also instead of checking every time in hfsplus_write_begin() or
> > > > hfsplus_file_extend(), how about implementing the check in the
> > > > file_operations->write_iter callback function, and returing EFBIG?
> > > 
> > > Which callback do you mean here? I am not sure that it's good idea.
> > > 
> > 
> > Here is a simple code snippet.
> > 
> >  static const struct file_operations hfsplus_file_operations = {
> > ...
> > -       .write_iter     = generic_file_write_iter,
> > +       .write_iter     = hfsplus_file_write_iter,
> > ...
> > 
> > +ssize_t hfsplus_file_write_iter(struct kiocb *iocb, struct iov_iter *iter)
> > +{
> > ...
> > +       // check iocb->ki_pos is beyond i_size
> > +
> > +       ret = generic_file_write_iter(iocb, iter);
> > 
> 
> The hfsplus_write_begin() will be called before hfsplus_file_write_iter() if we
> are trying to extend the file. And hfsplus_get_block() calls
> hfsplus_file_extend() that will call hfsplus_block_allocate(). So, everything
> will happen before hfsplus_file_write_iter() call. What's the point to have the
> check here?

The callstack for write(2) is as follows. Am I misunderstanding
something?

write()
  do_sync_write()
    fops->write_iter()
    generic_file_write_iter()
      aops->write_begin()
      hfsplus_write_begin()

> 
> Thanks,
> Slava.
> 
> > 

-- 
Thanks,
Hyunchul

  reply	other threads:[~2026-03-09  0:52 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-03  8:28 [PATCH] hfsplus: limit sb_maxbytes to partition size Hyunchul Lee
2026-03-04 13:08 ` Christoph Hellwig
2026-03-04 20:04   ` Viacheslav Dubeyko
2026-03-05  0:29     ` Hyunchul Lee
2026-03-05  0:46       ` Viacheslav Dubeyko
2026-03-05  1:52         ` Hyunchul Lee
2026-03-05 23:21           ` Viacheslav Dubeyko
2026-03-06  0:57             ` Hyunchul Lee
2026-03-06  1:23               ` Viacheslav Dubeyko
2026-03-06  2:05                 ` Hyunchul Lee
2026-03-06 20:08                   ` Viacheslav Dubeyko
2026-03-09  0:52                     ` Hyunchul Lee [this message]
2026-03-09 19:47                       ` Viacheslav Dubeyko
2026-03-09 23:25                         ` Hyunchul Lee
2026-03-05 14:27       ` hch
2026-03-06  0:40         ` Hyunchul Lee
2026-03-04 23:49   ` Hyunchul Lee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aa4ZxJ1Lkk_9-f-C@hyunchul-PC02 \
    --to=hyc.lee@gmail.com \
    --cc=Slava.Dubeyko@ibm.com \
    --cc=cheol.lee@lge.com \
    --cc=frank.li@vivo.com \
    --cc=glaubitz@physik.fu-berlin.de \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=slava@dubeyko.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox