From: Stefan Hajnoczi <stefanha@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
Alasdair Kergon <agk@redhat.com>,
Mikulas Patocka <mpatocka@redhat.com>,
dm-devel@lists.linux.dev, David Teigland <teigland@redhat.com>,
Mike Snitzer <snitzer@kernel.org>, Jens Axboe <axboe@kernel.dk>,
Christoph Hellwig <hch@lst.de>, Joe Thornber <ejt@redhat.com>
Subject: Re: [RFC 4/9] dm: add llseek(SEEK_HOLE/SEEK_DATA) support
Date: Wed, 3 Apr 2024 10:11:47 -0400 [thread overview]
Message-ID: <20240403141147.GD2524049@fedora> (raw)
In-Reply-To: <6awt5gq36kzwhuobabtye5vhnexc6cufuamy4frymehuv57ky5@esel3f5naqyu>
[-- Attachment #1: Type: text/plain, Size: 4524 bytes --]
On Thu, Mar 28, 2024 at 07:38:20PM -0500, Eric Blake wrote:
> On Thu, Mar 28, 2024 at 04:39:05PM -0400, Stefan Hajnoczi wrote:
> > Delegate SEEK_HOLE/SEEK_DATA to device-mapper targets. The new
> > dm_seek_hole_data() callback allows target types to customize behavior.
> > The default implementation treats the target as all data with no holes.
> >
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> > include/linux/device-mapper.h | 5 +++
> > drivers/md/dm.c | 68 +++++++++++++++++++++++++++++++++++
> > 2 files changed, 73 insertions(+)
> >
>
> > +/* Default implementation for targets that do not implement the callback */
> > +static loff_t dm_blk_seek_hole_data_default(loff_t offset, int whence,
> > + loff_t size)
> > +{
> > + switch (whence) {
> > + case SEEK_DATA:
> > + if ((unsigned long long)offset >= size)
> > + return -ENXIO;
> > + return offset;
> > + case SEEK_HOLE:
> > + if ((unsigned long long)offset >= size)
> > + return -ENXIO;
> > + return size;
>
> These fail with -ENXIO if offset == size (matching what we do on files)...
>
> > + default:
> > + return -EINVAL;
> > + }
> > +}
> > +
> > +static loff_t dm_blk_do_seek_hole_data(struct dm_table *table, loff_t offset,
> > + int whence)
> > +{
> > + struct dm_target *ti;
> > + loff_t end;
> > +
> > + /* Loop when the end of a target is reached */
> > + do {
> > + ti = dm_table_find_target(table, offset >> SECTOR_SHIFT);
> > + if (!ti)
> > + return whence == SEEK_DATA ? -ENXIO : offset;
>
> ...but this blindly returns offset for SEEK_HOLE, even when offset is
> beyond the end of the dm. I think you want 'return -ENXIO;'
> unconditionally here.
If the initial offset is beyond the end of the table, then SEEK_HOLE
should return -ENXIO. I agree that the code doesn't handle this case.
However, returning offset here is correct when there is data at the end
with SEEK_HOLE.
I'll update the code to address the out-of-bounds offset case, perhaps
by checking the initial offset before entering the loop.
>
> > +
> > + end = (ti->begin + ti->len) << SECTOR_SHIFT;
> > +
> > + if (ti->type->seek_hole_data)
> > + offset = ti->type->seek_hole_data(ti, offset, whence);
>
> Are we guaranteed that ti->type->seek_hole_data will not return a
> value exceeding end? Or can dm be used to truncate the view of an
> underlying device, and the underlying seek_hold_data can now return an
> answer beyond where dm_table_find_target should look for the next part
> of the dm's view?
ti->type->seek_hole_data() must not return a value larger than
(ti->begin + ti->len) << SECTOR_SHIFT.
>
> In which case, should the blkdev_seek_hole_data callback be passed a
> max size parameter everywhere, similar to how fixed_size_llseek does
> things?
>
> > + else
> > + offset = dm_blk_seek_hole_data_default(offset, whence, end);
> > +
> > + if (whence == SEEK_DATA && offset == -ENXIO)
> > + offset = end;
>
> You have a bug here. If I have a dm contructed of two underlying targets:
>
> |A |B |
>
> and A is all data, then whence == SEEK_HOLE will have offset = -ENXIO
> at this point, and you fail to check whether B is also data. That is,
> you have silently treated the rest of the block device as data, which
> is semantically not wrong (as that is always a safe fallback), but not
> optimal.
>
> I think the correct logic is s/whence == SEEK_DATA &&//.
No, with whence == SEEK_HOLE and an initial offset in A, the new offset
will be (A->begin + A->end) << SECTOR_SHIFT. The loop will iterate and
continue seeking into B.
The if statement you commented on ensures that we also continue looping
with whence == SEEK_DATA, because that would otherwise prematurely end
with the new offset = -ENXIO.
>
> > + } while (offset == end);
>
> I'm trying to make sure that we can never return the equivalent of
> lseek(dm, 0, SEEK_END). If you make my above suggested changes, we
> will iterate through the do loop once more at EOF, and
> dm_table_find_target() will then fail to match at which point we do
> get the desired -ENXIO for both SEEK_HOLE and SEEK_DATA.
Wait, lseek() is supposed to return the equivalent of lseek(dm, 0,
SEEK_END) when whence == SEEK_HOLE and there is data at the end.
>
> > +
> > + return offset;
> > +}
> > +
>
> --
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.
> Virtualization: qemu.org | libguestfs.org
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2024-04-03 14:12 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-28 20:39 [RFC 0/9] block: add llseek(SEEK_HOLE/SEEK_DATA) support Stefan Hajnoczi
2024-03-28 20:39 ` [RFC 1/9] " Stefan Hajnoczi
2024-03-28 23:50 ` Eric Blake
2024-03-28 20:39 ` [RFC 2/9] loop: " Stefan Hajnoczi
2024-03-29 0:00 ` Eric Blake
2024-03-29 12:54 ` Stefan Hajnoczi
2024-03-28 20:39 ` [RFC 3/9] selftests: block_seek_hole: add loop block driver tests Stefan Hajnoczi
2024-03-29 0:11 ` Eric Blake
2024-04-03 13:50 ` Stefan Hajnoczi
2024-03-29 12:38 ` Eric Blake
2024-04-03 13:51 ` Stefan Hajnoczi
2024-03-28 20:39 ` [RFC 4/9] dm: add llseek(SEEK_HOLE/SEEK_DATA) support Stefan Hajnoczi
2024-03-29 0:38 ` Eric Blake
2024-04-03 14:11 ` Stefan Hajnoczi [this message]
2024-04-03 17:02 ` Eric Blake
2024-04-03 17:58 ` Stefan Hajnoczi
2024-04-03 19:28 ` Eric Blake
2024-03-28 20:39 ` [RFC 5/9] selftests: block_seek_hole: add dm-zero test Stefan Hajnoczi
2024-03-28 22:19 ` Eric Blake
2024-03-28 22:32 ` Stefan Hajnoczi
2024-03-28 20:39 ` [RFC 6/9] dm-linear: add llseek(SEEK_HOLE/SEEK_DATA) support Stefan Hajnoczi
2024-03-29 0:54 ` Eric Blake
2024-04-03 14:22 ` Stefan Hajnoczi
2024-03-31 7:35 ` kernel test robot
2024-04-03 14:14 ` Stefan Hajnoczi
2024-03-28 20:39 ` [RFC 7/9] selftests: block_seek_hole: add dm-linear test Stefan Hajnoczi
2024-03-29 0:59 ` Eric Blake
2024-04-03 14:23 ` Stefan Hajnoczi
2024-03-28 20:39 ` [RFC 8/9] dm thin: add llseek(SEEK_HOLE/SEEK_DATA) support Stefan Hajnoczi
2024-03-29 1:31 ` Eric Blake
2024-04-03 15:03 ` Stefan Hajnoczi
2024-03-28 20:39 ` [RFC 9/9] selftests: block_seek_hole: add dm-thin test Stefan Hajnoczi
2024-03-28 22:16 ` [RFC 0/9] block: add llseek(SEEK_HOLE/SEEK_DATA) support Eric Blake
2024-03-28 22:29 ` Eric Blake
2024-03-28 23:09 ` Stefan Hajnoczi
2024-04-02 12:26 ` Christoph Hellwig
2024-04-02 13:04 ` Stefan Hajnoczi
2024-04-05 7:02 ` Christoph Hellwig
2024-04-02 13:31 ` Eric Blake
2024-04-05 7:02 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240403141147.GD2524049@fedora \
--to=stefanha@redhat.com \
--cc=agk@redhat.com \
--cc=axboe@kernel.dk \
--cc=dm-devel@lists.linux.dev \
--cc=eblake@redhat.com \
--cc=ejt@redhat.com \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mpatocka@redhat.com \
--cc=snitzer@kernel.org \
--cc=teigland@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.